VOGONS


Reply 20 of 20, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

The worst case for 64 bit compared to 32 bit should be breaking even, if that. This is because most transfers are large enough to have at least one 64 bit transfer and usually a word or dword if not both, and odd byte counts are rare which is predicted with gcc_unlikely, such that the usual or average worst is one conditional jump that may be mispredicted, which the savings from the one 64 bit transfer should easily negate. That leaves most of the compares falling through like they would be were they iterations of the for loop. Tada:)

Just to clarify, there would be more handler calls stopping at dword, not less, and not really less changes, but switching one failed conditional jump for two conditional jumps that fall through, and two dword handler calls in place of one qword handler call, for most transfers as described, and note that "handler call" entails "{ *writed++=phys_readd(pagemap[(pt>>12)&mask]*4096 + (pt & 4095)); pt+=4; }"; shifts, adds, multiply, ands, moves, fn call, .. vs the cost of a mispredicted jump.