superfury wrote:Edit: What about the other (un)conditional jump instructions and call instructions? Do they delay the BIU 6 cycles as well?
Taken conditional jumps and LOOPs (including JCXZ): 6 cycles.
Near/short JMP: 6 cycles.
Indirect JMP (i.e. "JMP CX"): 3 cycles.
Indirect CALL (i.e. "CALL CX") and near CALL: 10 cycles, of which last 4 are the prefetch of the instruction at the destination.
Far JMP: 4 cycles.
mov [iw],accum: 2 cycles.
Far CALL: 5 cycles before first stack store, 9 cycles before the second stack stack store (note that the prefetch of the destination instruction takes the last 4 of these 9 cycles).
OUT DX,accum and IN accum,DX: no delay except for the 1 cycle wait state
PUSH rw, PUSH segreg, PUSHF: 2 cycles before stack operation.
MOVSB, MOVSW: 3 cycles between load and store.
REP MOVSB, REP MOVSW: same, also 6 cycles between each load/store pair (0 between halves of a word load/store).
REP STOSB, REP STOSW: 6 cycles between each store (0 between halves of a word store).
REP LODSB, REP LODSW: 9 cycles between each load (0 between halves of a word load).
RET: 3 cycles between stack store and first prefetch at destination.
RET iw: 2 cycles before stack store, 4 cycles between stack store and first prefetch at destination.
XLATB: 2 cycles before load, 2 cycles after.
ADD B[SI],AL: 2 cycles before read, 3 cycles before write.
ADD AL,B[SI}: 2 cycles before read.
CMP [SI],accum: 2 cycles before read.
I've attached a file showing sniffer logs of all these and more (but not an exhaustive list). Note that some may be different for other bus states.
The attachment sniffer_timings.txt is no longer available