VOGONS


rep movs

Topic actions

First post, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

Has there been any thought of optimizing the decode?

Like:

mov ax, cx
shr cx, 1
rep movsw
mov cx, ax
and cx, 1
rep movsb

or

mov ax, cx
shr cx, 1
rep movsw
and ax,1
jz itseven
movsb
itseven:

etc.

instead of rep movsb.

Reply 1 of 7, by Harekiet

User metadata
Rank DOSBox Author
Rank
DOSBox Author

well the question would be if dos programmers ever used rep movsb when they wanted speed. Would seem they would also use mosw or movsd as much as possible for maximum speed. Your avarage memcpy routine in dos would probably already do a check like that.

Reply 3 of 7, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

Trying a test implementation. movsb case in decoder.h. What are the input variables to gen_shift_byte_imm?

	/* MOVSB/W/D*/
case 0xa4:
if (decode.rep) {
LOG_MSG("rep str_movsb");
dyn_push(DREG(ECX));
gen_shift_byte_imm(?,?,?,2); // shr cx, 2
// cache_addb(0x51); //PUSH ECX
// gen_preloadreg(DREG(ECX));
// DynRegs[G_ECX].flags|=DYNFLG_CHANGED;
// cache_addw(0xe9c1); // shr ecx, 2
// cache_addb(0x2);
dyn_string(STR_MOVSD); // rep movsd
dyn_pop(DREG(ECX));
// cache_addb(0x59); //POP ECX
gen_dop_word_imm(DOP_AND,false,DREG(ECX),3); // and ecx, 3
}
dyn_string(STR_MOVSB);
break;

Reply 4 of 7, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

You need gen_shift_word_imm (cx -> _word_imm).

Bitu op,bool dword,DynReg * dr1,Bit8u imm

op is the type of shift/rotate
dword decides if the target reg is word/dword (false here)
dr1 is the register (DREG(ECX) in your case)
imm is the value to rotate (2 here)

As Harekiet already noted it's unlikely that rep movsb code is in time critical
routines and last time i checked it wasn't used a lot in the games that run
slow in dosbox at the moment.

Reply 5 of 7, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

movsw and movsd can be improved with movsq, but since it's not implemented yet, getting the rest worked out by testing on movsb. movsb gets used to mop up odd bytes on later games so if it were to be optimized it'd need a count check added to skip if small.

Reply 6 of 7, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

If you wanna use the above code for more than just (speed) tests,
you should definitely get rid of the push/pop pair as that messes up
the stack if the rep movsb triggers a pagefault.