rep movs \ VOGONS

rep movs

Topic actions

First post, by ih8registrations

Posted on 2008-05-05, 18:22

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

Has there been any thought of optimizing the decode?

Like:

1mov ax, cx
2shr cx, 1
3rep movsw
4mov cx, ax
5and cx, 1
6rep movsb
7
8or
9
10mov ax, cx
11shr cx, 1
12rep movsw
13and ax,1
14jz itseven
15 movsb
16itseven:
17
18etc.

instead of rep movsb.

Reply 1 of 7, by Harekiet

Posted on 2008-05-06, 09:51

Harekiet Offline

Rank DOSBox Author

Rank: DOSBox Author
Posts: 1050
Joined: 2002-07-01, 07:14
Location: Fryslan

well the question would be if dos programmers ever used rep movsb when they wanted speed. Would seem they would also use mosw or movsd as much as possible for maximum speed. Your avarage memcpy routine in dos would probably already do a check like that.

Reply 2 of 7, by ih8registrations

Posted on 2008-05-06, 12:20

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

Could use qword movs, which they didn't have back then.

1mov ax, cx
2shr cx, 3
3rep movsq
4mov cx, ax
5and cx, 7
6rep movsb

Reply 3 of 7, by ih8registrations

Posted on 2008-05-08, 01:59

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

Trying a test implementation. movsb case in decoder.h. What are the input variables to gen_shift_byte_imm?

1	/* MOVSB/W/D*/
2	case 0xa4:
3          if (decode.rep) {    
4             LOG_MSG("rep str_movsb");  
5             dyn_push(DREG(ECX));
6             gen_shift_byte_imm(?,?,?,2);   // shr cx, 2 
7//   	        cache_addb(0x51);			//PUSH ECX   	
8// 	     	gen_preloadreg(DREG(ECX));
9//	     	DynRegs[G_ECX].flags|=DYNFLG_CHANGED;
10//             cache_addw(0xe9c1);   // shr ecx, 2
11//             cache_addb(0x2);	     
12             dyn_string(STR_MOVSD); // rep movsd  
13             dyn_pop(DREG(ECX));
14//             cache_addb(0x59);			//POP ECX
15             gen_dop_word_imm(DOP_AND,false,DREG(ECX),3); // and ecx, 3
16           } 
17	   dyn_string(STR_MOVSB);  
18	break;

Reply 4 of 7, by wd

Posted on 2008-05-08, 07:39

wd Offline

Rank DOSBox Author

Rank: DOSBox Author
Posts: 10813
Joined: 2003-12-03, 21:23

You need gen_shift_word_imm (cx -> _word_imm).

1Bitu op,bool dword,DynReg * dr1,Bit8u imm

op is the type of shift/rotate
dword decides if the target reg is word/dword (false here)
dr1 is the register (DREG(ECX) in your case)
imm is the value to rotate (2 here)

As Harekiet already noted it's unlikely that rep movsb code is in time critical
routines and last time i checked it wasn't used a lot in the games that run
slow in dosbox at the moment.

Reply 5 of 7, by ih8registrations

Posted on 2008-05-08, 08:24

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

movsw and movsd can be improved with movsq, but since it's not implemented yet, getting the rest worked out by testing on movsb. movsb gets used to mop up odd bytes on later games so if it were to be optimized it'd need a count check added to skip if small.

Reply 6 of 7, by wd

Posted on 2008-05-08, 10:43

wd Offline

Rank DOSBox Author

Rank: DOSBox Author
Posts: 10813
Joined: 2003-12-03, 21:23

If you wanna use the above code for more than just (speed) tests,
you should definitely get rid of the push/pop pair as that messes up
the stack if the rep movsb triggers a pagefault.

Reply 7 of 7, by ih8registrations

Posted on 2008-05-17, 22:54

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

Just an update. The hangup for doing this is there's something in dyn_string that causes dosbox to lock up, perhaps from trying to auto load data for decoding.

Go to top of page Go to top of page

Back to DOSBox Development