blit patch \ VOGONS

blit patch

Topic actions

Post a reply

First post, by ih8registrations

Posted on 2008-05-21, 10:29

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

memory, scalar, and dma blitting.

Attachments

Filename

blits.diff

File size

11.39 KiB

Downloads

344 downloads

File license

Fair use/fair dealing exception

Reply 1 of 10, by ih8registrations

Posted on 2008-05-21, 15:00

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

Any issues for not adding to CVS? I'm always curious why things aren't added.

Reply 2 of 10, by wd

Posted on 2008-05-21, 17:13

wd Offline

Rank DOSBox Author

Rank: DOSBox Author
Posts: 10813
Joined: 2003-12-03, 21:23

Doesn't gain anything, adds more code, and has issues?
Like the string copy can trigger pagefaults when it shouldn't.

Reply 3 of 10, by ih8registrations

Posted on 2008-05-21, 18:10

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

It gains some, it adds more code, and I'm missing where the string copy triggers a pagefault.

Reply 4 of 10, by wd

Posted on 2008-05-21, 18:15

wd Offline

Rank DOSBox Author

Rank: DOSBox Author
Posts: 10813
Joined: 2003-12-03, 21:23

It gains some

Not relevant amounts, so why bother.

it adds more code

Uglifies places that are quite straightforward and readable.

and I'm missing where the string copy triggers a pagefault.

mem_strlen, not the copy.

Reply 5 of 10, by ih8registrations

Posted on 2008-05-21, 19:18

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

mem_strlen could be done without; I've only seen it called at startup. I did it because I could, using the same algorithm in string copy.

Reply 6 of 10, by ih8registrations

Posted on 2008-05-21, 19:36

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

Profiling shows usage is spread out such that many small optimizations is how things are going to be improved, less doing something like threading. I have another patch here optimized with 64bit decoding and gcc_unlikely path optimizations which brings another small bump.

Reply 7 of 10, by wd

Posted on 2008-05-21, 21:26

wd Offline

Rank DOSBox Author

Rank: DOSBox Author
Posts: 10813
Joined: 2003-12-03, 21:23

which brings another small bump

Well the problem at this stage is that adding complexity for very small speed
gains makes other optimizations/rewrites/changes harder, so they're not
useful imo as they are not noticeable on regular PCs, and on low-powered
devices you got pretty much different problems anyways.
But that's only my humble opinion of course.

Reply 8 of 10, by wd

Posted on 2008-05-23, 19:54

wd Offline

Rank DOSBox Author

Rank: DOSBox Author
Posts: 10813
Joined: 2003-12-03, 21:23

The scaler BituMove might be interesting and is easy enough (not sure if
the 16byte alignment is fine though). Did you profile some stuff with that?
Especially default modes (320x200 games with normal2x scaler).

Reply 9 of 10, by ih8registrations

Posted on 2008-05-23, 21:38

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

Yeah, long ago. BTW, I'm on an Athlon XP with only 256k L2. In my testing, some games cycle between the three conditions, <8, >=8, >=8 /w tmp(remainder), but most hit one condition exclusively or most of the time. To improve size for even lower cache cpus, could change the <8 case to be a repeat byte rather than the if dword, if word, if byte.

It could be further shrunk by having the remainder fall through, using the <8 loop, like so:

1static void DMA_BlockRead(PhysPt pt,void * data,Bitu size) {
2    Bit32u page=pt>>12; 
3    Bit32u * pagemap;  	
4    Bit32u mask;         
5    if (page < LINK_START) { Bit32u pageend=(pt+size)>>12; pagemap=pmap[pageend < EMM_PAGEFRAME4K]; mask=~0; }    
6    else { pagemap=&page; mask=0; }                   
7 
8    Bit64u * writeq=(Bit64u *) data;    
9	  if (size>=8) {       	     
10      Bit8u tmp=size&0x07;	           
11      size>>=3;
12      do {
13  	    *writeq++=phys_readq(pagemap[(pt>>12)&mask]*4096 + (pt & 4095));
14  	    size--; pt+=8; 
15  	  } while (size);  	  
16      if (!tmp) return;
17      size=tmp;         	      
18  	} 	
19    Bit8u * write=(Bit8u *) writeq;  	
20  	do {
21      *write++=phys_readb(pagemap[(pt>>12)&mask]*4096 + (pt & 4095));  
22      pt++;	     
23  	} while (--size);  	
24}

Reply 10 of 10, by ih8registrations

Posted on 2008-05-24, 03:05

ih8registrations Offline

Rank Oldbie

Rank: Oldbie
Posts: 931
Joined: 2003-07-25, 17:20

Alternately, could do something like this:

1#define optimize 1 
2/* read a block from physical memory */
3static void DMA_BlockRead(PhysPt pt,void * data,Bitu size) {
4    Bit32u page=pt>>12; 
5    Bit32u * pagemap;  	
6    Bit32u mask;         
7    if (page < LINK_START) { Bit32u pageend=(pt+size)>>12; pagemap=pmap[pageend < EMM_PAGEFRAME4K]; mask=~0; }    
8    else { pagemap=&page; mask=0; }                   
9
10#ifdef optimize     
11	  if (size>=8) {       	   
12      Bit64u * writeq=(Bit64u *) data; 	    
13      Bit8u tmp=size&0x07;	           
14      size>>=3;
15      do {
16  	    *writeq++=phys_readq(pagemap[(pt>>12)&mask]*4096 + (pt & 4095));
17  	    size--; pt+=8; 
18  	  } while (size);  	         	     
19  	    	
20      if (tmp) {
21        tmp=8-tmp; pt-=tmp; writeq=(Bit64u *)((Bit8u *)writeq-tmp);
22        *writeq=phys_readq(pagemap[(pt>>12)&mask]*4096 + (pt & 4095));
23	    }
24	    return;
25  	}    
26#endif  	
27    Bit8u * write=(Bit8u *) data;  	       
28  	do {
29      *write++=phys_readb(pagemap[(pt>>12)&mask]*4096 + (pt & 4095));  
30      pt++;	     
31  	} while (--size);  	
32}

Go to top of page Go to top of page

Back to DOSBox Development

Main menu

Common searches

blit patch

Topic actions

First post, by ih8registrations

Attachments

Reply 1 of 10, by ih8registrations

Reply 2 of 10, by wd

Reply 3 of 10, by ih8registrations

Reply 4 of 10, by wd

Reply 5 of 10, by ih8registrations

Reply 6 of 10, by ih8registrations

Reply 7 of 10, by wd

Reply 8 of 10, by wd

Reply 9 of 10, by ih8registrations

Reply 10 of 10, by ih8registrations