VOGONS


First post, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

<0xffd, shouldn't it be 0xffc? It's a dword access so word and byte would be unaligned, aka addresses with the last three bits set. 0xffd lets byte addresses pass through, and just blocks word. As well, why not change the check to a lighter (!(address&3))? Or am I missing something?

IINLINE Bit32u mem_readd_dyncorex86(PhysPt address) {
if ((address & 0xfff)<0xffd) {
Bitu index=(address>>12);

if (paging.tlb.read[index]) return host_readd(paging.tlb.read[index]+address);
else return paging.tlb.handler[index]->readd(address);
} else return mem_unalignedreadd(address);
}

Reply 1 of 20, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

<0xffd means the worst thing to happen is a dword access starting
at 0xffc which spans 0xffc 0xffd 0xffe 0xfff which is fine.

!&3 is WAY heavier as it results in usage of mem_unalignedreadd a lot
more often, which splits accesses thus is much slower.

Not sure if i understood your question fully though, maybe you can
elaborate a bit on it.

Reply 2 of 20, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

Lighter referring to (!(address&3)) vs ((address & 0xfff)<0xffd), not the routine as a whole.

I'm still not grasping how ((address & 0xfff)<0xffd) translates to your description of it. ..

I assumed the check for unaligned of this dword handler would be word and byte addresses, but what you describe sounds like the intent is something else. I don't understand why you'd only want to block word addresses from 0xffc-0xfff.

Last edited by ih8registrations on 2007-08-24, 19:09. Edited 3 times in total.

Reply 3 of 20, by `Moe`

User metadata
Rank Oldbie
Rank
Oldbie

Looks like that this is not a check for unaligned access, but for accesses crossing a page boundary. Which means: treat accesses starting at 0xffd (or higher) specially, as they might access two pages.

Reply 4 of 20, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Yes, the unaligned-names are a bit misleading here, their initial intent
was (using the !&3) to prevent unaligned access, but for x86 all accesses
are valid EXCEPT for those that cross a page boundary as Moe said.

Reply 5 of 20, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

That clears up the confusion. Adding a comment for future code perusers wouldn't hurt. What got me to this point was I've modified blockread/blockwrite, as well as mem_memcpy, to do dword blitting, they work fine, and looking to go further with 64bit, since 64bit is readily available. Doing such requires adding 64bit handlers for everthing, which I did, with the finishing touches being updating the boundary check for 64bit handlers in paging.h and solving the input/output values for the read/write pagehandler classes in paging.cpp, which are only Bitu. The 64bit boundary check would be perhaps ((address & 0xfff)<0xffc), and I'm guessing there's no way around needing to update Bitu to Bit64u for the classes in paging.cpp.

Reply 6 of 20, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Be sure you check the functionality of the memory access functions in
decoder.h, which by default partly inline some of their code. Stuff like
mem_readd_dyncorex86 isn't used at all iirc.

Reply 7 of 20, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

What I'm modifying only uses a few of the memory functions:

	mem_memcpy, MEM_BlockWrite  memory.cpp
mem_writed_inline paging.h
host_writed mem.h
paging.tlb.handler[index]->writed paging.cpp
mem_unalignedwrited memory.cpp
mem_writeb_inline

though I did add 64bit versions for everything to be complete(unless you're suggesting adding qword handlers in decoder.h?) For the paging.tlb.handler classes, paging.h, core_dynrec.cpp, core_dyn_x86.cpp, paging.cpp, decoder_basic.h, decoder.h, & debug.cpp touches them, but it looks like only paging.h needs to be updated, and just to add size overrides in the dword accesses to quiet compiler complaints. The rest just use the returns for boolean checks and pass undersized input.

Reply 9 of 20, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

Used all over, video, xms, ems, dos; benefits loading for sure. A visual example would be Heretic's loading bar, which is essentially tracking MEM_BlockWrite. A heavier example would be quake's startup and map loading. Wolf3d uses mem_memcpy for dac samples. Logging the dword count, privateer has had the highest counts I've seen with streams of calls with 16k counts aka 64KB*n calls(I don't know about in-game though, can't start as new joystick code is borked, always been, sharply drifts the cursor offscreen. causes this problem in many games, but I digress..) It seemed like relatively low hanging fruit to make use of qword blits, with a little mix of cross the t's & dot the i's pita.

Reply 10 of 20, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Used all over, video, xms, ems, dos

Yes, but none of those is used a lot during gameplay (loading bars? well...)

video seems to be only cga functions, the rest of the MEM_Block are in
rarely called or short block transfer functions (setup stuff). The only relevant
seem to be the dos file read/writes (though the memory read/writes are most
likely only a tiny fraction of the time consumed by those??) and the xms block
move (think the only way to access xms memory).

Wolf3d uses mem_memcpy for dac samples.

That's xms or ems memory moves?

Reply 12 of 20, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Maybe try to fully align the memory moves, that is use a small temporary
buffer to have both the mem_readx and mem_writex's at 32 or 64bit
boundaries (so a copy(0,2) could use 32/64bit memory functions).

Reply 14 of 20, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

No, the memcpy loop:

	while (size--) mem_writeb_inline(dest++,mem_readb_inline(src++));

If you change that to readd/writed it might be best to have both memory
accesses dword aligned, which is a bit tricky if say src is dword aligned
but dest isn't (like copying from 0x1234:0 to 0x5678:1).

Reply 15 of 20, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

Ah, not a problem. Having checked, most programs are well behaved, only rarely having odd addressed src or dest. The only one with a bad habit of that is dosbox at the dos prompt. As such, I'd stick with the way I handle it now, aka not checking for it.

Reply 16 of 20, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

I've gotten dosbox to compile with i/o changed to 64bitu but there's still a matter of getting it to run with it, even when the only change is changing the i/o values to 64bitu. In lue of that, attached is a patch with the dword version blits of the copy and dma routines.

Attachments

  • Filename
    blits.diff
    File size
    6.68 KiB
    Downloads
    272 downloads
    File license
    Fair use/fair dealing exception

Reply 18 of 20, by ih8registrations

User metadata
Rank Oldbie
Rank
Oldbie

Slightly heavier overall but cleaner more compact code for DMA.

Attachments

  • Filename
    blits3.diff
    File size
    8.5 KiB
    Downloads
    385 downloads
    File license
    Fair use/fair dealing exception