VOGONS


FPU bug

Topic actions

First post, by Folken

User metadata
Rank Newbie
Rank
Newbie

I believe there is a bug with the FPU change that was submitted back on June 1. I really don't know enough about the code to say what the bug is exactly, just that I've been noticing a number of issues with CVS builds for the past while and it seems like this is the change that is causing the issues.

I've been seeing the problems with a number of DOS demos, problems that I cannot reproduce in 0.65 release, or with CVS builds from before this change. The exact problem varies depending on the program - usually graphical glitches or sound issues. I haven't noticed problems with any games yet, but I haven't tried very many.

So I decided to compile my own build, and then I basically reverted changes back until the problems went away. And they did indeed go away when I reverted those FPU changes. At that point I went back to the latest CVS, and figured out I could just comment out the "#define X86_DYNFPU" line in cpu/core_dyn_x86/decoder.h. Making that one change is the easiest way to see the difference.

One specific example is the demo "The Fulcrum". It can be downloaded here: http://www.scene.org/file.php?file=/demos/gro … ix/mtx-fulc.zip

At the very start of this demo there is a robot walking around. Using latest CVS (with those FPU changes), after the robot walks for about six or seven steps, the shadows become corrupted. Using an earlier build (or if you comment out that #define) the shadows display fine.

I'm running an AMD64 dual-core under WinXP.

Reply 1 of 27, by Srecko

User metadata
Rank Member
Rank
Member

I didn't notice any such problems with "dynamic" fpu, though distortions might be subtle and many games don't use fpu at all.

That option should btw. greatly improve FPU performance in that core with games like e.g, quake as it enables dynamic core caching of fpu execution.
What about normal/full core, do you notice distortions with those cores or just with dynamic one, did you check with/out other #define for using asm FPU (X86 ASM FPU or similar name AFAIR)?

Reply 2 of 27, by Folken

User metadata
Rank Newbie
Rank
Newbie

The problem is only on dynamic core. Sorry, I meant to mention that in the first message. The distortions don't appear using the normal or full cores (of course they're just too darn slow for me to use, but it's good to verify that it's only the one core with the problem).

I tried disabling the x86 assembly fpu core (the C_FPU_X86 option in config.h), but that made no difference.

Out of curiosity I fired up Quake to see what sort of framerate difference I got with and without the dynamic FPU stuff. In my test I get an increase of about 3 frames per second when using the dynamic FPU code (from 25fps to 28fps, roughly). For me the extra little bit of performance in Quake is not worth all the other problems I'm seeing. But hopefully there's a better solution than having to disable it completely.

Reply 3 of 27, by evo

User metadata
Rank Newbie
Rank
Newbie

It was easy to narrow this bug down to a faulty fnstsw ax instruction.
Try this patch: http://session-x.net/dosbox_fnstsw_fix.patch
The problem was simply that dosbox zeroed out the high word of EAX when in fact it should only alter the low word (ax).

Reply 4 of 27, by Qbix

User metadata
Rank DOSBox Author
Rank
DOSBox Author

thanks that instruction is used a lot.
I'm sure wd will notice it when he gets back.

Water flows down the stream
How to ask questions the smart way!

Reply 5 of 27, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

> For me the extra little bit of performance in Quake is not worth all the
> other problems I'm seeing

Don't use it then. And the framerate increase was somewhat higher, but
some parts have been back-ported to the regular x86 fpu core.

> It was easy to narrow this bug down to a faulty fnstsw ax instruction.

Yes you're correct. Thanks for the patch.
As the dyncore fpu is superceded by some other stuff, i'll leave it as
is for the moment.

Reply 7 of 27, by Folken

User metadata
Rank Newbie
Rank
Newbie

It was easy to narrow this bug down to a faulty fnstsw ax instruction.
Try this patch:

Tried it, and it works like a charm! I'll let you guys know if I see any further issues, but so far it's looking pretty good. I tried a couple other programs that were giving me a bit of trouble, and the problems seem to be gone there too.

Reply 8 of 27, by Folken

User metadata
Rank Newbie
Rank
Newbie

I've run into another issue. Same drill as before - bug happens with #define X86_DYNFPU, and doesn't happen without it or under normal/full core. Though I should also add that the bug happens both with and without the above fix by evo.

The bug happens in two different demos by Pulse, and pretty much same thing happens in both (so I'm assuming it's the same problem for both demos):
Sunflower: http://www.scene.org/file.php?file=/demos/gro … se/pls_sunf.zip
Tribes: http://scene.org/file.php?file=/demos/groups/ … se/tribesfi.zip

When the demos are run they first display a static screen while the demo loads. This can takes a few seconds. Then they go into a 3d scene. I get a few frames and then it simply stops rendering. If you wait in Tribes you'll get the odd frame here and there (on scene changes I think) but it'll quickly stop rendering each time. In Sunflower if you wait, part of the demo will display fine, but there are points where it'll stop rendering again.

Reply 9 of 27, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

There's a stack-wrapping missing in dyn_fpu_esc6(),
FCOMPP (that is case 0x03) should look like this:

			gen_load_host(&TOP,DREG(EA),4); 
gen_dop_word_imm(DOP_ADD,true,DREG(EA),1);
gen_dop_word_imm(DOP_AND,true,DREG(EA),7);

( the DOP_AND line added).
I think i'll upload those fixes even though the dynfpu is obsolete,
unless you find more bugs 😀

Reply 10 of 27, by evo

User metadata
Rank Newbie
Rank
Newbie
wd wrote:
There's a stack-wrapping missing in dyn_fpu_esc6(), FCOMPP (that is case 0x03) should look like this: […]
Show full quote

There's a stack-wrapping missing in dyn_fpu_esc6(),
FCOMPP (that is case 0x03) should look like this:

			gen_load_host(&TOP,DREG(EA),4); 
gen_dop_word_imm(DOP_ADD,true,DREG(EA),1);
gen_dop_word_imm(DOP_AND,true,DREG(EA),7);

( the DOP_AND line added).
I think i'll upload those fixes even though the dynfpu is obsolete,
unless you find more bugs 😀

Just out of curiosity, by what is it being superseded?

Reply 13 of 27, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

> Does DOSBox itself contain code that compiles into FPU instructions?

Yes, the timing (PIC_Ticks logic) mainly, which leads to a lot of dependent
code (sound blaster, pcspeaker, timer). That's the tricky part of it 😉

Reply 14 of 27, by evo

User metadata
Rank Newbie
Rank
Newbie
wd wrote:

Yes, the timing (PIC_Ticks logic) mainly, which leads to a lot of dependent
code (sound blaster, pcspeaker, timer). That's the tricky part of it 😉

Just wondering, maybe it's possible to replace it by 64 bit integer arithmetics, but I don't know about the accuracy needed.

Reply 15 of 27, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Well it's already solved by some save/restore logic.
I tried to replace the floating point values by fixed point
Bit64s variables, but it requires changes to a very large
number of files, and didn't really work well, so that
idea was kicked.

Reply 16 of 27, by Folken

User metadata
Rank Newbie
Rank
Newbie

Got a chance to try out the latest fix, and so far it looks pretty good. Those two demos are definitely fixed, and I haven't found any other problems yet - so hopefully that was the last bug (for now at least!).

Reply 17 of 27, by franpa

User metadata
Rank Oldbie
Rank
Oldbie

could someone please compile a build of the latest cvs with the patch evo maid above and the change wd maid above as well... as i am a n00b when it comes to compiling stuff... i only compile zsnes because there is a ultra handy zget program to do it for you...

i would love to test performance difference between ykhwong's build and the official cvs.

ps. i have a...

processor: Pentium 4 CPU 630 3.00GHz
asus p5ld2 standard m/b
geforce 6600gt pcie vid card
win XP home

AMD Ryzen 3700X | ASUS Crosshair Hero VIII (WiFi) | 16GB DDR4 3600MHz RAM | MSI Geforce 1070Ti 8GB | Windows 10 Pro x64.

my website

Reply 19 of 27, by franpa

User metadata
Rank Oldbie
Rank
Oldbie

are the builds from rcblanke optimized for my processor? as i would like a copy optimized for my system so as to reduce the requirements to get games to run efficiently.

and are his builds made from official cvs only (as in no new additions or changes)?

AMD Ryzen 3700X | ASUS Crosshair Hero VIII (WiFi) | 16GB DDR4 3600MHz RAM | MSI Geforce 1070Ti 8GB | Windows 10 Pro x64.

my website