VOGONS

Common searches


DOSBox-X branch

Topic actions

Reply 1760 of 2397, by TheGreatCodeholio

User metadata
Rank Oldbie
Rank
Oldbie

I'm seeing core=dynamic fail here with the same Scale and transcendental tests. x86_64 Linux.
DOSBox-X still uses the old dynamic core for 32-bit, is that what you are compiling?
I do not have a 32-bit Linux system handy at this time.

DOSBox-X project: more emulation better accuracy.
DOSLIB and DOSLIB2: Learn how to tinker and hack hardware and software from DOS.

Reply 1762 of 2397, by TheGreatCodeholio

User metadata
Rank Oldbie
Rank
Oldbie

Assuming you're on Linux, edit config.h and comment out C_DYNAMIC_X86 and uncomment C_DYNREC. Recompile and try that.

If Microsoft C++ on Windows, edit the same in vs2015/config.h

EDIT: If you're compiling 32-bit this change forces the use of the new dynrec code imported from SVN

DOSBox-X project: more emulation better accuracy.
DOSLIB and DOSLIB2: Learn how to tinker and hack hardware and software from DOS.

Reply 1763 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

I'm on 32-bit Windows 10 unfortunately. I used to have a Linux image, but I deleted it after not using for a while. I will try the dynrec build (win32) next as suggested.

Edit: also configured with --disable-dynamic-x86 and --enable-dynrec=yes because of the mingw32 automake, otherwise it tries for dynamic x86 instead of dynrec.

Edit2: I had to edit the configure line a couple of times. I think the above is correct now. 😀

Edit3: yes, it is correct. --disable-dynamic-core disabled both, but x86 is needed so dynrec is built. I haven't built the dynrec in a long time - hopefully it builds.

Reply 1764 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

Built the 32-bit dynrec version but it shows the same errors in all categories of the fpu test.

Edit: tested with 0.74-2, but then remembered that is using the x86 fpu core with normal cpu core.

Edit2: no binaries to test fpu, but source code here:
http://www.math.utah.edu/~beebe/software/ieee/#ieee-754-soft

This looks better:
http://www.jhauser.us/arithmetic/TestFloat.html

Reply 1765 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

I assume the failed tests are from missing fpu flags?

Edit: I built the 32-bit dynrec against the non-x86 long double fpu emulation. That had the failed fpu tests as with dyn-x86. I will try next with the x86 fpu core and dynrec.

Edit2: built the 32-bit dynrec (windows) with the x86 fpu core and that passes all the fpu tests except for the 35 SCALE tests.

Reply 1766 of 2397, by TheGreatCodeholio

User metadata
Rank Oldbie
Rank
Oldbie

I was wrong about ARMv7, looks like sizeof(long double) == sizeof(double).

Oh well.

However here's another idea on how to implement 80-bit float precision in DOSBox-X FPU core on Linux:

https://www.mpfr.org/

https://gmplib.org/

DOSBox-X project: more emulation better accuracy.
DOSLIB and DOSLIB2: Learn how to tinker and hack hardware and software from DOS.

Reply 1767 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

I think those libraries are also used in gcc to handle 128 bit floats. That is a promising idea. Another library like that is: www.ttmath.org. Do these libraries use a common strategy to store the 128 bit floats in a single data structure?

Confirmed the negative result on 35 SCALE tests in the FPU test software MCPDIAG. Used the 32 bit dynarec with the x86 fpu code. It should be possible to log the FSCALE function in both the 32 bit dynarec and 32 bit dynamic-x86 cores and then compare the differences between their results while running MCPDIAG. Those differences should sum to 35 since dynamic-x86 passes all tests and the dynarec passes all but 35. It appears the issue is in the dynarec core code, but is common to both x86 and x86_64 dynarecs.

Reply 1768 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

It would also be interesting to log when the FPU control word register is changed. Below is another reference for the register here (mainly so I'm not tempted to calculate in binary numbers):

/* From fpu_control.h: 387 through the control word register
*
* 11-10 9-8 5 4 3 2 1 0
* | RC | PC | | PM | UM | OM | ZM | DM | IM
*
* IM: Invalid operation mask 0x1
* DM: Denormalized operand mask 0x2
* ZM: Zero-divide mask 0x4
* OM: Overflow mask 0x8
* UM: Underflow mask 0x10
* PM: Precision (inexact result) mask 0x20
*
* Mask bit is 1 means no interrupt.
*
* PC: Precision control
* 11 - round to extended precision 0x300
* 10 - round to double precision 0x200
* 00 - round to single precision
*
* RC: Rounding control
* 00 - rounding to nearest
* 01 - rounding down (toward - infinity) 0x400
* 10 - rounding up (toward + infinity) 0x800
* 11 - rounding toward zero 0xC00
*
* The hardware default is 0x037f which we use.
*/

Assuming the current non-x86 fpu emulation uses the default flag values, then it should be possible to see which games and demos alter those flags through the fpu control word register.

I also wonder whether there are any emulators, such as bochs, which handle these flags in a C/C++ path.

Reply 1769 of 2397, by Timbi

User metadata
Rank Newbie
Rank
Newbie

Stack overflow while running Windows Media Player (5.0.1) (W95) in latest release of Dosbox-X. It crashes even if I run only the executable of mentioned application.

Core: Normal

In 0.82.14 however, I found nice improvements with handling codecs and performance. Best version I think.

Reply 1770 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

I think the normal cpu core and long double fpu core are not fully compatible in a Windows 95 dos box. I don't know yet whether it is related to the page fault system. Running Quake led to a SIGFPE arithmetic exception in the emulator. It seems related to use of the assembly instruction fldcw.

Reply 1771 of 2397, by jkapp976

User metadata
Rank Newbie
Rank
Newbie

I can't get the mouse to work in PCjr mode with the last couple releases.

Also, if we can't have PCjr composite video mode, is it possible to allow CGA composite mode with the F12 switch? I think CGA has a better palette for most games anyway, and would be nice along with the 3-voice sound.

Reply 1772 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member
hail-to-the-ryzen wrote:

I think the normal cpu core and long double fpu core are not fully compatible in a Windows 95 dos box. I don't know yet whether it is related to the page fault system. Running Quake led to a SIGFPE arithmetic exception in the emulator. It seems related to use of the assembly instruction fldcw.

Also occurs in regular DOS mode. Fixed by this change (although may test whether second block is optional):

diff -rupN dosbox-Orig//src/fpu/fpu_instructions_longdouble.h dosbox/src/fpu/fpu_instructions_longdouble.h
--- dosbox-Orig//src/fpu/fpu_instructions_longdouble.h
+++ dosbox/src/fpu/fpu_instructions_longdouble.h
@@ -26,6 +26,7 @@
# include <fpu_control.h>
# endif
static inline void FPU_SyncCW(void) {
+ fpu.cw = fpu.cw | 0x3f;
_FPU_SETCW(fpu.cw);
}
#else
@@ -528,6 +529,7 @@ static void FPU_FLDENV(PhysPt addr){
tag = static_cast<Bit16u>(tagbig);
}
FPU_SetTag(tag);
+ cw = cw | 0x3f;
FPU_SetCW(cw);
FPU_SyncCW();
TOP = FPU_GET_TOP();

Reply 1773 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

Or a more concise version:

@@ -26,6 +26,7 @@
# include <fpu_control.h>
# endif
static inline void FPU_SyncCW(void) {
+ fpu.cw |= 0x3f;
_FPU_SETCW(fpu.cw);
}
#else
@@ -528,7 +529,7 @@ static void FPU_FLDENV(PhysPt addr){
tag = static_cast<Bit16u>(tagbig);
}
FPU_SetTag(tag);
- FPU_SetCW(cw);
+ FPU_SetCW(cw | 0x3f);
FPU_SyncCW();
TOP = FPU_GET_TOP();
}

Reply 1774 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member
hail-to-the-ryzen wrote:

Confirmed the negative result on 35 SCALE tests in the FPU test software MCPDIAG. Used the 32 bit dynarec with the x86 fpu code.

That issue is also in core=normal with the x86 fpu code. Fixed it with this change:

diff -rupN dosbox-Orig//src/fpu/fpu_instructions_x86.h dosbox/src/fpu/fpu_instructions_x86.h
--- dosbox-Orig//src/fpu/fpu_instructions_x86.h
+++ dosbox/src/fpu/fpu_instructions_x86.h
@@ -802,17 +802,20 @@

// handles fprem,fprem1,fscale
#define FPUD_REMAINDER(op) \
- Bit16u new_sw; \
+ Bit16u new_sw,save_sw; \
__asm__ volatile ( \
+ "fnstcw %1 \n" \
+ "fldcw %4 \n" \
+ "fldt %3 \n" \
"fldt %2 \n" \
- "fldt %1 \n" \
"fclex \n" \
#op" \n" \
"fnstsw %0 \n" \
- "fstpt %1 \n" \
- "fstp %%st(0) " \
- : "=&am" (new_sw), "+m" (fpu.p_regs[TOP]) \
- : "m" (fpu.p_regs[(TOP+1)&7]) \
+ "fstpt %2 \n" \
+ "fstp %%st(0) \n" \
+ "fldcw %1 " \
+ : "=&am" (new_sw), "=m" (save_sw), "+m" (fpu.p_regs[TOP]) \
+ : "m" (fpu.p_regs[(TOP+1)&7]), "m" (fpu.cw_mask_all) \
); \
fpu.sw=(new_sw&0xffbf)|(fpu.sw&0x80ff);

Reply 1775 of 2397, by TheGreatCodeholio

User metadata
Rank Oldbie
Rank
Oldbie

Another interesting bug came to my attention in src/dos/drive_fat.cpp.

When DOSBox creates a file on the FAT filesystem, it doesn't give the FAT directory entry any date or time.

The result is new files have time 00:00:00 and date 1980-00-00 (which is invalid).

Second issue is that DOSBox is using the "created" time stamp instead of the "modified" time stamp. The "modified" timestamp has existed since DOS 1.0 while the "created" timestamp did not exist until Windows 95.

Change references crtTime/crtDate to modTime/modDate in the src/dos/*.cpp code to correct the second issue.

DOSBox-X project: more emulation better accuracy.
DOSLIB and DOSLIB2: Learn how to tinker and hack hardware and software from DOS.

Reply 1776 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

I read your interesting list of ideas for expanding DOSBox-X. Is it possible to handle two interpreted instructions at a time instead of one and gain any performance?

Reply 1777 of 2397, by TheGreatCodeholio

User metadata
Rank Oldbie
Rank
Oldbie
hail-to-the-ryzen wrote:

I read your interesting list of ideas for expanding DOSBox-X. Is it possible to handle two interpreted instructions at a time instead of one and gain any performance?

If you wrote the core to support that (emulating both pipelines of the Pentium, effectively), then perhaps yes. It would be very complex code though.

The idea is that one thread can prefetch opcodes and turn them into tokens so the main thread can act quickly on those tokens (in a giant switch statement), and the prefetch thread could have the freedom to combine opcodes or prepare things in advance for it to speed up emulation.

Most CPUs these days are at least dual core, so why not use two cores to do that?

DOSBox-X project: more emulation better accuracy.
DOSLIB and DOSLIB2: Learn how to tinker and hack hardware and software from DOS.

Reply 1778 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

Thank you for the interesting information. It sounds like a type of dynamic recompilation but the opcodes are recompiled at a higher level of code. I assume the benefit of the token management would exceed the cost of the threads. 😀

Reply 1779 of 2397, by jmarsh

User metadata
Rank Oldbie
Rank
Oldbie

You could handle FPU instructions in a separate thread. The co-processor was designed to operate asynchronously, that's the whole reason why there's an fwait instruction (makes the CPU wait until the FPU has finished everything).