DOSBox-X branch

Reply 1760 of 2419, by TheGreatCodeholio

Posted on 2019-02-04, 10:25

TheGreatCodeholio Offline

Rank Oldbie

Rank: Oldbie
Posts: 819
Joined: 2011-08-18, 20:15
Location: Seattle, WA

I'm seeing core=dynamic fail here with the same Scale and transcendental tests. x86_64 Linux.
DOSBox-X still uses the old dynamic core for 32-bit, is that what you are compiling?
I do not have a 32-bit Linux system handy at this time.

DOSBox-X project: more emulation better accuracy.
DOSLIB and DOSLIB2: Learn how to tinker and hack hardware and software from DOS.

Reply 1761 of 2419, by hail-to-the-ryzen

Posted on 2019-02-04, 10:25

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

I'll try this next to test fp calculations: http://people.eecs.berkeley.edu/~wkahan/srtest/

Edit: nevermind, that doesn't work.

Reply 1762 of 2419, by TheGreatCodeholio

Posted on 2019-02-04, 10:26

TheGreatCodeholio Offline

Rank Oldbie

Rank: Oldbie
Posts: 819
Joined: 2011-08-18, 20:15
Location: Seattle, WA

Assuming you're on Linux, edit config.h and comment out C_DYNAMIC_X86 and uncomment C_DYNREC. Recompile and try that.

If Microsoft C++ on Windows, edit the same in vs2015/config.h

EDIT: If you're compiling 32-bit this change forces the use of the new dynrec code imported from SVN

DOSBox-X project: more emulation better accuracy.
DOSLIB and DOSLIB2: Learn how to tinker and hack hardware and software from DOS.

Reply 1763 of 2419, by hail-to-the-ryzen

Posted on 2019-02-04, 10:29

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

I'm on 32-bit Windows 10 unfortunately. I used to have a Linux image, but I deleted it after not using for a while. I will try the dynrec build (win32) next as suggested.

Edit: also configured with --disable-dynamic-x86 and --enable-dynrec=yes because of the mingw32 automake, otherwise it tries for dynamic x86 instead of dynrec.

Edit2: I had to edit the configure line a couple of times. I think the above is correct now. 😀

Edit3: yes, it is correct. --disable-dynamic-core disabled both, but x86 is needed so dynrec is built. I haven't built the dynrec in a long time - hopefully it builds.

Reply 1764 of 2419, by hail-to-the-ryzen

Posted on 2019-02-04, 10:46

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

Built the 32-bit dynrec version but it shows the same errors in all categories of the fpu test.

Edit: tested with 0.74-2, but then remembered that is using the x86 fpu core with normal cpu core.

Edit2: no binaries to test fpu, but source code here:
http://www.math.utah.edu/~beebe/software/ieee/#ieee-754-soft

This looks better:
http://www.jhauser.us/arithmetic/TestFloat.html

Reply 1765 of 2419, by hail-to-the-ryzen

Posted on 2019-02-04, 11:02

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

I assume the failed tests are from missing fpu flags?

Edit: I built the 32-bit dynrec against the non-x86 long double fpu emulation. That had the failed fpu tests as with dyn-x86. I will try next with the x86 fpu core and dynrec.

Edit2: built the 32-bit dynrec (windows) with the x86 fpu core and that passes all the fpu tests except for the 35 SCALE tests.

Reply 1766 of 2419, by TheGreatCodeholio

Posted on 2019-02-04, 23:21

TheGreatCodeholio Offline

Rank Oldbie

Rank: Oldbie
Posts: 819
Joined: 2011-08-18, 20:15
Location: Seattle, WA

I was wrong about ARMv7, looks like sizeof(long double) == sizeof(double).

Oh well.

However here's another idea on how to implement 80-bit float precision in DOSBox-X FPU core on Linux:

https://www.mpfr.org/

https://gmplib.org/

DOSBox-X project: more emulation better accuracy.
DOSLIB and DOSLIB2: Learn how to tinker and hack hardware and software from DOS.

Reply 1767 of 2419, by hail-to-the-ryzen

Posted on 2019-02-05, 02:18

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

I think those libraries are also used in gcc to handle 128 bit floats. That is a promising idea. Another library like that is: www.ttmath.org. Do these libraries use a common strategy to store the 128 bit floats in a single data structure?

Confirmed the negative result on 35 SCALE tests in the FPU test software MCPDIAG. Used the 32 bit dynarec with the x86 fpu code. It should be possible to log the FSCALE function in both the 32 bit dynarec and 32 bit dynamic-x86 cores and then compare the differences between their results while running MCPDIAG. Those differences should sum to 35 since dynamic-x86 passes all tests and the dynarec passes all but 35. It appears the issue is in the dynarec core code, but is common to both x86 and x86_64 dynarecs.

Reply 1768 of 2419, by hail-to-the-ryzen

Posted on 2019-02-05, 02:34

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

It would also be interesting to log when the FPU control word register is changed. Below is another reference for the register here (mainly so I'm not tempted to calculate in binary numbers):

1/* From fpu_control.h: 387 through the control word register
2 *
3 *  11-10  9-8     5    4    3    2    1    0
4 * | RC  | PC |  | PM | UM | OM | ZM | DM | IM
5 *
6 * IM: Invalid operation mask 0x1
7 * DM: Denormalized operand mask 0x2
8 * ZM: Zero-divide mask 0x4
9 * OM: Overflow mask 0x8
10 * UM: Underflow mask 0x10
11 * PM: Precision (inexact result) mask 0x20
12 *
13 * Mask bit is 1 means no interrupt.
14 *
15 * PC: Precision control
16 * 11 - round to extended precision 0x300
17 * 10 - round to double precision 0x200
18 * 00 - round to single precision
19 *
20 * RC: Rounding control
21 * 00 - rounding to nearest
22 * 01 - rounding down (toward - infinity) 0x400
23 * 10 - rounding up (toward + infinity) 0x800
24 * 11 - rounding toward zero 0xC00
25 *
26 * The hardware default is 0x037f which we use.
27 */

Assuming the current non-x86 fpu emulation uses the default flag values, then it should be possible to see which games and demos alter those flags through the fpu control word register.

I also wonder whether there are any emulators, such as bochs, which handle these flags in a C/C++ path.

Reply 1769 of 2419, by Timbi

Posted on 2019-02-05, 12:02

Timbi Offline

Rank Newbie

Rank: Newbie
Posts: 21
Joined: 2016-04-11, 09:07

Stack overflow while running Windows Media Player (5.0.1) (W95) in latest release of Dosbox-X. It crashes even if I run only the executable of mentioned application.

Core: Normal

In 0.82.14 however, I found nice improvements with handling codecs and performance. Best version I think.

Reply 1770 of 2419, by hail-to-the-ryzen

Posted on 2019-02-07, 10:13

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

I think the normal cpu core and long double fpu core are not fully compatible in a Windows 95 dos box. I don't know yet whether it is related to the page fault system. Running Quake led to a SIGFPE arithmetic exception in the emulator. It seems related to use of the assembly instruction fldcw.

Reply 1771 of 2419, by jkapp976

Posted on 2019-02-07, 19:43

jkapp976 Offline

Rank Newbie

Rank: Newbie
Posts: 2
Joined: 2019-02-07, 19:37

I can't get the mouse to work in PCjr mode with the last couple releases.

Also, if we can't have PCjr composite video mode, is it possible to allow CGA composite mode with the F12 switch? I think CGA has a better palette for most games anyway, and would be nice along with the 3-voice sound.

Reply 1772 of 2419, by hail-to-the-ryzen

Posted on 2019-02-08, 03:49

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

hail-to-the-ryzen wrote:
I think the normal cpu core and long double fpu core are not fully compatible in a Windows 95 dos box. I don't know yet whether it is related to the page fault system. Running Quake led to a SIGFPE arithmetic exception in the emulator. It seems related to use of the assembly instruction fldcw.

Also occurs in regular DOS mode. Fixed by this change (although may test whether second block is optional):

1diff -rupN dosbox-Orig//src/fpu/fpu_instructions_longdouble.h dosbox/src/fpu/fpu_instructions_longdouble.h
2--- dosbox-Orig//src/fpu/fpu_instructions_longdouble.h
3+++ dosbox/src/fpu/fpu_instructions_longdouble.h
4@@ -26,6 +26,7 @@
5 #  include <fpu_control.h>
6 # endif
7 static inline void FPU_SyncCW(void) {
8+	fpu.cw = fpu.cw | 0x3f;
9 	_FPU_SETCW(fpu.cw);
10 }
11 #else
12@@ -528,6 +529,7 @@ static void FPU_FLDENV(PhysPt addr){
13 		tag    = static_cast<Bit16u>(tagbig);
14 	}
15 	FPU_SetTag(tag);
16+	cw = cw | 0x3f;
17 	FPU_SetCW(cw);
18 	FPU_SyncCW();
19 	TOP = FPU_GET_TOP();

Reply 1773 of 2419, by hail-to-the-ryzen

Posted on 2019-02-08, 04:15

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

Or a more concise version:

1@@ -26,6 +26,7 @@
2 #  include <fpu_control.h>
3 # endif
4 static inline void FPU_SyncCW(void) {
5+	fpu.cw |= 0x3f;
6 	_FPU_SETCW(fpu.cw);
7 }
8 #else
9@@ -528,7 +529,7 @@ static void FPU_FLDENV(PhysPt addr){
10 		tag    = static_cast<Bit16u>(tagbig);
11 	}
12 	FPU_SetTag(tag);
13-	FPU_SetCW(cw);
14+	FPU_SetCW(cw | 0x3f);
15 	FPU_SyncCW();
16 	TOP = FPU_GET_TOP();
17 }

Reply 1774 of 2419, by hail-to-the-ryzen

Posted on 2019-02-09, 05:08

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

hail-to-the-ryzen wrote:
Confirmed the negative result on 35 SCALE tests in the FPU test software MCPDIAG. Used the 32 bit dynarec with the x86 fpu code.

That issue is also in core=normal with the x86 fpu code. Fixed it with this change:

1diff -rupN dosbox-Orig//src/fpu/fpu_instructions_x86.h dosbox/src/fpu/fpu_instructions_x86.h
2--- dosbox-Orig//src/fpu/fpu_instructions_x86.h
3+++ dosbox/src/fpu/fpu_instructions_x86.h
4@@ -802,17 +802,20 @@
5 
6 // handles fprem,fprem1,fscale
7 #define FPUD_REMAINDER(op)					\
8-		Bit16u new_sw;						\
9+		Bit16u new_sw,save_sw;					\
10 		__asm__ volatile (					\
11+			"fnstcw		%1				\n"	\
12+			"fldcw		%4				\n"	\
13+			"fldt		%3				\n"	\
14 			"fldt		%2				\n"	\
15-			"fldt		%1				\n"	\
16 			"fclex						\n"	\
17 			#op" 						\n"	\
18 			"fnstsw		%0				\n"	\
19-			"fstpt		%1				\n"	\
20-			"fstp		%%st(0)			"	\
21-			:	"=&am" (new_sw), "+m" (fpu.p_regs[TOP])	\
22-			:	"m" (fpu.p_regs[(TOP+1)&7])				\
23+			"fstpt		%2				\n"	\
24+			"fstp		%%st(0)				\n"	\
25+			"fldcw		%1				"	\
26+			:	"=&am" (new_sw), "=m" (save_sw), "+m" (fpu.p_regs[TOP])	\
27+			:	"m" (fpu.p_regs[(TOP+1)&7]), "m" (fpu.cw_mask_all) 	\
28 		);									\
29 		fpu.sw=(new_sw&0xffbf)|(fpu.sw&0x80ff);
30

Reply 1775 of 2419, by TheGreatCodeholio

Posted on 2019-02-21, 20:15

TheGreatCodeholio Offline

Rank Oldbie

Rank: Oldbie
Posts: 819
Joined: 2011-08-18, 20:15
Location: Seattle, WA

Another interesting bug came to my attention in src/dos/drive_fat.cpp.

When DOSBox creates a file on the FAT filesystem, it doesn't give the FAT directory entry any date or time.

The result is new files have time 00:00:00 and date 1980-00-00 (which is invalid).

Second issue is that DOSBox is using the "created" time stamp instead of the "modified" time stamp. The "modified" timestamp has existed since DOS 1.0 while the "created" timestamp did not exist until Windows 95.

Change references crtTime/crtDate to modTime/modDate in the src/dos/*.cpp code to correct the second issue.

DOSBox-X project: more emulation better accuracy.
DOSLIB and DOSLIB2: Learn how to tinker and hack hardware and software from DOS.

Reply 1776 of 2419, by hail-to-the-ryzen

Posted on 2019-02-28, 04:56

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

I read your interesting list of ideas for expanding DOSBox-X. Is it possible to handle two interpreted instructions at a time instead of one and gain any performance?

Reply 1777 of 2419, by TheGreatCodeholio

Posted on 2019-02-28, 20:19

TheGreatCodeholio Offline

Rank Oldbie

Rank: Oldbie
Posts: 819
Joined: 2011-08-18, 20:15
Location: Seattle, WA

hail-to-the-ryzen wrote:
I read your interesting list of ideas for expanding DOSBox-X. Is it possible to handle two interpreted instructions at a time instead of one and gain any performance?

If you wrote the core to support that (emulating both pipelines of the Pentium, effectively), then perhaps yes. It would be very complex code though.

The idea is that one thread can prefetch opcodes and turn them into tokens so the main thread can act quickly on those tokens (in a giant switch statement), and the prefetch thread could have the freedom to combine opcodes or prepare things in advance for it to speed up emulation.

Most CPUs these days are at least dual core, so why not use two cores to do that?

DOSBox-X project: more emulation better accuracy.
DOSLIB and DOSLIB2: Learn how to tinker and hack hardware and software from DOS.

Reply 1778 of 2419, by hail-to-the-ryzen

Posted on 2019-03-01, 00:53

hail-to-the-ryzen Offline

Rank Member

Rank: Member
Posts: 441
Joined: 2017-03-09, 01:34

Thank you for the interesting information. It sounds like a type of dynamic recompilation but the opcodes are recompiled at a higher level of code. I assume the benefit of the token management would exceed the cost of the threads. 😀

Reply 1779 of 2419, by jmarsh

Posted on 2019-03-01, 01:03

jmarsh Offline

Rank Oldbie

Rank: Oldbie
Posts: 1698
Joined: 2014-01-04, 09:17

You could handle FPU instructions in a separate thread. The co-processor was designed to operate asynchronously, that's the whole reason why there's an fwait instruction (makes the CPU wait until the FPU has finished everything).

Main menu

Topic actions

Reply 1760 of 2419, by TheGreatCodeholio

Reply 1761 of 2419, by hail-to-the-ryzen

Reply 1762 of 2419, by TheGreatCodeholio

Reply 1763 of 2419, by hail-to-the-ryzen

Reply 1764 of 2419, by hail-to-the-ryzen

Reply 1765 of 2419, by hail-to-the-ryzen

Reply 1766 of 2419, by TheGreatCodeholio

Reply 1767 of 2419, by hail-to-the-ryzen

Reply 1768 of 2419, by hail-to-the-ryzen

Reply 1769 of 2419, by Timbi

Reply 1770 of 2419, by hail-to-the-ryzen

Reply 1771 of 2419, by jkapp976

Reply 1772 of 2419, by hail-to-the-ryzen

Reply 1773 of 2419, by hail-to-the-ryzen

Reply 1774 of 2419, by hail-to-the-ryzen

Reply 1775 of 2419, by TheGreatCodeholio

Reply 1776 of 2419, by hail-to-the-ryzen

Reply 1777 of 2419, by TheGreatCodeholio

Reply 1778 of 2419, by hail-to-the-ryzen

Reply 1779 of 2419, by jmarsh