64-bit dynamic_x86 (patch)

Developer's Forum, for discussion of bugs, code, and other developmental aspects of DOSBox.

Re: 64-bit dynamic_x86 (patch)

Postby Kerr Avon » 2019-10-15 @ 16:01

jmarsh wrote:Trying to emulate an entire 32-bit x86 CPU on a 32-bit x86 CPU is awkward because most of it is unusable by user code; segment registers, system registers, even ESP is restricted from being freely modifiable. A 64-bit CPU has an extra 8 general purpose registers to help get the job done.


I see, thanks.
Kerr Avon
Oldbie
 
Posts: 554
Joined: 2007-6-29 @ 14:33

Re: 64-bit dynamic_x86 (patch)

Postby latalante » 2019-10-16 @ 09:11

The Mp3 Lame encoder very heavily uses the dosbox FPU functions (FPU_FLD_32, FPU_FST_32), it doesn't show the difference between Intel Xeon and Core 2 (coding times under 64-bit dosbox are very similar).
However, the Dhrystone benchmark performs much better, it clearly shows the disproportions.
latalante
Newbie
 
Posts: 19
Joined: 2018-11-01 @ 22:47

Re: 64-bit dynamic_x86 (patch)

Postby jmarsh » 2019-10-16 @ 09:37

Maybe it's time to ruin everyone's benchmarking results by improving the 32-bit core...
jmarsh
Member
 
Posts: 333
Joined: 2014-1-04 @ 09:17

Re: 64-bit dynamic_x86 (patch)

Postby Qbix » 2019-10-16 @ 14:20

Indeed :)
Water flows down the stream
How to ask questions the smart way!
User avatar
Qbix
DOSBox Author
 
Posts: 10954
Joined: 2002-11-27 @ 14:50
Location: Fryslan

Re: 64-bit dynamic_x86 (patch)

Postby Kisai » 2019-10-20 @ 00:01

Compiling x64 on MSVC 2019 appears to not generate working binaries.


Exception thrown at 0x00007FF74F572C59 in dosbox.exe: 0xC0000005: Access violation writing location 0x000000009DAF9040.

The program '[51828] dosbox.exe' has exited with code 0 (0x0).

1>vga_memory.cpp
1>\src\hardware\vga_memory.cpp(961,55): warning C4311: 'type cast': pointer truncation from 'Bit8u *' to 'Bitu'
1>\src\hardware\vga_memory.cpp(961,55): warning C4302: 'type cast': truncation from 'Bit8u *' to 'Bitu'
1>\src\hardware\vga_memory.cpp(961,73): warning C4312: 'type cast': conversion from 'Bitu' to 'Bit8u *' of greater size
1>\src\hardware\vga_memory.cpp(965,49): warning C4311: 'type cast': pointer truncation from 'Bit8u *' to 'Bitu'
1>\src\hardware\vga_memory.cpp(965,49): warning C4302: 'type cast': truncation from 'Bit8u *' to 'Bitu'
1>\src\hardware\vga_memory.cpp(965,67): warning C4312: 'type cast': conversion from 'Bitu' to 'Bit8u *' of greater size


Code: Select all
   Bit32u vga_allocsize=vga.vmemsize;
   // Keep lower limit at 512k
   if (vga_allocsize<512*1024) vga_allocsize=512*1024;
   // We reserve extra 2K for one scan line
   vga_allocsize+=2048;
   vga.mem.linear_orgptr = new Bit8u[vga_allocsize+16];
   vga.mem.linear=(Bit8u*)(((Bitu)vga.mem.linear_orgptr + 16-1) & ~(16-1)); //line 961
   memset(vga.mem.linear,0,vga_allocsize);

   vga.fastmem_orgptr = new Bit8u[(vga.vmemsize<<1)+4096+16];
   vga.fastmem=(Bit8u*)(((Bitu)vga.fastmem_orgptr + 16-1) & ~(16-1));


I'm not sure if I missed something.
#define C_TARGETCPU X86_64
and
#define C_FPU_X86 0
are set in config.h

OK, Solution found in https://www.vogons.org/viewtopic.php?f=9&t=68982&start=20#p794900

The default config.h in SVN needs to be fixed to build
Code: Select all
#ifdef _M_X64
#define C_TARGETCPU X86_64
typedef unsigned __int64      Bitu;
typedef signed __int64         Bits;
#else // _M_IX86
#define C_TARGETCPU X86
typedef unsigned int        Bitu;
typedef signed int          Bits;
#endif


https://sourceforge.net/p/dosbox/code-0 ... c/config.h
Attachments
buildlog.txt
(34.23 KiB) Downloaded 6 times
Last edited by Kisai on 2019-10-20 @ 01:07, edited 1 time in total.
Kisai
Member
 
Posts: 138
Joined: 2010-5-05 @ 08:04

Re: 64-bit dynamic_x86 (patch)

Postby robertmo » 2019-10-20 @ 01:06

User avatar
robertmo
l33t
 
Posts: 4809
Joined: 2003-6-18 @ 10:35

Re: 64-bit dynamic_x86 (patch)

Postby Kisai » 2019-10-20 @ 01:08

robertmo wrote:have you read this:
viewtopic.php?f=9&t=68982&start=20#p794900


That's what I linked to in the edit. I found it after when looking for other people's built versions.

Edit (Oct 21)

I managed to get this to compile (VS2019), and recompiled MUNT-SVN ( >2.3.0), FluidSynth( >1.1.6-noglib), SDL2 (2.0.11-snapshot), zlib (1.2.11), libpng (1.6.38-SVN snapshot), in 64-bit in a static build.
C Preprocessor settings
Code: Select all
MT32EMU_WITH_LIBSOXR_RESAMPLER;MT32EMU_BOSS_REVERB_PRECISE_MODE;MT32EMU_USE_FLOAT_SAMPLES;ZLIB_WINAPI;DSOUND_SUPPORT;FLUIDSYNTH_NOT_A_DLL;WIN32;NDEBUG;_CONSOLE;%(PreprocessorDefinitions)


Libraries
Code: Select all
dxguid.lib;Dsound.lib;Iphlpapi.lib;Setupapi.lib;Imm32.lib;version.lib;opengl32.lib;winmm.lib;zlibstat.lib;libpng16static.lib;sdl_net.lib;sdl2main.lib;sdl2.lib;pdcurses.lib;pdcurses-wincon.lib;fluidsynth1.lib;mt32emu.lib;libsoxr.lib;odbc32.lib;odbccp32.lib;ws2_32.lib;%(AdditionalDependencies)
Attachments
dosbox-32-r4275-20191020.zip
Dosbox 32bit r4275-20191020
(1.76 MiB) Downloaded 7 times
dosbox-64-r4275-20191020.zip
Dosbox 64bit r4275-20191020
(2.18 MiB) Downloaded 9 times
Kisai
Member
 
Posts: 138
Joined: 2010-5-05 @ 08:04

Re: 64-bit dynamic_x86 (patch)

Postby jmarsh » 2019-12-02 @ 01:49

Would be interested to see some before/after benchmarks for this patch:
Code: Select all
--- a/src/cpu/core_dyn_x86/risc_x64.h
+++ b/src/cpu/core_dyn_x86/risc_x64.h
@@ -468,7 +468,11 @@ static void gen_discardflags(void) {
 }
 
 static void gen_needcarry(void) {
-   gen_needflags();
+   if (!x64gen.flagsactive) {
+      x64gen.flagsactive=true;
+      opcode(4).setea(4,-1,0,CALLSTACK+8).setimm(0,1).Emit16(0xBA0F);  // bt [rsp+8/40], 0
+      opcode(4).set64().setea(4,-1,0,CALLSTACK+16).Emit8(0x8D);       // lea rsp, [rsp+16/48]
+   }
 }
 
 static void gen_setzeroflag(void) {
jmarsh
Member
 
Posts: 333
Joined: 2014-1-04 @ 09:17

Re: 64-bit dynamic_x86 (patch)

Postby jtchip » 2019-12-03 @ 01:27

Ryzen 5 2400G, DOSBox SVNr4296 built with gcc-9.2.1 on Fedora 31 x86_64, without and with the patch (had to restore the tabs the forum software converted to spaces):

D1:
Microseconds 1 loop: 2.88 -> 2.85
Dhrystones / second: 347826 -> 350877
VAX MIPS rating: 197.97 -> 199.70

D2:
Microseconds 1 loop: 2.98 -> 2.89
Dhrystones / second: 335518 -> 345479
VAX MIPS rating: 190.96 -> 196.63

Quake 1.08 (-nosound) timedemo demo1 (windowed on X11, output=opengl):
640x480: 76.9-> 83.1fps
800x600: 55.8 -> 61.0fps
1024x768: 37.8 -> 41.4fps
jtchip
Newbie
 
Posts: 23
Joined: 2019-6-17 @ 22:24

Re: 64-bit dynamic_x86 (patch)

Postby jmarsh » 2019-12-03 @ 03:38

Thanks. It seems the newer a CPU is, the worse performance it has for the POPF instruction (thanks a lot, speculative execution bug mitigations) so avoiding it as much as possible is a good idea.
jmarsh
Member
 
Posts: 333
Joined: 2014-1-04 @ 09:17

Re: 64-bit dynamic_x86 (patch)

Postby awgamer » 2019-12-03 @ 06:27

>jtchip

You didn't indicate which is which. I can guess but it'd be better if explicit.
awgamer
Oldbie
 
Posts: 575
Joined: 2014-7-26 @ 07:42

Re: 64-bit dynamic_x86 (patch)

Postby latalante » 2019-12-03 @ 12:13

Intel(R) Core(TM) i5 CPU M 450 @ 2.40GHz
Linux 5.4.1
dosbox-svn-r4296, gcc-4.9.4, x86_64, (the second result with the patch applied)
D1:
VAX MIPS rating: 115.27 116.38
PCBench: 70.6 75.4
quake 1.06 with sound, demo1
320x200: 102.5 107.1
800x600: 28.4 32.8
Last edited by latalante on 2019-12-06 @ 11:47, edited 1 time in total.
latalante
Newbie
 
Posts: 19
Joined: 2018-11-01 @ 22:47

Re: 64-bit dynamic_x86 (patch)

Postby Firtasik » 2019-12-03 @ 17:22

Firtasik wrote:
Code: Select all
timedemo demo1

My own build (MinGW64), DOSBox SVN r4267 (x64)  ~197 fps
jmarsh's SVN DOSBox                      (x64)  ~195 fps

Official DOSBox 0.74-3                   (x32)  ~153 fps
EmuCR DOSBox SVN r4267                   (x32)  ~152 fps
Yesterplay80's DOSBox SVN r4267          (x32)  ~147 fps

r4296 and the newest patch: ~206 fps
11 1 111 11 1 1 1 1 1 11 1 1 111 1 111 1 1 1 1 111
User avatar
Firtasik
Member
 
Posts: 463
Joined: 2013-7-21 @ 19:07

Re: 64-bit dynamic_x86 (patch)

Postby realnc » 2019-12-03 @ 18:06

i5 2500K (Sandy Bridge). Linux 64-bit.

Quake timedemo demo1 average of 5 runs each. cycles max, core dynamic.

Before: 105FPS.
After: 117FPS.
User avatar
realnc
Member
 
Posts: 455
Joined: 2010-10-13 @ 11:02

Re: 64-bit dynamic_x86 (patch)

Postby latalante » 2019-12-03 @ 21:20

realnc wrote:i5 2500K (Sandy Bridge). Linux 64-bit.

Quake timedemo demo1 average of 5 runs each. cycles max, core dynamic.

Before: 105FPS.
After: 117FPS.

You could tell under what specific system you were running your binaries. Your results are unbelievably low compared to my older and much slower processor. The difference is around 9% should be at least in the range of 37-42% (if not greater).
https://en.wikipedia.org/wiki/List_of_I ... processors
latalante
Newbie
 
Posts: 19
Joined: 2018-11-01 @ 22:47

Re: 64-bit dynamic_x86 (patch)

Postby realnc » 2019-12-03 @ 22:05

latalante wrote:You could tell under what specific system you were running your binaries.

Gentoo Linux AMD64.

Edit:

Hm. I was using surface output and 1920x1200 window size :P

I changed to openglnb, "original" window size and in-game res 320x200. Reran it:

Before: 121FPS
After: 131FPS.
User avatar
realnc
Member
 
Posts: 455
Joined: 2010-10-13 @ 11:02

Re: 64-bit dynamic_x86 (patch)

Postby latalante » 2019-12-09 @ 21:54

I tested the dosbox-staging version yesterday and the results were clearly lower than my build.

I upload my compilation here if anyone wants to compare.
https://drive.google.com/file/d/1L12wqa ... sp=sharing
It was built under the new version of the C library (glibc-2.30) and requires it. Under Linux, compatibility only works up.
Downward compatibility can be maintained by starting it with loading the necessary libraries.
If you have glibc-2.30, then just run like this:
Code: Select all
cd dosbox-r4296
./bin/dosbox

otherwise
Code: Select all
cd dosbox-r4296
./lib/ld-linux-x86-64.so.2 --library-path ./lib ./bin/dosbox
latalante
Newbie
 
Posts: 19
Joined: 2018-11-01 @ 22:47

Re: 64-bit dynamic_x86 (patch)

Postby krcroft » 2019-12-09 @ 22:14

latalante wrote:I tested the dosbox-staging version

Do you mean dosbox-staging as in: https://github.com/dreamer/dosbox-staging? (Your shell commands show DOSBox SVN).

latalante wrote:... the results were clearly lower than my build.

"the results".. which results - realnc's posted above your post?

"my build".. compiler, version, and optimization flags did you use?

"clearly lower".. (lower being faster? or worse? and by how much?) - can you list the actual before-and-after/A-vs-B numeric results and units for the benchmark?
Last edited by krcroft on 2019-12-10 @ 01:57, edited 1 time in total.
User avatar
krcroft
Member
 
Posts: 425
Joined: 2017-4-29 @ 15:07
Location: Ogden's Retreat

Re: 64-bit dynamic_x86 (patch)

Postby latalante » 2019-12-09 @ 22:30

krcroft wrote:Do you mean dosbox-staging as in: https://github.com/dreamer/dosbox-staging? (Your shell commands shows DOSBox SVN).

Yes

krcroft wrote:"my build".. what architecture, compiler, and optimization flags did you use?

This thread is for 64-bit architecture.
Look here.
viewtopic.php?f=32&t=67673&start=100#p806114
I use flto and others.
krcroft wrote:"clearly lower".. (lower being faster? or worse? and by how much?) - can you list the actual before-and-after/A-vs-B numeric results and units for the benchmark?

Lua, pcbench, lame, quake, ...
In each of my tests the difference was quite significant, at least 5-15% and maybe more. Maybe it's a matter of my old processor.
latalante
Newbie
 
Posts: 19
Joined: 2018-11-01 @ 22:47

Re: 64-bit dynamic_x86 (patch)

Postby krcroft » 2019-12-10 @ 00:02

Thanks latalante, that helps.

I ran some tests and the two are roughly identical, within the noise of my system.
Can you try these steps? You can copy and paste them as-is into your shell.

Build the sources
Code: Select all
wget https://github.com/dreamer/dosbox-staging/archive/master.tar.gz -O - | tar -zxC /dev/shm
wget http://source.dosbox.com/dosboxsvn.tgz -O - | tar -zxC /dev/shm

for d in dosbox dosbox-staging-master; do
    cd "/dev/shm/$d"
    ./autogen.sh
    export CFLAGS="-DNDEBUG -pipe -O3 -march=native -flto=$(nproc)"
    ./configure CXXFLAGS="$CFLAGS" LDFLAGS="$CFLAGS"
    make -j$(nproc)
done


Install the benchmark
Code: Select all
mkdir -p /dev/shm/dosbench
cd /dev/shm/dosbench
curl -sL https://www.philscomputerlab.com/uploads/3/7/2/3/37231621/dosbench_v_1.4_jan_2017.zip | busybox unzip -


Download the attached dosbox.conf and save it as-is into /dev/shm/dosbench

Benchmark Prep
1. Close all applications and confirm the system is idle
2. Open a new terminal
3. Lock your processor at its maximum frequency
Code: Select all
echo performance | sudo tee /sys/devices/system/cpu/cpufreq/policy*/scaling_governor
echo 100 | sudo tee /sys/devices/system/cpu/intel_pstate/min_perf_pct


Benchmark
Launch each separately, ctrl+F9 when done, then run again.
Code: Select all
cd /dev/shm/dosbench
../dosbox-staging-master/src/dosbox DOSBENCH.BAT
../dosbox/src/dosbox DOSBENCH.BAT



I ran:
- the landmark benchmark at 96400 DOSBox cycles (it wraps beyond that)
- the PC Player 3D benchmarks at default (low) and 640x480 resolutions
- Quake at default (low) resolution
out.jpg


Curious if this changes anything for you.

Oh, you can now let your processor relax :happy:
Code: Select all
echo powersave | sudo tee /sys/devices/system/cpu/cpufreq/policy*/scaling_governor
echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/min_perf_pct
Attachments
dosbox.conf
(617 Bytes) Downloaded 1 time
User avatar
krcroft
Member
 
Posts: 425
Joined: 2017-4-29 @ 15:07
Location: Ogden's Retreat

PreviousNext

Return to DOSBox Development

Who is online

Users browsing this forum: Google Feedfetcher and 3 guests