VOGONS


64-bit dynamic_x86 (patch)

Topic actions

Reply 100 of 123, by Kerr Avon

User metadata
Rank Oldbie
Rank
Oldbie
jmarsh wrote:

Trying to emulate an entire 32-bit x86 CPU on a 32-bit x86 CPU is awkward because most of it is unusable by user code; segment registers, system registers, even ESP is restricted from being freely modifiable. A 64-bit CPU has an extra 8 general purpose registers to help get the job done.

I see, thanks.

Reply 101 of 123, by latalante

User metadata
Rank Newbie
Rank
Newbie

The Mp3 Lame encoder very heavily uses the dosbox FPU functions (FPU_FLD_32, FPU_FST_32), it doesn't show the difference between Intel Xeon and Core 2 (coding times under 64-bit dosbox are very similar).
However, the Dhrystone benchmark performs much better, it clearly shows the disproportions.

Reply 104 of 123, by Kisai

User metadata
Rank Member
Rank
Member

Compiling x64 on MSVC 2019 appears to not generate working binaries.

Exception thrown at 0x00007FF74F572C59 in dosbox.exe: 0xC0000005: Access violation writing location 0x000000009DAF9040.

The program '[51828] dosbox.exe' has exited with code 0 (0x0).

1>vga_memory.cpp 1>\src\hardware\vga_memory.cpp(961,55): warning C4311: 'type cast': pointer truncation from 'Bit8u *' to 'Bitu' […]
Show full quote

1>vga_memory.cpp
1>\src\hardware\vga_memory.cpp(961,55): warning C4311: 'type cast': pointer truncation from 'Bit8u *' to 'Bitu'
1>\src\hardware\vga_memory.cpp(961,55): warning C4302: 'type cast': truncation from 'Bit8u *' to 'Bitu'
1>\src\hardware\vga_memory.cpp(961,73): warning C4312: 'type cast': conversion from 'Bitu' to 'Bit8u *' of greater size
1>\src\hardware\vga_memory.cpp(965,49): warning C4311: 'type cast': pointer truncation from 'Bit8u *' to 'Bitu'
1>\src\hardware\vga_memory.cpp(965,49): warning C4302: 'type cast': truncation from 'Bit8u *' to 'Bitu'
1>\src\hardware\vga_memory.cpp(965,67): warning C4312: 'type cast': conversion from 'Bitu' to 'Bit8u *' of greater size

	Bit32u vga_allocsize=vga.vmemsize;
// Keep lower limit at 512k
if (vga_allocsize<512*1024) vga_allocsize=512*1024;
// We reserve extra 2K for one scan line
vga_allocsize+=2048;
vga.mem.linear_orgptr = new Bit8u[vga_allocsize+16];
vga.mem.linear=(Bit8u*)(((Bitu)vga.mem.linear_orgptr + 16-1) & ~(16-1)); //line 961
memset(vga.mem.linear,0,vga_allocsize);

vga.fastmem_orgptr = new Bit8u[(vga.vmemsize<<1)+4096+16];
vga.fastmem=(Bit8u*)(((Bitu)vga.fastmem_orgptr + 16-1) & ~(16-1));

I'm not sure if I missed something.
#define C_TARGETCPU X86_64
and
#define C_FPU_X86 0
are set in config.h

OK, Solution found in Benchmarking emulators on latest machines

The default config.h in SVN needs to be fixed to build

#ifdef _M_X64
#define C_TARGETCPU X86_64
typedef unsigned __int64 Bitu;
typedef signed __int64 Bits;
#else // _M_IX86
#define C_TARGETCPU X86
typedef unsigned int Bitu;
typedef signed int Bits;
#endif

https://sourceforge.net/p/dosbox/code-0/HEAD/ … isualc/config.h

Attachments

  • Filename
    buildlog.txt
    File size
    34.23 KiB
    Downloads
    49 downloads
    File license
    Fair use/fair dealing exception
Last edited by Kisai on 2019-10-20, 01:07. Edited 1 time in total.

Reply 106 of 123, by Kisai

User metadata
Rank Member
Rank
Member
robertmo wrote:

That's what I linked to in the edit. I found it after when looking for other people's built versions.

Edit (Oct 21)

I managed to get this to compile (VS2019), and recompiled MUNT-SVN ( >2.3.0), FluidSynth( >1.1.6-noglib), SDL2 (2.0.11-snapshot), zlib (1.2.11), libpng (1.6.38-SVN snapshot), in 64-bit in a static build.
C Preprocessor settings

MT32EMU_WITH_LIBSOXR_RESAMPLER;MT32EMU_BOSS_REVERB_PRECISE_MODE;MT32EMU_USE_FLOAT_SAMPLES;ZLIB_WINAPI;DSOUND_SUPPORT;FLUIDSYNTH_NOT_A_DLL;WIN32;NDEBUG;_CONSOLE;%(PreprocessorDefinitions)

Libraries

dxguid.lib;Dsound.lib;Iphlpapi.lib;Setupapi.lib;Imm32.lib;version.lib;opengl32.lib;winmm.lib;zlibstat.lib;libpng16static.lib;sdl_net.lib;sdl2main.lib;sdl2.lib;pdcurses.lib;pdcurses-wincon.lib;fluidsynth1.lib;mt32emu.lib;libsoxr.lib;odbc32.lib;odbccp32.lib;ws2_32.lib;%(AdditionalDependencies)

Attachments

  • Filename
    dosbox-32-r4275-20191020.zip
    File size
    1.76 MiB
    Downloads
    71 downloads
    File comment
    Dosbox 32bit r4275-20191020
    File license
    Fair use/fair dealing exception
  • Filename
    dosbox-64-r4275-20191020.zip
    File size
    2.18 MiB
    Downloads
    56 downloads
    File comment
    Dosbox 64bit r4275-20191020
    File license
    Fair use/fair dealing exception

Reply 107 of 123, by jmarsh

User metadata
Rank Oldbie
Rank
Oldbie

Would be interested to see some before/after benchmarks for this patch:

--- a/src/cpu/core_dyn_x86/risc_x64.h
+++ b/src/cpu/core_dyn_x86/risc_x64.h
@@ -468,7 +468,11 @@ static void gen_discardflags(void) {
}

static void gen_needcarry(void) {
- gen_needflags();
+ if (!x64gen.flagsactive) {
+ x64gen.flagsactive=true;
+ opcode(4).setea(4,-1,0,CALLSTACK+8).setimm(0,1).Emit16(0xBA0F); // bt [rsp+8/40], 0
+ opcode(4).set64().setea(4,-1,0,CALLSTACK+16).Emit8(0x8D); // lea rsp, [rsp+16/48]
+ }
}

static void gen_setzeroflag(void) {

Reply 108 of 123, by jtchip

User metadata
Rank Member
Rank
Member

Ryzen 5 2400G, DOSBox SVNr4296 built with gcc-9.2.1 on Fedora 31 x86_64, without and with the patch (had to restore the tabs the forum software converted to spaces):

D1:
Microseconds 1 loop: 2.88 -> 2.85
Dhrystones / second: 347826 -> 350877
VAX MIPS rating: 197.97 -> 199.70

D2:
Microseconds 1 loop: 2.98 -> 2.89
Dhrystones / second: 335518 -> 345479
VAX MIPS rating: 190.96 -> 196.63

Quake 1.08 (-nosound) timedemo demo1 (windowed on X11, output=opengl):
640x480: 76.9-> 83.1fps
800x600: 55.8 -> 61.0fps
1024x768: 37.8 -> 41.4fps

Reply 109 of 123, by jmarsh

User metadata
Rank Oldbie
Rank
Oldbie

Thanks. It seems the newer a CPU is, the worse performance it has for the POPF instruction (thanks a lot, speculative execution bug mitigations) so avoiding it as much as possible is a good idea.

Reply 111 of 123, by latalante

User metadata
Rank Newbie
Rank
Newbie

Intel(R) Core(TM) i5 CPU M 450 @ 2.40GHz
Linux 5.4.1
dosbox-svn-r4296, gcc-4.9.4, x86_64, (the second result with the patch applied)
D1:
VAX MIPS rating: 115.27 116.38
PCBench: 70.6 75.4
quake 1.06 with sound, demo1
320x200: 102.5 107.1
800x600: 28.4 32.8

Last edited by latalante on 2019-12-06, 11:47. Edited 1 time in total.

Reply 112 of 123, by Firtasik

User metadata
Rank Oldbie
Rank
Oldbie
Firtasik wrote:
[…]
Show full quote
timedemo demo1

My own build (MinGW64), DOSBox SVN r4267 (x64) ~197 fps
jmarsh's SVN DOSBox (x64) ~195 fps

Official DOSBox 0.74-3 (x32) ~153 fps
EmuCR DOSBox SVN r4267 (x32) ~152 fps
Yesterplay80's DOSBox SVN r4267 (x32) ~147 fps

r4296 and the newest patch: ~206 fps

11 1 111 11 1 1 1 1 1 11 1 1 111 1 111 1 1 1 1 111

Reply 114 of 123, by latalante

User metadata
Rank Newbie
Rank
Newbie
realnc wrote:
i5 2500K (Sandy Bridge). Linux 64-bit. […]
Show full quote

i5 2500K (Sandy Bridge). Linux 64-bit.

Quake timedemo demo1 average of 5 runs each. cycles max, core dynamic.

Before: 105FPS.
After: 117FPS.

You could tell under what specific system you were running your binaries. Your results are unbelievably low compared to my older and much slower processor. The difference is around 9% should be at least in the range of 37-42% (if not greater).
https://en.wikipedia.org/wiki/List_of_Intel_C … microprocessors

Reply 115 of 123, by realnc

User metadata
Rank Oldbie
Rank
Oldbie
latalante wrote:

You could tell under what specific system you were running your binaries.

Gentoo Linux AMD64.

Edit:

Hm. I was using surface output and 1920x1200 window size 😜

I changed to openglnb, "original" window size and in-game res 320x200. Reran it:

Before: 121FPS
After: 131FPS.

Reply 116 of 123, by latalante

User metadata
Rank Newbie
Rank
Newbie

I tested the dosbox-staging version yesterday and the results were clearly lower than my build.

I upload my compilation here if anyone wants to compare.
https://drive.google.com/file/d/1L12wqanCgboj … iew?usp=sharing
It was built under the new version of the C library (glibc-2.30) and requires it. Under Linux, compatibility only works up.
Downward compatibility can be maintained by starting it with loading the necessary libraries.
If you have glibc-2.30, then just run like this:

cd dosbox-r4296
./bin/dosbox

otherwise

cd dosbox-r4296
./lib/ld-linux-x86-64.so.2 --library-path ./lib ./bin/dosbox

Reply 117 of 123, by krcroft

User metadata
Rank Oldbie
Rank
Oldbie
latalante wrote:

I tested the dosbox-staging version

Do you mean dosbox-staging as in: https://github.com/dreamer/dosbox-staging? (Your shell commands show DOSBox SVN).

latalante wrote:

... the results were clearly lower than my build.

"the results".. which results - realnc's posted above your post?

"my build".. compiler, version, and optimization flags did you use?

"clearly lower".. (lower being faster? or worse? and by how much?) - can you list the actual before-and-after/A-vs-B numeric results and units for the benchmark?

Last edited by krcroft on 2019-12-10, 01:57. Edited 1 time in total.

Reply 118 of 123, by latalante

User metadata
Rank Newbie
Rank
Newbie
krcroft wrote:

Do you mean dosbox-staging as in: https://github.com/dreamer/dosbox-staging? (Your shell commands shows DOSBox SVN).

Yes

krcroft wrote:

"my build".. what architecture, compiler, and optimization flags did you use?

This thread is for 64-bit architecture.
Look here.
64-bit dynamic_x86 (patch)
I use flto and others.

krcroft wrote:

"clearly lower".. (lower being faster? or worse? and by how much?) - can you list the actual before-and-after/A-vs-B numeric results and units for the benchmark?

Lua, pcbench, lame, quake, ...
In each of my tests the difference was quite significant, at least 5-15% and maybe more. Maybe it's a matter of my old processor.

Reply 119 of 123, by krcroft

User metadata
Rank Oldbie
Rank
Oldbie

Thanks latalante, that helps.

I ran some tests and the two are roughly identical, within the noise of my system.
Can you try these steps? You can copy and paste them as-is into your shell.

Build the sources

wget https://github.com/dreamer/dosbox-staging/archive/master.tar.gz -O - | tar -zxC /dev/shm
wget http://source.dosbox.com/dosboxsvn.tgz -O - | tar -zxC /dev/shm

for d in dosbox dosbox-staging-master; do
cd "/dev/shm/$d"
./autogen.sh
export CFLAGS="-DNDEBUG -pipe -O3 -march=native -flto=$(nproc)"
./configure CXXFLAGS="$CFLAGS" LDFLAGS="$CFLAGS"
make -j$(nproc)
done

Install the benchmark

mkdir -p /dev/shm/dosbench
cd /dev/shm/dosbench
curl -sL https://www.philscomputerlab.com/uploads/3/7/2/3/37231621/dosbench_v_1.4_jan_2017.zip | busybox unzip -

Download the attached dosbox.conf and save it as-is into /dev/shm/dosbench

Benchmark Prep
1. Close all applications and confirm the system is idle
2. Open a new terminal
3. Lock your processor at its maximum frequency

echo performance | sudo tee /sys/devices/system/cpu/cpufreq/policy*/scaling_governor
echo 100 | sudo tee /sys/devices/system/cpu/intel_pstate/min_perf_pct

Benchmark
Launch each separately, ctrl+F9 when done, then run again.

cd /dev/shm/dosbench
../dosbox-staging-master/src/dosbox DOSBENCH.BAT
../dosbox/src/dosbox DOSBENCH.BAT

I ran:
- the landmark benchmark at 96400 DOSBox cycles (it wraps beyond that)
- the PC Player 3D benchmarks at default (low) and 640x480 resolutions
- Quake at default (low) resolution

out.jpg
Filename
out.jpg
File size
510.78 KiB
Views
1257 views
File license
Fair use/fair dealing exception

Curious if this changes anything for you.

Oh, you can now let your processor relax 😀

echo powersave | sudo tee /sys/devices/system/cpu/cpufreq/policy*/scaling_governor
echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/min_perf_pct

Attachments

  • Filename
    dosbox.conf
    File size
    617 Bytes
    Downloads
    97 downloads
    File license
    Fair use/fair dealing exception