VOGONS

Common searches


64bit replacement of ntvdm

Topic actions

Reply 40 of 47, by RayeR

User metadata
Rank Oldbie
Rank
Oldbie

VESA LFB mode: 1024x768/32 LFB

PCem 15 - 49 MB/s
DOSBox 0.74 - 978 MB/s
various QEMU - 744-934 MB/s
VirtualBox 5.2 - 18452 MB/s
Vmware 15 - 19112 MB/s
DOSEMU 1.4.0.7 - 22943 MB/s

So more than a magnitude difference, it must be due to different approach. But in this case I'm not hunting GFX performance and don't require it by NTVDMx64 more important would be seamless execution of dpmi and other pmode programs, I understand that implementing that is not easy...

Gigabyte GA-P67-DS3-B3, Core i7-2600K @4,5GHz, 8GB DDR3, 128GB SSD, GTX970(GF7900GT), SB Audigy + YMF724F + DreamBlaster combo + LPC2ISA

Reply 41 of 47, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie
RayeR wrote on 2020-06-12, 23:09:
VESA LFB mode: 1024x768/32 LFB various QEMU - 744-934 MB/s VirtualBox 5.2 - 18452 MB/s Vmware 15 - 19112 MB/s DOSEMU 1.4.0.7 - […]
Show full quote


VESA LFB mode: 1024x768/32 LFB
various QEMU - 744-934 MB/s
VirtualBox 5.2 - 18452 MB/s
Vmware 15 - 19112 MB/s
DOSEMU 1.4.0.7 - 22943 MB/s

Your QEMU results did not seem to be obtained from running with acceleration that leverages Intel VT/AMD-V technology. VirtualBox and VMWare support Intel VT-x/AMD-V, and DOSEMU2 can make use of KVM in Linux. If you had measured QEMU with KVM/WHPX acceleration, then the results should be somewhere around the REP MOVSD/STOSD memory bandwidth.

RayeR wrote on 2020-06-12, 23:09:

more important would be seamless execution of dpmi and other pmode programs, I understand that implementing that is not easy...

DOS is on the dead end. If they are just pure DJGPP DPMI command-line tools, then they should be fairly easy to be ported to WIN32 console with MSYS2/mingw-w64. The only exception are those require direct-access to VGA legacy IO and memory resources. Otherwise, DJGPP DOS programs solely depend on VBE LFB can be ported to WIN32 with libraries such as SDL which provides similar direct-access semantics to frame buffer.

Reply 42 of 47, by RayeR

User metadata
Rank Oldbie
Rank
Oldbie

Yes, rigt, I got working only accelerated VT-X QEMU using HAXM - the 3Dfx build from other thread on this forum but as was told HAXM QEMU crashed with DPMI applications so I cannot benchmark it. Is there other version for windows? So i tried only unaccelerated QEMU. Under linux i didn't tried accel ver yet as it would probably need recompile custom kernel with KVM support. Anyway the PC i target is working machine with Win10 on my home PC i can use multiboot and 32b OSes so no problem there. So I see NTVDMx64 as better way due to integration.

Gigabyte GA-P67-DS3-B3, Core i7-2600K @4,5GHz, 8GB DDR3, 128GB SSD, GTX970(GF7900GT), SB Audigy + YMF724F + DreamBlaster combo + LPC2ISA

Reply 43 of 47, by leecher

User metadata
Rank Newbie
Rank
Newbie

I guess the VESA-Benchmarks cannot be compared to Standard VGA benchmarks, as the problem lies within the planar video modes that need a lot of VM-Exits due to read/write on the A000-C000. I just tried to run Commander Keen or Giana Sisters DOS-Version as a test in qemu-kvm and the performance was awful.
VESA LFB may be intersting to implement, as the Windows console is providing a bitmap buffer for graphics, but I'm not sure that it is in the correct format.
Here is a short explanation how i.e. text buffer is currently mapped on HAXM-build:
https://github.com/leecher1337/ntvdmx64/blob/ … i386/sas.c#L741

It seems that the distributed binaries don't contain this important bugfix yet: https://github.com/leecher1337/ntvdmx64/commi … 2bc93d45c6706e0
Without it, it doesn't make sense to do DPMI tests. So I can just repeat myself: Compile it yourself.
Having said that: Don't expect better DPMI-compatibility than the original NTVDM on 32bit Windows, NTVDM DPMI-support wasn't too great.

Regarding dosemu: Yes, I meant dosemu2, correct.

Reply 44 of 47, by latalante

User metadata
Rank Newbie
Rank
Newbie
leecher wrote on 2020-06-13, 09:18:

I guess the VESA-Benchmarks cannot be compared to Standard VGA benchmarks, as the problem lies within the planar video modes that need a lot of VM-Exits due to read/write on the A000-C000. I just tried to run Commander Keen or Giana Sisters DOS-Version as a test in qemu-kvm and the performance was awful.

Not only slow KVM_EXIT_IO is a problem. For example, the original binary code of DOS's DOOM is not liked with Qemu, dosemu2.

DOS DOOM v1.9, -timedemo demo3
-dosbox-SVN 183.5fps
-qemu-tcg 30.3
-qemu-kvm 15.0
-dosemu2 15.7

chocolate-doom-2.3.0, SDL1.2, i386, linux VESA framebuffer

qemu-system-x86_64 -machine pc,accel=tcg -kernel vmlinuz -initrd doom.cfs -append 'root=/dev/ram0 rw vga=769 quiet'

-qemu-tcg 156
-qemu-kvm 607

EDIT:
dosbox-SVN-r4345, linux, cirrus framebuffer

qemu-system-x86_64 -machine pc,accel=kvm -cpu host -m 64 -net none -kernel vmlinuz -initrd dosbox.cfs -append 'ramdisk_size=6144K root=/dev/ram0 ro vga=769 quiet' -monitor stdio

-qemu-kvm 152.4
About 96% of native performance (same dosbox binaries, same configuration). Dosbox compiled with toolchain for i386 architecture.

Last edited by latalante on 2020-06-18, 22:42. Edited 4 times in total.

Reply 45 of 47, by RayeR

User metadata
Rank Oldbie
Rank
Oldbie

Why vga memory range cannot be mapped same way as lfb to not cause vmexit on every single access? I though that most vmexits are caused by i/o access but there shouldnt be it so many for vga rendering, needed for setting a palete or switching bakns in vesa banked mode. Again vmware from my view performs fine for vga and vesa lfb also for sb/fm audio emulation. So there must be a smart way to do it well.

The patch looks funny, nobody knows how it works but it works. So seems the preblem is at exit of dpmi app. NTVDM DPMI support is good enough for djgpp programs.

OK, I will read about building process and depencies and try rebuild myself. Hope it doesn't need mammooths like MSVC, DDK, etc. maybe better setup a virtual for building...

Gigabyte GA-P67-DS3-B3, Core i7-2600K @4,5GHz, 8GB DDR3, 128GB SSD, GTX970(GF7900GT), SB Audigy + YMF724F + DreamBlaster combo + LPC2ISA

Reply 46 of 47, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie

Not surprising, WHPX/KVM can only accelerate when it was kept busy. Exiting and re-entering hardware VM incur a huge latency. Hardware VM is a batch machine, if there wasn't enough work accumulated to hide the latency, then the application/games are more suited to pure emulation such as TCG or DOSBox. Planar VGA programming makes frequent accesses to VGA I/Os hence cannot take advantage of KVM/WHPX acceleration. Those need to be ported to use LFB if they need the bare-metal CPU performance from hardware virtualization.

Reply 47 of 47, by crazyc

User metadata
Rank Member
Rank
Member
RayeR wrote on 2020-06-13, 18:14:

Why vga memory range cannot be mapped same way as lfb to not cause vmexit on every single access?

Real vga requires every vram read and write to pass though the latch and potentially perform a logical operation on the latch with the data in vram. This requires a vm exit to properly handle. DOOM uses unchained mode and memory to memory copies though the latch so that is why it's affected also.