VOGONS


First post, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie

Greetings, my friend Dege!

Not sure if anyone still has interests in the ancient GLIDE 2.11 support, I found that dgVoodoo2 had broken the LFB implementation on GLIDE.DLL. When _grLfbGetWritePtr was called, the return pointer to LFB seemed to be faked and none of the writes will get updated to the screen. The legacy dgVoodoo 1.50Beta3 worked beautifully anyway. I am testing on Windows 10 Pro build 1803.

BTW, do think if it is possible to release a WIN64 build for the GLIDE and GLIDE2X DLLs? I am asking this because I had worked on Glide pass-through implementation on QEMU. The x86_64 build of QEMU TCG is 10%~15% faster than i686 build from MSYS2/mingw-w64. At the moment, the only working native 64-bit glide wrapper is OpenGlide x86_64 build. PsVoodoo can be my next target. Currently it has issues with 64-bit pointers.

For QEMU i686 build, most of the wrappers are working so far:
- OpenGlide (glide2x)
- PsVoodoo (glide2x)
- dgVoodoo 1.5 (glide/glide2x)
- dgVoodoo2 2.55_2 (glide/glide2x) - LFB issue with glide.dll

I would be glad to be able to check out dgVoodoo2 WIN64 build on QEMU x86_64 build with glide pass-through. This could make *ANY* legacy 3Dfx GLIDE games, running on QEMU on modern system. I also have plan to make this work with Linux, but focus is on Windows for now.

Last edited by kjliew on 2019-01-14, 23:17. Edited 7 times in total.

Reply 1 of 53, by Dege

User metadata
Rank l33t
Rank
l33t

Hi! 😀

kjliew wrote:

Not sure if anyone still has interests in the ancient GLIDE 2.11 support, I found that dgVoodoo2 had broken the LFB implementation on GLIDE.DLL. When _grLfbGetWritePtr was called, the return pointer to LFB seemed to be faked and none of the writes will get updated to the screen. The legacy dgVoodoo 1.50Beta3 worked beautifully anyway. I am testing on Windows 10 Pro build 1803.

I just did a quick test with Pandemonium 1 to see if it's basically broken but it worked (fine). What application did you try it with, I'm interested?
One thing though, but it's not Glide2.11 specific: Pandemonium didn't work properly with 'Voodoo Banshee' and 'Other greater' virtual card types because:

- Voodoo Banshee/Othergreater are UMA cards meaning working with non-1024-pixel wide stride for Lfb locks (even for Glide2.11)
- All other older cards are non-UMA (tiled memory) ones so they always provides 2048 byte stride which is (wrongly) assumed by default by some(?) old games like Pandemonium

kjliew wrote:

BTW, do think if it is possible to release a WIN64 build for the GLIDE and GLIDE2X DLLs? I am asking this because I had worked on Glide pass-through implementation on QEMU. The x86_64 build of QEMU TCG is 10%~15% faster than i686 build from MSYS2/mingw-w64. At the moment, the only working native 64-bit glide wrapper is OpenGlide x86_64 build. PsVoodoo can be my next target. Currently it has issues with 64-bit pointers.

This sounds interesting! I could try to build x64 versions.
But, how do you pass 64bit pointers back to 32bit games for lfb locks? Copying to/from a temporary 32bit visible area can cause serious performance loss.
Also, dgVoodoo's fast video memory access is based on page-guard exceptions and I think those exceptions wouldn't get passed to the x64 Glide component (but simply cause an application crash), but maybe I'm wrong.

Reply 2 of 53, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie
Dege wrote:

What application did you try it with, I'm interested?

Mechwarrior 2 3Dfx version. I found several peculiarities with MechWarrior 2. It touches LFB even outside of _grLfbBegin/_grLfbEnd using the LFB write pointer obtained from _grLfbGetWritePtr. It does not use _guFbReadRegion/_guFbWriteRegion for LFB operation. The GLIDE 2.11 sample tests from 3Dfx SDK are all using _guFbReadRegion/_guFbWriteRegion. None of them are checking out LFB operation directly through the obtained pointers.

Dege wrote:

One thing though, but it's not Glide2.11 specific: Pandemonium didn't work properly with 'Voodoo Banshee' and 'Other greater' virtual card types because:
- Voodoo Banshee/Othergreater are UMA cards meaning working with non-1024-pixel wide stride for Lfb locks (even for Glide2.11)
- All other older cards are non-UMA (tiled memory) ones so they always provides 2048 byte stride which is (wrongly) assumed by default by some(?) old games like Pandemonium

Technically, Glide 2.11 does not support anything else but Voodoo1, so this is expected. The API hardwires the LFB stride to 2048 bytes, or 1024-pixel width. Only Voodoo1 and Voodoo2 are natively 1024-pixel wide stride. I believe this was something that you had resolved in the past for dgVoodoo 1.x. The wrapper must provide the mechanism to translate the incoming LFB offset presuming 1024-pixel wide stride into the real LFB stride from _grLfbLock.

Dege wrote:

But, how do you pass 64bit pointers back to 32bit games for lfb locks? Copying to/from a temporary 32bit visible area can cause serious performance loss.

Nope 😀 In the virtualized context, the guest always see the 32-bit address for LFB. Guest LFB accesses are managed by the host memory handlers. I can even do the offset translation on-the-fly within the memory handler to emulate GLIDE 2.11 LFB semantics on GLIDE2X.DLL. I have already got this working on both dgVoodoo1 and dgVoodoo2 using just the GLIDE2X.DLL. Unfortunately, OpenGlide didn't quite work out yet, I think it has other LFB implementation issues. DOSBox glide pass-through works in similar fashion, and it also works in native x86_64 build with OpenGlide in Linux for Glide 2.4 APIs.

Looking forward to your release of WIN64 DLLs. Technically, only GLIDE2X wrapper is required now, but it won't hurt to be able to cross check or perform performance analysis with a real GLIDE.DLL wrapper.

Reply 3 of 53, by Dege

User metadata
Rank l33t
Rank
l33t
kjliew wrote:

Nope In the virtualized context, the guest always see the 32-bit address for LFB. Guest LFB accesses are managed by the host memory handlers. I can even do the offset translation on-the-fly within the memory handler to emulate GLIDE 2.11 LFB semantics on GLIDE2X.DLL.

Ok, so it's a mapping magic.

kjliew wrote:

Looking forward to your release of WIN64 DLLs. Technically, only GLIDE2X wrapper is required now, but it won't hurt to be able to cross check or perform performance analysis with a real GLIDE.DLL wrapper.

For 64 bit there is only one function calling convention, so there is no function name decoration like for 32 bit _stdcall's.
I mean, for example grSstWinclose isn't exported from Glide2x.dll as '_grSstWinClose@0' like for 32 bit but simply as 'grSstWinClose'. This must be taken into consideration when calling GetProcAddress for Glide functions.

kjliew wrote:

Mechwarrior 2 3Dfx version. I found several peculiarities with MechWarrior 2. It touches LFB even outside of _grLfbBegin/_grLfbEnd using the LFB write pointer obtained from _grLfbGetWritePtr. It does not use _guFbReadRegion/_guFbWriteRegion for LFB operation. The GLIDE 2.11 sample tests from 3Dfx SDK are all using _guFbReadRegion/_guFbWriteRegion. None of them are checking out LFB operation directly through the obtained pointers.

Ok, I'll have a look at it later. Btw, did you tried it natively on Windows? Accessing LFB outside of _grLfbBegin/_grLfbEnd shouldn't be a problem for dgVoodoo, but detecting such a behavior is based on guard-page exceptions I mentioned in my previous post.
Are those exceptions passed to dgVoodoo in an emulated Host-Guest environment?

kjliew wrote:

Technically, Glide 2.11 does not support anything else but Voodoo1, so this is expected. The API hardwires the LFB stride to 2048 bytes, or 1024-pixel width. Only Voodoo1 and Voodoo2 are natively 1024-pixel wide stride. I believe this was something that you had resolved in the past for dgVoodoo 1.x. The wrapper must provide the mechanism to translate the incoming LFB offset presuming 1024-pixel wide stride into the real LFB stride from _grLfbLock.

Let's select a non-UMA bard (Voodoo2 and below) in dgVoodoo and 1024 pixel wide stride is guaranteed. UMA cards has non 1024 pixel wide stride and dgVoodoo's Glide 2.11 doesn't do any extra work to convert the LFB data to 1024-type.
As far as I know, 3Dfx native driver works the same way, because non-UMA hw cannot be set to tiled mode. But, I'm saying this only by my memories. If I'm wrong, I could modify Glide2.11 to always provide 1024 type stride.

Reply 4 of 53, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie
Dege wrote:

For 64 bit there is only one function calling convention, so there is no function name decoration like for 32 bit _stdcall's.
I mean, for example grSstWinclose isn't exported from Glide2x.dll as '_grSstWinClose@0' like for 32 bit but simply as 'grSstWinClose'. This must be taken into consideration when calling GetProcAddress for Glide functions.

Actually, when I looked into OpenGlide glide2x.dll x86_64 build from mingw-w64, the same __stdcall function name decoration is still intact. My existing Glide pass-through implementation on QEMU does require __stdcall function name decoration. I employed some tricks to simplify the arguments passing by decoding the function decoration. Anyway if there isn't an option for the WIN64 compiler you used to retain __stdcall function name decoration, this is fine. I can make changes to my code to deal with that.

Dege wrote:

Ok, I'll have a look at it later. Btw, did you tried it natively on Windows?

Unfortunately, Mechwarrior 2 3Dfx no longer works on Windows 10, for all the hooks and patches that used to be able to make it work on WinXP/Win7. This is the main motivation that I started working on Glide pass-through on QEMU.

Dege wrote:

Accessing LFB outside of _grLfbBegin/_grLfbEnd shouldn't be a problem for dgVoodoo, but detecting such a behavior is based on guard-page exceptions I mentioned in my previous post.
Are those exceptions passed to dgVoodoo in an emulated Host-Guest environment?

OK, I see. I don't think the guest environment in QEMU would be able to reflect the exception back into host. In fact, the guest won't see any exception as it is only writing to virtualized LFB. QEMU will then write to the LFB pointer from dgVoodoo.

Reply 5 of 53, by Dege

User metadata
Rank l33t
Rank
l33t
kjliew wrote:

Actually, when I looked into OpenGlide glide2x.dll x86_64 build from mingw-w64, the same __stdcall function name decoration is still intact.

That's interesting. I use VS and it doesn't decorate function names, and the reason of it is clear.
_stdcall is only a compatibility placeholder for 64 bit since there is no callee-cleaned-stack calling convention but only the one universal 64bit calling convention (at least, on Windows).
The number after '@' in a decorated function name denotes how many bytes the callee must clean up from the stack after returning. On one hand the stack is always cleaned by the caller, on the other hand, even if the number of bytes of the parameters were encoded in the function name they wouldn't match their 32bit counterparts when pointers were passed on the parameter list.

The only way to force the 32bit decorated names is to create a .def file for the linker and specify all of the 100+ functions one by one to alter to their 32bit equivalents.

Reply 6 of 53, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie
Dege wrote:

The only way to force the 32bit decorated names is to create a .def file for the linker and specify all of the 100+ functions one by one to alter to their 32bit equivalents.

I was about to tell you about using the def file in linking the DLL, but I am not sure if your toolchain supports it. Well, you can generate the def file from your 32-bit DLL and write script to convert it for 64-bit DLL. You don't have to do that manually. I had done it for PsVoodoo. Here's the snippet of the def file from my PsVoodoo. BTW, PsVoodoo finally works on x86_64 build.

I am more than happy to give you the def file from your dgVoodoo2.55_2 for you to link the 64-bit DLL.

LIBRARY glide2x.dll
EXPORTS
_ConvertAndDownloadRle@64 = ConvertAndDownloadRle
_grAADrawLine@8 = grAADrawLine
_grAADrawPoint@4 = grAADrawPoint
_grAADrawPolygon@12 = grAADrawPolygon
_grAADrawPolygonVertexList@8 = grAADrawPolygonVertexList
_grAADrawTriangle@24 = grAADrawTriangle
_grAlphaBlendFunction@16 = grAlphaBlendFunction
_grAlphaCombine@20 = grAlphaCombine

Reply 7 of 53, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie

Attached to def file from dgVoodoo2.55_2 from Glide2x.dll

Here's how I did it from bash shell:

$ pexports Glide2x.dll | sed "s/\(^_.*\)/\1\ = \1/;s/=\ _/=\ /;s/@[0-9]*$//"

You should really be ditching VS and move to MSYS2/mingw-w64 😎
I made that transition many, many years ago....

Attachments

  • Filename
    dgv2_export64.def.txt
    File size
    6.07 KiB
    Downloads
    60 downloads
    File comment
    DLL def file from dgVoodoo2.55_2
    File license
    Fair use/fair dealing exception

Reply 8 of 53, by Dege

User metadata
Rank l33t
Rank
l33t

Ok, thanks for that!

I've whipped up an x64 build of Glide.dll and Glide2x.dll

http://dege.fw.hu/temp/dgVodooo_2_55_2_Glide_x64.zip

Tested Glide2x with my own test app and seemed to work but it was just a quick test.
This test includes writing to unlocked LFB, drawing triangle within grLfbLock/grLfbUnlock and such.
'Force emulating true PCI access' must/should be enabled for such cases but it worked too (aside from the splash dlls as those are 32 bit).

Reply 9 of 53, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie

Testing Glide2x.dll at the moment:
3Dfx SDK tests failed - TEST06, TEST22, TEST29. They ran fine but producing different output from 32-bit dgVoodoo2.55_2 on QEMU i686.
3Dfx demo failed - race, fight.
race - 3Dfx stamp showed on the lower right corner and the center dividing block border. Background were faint blue. Ctrl-P will show perf stats with FPS, so LFB kinda worked.
fight - 3Dfx stamp showed on the lower right corner, just plain black screen. Ctrl-P will show perf stats with FPS, so LFB kinda worked.

Games that works:
- Quake1.06 w/ GLQuake 0.97 miniGL 1.49
- Mechwarrior 2 3Dfx (with glide2x emulating API 2.11)
- Titanium Mechwarrior 2 Mercenaries (glide2x)

Games failed:
- Quake2 3.20 with miniGL 1.49. Black screen, but the game was running. Player stats was OK. Console OK.
- NFS2SEA demo. Very weird screen rendering, but the game was running. Not black or blue screen, I could see something on the screen, just the flaky colors/textures.

How do I enable debug output from dgVoodoo2? I tried using DebugView, but I got nothing. 3Dfx SDK TEST06 is the simplest W-buffer test, and it was really strange that it failed.

Reply 10 of 53, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie
Dege wrote:

'Force emulating true PCI access' must/should be enabled for such cases but it worked too (aside from the splash dlls as those are 32 bit).

The guest wrapper DLL does not export these functions. Are they really neccessary? Which games won't work without them?

Reply 11 of 53, by Dege

User metadata
Rank l33t
Rank
l33t
kjliew wrote:

How do I enable debug output from dgVoodoo2? I tried using DebugView, but I got nothing. 3Dfx SDK TEST06 is the simplest W-buffer test, and it was really strange that it failed.

Set 'MaxTraceLevel' to 2 in the dgVoodoo config file and dgVoodoo with the debug layer should give you feedback in DebugView or DebugView++.

kjliew wrote:

The guest wrapper DLL does not export these functions. Are they really neccessary? Which games won't work without them?

'Force emulating...' is a config item in dgVoodoo not a function. IIRC it must be enabled for Extreme Assault (DOS) along with selecting a non-UMA card. But it shouldn't be enabled for most of the games.
x64 dgVoodoo tries to load the splash dlls but it obviously fails. Not a problem, if you wrap them from 32 to 64 bit (as you did) then logo and splash screen should work.

Update: use DebugView++. DebugView doesn't work for me either with the x64 version.

Reply 12 of 53, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie

TEST08 from 3Dfx Glide 2.11 SDK is also failing, regardless of which DLL I used. I think this is obvious that GR_DEPTHBUFFER_WBUFFER is having trouble in the x86_64 build. GLIDE.DLL also has the same issue as its 32-bit counterpart on the LFB write. Mechwarrior 2 3Dfx HUD rendering is broken on both i686 and x86_64 builds using GLIDE.DLL, while emulating API 2.11 LFB semantics with GLIDE2x.DLL works on both x86_64 and i686 version.

Do you employ C floating point (such as float, double variables) computation in dgVoodoo? I think another major differences between i686 and x86_64 C ABI is the floating point processing. i686 will default to FPU while x86_64 will default to SSE. I found both OpenGlide and PsVoodoo are broken in fogging because the _guFogGenerateExp/Exp2/Linear use float variables to construct the fog table. OpenGlide also has inlined ASM for MMXCopy. On GCC, I was able to replicate the fogging issue by compiling i686 OpenGlide to use SSE for math. I have not tried the other way round by forcing x86_64 OpenGlide to use FPU for math. I do not know if this could also be QEMU TCG issues on messing with FPU/XMM states.

I pretty sure it is GR_DEPTHBUFFER_WBUFFER. The 2 failed 3Dfx demo are also setting up depth buffer mode with WBUFFER. Did you change something on WBUFFER from dgVoodoo2.55_2?
I need to double-check on potential FPU/XMM issues on x64 build. Look like both PsVoodoo and OpenGlide are OK with math on SSE for i686 build.

Last edited by kjliew on 2018-06-22, 04:15. Edited 2 times in total.

Reply 13 of 53, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie
Dege wrote:

x64 dgVoodoo tries to load the splash dlls but it obviously fails. Not a problem, if you wrap them from 32 to 64 bit (as you did) then logo and splash screen should work.

I think the splash DLLs are 32-bit DLLs from 3Dfx. They won't work with x64 dgvoodoo. AFAIU, the 3Dfx splash DLLs are to be used by the host wrappers, the guest wrapper DLLs simply pass through the APIs. For 3Dfx splash to work, the x64 host wrapper needs to implement this all within itself. I think that is what OpenGlide is doing on _grSplash.

Reply 14 of 53, by Dege

User metadata
Rank l33t
Rank
l33t
kjliew wrote:

I pretty sure it is GR_DEPTHBUFFER_WBUFFER. The 2 failed 3Dfx demo are also setting up depth buffer mode with WBUFFER. Did you change something on WBUFFER from dgVoodoo2.55_2?

Ok, thanks, then it must be something in dgVoodoo internals. I'll try to compile the sdk tests to 64bit and check them out. There were no changes in Glide, so I think it's related to SSE/FPU somewhere.

Update: Ok, I found the problem: missing register save/restore at some point, needed for the x64 calling convention. Test06 now passes for me on 64bit too. I reuploaded the zip file, plz try the new build.

Reply 15 of 53, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie

Wonderful!!! 😀 Both 3Dfx demos, race & fight, now run beautifully.
Quake2 3.20 also works, and it finally breaks 30FPS on demo1 on QEMU.

DgVoodoo2 is the current leading wrapper in speed and rendering accuracy.
OpenGlide comes very close in speed, but there are few rendering quirks.
PsVoodoo is a less complete implementation of GLIDE APIs. For games that work, it works. Speed wise, it is only marginally slower than OpenGlide.

Alright, the last remaining issue is fogging, and I found that all the 3 wrappers had issue with that, and this issue only show up in x64 builds. i686 builds are fine regardless of how the maths are done, FPU or SSE. I explicitly disassemble the code to look at _guFogGenerateExp/Exp2/Linear to check the code in use for maths. This is evident in 3Dfx SDK TEST08 and the 2 demos. PsVoodoo initially didn't have the _guFogGenerate* functions, I ported those from OpenGlide. it is intricate that a clean C float math code, as in PsVoodoo, would failed in x64, but not in i686. My initial debugging on 3Dfx TEST08 showed that the maths were generating fog table with all '0' on subsequent runs. 1st run was always OK.

I think you might have found it, too, as the dgVoodoo2 x64 seems to have the _guFogGenerate* functions disabled, as it is producing similar output as PsVoodoo without those functions in the 2 demos.

I hope to see official release of x64 GLIDE DLLs in future dgVoodoo2 releases. Again, a big THANK YOU!

Reply 16 of 53, by Dege

User metadata
Rank l33t
Rank
l33t

Great!! Thanks!

But I still don't understand this fog-thing. _guFogGenerate* functions work just fine in the x64 build too, I didn't have to disable anything because of the x64 target (float math runs on SSE2 in the x64 build).
I tested both Test08 and Test22 and both of them have proper fog, as their 32 bit counterparts. guFogGenerateExp fills the fog table with the expected values.

Also, I'd like to debug that Glide2.11 lfb-problem with Mech2 but I have the feeling I could do it on a QEMU build, some day. 😀
The bug must be in dgVoodoo2, regarding that dgVoodoo1 and emulation through Glide2x works fine.

kjliew wrote:

I hope to see official release of x64 GLIDE DLLs in future dgVoodoo2 releases. Again, a big THANK YOU!

I'm about to include them in the package, my only concern is that it'll be confusing for most of the users (which one to use for native games, x86 or x64).

Reply 17 of 53, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie
Dege wrote:

But I still don't understand this fog-thing. _guFogGenerate* functions work just fine in the x64 build too, I didn't have to disable anything because of the x64 target (float math runs on SSE2 in the x64 build).
I tested both Test08 and Test22 and both of them have proper fog, as their 32 bit counterparts. guFogGenerateExp fills the fog table with the expected values.

I can only got the correct fog table on the 1st run from QEMU. Subsequent run I got all '0' in the fogtable. From GDB, the FPU/XMM states was not the same between 2 runs. I don't know if this is the problem with QEMU. Perhaps, if you run natively, then the problem didn't show up. I can definitely see different rendering output between 32-bit and 64-bit wrappers.

Dege wrote:

Also, I'd like to debug that Glide2.11 lfb-problem with Mech2 but I have the feeling I could do it on a QEMU build, some day. 😀

Yes, you will. I am cleaning up the codes before attaching my patch for QEMU. Well, the QEMU community are serious folks. They probably don't intend to have QEMU playing ancient 3Dfx games (productivity killer) on modern OSes. 🤣 BTW, you also need MSYS2/mingw-w64 to build QEMU. This is the easiest to get it build. I am not sure if QEMU can be supported in VS, but you know, open-source world is biased towards GCC.

Dege wrote:

I'm about to include them in the package, my only concern is that it'll be confusing for most of the users (which one to use for native games, x86 or x64).

Cool... Well, you can just put them in the separate folder in your packaging, labelling as "qemu_x64". I think only QEMU can used them for now, unless DOSBox devs community change their stands on WIN64 support (which I had already asked). Or, even better, you decided to make a native Linux Glide wrappers. 😎

Reply 18 of 53, by Dege

User metadata
Rank l33t
Rank
l33t
kjliew wrote:

I can only got the correct fog table on the 1st run from QEMU. Subsequent run I got all '0' in the fogtable. From GDB, the FPU/XMM states was not the same between 2 runs. I don't know if this is the problem with QEMU. Perhaps, if you run natively, then the problem didn't show up. I can definitely see different rendering output between 32-bit and 64-bit wrappers.

It's weird. It should be debugged, too.

kjliew wrote:

Yes, you will. I am cleaning up the codes before attaching my patch for QEMU. Well, the QEMU community are serious folks. They probably don't intend to have QEMU playing ancient 3Dfx games (productivity killer) on modern OSes. BTW, you also need MSYS2/mingw-w64 to build QEMU. This is the easiest to get it build. I am not sure if QEMU can be supported in VS, but you know, open-source world is biased towards GCC.

Thanks! We'll see but I'll just use mingw if there is no VS support for QEMU.

kjliew wrote:

I think only QEMU can used them for now, unless DOSBox devs community change their stands on WIN64 support (which I had already asked). Or, even better, you decided to make a native Linux Glide wrappers.

I've already been asked for an x64 version of Glide2x to use that with x64 builds of DosBox but I didn't care about that because AFAIK (read somewhere) x64 version of DosBox is slower than x86, so didn't see it would make any sense.
Native Linux Glide wrapping would require Vulkan support like nGlide having it. But maybe wrapping dgVoodoo by DXVK is just enough if someone interested. Wish I had infinite spare time... 😀

Reply 19 of 53, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie

I noticed that dgVoodoo may have miscalculate the rendering window size by 1 pixel. When dgVoodoo takes over the window, the window is missing the right and bottom border. OpenGlide and psVoodoo are OK. This applies to both 32-bit and 64-bit dgVoodoo.

Another 64-bit only issue, but this time it applies to both dgVoodoo & psVoodoo. And, this is about a game demo, NFS 2 SE, the executable is nfs2sea.exe. Both dgVoodoo & psVoodoo are rendering with complete white screen. The game was running though, I could hear the music and keyboard sequence also worked as I could exit the game through the blind keyboard sequence. OpenGlide, on the other hand, worked out beautifully. OpenGlide 32-bit produced similar rendering with 32-bit dgVoodoo, including fogging as seen in the initial foggy race scene and sky. OpenGlide 64-bit started out with clear race scene and sky (nice side-effects of missing fogging), indicating that fogging wasn't working. I guess this could be specific to D3D9-based renderers.

Before you fixed WBUFFER for 64-bit dgVoodoo, I wasn't getting complete white screen, just very weird colors & texture.

Another game, MDK 3Dfx, is also not rendered correctly by 64-bit dgVoodoo. 32-bit dgVoodoo is fine.