VOGONS


First post, by darry

User metadata
Rank l33t++
Rank
l33t++

https://hackaday.com/2026/03/22/the-3dfx-vood … ain-in-an-fpga/

https://github.com/fayalalebrun/SpinalVoodoo

Reply 1 of 12, by vstrakh

User metadata
Rank Member
Rank
Member

No hardware resources estimation shown in readme.md and no performance indicators given.
As I understood the output is produced in simulation only. The implementation could be a fully functional model which either will not fit in available fpga boards (like MiSTer), or will choke because of limited memory throughput and high latencies.
Still cool though, would be great if he redesigned the inner pipeline with the modern memories in mind - high throughput, high latency. It's always the memory bandwidth that limits the possibilities.

Reply 2 of 12, by NeoG_

User metadata
Rank Oldbie
Rank
Oldbie
vstrakh wrote on 2026-03-24, 07:02:

No hardware resources estimation shown in readme.md and no performance indicators given.
As I understood the output is produced in simulation only. The implementation could be a fully functional model which either will not fit in available fpga boards (like MiSTer), or will choke because of limited memory throughput and high latencies.
Still cool though, would be great if he redesigned the inner pipeline with the modern memories in mind - high throughput, high latency. It's always the memory bandwidth that limits the possibilities.

The designer said on reddit that initially with a DE-10 nano he was only able to get 12.5Mhz cycle accurate, but since optimised it to achieve 50Mhz which should be the same as an original Voodoo. I imagine there still much room to improve.

98/DOS Rig: BabyAT AladdinV, K6-2+/550, V3 2000, 128MB PC100, 20GB HDD, 128GB SD2IDE, SB Live!, SB16-SCSI, PicoGUS, WP32 McCake, iNFRA CD, ZIP100
XP Rig: Lian Li PC-10 ATX, Gigabyte X38-DQ6, Core2Duo E6850, ATi HD5870, 2GB DDR2, 2TB HDD, X-Fi XtremeGamer

Reply 3 of 12, by vstrakh

User metadata
Rank Member
Rank
Member

Why then no stats are listed in the repository?

Also, normally it's not "much room to improve" but "too much stuff doesn't fit timings and has to be redesigned from scratch".
I'm skeptical about SpinalHDL use. While it allows you to think on the level farther from the underlying hardware, squeezing the last drop of performance in tight constraints comes from relying on that hardware architecture, planning with the available hw blocks properties and their relative layout on the chip.

Reply 4 of 12, by NeoG_

User metadata
Rank Oldbie
Rank
Oldbie
vstrakh wrote on 2026-03-24, 08:17:

Why then no stats are listed in the repository?

They are available on reddit for you to get answers to anything you need, I obviously don't know

https://www.reddit.com/r/FPGA/comments/1s18vi … on_of_the_3dfx/

98/DOS Rig: BabyAT AladdinV, K6-2+/550, V3 2000, 128MB PC100, 20GB HDD, 128GB SD2IDE, SB Live!, SB16-SCSI, PicoGUS, WP32 McCake, iNFRA CD, ZIP100
XP Rig: Lian Li PC-10 ATX, Gigabyte X38-DQ6, Core2Duo E6850, ATi HD5870, 2GB DDR2, 2TB HDD, X-Fi XtremeGamer

Reply 5 of 12, by noquiche

User metadata
Rank Newbie
Rank
Newbie

Hi! Creator here. Regarding the architecture, the Voodoo is pretty much a giant linear pipeline so there's plenty of opportunity for retiming beyond the 50 Mhz I am sitting at right now. SpinalHDL makes retiming way easier than Verilog or VHDL and gives you just as much control.

I believe the main risk right now is the memory interface. I explicitly designed it so that you can plug it in to almost anything, including a single modern DDR interface. That does mean I need fillbuffers for framebuffer access and a cache for the texture unit in order to guarantee speed with bursty accesses. If you were to use separate single-cycle memories for the framebuffer and textures like the original Voodoo then you should be able to get to the theoretical 50 Mtx/s fairly easily with this design.

That's exactly what I'm working on right now. Tomb Raider is at 5 FPS on the DE-10 because of general memory contention issues. But I have plenty of ideas still. Give me a couple weeks and hopefully it is running at full speed.

Reply 6 of 12, by leileilol

User metadata
Rank l33t++
Rank
l33t++

Sadly, this is not a fresh, all-new implementation. This looks to be a port of PCem's Voodoo emulation going by the screenshots, the similar limitations, and a lot of the credit to 86box when that's really Sarah Walker's code for PCem that's being referenced (ah the joys of hostile forks)

You should reference MAME's Voodoo.cpp instead tbh imho, they've at least got LOD dither, dither subtraction, clocks, and doesn't use a big LUT for recompiler performance optimizing (that's why it was there for PCem, it used to generate tables but that couldn't be optimized well by gcc), but they're not perfect on the Voodoo emulation either.

Also one particular weird thing...... if you don't have the gamma table CLUTs implemented, then how is there gamma in the presented screenshots?

apsosig.png
long live PCem
FUCK "AI"

Reply 7 of 12, by noquiche

User metadata
Rank Newbie
Rank
Newbie
leileilol wrote on Yesterday, 03:23:

Sadly, this is not a fresh, all-new implementation. This looks to be a port of PCem's Voodoo emulation going by the screenshots, the similar limitations, and a lot of the credit to 86box when that's really Sarah Walker's code for PCem that's being referenced (ah the joys of hostile forks)

This is true. Sadly, I don't have a real Voodoo card to test with. In principle I could generate a bunch of triangles with many combinations of available options, render all of them with a real Voodoo card, and use that as my golden reference. This would be very useful since right now my biggest mismatch with the software model is in the triangle edges. PCEm uses edge-walking while I am using Pineda-style rasterization, which is more accurate and more likely matches the real Voodoo hardware. So something like this could allow me to get to a 100% match. Maybe there is a Voodoo-owner here who is willing to help?

You should reference MAME's Voodoo.cpp instead tbh imho, they've at least got LOD dither, dither subtraction, clocks, and doesn't use a big LUT for recompiler performance optimizing (that's why it was there for PCem, it used to generate tables but that couldn't be optimized well by gcc), but they're not perfect on the Voodoo emulation either.

Thanks for the tip.

I was under the impression PCEm's voodoo implementation was originally based off of an older MAME version. Is that inaccurate?

Also one particular weird thing...... if you don't have the gamma table CLUTs implemented, then how is there gamma in the presented screenshots?

Right now I just apply a gamma of 2.2 in the trace replaying code that reads out the screenshots from FB memory. This is why in general the screenshots look too bright (the Voodoo uses lower gamma than standard by default). I have not implemented this unit yet because I probably will not need it for the first user-facing integration I am working on.

Reply 8 of 12, by leileilol

User metadata
Rank l33t++
Rank
l33t++
noquiche wrote on Yesterday, 06:31:

I was under the impression PCEm's voodoo implementation was originally based off of an older MAME version. Is that inaccurate?

Before she worked on Voodoo emulation she had S3 ViRGE's 3D emulation working, and the Voodoo emulation code was followed in a similar structure to how ViRGE was handled. MAME's code was very slow at the time, and they still did things PCem didn't (dithered subtraction and dithered lod) and the license was not yet permissive.

Not the only emulator with Voodoo Graphics in there, there's also Bochs but that's been an eternally untuitive emulator and I think that was MAME derived for their Voodoo. (currently Bochs is working on geforce2-3-fx-6800u emulation which I highly doubt is functional at this point, I dont' think there's a consumer CPU in the world that can emulate a Geforce2 GTS rendering 1280x1024x32 with 2x FSAA)

As for Voodoo Graphics cards I own, I skipped that one for a PowerVR PCX2 😀 (do have a v2, banshee and v3 though)

apsosig.png
long live PCem
FUCK "AI"

Reply 9 of 12, by SarahWalker

User metadata
Rank Member
Rank
Member

Nice to see some people are denying me credit for my work!

PCem's Voodoo emulation is indeed my own work. While I did look at MAME (as well as other sources) for one or two of the finer details, no MAME code is present in the Voodoo emulation.

Reply 10 of 12, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Interesting.

Makes me wonder if I should start hunting out more examples of pre-existing FPGA on PCI cards, in case we find one that is adaptable. My previous thread for examples FPGA found on PCI parallel card, possibilities??? but obviously 5000 gates isn't going to do it, however, a Spartan 6 one is mentioned in the last comment (of 2023, in case thread gets bumped)

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 11 of 12, by wierd_w

User metadata
Rank Oldbie
Rank
Oldbie

Since the OG author of the (PCem) emulation can be reached, why not capitalize on that, and attribute accordingly?

I'd understand if she has cold feet, and politely declines, but the attribution of code heritage is not that hard to do once its made aware of, no?

Reply 12 of 12, by BinaryDemon

User metadata
Rank Oldbie
Rank
Oldbie

I wonder how long before an equivalently powerful FPGA (to the DE-10 nano) is under <$100.