First post, by pyrogx
The other day I was messing around with some 3D accelerators and I was annoyed by the fact that Voodoo 1s (SST1) didn't want to cooperate with my 1GHz Athlon test system. Web search results mostly came down to "Yeah, Voodoo1s have issues with fast systems, FSB issue, don't go above 66MHz, don't use CPUs above 500MHz, use an older system, etc, etc..."
But my inner scientist was asking the question:
Why?
What's wrong with the first Voodoo cards that they stop working properly on systems with fast CPUs or FSB speeds, although the card itself still sits on a standard PCI bus with 33MHz? What does FSB speed have to do with it? Or CPU clock speed?
So I started poking around: Different drivers, different versions of the Glide libraries, different OSs (DOS, Windows, Linux), different chipset settings affecting the PCI bus behaviour. The result was always the same:
The card starts to initialize, the VGA passthrough is turned off - and then the card hangs, displays whatever garbage is currently in the framebuffer, and most of the time also takes the rest of the system down.
Anyway, I noticed a few things:
- Sometimes on very rare occations the Voodoo1 would work properly, but only once and not reproducible.
- If I go to an 800MHz CPU, this happens more often but still not 100% of the time.
- If I go to a 500MHz Athlon (the slowest one I have), the card works fine.
The rest of the system including the 100MHz FSB is always the same. So the whole mess is not really related to FSB speed, at least not on my system. Also, if I turn off caches or slow the system down with something like THROTTLE.EXE, the Voodoo also works with the 1GHz Athlon, although as a slideshow simulator.
So I thought: Maybe it's not the hardware, but the software i.e. the drivers? Do they contain some speed sensitive code which breaks on faster systems?
Since I had Linux on that machine already, I grabbed one of the many Glide source code forks (https://github.com/sezero/glide), compiled it for the SST1, verified that it worked on a slow system (it does) and that it failed on a fast system (it does).
I knew that the card failed early during init, so I started single-stepping through the glide init code with a debugger. Soon I noticed that if I single-stepped the program, the Voodoo works on my fast CPU, but if I set a breakpoint somewhere after initialization, it failed.
And then, when crawling through the actual code, I noticed it contains something like this:
/* glide2x/sst1/init/initvg/util.c *//* Wait for command to execute... */for(n=0; n<25000; n++)sst1InitReturnStatus(sstbase);
...and also this:
/* glide2x/sst1/init/initvg/video.c *//* Wait for video clock to stabilize */for(n=0; n<200000; n++)sst1InitReturnStatus(sstbase);
...and quite a few more variants of that in different files.
This reeks of a delay loop being executed too fast for the card. The function sst1InitReturnStatus does not seem to do anything special except for returning the value of the memory-mapped status register of the SST1 FBI.
But the for-loop does not even look at the result, so it is just a delay loop running in circles around that register.
That's a recipe for failure on faster systems. It looks like if you perform the initialization steps too quickly for the SST1, it gets confused and drops into an undefined state (i.e. it hangs...).
Just for fun, I tried to modify the source code of sst1InitReturnStatus() so that it slows down a bit by using a nanosleep() call in that function. I added a delay of just 20ns on each call of sst1InitReturnStatus(), recompiled and checked if it made any difference.
It did. The Voodoo came up with no issues using my modified glide library, every single time, no hangs, no crashes anymore, I even could play GLQuake and Quake II using FXMesa without problems.
So the whole problem (at least on my particular Athlon system, I didn't test even faster systems yet) was just a timer underrun during hardware setup. I don't know for sure why 3Dfx did this but I have a suspicion: The Voodoo1 (and the Voodoo2) have no active feedback channel of their current state like an interrupt. Just a status register that you have to poll a lot and maybe that register doesn't even track all the states of the hardware, especially during memory and video clock setup. That's a very simple way to implement that in hardware, just fire a series of commands at different registers, wait a bit and hope for the best that the hardware does what it needs to do in time...
I will try to compile a glide2x.dll and glide3x.dll for Win9x now but I need to find a replacement for that nanosleep() function which is not available outside of Linux.