I have an Ahead V5000 (VGA-102-1M) ISA VGA card with 1mb of vram in my 386sx-40. Output looks great, and seems to be pretty compatible with DOS games (no Keen scrolling issue). However, in Wolfenstein 3D, I'm getting what I can only describe as "snow" during fade-in/fade-out effects? Once the transition is over, the picture is perfectly fine. It's hard to capture, but see the top few rows of randomly "light" pixels in the below picture.
I have swapped around some of the VRAM chips and exact same artifact occurs in the same area of the screen. So I don't think any of the VRAM is bad. Any ideas?
This is caused by a limitation of SRAM technology that they chose to use in their card. In short, the game is attempting to program the VGA palette during a visible part of the image, and not during vblank. Since the artifacts appear at the very top of the screen, that suggests that the game is starting the palette programming at vsync, but runs out of time, and the programming runs over to the start of the next visible frame, causing the artifacts at the top of the screen.
Typically SRAM has some specific number of ports, that can be either read or write ports. Different logic chips can be wired to these access ports on the SRAM to access the information. Most common designs are "single-ported", where there is only one port, that can be flipped between read and write, or "dual-ported", which means that there can be one actor that reads from the SRAM, and another that writes to it, simultaneously. More than two ports becomes quite rare and a complex design that costs good money.
If a hardware design requires more ports to access the SRAM than there are available, then one must implement a multiplexer switching design, where different parts of the chip are dynamically served to different actors at different times, but in this design, the actors cannot access the SRAM simultaneously at the same time.
The RAMDAC palette is very likely stored on such a SRAM chip. So what is happening here is that the graphics card palette SRAM is running out of ports to enable simultaneous access to the RAMDAC to read from the palette for visible picture scanout, and for the CPU to reprogram a new palette to it (a single-ported SRAM chip). Since the card cannot drop the palette write access (or the programmed palette color would be lost permanently), it does the less disturbing thing, and temporarily disables access to palette for the visible picture scanout, and for a short duration (for a few pixels) while the CPU is writing the new palette color, the RAMDAC scans out incorrect colors.
which interactively tests how the graphics card behaves when the palette is programmed synchronized to vblank, or unsynchronized during the visible picture area.
The attachment PALANIM-instructions.png is no longer available
It renders a palette color cycling animation that should look like this:
The attachment PALANIM-expected.png is no longer available
It is not only old ISA cards that are subject to this issue. For example, I have this newer PCI ATI Mach64 VT-264VT2 card from 1997:
The attachment ATI Mach64 VT-264VT2.jpg is no longer available
and when I run PALANIM on it, I instead get the following output from the Feature Connector of the card:
The attachment PALANIM-ATI-Mach64-VT2.png is no longer available
the display "snows" as PALANIM reprograms the palette repeatedly as fast as possible.
Interestingly, this was so much of a common behavior that VESA actually made it an advertised Feature Capability if the card would be immune to this issue, and the palette could be programmed unsynchronized without danger of snow.
1The need for D2 = 1 "program the RAMDAC using the blank bit in Function 09h" is for older 2style RAMDACs, where programming the RAM values during display time causes a "snow-like" 3effect on the screen. Some newer style RAMDACs don't have this limitation and can easily be 4programmed at any time, but older RAMDACs require that they be programmed during the 5vertical retrace period so as not to display the snow while values change during display time
Curiously, this ATI card that I have, it advertises in the VESA Feature Capabilities that it does not have "palette snow". This VESA feature capability is printed in SNOOP program as a property "[x] no-snow" under the VESA Caps field:
The attachment SNOOP-ATI-Mach64-VT2.png is no longer available
No ATI, no, stop lying. You do have palette snow.
Last edited by clb on 2023-09-12, 10:17. Edited 1 time in total.
One last mildly interesting thing: on this ATI Mach64 VT-264VT2 card, the palette snow is much worse on the Feature Connector output, than on the VGA output. I was really surprised to see there is a difference.
Feature Connector output (the same image from previous post):
The attachment PALANIM-ATI-Mach64-VT2.png is no longer available
VGA output at the same time:
The attachment PALANIM-ATI Mach64 VT-264VT2-on-VGA.jpg is no longer available
i.e. both outputs do snow, but the Feature Connector snows considerably much more than the VGA output. On the VGA output, the areas outside the vertical stripes are completely clean black. Weird.
The palette SRAM will very likely reside inside the RAMDAC chip, which on my Ahead is the KDA0476CN-66 chip.
Other Vogons people have occassionally been adventurous and switched RAMDACs on cards with other pin compatible ones e.g. when they have died, but I am not sure if that would be an operation that would help in this kind of case. CRT Terminator will also avoid the issue for this card at least, without mods needed 😀
yep, the palette SRAM is residing inside the RAMDAC ,otherwise you would need to feed 18 bits of pixel data from the VGA chip instead of 8 :)
Speaking of snow itself, there are three distinct RAMDAC classes:
- single-ported palette RAM with transparent latch on RGB DAC inputs. During CPU access to the palette, SRAM address/data buses are routed to the RAMDAC CPU interface, but the latch continues to feed pixel data to the RGB DACs, resulting in current CPU accessed pallette data being displayed on the screen duing CPU access time, generating much of the unpleasant snow. Early VGA RAMDACs (like IMSG171) are using this scheme.
- later (apprx. from 1990 onwards) most RAMDAC vendors started to advertise their products as "Snow-free". In order to save costs on dual-ported SRAM, they "masked" the problem by disabling the latch during CPU access time, so RGB DACs are driving last pixel value to the monitor; thus, "snow" is appearing as less severe image shimmering. Many RAMDACs and graphics cards (from common KDA476/TDK8001 up to S3/NVidia integrated DACs) use this method
- at last, using dual-ported SRAM eliminates "snow" artifacts completely, and it is used in higher-end RAMDACs.
Gona's snow table lists both graphics cards (look for Warcraft 1 column) and discrete RAMDACs by "noise" type, but doesn't really distincts between them, describing the perceived noise "level" instead - I suppose "big noise" refers to the first type (single-ported with pixel data always routed to the DACs)
- single-ported palette RAM with transparent latch on RGB DAC inputs.
- later (apprx. from 1990 onwards) most RAMDAC vendors started to advertise their products as "Snow-free". In order to save costs on dual-ported SRAM, they "masked" the problem by disabling the latch during CPU access time, so RGB DACs are driving last pixel value to the monitor; thus, "snow" is appearing as less severe image shimmering. Many RAMDACs and graphics cards (from common KDA476/TDK8001 up to S3/NVidia integrated DACs) use this method
- at last, using dual-ported SRAM eliminates "snow" artifacts completely, and it is used in higher-end RAMDACs.
Thank you for this great info.
Judging by the width of the glitches CPU RAMDAC access times are much longer/slower than ordinary pixel output RAMDAC polling. Makes me wonder why RAMDAC manufacturers didnt implement a simple FIFO in front of palette ram? maybe running RAMDAC synchronously with pixel clock was just too much of a simplicity win to give it up.
That was interesting to learn, I did not know of the `outsb` instruction even existing before this. A neat feature.
Looking at the slow path, it does a traditional `out dx, al` call looped 256 times, unrolled 3x for r-g-b. That is a very traditional way to implement palette uploads. Also, looking at the code, it does wait for vsync start right before initiating the palette upload.
I am a bit surprised that any graphics card would be too slow to not be able to do the palette upload even in the Wolf 3D slow path, but looks like that must be the case.
I had to implement fastpalette for FastDoom, as some cards wouldn't update the palette correctly, specially on fast systems. The issue is that REP OUTSxx doesn't wait for the port to be ready between writes, so it causes troubles with some cards.
Is there even such a thing as a concept of palette port readiness? You cant check it. Afaik IBM never mentioned/defined any access time restrictions on those registers like universally ignored ones of Yamaha (OPL2 3.3 us after index register write, 23 us after data register write, OPL3 0.28 us after data register write).
Sounds like omission leading to broken hardware, it should either be defined in the spec, of VGA should insert wait states.
Now I wonder - what happens when you read out palette really fast on such broken hardware?
There does not exist a VGA mechanism for palette port ready, but there does exist a mechanism for ISA bus ready. I wonder if that might be something that the rep outsb instruction didn't account for (although I am a bit skeptical, as that would defeat the whole purpose of this instruction). It feels more like a graphics card bug rather than a mainboard ISA bus bug.
I implemented a test for outsb/insb compatibility into SNOOP, and it just hit me how slow programming the VGA palette really is.
On my test system, 80 MHz 486 with ISA bus at 10.0 MHz, it looks like the fast rep outsb method takes about 30-31 scanlines in vblank to write new 768 bytes, whereas the slow manual out path code takes 39-40 scanlines.
The whole vertical blank in Mode 13h is just 35 scanlines long, of which six first scanlines are the vertical front porch, so commonly lost since IBM wasn't precise to provide a mechanism to allow waiting for vblank start, but just for vsync start. (Commander Keen vblank wait strategy mitigates that a little bit though, and so will CRT Terminator 😁 )
Previously I had thought that DOS games would have had plenty of time to program the palette in vblank, and then some more for other operations, but that doesn't seem to be the case. With the 286+ rep outsb instruction sequence and ISA bus at 10.0 MHz one can just about make it.
I added a code path in SNOOP to diagnose whether the rep outsb/insb instructions would not work for palette programming, curious to see if any cards turn up where that'll fail.
Thanks for the pointer. I have two cards, a WD90C30:
The attachment WD90C30-LR.png is no longer available
The attachment CRTT-SCAN-WD90C30-LR.png is no longer available
and a WD90C31:
The attachment WD90C31A-LR.jpg is no longer available
The attachment CRTT-SCAN-WD90C31-LR.png is no longer available
Ran the SNOOP test code on both of them, though surprisingly it looks like neither of them found a fault.
That does make me suspect a bit whether the incompatibility might have to do with the system motherboard and not the graphics card - or maybe it is dependent on the BIOS in question.
Or alternatively I got something wrong with the test code..