Snow on ISA vga card \ VOGONS

Reply 1 of 24, by clb

Posted on 2023-09-12, 09:52

clb Offline

Rank Oldbie

Rank: Oldbie
Posts: 559
Joined: 2021-07-14, 11:45

This is caused by a limitation of SRAM technology that they chose to use in their card. In short, the game is attempting to program the VGA palette during a visible part of the image, and not during vblank. Since the artifacts appear at the very top of the screen, that suggests that the game is starting the palette programming at vsync, but runs out of time, and the programming runs over to the start of the next visible frame, causing the artifacts at the top of the screen.

Typically SRAM has some specific number of ports, that can be either read or write ports. Different logic chips can be wired to these access ports on the SRAM to access the information. Most common designs are "single-ported", where there is only one port, that can be flipped between read and write, or "dual-ported", which means that there can be one actor that reads from the SRAM, and another that writes to it, simultaneously. More than two ports becomes quite rare and a complex design that costs good money.

If a hardware design requires more ports to access the SRAM than there are available, then one must implement a multiplexer switching design, where different parts of the chip are dynamically served to different actors at different times, but in this design, the actors cannot access the SRAM simultaneously at the same time.

The RAMDAC palette is very likely stored on such a SRAM chip. So what is happening here is that the graphics card palette SRAM is running out of ports to enable simultaneous access to the RAMDAC to read from the palette for visible picture scanout, and for the CPU to reprogram a new palette to it (a single-ported SRAM chip). Since the card cannot drop the palette write access (or the programmed palette color would be lost permanently), it does the less disturbing thing, and temporarily disables access to palette for the visible picture scanout, and for a short duration (for a few pixels) while the CPU is writing the new palette color, the RAMDAC scans out incorrect colors.

If you'd like to verify whether the above is correct, you can run this PALANIM.EXE test program:
- Program: https://github.com/juj/crt_terminator/raw/mai … bin/PALANIM.zip
- Source code: https://github.com/juj/crt_terminator/blob/ma … NIM/PALANIM.CPP

which interactively tests how the graphics card behaves when the palette is programmed synchronized to vblank, or unsynchronized during the visible picture area.

The attachment PALANIM-instructions.png is no longer available

It renders a palette color cycling animation that should look like this:

The attachment PALANIM-expected.png is no longer available

It is not only old ISA cards that are subject to this issue. For example, I have this newer PCI ATI Mach64 VT-264VT2 card from 1997:

The attachment ATI Mach64 VT-264VT2.jpg is no longer available

and when I run PALANIM on it, I instead get the following output from the Feature Connector of the card:

The attachment PALANIM-ATI-Mach64-VT2.png is no longer available

the display "snows" as PALANIM reprograms the palette repeatedly as fast as possible.

Interestingly, this was so much of a common behavior that VESA actually made it an advertised Feature Capability if the card would be immune to this issue, and the palette could be programmed unsynchronized without danger of snow.

See https://pdos.csail.mit.edu/6.828/2018/reading … rdware/vbe3.pdf printed page 27/PDF page 35:

1The need for D2 = 1 "program the RAMDAC using the blank bit in Function 09h" is for older
2style RAMDACs, where programming the RAM values during display time causes a "snow-like"
3effect on the screen. Some newer style RAMDACs don't have this limitation and can easily be
4programmed at any time, but older RAMDACs require that they be programmed during the
5vertical retrace period so as not to display the snow while values change during display time

Curiously, this ATI card that I have, it advertises in the VESA Feature Capabilities that it does not have "palette snow". This VESA feature capability is printed in SNOOP program as a property "[x] no-snow" under the VESA Caps field:

The attachment SNOOP-ATI-Mach64-VT2.png is no longer available

No ATI, no, stop lying. You do have palette snow.

Last edited by clb on 2023-09-12, 10:17. Edited 1 time in total.

Reply 2 of 24, by clb

Posted on 2023-09-12, 09:55

clb Offline

Rank Oldbie

Rank: Oldbie
Posts: 559
Joined: 2021-07-14, 11:45

One last mildly interesting thing: on this ATI Mach64 VT-264VT2 card, the palette snow is much worse on the Feature Connector output, than on the VGA output. I was really surprised to see there is a difference.

Feature Connector output (the same image from previous post):

The attachment PALANIM-ATI-Mach64-VT2.png is no longer available

VGA output at the same time:

The attachment PALANIM-ATI Mach64 VT-264VT2-on-VGA.jpg is no longer available

i.e. both outputs do snow, but the Feature Connector snows considerably much more than the VGA output. On the VGA output, the areas outside the vertical stripes are completely clean black. Weird.

Reply 3 of 24, by clb

Posted on 2023-09-12, 10:04

clb Offline

Rank Oldbie

Rank: Oldbie
Posts: 559
Joined: 2021-07-14, 11:45

Huh, how weird!

I also have an Ahead V5000 B card:

The attachment Ahead-5000-V50PC-B.jpg is no longer available

The attachment SNOOP-Ahead-5000B.png is no longer available

and there the situation is completely the opposite! The Feature Connector output does not have palette snow:

The attachment PALANIM-Ahead-5000.png is no longer available

but the VGA output does:

The attachment PALANIM-Ahead-5000-on-VGA.jpg is no longer available

So looks like that can go both ways.

Reply 4 of 24, by clb

Posted on 2023-09-12, 10:44

clb Offline

Rank Oldbie

Rank: Oldbie
Posts: 559
Joined: 2021-07-14, 11:45

Updated SNOOP to diagnose palette snow instead of relying on the VESA cap.

Reply 5 of 24, by ahyeadude

Posted on 2023-09-12, 12:57

ahyeadude Offline

Rank Member

Rank: Member
Posts: 128
Joined: 2018-10-27, 03:43

Yep, that's definitely what I was seeing in Wolf3D. Appreciate the help and detailed explanation.

Does the SRAM reside in the processing unit itself? All of the other chips look like shift registers and other logic.

Reply 6 of 24, by clb

Posted on 2023-09-12, 13:51

clb Offline

Rank Oldbie

Rank: Oldbie
Posts: 559
Joined: 2021-07-14, 11:45

The palette SRAM will very likely reside inside the RAMDAC chip, which on my Ahead is the KDA0476CN-66 chip.

Other Vogons people have occassionally been adventurous and switched RAMDACs on cards with other pin compatible ones e.g. when they have died, but I am not sure if that would be an operation that would help in this kind of case. CRT Terminator will also avoid the issue for this card at least, without mods needed 😀

Reply 7 of 24, by wbc

Posted on 2023-09-12, 15:41

wbc Offline

Rank Member

Rank: Member
Posts: 198
Joined: 2015-03-14, 14:51
Location: Russia \ Omsk

yep, the palette SRAM is residing inside the RAMDAC ,otherwise you would need to feed 18 bits of pixel data from the VGA chip instead of 8 :)

Speaking of snow itself, there are three distinct RAMDAC classes:
- single-ported palette RAM with transparent latch on RGB DAC inputs. During CPU access to the palette, SRAM address/data buses are routed to the RAMDAC CPU interface, but the latch continues to feed pixel data to the RGB DACs, resulting in current CPU accessed pallette data being displayed on the screen duing CPU access time, generating much of the unpleasant snow. Early VGA RAMDACs (like IMSG171) are using this scheme.
- later (apprx. from 1990 onwards) most RAMDAC vendors started to advertise their products as "Snow-free". In order to save costs on dual-ported SRAM, they "masked" the problem by disabling the latch during CPU access time, so RGB DACs are driving last pixel value to the monitor; thus, "snow" is appearing as less severe image shimmering. Many RAMDACs and graphics cards (from common KDA476/TDK8001 up to S3/NVidia integrated DACs) use this method
- at last, using dual-ported SRAM eliminates "snow" artifacts completely, and it is used in higher-end RAMDACs.

Gona's snow table lists both graphics cards (look for Warcraft 1 column) and discrete RAMDACs by "noise" type, but doesn't really distincts between them, describing the perceived noise "level" instead - I suppose "big noise" refers to the first type (single-ported with pixel data always routed to the DACs)

--wbcbz7

Reply 8 of 24, by ahyeadude

Posted on 2023-09-13, 02:16

ahyeadude Offline

Rank Member

Rank: Member
Posts: 128
Joined: 2018-10-27, 03:43

I stumbled on this when searching for my RAMDAC to find alternatives.

https://m.youtube.com/watch?v=9gROqRlejwA

He's got the same RAMDAC I have, the ADV476KN66. He replaces it with a IMSG176P-80S and eliminates most/all of the snow on his ET4000.

Just ordered one on eBay.

Reply 9 of 24, by rasz_pl

Posted on 2023-09-18, 03:40

rasz_pl Offline

Rank l33t

Rank: l33t
Posts: 4208
Joined: 2017-06-04, 00:57

wbc wrote on 2023-09-12, 15:41:

- single-ported palette RAM with transparent latch on RGB DAC inputs.
- later (apprx. from 1990 onwards) most RAMDAC vendors started to advertise their products as "Snow-free". In order to save costs on dual-ported SRAM, they "masked" the problem by disabling the latch during CPU access time, so RGB DACs are driving last pixel value to the monitor; thus, "snow" is appearing as less severe image shimmering. Many RAMDACs and graphics cards (from common KDA476/TDK8001 up to S3/NVidia integrated DACs) use this method
- at last, using dual-ported SRAM eliminates "snow" artifacts completely, and it is used in higher-end RAMDACs.

Thank you for this great info.
Judging by the width of the glitches CPU RAMDAC access times are much longer/slower than ordinary pixel output RAMDAC polling. Makes me wonder why RAMDAC manufacturers didnt implement a simple FIFO in front of palette ram? maybe running RAMDAC synchronously with pixel clock was just too much of a simplicity win to give it up.

https://github.com/raszpl/FIC-486-GAC-2-Cache-Module for AT&T Globalyst
https://github.com/raszpl/386RC-16 memory board
https://github.com/raszpl/440BX Reference Design adapted to Kicad
https://github.com/raszpl/Zenith_ZBIOS MFM-300 Monitor

Reply 10 of 24, by Calvero

Posted on 2023-09-23, 15:57

Calvero Offline

Rank Member

Rank: Member
Posts: 167
Joined: 2007-08-02, 13:30

I thought Wolfenstein's fastpalette setting would fix this problem.

Reply 11 of 24, by clb

Posted on 2023-09-23, 17:23

clb Offline

Rank Oldbie

Rank: Oldbie
Posts: 559
Joined: 2021-07-14, 11:45

Interesting pointer: I see that fastpalette option opts into using a `rep outsb` instruction to send the whole palette in one instruction: https://github.com/id-Software/wolf3d/blob/05 … C#L391C7-L391C7

That was interesting to learn, I did not know of the `outsb` instruction even existing before this. A neat feature.

Looking at the slow path, it does a traditional `out dx, al` call looped 256 times, unrolled 3x for r-g-b. That is a very traditional way to implement palette uploads. Also, looking at the code, it does wait for vsync start right before initiating the palette upload.

I am a bit surprised that any graphics card would be too slow to not be able to do the palette upload even in the Wolf 3D slow path, but looks like that must be the case.

Reply 12 of 24, by ViTi95

Posted on 2023-09-24, 10:11

ViTi95 Offline

Rank Oldbie

Rank: Oldbie
Posts: 556
Joined: 2017-02-14, 22:18

I had to implement fastpalette for FastDoom, as some cards wouldn't update the palette correctly, specially on fast systems. The issue is that REP OUTSxx doesn't wait for the port to be ready between writes, so it causes troubles with some cards.

https://www.youtube.com/@viti95

Reply 13 of 24, by rasz_pl

Posted on 2023-09-24, 10:33

rasz_pl Offline

Rank l33t

Rank: l33t
Posts: 4208
Joined: 2017-06-04, 00:57

Is there even such a thing as a concept of palette port readiness? You cant check it. Afaik IBM never mentioned/defined any access time restrictions on those registers like universally ignored ones of Yamaha (OPL2 3.3 us after index register write, 23 us after data register write, OPL3 0.28 us after data register write).
Sounds like omission leading to broken hardware, it should either be defined in the spec, of VGA should insert wait states.
Now I wonder - what happens when you read out palette really fast on such broken hardware?

https://github.com/raszpl/FIC-486-GAC-2-Cache-Module for AT&T Globalyst
https://github.com/raszpl/386RC-16 memory board
https://github.com/raszpl/440BX Reference Design adapted to Kicad
https://github.com/raszpl/Zenith_ZBIOS MFM-300 Monitor

Reply 14 of 24, by clb

Posted on 2023-09-24, 16:24

clb Offline

Rank Oldbie

Rank: Oldbie
Posts: 559
Joined: 2021-07-14, 11:45

There does not exist a VGA mechanism for palette port ready, but there does exist a mechanism for ISA bus ready. I wonder if that might be something that the rep outsb instruction didn't account for (although I am a bit skeptical, as that would defeat the whole purpose of this instruction). It feels more like a graphics card bug rather than a mainboard ISA bus bug.

I implemented a test for outsb/insb compatibility into SNOOP, and it just hit me how slow programming the VGA palette really is.

On my test system, 80 MHz 486 with ISA bus at 10.0 MHz, it looks like the fast rep outsb method takes about 30-31 scanlines in vblank to write new 768 bytes, whereas the slow manual out path code takes 39-40 scanlines.

The whole vertical blank in Mode 13h is just 35 scanlines long, of which six first scanlines are the vertical front porch, so commonly lost since IBM wasn't precise to provide a mechanism to allow waiting for vblank start, but just for vsync start. (Commander Keen vblank wait strategy mitigates that a little bit though, and so will CRT Terminator 😁 )

Previously I had thought that DOS games would have had plenty of time to program the palette in vblank, and then some more for other operations, but that doesn't seem to be the case. With the 286+ rep outsb instruction sequence and ISA bus at 10.0 MHz one can just about make it.

I added a code path in SNOOP to diagnose whether the rep outsb/insb instructions would not work for palette programming, curious to see if any cards turn up where that'll fail.

Reply 15 of 24, by clb

Posted on 2023-09-24, 16:30

clb Offline

Rank Oldbie

Rank: Oldbie
Posts: 559
Joined: 2021-07-14, 11:45

rasz_pl wrote on 2023-09-24, 10:33:

Now I wonder - what happens when you read out palette really fast on such broken hardware?

The SNOOP test code is up at https://github.com/juj/crt_terminator/commit/ … b8027a7a2f83ae8 . That includes a test for rep insb instruction, so it should diagnose both reads and writes, if anyone finds such an incompatible graphics card.

Reply 16 of 24, by ahyeadude

Posted on 2023-09-24, 22:01

ahyeadude Offline

Rank Member

Rank: Member
Posts: 128
Joined: 2018-10-27, 03:43

ahyeadude wrote on 2023-09-13, 02:16:
I stumbled on this when searching for my RAMDAC to find alternatives. […]
Show full quote

I stumbled on this when searching for my RAMDAC to find alternatives.

https://m.youtube.com/watch?v=9gROqRlejwA

He's got the same RAMDAC I have, the ADV476KN66. He replaces it with a IMSG176P-80S and eliminates most/all of the snow on his ET4000.

Just ordered one on eBay.

Arrived today. Dropped it in, and output is much improved. Not perfect, but pretty much unnoticeable in actual games.

Before:

After:

Reply 17 of 24, by ViTi95

Posted on 2023-09-25, 05:20

ViTi95 Offline

Rank Oldbie

Rank: Oldbie
Posts: 556
Joined: 2017-02-14, 22:18

Re: FastDoom. A new Doom port for DOS, optimized to be as fast as possible for 386/486 personal computers!

One card that have issues with REP OUTSxx

https://www.youtube.com/@viti95

Reply 18 of 24, by clb

Posted on 2023-09-25, 07:20

clb Offline

Rank Oldbie

Rank: Oldbie
Posts: 559
Joined: 2021-07-14, 11:45

ahyeadude wrote on 2023-09-24, 22:01:

Arrived today. Dropped it in, and output is much improved. Not perfect, but pretty much unnoticeable in actual games.

That is pretty cool result overall, nice work 😀 I wonder if this might be usable to turn into a method for detecting RAMDACs more reliably. Hmm.

Reply 19 of 24, by clb

Posted on 2023-09-25, 07:30

clb Offline

Rank Oldbie

Rank: Oldbie
Posts: 559
Joined: 2021-07-14, 11:45

ViTi95 wrote on 2023-09-25, 05:20:

Re: FastDoom. A new Doom port for DOS, optimized to be as fast as possible for 386/486 personal computers!

One card that have issues with REP OUTSxx

Thanks for the pointer. I have two cards, a WD90C30:

The attachment WD90C30-LR.png is no longer available

The attachment CRTT-SCAN-WD90C30-LR.png is no longer available

and a WD90C31:

The attachment WD90C31A-LR.jpg is no longer available

The attachment CRTT-SCAN-WD90C31-LR.png is no longer available

Ran the SNOOP test code on both of them, though surprisingly it looks like neither of them found a fault.

That does make me suspect a bit whether the incompatibility might have to do with the system motherboard and not the graphics card - or maybe it is dependent on the BIOS in question.

Or alternatively I got something wrong with the test code..

Main menu