VOGONS


First post, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

I am redesigning my MDA/Herc/CGA/EGA emulation and I have these questions:

1) I want to properly emulate the single port memory on CGAs where either the CPU or the 6845 can read/write to it but not at the same time. Is that single port memory also used on MDA and Hercules cards? Do they also suffer from the "snow" effect?

2) is the memory contention address related? In other words if the CPU touches a different address than 6845 but at the same time does the memory still only serves one master?

3) What happens if the 6845 needs to read memory but the CPU is writing same address at the same time? Say the CPU got in first will the 6845 wait for access or just keep going, generating new/next address?

4) When the 6845 reads/writes memory on either cards, how many wait states there are?

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 1 of 9, by Scali

User metadata
Rank l33t
Rank
l33t

MDA and Hercules do not suffer from snow.
CGA only suffers from snow in 80-column mode.

The reason why 'snow' occurs, as I understand it, is because in 80-column mode, the CPU is not getting locked out of the bus by wait states during all memory accesses.
As a result, the CPU may 'override' the CGA card during accesses to its memory. Which means that whatever data the CPU is accessing in CGA memory is the data being put on the bus, rather than the data the output circuitry was expecting.

You can see a good example of it here: https://youtu.be/XxAdJpyZ0HM?t=48s
During the time the CPU is writing to the screen to update the scroller, you see 'random' colours at the top, which should be black. Upon closer inspection, it's not that 'random' at all, you clearly see the same colours as used in the scroller. You can also see it, perhaps less obvious, in the twister part. It mainly uses red and white colours, and the snow is also showing mainly red and white.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 2 of 9, by Jo22

User metadata
Rank l33t++
Rank
l33t++

My XT clone also has a built-in CGA and suffers from snow occasionally.
But if my memory serves me, some CGA clones did use dual-ported memory to fix this issue.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 3 of 9, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie
Scali wrote:

MDA and Hercules do not suffer from snow.

So that means only the CGA has the single-port memory.

Scali wrote:
CGA only suffers from snow in 80-column mode. […]
Show full quote

CGA only suffers from snow in 80-column mode.

The reason why 'snow' occurs, as I understand it, is because in 80-column mode, the CPU is not getting locked out of the bus by wait states during all memory accesses.
As a result, the CPU may 'override' the CGA card during accesses to its memory. Which means that whatever data the CPU is accessing in CGA memory is the data being put on the bus, rather than the data the output circuitry was expecting.

You can see a good example of it here: https://youtu.be/XxAdJpyZ0HM?t=48s
During the time the CPU is writing to the screen to update the scroller, you see 'random' colours at the top, which should be black. Upon closer inspection, it's not that 'random' at all, you clearly see the same colours as used in the scroller. You can also see it, perhaps less obvious, in the twister part. It mainly uses red and white colours, and the snow is also showing mainly red and white.

Thank you Scali. So if the CPU's BIU is writing data to the CGA memory and in the same cycle the 6845 is reading something then whatever the CPU wrote (and is on the bus) gets served to the 6845. Ok, that makes sense.

The 40 vs 80 column snow is a cause of timing right? Basically the 1 wait state that the CPU is subject too is not enough to cover the 6845 access and the 6845 will indeed read in the same cycles that is writing. In 80-column mode the 6845 generates addresses twice more often and therefore more likely to clash with the CPU.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 4 of 9, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie
Jo22 wrote:

My XT clone also has a built-in CGA and suffers from snow occasionally.
But if my memory serves me, some CGA clones did use dual-ported memory to fix this issue.

That is interesting, I did not know about that. I wonder if any of the ATI Wonder series (prior to EGA versions) have dual-port memory.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 5 of 9, by Scali

User metadata
Rank l33t
Rank
l33t
vladstamate wrote:

So that means only the CGA has the single-port memory.

Not necessarily... there are other ways to get around the snow-issue. CGA itself gets around it in all but 80-column mode by inserting waitstates on the bus, to lock out the CPU.
The reason why they didn't do this for 80-column mode is probably because you'd lock out the CPU pretty much completely, given that 80-column mode requires twice as much bandwidth as any other mode: It needs to read 80 characters * 2 bytes (character and attribute) per scanline. All other modes require only 80 bytes per scanline. So they instead introduced the WRITE_ENABLE bit in the status register, so you could poll with the CPU when it is 'safe' to write to VRAM. Or you could choose not to care and access the memory a lot faster.

There are other ways around it. The C64 for example caches the 'colorram' data (basically the equivalent of the attribute bytes) for 8 scanlines, since they are the same on all scanlines. The same trick could be done with CGA.
You could also use a more clever interleaved interface where you'd fetch from two banks of memory in parallel, so you would fetch twice as much data per cycle, allowing you to give up half the cycles for the CPU to use.
Or you could use memory that is twice as fast.

I don't know how they solved it with MDA/Hercules, but there must be some 'trick' to it. With MDA/Hercules the problem is even worse than with 80-column CGA: You still have the 80 columns and 2 bytes per column. However, you now have more scanlines per frame, so the horizontal scanrate is higher (~18.5 kHz as opposed to ~15.7 kHz).

vladstamate wrote:

Thank you Scali. So if the CPU's BIU is writing data to the CGA memory and in the same cycle the 6845 is reading something then whatever the CPU wrote (and is on the bus) gets served to the 6845. Ok, that makes sense.

Yes, and I think it also goes for CPU reads from CGA memory. Either way, the data has to pass through the memory bus on the CGA card.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 6 of 9, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
Scali wrote:

I don't know how they solved it with MDA/Hercules, but there must be some 'trick' to it. With MDA/Hercules the problem is even worse than with 80-column CGA: You still have the 80 columns and 2 bytes per column. However, you now have more scanlines per frame, so the horizontal scanrate is higher (~18.5 kHz as opposed to ~15.7 kHz).

The trick is (at least on the MDA) that the memory bus width is twice as high. MDA uses 2114L (static RAM) chips which are organized as 1024x4 bits each, paired up to give a 16-bit memory bus, so the entire character/attribute pair can be fetched in a single cycle, and CRTC cycles can be interleaved with CPU cycles.

Dual-ported RAM is something that exists but would have been expensive and was not usually used for video cards as far as I know - a much better solution (the one that was usually used in practice) was to increase the bus width or memory access rate so that CPU accesses can happen between consecutive CRTC accesses.

Reply 7 of 9, by Scali

User metadata
Rank l33t
Rank
l33t
reenigne wrote:

The trick is (at least on the MDA) that the memory bus width is twice as high. MDA uses 2114L (static RAM) chips which are organized as 1024x4 bits each, paired up to give a 16-bit memory bus, so the entire character/attribute pair can be fetched in a single cycle, and CRTC cycles can be interleaved with CPU cycles.

Right, so effectively you have slightly more waitstates than on CGA I suppose. You have the same amount of waitstates per scanline (since the CGA waitstates are based on fetching 80 bytes per scanline, and MDA/Hercules will do 80 16-bit words), but your scanlines are 'faster'.

As for CGA, I take it they didn't use the MDA solution because where MDA only has 4k of memory, CGA has 16k of memory, so for CGA it would be considerably more complex/expensive to set up a 16-bit bus.

Hercules is a few years newer, which probably changed the economics considerably. A Hercules card has 64k instead of 4k, but doesn't look like it has more components than an MDA or CGA card. So I guess they just used memory chips with much higher capacity, which had become available/affordable by then.
In terms of waitstates/performance, it seems to be the same as an MDA card in my experience.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 8 of 9, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
Scali wrote:

Right, so effectively you have slightly more waitstates than on CGA I suppose.

It's hard to tell without measuring it (I don't have a genuine IBM MDA card) or doing some in-depth circuit analysis. The critical factor is probably the character clock, which is 1.81MHz on MDA and 1.79MHz on CGA so (all else being equal) the MDA is probably slightly faster on average.

Scali wrote:

As for CGA, I take it they didn't use the MDA solution because where MDA only has 4k of memory, CGA has 16k of memory, so for CGA it would be considerably more complex/expensive to set up a 16-bit bus.

Yes, when you go from 4kB to 16kB it's no longer economical to use SRAM, and DRAMs of that period generally had one output pin per chip, so you'd need 16 chips instead of 8 to get a 16-bit data bus. That makes sense if you have 32kB of VRAM (since 16kbit was a particularly common DRAM chip size) but I guess that would have made the CGA card too expensive (and wasn't needed anyway for the 320x200x4 and 640x200x2 modes the card was specified for). This was the solution they used for PCjr though (except by that time it was 64kbit chips instead of 16kbit, and the high-bandwidth modes were just unavailable in the 64kB PCjr instead of snowy).

Scali wrote:

Hercules is a few years newer, which probably changed the economics considerably. A Hercules card has 64k instead of 4k, but doesn't look like it has more components than an MDA or CGA card. So I guess they just used memory chips with much higher capacity, which had become available/affordable by then.

Higher capacity, yes, but the more important factor in snow avoidance was probably access time (assuming that Hercules used eight 64kbit DRAM chips). Not sure how often Hercules lets the CPU access VRAM in the 80-column text mode, but if it's once per character you'd need an access time of 1/(3*1.81MHz) ~= 180ns.

Reply 9 of 9, by Jo22

User metadata
Rank l33t++
Rank
l33t++

Thank you guys, this is very interesting. I wonder, did the Plantronics card also suffer from snow ?
I heard it had 32KiB of RAM, but was an early CGA "clone" also.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//