VOGONS


VGA's undocumented latching?

Topic actions

First post, by superfury

User metadata
Rank l33t++
Rank
l33t++

I notice something weird when running it in UniPCemu(compared to Dosbox).

When I run Jazz Jackrabbit, the bottom part of the screen(with the split screen operation of the VGA and up) displays correctly.
But the top part of the screen shows a wholly different part of the screen than what it's supposed to?

This seems to have something to do with the start address being reloaded at the start of vertical retrace, when the VSYNC bit of the Input Status 1 register(bit 3) goes high during vertical retrace, latching the start address for the next frame to render? Reloading at vertical total fixes the bug(but seems to be incorrect, according to some demo I tested)?

It only seems to affect the normal gameplay part of the game(the main game levels)?

Last edited by superfury on 2020-12-15, 20:32. Edited 1 time in total.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 1 of 25, by mkarcher

User metadata
Rank l33t
Rank
l33t
superfury wrote on 2020-12-11, 08:02:

This seems to have something to do with the start address being reloaded at the start of vertical retrace, when the VSYNC bit of the Input Status 1 register(bit 3) goes high during vertical retrace, latching the start address for the next frame to render? Reloading at vertical total fixes the bug(but seems to be incorrect, according to some demo I tested)?

There is a lot of time between the start of vertical blanking (when VSYNC goes high) and vertical total (when VSYNC goes low). Possibly the correct point-in-time to sample the start address is some scan-lines before vertical total, when the card starts setting up the next frame. I recommend you to write a test program that changes the start addres on every scan-line during vertical blanking and check for which start address takes effect on physical hardware.

Reply 2 of 25, by superfury

User metadata
Rank l33t++
Rank
l33t++

Don't have the physical hardware or PC to test though(doubt I'll ever will).

All docs say that it happens at the start of vertical retrace. Some demo (pgame, https://pcem-emulator.co.uk/phpBB3/viewtopic.php?t=3089) only works properly when the start address is latched at the start of vretrace(status register bit going from 0 to 1).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 3 of 25, by superfury

User metadata
Rank l33t++
Rank
l33t++

The weird thing is: when Jazz Jackrabbit plays it's levels, the start address is probably incorrect somehow? I see it's flipped between two values(at the time of vsync starting).

Perhaps it's using the display enabled signal incorrectly somehow? Does it have anything to do with this?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 4 of 25, by superfury

User metadata
Rank l33t++
Rank
l33t++

I do see something interesting with Jazz Jackrabbit when starting up a level.

It first shows the screen semi-correctly, then suddenly(when jazz drops in the first level?).

It starts out with the status bar (the unscrolling fixed window) being at both the top and bottom of the screen. Between it is the game screen (or a part of it, until the bottom window seems to start rendering).

Then, once Jazz starts falling to the starting point of the level, the gaming portion of the screen starts lowering itself towards the bottom of the screen, displaying another empty gaming screen seemingly layered under it?

It almost looks as if the game is shifting in new data into VRAM, but the display window isn't effective at all somehow, like it's always starting both the top window and bottom window at 0?

So there might be an error in the handling of said register somehow (since it looks like it's always 0 in this case)?

I see the frame start address being 2AA8h and 6564h, toggling between the two of them during vertical retrace start?
Edit: Hmmm... I see it's being set during the start of vertical retrace.
But once vertical total hits, it's somehow zeroed?
Edit: It's indeed zeroed. The Jazz Jackrabbit VBlank handler caused some VGA registers to be written. Said write had the effect of calling the rendering-only line parameter update function to be called, which caused it to see that it was rendering the top window(because the top window was rendering vertical retrace through vertical total). Then, the function would (because it was in the top window) clear the start address used for rendering, because it was assuming it was starting a top window rendering, not taking into account this only is supposed to be done from the renderer itself! The renderer would setup the next frame start address(during vertical retrace) and then clear it because of said handler in the code that ran during vertical retrace through vertical total (which starts the next frame from software perspective)!

Edit: That bugfix seems to have fixed Jazz!

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 5 of 25, by superfury

User metadata
Rank l33t++
Rank
l33t++

I do still have one little question, though: Are all panning and scrolling registers latched when starting a new frame? Or only some of them?
Does changing the byte panning, horizontal pixel panning and/or preset row scan during active display have any effect for the current frame being rendered?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 6 of 25, by mkarcher

User metadata
Rank l33t
Rank
l33t
superfury wrote on 2020-12-13, 11:13:

I do still have one little question, though: Are all panning and scrolling registers latched when starting a new frame? Or only some of them?
Does changing the byte panning, horizontal pixel panning and/or preset row scan during active display have any effect for the current frame being rendered?

Let's assume the hardware is built in the simplest way possible. Everything that counts through the whole frame (like the current screen address) is latched at the beginning of the frame and counted in an internal, invisible register. Everything that is reset each line is most likely re-latched each line into some internal registers. So with this assumption, I conclude that:

  • Byte Panning and pixel panning is applied to each line. Every line, the VGA hardware needs to make the decision at what position in the 32-bit data word it needs to start scanning out. If your goal is to be able to emulate EGA in addition to VGA, be aware that EGA also has a byte panning mechanism, but it uses CR05, bit 7 instead of CR08, bits 5 and 6.
  • Preset Row Scan is applied once to the whole frame. After presetting the row counter at the beginning of the frame, the row counter is a free-running counter that does not need to be re-preset.
  • Maximum Scanline on the other hand, the value the current row counter is compared to, is not latched at all (why should it?), and compared just-in-time at the end of each raster line to the value of the current row register. It is a well-known effect, which is used in a lot of demos, that you can adjust maximum scanline mid-frame to enable double/triple/quadruple scanning in graphics mode.
  • Offset (the amount of bytes row one row to the next row) is also not latched at all, but it is added to the starting address of the previous line (which got latched in some internal register) to yield the starting address of the current line (which replaces the latched value after adding it). This again is a well-known effect used in a lot of demos: You can set offset mid-frame to 0 to repeat the current scanline an arbitrary number of times (even more than 32 times, the maximum repeat count you can obtain using maximum scanline).

I checked my book about programming EGA/VGA cards, and it indeed tells you that the start address is loaded at the beginning of the vertical blanking period, whereas the preset row scan register is described to be load during the blanking period, without any specification when exactly during the blanking period the latching of this register happens. The book goes on to suggest to adjust the start address during the display period to have it effective as soon as the blanking starts, but adjust the preset row scan register at the beginning of the blanking period. This indicates that the preset row scan register is obviously loaded late enough in the blanking period that you don't miss the preset point when you change the register at the start of the retrace period. I remember the demo software that was supplied with the book to work perfectly, so in practice, loading preset row scan at the start of the blanking period is enough to have it effective for the subsequent frame. It avoids depending on effects reloading the preset row scan register might or might not have on the current display period.

Reply 7 of 25, by superfury

User metadata
Rank l33t++
Rank
l33t++
mkarcher wrote on 2020-12-13, 12:50:
Let's assume the hardware is built in the simplest way possible. Everything that counts through the whole frame (like the curren […]
Show full quote
superfury wrote on 2020-12-13, 11:13:

I do still have one little question, though: Are all panning and scrolling registers latched when starting a new frame? Or only some of them?
Does changing the byte panning, horizontal pixel panning and/or preset row scan during active display have any effect for the current frame being rendered?

Let's assume the hardware is built in the simplest way possible. Everything that counts through the whole frame (like the current screen address) is latched at the beginning of the frame and counted in an internal, invisible register. Everything that is reset each line is most likely re-latched each line into some internal registers. So with this assumption, I conclude that:

  • Byte Panning and pixel panning is applied to each line. Every line, the VGA hardware needs to make the decision at what position in the 32-bit data word it needs to start scanning out. If your goal is to be able to emulate EGA in addition to VGA, be aware that EGA also has a byte panning mechanism, but it uses CR05, bit 7 instead of CR08, bits 5 and 6.
  • Preset Row Scan is applied once to the whole frame. After presetting the row counter at the beginning of the frame, the row counter is a free-running counter that does not need to be re-preset.
  • Maximum Scanline on the other hand, the value the current row counter is compared to, is not latched at all (why should it?), and compared just-in-time at the end of each raster line to the value of the current row register. It is a well-known effect, which is used in a lot of demos, that you can adjust maximum scanline mid-frame to enable double/triple/quadruple scanning in graphics mode.
  • Offset (the amount of bytes row one row to the next row) is also not latched at all, but it is added to the starting address of the previous line (which got latched in some internal register) to yield the starting address of the current line (which replaces the latched value after adding it). This again is a well-known effect used in a lot of demos: You can set offset mid-frame to 0 to repeat the current scanline an arbitrary number of times (even more than 32 times, the maximum repeat count you can obtain using maximum scanline).

I checked my book about programming EGA/VGA cards, and it indeed tells you that the start address is loaded at the beginning of the vertical blanking period, whereas the preset row scan register is described to be load during the blanking period, without any specification when exactly during the blanking period the latching of this register happens. The book goes on to suggest to adjust the start address during the display period to have it effective as soon as the blanking starts, but adjust the preset row scan register at the beginning of the blanking period. This indicates that the preset row scan register is obviously loaded late enough in the blanking period that you don't miss the preset point when you change the register at the start of the retrace period. I remember the demo software that was supplied with the book to work perfectly, so in practice, loading preset row scan at the start of the blanking period is enough to have it effective for the subsequent frame. It avoids depending on effects reloading the preset row scan register might or might not have on the current display period.

Well, that's the weird thing about it: looking at FreeVGA mentions in it's specisl effects part of documentation that the byte panning field is added to the start address register. But the start address register is loaded during vertical retrace, so wouldn't the byte panning field need to be doing so as well? It's functioning much the same as the preset row scan and pixel panning fields, which seem to be loaded/latched during vertical total being reached(effective end of frame), once the new frame starts, which may be after vertical blanking and/or vertical retrace(if clocks are setup for overscan between it). Otherwise, they would have no effect during retracing or overscan, rendering them moot?
There's also the weird thing of handling it(byte panning) during active display. Would it be applied after adding the base address of the scanline(from the offset register calculation) to the previous scanline start(which starts at 0(split window) or the start address register)? So it would have start address or 0(split/top) to start with. Then add offset*2 for each scanline. Said value would be stored for the next scanline to add to. And for the start of the current scanline, add the byte panning value to it? Would that be correct?

So startaddr(or 0)+(offset*scanline) for the scanline base. And add byte panning to it for the scanline start address? Finally skip x pixels for the horizontal pel panning? Is that what happens?
Edit: Just adjusted the emulator for that. Now the scanline settings(start map(latched during vertical retrace starting only), preset row scan(latched during vertical total only), byte panning(latched each horizontal total and vertical total) and pixel shift count(latched each horizontal total and vertical total)) are applied each scanline. Those are now applied to each scanline accordingly(start map initializes the MA counter during line #0 or top window line #0, preset row scan when starting a new frame(vertical total), offset register times 2 is added during horizontal total to the scanline start address(for the next scanline to apply), and finally current byte panning register is added to the MA counter for the current scanline for each scanline(the next scanline start address acts like it wasn't applied, adding the value for the current scanline if needed).
So, for example, it goes like this:
0/startaddr+byteaddr=First line. The byte addr used isn't taken to the next scanline.
0/startaddr+(offset*2)=Second line start. Add byte panning to this for the actual start for this scanline(not stored for the next scanline).
0/startaddr+(2*(offset*2))=Third line start. Add byte panning like for the previous scanline for the effective start(not stored in the main counter).

So, if startaddr=0x100, offset=0x10, byte panning 0 for the first scanline, 1 for second, 2 for third, we get:
0x100+((0x10*2)*0)+0=First scanline (0x100 starting). 0x100+0x20 is stored in the next scanline counter, so 0x120.
0x120+1=Second scanline (0x121 starting). 0x120+0x20 is stored in the next scanline counter, so 0x140.
0x140+2=Third scanline (0x142 starting). 0x140+0x20 is stored in the next scanline counter, so 0x160.
Etc.

That's what UniPCemu is now doing with the latest commit.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 8 of 25, by mkarcher

User metadata
Rank l33t
Rank
l33t
superfury wrote on 2020-12-13, 15:28:

Well, that's the weird thing about it: looking at FreeVGA mentions in it's specisl effects part of documentation that the byte panning field is added to the start address register.

While this is a nice and easily understandable interpretation of the byte panning field, I think it's a gross technical inaccuracy to describe it that way. It doesn't matter for most modes, as byte panning is only used with 4-color or 2-color EGA/VGA graphics modes. The only BIOS-supported 4-color EGA mode is the 640x350-lite mode on 64K equipped EGA cards. Not a single BIOS-supported VGA mode makes use of the hardware features that require the use of byte panning. What actually happens in 2-color graphics mode is that the VGA card fetches 32 bits from a single address (all 4 planes, a byte per plane). Usually (in 16-color mode), it scans out all four planes in parallel, yielding 8 packets of 4 bits each. In 2-color mode, it scans out the 4 planes one after the other, so it generates 32 pixels instead of 8 pixels from a single address in video ram. 2-color mode is most likely indended to be used in conjunction with doubleword mode, in which the CRTC address is incremented by 4 everytime the CRTC fetches data, and with chain-4 address mapping. That's the same addressing setup used in Mode 13h (but not in ModeX). chain-4 address mapping spreads out the bytes of every fourth video RAM address over the VGA address space like this (assuming the graphics mode memory range A000-BFFF):

A000:0000 - CRTC address 0, plane 0
A000:0001 - CRTC address 0, plane 1
A000:0002 - CRTC address 0, plane 2
A000:0003 - CRTC address 0, plane 3
A000:0004 - CRTC address 4, plane 0
...

This enables the processor to access the 32 pixels of a single 32-bit word in video ram as if it were spread out over four bytes, while in fact the 32 pixels are stored all at the same CRTC address, so I strongly oppose the notiong that byte panning is added to the CRTC address. Instead, byte panning tells the CRTC to skip scanning out some of the planes from the first 32-bit word it loaded. It is very much more like extending the pixel panning register by two extra bits than adding anything to the start address.

superfury wrote on 2020-12-13, 15:28:

But the start address register is loaded during vertical retrace, so wouldn't the byte panning field need to be doing so as well?

If it were in fact added to the address, it would. I am confident the premise of some addition happening at this point wrong.

superfury wrote on 2020-12-13, 15:28:

It's functioning much the same as the preset row scan and pixel panning fields, which seem to be loaded/latched during vertical total being reached(effective end of frame), once the new frame starts, which may be after vertical blanking and/or vertical retrace(if clocks are setup for overscan between it). Otherwise, they would have no effect during retracing or overscan, rendering them moot?

I don't understand why you are talking about an effect the preset row scan or pixel panning fields could have during overscan. These fields only affect how pixels inside the visible area are generated. Overscan is always defined only by the overscan color register in the attribute controller, no matter how many pixels or rows the display image is shifted. The size of (vertical) overscan is given by the display end, start blanking, end blanking and vertical total register only.

superfury wrote on 2020-12-13, 15:28:

There's also the weird thing of handling it(byte panning) during active display. Would it be applied after adding the base address of the scanline(from the offset register calculation) to the previous scanline start(which starts at 0(split window) or the start address register)? So it would have start address or 0(split/top) to start with. Then add offset*2 for each scanline. Said value would be stored for the next scanline to add to. And for the start of the current scanline, add the byte panning value to it? Would that be correct?

As explained above, adding byte panning to the CRTC start address seems to be always incorrect. It's interesting to test whether byte panning applies for the past-scanline-compare region. You have a register bit to toggle whether pixel panning applies to the bottom half, and it would be sensible if that same bit also applies to byte panning. On the other hand, maybe the "pixel panning can affect everything" state is meant purely for EGA compatibility. As EGA doesn't have byte panning, it might be enough for VGA to only respect byte panning before the trigger scanline. As this is a feature that effectively no-one uses, it might actually vary between different VGA clones.

I am gonna hack a demo program using 2-color mode, byte panning and scanline compare the next days and take reference pictures. Do you have any preference which kind of VGA card I should initially test it on? Does UniPCEm target emulating a specific brand of SuperVGA? I have a couple of different pre-PCI VGA cards at hand:

  • OAK OTI037c (simple 8-bit SVGA, up to 800x600)
  • OAK OTI077 (1MB SVGA)
  • Tseng ET4000AX
  • Tseng ET4000/W32i
  • ATI mach32
  • Realtek RTG3106
  • Cirrus Logic 510/520
  • Cirrus Logic 5426
  • S3 928

For kicks, I might also implement the EGA variant (4-color mode, as EGA doesn't have the 2-color mode) and test that one on a Genoa SuperEGA. I can also test 4-color EGA mode emulation on VGA cards that try to offer backwards compatibility, but that would be considerably more effort, as I don't currently have the required software tools for all those cards at hand. Also, I might reach out to a friend who has a multiple Trident boards, as Trident cards are missing from my collection (which is an indicator for the value I attributed to them in the days they were still in use).

Reply 9 of 25, by superfury

User metadata
Rank l33t++
Rank
l33t++

UniPCemu currently uses the following graphics cards from EGA and up:
- EGA(wasn't posting in the past, haven't retested since)
- VGA
- ET3000AX
- ET4000AX

Although I did the Jazz Jackrabbit test on the ET4000AX running on the i440fx, it should be compatible with the methods used on the VGA for said game. Although it does have the Tseng way of dealing with byte mode(as is documented many times over with Tseng graphics cards mode 13h byte mode vs doubleword mode incompatibility).
The EGA/VGA is handled as everyone knows(normal documented behaviour).
Interesting enough, I see vgatest2 having some readback issues with input status 0 register(0x70 read expected instead of 0x00/0x10 read in various cases). It also seems to have some issues with reading back text characters from video mode 0Dh(getting filled block characters instead of A-Z at least), according to the report?

Also, FreeVGA documentation on the byte panning field:

Byte Panning

The value of this field is added to the Start Address Register when calculating the display memory address for the upper left hand pixel or character of the screen. This allows for a maximum shift of 15, 31, or 35 pixels without having to reprogram the Start Address Register.

Also did a quick google on it:
https://hackaday.io/project/6150-beckman-du-6 … -a-new-vga-mode

The fields used for 'fine' panning are actually the combination of an 8-pixel coarse 'byte' panning field in CR8 and a 1 pixel fine panning field in AR13. Together they give a range of 0-31 pixels of panning, combined with the memory address word-panning, allowing full panning of any image.

That pretty much says that the 2-bit byte panning is actually added to the MA address counter, otherwise it wouldn't perform 8-pixel course 'byte' panning. What it literally means is that it's added to the start address of a scan line(which is the start address or 0(top window) for the first scanline, with the offset register added for each successing scanline). So MA 0+B for the first of the bottom window scanline, MA 0+(offset*2)+B for the second, keeping adding offset*2 for each successing scanline(as mentioned before). The byte panning is simply added for 8-pixel granular control on each scanline(independent to the start address/0 and the previous offsets being added to it). Then the horizontal shift count simply discards 0-7 pixels of said loaded dword and renders pixels from that point on.
Edit: Little sidenote: the examples that I gave were using offset*2 increments each scanline. It actually multiplies that with 1(byte), 2(word) or 4(dword) before adding it to the next line MA counter during horizontal total being reached.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 10 of 25, by mkarcher

User metadata
Rank l33t
Rank
l33t

First results of testing a 2-color mode (what byte panning is intended for) on the ET4000/W32i is in. I tested the byte-mode/ModeX like variant of the two-color mode. Source/Executable will be posted after some polishing. The mode is initialized like this (which is the same approach as on the hackaday blog post you linked):

  • Set up mode 0x12 using the BIOS
  • Set bit SR1[4] (1x32 Bit serialization)
  • Program the palette in a away that all even entries map to black, and all odd entries map to white (an alternative, more elegant approach, as given in the hackaday POST, would be using the Color Plane Enable register).

This results in a 2-color 640x480 mode with a virtual screen size of 2560x819, because I don't change the offset register, and a 2-color mode uses 4 times less memory than a 16-color mode. Then I painted a test picture to prove that the mode actually behaves as I intended. And indeed, I get 32 pixels from CRTC address 0, then 32 pixels from CRTC address 1 like this:

  • Pixels 0-7: CRTC address 0, plane 0
  • Pixels 8-15: CRTC address 0, plane 1
  • Pixels 16-23: CRTC address 0, plane 2
  • Pixels 24-31: CRTC address 0, plane 3
  • Pixels 32-39: CRTC address 1, plane 0
  • Pixels 40-47: CRTC address 1, plane 1
  • ...

Then I ran a loop that soft-scrolls the image 33 pixels forth and back. I works like this:

  • put the low three bits of the X scrolling position into the pixel panning (during blanking)
  • put the next two bits of the X scrolling position into the byte panning register (before(!) blanking. This was a surprise to me)
  • put the high bits of the X scrolling position into the CRTC start address (before blanking. This is well known)

The conclusion is: Byte panning is not the same as incrementing the CRTC address! Consider these examples:

  PEL pan  BYTE pan  CRTC start address  first pixel
0 0 0 0
1 0 0 1
...
7 0 0 7
0 1 0 8
1 1 0 9
...
7 3 0 31
0 0 1 32
1 0 1 33

If the byte panning were indeed added to the MA address counter, setting byte panning to 1 would jump straight to pixel 32. The whole reason for byte panning to exist is that it does not jump to pixel 32, as incrementing the address would do, bit instead it jumps to pixel 8. And that's what the hackaday post is also trying to say: If you go down to 4-color or 2-color mode, incrementing the CRTC address by one causes a shift of more than 8 pixels, and byte panning extends the pixel panning register to be able to reach all possible starting positions inside a single CRTC address. I did not yet test the split-screen mode, so I can't tell you whether byte panning affects the lower window (address 0), but I hope it does not, at least if you set AR10[5].

superfury wrote on 2020-12-13, 23:36:

So MA 0+B for the first of the bottom window scanline, MA 0+(offset*2)+B for the second, keeping adding offset*2 for each successing scanline(as mentioned before). The byte panning is simply added for 8-pixel granular control on each scanline(independent to the start address/0 and the previous offsets being added to it). Then the horizontal shift count simply discards 0-7 pixels of said loaded dword and renders pixels from that point on.

I hope it is clear by now that at least in 2-color mode (1bpp mode), your calculations would yield a wrong result. Adding 1 to "offset" makes each line 2 "32-bit words" longer, which is 64 1-bpp pixels. This means the second line in the lower window will move 64 pixels to the left when you increase offset by one. On the other hand, adding one to "B" moves all lines by 8 pixels, so adding 2 to B will move the second line by 16 pixels to the left. If your equation were accurate, adding 2 to B or adding 1 to offset should have the same effect on the second line.

It might be possible that the byte panning register *does* what you expect and implemented in 16-color mode, though. I only tested byte-panning in 2-color mode yet.

Reply 11 of 25, by superfury

User metadata
Rank l33t++
Rank
l33t++

So, if I understand it correctly, the byte panning value isn't about a 'byte'? It's actually just an extension of the pixel panning register, with the byte panning register becoming it's upper bits? It's a 8-pixel pixel shift count value instead? So the maximum shift is (8*3)+7=31 pixels in that way? At least, that's the case in 8-pixel modes.
What about the 9-pixel modes? Is it then 32 pixel range? Or (9*3)+8=35 pixel range?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 12 of 25, by superfury

User metadata
Rank l33t++
Rank
l33t++

OK. I just modified the 'byte panning' field to become an extension of the horizontal PEL panning register.
In 8 pixel modes, it will shift the display left in 8 pixel multipliers(added to the horizontal PEL panning shift).
In 9 pixel modes, it will shift display left in 9 pixel multipliers(undocumented behaviour, but implied if FreeVGA's documentation of 35 pixels is to be believed).
Edit: Yay. Got the update ready on my phone to verify on my PC, but the repository is down temorarily for maintenance(bitbucket in general). I'll need to push when it's available again.
Edit: There's also still the weird case of the additional character widths on the ET4000? Those go all the way from 5 to 16 pixels wide?
How would the byte panning happen with those? Just 8 pixel multipliers?

Edit: About the shift/load rate. In UniPCemu, this simply affects how many times the memory is loaded(in half character clocks instead of whole character clocks, due to Tseng compatibility of 16-bit color modes). The character clock shift works in parallel, increasing the MA counter monotonically (except in 16-bit text mode, where it's increasing by 2).
The shift/load registers in graphics mode are implemented as a simple 1/2/4-bit tap on a 32-bit integer loaded from VRAM. So in that way, it's like the report says.
Edit: The changes are up on the repo now. The byte panning is a 8-pixel or 9-pixel addition to the horizontal pel panning shift now.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 13 of 25, by superfury

User metadata
Rank l33t++
Rank
l33t++

Also, should I latch the byte panning register for the frame at the same time as the start address register? If I understand correctly, it's actually latched like that from what you've said?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 14 of 25, by mkarcher

User metadata
Rank l33t
Rank
l33t
superfury wrote on 2020-12-14, 02:33:

Also, should I latch the byte panning register for the frame at the same time as the start address register? If I understand correctly, it's actually latched like that from what you've said?

Yes, on the ET4000/W32i, byte panning is latched like the start address. The following pseudo-code for soft-scrolling in 2-color graphics mode creates ghost frames with bad scrolling because the byte panning is updated too late:

for (x = 0; x < 40; x++)
{
WaitForDisplayPeriod();
SetStartAddress(x >> 5);
WaitForVBlanking();
SetPELPanning(x & 7);
SetBytePanning((x >> 3) & 3);
}

When I move SetBytePanning before WaitForVBlanking, the scrolling is perfectly smooth.

Reply 15 of 25, by mkarcher

User metadata
Rank l33t
Rank
l33t
superfury wrote on 2020-12-14, 01:18:

So, if I understand it correctly, the byte panning value isn't about a 'byte'? It's actually just an extension of the pixel panning register, with the byte panning register becoming it's upper bits? It's a 8-pixel pixel shift count value instead? So the maximum shift is (8*3)+7=31 pixels in that way? At least, that's the case in 8-pixel modes.
What about the 9-pixel modes? Is it then 32 pixel range? Or (9*3)+8=35 pixel range?

I need to test that, but I guess the combination of 9-pixel mode and color reduction (16 4-color pixels or 32 2-color pixels per VGA memory address) is rarely used. I never even tried the 9-pixel bit in graphics modes, or color reduction in text modes, because both seems to be "out of design spec" for the VGA for me.

Reply 16 of 25, by mkarcher

User metadata
Rank l33t
Rank
l33t
superfury wrote on 2020-12-14, 01:37:

Edit: About the shift/load rate. In UniPCemu, this simply affects how many times the memory is loaded(in half character clocks instead of whole character clocks, due to Tseng compatibility of 16-bit color modes). The character clock shift works in parallel, increasing the MA counter monotonically (except in 16-bit text mode, where it's increasing by 2).

I am not sure how you mean the statement about the MA counter. Reducing the shift/load rate does not affect video timing, so the character clock runs at the usual rate, no matter what you choose for the shift/load rate. On the other hand, the MA counter is incremented on every load, which is twice per character clock in Tseng 16-bit color modes, once per character clock in normal operation, once every other character clock in 2-color mode and once every four character clock in 4-color mode. The MA counter needs to be incremented by 1, 2 or 4 depending on the CRTC mode (byte mode, word mode, doubleword mode), not depending on any of the text/graphics mode bits the VGA provides. The most prominent example is that the MA is increased by 4 in Mode 13h setup, but just by 1 in the related mode X setup. The lower increment requires the offset register to be reduced (from 40 to 10) in 640x480 graphics mode if you want the scan lines to be contigous in memory the in 640x480x2 mode obtained from "load every 4" mode. I didn't change the offset register, and thus obtained a 2560 pixels wide virtual screen for soft-scrolling.

Reply 17 of 25, by superfury

User metadata
Rank l33t++
Rank
l33t++

UniPCemu handles it like so(both in parallel):
- During half character clocks, the character clock shift divides a counter into whole clocks through 4 clock intervals. When such a clock expires, the MA counter is increased by 1(most modes) or 2(16-bit character text mode on the ET4000).
- The video load rate divides it as well, causing the video serializers to load during half through 4 character clocks(in powers of 2).

When the video serializers load, the following happens(16-bit text mode loading 2 successive dwords from VRAM instead of 1):
1. Address wrapping is applied to the MA counter for the fetch. This applies MA13/15(word mode), word mode shift, dword mode shift(byte mode instead if not plain VGA).
2. Map 13 and 14 is applied as needed for the VRAM address.
3. Said dword(s) are fetched from VRAM.
4. Shifting out the entire dword is done all at once using taps and shifts(32-bit register) in graphics mode only.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 18 of 25, by mkarcher

User metadata
Rank l33t
Rank
l33t
superfury wrote on 2020-12-14, 08:04:
UniPCemu handles it like so(both in parallel): - During half character clocks, the character clock shift divides a counter into […]
Show full quote

UniPCemu handles it like so(both in parallel):
- During half character clocks, the character clock shift divides a counter into whole clocks through 4 clock intervals. When such a clock expires, the MA counter is increased by 1(most modes) or 2(16-bit character text mode on the ET4000).
- The video load rate divides it as well, causing the video serializers to load during half through 4 character clocks(in powers of 2).

When the video serializers load, the following happens(16-bit text mode loading 2 successive dwords from VRAM instead of 1):
1. Address wrapping is applied to the MA counter for the fetch. This applies MA13/15(word mode), word mode shift, dword mode shift(byte mode instead if not plain VGA).
2. Map 13 and 14 is applied as needed for the VRAM address.
3. Said dword(s) are fetched from VRAM.
4. Shifting out the entire dword is done all at once using taps and shifts(32-bit register) in graphics mode only.

That sounds about right. I didn't know the 16-bit character text mode on the ET4000 till now. I just knew about modes with around 10 to 12 bits of character codes coming at the expense of attribute bits. I'm thus not discussing that mode, but if there actually is a mode that fetches 16-bit character codes and attribute codes, and uses two successive dwords, your description sounds correct. Also, your implementation is actually closer to what probably happens in hardware than how I expressed it: You don't add 1/2/4 in byte/word/dword mode, but instead always add 1, and you rotate the counter before using it as address. This is fine.

I guess some software could try to race the beam and modify RAMDAC contents during scanout of the 32 bit shift register, so shifting it out all at once can yield emulation failures, but I don't know whether UniPCEmu is actually intended to run programs that race the beam during a scanline. Not doing the scanout at once would most likely add a considerable overhead which might be not worth the result. I guess updating the RAMDAC mid-scanline is very uncommon because it adds snow on some VGA implementations.

Reply 19 of 25, by superfury

User metadata
Rank l33t++
Rank
l33t++

Tried VGATEST 2. Running "vgatest error.log -t18" (w/o quotes), I see panning going weird? The M part of "Mode 10h" keeps scrolling left(at the start) and right(at the emd) and reappears after that? It's like the horizontal pixel panning is used only? No byte or start address shifts?

Edit: I see that it's updating various registers during the scrolling demo:
3018=top window start
300c=start address low
300d=start address high
3008=preset row scan(bit 0-4)/byte panning(bit 5-6)
b004=Index AttributeController
4013=horizontal pixel panning count(bit 0-4)

The preset row scan/byte panning always seems to be set to 0 over and over again(even though written).
I see it's writing the start address with increased values(+1 for every 8 pixels), while for every pixel, it increases the horizontal pixel panning count, wrapping around 8.
It's a graphics mode which scrolls the top window down and up.
At the same time, I see the top window's horizontal pixel panning count having effect. It keeps scrolling the first character(8 pixels wide) off the screen, after which it reappears again as it was before scrolling off the screen.
But since the start address keeps increasing monotonically in that clock, somehow the frame isn't applying it? Since the top left of the screen(which is said 8 pixels) keeps reappearing every 8 frames, shifting the display back to the right, where it started out as?
Edit: Just noticed that the start map address was cleared by the time it reached vertical total. So the new frame started at address 0 always!
It was properly 'latching' said start address, but clearing the latched value because it's still rendering overscan or vertical/horizontal retrace after that.
That causes the latched value to be cleared in the top-window case(split-screen operations).
So the next frame starts with the cleared value in this case, instead of a proper start address!
Edit: After having fixed this, I see the demo display two different pages(page 1 above the split screen, page 0 at/below the split screen).
So that's fixed now.
One thing I still see happening, though, is that when scrolling the display left/right, the display jerks left/right for 1 frame every few frames somehow?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io