Darmok wrote on 2025-12-28, 08:38:As I understand your idea, to avoid slow register handling, you propose writing a memory-resident utility that would do the foll […]
Show full quote
As I understand your idea, to avoid slow register handling, you propose writing a memory-resident utility that would do the following:
1. Intercept all VBE calls made by the game.
2. Notify the game of VBE1.2 support, but actually enable VBE2.0 mode and execute VBE calls, emulating VBE1.2 for the game, switching to protected mode if necessary.
3. Using the memory manager, emulate windowing functions in video memory using a virtual video buffer. Hypothetically, this would be faster than register manipulation.
You are nearly there. But the point is not "to avoid slow register handling" during bank switching. Switching banks using the memory manager will actually be slower than just writing some values to I/O ports. Most (if not all) memory managers keep the mapping tables (the page tables) in a part of memory that is not visible in VM86 mode, so every page table manipulation has to be done from protected mode, which means every bank switch needs to transisition to protected mode. On the other hand, if the FAR CALL banking method is used (in contrast to calling INT 10, AX=4F05), this allows staying in VM86 all the time.
The point is a different one. Some VGA chips allow faster access to the video memory, if it is performed through the LFB address range than if it is performed through the VGA legacy range at A000:0. On these types of VGA cards, the increased memory performance might well make up for the extra overhead of remapping page tables in protected mode for bank switching. if your VGA card does not have consideratbly higher memory performance in the LFB compared to the legacy address range, the whole idea of my suggestion is pointless.
Darmok wrote on 2025-12-28, 08:38:
I don't understand what you meant by direct access to video memory, bypassing the graphics controller. Looking at a typical video card ISA block diagram, the data bus is connected to the graphics controller, there are FIFO buffers, and then the data goes to the video memory controller within the graphics controller. What does direct access mean?
OK, this explains why you suspected that I want to accelerate banking (which I do not). That point is indeed more difficult to understand. Yes, the ISA bus is connected to a FIFO buffer on any graphics cards that provides reasonable ISA performance (above 2MB/s). There was no FIFO on the original IBM VGA (and the speed was around 800K/s). The original VGA design is inherited from the EGA design, which is perfectly tailored towards an 8-bit bus, while having 32-bit memory on the card. Any 8-bit ISA access on the EGA will access a 32-bit word on the EGA card internally (unless some parts are masked due to configuration registers). What exactly happens with an 8-bit write depends on the configuration of the graphics controller. In the most simple case, the byte is just written to all 4 8-bit pieces of the 32-bit word that are currently configured as "writeable", but to get good performance in 16-color modes on an 8088 system, the graphics controller can do a lot of extra things, like rotating the 8-bit byte from the CPU before it is processed, forward that byte only to certain 8-bit pieces of the 32-bit word, while other 8-bit pieces get a constant 00 or FF, mix the "new content" generated this way with the contents of a 32-bit latch (aka 4 8-bit latches) using logical functions (OR, AND, XOR), and take certain bit positions unchanged from the latches before finally forwarding this data to the memory subsystem of the EGA card.
In the VGA 256-color mode, the graphics controller and memory subsystem are set up in a specific way that happens to make all that logic "invisible" to the normal application (the write bit mask is set to pick all bits from the newly written value, the mixing function is set to ignore latch contents and just use the ISA value, rotation is set to rotate by 0 bits, replacement of bytes by 00 or FF is disabled for all 8-bit pieces, and the 8-bit piece of the 32-bit word is selected from the low two address bits on the ISA bus), nevertheless, these features could technically all be enabled by VGA software (also they are all mostly pointless in 256-color mode except for very special edge cases). So as you see, there still is a quite complex write pipeline, that furthermore is meant to work with only 8 CPU bits at a time, so the card needs to behave as if all 16-bit writes are 2 successive 8-bit writes.
On the other hand, writes to the LFB region can go straigh from the ISA write buffer into the video memory (even merging 8-bit or 16-bit writes to 32-bit writes), bypassing the legacy logic that is not used by typical 256-color VESA applications. The huge performance gain of LFB use on some VGA chipsets thus is from skipping the processing pipeline in the graphics controller, not just from avoiding banking. There may be SVGA chips that can detect that the graphics controller is configured in a "pass-through" way and internally just bypass it, or there might be a control bit that allows bypassing the graphics controller even for accesses at A000-BFFF, which may be set by the VESA BIOS in high-resolution 256-color modes. On these types of cards, my idea would be pointless again. But as proven by S3SPDUP, certain S3 graphics chips are indeed considerably faster on LFB access than they are on accesses to the legacy range, and that's why remapping the "direct video memory access" over the "legacy VGA memory range" can yield a speed up on those chips. My idea was that instead of relying on the graphics chip to be able to present a "direct access to video memory" range at A000:0000 (which S3 chips can do, but it's specific to S3 chips), to use the memory manager to redirect A000:0000 to the fast LFB area that provides "direct access to video memory" (which should work with any graphics card). So this would be something like a "universal S3SPDUP", but obviously only useful if other cards behave like S3 cards in that their LFB is considerably faster than their VGA range in VESA modes. I don't know whether such cards exist at all.