VOGONS


NVIDIA Kepler/Maxwell/Pascal VESA Bios Bug (workaround found)

Topic actions

Reply 140 of 151, by EduBat

User metadata
Rank Member
Rank
Member

Geforce GTX 650 - BIOS version: 80.07.35.00.60
NVIDIA Corporation GK107 [GeForce GTX 650] [10de:0fc6] (rev a1)

UNER - Unlock Nvidia Extended Regs - V0.1 (alpha)
Nvidia Key found at: C000:469B
Nvidia offset bar regs at: C000:0120
Nvidia Bar regs at: DC00
IRET found at: C000:04FB
Nvidia unlock routine found at: C000:48B4
Start Nvidia unlock ... complete

Reply 141 of 151, by Marco Pistella

User metadata
Rank Newbie
Rank
Newbie

NEWAX v0.3 (alpha) — Full Nvidia virtual resolution support

After completing the reverse engineering of the Nvidia VBIOS unlock sequence (documented in the UNER post), NEWAX v0.3 now implements correct VESA 4F07h support for Nvidia Kepler, Maxwell, and Pascal GPUs.

Minimal unlock — no full UNER mechanism required

The extended CRTC registers on these cards require only two port writes to unlock before programming:

CRTC index 3Fh ← 57h

This is the minimal form of the unlock discovered during UNER development. No key, no trampoline, no VBIOS routine call — just those two bytes written to the CRTC before touching the extended registers.

Extended start address registers

The reverse engineering revealed two previously undocumented Nvidia extended CRTC registers for the display start address:

CRTC 34h — high byte of extended start address (bits 5:0 used)
CRTC 35h — low byte of extended start address

These are programmed alongside the standard VGA start address registers (0Ch/0Dh) to provide the full address range required for virtual resolution panning. The retrace wait logic (BL=80h) is also orrected: Nvidia's own BIOS inverts the vertical retrace polarity, waiting for end-of-retrace instead of start-of-retrace. NEWAX fixes this.

Tested hardware and software

Verified on Nvidia GT740 and GT1030 with:
- X-VESA (virtual resolution panning across all tested modes)
- Quake (VESA mode, full panning)
- Duke Nukem 3D (VESA mode, full panning)

All tests ran without issues up to 1280x1024 virtual resolution.

Known limitation

On the tested GPUs, the horizontal start address must be a multiple of 2 pixels. If an odd value is requested (e.g. 103 pixels), NEWAX rounds it down to the nearest even value (102 pixels). This appears
to be a hardware constraint specific to Nvidia on the tested cards — behavior on other GPUs may differ and is one of the things beta testing will clarify.

Call for beta testers

NEWAX needs testing on as many Nvidia cards as possible to build a complete compatibility map. Cards of particular interest:

- Pre-Kepler Nvidia (Fermi and older)
- Quadro and professional series
- High-end Maxwell and Pascal (GTX 900/1000 series)
- Turing and Ampere

If you have Nvidia hardware and a DOS boot environment, please test and report: card model, whether virtual resolution panning works correctly, and whether the horizontal alignment behavior matches or
differs from the description above.

Beta testers from other forums are equally welcome — the more hardware covered, the more complete the compatibility picture.

Source code is included in the archive.

The attachment NEWAX03.ZIP is no longer available

Reply 142 of 151, by Falcosoft

User metadata
Rank l33t
Rank
l33t
Marco Pistella wrote on Today, 11:14:
NEWAX v0.3 (alpha) — Full Nvidia virtual resolution support […]
Show full quote

NEWAX v0.3 (alpha) — Full Nvidia virtual resolution support

After completing the reverse engineering of the Nvidia VBIOS unlock sequence (documented in the UNER post), NEWAX v0.3 now implements correct VESA 4F07h support for Nvidia Kepler, Maxwell, and Pascal GPUs.

Minimal unlock — no full UNER mechanism required

The extended CRTC registers on these cards require only two port writes to unlock before programming:

CRTC index 3Fh ← 57h

This is the minimal form of the unlock discovered during UNER development. No key, no trampoline, no VBIOS routine call — just those two bytes written to the CRTC before touching the extended registers.

Extended start address registers

The reverse engineering revealed two previously undocumented Nvidia extended CRTC registers for the display start address:

CRTC 34h — high byte of extended start address (bits 5:0 used)
CRTC 35h — low byte of extended start address

These are programmed alongside the standard VGA start address registers (0Ch/0Dh) to provide the full address range required for virtual resolution panning. The retrace wait logic (BL=80h) is also orrected: Nvidia's own BIOS inverts the vertical retrace polarity, waiting for end-of-retrace instead of start-of-retrace. NEWAX fixes this.

Tested hardware and software

Verified on Nvidia GT740 and GT1030 with:
- X-VESA (virtual resolution panning across all tested modes)
- Quake (VESA mode, full panning)
- Duke Nukem 3D (VESA mode, full panning)

All tests ran without issues up to 1280x1024 virtual resolution.

Known limitation

On the tested GPUs, the horizontal start address must be a multiple of 2 pixels. If an odd value is requested (e.g. 103 pixels), NEWAX rounds it down to the nearest even value (102 pixels). This appears
to be a hardware constraint specific to Nvidia on the tested cards — behavior on other GPUs may differ and is one of the things beta testing will clarify.

Call for beta testers

NEWAX needs testing on as many Nvidia cards as possible to build a complete compatibility map. Cards of particular interest:

- Pre-Kepler Nvidia (Fermi and older)
- Quadro and professional series
- High-end Maxwell and Pascal (GTX 900/1000 series)
- Turing and Ampere

If you have Nvidia hardware and a DOS boot environment, please test and report: card model, whether virtual resolution panning works correctly, and whether the horizontal alignment behavior matches or
differs from the description above.

Beta testers from other forums are equally welcome — the more hardware covered, the more complete the compatibility picture.

Source code is included in the archive.

The attachment NEWAX03.ZIP is no longer available

Thank you and congratulations!
I think this is a huge breakthrough. It works almost perfectly on my GTX 970 and GTX 960.
The only (minor) problem is that virtual resolutions at and above (1280 * 2) x (1024 * 2) have unreachable memory region both in banked and LFB modes. The problem manifests itself as garbage after writing/drawing.
Here is a demo program and a screenshot about the problem.

The attachment FALSSANI.zip is no longer available
The attachment demo_garbage.jpg is no longer available

@Edit:
Here is a demonstration video about the problem:
https://youtu.be/LywuiRP7PXs

Website, Youtube
Falcosoft Soundfont Midi Player + Munt VSTi + BassMidi VSTi
VST Midi Driver Midi Mapper
x86 microarchitecture benchmark (MandelX)

Reply 143 of 151, by Marco Pistella

User metadata
Rank Newbie
Rank
Newbie
Falcosoft wrote on Today, 11:56:

Thank you and congratulations!
I think this is a huge brea ... [CUT]

Hi Falcosoft,

thank you very much for the quick test and for the detailed feedback — it’s extremely helpful.

About the issue you observed with virtual resolutions at and above 1280 × 1024:
this behaviour is actually normal on Nvidia cards and not caused by a bug in NEWAX.

Although VESA function 4F00h reports 14336 (or more) KB of available video memory, Nvidia Kepler+ cards expose only a much smaller contiguous framebuffer region when running under DOS.
On the Kepler/Maxwell/Pascal cards I tested, the maximum reliably accessible area is about 4160 KB.

Anything beyond that range is not mapped as visible framebuffer memory.
So when the virtual start address grows large enough to point past this limit, writes go into a non‑framebuffer region, and the result is the “garbage” pattern you observed.

X‑VESA includes a built‑in VRAM visibility test for exactly this reason.
If you start X‑VESA and press F6, then F5, (wait for the test) you can see the actual amount of video memory that is accessible physically for each video mode, separately for banked and LFB modes.
This confirms the real hardware limit and explains why very large virtual resolutions cannot be fully mapped.

So the good news is:
NEWAX is working correctly — the limitation comes from how Nvidia maps VRAM under DOS.

Thanks again for testing on GTX 960 and 970. Your reports are invaluable for building a complete compatibility map.

EDIT: Resut on a Nvidia GT1050:

The attachment FILE0000.PNG is no longer available

Reply 144 of 151, by RayeR

User metadata
Rank Oldbie
Rank
Oldbie

It drives me thinking if it would be possible to reprogram some MMIO reg. to expose more VRAM for LFB. Probably vbios initialize some minimal space exposed by VBE and graphics driver then reprograms it to larger window according to available vram and memory space. nV up to 7xxx exposed full VRAM, not sure about 8xxx/9xxx

Gigabyte GA-P67-DS3-B3, Core i7-2600K @4,5GHz, 8GB DDR3, 128GB SSD, GTX970(GF7900GT), SB Audigy + YMF724F + DreamBlaster combo + LPC2ISA

Reply 145 of 151, by Falcosoft

User metadata
Rank l33t
Rank
l33t
Marco Pistella wrote on Today, 12:43:
Hi Falcosoft, […]
Show full quote
Falcosoft wrote on Today, 11:56:

Thank you and congratulations!
I think this is a huge brea ... [CUT]

Hi Falcosoft,

thank you very much for the quick test and for the detailed feedback — it’s extremely helpful.

About the issue you observed with virtual resolutions at and above 1280 × 1024:
this behaviour is actually normal on Nvidia cards and not caused by a bug in NEWAX.

Although VESA function 4F00h reports 14336 (or more) KB of available video memory, Nvidia Kepler+ cards expose only a much smaller contiguous framebuffer region when running under DOS.
On the Kepler/Maxwell/Pascal cards I tested, the maximum reliably accessible area is about 4160 KB.

Anything beyond that range is not mapped as visible framebuffer memory.
So when the virtual start address grows large enough to point past this limit, writes go into a non‑framebuffer region, and the result is the “garbage” pattern you observed.

X‑VESA includes a built‑in VRAM visibility test for exactly this reason.
If you start X‑VESA and press F6, then F5, (wait for the test) you can see the actual amount of video memory that is accessible physically for each video mode, separately for banked and LFB modes.
This confirms the real hardware limit and explains why very large virtual resolutions cannot be fully mapped.

So the good news is:
NEWAX is working correctly — the limitation comes from how Nvidia maps VRAM under DOS.

Thanks again for testing on GTX 960 and 970. Your reports are invaluable for building a complete compatibility map.

EDIT: Resut on a Nvidia GT1050:

The attachment FILE0000.PNG is no longer available

Hi,
OK, it's understood.
So far I have only tested 8-bit modes since these are the really relevant modes in DOS. Now I also tested 16/32-bit modes and the situation is much worse. Actually none of the 16/32-bit modes works in X-VESA's dual pages/double buffering test. Even 640x480x16 and 640x480x32 gives flickering screen with artifacts:

The attachment DB_640x480x16_artifacts1.jpg is no longer available
The attachment DB_640x480x16_artifacts2.jpg is no longer available

Website, Youtube
Falcosoft Soundfont Midi Player + Munt VSTi + BassMidi VSTi
VST Midi Driver Midi Mapper
x86 microarchitecture benchmark (MandelX)

Reply 146 of 151, by Marco Pistella

User metadata
Rank Newbie
Rank
Newbie

Hi everyone,
I’ve released NEWAX 0.4, this update introduces an important improvement to the handling of VESA function 4F01h (Get Video Mode Info): NEWAX now recalculates the NumberOfImagePages field dynamically based on the actual amount of VRAM available for the current video mode.

VRAM is calculated as follows: in banked mode it is fixed to 4160 KB unless the video mode requires more. In that case — and also when the video mode is opened in linear mode — NEWAX uses the TotalMemory field from the VbeInfoBlock structure.

Fixed:
- Duke Nukem3D (1280x1024)
- Reset nvidia extended regsters on new open video mode
- Video Mode 15/16/32 bpp

The attachment NEWAX04.ZIP is no longer available

Reply 147 of 151, by LSS10999

User metadata
Rank Oldbie
Rank
Oldbie

Just tested this on my Ampere system, and it seems NEWAX is not going to work yet.

It exits with message "VESA function 4F06h (Set/Get scanline) not supported".

So I think for that system I'll have to wait until a fix for 4F06h gets implemented.

Reply 148 of 151, by Falcosoft

User metadata
Rank l33t
Rank
l33t
Marco Pistella wrote on Today, 15:11:
Hi everyone, I’ve released NEWAX 0.4, this update introduces an important improvement to the handling of VESA function 4F01h (Ge […]
Show full quote

Hi everyone,
I’ve released NEWAX 0.4, this update introduces an important improvement to the handling of VESA function 4F01h (Get Video Mode Info): NEWAX now recalculates the NumberOfImagePages field dynamically based on the actual amount of VRAM available for the current video mode.

VRAM is calculated as follows: in banked mode it is fixed to 4160 KB unless the video mode requires more. In that case — and also when the video mode is opened in linear mode — NEWAX uses the TotalMemory field from the VbeInfoBlock structure.

Fixed:
- Duke Nukem3D (1280x1024)
- Reset nvidia extended regsters on new open video mode
- Video Mode 15/16/32 bpp

The attachment NEWAX04.ZIP is no longer available

Thanks!
It seems 16/32-bit modes work properly now.
Once again, this is a huge success.
These cards have been almost useless so far considering high resolution DOS gaming/VESA programming. But now, thanks to you, they have one of the best performing VESA implementations! 😀

Website, Youtube
Falcosoft Soundfont Midi Player + Munt VSTi + BassMidi VSTi
VST Midi Driver Midi Mapper
x86 microarchitecture benchmark (MandelX)

Reply 149 of 151, by Marco Pistella

User metadata
Rank Newbie
Rank
Newbie

New version NEWAX 0.5

Fixed:

- 16/32 video mode (again)

New:

- 1 Pixel horizontal scrolling for 8/16/32 bpp (unusual 3C0h/13h)

The attachment NEWAX05.ZIP is no longer available

Reply 150 of 151, by Marco Pistella

User metadata
Rank Newbie
Rank
Newbie

@RayeR

The idea of trying to map all the VRAM reported in VbeInfoBlock.TotalMemory is interesting, but it’s important to clearly distinguish between banked and linear modes.

In linear mode, the problem doesn’t actually exist:
the LFB is always mapped to the maximum size reported by VBE, i.e. 14336 KB or 16384 KB, depending on the card.
So the BIOS already exposes all the VRAM that VBE considers usable for that mode.

The limitation applies only to banked mode, where the classic window is about 4160 KB.
When a video mode requires more memory than that window (for example 1280×1024×32), the BIOS uses the larger amount indicated in TotalMemory (16384 KB in my case).
If the mode fits within the ~4 MB window (like 1280×1024×16), it stays within the standard banked window.

To understand exactly how the BIOS decides what to map and when, one would need to compare—via debugger—two video modes that are almost identical but have different memory requirements (in my case 1280×1024×16 vs 1280×1024×32) and see what changes the BIOS introduces during initialization.
From what little I’ve seen, it looks much more complex than just writing to a few I/O ports. I’d set this aside for now.

@LSS10999

Yes, on Ampere NEWAX cannot work yet — and the reason is technical, precise, and already known: on Nvidia Ampere GPUs the VESA function 4F06h simply no longer exists.

I would like to add a reimplementation of 4F06h as well, but to be honest: implementing 4F07h on Kepler/Maxwell/Pascal took me over 15 hours of continuous work, so I can’t promise anything.

I will still take another look at the GT 1030 BIOS (where 4F06h is still present and functional); if the logic is straightforward enough to replicate in a TSR, I will add it. Otherwise, Ampere will unfortunately remain unsupported.

@Falcosoft

I'm really glad to hear that 16/32-bit modes are now working correctly. In version 0.5 I fixed a bug in the horizontal scanline size for 16/32 bpp video modes, and horizontal scrolling is now accurate to 1 pixel in all video modes.

I will try to implement the 4F06h function on cards where it has been removed, but I cannot promise anything. I’ll take a look and see if it’s feasible.

P.S. I also tested the Mandelbrot animation (FALSSANI.zip), and it now works correctly

Reply 151 of 151, by EduBat

User metadata
Rank Member
Rank
Member
Marco Pistella wrote on Today, 16:22:

New version NEWAX 0.5

The attachment NEWAX05.ZIP is no longer available

I know my GTX650 may not be a very interesting datapoint, given that it's bang in the middle of the range of cards under test but here goes: NEWAX works well and makes the double buffering and virtual resolutions work perfectly.

I'm running my tests with a Windows 98 boot diskette and noticed that, if I run NEWAX and then NEWAX /U to remove it from memory, followed by the MEM /C /P command it leaves behind a 96 bytes "memory hole" (for lack of a better term).