Reply 40 of 62, by EduBat
Another card:
Nvidia AGP GeForce 6200
Let me know if you would like any other screenshot.
Another card:
Nvidia AGP GeForce 6200
Let me know if you would like any other screenshot.
@Falcosoft
Glad to hear it. Thanks for the follow-up.
Thank you for the detailed log — this is a very clean and
informative failure pattern.
All 999 errors are exclusively MATRIX type, with a systematic
1-bit shift between expected and read value. BITS_TEST and
BURST_TEST pass cleanly. The key difference: BURST_TEST
includes a forced write buffer flush via I/O port 80h before
the readback verification. MATRIX_TEST does not.
On an IGP with shared system memory like the Radeon Xpress
1150, write-combining or caching is almost certainly active.
Without an explicit flush between write and read, MATRIX_TEST
reads stale cached data — the previous bit position, not the
one just written. The systematic shift in the XOR mask
confirms this.
This is not a VRAM defect and not an X-VESA bug — it is
documented hardware behavior caused by the memory subsystem
of integrated graphics with shared memory. Could you repeat
the test in linear mode (F10 before entering the command)
to see if the behavior changes with a different access path?
@Tiido
Thank you! No rush — the thread will still be here in a few
months.
Unreal mode is worth the effort when you get to it. The
transition is straightforward once you have the GDT setup
right; the interesting part is what you can do with 32-bit
addressing from real mode without the overhead of full
protected mode. If you have questions when you get there,
feel free to ask.
@konc
Thank you for tracking down the original reference — that
explains how version 1.x circulated. Good to know the VRAM
reliability test was already finding real use on questionable
hardware. That use case is fully covered in 2.0 as well, with
a more detailed error report and configurable test parameters.
@ajacocks
Thank you, enjoy it!
@EduBat - UMC
Thank you — clean data on all three screenshots.
s/µ = 0.0064 at 640×400 is consistent with other ISA cards
tested so far. The distribution confirms a real analog
oscillator: single spike, N=4, 3 PIT unit spread.
"Emulated VGA CRTC registers!" and the window function
pointer at 1344:00B5 are both expected when running with a
software VESA TSR: the TSR resides in conventional memory
and intercepts the VESA interface entirely, so the bank
switching function pointer lands in the TSR's own segment
rather than in the card's BIOS ROM at C000h, and the CRTC
registers may not be directly accessible in the same way
they would be on a card with native VESA support. X-VESA
falls back to indirect estimation for bandwidth and
interlace detection as a result.
Regarding the VRAM results from the previous screenshot:
Fail[0Eh] on the high-resolution planar and direct color
modes is expected — the card has 512 KiB, insufficient for
those modes. The TSR declares them anyway without setting
bit 0 of byte 0 in the 4F01h mode attributes to indicate
"not supported by hardware configuration", which is a known
class of bug in several 1990s VESA implementations. X-VESA
validates that bit independently and rejects the modes
before any opening attempt.
The "Not supported" entries in the VRAM linear column are
also expected: linear frame buffer requires VESA 2.0 or
higher, and this TSR implements VESA 1.2.
Finally, the 256 KiB vs 512 KiB discrepancy between modes
is normal: the amount of VRAM accessible in banked mode
depends on how the TSR has configured the memory mapping
for each specific mode, not just the total physical memory
installed.
If you have other hardware without a TSR to test on, a
comparison would be interesting.
@EduBat - GeForce 6200
Thank you — GeForce 6200 is a useful data point.
s/µ = 0.0046 at 640×400 is the lowest value seen so far in
this beta, lower than the ISA cards with real analog
oscillators. N=3, 2 PIT unit spread — very stable VGA
emulation at low resolution, quite different from the GT740
which showed extreme jitter and a trimodal distribution.
The garbled characters in the Product revision field are not
an X-VESA bug — that field contains non-ASCII bytes in the
BIOS string and X-VESA displays the raw content as-is.
128 MiB VRAM, LFB at D8000000h, VESA 3.0 — everything looks
clean. Could you run Command 4 on a higher resolution mode
(0107 or 0118) as well? I'm curious whether the jitter
remains this low at higher horizontal frequencies on this
card.
New Beta 3 - Changes from Beta 1
Beta 2
- Command 8 (Test switch DAC 6/8 bit): use spacebar for
compatibility with keyboards without a dedicated numeric
keypad.
- Commands 6 and 7 (Test dual page video mode / Test virtual
resolution): the result of function 4F07h is now reported
with a distinction between "not supported" (AL ? 4Fh) and
"supported but failed" (AL = 4Fh, AH = 01h / return 014Fh).
Previously both conditions were reported identically.
Beta 3
- Command 8 (Test switch DAC 6/8 bit): spacebar added as an
alternative toggle key to numpad +, for compatibility with
keyboards without a dedicated numeric keypad.
- Command 7 (Test virtual resolution): the summary panel
displayed on ESC now includes the horizontal granularity
of the virtual resolution — the interval in pixels at which
function 4F06h generates a new distinct BytesPerScanLine
value, derived from the compatible virtual resolution table
built during the test.
I have 128MB Asus Radeon A9600XT https://www.pcstats.com/articles/1495/2.html
Attached Lavalys Everest 2005 GPU export .txt for more details
Chip ID and memory amount are not detected correctly
vutt wrote on Yesterday, 10:28:I have 128MB Asus Radeo ... [CUT]
Thank you for the detailed report.
Regarding memory detection: X-VESA reads the TotalMemory
field from the VbeInfoBlock, which on this card declares
16 MiB. However the direct physical probe on mode 0110
finds 32,768 KiB actually accessible — the VESA BIOS
underdeclares the available memory. This is a known
limitation in several ATI VESA implementations and is
not an X-VESA detection error. The full 128 MiB is not
accessible via the VESA interface regardless.
The σ/μ = 3.8587 at 640×480×15 is surprisingly high for
a card of this generation — higher than the GT740 tested
earlier. Could you repeat Command 4 on a packed 8-bit mode
(0101 or 0100) and on a higher resolution mode to see if
the jitter is consistent across modes, or specific to
this one?
Marco Pistella wrote on Yesterday, 08:01:@EduBat - GeForce 6200 […]
@EduBat - GeForce 6200
128 MiB VRAM, LFB at D8000000h, VESA 3.0 — everything looks
clean. Could you run Command 4 on a higher resolution mode
(0107 or 0118) as well? I'm curious whether the jitter
remains this low at higher horizontal frequencies on this
card.
Here goes
Here are details from one of my laptops.
Hope it helps.
Marco Pistella wrote on Yesterday, 10:41:Could you repeat Command 4 on a packed 8-bit mode (0101 or 0100) and on a higher resolution mode to see if the jitter is consi […]
Could you repeat Command 4 on a packed 8-bit mode
(0101 or 0100) and on a higher resolution mode to see if
the jitter is consistent across modes, or specific to
this one?
btw. it might be relevant. I have modded bios. I forced 85HZ for common vesa resolutions with Radeon bios editor - here: Re: ATI Radeon 9000 series cards DOS refresh rate setting options
EduBat wrote on Yesterday, 13:19:Here are details from one of my laptops.
Hope it helps.
Thank you for both sets of data.
GeForce 6200: σ/μ scales correctly with horizontal
frequency — 0.0046 at 640×400, 0.0070 at 1024×768×24,
0.0121 at 1280×1024×8. Clean progressive increase
consistent with the general pattern observed across
other cards. Good reference data.
ATI Mobility Radeon 7500: σ/μ = 0.0079 at 640×400,
distribution clean with single spike and N=3.
"Emulated VGA CRTC registers!" is expected on a mobile
GPU of this generation.
One anomaly worth noting: Video BIOS ROM size reports
0 Bytes. This field is read directly from the VbeInfoBlock
— either the mobile BIOS does not populate it, or it
deliberately reports zero. Not an X-VESA error, but an
unusual value worth documenting.
LFB declared at 80000000h with 32,454 KiB of off-screen
memory — clean VESA 2.0 implementation overall.
vutt wrote on Yesterday, 13:43:btw. it might be relevant. I have modded bios. I forced 85HZ for common vesa resolutions with Radeon bios editor - here: Re: ATI Radeon 9000 series cards DOS refresh rate setting options
Thank you for the data and for mentioning the modded BIOS
— that detail is essential for correct interpretation.
The σ/μ values show an irregular pattern that is not
consistent with a simple frequency-dependent progression:
640×400: 0.0049 — normal
640×480: 0.0083 — normal
800×600: 4.6448 — extreme jitter
1024×768: 0.0125 — normal, but CRTC registers not accessible
1280×1024: 1.3890 — high jitter, 255/256 valid samples
The extreme jitter on 800×600 and 1280×1024 is very likely
caused by the forced 85Hz timing in the modded BIOS: if the
programmed timing is not perfectly stable or not correctly
supported by the monitor at that refresh rate, the PIT
sampling will capture the instability directly. The modes
that show normal σ/μ are probably those where the forced
timing happens to be stable.
"Emulated VGA CRTC registers!" on 1024×768 is consistent
with what has been observed on other ATI cards of this
generation.
For comparison purposes it would be very useful to repeat
Command 4 on 800×600 and 1280×1024 with the original
unmodified BIOS, if you still have access to it. That
would confirm whether the jitter is BIOS-induced or
hardware-specific.
And now, a "modern" card, a GeForce GTX 650
(continuation...)
EduBat wrote on Yesterday, 16:58:(continuation...)
Thank you — Kepler (GK107) is a very useful data point.
The σ/μ values are the highest recorded in this beta so far:
640×400: 9.1737 — 1024×768×24: 10.5661 — 1280×1024: 13.8558.
Sample validity drops progressively (253/256, 250/256, 252/256).
The extreme and non-linear jitter is consistent with
what is being observed on other Kepler/Maxwell cards and
confirms a systematic issue in the VGA timing emulation
on this GPU generation.
The VRAM benchmark shows a classic write-combining
signature: Write 128b reaches 345,000 KiB/s while
Read 128b is only 7,750 KiB/s — a 45:1 asymmetry.
Writes are coalesced into burst transactions; reads
bypass the cache entirely. This is expected behavior
for a GPU framebuffer accessed from the CPU side without
explicit read-combining enabled.
AVX and AVX-512F not available on this CPU, so the
benchmark stops at 128-bit. If you have access to a
system with AVX support, the 256b and 512b results
on this card would be interesting.
Marco Pistella wrote on Yesterday, 17:18:... […]
...
The VRAM benchmark shows a classic write-combining
signature: Write 128b reaches 345,000 KiB/s while
Read 128b is only 7,750 KiB/s — a 45:1 asymmetry.
...
Hi,
I do not think that your analysis is entirely correct. The 45:1 asymmetry has nothing to do with write-combining enabled. It's simply because reading from VRAM is that much slower than writing to VRAM.
The real signature of enabled write-combining is that writing with byte, word and dword granularity results in similar performance because, well, writes are combined 😀
Here are my 1024x768x8-bit result form a GTX 970.
In case of explicitly disabled Write-combining the results are very similar to your analyzed result of the GTX 650. That is the read:write performance asymmetry is about to ~1:25,
and word sized writes are twice as fast as byte sized writes and dword sized writes are twice as fast as word sized writes:
But here is the result when Write-combining is really enabled:
As can be seen the read:write performance asymmetry is about ~1:3000 in case of byte sized writes and about ~1:800 in case of dword sized writes! And there is not much difference between the performance of byte, word and dword sized writes.
Falcosoft wrote on Yesterday, 17:50:... [CUT]
You are correct and my analysis was wrong — thank you for
the precise correction.
The 45:1 asymmetry on the GT640 is simply the baseline
read/write speed difference for VRAM accessed from the CPU
side, not a write-combining signature. The actual signature
of enabled write-combining is what your GTX 970 screenshots
show clearly: write performance converges across all access
widths (byte, word, dword all similar), and the read/write
asymmetry reaches 1:3000 for byte-sized writes.
The GT640 results show linear write scaling (each doubling
of width doubles performance) which is the opposite of
write-combining behavior — it confirms write-combining is
not active on that card in that configuration.
This is a useful calibration point for interpreting VRAM
benchmark results in X-VESA. I'll update the analysis
accordingly.
Marco Pistella wrote on Yesterday, 18:16:You are correct and my analysis was wrong — thank you for the precise correction. .... […]
Falcosoft wrote on Yesterday, 17:50:... [CUT]
You are correct and my analysis was wrong — thank you for
the precise correction.
....
To confirm the pattern here are the results of a much older AGP Geforce 6600 (platform is Abit KT7-A + Athlon XP 1750 MHz)
Write combining disabled:
Write combining enabled:
It can be clearly seen that enabled write combining could not result in that much difference on this AGP system as in case of the PCI-E system (that was a GB Z97X-UD5H + I7-4770k 3500 MHz anyway). But the pattern is very similar.
Some results. (without wrte combining)
Trident 9440 VLB, GeForce 210 PCIe, GTX 960 PCIe
GBAJAM 2024 submission on itch: https://90soft90.itch.io/wreckage
Falcosoft wrote on Yesterday, 18:54:... [CUT]
Thank you — the AGP 6600 comparison makes the pattern
unmistakable. Write-combining disabled: linear scaling
across widths, ~1:10 read/write ratio. Write-combining
enabled: all write widths converge to 673,000 KiB/s,
~1:1000 ratio. Same signature as the GTX 970, scaled
down by the AGP/Athlon platform.
The reduced absolute difference compared to the PCIe
system is expected — the AGP bus and the Athlon memory
controller impose their own ceiling regardless of
write-combining state.
This is now a clean reference dataset for interpreting
X-VESA VRAM benchmark results: linear write scaling =
no write-combining, convergence across widths =
write-combining active.
bakemono wrote on Yesterday, 19:07:Some results. (without wrte combining)
Trident 9440 VLB, GeForce 210 PCIe, GTX 960 PCIe
I also have GTX 960 on a completely different platform (GB MA790X-UD4 + Phenom II x4) and for 8 to 64-bit writes I got almost exactly the same results for disabled write-combining (128-bit writes are slower and 256-bit/AVX writes are not supported). The interesting thing is that these results are about 4x faster than the GTX 970 results on my Intel platform (it can be seen in my last but one post) despite the fact that they are both Maxwell 2.0 cards.
I have already noticed some VGA BIOS differences since 2 pages 640x350 VGA mode is completely unusable on GTX 970 while works perfectly on GTX 960...
bakemono wrote on Yesterday, 19:07:Some results. (without wrte combining)
Trident 9440 VLB, GeForce 210 PCIe, GTX 960 PCIe
Thank you — three very different cards, all useful data.
Trident 9440 VLB: σ/μ = 0.0054 at 640×480 and 0.0053
at 800×600 — very consistent across resolutions.
GeForce 210 (GT218): σ/μ = 0.0130 at 1280×1024×24.
AVX present and working correctly.
GTX 960 (Maxwell): σ/μ = 0.7707 at 640×480 with only
128/256 valid samples — confirms the Maxwell jitter
pattern observed on other cards in this beta. Write
results: 8b=96,800 KiB/s, 16b=193,000 KiB/s,
32b=386,000 KiB/s, 64b=773,000 KiB/s, 128b=1,510 MiB/s,
256b=2,310 MiB/s.
The GTX 960 data is particularly relevant in the context
of the 4F07h TSR project — this is the first confirmed
Maxwell result in the beta.
Here are some Ivy Bridge (Thinkpad T430 - i7 3632QM 2200 MHz) integrated VGA results.
Write combining disabled:
Write combining enabled:
It can be seen that reading from VRAM (that is actually reserved system RAM) is much faster in case of an integrated VGA than in case of an external one.
Also the difference between Write combining disabled and Write combining enabled writes are the biggest one so far (byte sized writes are about ~650x faster!).
There is a performance anomaly at 64-bit writes that cannot be explained easily (the results are the same with multiple runs).
@Marco Pistella: Do 64-bit reads/writes use FPU or MMX registers?
BTW, I very much like your test program 😀 I think this is the only available DOS tool that can measure such high bandwidths reliably.