VOGONS


Good VLB card benchmark

Topic actions

First post, by Parni

User metadata
Rank Newbie
Rank
Newbie

Hi,

Just wanted to drop this here (in case its not yet).

https://dependency-injection.com/vlb-vga-group-test/

Then an additional maybe stupid question, from a newbie: will dos games run faster in win 3.1?

Reply 1 of 24, by debs3759

User metadata
Rank Oldbie
Rank
Oldbie

I don't see what benchmark(s) they used.

Some games that can make use of the windows drivers might run a little faster, but I would expect the overhead of Windows back then to slow most down, especially if they are coded to use their own dpmi driver.

Reply 2 of 24, by firage

User metadata
Rank Oldbie
Rank
Oldbie

It is a great writeup, by Vogons member mpe.

Haven't ran many games, but I don't think any DOS games benefit from acceleration under 3.1x. In the best case, they run the same - but more likely worse.

My big-red-switch 486

Reply 4 of 24, by douglar

User metadata
Rank Member
Rank
Member

Nice pictures and graphs! Nice collection of cards.

NexGen NxVL motherboard with NexGen Nx586-P90 CPU and 32 MB RAM.

Now that's an unexpected configuration.

VL-Bus clocked at 41 MHz on this system. Cards used 0 wait-state configurations where possible.

That's also an unusual set up. Did I miss the part that said which cards were running with 1 wait state ? That seems kind of important.

EDIT -- OK, I see that the Diamond Viper VLB was running with 1 wait state--

Reply 5 of 24, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

Thanks for the appreciation!

As for reasons for using the NexGen board as the test platform. Despite 41 MHz I found it to be actually a good choice for testing VL-Bus parts. Performance-wise it is essentially a fast 486 roughly matching or exceeding DX4-100.

Unlike a typical 486 board, the bus on the NexGen is not directly driven by the CPU but by the NxVL chipset. The chipset is essentially a bridge, just like PCI. I believe this is the reason that the bus is more reliable than on many 486 boards @ 40 MHz. And that's with Adaptec 2842VL SCSI controller in the other slot in bus-master mode, which is known to be problematic.

I am yet to find a VL-Bus card that wouldn't work fine with this board. In fact the Diamond Viper is the only one that required 1 waitstate. I had much more issues with different 486 boards I tested.

Blog|NexGen 586|S4

Reply 7 of 24, by maxtherabbit

User metadata
Rank Oldbie
Rank
Oldbie
Disruptor wrote on 2020-09-25, 21:10:
ctcm /vid (from heise) […]
Show full quote

ctcm /vid
(from heise)

Comparing
ctcm /vid /movsw
with
ctcm /vid /movsd
reveals that Cirrus Logic cards access the VL bus with 16 bits only

ALL cirrus chipsets?

Reply 8 of 24, by mpe

User metadata
Rank Oldbie
Rank
Oldbie
Disruptor wrote on 2020-09-25, 21:10:

reveals that Cirrus Logic cards access the VL bus with 16 bits only

I think it is unlikely (unless you happen to be on a 386SX/486SLC system).

Even the entry level (on VL-Bus) CL-GD5424 is a 32bit device. Both towards host as well as DRAM (with at least 1MB). And it shows healthy improvement when using 32bit writes:

IMG_6333.jpg
Filename
IMG_6333.jpg
File size
718.83 KiB
Views
533 views
File license
Public domain

Vast majority of DOS software use 16bit writes, that's for sure.

The GD-5424 is slow in Windows mainly due to not having GUI acceleration features. In DOS it is in spitting distance to other VL-Bus cards, just some are a bit more optimised than others. The ARK1000 might be the fastest, but there is really nothing to write home about. They are all within 20% from each other.

That wouldn't be the case if the CL-GD had 16bit host bus. Look in the test for Viper VLB DOS results. That's a 16bit bus in action (as its secondary VGA chip sits on the ISA).

Blog|NexGen 586|S4

Reply 11 of 24, by Parni

User metadata
Rank Newbie
Rank
Newbie
Disruptor wrote on 2020-09-27, 08:53:

Older VGA cards may need VESA driver.
Newer VGA cards have VESA support integrated in the BIOS.

I have the following VLB cards, will these need DOS Drivers?

Miro S3 864
Cirrus GD5429
Ali ALG2228
Trident 9400CXi

Last edited by Stiletto on 2020-09-27, 20:16. Edited 1 time in total.

Reply 12 of 24, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

Depends on the software. Majority of DOS games use plain VGA modes. No drivers are required, just some cards might have compatiblity issues.

Most 1993+ cards have early VESA VBA 1.x in BIOS which is hardly used in games. Typically you can add support for VESA 2.0 using a manufacturer supplied TSR driver or UniVBE (universal sw).
Newer cards (1995+) might have VESA VBE 2.0 built-in. But for VL-Bus card it is rather exception.

Newer, more demanding DOS games typically require VESA VBE 2.0 for high-res modes (640x480+).

I'd say that the above cards won't support VBE 2.0 out of the box and a driver will be needed. However, any computer with VL-Bus is unlikely to be able to run games at 640x480 at reasonable speed so missing VBE 2.0 is not a big deal.

Blog|NexGen 586|S4

Reply 16 of 24, by mkarcher

User metadata
Rank Member
Rank
Member
mpe wrote on 2020-09-26, 17:50:
Disruptor wrote on 2020-09-25, 21:10:

reveals that Cirrus Logic cards access the VL bus with 16 bits only

I think it is unlikely (unless you happen to be on a 386SX/486SLC system).

Even the entry level (on VL-Bus) CL-GD5424 is a 32bit device. Both towards host as well as DRAM (with at least 1MB).

See for example here (sorry for the ad-ridden page): https://datasheetspdf.com/datasheet/CL-GD5426.html. The pinout of the CL-GD542x series chips in local bus mode is on page 44, on page 47 is a list of pins for the host interface in local bus mode. It lists only 16 data pins. The description of the pins is on page 61. The data sheet says:

GD542x wrote:

DATA [15:0]: These bidirectional pins are used to transfer data during any memory or I/O operation. These pins are directly connected to D[15:0] of the ’386SX or ’386DX bus. These pins are connected via four bidirectional data transceivers to the 32 data pins of the ’486 or VESA VL-Bus. The transceivers are controlled with OEH#, OEL#, and W/R#. These pads have pull-up resistors.

Notice how the data sheet requires four (8-bit) bidirectional transcievers to mulitplex the 32 bits of the 486 or VESA local bus onto the 16 bits of the Cirrus Logic chip.

The benchmark by ctcm on a 486DX/2 80MHz system in mode 13h shows:

MOVSW mem (hit) => Vid: 15.9 MByte/s
MOVSD mem (hit) => Vid: 15.9 MByte/s
STOSW Reg => Vid: 16.0 MByte/s
STOSD Reg => Vid: 16.0 MByte/s
LODSW Vid => Reg: 6.0 MByte/s
LODSD Vid => Reg: 6.3 MByts/s

There is a slight improvement in LODS performance when using 32-Bit cycles. This is not due to any 32-bit cycles on the bus, but this is caused by the overhead in the LODSx in the Cx486 processor. The 80486 has a write buffer, so it can prepare the next iteration of repeated store instructions while the write is still pending and saturate the front side bus. On loads, the 80486 needs one load to finish before it can complete one iteration of a repeated load, and only then it is going to check whether more iterations are needed and it prepares the next iteration.

mpe wrote on 2020-09-26, 17:50:
And it shows healthy improvement when using 32bit writes: […]
Show full quote

And it shows healthy improvement when using 32bit writes:

80386 with VGA 512k Vesa VBE support
-------------------------------------
3146W 1402R Bytes per millisecond 33.55KHz 87.39Hz 1024x768x256 (S-VGA)
6293W 2806R 16 bit writes/reads
8182W 3727R 32 bit writes/reads

On the Cyrix DX2/80 with the CL-GD5249 (ignore the "Paradise or Western Digital" output), the vidspeed output looks like this

80486 with VGA Paradise or Western Digital 2meg Vesa VBE support
----------------------------------------------------------------
7572W 2408R Bytes per millisecond 33.51KHz 87.28Hz 1024x768x256 (S-VGA)
15138W 4818R 16 bit writes/reads
15138W 6072R 32 bit writes/reads

The key difference (except for the performance) is the 80486 processor in my system compared to the 80386 processor (assuming your vidspeed version isn't outdated and doesn't know about the 80486). The 80486, as mentioned above uses its write buffer to parallelize writing to graphics memory and running the write loop (the REP part of REP STOSW/STOSD), whereas the 80386 processor can't. So the performance increase you see on reads and writes when using 32-bit instructions is caused by the overhead in the processor, not by the video card interface.

mpe wrote on 2020-09-26, 17:50:

In DOS it is in spitting distance to other VL-Bus cards, just some are a bit more optimised than others. The ARK1000 might be the fastest, but there is really nothing to write home about. They are all within 20% from each other.

That wouldn't be the case if the CL-GD had 16bit host bus. Look in the test for Viper VLB DOS results. That's a 16bit bus in action (as its secondary VGA chip sits on the ISA).

You are indeed right that these benchmarks show that the 16-bit bus interface of the Cirrus Logic card doesn't seem to hamper real-world DOS performance, at least on the Nx586-P90 system used to benchmark the cards. The awful Viper VLB results seems to be mainly caused by the ISA bus clock, which is 8MHz at maximum, a fifth of the VL clock. Do you happen to have the VIDSPEED output of a system with an ISA graphics card in a 386DX system showing that it does not profit from 32-bit access?

Reply 18 of 24, by SodaSuccubus

User metadata
Rank Member
Rank
Member

Great article!

Just further goes to show though that the long-hailed TSENG Labs cards arnt worth the museum price tag alot of people like to put on them.

2-3 FPS improvement in DOOM compaired to the more common Cirrus GD5429/8 cards. Yeah they technicaly are "the best"non the charts. But with such a low difference in frames gained, as far as pure gaming is concerned, you'd be beyond fine without a TSENG.

Just don't torture yourself with a Viper VLB 😜

Reply 19 of 24, by mpe

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2020-10-03, 20:25:

See for example here (sorry for the ad-ridden page): https://datasheetspdf.com/datasheet/CL-GD5426.html. The pinout of the CL-GD542x series chips in local bus mode is on page 44, on page 47 is a list of pins for the host interface in local bus mode. It lists only 16 data pins. The description of the pins is on page 61. The data sheet says:

You might be right actually. The information I missed in the datasheet is the "16bit VL-Bus hots interface". So there is 32bit interface to memory but only 16bit to the CPU. Which is weird.

Screenshot 2020-10-03 at 22.07.11.png
Filename
Screenshot 2020-10-03 at 22.07.11.png
File size
88.04 KiB
Views
282 views
File license
Public domain

I will investigate this further. I mean the bonus when using 32bit writes which is quite substantial to be effect of buffering. The 80386 in my vidspeed output might be related to using NexGen CPU which is not 100% 486 compatible and that can fool detection. I thought transations with destination on the bus are non-cacheable and VL-Bus doesn't have a write post buffer like PCI does. Anyway, will try to explain this.

I am now playing with a bunch of Trident cards which seem to have the opposite bottleneck - 32-bit host, 32-bit to memory, but only when using accelerator. The chip is internally devided to Accelerator engine and VGA and the VGA engine only has 16-bit access to memory according to the functional scheme.

Blog|NexGen 586|S4