VOGONS


First post, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

I am working on a project, where I am comparing performance of some selected early chipsets from Socket 4 and Socket 5 era. So far I've tested the the following:

  • Intel 430LX (Mercury)
  • Intel 430NX (Neptune)
  • Intel 430FX (Triton)
  • Intel 430HX (Triton II)
  • ALI M1451 Aladdin
  • UMC UM8890
  • SiS 501
DSC_7147.jpeg
Filename
DSC_7147.jpeg
File size
1.28 MiB
Views
2731 views
File license
Fair use/fair dealing exception

I got the first results in and I thought it might be interesting to share them. Unsurprisingly, Intel Triton runs in circles around the competition. However, there are some nice 3rd party choices too. I was surprised by strong result of UMC8891BF chipset which reportedly has 486 architecture (which I found not to be true).

This is still WIP and I'd like to cover more chipsets from this era. Specifically different revisions of UMC, Intel chipsets with different cache configurations, SiS 5501, Opti Python/Cobra/Viper and more.

Screenshot 2020-05-03 at 19.41.42.png
Filename
Screenshot 2020-05-03 at 19.41.42.png
File size
370.98 KiB
Views
2731 views
File license
Fair use/fair dealing exception
Screenshot 2020-05-03 at 20.09.19.png
Filename
Screenshot 2020-05-03 at 20.09.19.png
File size
241.92 KiB
Views
2731 views
File license
Fair use/fair dealing exception
Screenshot 2020-05-03 at 19.57.21.png
Filename
Screenshot 2020-05-03 at 19.57.21.png
File size
233.91 KiB
Views
2731 views
File license
Fair use/fair dealing exception

More details are in this blogpost.

Blog|NexGen 586|S4

Reply 2 of 42, by feipoa

User metadata
Rank l33t++
Rank
l33t++

Oh nice. I had planned a similar comparison of socket 5/7 chipsets. I'd really like to see the Triton w/async cache vs. Triton w/PB on there, as well as VX and TX chipsets.

Plan your life wisely, you'll be dead before you know it.

Reply 3 of 42, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

Thanks. The FX with async cache is definitely coming as well as 512k pipelined-burst upgrade and even regular synchronous burst vs pipelined burst comparison.

I originally wanted to cap it at Socket 5 level. But since I let 430HX in, I have an obligation to test TX/VX and possibly VIA chipsets...

Blog|NexGen 586|S4

Reply 5 of 42, by feipoa

User metadata
Rank l33t++
Rank
l33t++

I think my 430FX board has solder pads for the pipeline burst cache, but it uses DIP async sockets. Am wondering if soucing the chips and upgrading the board is worth the effort. Do you know if all revisions of the FX chipset support pipeline burst, or if only later revisions did?

Plan your life wisely, you'll be dead before you know it.

Reply 6 of 42, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

AFAIK it was there from the beginning. My board has the very fist sspec (SZ966) of 82437FX with A1 stepping and it works fine with PB stick in the COAST slot.

I don't think retrofitting SMD mount BSRAM chips is worth of the hassle. There is a plenty of 430FX boards selling for cheap with either type. In fact one with async cache makes it a it more valuable for people like me 😀

Blog|NexGen 586|S4

Reply 7 of 42, by feipoa

User metadata
Rank l33t++
Rank
l33t++

I had a tough time finding a socket 5 board which works properly with a Cyrix 2x40 at 80 MHz and an AMD K5-PR200, so with this requirement, it might be easier to retrofit the MB.

Plan your life wisely, you'll be dead before you know it.

Reply 8 of 42, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

Finally found a Triton motherboard which can run both pipelined-burst as well as asynchronous SRAMs + some 512k PBSRAM COAST modules.

Looks like the Triton chipset benefits a lot from the PB cache. Much more so than the UM8891BF. According to datasheet the Triton can only do 4-3-3-3 writes with async cache so there is no surprise it can be beaten even by its predecessors. The UMC
can also do both cache types and doesn't have such a big gap.

Screenshot 2020-05-13 at 00.50.21.png
Filename
Screenshot 2020-05-13 at 00.50.21.png
File size
107.83 KiB
Views
2532 views
File license
Fair use/fair dealing exception
Screenshot 2020-05-13 at 00.50.04.png
Filename
Screenshot 2020-05-13 at 00.50.04.png
File size
112.21 KiB
Views
2532 views
File license
Fair use/fair dealing exception
Last edited by mpe on 2020-05-13, 06:41. Edited 1 time in total.

Blog|NexGen 586|S4

Reply 10 of 42, by H3nrik V!

User metadata
Rank Oldbie
Rank
Oldbie

Nice! What about with no L2 cache? Just to see if async makes a big difference over none .. 😀

Please use the "quote" option if asking questions to what I write - it will really up the chances of me noticing 😀

Reply 11 of 42, by dionb

User metadata
Rank l33t++
Rank
l33t++

Maybe interesting to post my results from ~15 years ago:
http://dionb.eu/chipset1.html

All tests run with RamSpeed INTmark and FLOATmark. "C" scores were with L2 off, "P" scores with L2 on and any optimizations I could perform in BIOS. CPU was P100, RAM 2x 32MB EDO (or 2x32MB FP, 1x 64MB or 2x 32MB SDRAM where applicable)

Unfortunately I have none of the hardware anymore, so can't re-run or re-validate anything :'(

Compared to these results, the big difference is the UMC chipset. In my benches it was the worst solution, bar none, with lower performance even than any of the integrated VGA solutions (although I didn't have SiS5511+6202 to compare), suggesting it was addressing memory with 32b width not 64b. Given that I tested two different boards from different vendors it can't have been a one-off issue in a single board. However it clearly is the same chipset that gives mpe quite respectable scores, actually beating i430FX in mem read and being ahead of the pack in writes too. That is obviously full 64b.

I can't test anymore, but mpe, could you perhaps run INTmark and/or FLOATmark from the RamSpeed suite (simple DOS .EXE)? If you get similar scores to mine, the issue is in how RamSpeed measures. If not, well, it might remain a mystery...

Other takeaway is that performance can differ quite a lot between boards with same chipset. It doesn't change the big picture (PLB beats async, EDO beats FP, SDRAM beats EDO), but can differ by up to 10% (or 22% in the case of the UMC chipset)

Last edited by dionb on 2020-05-13, 14:27. Edited 2 times in total.

Reply 12 of 42, by GL1zdA

User metadata
Rank Oldbie
Rank
Oldbie
Swiego wrote on 2020-05-13, 03:50:

Very cool. It would be neat if you could source Triflex/EISA or Triflex/PCI and see how either compares.

Unfortunately I don't know what memory modules I've used for it. It's from a Deskpro XL with the Pentium 133 CPU module installed. I don't have this system anymore. Here are the results:


Cache/Memory Benchmark
┌────────────────┬───────────────┬───────────────┬────────────────┐
│ Read │ Write │ Move │ Average │
┌─────────────────┼────────────────┼───────────────┼───────────────┼────────────────┤
│ Cache Level 1 │ 170.52 MB/s│ 64.52 MB/s│ 507.45 MB/s│ 247.50 MB/s│
│ Cache Level 2 │ 107.88 MB/s│ 64.09 MB/s│ 128.05 MB/s│ 100.00 MB/s│
│ Memory │ 58.58 MB/s│ 29.13 MB/s│ 39.45 MB/s│ 42.39 MB/s│
└─────────────────┴────────────────┴───────────────┴───────────────┴────────────────┘

Attachments

  • XL133.png
    Filename
    XL133.png
    File size
    11.77 KiB
    Views
    2430 views
    File comment
    Deskpro XL 133 System Speed Test results
    File license
    Public domain

getquake.gif | InfoWorld/PC Magazine Indices

Reply 13 of 42, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

Sadly I only have one board with the UMC8890, but trying to source more. Especially one with AF revision. Will see if there is anything fundamentally different. But guessing from the layout of motherboards featuring AF, it should be using the same architecture (no 32bit path). The AF boards seem to be coming with async cache only and I've seen a board with dual-banked cache.

Chipsets revisions aside I found the differences in motherboards to be subtle. Some boards have only a handful of settings in BIOS (like Intel boards) and might have some important settings disabled by default. Then it is down to chipset programming to normalise the difference after which they perform the same. I also compared different steppings of 430LX, NX, FX and HX, but once normalising the chipset registers, there is nothing to write home about.

Obviously I have no datasheets for UMC8890 so can't do much. Fortunately, Gigabyte has put a plenty of BIOS settings to GA-586AS.

I have some Pentium motherboards with Opti chipset coming. These also have somewhat bad reputation incl. claims being 486 designs. Will see....

Will try to run the tool to see if I can replicate at least some of your numbers.

Blog|NexGen 586|S4

Reply 14 of 42, by mpe

User metadata
Rank Oldbie
Rank
Oldbie
H3nrik V! wrote on 2020-05-13, 08:11:

Nice! What about with no L2 cache? Just to see if async makes a big difference over none .. 😀

This is actually very interesting. I mean the perspective of the cache-less config. As soon as you program the 430FX chipset to use asynchronous cache the write performance tanks:

Screenshot 2020-05-13 at 23.09.13.png
Filename
Screenshot 2020-05-13 at 23.09.13.png
File size
149.35 KiB
Views
2384 views
File license
Fair use/fair dealing exception

With write-heavy Quake test being hugely affected:

Screenshot 2020-05-13 at 23.09.43.png
Filename
Screenshot 2020-05-13 at 23.09.43.png
File size
122.76 KiB
Views
2384 views
File license
Fair use/fair dealing exception

Fortunately, not every program is like Quake and despite async cache fighting an uphill battle on write-crippled memory bus, the marginally improved read latency of async cache can still produce some measurable gains in "normal" apps:

Screenshot 2020-05-13 at 23.19.07.png
Filename
Screenshot 2020-05-13 at 23.19.07.png
File size
121.14 KiB
Views
2384 views
File license
Fair use/fair dealing exception

I've seen this on two Triton motherboards with both chipset steppings. From Triton datasheet it is clear that when async SRAMs are used, the write latency goes up to 4-3-3-3 which is higher than on previous Intel chipsets . It can't be programmed by BIOS or registers so this is a feature of Triton 😀

I think this must be either a chipset bug they never bothered to fix (or a deliberate choice?). None of other chipset is showing this (and none of later Intel chipset supports async SRAMs).

Blog|NexGen 586|S4

Reply 15 of 42, by H3nrik V!

User metadata
Rank Oldbie
Rank
Oldbie
mpe wrote on 2020-05-13, 22:29:
This is actually very interesting. I mean the perspective of the cache-less config. As soon as you program the 430FX chipset to […]
Show full quote
H3nrik V! wrote on 2020-05-13, 08:11:

Nice! What about with no L2 cache? Just to see if async makes a big difference over none .. 😀

This is actually very interesting. I mean the perspective of the cache-less config. As soon as you program the 430FX chipset to use asynchronous cache the write performance tanks:

Screenshot 2020-05-13 at 23.09.13.png

With write-heavy Quake test being hugely affected:

Screenshot 2020-05-13 at 23.09.43.png

Fortunately, not every program is like Quake and despite async cache fighting an uphill battle on write-crippled memory bus, the marginally improved read latency of async cache can still produce some measurable gains in "normal" apps:

Screenshot 2020-05-13 at 23.19.07.png

I've seen this on two Triton motherboards with both chipset steppings. From Triton datasheet it is clear that when async SRAMs are used, the write latency goes up to 4-3-3-3 which is higher than on previous Intel chipsets . It can't be programmed by BIOS or registers so this is a feature of Triton 😀

I think this must be either a chipset bug they never bothered to fix (or a deliberate choice?). None of other chipset is showing this (and none of later Intel chipset supports async SRAMs).

So, in some cases, NO cache is faster than async cache? Are the async cache tests also with EDO?

Please use the "quote" option if asking questions to what I write - it will really up the chances of me noticing 😀

Reply 16 of 42, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

I tested all combinations. The above is all EDO. The benefit of EDO is twice as big in a cache-less system though.

This is somewhat in line with Intel's marketing message they were pushing back in 1995, that EDO (or hyper page ram how they call it) is a cost effective alternative for expensive SRAMs...

One could say they crippled async cache support in Triton to be able to make that claim 😀

Blog|NexGen 586|S4

Reply 17 of 42, by TheMobRules

User metadata
Rank Oldbie
Rank
Oldbie

Great work on those benchmarks, I don't know how I missed this thread before! I am very interested in doing a similar comparison myself, albeit with a more limited number of boards...

There are two things in particular that I want to compare: 430NX vs 430FX using Async L2 on both, to see how much the chipset by itself impacts performance, and I am also interested in seeing how much better an actual 1994 P100 using the Neptune chipset is vs the best 486 of that year (say, a DX4-100WB with a fast chipset like SiS471)... I know the P100 will be better, but I wonder if the difference will be as big with 1994 Pentium parts.

I have a FIC PT-2003 (430FX) with a COAST slot that is supposed to support both Async and PB cache (confirmed by other owners of this board), but in my case it only works with the Async cache module the board came with, none of the PB modules I have tried seem to be detected. I'll keep looking into that, but it may be related to the board revision or something like that.

Another snag I've hit is my terrible luck with finding 430NX boards, all I have gotten from different hauls have been utterly dead... but I may have just found one that works, we'll see when it arrives.

Reply 18 of 42, by mpe

User metadata
Rank Oldbie
Rank
Oldbie
TheMobRules wrote on 2020-05-28, 21:18:

There are two things in particular that I want to compare: 430NX vs 430FX using Async L2 on both, to see how much the chipset by itself impacts performance

It turns out that given the big hit 430FX suffers when using async cache, the overall performance is almost identical to 430NX (also with async cache). See - this. Poor writes negate all the improvements of the Triton.

TheMobRules wrote on 2020-05-28, 21:18:

, and I am also interested in seeing how much better an actual 1994 P100 using the Neptune chipset is vs the best 486 of that year (say, a DX4-100WB with a fast chipset like SiS471)... I know the P100 will be better, but I wonder if the difference will be as big with 1994 Pentium parts.

This is well covered in The Ultimate 486 Benchmark Comparison which used P100 as a basline. Although no Neptune chipset there...

Blog|NexGen 586|S4

Reply 19 of 42, by TheMobRules

User metadata
Rank Oldbie
Rank
Oldbie
mpe wrote on 2020-05-28, 21:37:

This is well covered in The Ultimate 486 Benchmark Comparison which used P100 as a basline. Although no Neptune chipset there...

Right, I have seen those benchmarks, but the P100 reference in those tests uses a 430TX which is a more modern chipset, as well as 512KB of PB that makes a significant difference according to your results. So I was thinking that a more adequate comparison would be with the Pentium hardware that was available when the DX4 came out (1994), that would be a 430NX-based P100 in the best case.

Of course, I would expect FPU-dependent tests such as Quake to heavily favor the Pentium, but when only integer operations are involved the Neptune P100 may not have such an easy win.