VOGONS


Let's benchmark our systems with caches disabled

Topic actions

First post, by clueless1

User metadata
Rank l33t
Rank
l33t

Lots of us here grew up through a variety of DOS generations, where system speed was critical to good gameplay. In the 386 days, we were lucky if the latest games ran well on our systems. But within a generation, those same games played too fast on our 486-class PCs.

Today, those of us with retro PCs try to recapture as much of that nostalgia as we can, but if you're within a space or financial constraint, it's not always possible to have a 386 for 386 games, a 486 for 486 games, and a Pentium-class system for its generation of games.

Phil from philscomputerlab.com made a very nice video demonstrating how to build a system that can be many generations in one box. Phil also set up the Ultimate VGA Benchmark Database Project so that members of the community could add their own results. This benefits everyone!

Led by advice from forum members Tertz and gdjacobs, I pulled what I felt were "good representatives" for each type of old CPU out of Phil's VGA Benchmark Database, from the 100Mhz 486 down to the slowest system represented, an i386DX-16. These results are now our guideposts as we determine where our Pentium-class and higher processors fall when we disable various caches to intentionally drop performance. For this purpose, I set up a Google Sheets page to add results for CPUs with all caches enabled AND with caches disabled. In the spreadsheet the guidepost results are indicated in red text.

cachebench.PNG
Filename
cachebench.PNG
File size
148.98 KiB
Views
19307 views
File license
Fair use/fair dealing exception

So here's how this works: If you don't already have this on every DOS machine you own ( 😉 ), download Phil's VGA Benchmark kit here. For instructions, see his thread linked a couple of paragraphs up. Now download gerwin's Setmul utility here. Instructions are in the link. Setmul will allow you to disable the L1 cache on any x86 processor from the 486 on up and allows disabling L2 cache on K6 Mobile and VIA C3 chips.

Benchmark your systems with all caches enabled, as well as with whatever combinations of caches you can disable. In the spreadsheet, there are some results there that give examples with L1 cache disabled, L2 cache disabled and both L1 and L2 caches disabled. For this test, we are running 3DBENCH 1.0c (aka 3DBENCH2), PCPBench, Speedsys, and Doom. Speedsys is in the Speedsys folder in Phil's kit. I left out Quake because for our purposes, it is not relevant--an intentionally slow system would not be running it anyway. In the event your slowed down system is running at 286 or slower speeds, substitute 3DBENCH 1.0 (it is in the 3DBENCH folder in Phil's kit). As you can see from the Pentium II results at the bottom of the spreadsheet, 3DBENCH 1.0c does not scale well at very slow speeds. Version 1.0 should do the trick here. Thanks to Phil for providing that info.

I have the spreadsheet sorted by Doom results because as the only actual game benchmark, I feel it should be factored highest. Feel free to download a copy of the spreadsheet and sort it however else you'd like to.

You may access the spreadsheet here: https://docs.google.com/spreadsheets/d/1uKhCI … dit?usp=sharing

EDIT: If in the process of adding your results you accidentally muck up the spreadsheet, please post here ASAP so I can revert it to its previous revision. I've had an instance where the spreadsheet was mangled and I didn't discover it til months later. In the process of restoring the last good version, we lost results that people had added after last known good version. Thanks.

When entering results, input them at the bottom of the sheet. I will periodically resort the list. To keep the spreadsheet tidy, just report your CPU at the speed it is running (eg, AMD K6-2 170) rather than its rated speed (AMD K6-2 500). So if you are reporting multiple clock speeds with the same CPU, each of your entries will have a different CPU description. See my results on the spreadsheet for examples. The CPUs I tested were a K6-2 550 @ 166, 238, 350, 366, and 550, Pentium 120 @ 75-133Mhz, Pentium II 333 @ 133-333Mhz, a Celeron 333 (locked), and POD 200. The motherboard I used for the K6-2 includes 512KB L2, but does not support disabling the cache (so no results with L1+L2 disabled).

I'm really curious to get as many types of CPUs on this list as possible, especially some of the more exotic/rare ones. Thanks for taking the time to read!

EDIT: Now that we have a decent amount of results, I picked some common CPUs to highlight on their own spreadsheet tabs. I chose results from common and fast motherboard chipsets and graphics cards.

Last edited by clueless1 on 2020-01-23, 11:45. Edited 4 times in total.

The more I learn, the more I realize how much I don't know.
OPL3 FM vs. Roland MT-32 vs. General MIDI DOS Game Comparison
Let's benchmark our systems with cache disabled
DOS PCI Graphics Card Benchmarks

Reply 1 of 194, by clueless1

User metadata
Rank l33t
Rank
l33t

Kamerat,

Thanks for adding your Athlon Thunderbird. I like the performance with L1+L2 disabled--almost exactly at the Am386DX-40 guidepost result.

I went ahead and resorted the list. Much appreciated. 😀

The more I learn, the more I realize how much I don't know.
OPL3 FM vs. Roland MT-32 vs. General MIDI DOS Game Comparison
Let's benchmark our systems with cache disabled
DOS PCI Graphics Card Benchmarks

Reply 2 of 194, by Kamerat

User metadata
Rank Oldbie
Rank
Oldbie

No problem. 😀 If there's any interest I can benchmark at other speeds too, the CPU runs up to about 1.47GHz (1.33 is stock). The motherboard also got an option for 5x multiplier but the system refuses to boot with it.

DOS Sound Blaster compatibility: PCI sound cards vs. PCI chipsets
YouTube channel

Reply 3 of 194, by candle_86

User metadata
Rank l33t
Rank
l33t

I'm curious why would a CPU suddenly tank to 386 preformance by simply disabling Cache, it sohuld still be faster. I mean the original Celeron 266 and Celeron 300 certainly operated closer to pentium levels. Shouldn't a faster chip like a Pentium 3 or an Athlon preform similar if not faster than a Celeron 266? Even with cache disable the memory access is alot faster than even cache was for a Pentium 1 class system. 66mhz vs say 200/266 for an Athlon system.

Reply 4 of 194, by clueless1

User metadata
Rank l33t
Rank
l33t

candle_86, are you referring specifically to the Thunderbird? What I found odd was how slow the Celeron 333 and Pentium II 333 got when L1 cache was disabled. So L1 plays an even huger role in their performance if they start behaving like a 286 without any L1.

Kamerat, I think the faster you go on your T-bird, the higher your score with L1+L2 disabled will get, which will put it out of range of what we're looking for. But I'm not opposed to you adding more data to the sheet. 😀 Always curious about benchmark results here!

The more I learn, the more I realize how much I don't know.
OPL3 FM vs. Roland MT-32 vs. General MIDI DOS Game Comparison
Let's benchmark our systems with cache disabled
DOS PCI Graphics Card Benchmarks

Reply 5 of 194, by melbar

User metadata
Rank Oldbie
Rank
Oldbie

@Kamerat

I am little wondering about your values 3DBench & DOOM with T-Bird & GeforceFX5500. Should not be the FX5500 as strong as my GF2MX-400 or even little stronger?
When i look at the specs on wiki, the FX5500 has even with 64bit DDR a 20% higher memory bandwidth like a stock GF2MX400. Also the raw power is twice (FX5500 - 1080MOperations/s, GF2MX-400 - 400MOperations/s).
Finally the T-Bird should be little faster than a K6-2...

@candle_86

You have to see the CPU as a complex unit with many circuits, transistors & also little caches. The different parts comunicate/work together as it can be seen in the attached flow charts of Pentium/PII/PIII/Athlon architecture. The L1 cache is between the 'bus interface unit' and the 'instruction decoder/control unit'. When you disable this little cache, this has a huge impact in system performance and you cannot compare the speed of your RAM, which is 'outside' of your CPU at system bus level connected to CPU.
Let me say, the whole architecture of systems >= PIII are dependent on L1 cache.

Attachments

  • Intel_Pentium.png
    Filename
    Intel_Pentium.png
    File size
    59.83 KiB
    Views
    19177 views
    File comment
    Pentium architecture
    File license
    Fair use/fair dealing exception
  • P6_func_diag.png
    Filename
    P6_func_diag.png
    File size
    21.65 KiB
    Views
    19177 views
    File comment
    P6 (Pentium II) architecture
    File license
    Fair use/fair dealing exception
  • amd_athlon_architektur.png
    Filename
    amd_athlon_architektur.png
    File size
    20.69 KiB
    Views
    19177 views
    File comment
    Athlon architecture
    File license
    Fair use/fair dealing exception
  • intel_pentium3_architektur.png
    Filename
    intel_pentium3_architektur.png
    File size
    13.13 KiB
    Views
    19177 views
    File comment
    Pentium III architecture
    File license
    Fair use/fair dealing exception

#1 K6-2/500, #2 Athlon1200, #3 Celeron1000A, #4 A64-3700, #5 P4HT-3200, #6 P4-2800, #7 Am486DX2-66

Reply 6 of 194, by clueless1

User metadata
Rank l33t
Rank
l33t
melbar wrote:

@Kamerat

I am little wondering about your values 3DBench & DOOM with T-Bird & GeforceFX5500. Should not be the FX5500 as strong as my GF2MX-400 or even little stronger?

I have an FX5200 and in my testing it is equal to a GF2 MX and TNT2 M64 in DOS benchmarks.

Speedsys is 100% CPU dependent, so in comparison with my K6-2 550 (same clock speed), it is 4.5% faster.
PCPBench is dependent on all system parameters (CPU, graphics, chipset, RAM) but seems to be a little heavier on CPU than 3DBench. 3DBench seems to weight the graphics card more heavily. With that in mind, his PCPBench score is 53% faster than my K6-2 550 and 47% faster than your K6-2 500, while your 3DBench score is 14.7% higher than his. It is a little weird. 😕

The more I learn, the more I realize how much I don't know.
OPL3 FM vs. Roland MT-32 vs. General MIDI DOS Game Comparison
Let's benchmark our systems with cache disabled
DOS PCI Graphics Card Benchmarks

Reply 7 of 194, by BSA Starfire

User metadata
Rank Oldbie
Rank
Oldbie

Quick set of tests with my socket 370 Celeron 466 MHz "Mendocino", Asus CUSL2 Intel 815e chipset, 256 MB SD_RAM, STB Nitro DVD(Chromatic Mpact2 4MB) AGP graphics card. No utilities and windows ME boot disk.
All caches on scores:
3Dbench2 : 241.9
PCP bench 112.
DOOM realtics 1007
Speedsys 541.95

Level 2 cache off:
3Dbench2 222.4
PCP Bench 82.0
DOOM realtics 1026
Speedsys 541.95

Level1 cache off, level2 cache on.
3Dbench2 7.3
PCP Bench 1.6
DOOM realtics 27334
Speedsys 5.65

Both cache off

3Dbench2 7.3
PCPBench 1.6
DOOM realtics 27334
Speedsys 5.62

So, what we learn from this is: "Mendocino" is utterly crippled by no Level 1 cache. If i'm honest it feels less responsive in regular use than my 386 SX 25 MHz machine. Level 2 cache seems to have little impact on DOS scores. This is not a good choice for cache changing DOS performance use. P6 core without level1 is super slow.

286 20MHz,1MB RAM,Trident 8900B 1MB, Conner CFA-170A.SB 1350B
386SX 33MHz,ULSI 387,4MB Ram,OAK OTI077 1MB. Seagate ST1144A, MS WSS audio
Amstrad PC 9486i, DX/2 66, 16 MB RAM, Cirrus SVGA,Win 95,SB 16
Cyrix MII 333,128MB,SiS 6326 H0 rev,ESS 1869,Win ME

Reply 8 of 194, by BSA Starfire

User metadata
Rank Oldbie
Rank
Oldbie

Cyrix MII PR333(83 MHz), Ali Alladdin V, ATi RAGE PRO 8 MB AGP, 512MB SD-RAM. Windows Me Boot disk.

All caches on:
3Dbench2 229.9
PCP Bench 67.4
DOOM Realtics 662
Speedsys 180.97

Level 2 cache off.

3Dbench2 223
PCP Bench 63.0
DOOM Realtics 692
Speedsys 181.00

No option in this BIOS to disable L1 cache.

286 20MHz,1MB RAM,Trident 8900B 1MB, Conner CFA-170A.SB 1350B
386SX 33MHz,ULSI 387,4MB Ram,OAK OTI077 1MB. Seagate ST1144A, MS WSS audio
Amstrad PC 9486i, DX/2 66, 16 MB RAM, Cirrus SVGA,Win 95,SB 16
Cyrix MII 333,128MB,SiS 6326 H0 rev,ESS 1869,Win ME

Reply 9 of 194, by clueless1

User metadata
Rank l33t
Rank
l33t
BSA Starfire wrote:

No option in this BIOS to disable L1 cache.

Can Setmul disable L1?

The more I learn, the more I realize how much I don't know.
OPL3 FM vs. Roland MT-32 vs. General MIDI DOS Game Comparison
Let's benchmark our systems with cache disabled
DOS PCI Graphics Card Benchmarks

Reply 10 of 194, by BSA Starfire

User metadata
Rank Oldbie
Rank
Oldbie

My other Cyrix MII 333(75MHz x 3.5 FSB 266 MHz), SiS 5591 chipset MB(ECS P5SD-AS), Matrox Mystique MGA-1064SG 4MB PCI, 128 MB SD-RAM.

All caches on:
3DBench2 370.9
PCPBench 84.4
DOOM Realtics 753
Speedsys 187.99

Level 2 cache off

3DBench2 342.8
PCPBench 68.6
DOOM Realtics 801
Speedsys 187.97

Level 1 off Level 2 on.

3DBench2 15.6
PCPBENCH 3.7
DOOM Realtics 11658
Speedsys 13.41

All caches off.
3DBench2 16.6
PCPBENCH 3.6
DOOM Realtics 12894
Speedsys 13.39

286 20MHz,1MB RAM,Trident 8900B 1MB, Conner CFA-170A.SB 1350B
386SX 33MHz,ULSI 387,4MB Ram,OAK OTI077 1MB. Seagate ST1144A, MS WSS audio
Amstrad PC 9486i, DX/2 66, 16 MB RAM, Cirrus SVGA,Win 95,SB 16
Cyrix MII 333,128MB,SiS 6326 H0 rev,ESS 1869,Win ME

Reply 11 of 194, by BSA Starfire

User metadata
Rank Oldbie
Rank
Oldbie
clueless1 wrote:
BSA Starfire wrote:

No option in this BIOS to disable L1 cache.

Can Setmul disable L1?

No it did not on this particular board. It's a OEM TIME, they were always awkward buggers! 🤣

286 20MHz,1MB RAM,Trident 8900B 1MB, Conner CFA-170A.SB 1350B
386SX 33MHz,ULSI 387,4MB Ram,OAK OTI077 1MB. Seagate ST1144A, MS WSS audio
Amstrad PC 9486i, DX/2 66, 16 MB RAM, Cirrus SVGA,Win 95,SB 16
Cyrix MII 333,128MB,SiS 6326 H0 rev,ESS 1869,Win ME

Reply 12 of 194, by clueless1

User metadata
Rank l33t
Rank
l33t

Thanks, BSA Starfire! Those Cyrix chips are real interesting. They perform great in Doom but PCPBench and Speedsys do not like them.

I wasn't sure on the Ali Aladdin-based system, but I figured you were running at 4.0x83=333Mhz. Is that right?

I entered all three of your systems you posted today if anyone wants to check them out.

The more I learn, the more I realize how much I don't know.
OPL3 FM vs. Roland MT-32 vs. General MIDI DOS Game Comparison
Let's benchmark our systems with cache disabled
DOS PCI Graphics Card Benchmarks

Reply 13 of 194, by BSA Starfire

User metadata
Rank Oldbie
Rank
Oldbie
clueless1 wrote:

Thanks, BSA Starfire! Those Cyrix chips are real interesting. They perform great in Doom but PCPBench and Speedsys do not like them.

I wasn't sure on the Ali Aladdin-based system, but I figured you were running at 4.0x83=333Mhz. Is that right?

I entered all three of your systems you posted today if anyone wants to check them out.

quote, no 3x83Mhz to give 250 MHz, all Cyrix MII's are PR rated chips.

286 20MHz,1MB RAM,Trident 8900B 1MB, Conner CFA-170A.SB 1350B
386SX 33MHz,ULSI 387,4MB Ram,OAK OTI077 1MB. Seagate ST1144A, MS WSS audio
Amstrad PC 9486i, DX/2 66, 16 MB RAM, Cirrus SVGA,Win 95,SB 16
Cyrix MII 333,128MB,SiS 6326 H0 rev,ESS 1869,Win ME

Reply 14 of 194, by melbar

User metadata
Rank Oldbie
Rank
Oldbie

Yes, since Cyrix has introduced the 5x86 generation, they have labeled their chips with a P-rating. Cyrix MII333: 250 MHz (83 MHz FSB).

Regarding your results with the MII: It's really interesting that the first 3 benchmarks (Ok, 3DBench was little higher) is around PII233 and PII266, but when you look at DOOM result you're slingshot to 500MHz range and this with a ATI Rage Pro. And you can say that the PII's were not running with the slowest graphic cards. Didn't expect that from the Rage Pro, also when you compare the hard facts:

Rage Pro - Core clock (75Mhz) - Memory clock (75Mhz) - MPixels/s (75) - MTextels/s (75) - Memory Bandwidth (0.600GB/s)
GeForce2 MX - Core clock (175Mhz) - Memory clock (166Mhz) - MPixels/s (350) - MTextels/s (700) - Memory Bandwidth (2.656GB/s)

Edit:

Also the Doom value of your 2nd Cyrix (3,5x75=266Mhz) seems really strange. Ok, the 3DBench is 61% higher compared to your other Cyrix CPU and this with only 16Mhz more core clock and 8MHz less FSB???.
With ~750realticks your are at the level of K6-2 366 with RivaTNT2-M64. And the FPS are 23% higher with the Mystique card compared to the Celeron 333 with GF2-MX.... hard to believe.

Matrox MGA-1064SG (Mystique) - Core clock (60Mhz) - Memory clock (90Mhz) - MPixels/s (???) - MTextels/s (25) - Memory Bandwidth (should be ~0.720GB/s)
Riva TNT2 M64 - Core clock (125Mhz) - Memory clock (150Mhz) - MPixels/s (250) - MTextels/s (250) - Memory Bandwidth (1.200GB/s)

#1 K6-2/500, #2 Athlon1200, #3 Celeron1000A, #4 A64-3700, #5 P4HT-3200, #6 P4-2800, #7 Am486DX2-66

Reply 15 of 194, by BSA Starfire

User metadata
Rank Oldbie
Rank
Oldbie

Both my Cyrix MII 333 systems are also in Phil's VGA database if you want to take a look(the one in the ECS SiS 5591 board is actually a IBM branded one), also scores using Matrox Millennium G450 32MB PCI & the on-board SiS 6326 AGP card with 4MB VRAM on the board rather than the Mystique 220 PCI, that is very quick too in DOS. I use the Mystique 220 because VGA signal quality strength alongside a Creative DXR-2 DVD decoder(this system is my main DVD player!!!).
I also have a set of scores on the spreadsheet using a PCCHIPS M590 motherboard, this is also SiS 5591 chipset with 6326 AGP on board, but with 8MB VRAM, scores there with CPU's from 6x86 MX PR266(208 Mhz), MII 333(250 MHz), K6-2 300,450 & K6-III 400. Note that the "REAL" FSB max on the M590 is 90MHz, not 100MHz as reported by the BIOS. I'm using Jan Steunebrink's BIOS patch from 2004 on this one. The SiS 5591 seems a very good combination with the Cyrix MII/MX CPU's.

286 20MHz,1MB RAM,Trident 8900B 1MB, Conner CFA-170A.SB 1350B
386SX 33MHz,ULSI 387,4MB Ram,OAK OTI077 1MB. Seagate ST1144A, MS WSS audio
Amstrad PC 9486i, DX/2 66, 16 MB RAM, Cirrus SVGA,Win 95,SB 16
Cyrix MII 333,128MB,SiS 6326 H0 rev,ESS 1869,Win ME

Reply 16 of 194, by melbar

User metadata
Rank Oldbie
Rank
Oldbie

I've found your entries in Phil's VGA database. Yes, the SiS 5591 chipset might be a strong player with the IBM 6x86MX (@266MHz).
I see three different row's as your said. One with the Mystique 220, one with the G450 and finally one with the onboard SiS 6326 AGP.

All of them seems to be really strong at DOOM, but there is this situation:
With 3DBench & PCPBench, the onboard has no chance, but with DOOM it's around 17% over the dedicated cards. Looks strange right?

When i see your further results with the SiS 5591 chipset, you have tested the IBM 6x86MX (@250MHz), K6-2 (@300MHz and 450MHz) and the K6-3 (@400MHz),
they are all really good with DOOM.

Finally we can see, as clueless already said, that the 3DBench is weighted more to VGA card. The PCPBench is weighted more to CPU's raw power and this you cannot see anytime in DOOM benchmark.

#1 K6-2/500, #2 Athlon1200, #3 Celeron1000A, #4 A64-3700, #5 P4HT-3200, #6 P4-2800, #7 Am486DX2-66

Reply 17 of 194, by Kamerat

User metadata
Rank Oldbie
Rank
Oldbie
melbar wrote:
@Kamerat […]
Show full quote

@Kamerat

I am little wondering about your values 3DBench & DOOM with T-Bird & GeforceFX5500. Should not be the FX5500 as strong as my GF2MX-400 or even little stronger?
When i look at the specs on wiki, the FX5500 has even with 64bit DDR a 20% higher memory bandwidth like a stock GF2MX400. Also the raw power is twice (FX5500 - 1080MOperations/s, GF2MX-400 - 400MOperations/s).
Finally the T-Bird should be little faster than a K6-2...

I don't know why I got so bad results. Changing the graphic card to ATI Radeon 9700 Pro improved 3DBench and Doom somewhat, 407.8fps vs. 332.6fps and 84.49fps vs. 80.57fps. Still my Doom result is kind of shitty compared to your K6-2 500 @ 117.81fps.

DOS Sound Blaster compatibility: PCI sound cards vs. PCI chipsets
YouTube channel

Reply 18 of 194, by BSA Starfire

User metadata
Rank Oldbie
Rank
Oldbie

We had a discussion recently related to the DOOM scores on SiS VGA"Super 7" SiS 530 chipset., this was with the inferior SiS 530 with embedded 6326 and shared memory. You can also find my scores with this setup in the database using a K6-2 450, for anything other than DOOM though a discreet VGA card wins out.

Your comments on VGA core speeds and memory speeds are interesting, to me it's appears that some VGA cores are just particularly good for DOS, clocks/bandwidth etc don't seem to matter. Tseng ET6000 and ARK LOGIC PV2000 seem to be an example of this, as does the SiS 6326 score in DOOM.
I'd like to see Phil's spreadsheet but without the QUAKE scores(it's too Intel FPU loaded/biased), I imagine that would give some interesting and perhaps unexpected results.
Perhaps we can achieve that here with the caches on section. Would be good to see.

286 20MHz,1MB RAM,Trident 8900B 1MB, Conner CFA-170A.SB 1350B
386SX 33MHz,ULSI 387,4MB Ram,OAK OTI077 1MB. Seagate ST1144A, MS WSS audio
Amstrad PC 9486i, DX/2 66, 16 MB RAM, Cirrus SVGA,Win 95,SB 16
Cyrix MII 333,128MB,SiS 6326 H0 rev,ESS 1869,Win ME

Reply 19 of 194, by clueless1

User metadata
Rank l33t
Rank
l33t

melbar,

I've seen certain video cards favor different benchmarks. ATI cards seem to do really well in Doom, for example. But not so well in SVGA Quake.

For this thread, I think the results with cache(s) disabled is the focus, and that's where a lot of the weirdness from the more exotic CPUs at the upper speeds goes away.

An interesting example is in my Packard Bell system (the POD 200), with L1 cache on, there is a significant difference in 3DBench and Doom between the onboard Cirrus Logic graphics and the TNT2 M64. But with L1 disabled, there is practically zero difference. The CPU bottleneck takes away any advantage the video sub-system had.

The more I learn, the more I realize how much I don't know.
OPL3 FM vs. Roland MT-32 vs. General MIDI DOS Game Comparison
Let's benchmark our systems with cache disabled
DOS PCI Graphics Card Benchmarks