VOGONS


The Ultimate 686 Benchmark Comparison

Topic actions

Reply 80 of 145, by feipoa

User metadata
Rank l33t++
Rank
l33t++

I tried various minigl versions and picked the one with the best performance.

Those charts are of little value because they do not show the results with and without the 3DNow! patch.

Plan your life wisely, you'll be dead before you know it.

Reply 81 of 145, by falloutboy

User metadata
Rank Member
Rank
Member

The K6 450 is an overclocked plain K6, not a K6-2, he used a vapochill ls.
K6 450 --> 57.2 fps
K6-2 450 --> 103.6 fps
You need a 3dfx card to benefit from this patch. Voodoo2 SLI would be the best in low resolutions.

Here are my numbers.

Quake2 3.20 + 3DNow! patch
For Voodoo3: To use miniGL-driver --> rename miniGL 1.49 driver to opengl32.dll and move in game directory. Select "3DNow! OpenGL" as the renderer in the video options.
The "3DNow 3Dfx OpenGL" setting is for the Voodoo2, do not use this! It will not work with the Voodoo3, even if you rename the minigl 1.49 driver in to "3dfxgl.dll"

timedemo 1
map demo1.dm2

K6-III+@550 MHz
Tyan S1590 , 256 MB RAM
Voodoo3 3000 @190 MHz ; driver 1.07

320x240
44.1 fps software
56.4 fps 3DNow_software
67.3 fps def.OGL
78.8 fps 3DNow.OGL
81.9 fps miniGL 1.49
98.2 fps 3DNow + miniGL 1.47
99.1 fps 3DNow + miniGL 1.49
69.3 fps miniGL 1.46
76.3 fps 3DNow + miniGL 1.46

640x480
69.6 fps WickedGL 3.02
74.0 fps 3DNow + WickedGL 3.02

This is just a driver comparison, the 320x240 vs 640x480 do not make much difference in performance (for 3d accelerated modes), I think 640x480 is slightly faster on my Voodoo3.

EDIT: Added miniGL 1.46 results

Last edited by falloutboy on 2015-10-05, 22:31. Edited 2 times in total.

Reply 83 of 145, by falloutboy

User metadata
Rank Member
Rank
Member
386SX wrote:

From default opengl to 3dnow + minigl.... it's like two different planets... THAT was low level optimization!

Yes, and it wasn't even intended to run on a voodoo3. Like I said before it's optimised for the Voodoo 2.
Seeing an early anandtech review, what they got with the Voodoo2 SLI (12 MB) at 640x480.

44.2 fps K6-2 333 MHz without 3DNow! patch
76.4 fps K6-2 333 MHz with 3DNow! patch

That's just over 72% increase.
In a later review, with the finished 3DNow! patch, they got even 82.5 fps with the K6-2 300 MHz + Voodoo2 SLI. That is impressive 😲

http://www.anandtech.com/show/160/10
http://www.anandtech.com/show/277/9

Last edited by falloutboy on 2015-10-05, 18:38. Edited 1 time in total.

Reply 84 of 145, by 386SX

User metadata
Rank l33t
Rank
l33t
falloutboy wrote:
Yes, and it wasn't even intended to run on a voodoo3. Like I said before it's optimised for the Voodoo 2. Seeing an early anandt […]
Show full quote
386SX wrote:

From default opengl to 3dnow + minigl.... it's like two different planets... THAT was low level optimization!

Yes, and it wasn't even intended to run on a voodoo3. Like I said before it's optimised for the Voodoo 2.
Seeing an early anandtech review, what they got with the Voodoo2 SLI (12 MB) at 640x480.

44.2 fps K6-2 333 MHz with out 3DNow! patch
76.4 fps K6-2 333 MHz with 3DNow! patch

That's just over 72% increase.
In a later review, with the finished 3DNow! patch, they got even 82.5 fps with the K6-2 300 MHz + Voodoo2 SLI. That is impressive 😲

http://www.anandtech.com/show/160/10
http://www.anandtech.com/show/277/9

It let me think if all the other "supported" games/tests actually were optimized or just compatible in some way. Actually I don't understand how the Voodoo2 architecture could benefit that much. Probably cause at that time all those 1997/98 cpu could not use most of its (sli) capability?

Reply 85 of 145, by feipoa

User metadata
Rank l33t++
Rank
l33t++

falloutboy, from your results, you obtained a 20% increase on a Voodoo-based card. For my tests with a K6-III with non-Voodoo cards, I received a 0-10% increase, with the most benefit received from the GeForce2 MX400.

The Ultimate 686 Benchmark Comparison was run with a Matrox G200. As my results indicate a 0% increase with a G450, it is unlikely that a G200 would yield any performance benefit from the 3DNow! patch. As such, there seems little point in re-running 3DNow!-capable CPUs for this particular comparison. A sigh of releif.

You noted that the 3DNow! patch benefit was mostly for K6-II CPUs. I wonder what benefit Winchip CPUs would have if paired with a Voodoo2 in comparison to K6-II CPUs.

EDIT: In retrospect, I probably would have run this benchmark comparison using a Voodoo2, especially now that I have a Voodoo2 card running perfectly in my 66 MHz FSB-based socket 3 system. This, however, would have placed preferencial treatment on the K6-2's and may have offset the ave. performance comparison amongsts the chips.

Plan your life wisely, you'll be dead before you know it.

Reply 86 of 145, by falloutboy

User metadata
Rank Member
Rank
Member
386SX wrote:

1997/98 cpu could not use most of its (sli) capability?

Yes, check out phils voodoo 2 processor scaling project. http://www.philscomputerlab.com/voodoo-2-and- … ng-project.html

feipoa wrote:

You noted that the 3DNow! patch benefit was mostly for K6-II CPUs..

This patch was made by AMD in cooperation with 3dfx to show off the capabilities of 3DNow! when they introduced the K6-2. Source code was given by id software.
I have seen Quake 2 benchmarks with the Athlon cpu loosing performance when 3DNow! is enabled (not sure what GPU was used).
http://web.archive.org/web/200102030623/http: … 3d/drivers.html

feipoa wrote:

I wonder what benefit Winchip CPUs would have if paired with a Voodoo2 in comparison to K6-II CPUs..

I would like to see this too, but I don't have this CPU nor the Voodoo 2.

Your GeForce2 MX400 result seems to be to low. I have seen this card hitting 94 fps in 1024x768x16 on a K6-III+ 600 with detonator 44.03. I think 640x480x16 with the detonator 12.90 should definitily be faster.
With a Geforce3 Ti 200 I got 91.2 fps (K6-III+550 ;not much difference with out 3DNow).

feipoa wrote:

20% increase on a Voodoo-based card.

Depends how you look at it. I rerun some test and made a comparison with the Pentium cpu (benched earlier).
The pentium does not benefit from newer minigl drivers and minigl drivers are alot faster then the default OpenGL.
On the K6-III+ the default Opengl driver is on par with the old minigl drivers. The performance jump happens with the minigl 1.47.
The new minigl drivers are definitely 3DNow optimised, starting with version 1.47.

Voodoo3 3000 AGP (driver Amigamerlin 2.9) ; directx8.2 ; Aureal Vortex2 (driver 4.06.2050.18)
***********************************************************
Pentium-S @ 200 (2.0x100 MHz)
640x480

24.4 fps default OpenGL
29.0 fps WickedGL3.02-normal
29.9 fps WickedGL3.02-hires
34.5 fps miniGL 1.46
35.4 fps miniGL 1.49
35.5 fps miniGL 1.47
35.7 fps 3Dfx OpenGL (Quake 2 default 3dfx minigl driver)
***********************************************************
K6-III+@550 (5.5x100 MHz)
320x240

67.5 fps default OpenGL
80.2 fps 3DNow OpenGL

68.1 fps 3DFX OpenGL (Quake 2 default 3dfx minigl driver)

68.1 fps miniGL 1.46
75.0 fps 3DNow + miniGL 1.46

82.2 fps miniGL 1.49
100.8 fps 3DNow + miniGL 1.49
***********************************************************

Reply 88 of 145, by matze79

User metadata
Rank l33t
Rank
l33t

Today i got this:

powerleap.jpg
Filename
powerleap.jpg
File size
173.1 KiB
Views
15678 views
File license
Fair use/fair dealing exception

i can try a few Benchmarks with K6-2 or 3 on Socket 5 Board soon, if the BIOS doesnt look up 😉
ftp://cyberia.dnsalias.com/pub/filebase/hw/si … L-K6-III-98.pdf

The Socket 7 Platform was impressive, really a long life time.

https://www.retrokits.de - blog, retro projects, hdd clicker, diy soundcards etc
https://www.retroianer.de - german retro computer board

Reply 90 of 145, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++

Newest official version I've found:
http://www.elhvb.com/mobokive/archive/Epox/bios/vp3c0c21.exe

This is also the latest version listed on the unofficial K6-2+/K6-III+ BIOS page:
http://web.inter.nl.net/hcc/J.Steunebrink/k6plus.htm

All hail the Great Capacitor Brand Finder

Reply 91 of 145, by g_o_n_z_o

User metadata
Rank Newbie
Rank
Newbie

Thank you. Unfortunately, both BIOS-versions you posted are the same, and both from 12/21/2000. This is the installed BIOS-version onto my board, too. I have problems to install an AGP-VGA under Win98SE/ME/2000. I tried different VGAs (Nvidia and ATI), and spend a lot of time to enable/disable the appropriate BIOS-options - but without success. The problem is: after a fresh Win-installation, the VGA is always recognized as "standard VGA". The Northbridge-driver installs without problems. After installation of the VGA-driver and restart of the system, ANY Windows (see before) crashes with BSOD 🙁 Starting in safe-mode the VGA is always marked as "a device caused problems and therefore disabled by windows". So, it's maybe a mainboard-bug (northbridge), or (I think - and hope) just a BIOS-stability-problem. I heard about other 2-MB-L2-Boards with stability-problems, too - they were solved by later (newer) BIOSes. So, I still would like to test the 2A5LEPA2 from 2001-08-07. Thanks again!

Reply 92 of 145, by Imperious

User metadata
Rank Oldbie
Rank
Oldbie

I've found it here http://biosagentplus.com/bios/2A5LEPA2?ref=82 … 8us3g1ee45883k0

If You register and download it, please upload it here as well thanks 😀

Atari 2600, TI994a, Vic20, c64, ZX Spectrum 128, Amstrad CPC464, Atari 65XE, Commodore Plus/4, Amiga 500
PC's from XT 8088, 486, Pentium MMX, K6, Athlon, P3, P4, 775, to current Ryzen 5600x.

Reply 93 of 145, by kool kitty89

User metadata
Rank Member
Rank
Member

Somehow I'd failed to notice the odd performance distribution in the Quake 1 tests before. The oddly high performance of the Pentium II overdrives was noted early on after the results were posted, but there's other oddities all over the range too.

I know Quake is hand-optimized for the P5 architecture's quirks, and the P55C obviously does the best clock for clock of any of the CPUs tested, that's not so surprising. What's weird is how well the Winchip (even C6), K6, and 6x86 (and MII) compare to the Pentium Pro (even 1 MB cache version) and Pentium II and how poorly the Athlon does. The K6 family and P6 (PPro, II, III, Celeron) also seem very close in performance to eachother at similar clock and bus speeds leaving only the P5 family to have serious clock for clock disparity on Intel's end.

Given the P5, P6, and K7 all have dual-issue piplined FPUs (and only the P5 requires specific optimization for in-order execution) I'd think there'd be more consistency in performance there assuming the typical FPU performance bottleneck on Quake. Even with some performance oddities between the P5 and P6 FPUs (like the theoretical multiplication throughput of the P5 being double that of the P6) it wouldn't explain the K7's odd performance and more so since the 640x480 resolution should be pretty heavy on Fxch and Fdiv operations due to the way Quake texture maps, so the division performance lead the P6 and K7 add should show up more prominently. (granted, that might help the 6x86 a bit too with its decent Fdiv performance) That, or I've been misled as far as what Quake's most finicky requirements are.

The Winchip 2 and C6 performing so well is particularly odd, often matching or slightly beating similarly clocked P6 chips. I could maybe understand that 1 MB cache SS7 results might skew some things, but that wouldn't account for the 1 MB cache Xeon and PPro performance. Plus, that advantage should disappear for Socket 5/7 CPUs tested on 512kB boards ... unless perhaps it's some odd affinity for Quake and the direct-mapped caching scheme used for board-level caches over the set-associative caches for the Socket 8, Slot 1, and SlotA examples. (except that still wouldn't explain the PII OD's high performance, unless they used a different caching scheme from other PIIs)

The affinity for 75 MHz (and some 83 MHz) bus examples seems pretty reasonable at least with the slightly overclocked PCI/AGP bus. (presumably the 83 MHz tests that don't show such disparity were using 2.5x rather than 2x PCI dividers) Though if it is the board-level cache making a significant difference in a lot of these figures, I'd think the 100 MHz bus would be favored more consistently than it is.

Perhaps including a Duron in testing might have shed at least a little more light on things, at least as far as the cache-specific performance issues go. (would give a decent contrast to the Samuel II and Ezra with their similar cache configurations; obviously useful for comparing far more than just Quake)

Quake II very consisistently favors all the pipelined FPU CPUs and favors more advanced P6 and K7 types over the P5 to the point it seems to run noticeably faster than Quake 1 on most of them while several of the other CPUs fall behind on Quake II compared to Quake 1. (and SIMD extentions aren't taken advantage of in either of them)

Aside from the Athlon, the Winchip performance is probably the most surprising of anything I noticed when sifting back through these tests, Quake 1 does oddly well in it compared to most other computationally intensive applications. (FPU or ALU wise)

It does make me wonder how much better the VIA C3 family would've performed as Socket-7 parts with the benefit of said caches. (given the FPU should be pretty close to the Winchip2's but clocked at half the ALU speed -and indeed showing just slightly better than 1:1 performance at 2x the clock speed in other FPU benchmarks, it seems like the cache is the weak link setting those C3 chips well below the 1/2 or even 1/3 clock for clock performance mark compared to the C6 and Winchip2 ... a 500 MHz SS7 Samuel should comprehensively outperform a 250 MHz Winchip II by that logic -which the S370 Samuel certainly does not, even at 600 MHz)

Reply 94 of 145, by feipoa

User metadata
Rank l33t++
Rank
l33t++

It was suggested that enabling fastvid would help the P6 CPUs in comparison to the P5 chips. fastvid was not enabled. Is fastvid only useful for DOS? If so, then having it enabled or not would not alter the GLQuake P5/P6 comparative nature.

Plan your life wisely, you'll be dead before you know it.

Reply 95 of 145, by BSA Starfire

User metadata
Rank Oldbie
Rank
Oldbie

I tested the IDT Winchip C6 200 MHz with phil's VGA benchmarking suite and a variety of ISA & PCI video cards on a DFi 430TX motherboard, it was slower than both Cyrix 6x86MX & AMD K5 PR166 CPU's on the same motherboard. By far the best results I got were with a ARK logic PV2000 PCI, easily the best performing DOS card i have come across, easily out performing S3, Matrox and Tseng labs.
I have also run tests with a VIA C3 "ezra" 866 MHz, according to the google spreadsheet data in falls between a AMD k5/2 500 and K6/3, that was with a AGP Matrox G550, a pretty swift DOS card.
One thing I will say for the winchip c6 is it's a very reliable and compatible chip, totally trouble free in fact, it runs anything. Unlike the VIA C3 that i have found to be a finicky and often incompatible CPU. both do run very cool as far as temps are concerned tho. I also have a VIA C7 desktop system, this one is a lot better as far as speed and compatibility but still does have it's issues with various software stability.

286 20MHz,1MB RAM,Trident 8900B 1MB, Conner CFA-170A.SB 1350B
386SX 33MHz,ULSI 387,4MB Ram,OAK OTI077 1MB. Seagate ST1144A, MS WSS audio
Amstrad PC 9486i, DX/2 66, 16 MB RAM, Cirrus SVGA,Win 95,SB 16
Cyrix MII 333,128MB,SiS 6326 H0 rev,ESS 1869,Win ME

Reply 96 of 145, by idspispopd

User metadata
Rank Oldbie
Rank
Oldbie
feipoa wrote:

It was suggested that enabling fastvid would help the P6 CPUs in comparison to the P5 chips. fastvid was not enabled. Is fastvid only useful for DOS? If so, then having it enabled or not would not alter the GLQuake P5/P6 comparative nature.

Fastvid basically enables write-combining for the VESA 2.0 linear frame buffer. This means that copying the screen buffer from main memory to video memory is done faster.
For Direct3D/OpenGL this shouldn't matter. For software rendering in Windows (for example Quake II software renderer) this matters, but Windows (or the video driver) should enable write combining so you shouldn't need additional tools.

Reply 97 of 145, by feipoa

User metadata
Rank l33t++
Rank
l33t++

idspispopd, thank you for the clarification. It sounds like fastvid increases the DOS game/benchmark speeds, including DOS Quake (software), for some P6-class CPUs.

Plan your life wisely, you'll be dead before you know it.

Reply 98 of 145, by kool kitty89

User metadata
Rank Member
Rank
Member
idspispopd wrote:
feipoa wrote:

It was suggested that enabling fastvid would help the P6 CPUs in comparison to the P5 chips. fastvid was not enabled. Is fastvid only useful for DOS? If so, then having it enabled or not would not alter the GLQuake P5/P6 comparative nature.

Fastvid basically enables write-combining for the VESA 2.0 linear frame buffer. This means that copying the screen buffer from main memory to video memory is done faster.
For Direct3D/OpenGL this shouldn't matter. For software rendering in Windows (for example Quake II software renderer) this matters, but Windows (or the video driver) should enable write combining so you shouldn't need additional tools.

feipoa wrote:

idspispopd, thank you for the clarification. It sounds like fastvid increases the DOS game/benchmark speeds, including DOS Quake (software), for some P6-class CPUs.

This still wouldn't explain the general range and magnitude of performance anomalies with Quake 1 at 640x480 software mode given Quake 2 software rendering shows no such oddities. (particularly with the Athlon, but also the general lack of P5 affinity)

The MMX and 3DNow! performance results for the Athlon also seem oddly low compared to the K6-2 in the sandra tests, but not in the Passmark tests. (this could just be an oddity of Sandra's testing scheme -I know it's got other quirks like memory bandwidth figures favoring P5 architecture chips disproportionally where some other benchmarks don't) Sandra's whetstone figures are also a lot more favorable for the K6-2/3 than Passmark's floating point tests while Passmark is more favorable for the K6 family in its Memory bandwith tests than Sandra.

Running the Quake test with board-level cache disabled could shed some light on whether that direct mapped Socket 7 cache arrangement had a big impact on things. (especially if Quake optimized its lookup tables with that sort of cache in mind) The K6-2+ and III/III+'s results without the board-level cache would be particularly interesting there given its caching arrangement.

Reply 99 of 145, by Imperious

User metadata
Rank Oldbie
Rank
Oldbie

I have PM'd Gonzo about this but will post the Bios files for Epox EP-MVP3G5 and EP-MVP3G2 and EP-MVP3G-M from 2001, the latest that was ever released.
I also modified it for updated cpu microcodes and to select Y upon exit by default. The modified version has tested perfectly on my board.

It would be good if these could make it to the Vogons drivers section as it was nearly impossible to find, took a few hours of searching.

Attachments

  • Filename
    vp3c1806.zip
    File size
    122.13 KiB
    Downloads
    260 downloads
    File license
    Fair use/fair dealing exception
  • Filename
    VP3C1806mod.zip
    File size
    123.15 KiB
    Downloads
    284 downloads
    File license
    Fair use/fair dealing exception

Atari 2600, TI994a, Vic20, c64, ZX Spectrum 128, Amstrad CPC464, Atari 65XE, Commodore Plus/4, Amiga 500
PC's from XT 8088, 486, Pentium MMX, K6, Athlon, P3, P4, 775, to current Ryzen 5600x.