K6-3+ 550 vs early Athlon/Duron

Reply 80 of 100, by petro89

Posted on 2016-02-28, 04:45

petro89 Offline

Rank Member

Rank: Member
Posts: 189
Joined: 2012-12-29, 22:54
Location: Michigan

I don't understand how so many people simply write off the K6-3 and/or K6-3+ by default. Coming from someone that has had ALL of the following CPUs in varying but similar configurations:

K6-2 400
K6-2 450
K6-2 500
K6-2 550
K6-3 400
K6-3 450
K6-3+ 450 (also at 400, 500, 550, 600)
K6-2+ 450 (also at 500, 560, 600)
P3 500
P3 600
P3 700
Athlon 600
Athlon 650
Athlon 750
Athlon800
Duron 600

I can tell you with no bias and no doubt, the k6-3/k6-3+ is AS GOOD OR BETTER CLOCK FOR CLOCK AT INTEGER OPERATIONS THAN..... YOU NAME IT (P3, Athlon, Celeron , Duron etc). Basic Windows tasks, excel, etc. it flies. If you haven't had one you shouldn't say "it can't be" or "that's not possible".

FPU is another story, as has been mentioned several times. Take off 100-150 MHz and you have the equivalent Intel CPU (p3, p2, Celery, whatever floats your boat). It's not that hard. However, a Celeron 400 4 times faster than a k6-3 450? You're out of your mind. there is something wrong with that benchmark. 4% faster in games? Maybe. Not 400%. That is common sense.

But at the same token, comparing the integer performance of a k6-3+ to an Athlon XP???! Come back to earth. Let's get real here, on both sides.

*Ryzen 9 3900xt, 5700xt, Win10
*Ryzen 7 2700x, Gtx1080, Win10
*FX 9590, Vega64, Win10
*Phenom IIx6 1100T, R9 380, Win7
*QX9770, r9 270x, Win7
*FX60, hd5850, Win7
*XP2400+, ti4600, Win2k
*PPro 200 1mb, banshee, w98
*AMD 5x86, CL , DOS

Reply 81 of 100, by swaaye

Posted on 2016-02-28, 20:28

swaaye Offline

Rank l33t++

Rank: l33t++
Posts: 8155
Joined: 2002-07-22, 21:24
Location: WI, USA

Phil posted a link to his excellent exhaustive testing on the previous page.

Reply 82 of 100, by petro89

Posted on 2016-02-28, 20:30

petro89 Offline

Rank Member

Rank: Member
Posts: 189
Joined: 2012-12-29, 22:54
Location: Michigan

download/file.php?id=11657

*Ryzen 9 3900xt, 5700xt, Win10
*Ryzen 7 2700x, Gtx1080, Win10
*FX 9590, Vega64, Win10
*Phenom IIx6 1100T, R9 380, Win7
*QX9770, r9 270x, Win7
*FX60, hd5850, Win7
*XP2400+, ti4600, Win2k
*PPro 200 1mb, banshee, w98
*AMD 5x86, CL , DOS

Reply 83 of 100, by havli

Posted on 2016-02-28, 21:03

havli Offline

Rank Oldbie

Rank: Oldbie
Posts: 1011
Joined: 2014-11-07, 16:51
Location: Czech Republic

Well, synthetic INT/FPU performance is not the only thing affecting the actual application performance. Memory bandwith is also important... and all socket 7 chipsets are terrible compared to i440BX or Athlon chipsets. In my tests K6-III+ @ 577 on average is 2% slower than Celeron 433 and 27% slower than PIII 500 (Katmai).

http://www.cnews.cz/testy/test-historickych-p … 999/strana/0/15

HW museum.cz - my collection of PC hardware

Reply 84 of 100, by 386SX

Posted on 2016-02-28, 21:18

386SX Offline

Rank l33t

Rank: l33t
Posts: 3141
Joined: 2014-10-27, 12:56

Off topic question but did anyone ever see a K6-3+ 550?

Reply 85 of 100, by swaaye

Posted on 2016-02-28, 21:46

swaaye Offline

Rank l33t++

Rank: l33t++
Posts: 8155
Joined: 2002-07-22, 21:24
Location: WI, USA

havli wrote:
Well, synthetic INT/FPU performance is not the only thing affecting the actual application performance. Memory bandwith is also important... and all socket 7 chipsets are terrible compared to i440BX or Athlon chipsets. In my tests K6-III+ @ 577 on average is 2% slower than Celeron 433 and 27% slower than PIII 500 (Katmai).

Yeah this is something that is usually overlooked. It's beneficial to have a Super 7 board that can handle a 133 MHz FSB and improve the memory performance situation some. I don't know offhand which boards will truly work at that speed though.

Reply 86 of 100, by kool kitty89

Posted on 2016-03-02, 03:32

kool kitty89 Offline

Rank Member

Rank: Member
Posts: 434
Joined: 2012-02-15, 08:43
Location: San Jose, CA

swaaye wrote:
havli wrote:
Well, synthetic INT/FPU performance is not the only thing affecting the actual application performance. Memory bandwith is also important... and all socket 7 chipsets are terrible compared to i440BX or Athlon chipsets. In my tests K6-III+ @ 577 on average is 2% slower than Celeron 433 and 27% slower than PIII 500 (Katmai).

Yeah this is something that is usually overlooked. It's beneficial to have a Super 7 board that can handle a 133 MHz FSB and improve the memory performance situation some. I don't know offhand which boards will truly work at that speed though.

The only boards that featured that seem to be a few of the SIS based ones and might even need to have the board level cache disabled. A slow chipset in a slow configuration ... not really worthwhile.

You'd have better luck getting 124 MHz on an MVP3 based system or 120 (or 125 MHz -undocumented but some motherboards support it) on an Aladdin V based board. 112 and 110-115 MHz are a lot easier to get stable though. Gains on a K6-2 would also probably be bigger than on a 2+ or III given the board-level cache is also being overclocked. (FIC's MVP3 boards tend to use 4 ns pipeline burst cache too, so 124 MHz is running within spec while most Aladdin V boards seem to use 5 ns SRAM -though the chipset itself might overclock more easily than the MVP3 and there's more steps offered on many Aladdin V boards than MVP3 boards -usually 100, 105, 110, 115, 120, and sometimes 125 MHz)

I have a K6-2/240 that will run at 4x125 MHz on my Asus P5A-B, though it's not incredibly stable (120 MHz is a lot more solid). My K6-3+ doesn't like going over 110 MHz though.

Anything over 112 MHz is also probably better to avoid when using an AGP video card, somewhat like with 440BX overclocking except with a more finicky AGP interface to start with.

On that note, I remembered the more specific troubles I had with AGP on my P5A-B: aside from certain cards not posting at all (like the Matrox G450 -TNT and Geforce cards are known for that too) there's a problem with plug n' play recognition where an AGP card will show up as a PCI card to the OS and fail to install properly when you attempt to use the official driver installation program (as the card isn't recognized). This seems like it might be more of a problem with any AGP card not recognized by the OS with at least basic drivers included for it, but may have been fixed with later AGP driver revisions for the chipset too. (I haven't messed around with it too much)

The MVP3 chipset isn't as finicky there and it's not too hard to find later driver revisions for that either. (early revision motherboards might be a problem though, at least without a BIOS update)

havli wrote:
Well, synthetic INT/FPU performance is not the only thing affecting the actual application performance. Memory bandwith is also important... and all socket 7 chipsets are terrible compared to i440BX or Athlon chipsets. In my tests K6-III+ @ 577 on average is 2% slower than Celeron 433 and 27% slower than PIII 500 (Katmai).

http://www.cnews.cz/testy/test-historickych-p … 999/strana/0/15

Oddly enough, a few memory benchmarks seem to favor Socket 7 more than others (in the 686 benchmark tests, Cachechek v7.0 had much more favorable results than Speedsys 4.78 or Sandra99). Also a bit surprising that the Aladdin V fared so much better (in most categories) in memory bandwidth than the MVP3 did in your tests listed a couple pages back. (while the 3DMark results showed the MVP3 still having an overall edge in end performance)

For general-purpose use, the 1 MB board level cache can still be a big factor for FIC's MVP3 boards (or the less common 2 MB cache for that matter) and also made the difference between fast and slow DRAM less dramatic. (with the asynchronous DRAM mode -using the 1.5x AGP clock divider- while running the cache at full bus speed) It also supported EDO DRAM and the gains from switching to SDRAM might not have been all that dramatic. I'd like to get my hands on a late revision FIC503+ to test that, especially since I've got 128 MB of 50 ns EDO DRAM on hand. (which should be capable of running at 100 MHz using the MVP3's normal 5-2-2-2 EDO timing -that RAM may have originally been used in a 503+ as such, not sure since we don't have one kicking around at home anymore -I think we replaced it with a 503A ... or traded it in or something, not 100% sure, but I'm positive we had a SS7 board with SIMM slots in it, and the 503+ was the obvious one by far -even had some funky SIMM saver modules in it originally, presumably running at 66 MHz)

Reply 87 of 100, by kool kitty89

Posted on 2016-03-02, 09:52

kool kitty89 Offline

Rank Member

Rank: Member
Posts: 434
Joined: 2012-02-15, 08:43
Location: San Jose, CA

F2bnp wrote:
No way. Don't trust synthetic benchmarks or speedsys.

Just run GlQuake at 640x480 and watch the difference in performance. Or better yet, try some other, more relevant benchmarks/games!

While GLQuake consistently favors the P6 architecture, Quake's software renderer seems to oddly P5 chips and also puts a much more even playing field between all the non-intel S5/7/SS7 chips and the P6. (clock for clock, nothing beats a Pentium MMX for Quake 1, but the Ppro, PII, Celeron, and PIII often end up close to or behind the K6 family and even Cyrix chips or Winchip2 at times) The Athlon did oddly poorly in the 6x86 benchmark tests for Quake, but that was the odd exception. (the Athon 500 and 600 did far below the PIII 500 and 600 or K6-2 ... and equal to a Pentium MMX overclocked to 2.5x100 MHz)

For Quake 2 in OpenGL, though, it's important to note the driver revision as, even with the 3DNow! patch, there's drastic changes in performance. (though I think the Voodoo2 shows the most improvement in all cases, not sure if any others come close)

Falloutboy did some tests on this:
The Ultimate 686 Benchmark Comparison

The OpenGL driver + 3DNow! actually performs better than the MiniGL driver with 3DNow! for earlier MiniGL revisions, but the best performance by far is the MiniGL version 1.49 (which also outperforms OpenGL and earlier/default MiniGL versions even with 1.49 not using 3DNow!). It'd be nice to see how the performance scales with other CPUs and if those later MiniGL revisions actually helped boost the K6 (or K6-2) specifically, perhaps rewriting the FPU code to favor the K6's specific execution adventages (which are weighted differently than the P5 or P6 FPUs). The Pentium tested there actually looses a slight amount of performance using MiniGL 1.49, so that implies a possible K6-specific affinity.

Looking at the performance figures, Phil's Voodoo 2 tests seem like they're using the default OpenGL or MiniGL drivers (or MiniGL 1.46). Without adding 3DNow! into the mix, there's still over 20% framerate gain for a K6-3+ (and presumably K6-2 and 2+ more or less) when running MiniGL 1.49 vs 1.46 or the default. (the default MiniGL and 1.46 are only very slightly faster than full OpenGL too) Falloutboy was using a Voodoo 3 3000, not Voodoo2, but the performance should approximate a pair of Voodoo 2's in SLI.

Also: The Ultimate 686 Benchmark Comparison
Anandtech's own tests with a K6-2/300 managed over 82 FPS in a Voodoo2 SLI arrangement. (that approaches the performance Phil's PII/400 managed -so in the ballbark of a PII/Celeron at 333-366 MHz and nearly 2x the speed Phil managed with a K6-2/300)

In theory, DirectX 6.x based games would benefit most frequently from 3DNow! (DirectX7 technically should as well, but there's some evidence that at least some microsoft provided drivers omit 3DNow! support in DX7 or at least disable it by default) I'm not sure about OpenGL and whether it requires game-specific patches or could take advantage of a universal GPU driver or patch for all OpenGL games. (or if OpenGL of that vintage even functions in a manner condusive of a standard Geometry transformation driver embedded in the API -if each game needs to include its own transform/lighting code that feeds vertex data to the API, then this obviously wouldn't work)

I don't think DirectX7 had issues with SSE getting disabled, perhaps due to the PIII/4 more consistently benefitting from it than the Athlon and Duron did 3DNow! ... particularly given there's some reports of performance drops when 3DNow! is enabled on a K7 chip. (I'd assume mostly with poorly implemented examples -the sort that add little benefit to even the K6-2)

swaaye wrote:
I await the final discovery of how K6-3 is a match for Duron and P3. And yeah not in Sandra, Speedsys or Aida/Everest. I figure a -3 can keep up with 2D games and productivity software.

Sandra and Speedsys seem like they might exaggerate the memory bandwidth disparity on Socket 7, or at least not show the complete story ... either that or the K6 and Cyric 6x86 family really have crappy I/O busses compared to the P55C. (the Pentium MMX manages considerably higher marks at similar clock speeds and bus speeds on the same motherboards, though only for DRAM, not generally for the board-level cache ... which is odd given both should be similarly I/O bound) Going both by the 686 benchmark thread results and my own tests using my P5A-B as well as some Celeron and PIII tests using my Apollo Pro 133 Soyo board a couple years back. (which smoked SS7 results even when clocked back to 66 MHz FSB -tested a Celeron 1000 @ 666.66 MHz, TualatinS 1400 @ 700 MHz, and PIII 667 at 333.33 MHz) But I only compared Sandra99 results in my own tests.

It's entirely possible that a lot of the synthetic benchmarks actually sell these machines short, depending on what sort of tasks and working environment you actually use in the real world.

SPBHM wrote:
the problem I see with k6 3 is that it's to incosistent, it seems good for some tests, but when things go wrong, they seem to go really wrong, while k7 is always good.

Except for Quake 1 in software mode ... (unless there's some other results that contradict this).

Cyrix's Socket 7 CPUs were even more dramatically polarized in perfomance like that, the PR ratings were valid (or even conservative -prior to the 300MX at least) in some cases, and vastly off base in others. Given Winstone used more real-world code and situation/context in their business benchmarks than most (or any) others and also had the best reputation in the industry at the time, it's fair to say that they at least give a decent picture for general purpose office/home/business desktop use for CPUs of the era. (and testing for general 'feel' for speed/performance visible to an average user)

matze79 wrote:

You need 100Mhz FSB and a decent S7 Board (VIA MVP3), i benchmarked this out in the Past.
There is really a good chance your K6-2+ will beat a Athlon Clock for Clock in Doom Benchmark.
But never in Quake.

Again, this doesn't seem to be the case for Quake in software mode, at least not at 640x480 and comparing a SlotA Athlon 500 or 600. (Don't have Duron or Thunderbird figures unfortunately)

Reply 88 of 100, by kool kitty89

Posted on 2016-03-02, 12:48

kool kitty89 Offline

Rank Member

Rank: Member
Posts: 434
Joined: 2012-02-15, 08:43
Location: San Jose, CA

bhtooefr wrote:
I'm wondering how you'd do a fair office productivity responsiveness benchmark, to test Red Hill's claims that the K6-3+ at 5x112 was somewhere between the performance of an Athlon Thunderbird 1000 and a Athlon XP Palomino 1700+. (Then again, there's a lot of crap on Red Hill's site, like the bit about the 386 being designed before the 286, and them trying to represent a 286 at 25 MHz as being as fast as a 386DX at 40 MHz.)

The 286 20 and 25 examples were more off the cuff comments (the 25 in particular was in the context of customers being shown the 286-25 machine and being asked to guess what it was running), the Red Hill guys never implied the performance was remotely close to a 386DX-40 (or 33 for that matter). They did mention they suspected (but never tested) that a good 286 system was faster than a 386SX of the same clock speed (or possibly even a step up -like 286-20 vs 386SX-25) which has some real-world merit as quite a few instructions execute in fewer cycles on a 286 than they do on a 386. (this also is in the context of cacheless systems -and in that sense, a 386DX-40 might not seem all that much faster than a 286-25 ... or 386SX-25 for that matter -DX or even SX systems with board-level cache would see a lot more gain at higher clock speeds than their cacheless counterparts; early 3D games would probably be the most noticeable winners for the cacheless DX/40, particularly those relying on hardware multiply and divide more than LUT optimizations)

The '386 was designed before the 286' remark is almost certainly a mix-up with the iAPX432, which isn't related to the 386, but definitely fits the 'too big to make in a single chip' comments made in the Red Hill article.

Once you get to 486 (and 486DLC) era stuff, their performance commentary seems to be based on more detailed experience and actual benchmarks rather than pure anecdotes. (Winstone 95 and 97 in particular once you get to Socket 5/7 processors -both of which seem to be unavailable now given several fruitless searches for those benchmark programms by various perfomance comparisons here on vogons -Ziff Davis's '98 and '99 stuff seems to have survived more) Winstone 95 and 97 both relied fairly notably on popular DOS programs running under windows shells (moreso the former) and reflected Virtual8086 mode much more heavily than later (and even most contemporary) benchmark suites, business or otherwise. (that was important for some of the PR ratings that didn't age well once code got more Pentium-specific in optimization as routine and also tended more towards 32-bit software in general, though there's some exceptions -including some 32-bit applications that favor non-intel processors over P6 architecture CPUs, though more often the K6 family than 6x86MX -especially going by PR value, not clock speed)

Edit:
To Red Hill's credit (given they consistently list processors in order of general performance), the Athlon 500, 600, Pentium III 550 (Coppermine), and Duron 600 were all listed after the K6-III+/450 (or 560 as they ran it), demonstrating overall performance of the newer CPUs was ahead there. (the Katmai PIII and Celeron 566-766 were listed behind the K6-III+/450, though ahead of the K6-III/450 -save the Katmai 500 which got stuck between the K6-III/400 and K6-2+/500) Odd they didn't note the overclockability of the K6-2+ compared to the III+ -granted they also omitted stuff like how easily the AM5x86 runs at 4x40 MHz and noted running the K6-300 at 3x100 instead of the rated 66 MHz, yet made no mention of attempting that with Cyrix chips released around the same time -like running the 2.5x 83 MHz PR 266 or 3x66 PR233 at 2x200, or the later PR-333 -which they didn't like at 83x3- at 2.5x100. (and they mentioned the overclockability of the Mendocino celeron, or at least the 300A, but not the Coppermine Celeron -albeit if they had, they'd probably have mentioned the Duron was still a better value ... which would be arguable in the case of pushing a Celeron 566 to 850 MHz or 600 to 900 MHz)

Though on the note of DOS performance and 3D math, 3DBench seems to have an extreme affinity to the Cyrix 6x86 (classic and MX/MII) family. It seems like a 486 (or 386) optimized integer polygon rasterizer (and does do well on the k6, i486DX/2, Winchip, and 5x86s tested) it seems to have a particular affinity to the 6x86 and MII, as well as a tendency to scale better on those than any of the other CPUs tested. (and given how well it does on the 6x86 classic -with 16 kB unified L1 cache, it can't be a matter of fitting into the 64k cache of the MX/MII, and all the P6 CPUs use 4-way associative L1 caches as well, and the PII onward has 16k data and code caches, so should catch everything the 6x86 does and would show a similar dramatic trend if it managed to cache key chunks of code and data from the program. So it doesn't seem like cache arrangement is the main cause for that oddity. (the fact it also scales really well points to it not being tied to the board-level cache either ... must be something to do with the way the 6x86 executes Realmode 16-bit code on top of any cache advantages -the K6 doesn't even come close, clock for clock and goes far beyond the posted PR ratings of the Cyrix CPUs in question too)

It does make me wonder if there's some missed opportunities in 16-bit integer based geometry engines on the 6x86, but otherwise it's just a good example of an odd benchmark exception. (though, oddly enough, it is one of the few benchmarks that falls into Red Hill's observations of the 6x86MX PR200 -2.5x66- exceeding the K6-233 and Pentium MMX-233 in performance, even if that was specifically in the context of using an FIC-502 motherboard which wasn't tested in the 686 trials -neither was the PA-2005 using the same VPX chipset)

I'm thinking the first thing would be to run as similar chipsets as possible for every platform, with as similar GPUs as possible. So, if you're running an MVP3 on the K6-3+, run an Apollo Pro or Apollo Pro 133 on the Intel parts, and a KX133 or KT133 on the K7 parts. Or, if you're using the MVP4 on the K6-3+, use a PLE133 for Intel, and a KLE133 for K7s. (But, for what I'm suggesting here, an external GPU with DVI would be best - this is going to require specialized hardware, and DVI makes this easier, so SS7-era IGPs aren't wanted.) This way, the CPU is what's under test, not the chipset.

The MVP4 doesn't include onboard video, it just includes provisions for a Trident Blade3D GPU to be included on the motherboard. FIC's 503A used the MVP4 but omits the onboard GPU (but includes an AGP slot), though it does use VIA's chipset embedded sound (which is in the MVP4 chipset itself, not an add-on) and likewise includes an AMR socket. (for what that's worth)

Then, perform an automated script of multitasking office tasks (this includes both mouse and keyboard input), with a system analyzing the display to know what the display should show when the task is complete. (The blinking cursor will need to be disabled, and this hardware will need to ignore the clock's region on the display).

Effectively, you'll have measured the latency of the system in those office tasks, and can then actually measure "snappiness" of a machine.

I could've sworn I'd seen some sort of business benchmark do something very much like this ... I think on Comupter Chronicles either for UNIX or OS/2 with a benchmark program opening up a variety of applications and rapidly cycling through different operations in each intermittently/concurrently. Either that or it looked visually similar to what I'm imagining a script like you're describing would do. (no idea if any windows benchmarks did this)

One possible note to consider with the 112 MHz bus setting is that it also overclocks the onboard IDE controller and might lead to some noticeable real-world increase in hard drive DMA bandwidth. (though given the IDE interface itself is rarely the main bottleneck in HDD performance -especially in the era in question- it's probably not a factor in Red Hill's observations)

The argument that the K6-3+ can flush its pipeline in response to user activity more quickly makes some sense, but then you've also got an argument that at least in a single-tasking environment, long pipeline chips that clock up high are sometimes said to feel snappier than their benchmarks claim. (And let me tell you, the Samuel 2 Eden at 533 MHz I've got doesn't feel like a Tillamook Pentium MMX at all in general GUI responsiveness, even though the actual performance is in that ballpark once you actually ask it to do something.) And, ultimately, unless there's cache missing going on in the Athlon that isn't happening in the K6-3+, an Athlon at about 933 MHz should be, worst case, as responsive as the K6-3+ at 560 (6 stage vs. 10 stage pipeline, assuming all stages take 1 cycle, which is the point of pipelining, but some early x86 CPUs had slower pipelines, which is more likely to be a problem with the K6 pipeline than the K7 pipeline...) Oh, and the Athlon had twice the K6-3+'s L1, and while its L2 was slower, it was also twice as big, and far faster than the K6-3+'s L3, and its DRAM was faster than the K6-3+'s, even overclocked as in Red Hill's example.[/quote]

Reply 89 of 100, by kanecvr

Posted on 2016-03-02, 22:29

kanecvr Offline

Rank Oldbie

Rank: Oldbie
Posts: 1957
Joined: 2015-04-22, 20:30
Location: Bucharest, Romania

petro89 wrote:
I don't understand how so many people simply write off the K6-3 and/or K6-3+ by default

Me neither. In these parts the K6-2 and the super7 platform were very appreciated - lots of us had K6 machines and were playing everything else PII/PIII owners were playing, using better video cards usually (because obviously the ss7 board + cpu cost a lot less then a Slot1 / Socket370 board + CPU and we had quite a bit of scratch left over for the GPU).

Often times my overclocked K6-2 400 + voodoo 2 kit would play games that a 350MHz PII would not, either because it was lacking hardware support (cheap / crappy video card, no openGL or glide, lots of times on board with no hardware acceleration support whatsoever). From a value/performance point of view they were great CPUs. My 400MHz K6-2 would do 450 rock solid at stock voltage and 550MHz at 2.4V + spire socket 370 cooler with 72mm fan modded with the socket 7 clip. At that speed it would rival my best friend's 450 PII + V2 in Unreal and Quake 2 - I remember poking fun at him about it at lan parties whenever my K6 would show even 1fps more then his PII 😁. Needless to say, his machine cost double what mine did.

What's weirder is that another friend had a mendocino? celeron (the black socket 370 one), and at 433MHz it was a little faster then the aforementioned PII and my hugely overclocked K6-2. Maybe the built-in 128kb of cache running at CPU speeds was responsible... His machine was a socket 370 micro-atx job with said CPU. No AGP, so he ran a PCI video card. He originally ran a V2 card, then bought a V3 PCI.

Reply 90 of 100, by gdjacobs

Posted on 2016-03-02, 22:50

gdjacobs Offline

Rank l33t++

Rank: l33t++
Posts: 7631
Joined: 2015-11-03, 05:51
Location: The Great White North

Celeron chips of that era were often superior to equivalent clocked pentiums due to higher clocked on die cache (as opposed to lower clocked on card cache).

All hail the Great Capacitor Brand Finder

Reply 91 of 100, by kool kitty89

Posted on 2016-03-03, 12:47

kool kitty89 Offline

Rank Member

Rank: Member
Posts: 434
Joined: 2012-02-15, 08:43
Location: San Jose, CA

On the bus speed and memory bandwidth topic: as far as CPU performance is concerned, for the likes of most (or all) Socket 7 processors as well as PII/III/Celeron and Athlon, bus speed itself makes a bigger difference than RAM speed/bandwidth (and latency is more important than bandwidth/throughput). There's around a dozen articles from the early Athlon/Duron/Pentium 4 era that deal with that transition on Anandtech, and it was mainly the Pentium 4 that benefitted from high bandwidth RAM while latency was more important on the P6 and K7 (and possibly even more so K6) with DDR configruations (even in the same chipset in some cases) often doing little better and sometimes slightly worse than PC-133 and sometimes PC-100 SDRAM.

http://www.anandtech.com/show/750 (that's just one of the threads in question, but it covers a lot of examples)

This would also give more credence to Red Hill's complaints about the 66 MHz Celeron bus vs 100 MHz SS7 in spite of the former's bandwidth advantage.

It might also mean that using CS-2 timing rather than CS-3 on SS7 boards would be significant even if synthetic memory benchmarks don't show it. (Sandra99 showed almost no difference when running my P5A-B at CS-2 -though on that note, I trued both 7.5 ns PC-133 and 8 ns PC-100 SDRAM and there seemed to be no problem at CS-2 even though they were rated for CS-3 -the PC-100 specification seems to be very conservative with its use of 8 ns DRAM among other things ... intel's documents claim that 125 MHz rated memory is needed to cope with a 100 MHz bus, but that seems like overkill -and more conservative than most video card manufacturers were doing: on the plus side, it means typical PC-100 SDRAM could cope with overclocking on SS7 boards without being one of the weak points -given you really aren't getting past 120-125 MHz in any reasonable context anyway)

On the other hand, AGP performance can be more sensitive to raw DRAM bandwidth (makes sense given the use of streaming large blocks of texture data in a serial manner) but this only really becomes a factor when you go beyond AGP 2x, so not relevant to SS7 boards or early Slot/Socket A boards (all AGP 1x or 2x) and also not a factor unless you have games and GPUs that can actually tax the AGP bus.

And on the issue of 3DNow! performance: looking over period multimedia benchmarks, the K7 (even early models) most definitely benefitted significantly from 3DNow! and in some cases gained promortionally more than SSE enabled P6 chips managed.
http://www.xbitlabs.com/articles/cpu/display/amd-athlon.html (that example seems about as dramatic as a K6-2 with and without 3DNow! with 2:1 difference)

And while there were a fair number of late 90s games still using DirectX 5.X (or not supporting Direct3D at all), a ton used DirectX 6.x, all of which should benefit significantly from 3DNow! and SSE provided the proper Direct3D drivers are installed. http://www.mobygames.com/attribute/sheet/attr … ,275/p,3/so,0a/

I'd actually forgotten that Unreal ran under directX 6 (I have Unreal Gold and it makes DirectX 7 stick in my head more). And while the Direct3D engine for Unreal is a bit buggy on some DirectX6 compatible cards (like the Riva 128 and to much lesser extent Rage Pro -messy mipmaps but non-broken lighting), it works quite well on other cards from the time including 3DFX's Direct3D drivers and Rage 128. (PCI versions of TNT, Geforce, and Matrox G400 should also be fine for SS7 motherboards, AGP ones might work on Via boards but not ALi ones, and I think AGP versions of the S3 Savage 4/Pro work fine as do Rage 128/128Pro -though installing drivers can be frustrating on some systems ... to be fair I was too lazy to really try installing Aladdin V AGP driver updates beyond what MS included in Win 98SE, so my 128 Pro experience might be exaggerated)

For a budget gaming-capable general purpose desktop+multimedia set-up with something like a Rage Pro running Unreal (in D3D or OpenGL), unless you ran at low resolution, I'd think the bottleneck would be with the video card, not the CPU. (though 3DNow! would come more into play for a Winchip2 or K6-2 266-350 ... even 350 might be overkill though)
Riva 128 wouldn't cover MPEG-2 acceleration like the Rage did, so if you remotely cared about that (for movies or some DirectX multimedia games using MPEG-1/2 video) though it would be interesting to see how well Quake 2's OpenGL 3DNow! patch favors the Riva 128 in Quake 2. (the Rage 128 Pro was nice and had mature drivers faster than the prior 128, but in 1999 it was expensive).

Off the top of my head, Voodoo Banchee and Savage 4 would've been the better budget options in 99 with a Voodoo2 being appealing if you already had a good 2D card (Rage Pro + Voodoo 2 would be a fairly balanced set-up). SLI usually wasn't practical in Baby AT motherboards either (though a few like the GA-5AA and P5A-B had multiple unobstructed PCI slots, if only 3 total). Lower-end Voodoo3 models might be appealing too, definitely more so than the Rage 128 Pro, TNT, or Matrox offerings, but it didn't have 32-bit color support or stuff like TV-out if you wanted a home theater set-up. (high-end Voodoo 3s had TV-out options but ... the Rage Pro was the best low-cost multimedia solution I know of at the time, honestly -specifically the XPert@Play form factor)

On that note, if you liked the novelty of using your PC not just as a home theater system, but also a game console though TV out (especially on a big TV with good sound system), then the Rage Pro's 3D performance might actually stretch a bit further too given you wouldn't gain any visual quality above 640x480 anyway (and 512x384 might look nearly as good depending on the scaling -I forget how well the Rage's video scaler worked ... or if you could alter the screen/overscan size for that matter; I know we used 800x600 a lot with our Radeon + TV home theater/gaming set-up, downscaled to 480i obviously and adjusted to work within TV overscan limits, but it's been too long since I tried that with the Rage)

Factor 5's Star Wars games were all DirectX 5.x based iirc, so I don't think they benefitted from 3DNow! but I do remember than running perfectly fine on both our Celeron 366 and K6-2/300 (which became a K6-2/550 at some time I don't recall ... obviously no earlier than 2000, but could have been a fair bit later than that given Dad's tendency to buy remaindered warehouse stock on clearance)

That said, a LOT of games looked like they were running fairly smoothly and looking nice if you were used to 3D consoles at the time, especially comparing N64 games to their PC counterparts. (especially given the framerate drop Factor 5 games took at times when running at 640x480 with the RAM expansion option ... or ran the faster low-res default 320x240 -though some of those upscaled to 640x480 and applied blur as primitive AA)

Oddly enough, that SS7 system got replaced with a Tualatin 1300 Celeron system later on, though it was supposed to be a Duron 1000. (the Duron had overheating issues in my case with the heat sink being used, so I ended up swapping with one of the other new builds we had going at the time) Oh, and on that note it was an Apollo Pro 133 Shuttle Spacewalker board in that set-up and in the anecdote I a few posts back on memory bandwidth. (I'd said Soyo, but got mixed up: the Soyo board was the 440ZX with Celeron 366)

Hmm:
http://vintage3d.org/rage3.php#sthash.qXlI1NZj.dpbs

Given the 320x240 Unreal performance there and the framrates Phill's Voodoo2 tests managed with a K6-2/300, I don't think the Rage Pro would benefit too much from 3DNow! in Unreal even at low res. (which would probably be a tad more fillrate bound than that page even shows given you'd really need to use the 32-bit color mode to avoid super grainy low-res dithering at 320x240, unless fillrate isn't the bottleneck at that res)

Reply 92 of 100, by kanecvr

Posted on 2016-03-03, 22:21

kanecvr Offline

Rank Oldbie

Rank: Oldbie
Posts: 1957
Joined: 2015-04-22, 20:30
Location: Bucharest, Romania

Ok so I ran my own benchmarks using my K6-III rig vs Abit BE-6 II based 500MHz pentium III and pentium II setups: 500MHz P II v.s. 500MHz P III v.s. 550MHz K6-III+ benchmarks!

The K6 holds its own quite well.

Reply 93 of 100, by PhilsComputerLab

Posted on 2016-03-03, 22:34

PhilsComputerLab Offline

Rank l33t++

Rank: l33t++
Posts: 6174
Joined: 2014-09-28, 03:33
Location: Western Australia

Another point that is worth considering is performance consistency. I find that on a Slot 1 system it doesn't matter too much what board you have, but with Super Socket 7 there are wide performance differences, especially with various graphics cards, like the V3 can perform slower on some boards. It's the kind of information you only learn when you play around with a wide range of systems, so that's worth considering too I think.

YouTube, Facebook, Website

Reply 94 of 100, by kanecvr

Posted on 2016-03-03, 22:39

kanecvr Offline

Rank Oldbie

Rank: Oldbie
Posts: 1957
Joined: 2015-04-22, 20:30
Location: Bucharest, Romania

I agree. The Aopen AX59Pro isn't even my fastest board - the AT Lucky Tech P5MVP3-99X has better AGP and Memory performance reaching up to 300MB/sec in memory read tests. In contrast the Aopen is very VERY stable, that being the reason I used it in the tests.

The thing is, there's performance and compatibility differences between slot 1 boards as well - The ABIT BE6-II is by far the best 440BX board I've ever had the pleasure to use. In contrast, OEM boards are quite a bit slower (a compaq Deskpro 1000 or HP vectra VL for example). The compaq machines are also very picky about AGP cards (can't get V3 cards or anything faster then a Geforce 2 to work in it) whereas the ABIT will take anything I throw in it.

The point of the test is to show that the K6 is not really that much slower in games then a Slot 1 setup at similar speeds. K6 machines are also more flexible, being able to clock lower and emulate slower CPUs while being able to keep up with a slot 1 build. It is now more then a ever a viable platform for dos / early windows games, especially glide stuff, also allowing you to play stuff like wing commander by simply disabling cache and lowering FSB / multi from software (if you have a K6-2+ or K6-III+). K6 CPUs are also a lot easier to overclock - in my experience the average 450MHz K6-2 will do 550Mhz stable w/o a voltage boost on a good board, while slot 1 machines won't OC as easily, usually requiring really good motherboards for stable results when going from 66 to 100MHz fsb. PII and PIII's are also multiplier locked, while K6's are not.

Last edited by kanecvr on 2016-03-03, 22:44. Edited 1 time in total.

Reply 95 of 100, by Skyscraper

Posted on 2016-03-03, 22:41

Skyscraper Offline

Rank l33t

Rank: l33t
Posts: 3582
Joined: 2013-09-18, 15:36
Location: Sweden

PhilsComputerLab wrote:
Another point that is worth considering is performance consistency. I find that on a Slot 1 system it doesn't matter too much what board you have, but with Super Socket 7 there are wide performance differences, especially with various graphics cards, like the V3 can perform slower on some boards. It's the kind of information you only learn when you play around with a wide range of systems, so that's worth considering too I think.

That is true but there are some really slow Slot-1 BX boards aswell, my Gigabyte BX2000+ is awfully slow compared to my Aopen AX6BC.

The issue with some Alladin 5 boards and some Voodoo 3 cards is of course worse but on the other hand easily solved by using a Nvidia card.

New PC: i9 12900K @5GHz all cores @1.2v. MSI PRO Z690-A. 32GB DDR4 3600 CL14. 3070Ti.
Old PC: Dual Xeon X5690@4.6GHz, EVGA SR-2, 48GB DDR3R@2000MHz, Intel X25-M. GTX 980ti.
Older PC: K6-3+ 400@600MHz, PC-Chips M577, 256MB SDRAM, AWE64, Voodoo Banshee.

Reply 96 of 100, by kanecvr

Posted on 2016-03-03, 22:46

kanecvr Offline

Rank Oldbie

Rank: Oldbie
Posts: 1957
Joined: 2015-04-22, 20:30
Location: Bucharest, Romania

Skyscraper wrote:
The issue with some Alladin 5 boards and some Voodoo 3 cards is of course worse but on the other hand easily solved by using a Nvidia card.

You can reliably use a V3 cards with allmost any VIA MVP3 board. AGP is miles better then the Aladdin V, and the MVP3 overclocks better too. The only thing the ALi chipsets have going in their favor is memory performance. All my V3 cards work in all my MVP3 boards regardless of manufacturer and form factor. Some of them do have issues with faster nvidia cards tough (GF3 and higher).

Reply 97 of 100, by PhilsComputerLab

Posted on 2016-03-04, 07:32

PhilsComputerLab Offline

Rank l33t++

Rank: l33t++
Posts: 6174
Joined: 2014-09-28, 03:33
Location: Western Australia

I actually got that Gigabyte board, haven't pitted it against the AOpen though. The SS7 boards I benchmarked though, the results are all over the place. The IWill board is the fastest one with a V3 I think.

YouTube, Facebook, Website

Reply 98 of 100, by swaaye

Posted on 2016-03-04, 20:53

swaaye Offline

Rank l33t++

Rank: l33t++
Posts: 8155
Joined: 2002-07-22, 21:24
Location: WI, USA

I used to have a FIC VA503 (A or +, don't recall anymore)with the MVP3 chipset. The board didn't overclock as well as my ASUS P5A nor was its AGP superior in compatibility. Anecdotes are fun. It also had that fun VIA PCI and USB trouble that incited the creation of George Breese's VIA PCI Latency patch.

Not that I would call the P5A a quality product either. Maybe the AGP is equally troublesome.... But my mantra is "just use a Voodoo" when it comes to non-Intel 90s AGP.

Reply 99 of 100, by kool kitty89

Posted on 2016-03-08, 21:26

kool kitty89 Offline

Rank Member

Rank: Member
Posts: 434
Joined: 2012-02-15, 08:43
Location: San Jose, CA

swaaye wrote:
I used to have a FIC VA503 (A or +, don't recall anymore)with the MVP3 chipset. The board didn't overclock as well as my ASUS P5A nor was its AGP superior in compatibility. Anecdotes are fun. It also had that fun VIA PCI and USB trouble that incited the creation of George Breese's VIA PCI Latency patch.

Not that I would call the P5A a quality product either. Maybe the AGP is equally troublesome.... But my mantra is "just use a Voodoo" when it comes to non-Intel 90s AGP.

The only AGP card I haven't had trouble with installing/recognizing/set-up (ie not stuck in VGA-mode) on my P5A-B in win98SE seems to be a Rage Pro (though I don't have an AGP Voodoo 3 -PCI one is fine, but that's expected). Others either work after tedious manual installation (Diamond Stealth III -Savage 4- seems to go OK after manual driver install, but seemed buggy in Unreal -might just be the card/driver in general though, some stuff almost seemed like the chip was overheating and getting buggy, but it only showed up in Unreal's menu ... or more like an analog problem honestly, like the DAC is operating wrong or synch is off -weird out of allignment scanline staticky jittery mess). Rage 128 Pro, Geforce 256, Matrox G450, Geforce 2MX and maybe a couple others all get recognized as basic VGA adapters and not even as unknown AGP devices. (like the video cards BIOS fails to initialize the GPU leaving only the legacy VGA core logic enabled)

Plug and play initialization/recognition is limited to the devices in Win98SE's default archive, and anything beyond that seems to require manual installation, but only when an actual VGA device is detected.

On the plus side, that Rage Pro actually manages to run Unreal Gold in Direct3D mode with the full feature set (alpha textures -mostly just the light maps- look bad, but the Rage Pro is known for that problem -oddly the software renderer DOES interpolate/filter the low-res light maps rather like Quake's software renderer but smoother -high/true color vs 256 color ... the Rage would probably look great if there was a highres light map option; wonder if I could replace the lightmap texture files from one of the highres Unreal texture mods)

Mostly for fun/curiosity though ... and OpenGL Unreal runs in 16-bit color mode OK ... in the demo (freezes at the menu). Quake II in OpenGL seems to be very well suited to the Rage Pro in spite of its relatively modest specs. (Seems to handle higher texture detail settings better than the Riva 128 ... granted 8 MB vs 4 MB might have something to do with it or the Rage Pro's texture cache might be bigger; I doubt AGP vs PCI has much to do with it) The Rage Pro is also MUCH faster than software rendering on a K6-2/500 at all resolutions and detail settings I tried. (unlike Unreal, that very much likes software mode for anything using high detail textures and 32-bit color -plus the Rage's questionable sub-pixel accuracy is apparent in Unreal especially with higher texture res settings)

I remember my rage+K6-2 system running Lucas Arts (Factor 5's really) Star Wars games nicely as a kid, all DirectX 5 I think and not having the blaring alpha texture issues of some other N64 ports (like Turok) but maybe more like Final Reality's alpha fog at most (unfiltered but barely noticeable -opaque or nearly opaque alpha levels in textures is the worst as they're usually very low res and totally unfiltered -Unreal's colored light maps with higher opacity do this but especially the far background scenery -not the sky, but the skyline) Note the Rage Pro handles translucency fine (vertex alpha -entire polygons/textures of uniform opacity/translucuency) but just not formal alpha textels. (ie textures in 8-8-8-8 or 4-4-4-4 RGBA format)

So ironically enough, all those super lazy dithered surfaces in Unreal's software renderer are nicely blended and filtered by the Rage Pro, but light maps (and a handful of other things) which ARE filtered in the software engine (or light maps are at least -some other stuff isn't) are blocky and posterized due to lack of filtering on the Rage Pro. (you can disable the lazy 'fast translucency' in Unreal's command line preferences though, which IMO should've been the default as the speed difference is negligible to the naked eye -at least on MMX capable CPUs ... and non-MMX CPUs are mostly stuck at 320x240 for decent framerates so the dithering looks way worse and is still worth the trade IMO)

I've seen some mention of Direct3D drivers working around that limit on the Rage Pro by pre-filtering alpha textures (something the generous 8 MB of typical Rage Pro's certainly facilitates for most games of the time) but that doesn't seem to be happening in Unreal. (maybe it's a fix specific to DirectX5.x games, not 6.x/7, or it's just a rumor and not actually a thing)

Main menu

Common searches