Tweaks, Overclocking and Benchmark post
A Brief History
Upgrading the CPUs on a Socket 7 system was given an interesting twist after AMD introduced the K6-II / K6-III (desktop) and the K6-II+ and K6-III+ (mobile) CPUs. The CPUs were socket compatible with many systems that could officially only support Pentium and K6/Cyrix CPUs up to 233 MHz.
Sometimes manufacturers would add support for these CPUs with an official BIOS update, but many times an updated BIOS was introduced by enthusiasts. One location for such BIOS-files is The Unofficial AMD K6-2+ / K6-III+ page.
Many of the later K6-series CPUs were intended to run on newer so-called "Super Socket 7"-boards that supported a front-side-bus speed of 100 MHz. The CPU I bought to upgrade my system, for example, would utilise a 4x multiplier together with a 100 MHz front-side bus to reach its intended speed of 400 MHz. However, AMD cleverly included a work-around for users of older Socket 7 motherboards: if the motherboard was set to use the 2x CPU multiplier, the CPU would interpret that internally as 6x! This allowed the use of these faster CPUs on motherboards that maybe lacked the necessary multiplier or FSB settings.
My motherboard (the ASUS TX97-XE) is one that officially only supports a 66 MHz front-side bus speed and does not include a 6x CPU multiplier setting. It also needed an unofficial BIOS update to support the AMD K6-III+ 400 MHz CPU I'm using now.
Overclocking on the TX97-XE
The K6-III+ mobile CPU is a special kind of animal. It uses a lower voltage than regular desktop CPUs (at the time). In fact, the specified 1.6V cannot by supplied by the TX97-XE motherboard I use in my configuration. The lowest supported voltage is 1.8V:
The attachment TX97-XE-CPU-jumpers-pt1.png is no longer available
However, a minor bump in the operating voltage is not necessarily an issue as long as the CPU is properly cooled and overvolting is often used when overclocking to ensure stable operation of the CPU. Hence, I was actually already set up to attempt an overclock by raising the front side bus from the officially supported maximum of 66 MHz to 75 MHz. speeddemon has his FSB set to 83 MHz, but besides being quite a substantial overclock, it's also one that I couldn't find any documentation for. Instead the 75 MHz FSB was clearly outlined in the ASUS manual for the TX97-XE:
The attachment TX97-XE-CPU-jumpers-pt2.png is no longer available
Results
Methodology
I'm going to be using 3DMark2000 scores as a way to see if video performance is affected by the overclocking or the tweaks.
Note: All results are with the updated BIOS.
Note: All results are with 16-bit color depth because the Voodoo3 does not support 32-bit 3D-rendering.
Note: 3DNow! is enabled for all tests.
Besides applying the overclock, I'm going to be utilising Central Tweaking Unit by Rob Muller. This handy utility allows the enabling of some performance enhancing features (namely write allocation and write combining) that are unknown by the BIOS and thus likely disabled.
First test is just a refresher with only the updated BIOS and the certified clock speed. No further performance enhancements.
AMD K6-III+ @ 400 MHz (6x66): 1280 points
After successfully applying the 50 MHz CPU overclock, I decided to also attempt optimization with CTU.
I applied write combining to the first reported framebuffer/video memory range based on what I read in William Jones' guide where he states that:
The 2nd MTRR1 row can be used if you have two graphics cards to enable Write Combining on that card also...
I used the memory addresses visible in Device Manager as basis for the settings in CTU, which confusingly gave me 32 MB worth of VRAM:
The attachment ctu_writecombine_1x32m-range.png is no longer available
This led to a substantial improvement in performance with and without overclocking:
AMD K6-III+ @ 400 MHz (6x66), 1x32MB range write-combining: 1392 points
AMD K6-III+ @ 450 MHz (6x75), 1x32MB range write-combining: 1441 points
In fact, it seems the optimization by enabling write-combining brought a bigger jump in performance than the overclock! However, it bugs me that 1) there are two memory regions designated on the Resources-tab in Device Manager and 2) the indicated ranges are 32 MB each, even though my Voodoo3 only has 16 megabytes of VRAM. My conclusion from this was that the Voodoo3 must be allocating system RAM for textures when it runs out of local video memory. Indeed 3DMark2000 even runs Texture-rendering speed tests up to 64 MB.
Following my intuition, I decided to perform two more benchmarks. In the first one, I enabled write-combining for each of the reported 32 MB memory ranges allocated to the Voodoo3 according to Device Manager:
The attachment ctu_writecombine_2x32m-range.png is no longer available
I knew all of it would not be video memory and thus did not expect to get the result I got:
AMD K6-III+ @ 450 MHz (6x75), 2x32MB range write-combining: 1564 points
Because the performance was improved, it must mean the second memory range also includes local video memory on the V3. However, this would imply that I was incorrectly applying write-combining to quite a lot of system RAM (see the Wikipedia article on why this is undesirable).
In my second experiment, I made the assumption that both memory ranges would start with local VRAM, but that the remainder of the range was just system RAM allocated for textures (sort of like AGP aperture). If I was right, I could enable write-combining for only the first 8 MB of each memory range (totalling the 16MB on the Voodoo3) and still achieve the same performance as when I had write-combining turned on for the full 64 megabytes of allocated memory:
The attachment ctu_writecombine_2x8m-range.png is no longer available
...and the results are in(!):
AMD K6-III+ @ 450 MHz (6x75), 2x8MB range write-combining: 1556 points
The difference between this and the previous results are within margin of error, which in my opinion indicates I have the right conclusion. This is further confirmed by the fact that I got a lower score when I only enabled write-combining for the first memory range, even though I had designated a complete 32MB range then. 😎
Conclusion:
For optimum results on your K6-II/K6-III based system, you should enable write combining so that both video memory ranges are included, but so that the two ranges together add up to the total amount of local video memory on your card.*
* At least in the case of a 3Dfx Voodoo 3 PCI