VOGONS


AMD K5 - enable write allocate

Topic actions

First post, by swaaye

User metadata
Rank l33t++
Rank
l33t++

I've been playing with a AMD K5 again on my ASUS P5A and decided to look into enabling write allocate. The BIOS only supports K6 write allocate, though the K5 also supports this function. Write allocate reduces main memory traffic by better utilizing the L1 cache. The end result is improved CPU performance. I found two programs that can control the special features of K5.

-Powerleap CPU Control Panel (https://web.archive.org/web/20020607151843/ht … roducts/ccp.htm) Checkbox for WA.
-K5.sys (http://homepage3.nifty.com/guchimasa/tools/toolsmain.htm). DOS device driver. Can configure all special AMD WA registers. The most configurable of all.
-AMD ENWA.exe (ftp://78.46.141.148/driver/CPU/amd/enwa.exe) K5 and K6 write allocate enabler for DOS. Sets WA, top of memory, and 15-16M memory hole.

AMD Documentation
http://bitsavers.trailing-edge.com/pdf/amd/K8 … _Note_Mar97.pdf

Sisoft Sandra can also determine if WA is enabled on a K5.

End results:
K5 PR200
ASUS P5A
128MB PC66 2-2-2
Sierra Screamin 3D Verite V1000
IDE DVDROM
IDE 80GB HDD
Pro 100M NIC
Win98SE / pure DOS 7

I ran tests with Wing Commander III, Sisoft Sandra and Wintune. I only used K5.sys for these tests.

-Wing Commander III in DOS actually benchmarks slower and I don't know why. Both video and CPU performance are reduced.
-Sisoft Sandra scores were mostly unchanged. Only the CPU ALU benchmark improved noticeably. CPU Multimedia tests curiously did not change, nor did memory performance.
-Wintune showed general improvements, especially in memory performance.

I had no stability problems.

Reply 2 of 34, by feipoa

User metadata
Rank l33t++
Rank
l33t++

Perhaps a BIOS hack is needed to fully utilise write allocate on the K5, or perhaps some minor hardware mode? Or maybe it just doesn't do much for the K5...

Plan your life wisely, you'll be dead before you know it.

Reply 4 of 34, by swaaye

User metadata
Rank l33t++
Rank
l33t++
feipoa wrote:

Perhaps a BIOS hack is needed to fully utilise write allocate on the K5, or perhaps some minor hardware mode? Or maybe it just doesn't do much for the K5...

The results seemed entirely positive in Windows. Even the OpenGL and D3D performance improved a little in Wintune.

I need to look at some other DOS games. The Wing Commander III slowdown boggles... I tried some other parameters on K5.sys too but nothing improved it. Without the write allocate enabled, WC3 actually rates the K5 PR200 + Verite setup as maximum CPU and video performance (a rating of 0 for both). I think the 3D engine is based on integer arithmetic so it should perform very well on K5.

Reply 5 of 34, by swaaye

User metadata
Rank l33t++
Rank
l33t++
Artex wrote:

Nice! K5 PR200 is super rare isn't it? Like never officially released?

I got in on a CPU-World group buy of some stock of the PR200 around 8 years ago. So there were some made but yeah it's not exactly commonplace.

Reply 10 of 34, by feipoa

User metadata
Rank l33t++
Rank
l33t++
Matth79 wrote:

Is there a thing for setting memory regions, or is that only a Cyrix tweak - I'm thinking maybe the graphics memory might be getting tangled up with the write allocate

Is that the write gathering feature(s)?

Plan your life wisely, you'll be dead before you know it.

Reply 11 of 34, by swaaye

User metadata
Rank l33t++
Rank
l33t++
Matth79 wrote:

Is there a thing for setting memory regions, or is that only a Cyrix tweak - I'm thinking maybe the graphics memory might be getting tangled up with the write allocate

There is with the K5.sys driver.

It's all Japanese but take a look at the text file attached. I'm not sure what to do with the fixed region options....

There is also an official AMD document about the registers.
http://bitsavers.trailing-edge.com/pdf/amd/K8 … _Note_Mar97.pdf

Attachments

  • Filename
    K5.TXT
    File size
    6.72 KiB
    Downloads
    78 downloads
    File license
    Fair use/fair dealing exception

Reply 13 of 34, by elianda

User metadata
Rank l33t
Rank
l33t

Just to add another tool: ftp://78.46.141.148/driver/CPU/amd/enwa.exe

It would be good to know which memory ranges are configured non-cacheable without and with WA enabled. Maybe the config for VGA memory access (either windowed or LFB) isn't changed at all.
Also the primary gain of WA seems to be strongly code dependent.
However there should be some difference in the write performance graph in e.g. speedsys. With WA on write it would be Read Burst, then writes to L1 until the cache line is written back. Without it would be write burst for each. (Both for Cache Miss).

Retronn.de - Vintage Hardware Gallery, Drivers, Guides, Videos. Now with file search
Youtube Channel
FTP Server - Driver Archive and more
DVI2PCIe alignment and 2D image quality measurement tool

Reply 14 of 34, by swaaye

User metadata
Rank l33t++
Rank
l33t++

I tried enwa.exe earlier but couldn't get it working for some reason. But I've tried it again and now it's working. I don't know what I was doing before...

But I tested Quake again and it is slower with enwa.exe as well. I also tried k5.sys with the fixed range exemption enabled which prevents WA in 000A_0000h-000F_FFFFh. Still slower than without WA.

And I also tried switching the Verite card for a Riva 128 ZX. WA is still slower. 😀

The speed drop is <1 fps. From ~32.2fps down to ~31.5fps at 320x200. But it isn't margin of error. It is always slower.

Reply 16 of 34, by swaaye

User metadata
Rank l33t++
Rank
l33t++
feipoa wrote:

How does the slow down compare in software mode vs. 3D accelerated mode?

Ok I installed VQuake and did more runs of "timedemo demo1" at 320x200 with the Verite V1000. Write allocate is faster in this case. I see 45.5fps without and 45.9-46.0fps with WA.

I used enwa.exe.

Reply 17 of 34, by feipoa

User metadata
Rank l33t++
Rank
l33t++

Not exactly awe-inspiring but at least the tides have turned. Do higher resolutions fare any better?

Plan your life wisely, you'll be dead before you know it.

Reply 18 of 34, by swaaye

User metadata
Rank l33t++
Rank
l33t++
feipoa wrote:

Not exactly awe-inspiring but at least the tides have turned. Do higher resolutions fare any better?

Yeah it looks like write allocate is not much of a benefit.

The V1000 is most likely going to be a bottleneck much above 320x200 so I didn't do any comparisons at higher resolutions.

I did do one run of demo3 at 640x480 and scored 22.3fps. It looks like the K5 PR200 is performing like a Pentium 133 with vQuake. http://gona.mactar.hu/v1000/ (see bottom chart)