VOGONS


First post, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie

The UM481 and UM491 are two 386/486 combo chipsets that support write-back external cache up to 256KB.
They have an issue in that they support neither 7+1 tag mode to rob a bit of the Tag RAM for a dirty (alter) bit nor do they support switching the external cache from write-back to write-through mode to avoid the need for a dirty bit.

As a refresher, these chipsets (and most other 486 chipsets) read in cache lines of 16 bytes. If the CPU changes just a single bit in that entire cache line, then once it is time to bring in a different part of DRAM into that cache line, the entire 16 bytes must be written back to RAM first. Sometimes the board maker will omit that Dirty Bit from the design altogether, and instead wire the ALT pin of the chipset to a pull-up resistor and leave the ALTWE# pin floating. The chipset attempts to write the alter bit and it goes nowhere; when it tries to read the alter bit, it always reads a 1. So, if the CPU causes a cache line to be read into SRAM, then causes another one to be read without changing anything, the cache has to write out those 16 bytes anyway. It is not hard to see that this hurts performance.

Some boards at least leave a socket to add the Dirty RAM even if it is left out at the factory, but I wanted to try PC-Engineer's modification (suggested here: Re: 486 cache/ram speed issue with write-back) on two boards that did not. It was successful.

The UM82C481 has ALT on pin 37, ALTWR# on pin 38, and TRWR# on pin 39.
The UM82C491 has ALTWR# on pin 72, ALT on pin 73, and TRWR# on pin 74.

  1. Locate the Tag RAM by determining which of the nine SRAMs has the TAGWR# connected to WE# (pin 27) on the SRAM.
  2. Create a makeshift Dirty RAM. You take another 32kx8 SRAM and break off the data pins (11, 12, 13, 15, 16, 17, 18, 19) and WE# (27). Piggy back it on top of an intact SRAM and solder down the remaining pins on the top one. This combination is your new Tag RAM.
  3. There should be a pull-up resistor on your board nearby to the ALT pin; use a multimeter to make sure. Run a wire from any one of the 8 remnants of the data pins of your Dirty RAM to the side of that pull-up resistor that has continuity to the ALT pin.
  4. This is the hard part, run a wire from the ALTWR# pin to remnant of the WE# pin on your Dirty RAM.

I asked behind the scenes and got some good advice. Use 30 gauge wire (wire-wrap wire), a syringe of SMD291 flux, and 0.38 mm solder. Pre-tin the bare ends of your wire and make sure you solder to the top of the ALTWR# pin where it goes into the chip. You will be aligning the wire over top that pin and touching it with the tip of the iron for a split second. Don't try to solder to the pad or solder to the vertical portion of the wire. I attempted that on another board and am now having to clean it up as all the pins are now bridged together with solder and some got crossed onto the wrong pads altogether.

On the UM491, 486DX2-66, cachechk scores improved to 16 uS/KB, 24 uS/KB, and 36 uS/KB (36 down from 51).
On the UM481, 386DX-40, cachechk scores improved to 39 uS/KB and 60 uS/KB (60 down from 92). With the modification it is now on-par with my ALi M1429 which uses 7+1.

What cachechk scores do you get on your 386DX-40 and 486DX2-66 systems?
Who else has a write-back-only chipset with no 7+1 support? I believe there were some OPTi that fall into this category also.

Attachments

  • um491.jpg
    Filename
    um491.jpg
    File size
    459.59 KiB
    Views
    968 views
    File license
    Public domain
  • um481.jpg
    Filename
    um481.jpg
    File size
    460.66 KiB
    Views
    968 views
    File license
    Public domain

Reply 1 of 11, by majestyk

User metadata
Rank Oldbie
Rank
Oldbie

When you break off the 8 data pins and the WE# pin, they will break off directly at the plastic case so no decent soldering will be possible there afterwards.
I guess you should rahther cut one data pin and the WE#pin halfway between plastic case and tip.

Reply 2 of 11, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
majestyk wrote on 2023-09-25, 04:45:

When you break off the 8 data pins and the WE# pin, they will break off directly at the plastic case so no decent soldering will be possible there afterwards.
I guess you should rahther cut one data pin and the WE#pin halfway between plastic case and tip.

That's probably a good idea, but it doesn't seem any worse than one of those Dallas mods...

Reply 3 of 11, by Eep386

User metadata
Rank Member
Rank
Member

I did such a modification on my Aquarius MB-4D33NR, and it worked! With a DX-33, off-cache access time drops from mid 70s to 50 uS/KB in Cachechk. With a DX2-66 it drops to about 36 uS/KB with all the waitstates at their lowest setting.

Life isn't long enough to re-enable every hidden option in every BIOS on every board... 🙁

Reply 4 of 11, by Tiido

User metadata
Rank l33t
Rank
l33t

One shouldn't cut off the excess data pins and instead wire them to pullup/downs so that the chip will not read/write random data on those pins (unconnected CMOS inputs will not hold a specific state and will be affected by any nearby activities). It will reduce power consumption and provide for more stable operation which starts to make a difference when one begins to want to push the clocks.

DirtyCache.jpg
Filename
DirtyCache.jpg
File size
150.76 KiB
Views
783 views
File comment
Dirty Bit on 420EX mobo
File license
Public domain

This is what I did on an i420EX board, luckily for me there was a spot for a Dirty bit chip, but not the kind I had at hand so I made a thing out of a regular 8bit chip instead, tying all the unused data bits to ground via 1k resistors so that there's less traffic on the chip and with it, higher dynamic performance.

T-04YBSC, a new YMF71x based sound card & Official VOGONS thread about it
Newly made 4MB 60ns 30pin SIMMs ~
mida sa loed ? nagunii aru ei saa 😜

Reply 5 of 11, by Eep386

User metadata
Rank Member
Rank
Member

I used a SIP packaged 4.7K bussed resistor network from a broken old board to keep the unused data lines tied to a known state. (It can be either high or low, doesn't really matter.) You can solder pin 2 and up of the resistor array directly to the unused data pins on the dirty TAG SRAM, and run pin 1 of said array to either Vcc or GND. You can let the last unused resistor array pin float without problem.

Last edited by Eep386 on 2023-11-21, 18:07. Edited 1 time in total.

Life isn't long enough to re-enable every hidden option in every BIOS on every board... 🙁

Reply 6 of 11, by majestyk

User metadata
Rank Oldbie
Rank
Oldbie

This is how it _should_ be done.
In a similar case when a second TAG chip is used to enlarge the cacheable area from 64MB to 512MB on Intel 430HX mainboards, 3 additional data lines of a 32K x 8 TAG chip are needed to expand the TAG-bus from 8 bit to 11 bit, 5 data lines of the 2nd chip are unused.
In about 95% of the cases manufacturers of mainboards / COAST sticks just let the 5 data pins float.
I could not find any significant negative effects so far...

Reply 7 of 11, by Eep386

User metadata
Rank Member
Rank
Member

Ok, this isn't a UMC chipset but - I figured out how to add dirty tag to an OPTi 82C495SX.
~ALT goes to pin 87, while ~ALTWE goes to pin 104.
Figured this out using a multimeter and an all-ISA board that had the same chipset but had a dirty TAG socket.

After the modification, my Shuttle HOT-409 with the OPTi '495SX chipset goes from 130 uS/KB off-cache to 86 uS/KB.

Seems the '495SX is just a slow mofo, either way 😜

Life isn't long enough to re-enable every hidden option in every BIOS on every board... 🙁

Reply 8 of 11, by mockingbird

User metadata
Rank Oldbie
Rank
Oldbie
jakethompson1 wrote on 2023-09-25, 03:15:

On the UM481, 386DX-40, cachechk scores improved to 39 uS/KB and 60 uS/KB (60 down from 92). With the modification it is now on-par with my ALi M1429 which uses 7+1.
<snip>

Would you please confirm my result:

 CACHECHK v4 2/7/96  Copyright (c) 1995 by Ray Van Tassle. (-h for help)
CMOS reports: conv_mem= 640K, ext_mem= 15,360K, Total RAM= 16,000K
Clocked at 386 39.6 MHz
Reading from memory.
MegaByte#: --------- Memory Access Block sizes (KB)-----
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 <-- KB
0: 39 39 39 39 39 39 39 39 39 116 -- -- -- µs/KB
1: 39 39 39 39 39 39 39 39 39 117 117 117 117 µs/KB
2 3 4 5 6 7 8 9 10 11 12 13 14 15 <--- same as above.

Extra tests----
Wrt 54 54 55 55 55 55 55 55 55 55 55 55 55<-Write mem
This machine seems to have one cache!? [read]
!! cache is 256KB -- 27.9 MB/s 37.6 ns/byte (296%) 5.7 clks
Main memory speed -- 9.4 MB/s 111.2 ns/byte (100%) [read] 16.9 clks
Effective RAM access time (read ) is 222ns (a RAM bank is 2 bytes wide).
Effective RAM access time (write) is 104ns (a RAM bank is 2 bytes wide).
Clocked at 386 39.6 MHz. Cache ENABLED.
Options: -t0

I ask because I have the same 386 board as you, and I seem to have the speed boost without doing the mod.

I have the UM82C481AF (yours is the UM82C481BF). I have undocumented "W7" closed (yours is open), otherwise, aside from my unlocked BIOS, everything looks the same.

EDIT:

I also have a Chicony 386-33H I tested on and it also seems to work properly in this regard:

CACHECHK V7 11/23/98  Copyright (c) 1995-98 by Ray Van Tassle. (-h for help)
CMOS reports: conv_mem= 640K, ext_mem= 15,360K, Total RAM= 16,000K
386 Clocked at 32.2 MHz
Reading from memory.
MegaByte#: --------- Memory Access Block sizes (KB)-----
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 <-- KB
0: 48 48 48 48 48 48 48 48 48 141 -- -- -- us/KB
1: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
2 <--- same as above.
3: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
4 5 6 7 <--- same as above.
8: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
9: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
10: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
11: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
12: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
13 <--- same as above.
14: 48 48 48 48 48 48 48 48 48 142 142 142 -- us/KB
15: 48 48 48 48 48 48 48 48 48 142 142 -- -- us/KB

Extra tests----
Wrt 66 66 66 66 66 66 66 66 66 66 66 66 66<-Writing
This machine seems to have one cache!? [reading]
!! cache is 256KB-- 22.9 MB/s 45.7 ns/byte (293%)
>>>> If you think you do have L2 cache, you might have FAKE CACHE chips! <<<<
5.6 clks
Main memory speed -- 7.8 MB/s 135.1 ns/byte (100%) [reading] 16.5 clks
Effective RAM access time (read ) is 270ns (a RAM bank is 2 bytes wide).
Effective RAM access time (write) is 126ns (a RAM bank is 2 bytes wide).
386 Clocked at 32.2 MHz. Cache ENABLED.
Options: -t0

Unless I'm misreading the results, might I suggest that this issue can be resolved through BIOS with these boards and that a hardware mod may not be necessary?

mslrlv.png
(Decommissioned:)
7ivtic.png

Reply 9 of 11, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
mockingbird wrote on 2023-12-26, 03:15:
Would you please confirm my result: […]
Show full quote
jakethompson1 wrote on 2023-09-25, 03:15:

On the UM481, 386DX-40, cachechk scores improved to 39 uS/KB and 60 uS/KB (60 down from 92). With the modification it is now on-par with my ALi M1429 which uses 7+1.
<snip>

Would you please confirm my result:

 CACHECHK v4 2/7/96  Copyright (c) 1995 by Ray Van Tassle. (-h for help)
CMOS reports: conv_mem= 640K, ext_mem= 15,360K, Total RAM= 16,000K
Clocked at 386 39.6 MHz
Reading from memory.
MegaByte#: --------- Memory Access Block sizes (KB)-----
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 <-- KB
0: 39 39 39 39 39 39 39 39 39 116 -- -- -- µs/KB
1: 39 39 39 39 39 39 39 39 39 117 117 117 117 µs/KB
2 3 4 5 6 7 8 9 10 11 12 13 14 15 <--- same as above.

Extra tests----
Wrt 54 54 55 55 55 55 55 55 55 55 55 55 55<-Write mem
This machine seems to have one cache!? [read]
!! cache is 256KB -- 27.9 MB/s 37.6 ns/byte (296%) 5.7 clks
Main memory speed -- 9.4 MB/s 111.2 ns/byte (100%) [read] 16.9 clks
Effective RAM access time (read ) is 222ns (a RAM bank is 2 bytes wide).
Effective RAM access time (write) is 104ns (a RAM bank is 2 bytes wide).
Clocked at 386 39.6 MHz. Cache ENABLED.
Options: -t0

I ask because I have the same 386 board as you, and I seem to have the speed boost without doing the mod.

I have the UM82C481AF (yours is the UM82C481BF). I have undocumented "W7" closed (yours is open), otherwise, aside from my unlocked BIOS, everything looks the same.

EDIT:

I also have a Chicony 386-33H I tested on and it also seems to work properly in this regard:

CACHECHK V7 11/23/98  Copyright (c) 1995-98 by Ray Van Tassle. (-h for help)
CMOS reports: conv_mem= 640K, ext_mem= 15,360K, Total RAM= 16,000K
386 Clocked at 32.2 MHz
Reading from memory.
MegaByte#: --------- Memory Access Block sizes (KB)-----
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 <-- KB
0: 48 48 48 48 48 48 48 48 48 141 -- -- -- us/KB
1: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
2 <--- same as above.
3: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
4 5 6 7 <--- same as above.
8: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
9: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
10: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
11: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
12: 48 48 48 48 48 48 48 48 48 142 142 142 142 us/KB
13 <--- same as above.
14: 48 48 48 48 48 48 48 48 48 142 142 142 -- us/KB
15: 48 48 48 48 48 48 48 48 48 142 142 -- -- us/KB

Extra tests----
Wrt 66 66 66 66 66 66 66 66 66 66 66 66 66<-Writing
This machine seems to have one cache!? [reading]
!! cache is 256KB-- 22.9 MB/s 45.7 ns/byte (293%)
>>>> If you think you do have L2 cache, you might have FAKE CACHE chips! <<<<
5.6 clks
Main memory speed -- 7.8 MB/s 135.1 ns/byte (100%) [reading] 16.5 clks
Effective RAM access time (read ) is 270ns (a RAM bank is 2 bytes wide).
Effective RAM access time (write) is 126ns (a RAM bank is 2 bytes wide).
386 Clocked at 32.2 MHz. Cache ENABLED.
Options: -t0

Unless I'm misreading the results, might I suggest that this issue can be resolved through BIOS with these boards and that a hardware mod may not be necessary?

The 117 us/KB and 142 us/KB parts are what I am talking about. i.e., you pay the "always dirty" penalty on a cache miss, not a cache hit. With the modification you should be able to get those numbers down to 60ish. I suspect you have not tuned DRAM wait states on the first board, and both DRAM and cache wait states on the second board either, assuming your SIMMs are fast enough.

Reply 10 of 11, by mockingbird

User metadata
Rank Oldbie
Rank
Oldbie
jakethompson1 wrote on 2023-12-27, 04:51:

The 117 us/KB and 142 us/KB parts are what I am talking about. i.e., you pay the "always dirty" penalty on a cache miss, not a cache hit. With the modification you should be able to get those numbers down to 60ish. I suspect you have not tuned DRAM wait states on the first board, and both DRAM and cache wait states on the second board either, assuming your SIMMs are fast enough.

Thanks, I tightened up my cache settings to 3-1-1-1 for read and 1ws write, as well as my DRAM settings to 1ws for both.

Unfortunately, the UM82C481A is unable to set read and write on the DRAM asynchronously, so I set it to 1ws for both, though I would have liked to try 0 and 2:

IMG_20231227_144506082.jpg
Filename
IMG_20231227_144506082.jpg
File size
98.41 KiB
Views
341 views
File license
Public domain

Now I get results similar to yours, pre-mod, that is:

CACHECHK v4 2/7/96  Copyright (c) 1995 by Ray Van Tassle. (-h for help)
CMOS reports: conv_mem= 640K, ext_mem= 15,360K, Total RAM= 16,000K
Clocked at 386 39.7 MHz
Reading from memory.
MegaByte#: --------- Memory Access Block sizes (KB)-----
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 <-- KB
0: 39 39 39 39 39 39 39 39 39 90 -- -- -- æs/KB
1: 39 39 39 39 39 39 39 39 39 90 90 90 90 æs/KB
2 3 4 5 6 7 8 9 10 11 12 13 14 15 <--- same as above.

Extra tests----
Wrt 48 48 48 48 48 48 48 48 48 48 48 48 48<-Write mem
This machine seems to have one cache!? [read]
!! cache is 256KB -- 27.9 MB/s 37.6 ns/byte (228%) 5.7 clks
Main memory speed -- 12.2 MB/s 85.9 ns/byte (100%) [read] 13.0 clks
Effective RAM access time (read ) is 171ns (a RAM bank is 2 bytes wide).
Effective RAM access time (write) is 91ns (a RAM bank is 2 bytes wide).
Clocked at 386 39.7 MHz. Cache ENABLED.
Options: -t0

I'll have to test that 1ws for read and write for stability though. My RAM is 60ns and my cache is 20ns.

I'll post my results after modding. Thanks.

mslrlv.png
(Decommissioned:)
7ivtic.png

Reply 11 of 11, by mockingbird

User metadata
Rank Oldbie
Rank
Oldbie

Mission accomplished:

IMG_20231227_175429018.jpg
Filename
IMG_20231227_175429018.jpg
File size
561.6 KiB
Views
321 views
File license
Public domain
CACHECHK V7 11/23/98  Copyright (c) 1995-98 by Ray Van Tassle. (-h for help)
CMOS reports: conv_mem= 640K, ext_mem= 15,360K, Total RAM= 16,000K
386 Clocked at 39.0 MHz
Reading from memory.
MegaByte#: --------- Memory Access Block sizes (KB)-----
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 <-- KB
0: 39 39 39 39 39 39 39 39 39 66 -- -- -- us/KB
1: 39 39 39 39 39 39 39 39 39 66 66 66 66 us/KB
2 3 4 5 6 7 8 9 <--- same as above.
10: 39 39 39 39 39 39 39 39 39 66 66 66 66 us/KB
11: 39 39 39 39 39 39 39 39 39 66 66 66 66 us/KB
12 13 14 15 <--- same as above.

Extra tests----
Wrt 48 48 48 48 48 48 48 48 48 48 48 48 48<-Writing
This machine seems to have one cache!? [reading]
!! cache is 256KB-- 27.9 MB/s 37.6 ns/byte (167%)
>>>> If you think you do have L2 cache, you might have FAKE CACHE chips! <<<<
5.6 clks
Main memory speed -- 16.7 MB/s 62.9 ns/byte (100%) [reading] 9.3 clks
Effective RAM access time (read ) is 125ns (a RAM bank is 2 bytes wide).
Effective RAM access time (write) is 91ns (a RAM bank is 2 bytes wide).
386 Clocked at 39.0 MHz. Cache ENABLED.
Options: -t0

Pin 15 (I/O 4) is also disconnected, even though it's not splayed up like the other pins. It was broken beforehand. The purple wire looks like it's attached to pin 39, but it is in fact soldered to pin 38 (third from the top).

Thanks for your great work. Cheers.

mslrlv.png
(Decommissioned:)
7ivtic.png