VOGONS


First post, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie

I recently posted about not being able to get cache read faster than 3-2-2-2 and write faster than 1 WS on my board. It's running a DX2-80 and has a UMC 491F chipset.

I just replaced all cache chips plus the tag ram with 12 ns, 256 kilobit (32Kx8) chips. They are from Digi-Key and therefore should be legitimate. No dice. I still can't get the timings any faster. I experimented with swapping out VLB VGA for an ISA one, and setting the VLB and CPUCLK jumpers for 40 MHz operation. Doesn't help. On 3-1-1-1 sometimes I can get it to boot partially into DOS and then have it complain about can't find command.com, etc., or suddenly AMISETUP claims the BIOS is password-protected. Very erratic operation. I can't get it to boot at all with the read timing on 2-1-1-1 or write timing on 0 WS. The DRAM is 60ns and is set (successfully) to 0 WS.

Doesn't seem like there is any point in having cache on this board if it can't go any faster than this.
With L2 disabled, CACHECHK reports 16 us/KB up to 8 KB (L1) cache, and 22 us/KB for all bigger sizes.
With L2 enabled, CACHECHK reports 16 us/KB up to 8KB, 20 us/KB from 16 to 256 KB, and 37 us/KB on anything bigger. So I'd be trading 10% faster for things in the cache for an added 68% miss penalty on anything not in the cache. Ouch!

My board is maxed out at 256KB cache. Any chance it would work at 2-1-1-1 and 0 WS with a smaller cache? Problem is, the smaller cache sizes call for 8Kbx8 or 16Kbx8 chips. Can I jumper it as if I've swapped the chips for smaller ones or will that not work? I'm inspired to try out smaller caches by some of the commentary in Feipoa's Biostar MB-8433UUD manual.

Presumably the chipset is just too slow? Perhaps it wasn't actually designed for 40 MHz operation? I haven't seen how much I can push it if I lower to 33 MHz but that might be an experiment for another day.

Was PC Chips onto something? haha

Thanks

Reply 1 of 14, by Anonymous Coward

User metadata
Rank l33t
Rank
l33t

UMC 491F should be a pretty fast and efficient chipset. I think something must be wrong with your board. Maybe it was abused or subjected to electrostatic discharge before you got it...or just bad from the factory. I doubt a smaller cache would help, but in my experience you should be able to set your cache jumpers for smaller sizes and it should work if your SRAMs are larger than the required size.

"Will the highways on the internets become more few?" -Gee Dubya
V'Ger XT|Upgraded AT|Ultimate 386|Super VL/EISA 486|SMP VL/EISA Pentium

Reply 2 of 14, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie

Well now I'm in worse trouble. It won't work reliably with 2x16MB SIMMs anymore. One 16MB SIMM in either slot works better, as does a single 32MB SIMM. Sometimes the video even glitches in Speedsys while doing memory read or move testing with two SIMMs in. Also having trouble with WfW 3.11 randomly rebooting the machine. Seems I need to dismantle, possibly take out and reseat the cache, and try again. If that doesn't work guess I'm going back to the old cache chips since timing is no better anyway. The 12ns ones use a few more mA than the old ones, don't know if that could be the issue.

Reply 3 of 14, by mkarcher

User metadata
Rank l33t
Rank
l33t

I wonder why you only talk about 3-2-2-2, 2-1-1-1 and 3-1-1-1. Typically the burst speed recommended by chipset vendors at 40 MHz is 2-2-2-2. That's just one cycle faster than 3-2-2-2, but it should be stable, too. Especially so if you swap out the 20ns tag RAM you wrote about in your other thread by something faster, like the 12ns chips you are talking about in this thread.

The abysmal RAM read performance with cache enabled sounds like your board operates L2 in write-back with an "always dirty" strategy. This strategy is known to be worse than write-through in most situations. A L2 cache in write-back mode can have contents that are not yet in RAM. On cache miss, the RAM needs to be updated if the L2 cache line that gets ejected has data that is not yet in RAM. This takes some time. Usually, the chipset has a single bit per cache line indicating the clean/dirty state of the cache line. Old chipsets required a separate dirty tag chip to store the dirty bit, whereas newer chipsets can (optionally) use one of the 8 data bits from the tag chip as dirty bit. You can run in write-back mode without a dirty bit, but in that case, the chipset has to assume the data is dirty on every cache miss, and needs to have the processor wait for data during the often unnecessary write-back process.

Look for an option called "tag bits: 7+1"/"tag bits: 8" or "combine tag+dirty" in AMISETUP and set it to 7+1 (7 actual tag bits, 1 dirty bit) or combine:enabled. If you don't have an option like that, try to set the L2 cache in write-through mode. Read performance of main RAM with L2 disabled should not be considerably faster than with L2 enabled and a cache miss.

Reply 4 of 14, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie

No option for 2-2-2-2 in AMISETUP.

I don't see anything about the dirty bit, but there was a external cache WB option. I tried disabled, and it didn't improve anything as reported by cachechk. Speedsys graph appeared the same or slightly worse. There is one hint in the manual of what you are saying about a dirty bit, though. When using 8 32K sram chips there is the option to use either a 32K or 16K tag RAM. Mine has, and is jumpered for, the 32K tag.

For what it's worth, I put all the old cache chips back in and the problems I was having even with 3-2-2-2 when using two RAM SIMMs went away.

Someone elsewhere suggested testing all the SRAMs since I have a TL866II+; they all pass.

There is no wild way it would do better with 20ns srams rather than 15ns, is there? Doesn't it seem suspicious this board would have shipped with a 20ns tag and 15ns srams?

Reply 5 of 14, by amadeus777999

User metadata
Rank Oldbie
Rank
Oldbie

Could you post a photo fo your srams?
The sram's latency is just that - it won't affect functionality unless in terms of speed. The reason why they supplied a 20/15ns team is that nobody expects the things to run maxed out.

Reply 7 of 14, by The Serpent Rider

User metadata
Rank l33t++
Rank
l33t++

Look for an option called "tag bits: 7+1"/"tag bits: 8" or "combine tag+dirty" in AMISETUP and set it to 7+1 (7 actual tag bits, 1 dirty bit) or combine:enabled.

Certain AMIBIOS versions didn't had that option, even though BIOS had Write Back or Write Through cache policy switch.

There is no wild way it would do better with 20ns srams rather than 15ns, is there?

The board can't be aware which SRAMs are inserted, but quality between different "15ns" modules can fluctuate very wildly. So it can be worse than quality 20ns modules.

I must be some kind of standard: the anonymous gangbanger of the 21st century.

Reply 9 of 14, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie

So, I did some experimenting and got quite the opposite result I expected.

First I tried an Intel 486DX2-66 overclocked to 80 just to see if Cyrix was a factor. No difference.

Next, I dropped the bus to 33 MHz.
Doing this, I'm now able to try 3-2-2-2, 3-1-1-1, and 2-1-1-1 timings. On 2-1-1-1, the system is too unstable to even run Mscdex, but I can at least do an F5 boot and run cachechk.
I still can't do write timing of 0 WS so it's left at 1.

The cachechk numbers at a 33MHz bus are worse!
3-2-2-2: 32 us/KB for 16K-256K reads, 52 us/KB for anything larger.
3-1-1-1: 26 us/KB for 16K-256K reads, 52 us/KB for anything larger.
2-1-1-1 24 us/KB for 16K-256K reads, 52 us/KB for anything larger.

Considering I'm at 20 us/KB cache hit and 37 us/KB cache miss at 40 MHz, looks like I should shut up, quit complaining, and accept 3-2-2-2 @ 40 MHz!
More importantly, the cache timing appears to have no effect on cache misses, so even if I could do 2-1-1-1 @ 40 MHz it would do nothing about that 37 us/KB on misses.

Those of you with 486DX2/66 or 486DX2/80 and 256KB cache - what are your cachechk timings?

Reply 10 of 14, by chublord

User metadata
Rank Member
Rank
Member
jakethompson1 wrote on 2020-07-21, 18:34:

Those of you with 486DX2/66 or 486DX2/80 and 256KB cache - what are your cachechk timings?

This isn't exactly what you asked, but it might provide a point of reference at least - 486DX4/100 w/ 50 MHz FSB and 256KB cache (20ns):
10 us/KB for 1-8K
11 us/KB for 16K
22 us/KB for 32-256K
33 us/KB for >256K

Additional tests
L1 cache speed - 103.4 MB/s 10.1 ns/byte
L2 cache speed - 49.4 MB/s 21.2 ns/byte
Memory speed - 32.9 MB/s 31.9 ns/byte

I suspect my timings are super slow. Opti 802C chipset. Edit: I'm using 20 ns cache chips and 70ns memory chips.

IBM Valuepoint 486 DX4-100, Opti 802G, 50 MHz FSB, Voodoo1+S3 864, Quantum Fireball EX 4.0 GB, Seagate Medalist 1.6 GB, 128 MB FPM, 256k L2

Reply 11 of 14, by jokker

User metadata
Rank Newbie
Rank
Newbie

486DX2/66 8MB 70ns 256K 20ns

16 us/KB 1-8K
32 us/KB 16-256K
158 us/KB >256K

L1 cache speed - 67.9 MB/s 15.4 ns/byte
L2 cache speed - 34.7 MB/s 30.2 ns/byte
Memory - 7.0 MB/s 150.3 ns/byte

Man those numbers seem pretty bad, might have to check out the BIOS settings or something.

Reply 12 of 14, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
chublord wrote on 2020-07-21, 21:14:
This isn't exactly what you asked, but it might provide a point of reference at least - 486DX4/100 w/ 50 MHz FSB and 256KB cache […]
Show full quote
jakethompson1 wrote on 2020-07-21, 18:34:

Those of you with 486DX2/66 or 486DX2/80 and 256KB cache - what are your cachechk timings?

This isn't exactly what you asked, but it might provide a point of reference at least - 486DX4/100 w/ 50 MHz FSB and 256KB cache (20ns):
10 us/KB for 1-8K
11 us/KB for 16K
22 us/KB for 32-256K
33 us/KB for >256K

Additional tests
L1 cache speed - 103.4 MB/s 10.1 ns/byte
L2 cache speed - 49.4 MB/s 21.2 ns/byte
Memory speed - 32.9 MB/s 31.9 ns/byte

I suspect my timings are super slow. Opti 802C chipset. Edit: I'm using 20 ns cache chips and 70ns memory chips.

Interesting; your (larger) L1 cache is a bit faster as expected, L2 is a little slower, and memory is a little faster.
One thing about your chipset is that it's actually documented. Page 15 of this (https://datasheetspdf.com/pdf-file/523501/Opti/82C802G/1) starts explaining how the cache works.

Reply 13 of 14, by pentiumspeed

User metadata
Rank l33t
Rank
l33t

bus frequency at 40MHz and CPU so quick (DX2-80), means faster timing not possible. Reset the bios setting to default and be done with this since this board is not designed properly in first place.

Find another quality board that supports DX2-80.

Cheers,

Great Northern aka Canada.

Reply 14 of 14, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
pentiumspeed wrote on 2020-07-22, 02:39:

bus frequency at 40MHz and CPU so quick (DX2-80), means faster timing not possible. Reset the bios setting to default and be done with this since this board is not designed properly in first place.

Find another quality board that supports DX2-80.

Cheers,

Well, considering the two other timings so far it's actually competitive at 3-2-2-2 read & 1 WS write, so I'm thinking I'll just live with that, considering it's still faster than any of the settings on 33 MHz.

If I get another board it'll be for an Am5x86-133. Hoping for an MB-8433UUD or another UM8881F board w/PS/2 mouse to show up on eBay sometime. Looks like only two in the past six months.