486 board with UMC 8881E/8886B: The winner is: EDO without L2 (if your only other option is L2 at 3-2-2-2) \ VOGONS

486 board with UMC 8881E/8886B: The winner is: EDO without L2 (if your only other option is L2 at 3-2-2-2)

Topic actions

First post, by mkarcher

Posted on 2023-03-12, 22:25

mkarcher Offline

Rank l33t

Rank: l33t
Posts: 3801
Joined: 2019-01-19, 16:29
Location: Germany

A well-known sales pitch from the mid 90s is "we use the new modern EDO RAM, so we don't need to add L2 cache", countered by "we use EDO RAM and L2 cache for even better performance". Let's perform a Quake 320x200 shootout at 120MHz on a Cyrix 5x86.

I'm using a Biostar MB-8433UUD with the latest UMC 8881/8886 chipset revision. To boost performance slightly, RAM is set to "slow refresh".
The processor is a Cyrix 5x86 (rated 100) at 120MHz, 4.0V (the Biostar board doesn't have a 3.6V setting, and I didn't yet bodge one in), PCR0=02, CCR2=D6, CCR3=1C, CCR4=38 with a cooling fan added to the original cyrix heatsink. The register settings mean: Branch prediction enabled, but loop optimization and return stack disabled, linear burst and burst writes enabled, "fast FPU" enabled, load/store serialization disabled.

These are the results I obtained. The columns mean

CPU: cpu clock 120Mhz as 2*60 or 3*40. My 5x86 doesn't have a 4x multiplier, and my board doesn't provide 30MHz FSB, so no benchmarks for 4*30. That 5x86 doesn't work at 2*66 (133MHz).
L2: amount of L2 cache installed (either chips removed from board or 256KB cache in two banks. Chips are nine 15ns 32k x 8 chips)
RAM: amount of RAM installed. SIMMs were 32MB modules. EDO modules are single-sided(!) 32MB modules with 4 chips, FPM modules are double-sided 32MB modules with 16 chips.
SPD: speed of the slowest SIMM in nanoseconds
WS: RAM wait states as configured in the BIOS. (read WS/write WS)
RAM burst: 3-1-1-1 or 4-2-2-2 as configured in the BIOS if EDO mode is active, or FPM in FPM mode.
Quake: Score in fps from timedemo 1 at standard resolution (dosbench option c) with a Trio64V+ PCI graphics card. PCI clock is 40MHz at 3*40, and 30MHz at 2*60.
SPSYS: Speedsys CPU score.
remaining columns: Memory throughput as measured by speedsys (when you press M or write a report file).

1CPU   L2             RAM  SPD WS  RAM burst   Quake SPSYS L1R   L1W   L1M   L2R   L2W   L2M   MR    MW    MM
22*60  none    off    96MB  60 0/0  4-2-2-2    17.5  68.18 225.2 114.7 218.7                    76.4 114.6  31.7
32*60  none    off    64MB  60 2/1  FPM        15.5  68.18 222.3  76.4 214.6                    45.8  76.3  20.5
42*60  256K/2B 3-2 WB 96MB  50 1/0  4-2-2-2    17.1  68.09 226.6  76.6 211.2  83.7  76.3  38.4  54.0  76.5  18.8
52*60  256K/2B 3-2 WT 96MB  50 1/0  4-2-2-2    17.2        226.6  76.6 219.7  83.7  75.3  35.3  54.0  76.5  24.2
62*60  256K/2B 3-2 WB 32MB  60 1/0  FPM        16.9  68.68 226.4  76.6 210.7  83.7  76.3  38.4  57.3  76.5  18.8
72*60  256K/2B 3-2 WT 32MB  60 1/0  FPM        17.2  68.58 226.6  76.6 219.7  83.7  76.3  35.3  57.3  76.5  24.6
83*40  none    off    32MB  50 0/0  3-1-1-1    17.0  68.45 225.4  76.5 218.7                    76.3  76.4  27.8
93*40  256K/2B off    32MB  50 0/0  4-2-2-2    16.0        222.4  76.5 216.6                    50.9  76.4  23.5
103*40  256K/2B off    64MB  60 0/0  FPM        16.3  68.55 223.1  76.5 217.1                    55.5  76.4  24.5
113*40  256K/2B 2-1 WB 32MB  60 0/0  4-2-2-2    16.8  68.51 227.1  76.5 216.0  87.6  76.2  40.1  50.9  76.4  18.0
123*40  256K/2B 2-1 WB 96MB  60 0/0  4-2-2-2    17.0    -- same, but partially uncached --
133*40  256K/2B 2-1 WT 32MB  60 0/0  4-2-2-2    17.1  68.46 227.1  76.5 220.1  87.6  76.2  36.0  50.9  76.4  24.5
143*40  256K/2B 2-1 WT 96MB  60 0/0  4-2-2-2    17.1    -- same as above, all memory Quake uses is cached --
153*40  256K/2B 2-1 WB 32MB  60 0/0  FPM        16.8  68.58 227.1  76.5 216.0  87.7  76.2  40.1  50.9  76.4  18.0
163*40  256K/2B 2-1 WB 64MB  60 0/0  FPM        17.0    -- same, but partially uncached --
173*40  256K/2B 2-1 WT 32MB  60 0/0  FPM        17.1  68.51 227.1  76.4 220.1  87.7  76.2  36.0  50.8  76.4  24.5

This table shows some interesting insights:

The highest Quake score is achieved at FSB60 with no L2 cache. I didn't test whether L2 chips inserted but disabled will work at 0/0 WS.
Quake does not like write-back cache at all. It's generally slower in WB mode than in WT mode. In WB mode, Quake gets faster if I install 64MB of RAM with the second half uncached, compared to 32MB fully cached. Quake is even faster in write-through mode. DOS 6 (used as OS for these tests) doesn't use more than 64MB of RAM, so installing 96MB just shows that the system is stable with that much RAM installed, but exceeding the cacheable area is not represented in these benchmark scores.
EDO at its slow burst (4-2-2-2) with 0WS/0WS is slower than FPM RAM in FPM mode.
EDO at its fast burst (3-1-1-1) performs remarkably well, but this speed barely works at 40MHz (only if just a single 50ns SIMM is installed, and L2 cache is physically removed). This configuration is likely meant for 33MHz FSB maximum.
Setting the cache leadoff cycle to 3 clocks limits RAM performance. This is not surprising, as a tag lookup is required for all memory cycles. This means not using the cache can be helpful if you have fast RAM and need 3 cycles for a tag lookup.
All of the FPM modules I used are not able to provide good performance at FSB60. I can run 60ns EDOs in "EDO mode" at 0/0, but I need to run 60ns FPM modules in "FPM mode" at 2/1.

So, for Quake, optimal performance is obtained with EDO DRAMs and no L2 cache installed or active.

I intend to automate these tests, possibly including more numbers. In the long run, I want to continue testing "overly new graphics cards" like the Radeon 9250 or even the Geforce FX 5200 with this platform. If you have suggestions what benchmarks to include, preferably in a way that the run can be automated, feel free to mention them.

Last edited by mkarcher on 2023-06-10, 22:22. Edited 1 time in total.

Reply 1 of 170, by jakethompson1

Posted on 2023-03-12, 22:37

jakethompson1 Offline

Rank l33t

Rank: l33t
Posts: 2091
Joined: 2015-11-17, 04:16

Which BIOS revision is this?

Reply 2 of 170, by mkarcher

Posted on 2023-03-12, 22:50

mkarcher Offline

Rank l33t

Rank: l33t
Posts: 3801
Joined: 2019-01-19, 16:29
Location: Germany

jakethompson1 wrote on 2023-03-12, 22:37:

Which BIOS revision is this?

The ID string 05/20/96-UMC-881E/886B-2A4X5B08C-00; on the other hand, the custom message above the CPU type reads "BIOSTAR MB-8433UUD v2020", so this is most likely a modded version from 2020. It might even be a patch I performed myself; I don't remember any details I needed to patch on that board - but I am confident that I didn't mess around with memory type and timings, so that part should be identical to the 05/20/96 version.

Reply 3 of 170, by rasz_pl

Posted on 2023-03-12, 23:07

rasz_pl Offline

Rank l33t

Rank: l33t
Posts: 4449
Joined: 2017-06-04, 00:57

quake on 486 is not a good benchmark, not to mention quake is a bit of an outlier when it comes to L2, even on 430FX sram cache gives worse scores than no cache and only PBcache provides any boost https://dependency-injection.com/intel-430fx- … riton-l2-cache/

https://github.com/raszpl/sigrok-disk FM/MFM/RLL decoder
https://github.com/raszpl/FIC-486-GAC-2-Cache-Module (AT&T Globalyst)
https://github.com/raszpl/386RC-16 ram board
https://github.com/raszpl/440BX Reference Design adapted to Kicad

Reply 4 of 170, by BitWrangler

Posted on 2023-03-12, 23:29

BitWrangler Offline

Rank l33t++

Rank: l33t++
Posts: 8669
Joined: 2017-10-11, 00:55
Location: Ontario

Thanks for sharing your results... hmmm maybe I can stop looking for the M912 cache module now 🤣 (I own one, just not sure where it is) ... good timing too, been feeling an "ultimate 486" bash of my own coming on while I know where 1 of my m912s, BEK V439, Asus 486sv2gx4 and a POD and Cyrix 5x86 are.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 5 of 170, by mockingbird

Posted on 2023-03-13, 00:07

mockingbird Offline

Rank Oldbie

Rank: Oldbie
Posts: 1535
Joined: 2013-06-17, 02:57

Fascinating tests... So far I am finding PCI 8881E/8886B (HOT-433) a lot more stable than the VLB VL/I-486SV2GX4...

(Decommissioned:)

Reply 6 of 170, by jakethompson1

Posted on 2023-03-13, 03:07

jakethompson1 Offline

Rank l33t

Rank: l33t
Posts: 2091
Joined: 2015-11-17, 04:16

mkarcher wrote on 2023-03-12, 22:50:

jakethompson1 wrote on 2023-03-12, 22:37:

Which BIOS revision is this?

The ID string 05/20/96-UMC-881E/886B-2A4X5B08C-00; on the other hand, the custom message above the CPU type reads "BIOSTAR MB-8433UUD v2020", so this is most likely a modded version from 2020. It might even be a patch I performed myself; I don't remember any details I needed to patch on that board - but I am confident that I didn't mess around with memory type and timings, so that part should be identical to the 05/20/96 version.

Oh, that's from feipoa from right here on vogons. With a stock BIOS, I think some of those settings you tweaked would not have been available.

Reply 7 of 170, by mkarcher

Posted on 2023-03-13, 20:35

mkarcher Offline

Rank l33t

Rank: l33t
Posts: 3801
Joined: 2019-01-19, 16:29
Location: Germany

rasz_pl wrote on 2023-03-12, 23:07:

quake on 486 is not a good benchmark, not to mention quake is a bit of an outlier when it comes to L2, even on 430FX sram cache gives worse scores than no cache and only PBcache provides any boost https://dependency-injection.com/intel-430fx- … riton-l2-cache/

Firstly, I frankly admit that the title is slightly click-baity, which I tried to counter using "terms and conditions apply".

It's quite obvious that Quake is mostly CPU bound. If we exclude the obviously bad configurations (not using a cache at 40MHz or using uncached non-EDO at FSB60), the range spreads from 16.8 to 17.5 fps, which is less than 5% difference. Furthermore, the fact that Quake profits from hitting uncached memory in L2WB mode clearly shows that the memory access pattern by Quake is not very cache friendly. That's why I avoided drawing general conclusions from the Quake scores in the post contents. The general statements are based on the raw speedsys memory benchmark scores. The interpretation of the Quake scores are clearly written that way.

And that's also why I was asking for different benchmark suggestions. As I intend to do a clean scientific re-run of the benchmarks with all parameters controlled and checked, I can add benchmarks that better capture the "real-world" performance than Quake. The reason I used Quake is that I was trying to hit the 18 fps barrier the AMD 5x86 users are constantly fighting with 160MHz (AFAIK 18 fps was beaten at 3*60 on an 5x86) at 120MHz with the Cyrix 5x86 to show its power.

The test is not intended to be rigged towards EDO beats FPM, but the fact that I was using 15ns cache (which is on the slow side) and 60ns EDO modules (which are on the fast side) is not that fair. I doubt that 10ns cache would allow tighter timing at FSB60, but it's not impossible. Actually, I have a bunch of 128K x 8 10ns SMD chips ordered some time ago, and I intend to build an "adapter PCB" to add them into my Biostar board, upgrading the cache to 1MB at the same time. IIRC there is a (or was?) thread by feipoa detailing where to get all the address lines required for 1MB L2 cache.

I still need to find out why the PCI 2:3 divider doesn't work reliably. I get no issues using 1:1 at 40MHz FSB clock, but I get glitches (e.g. stuck characters when scrolling in DOS) using 2:3 at 60MHz FSB clock. Both settings should result in 40 MHz PCI clock. There might be chipset limitations on using the 2:3 divider. Too bad there are no public UMC datasheets.

Another hint I figured out during the tests: You don't get reliable operation if L2 is disabled, RAM is set to 0 read WS and L2 cache leadoff is set to three clocks. This will inevitably cause freezes during POST, at least at FSB60. Setting the seemingly irrelevant L2 burst leadoff time to 2 clocks make the thing work.

Reply 8 of 170, by mkarcher

Posted on 2023-03-13, 20:40

mkarcher Offline

Rank l33t

Rank: l33t
Posts: 3801
Joined: 2019-01-19, 16:29
Location: Germany

mockingbird wrote on 2023-03-13, 00:07:

Fascinating tests... So far I am finding PCI 8881E/8886B (HOT-433) a lot more stable than the VLB VL/I-486SV2GX4...

That's comparing apples and oranges. The 8881E and 8886B are the last revision of the UMC 8881/8886 chipset, which is likely years newer than the SiS 471 on the VL/I-486SV2GX4. The SiS competitor to the UMC8881/8886 is the SiS 496/497. Again, the only latest revision of that chipset has EDO support. You can get VL-capable boards with both the SiS 496/497 (like the Lucky Star LS486E, the Soyo 4SA(W)2, the Asus PVI-486SP3) and the UMC8881/8886 (like the Gigabyte GA-486IM).

Reply 9 of 170, by mkarcher

Posted on 2023-03-13, 20:46

mkarcher Offline

Rank l33t

Rank: l33t
Posts: 3801
Joined: 2019-01-19, 16:29
Location: Germany

jakethompson1 wrote on 2023-03-13, 03:07:

Oh, that's from feipoa from right here on vogons. With a stock BIOS, I think some of those settings you tweaked would not have been available.

Yeah, that BIOS being an adaption by feipoa makes a lot of sense. This explains why I don't remember creating that version myself.

Reply 10 of 170, by mockingbird

Posted on 2023-03-13, 21:14

mockingbird Offline

Rank Oldbie

Rank: Oldbie
Posts: 1535
Joined: 2013-06-17, 02:57

mkarcher wrote on 2023-03-13, 20:40:

That's comparing apples and oranges. The 8881E and 8886B are the last revision of the UMC 8881/8886 chipset, which is likely years newer than the SiS 471 on the VL/I-486SV2GX4. The SiS competitor to the UMC8881/8886 is the SiS 496/497. Again, the only latest revision of that chipset has EDO support. You can get VL-capable boards with both the SiS 496/497 (like the Lucky Star LS486E, the Soyo 4SA(W)2, the Asus PVI-486SP3) and the UMC8881/8886 (like the Gigabyte GA-486IM).

Nothing wrong with the Asus and the 471 VLB, except that it was not Doom stable, no matter what setting in the BIOS, or however slow the memory timings were, or if the cache was set to WT or WB. Almost positive it's the VLB CL-GD5428 causing the issue, and one of these days I might build my Trio64+ VLB card and test with that (it supports 0WS). But this is for a different topic.

(Decommissioned:)

Reply 11 of 170, by CoffeeOne

Posted on 2023-03-13, 22:00

CoffeeOne Offline

Rank Oldbie

Rank: Oldbie
Posts: 1167
Joined: 2019-12-25, 16:12
Location: Austria

mkarcher wrote on 2023-03-13, 20:35:

.....
The reason I used Quake is that I was trying to hit the 18 fps barrier the AMD 5x86 users are constantly fighting with 160MHz (AFAIK 18 fps was beaten at 3*60 on an 5x86) at 120MHz with the Cyrix 5x86 to show its ....

Only a small comment: with an Ark Logic VLB and Am5x86 @160 I have 18.3 fps in Quake 😉 with the VL/I-486SV2GX4
User pshipkov even reported 18.8fps with the same configuration, but I don't know how he did that 😁

Reply 12 of 170, by feipoa

Posted on 2023-03-14, 10:00

feipoa Offline

Rank l33t++

Rank: l33t++
Posts: 10472
Joined: 2011-03-07, 13:54
Location: Canada

At 120 MHz, are you able get 4.0 V out of your VRM on the 8433UUD? Did you measure it? I don't think it can go much above 3.85 V at full load with the cx5x86. At 133 MHz, 2x66, it cannot go much above 3.75 V at full CPU load.

The only BIOS mods for this board I did were v2012 and v2014. v2012 contains the EDO settings for 4-2-2-2 or 3-1-1-1. When I briefly tested this w/EDO, I saw no performance difference, so I re-hid that EDO speed option and put out v2014. If you see a v2020, it was from someone else. They probably unhid the EDO speed option again.

Having to physically remove my L2 is difficult, but it sounds like if someone wants to use EDO on 3-1-1-1 at 40 MHz, that's what they have to do. The 50 ns module requirement is difficult. Did you try a 64 MB EDO 50 ns module at 40 MHz? It also sounds like if you want to use cache, stick with FPM.

It sounds like you need better FPM modules. At 66 MHz FSB, w/IBM 5x86c, I am able to use 64 MB/60 ns FPM with 1024K cache on 3-2-2-2 and DRAM read/write at 1ws/0ws. If I increse the RAM to 128 MB, I need to do 2ws/0ws. Note that most modules cannot cope with 1ws/0ws at 66 Mhz FSB and require 2ws/0s. I managed to find a magic set. I get 19.8 fps in DOS Quake at 133 Mhz. I think it dropped to 19.2 with 256K, but I'd have to double check that. PCI is set at 33 MHz.

Details on the 1024K cache mode are found in my BIOSTAR MB-8433UUD manual ver. 2. Should be located in my world's fastest 486 thread.

Plan your life wisely, you'll be dead before you know it.

Reply 13 of 170, by Disruptor

Posted on 2023-03-14, 11:41

Disruptor Offline

Rank Oldbie

Rank: Oldbie
Posts: 1921
Joined: 2018-03-22, 18:31
Location: European Union

We have an AMD 486 DX4 SV8B 120 too that we test in 60*2. It seems to be slightly more tolerant in timings.

feipoa wrote on 2023-03-14, 10:00:

The 50 ns module requirement is difficult. Did you try a 64 MB EDO 50 ns module at 40 MHz?
...
It sounds like you need better FPM modules.

I'm sorry, we don't have that many 72 pin DRAM modules.
But you're right, we need better FPM modules. Our 64 MB FPM modules have 36 chips.

Reply 14 of 170, by mkarcher

Posted on 2023-03-14, 19:24

mkarcher Offline

Rank l33t

Rank: l33t
Posts: 3801
Joined: 2019-01-19, 16:29
Location: Germany

feipoa wrote on 2023-03-14, 10:00:

At 120 MHz, are you able get 4.0 V out of your VRM on the 8433UUD? Did you measure it? I don't think it can go much above 3.85 V at full load with the cx5x86. At 133 MHz, 2x66, it cannot go much above 3.75 V at full CPU load.

I'm going to measure that. That post was just quoting the nominal voltage from the 8433UUD jumpering scheme. On my copy of the 8433UUD, the voltage regulator is a SHARP PQ30RV21. This regulator is specified for 1.5W dissipation without heatsink. At 4V, this would allow up to 1.5A current draw. The Cx5x86-120 is specified at a peak power of 4.8W at 3.3V, which is 1.45A, so 1.5W might actually be exceeded at 4V target voltage, kicking in the "overtemperature protection", which reduces the output voltage by a significant amount, such that the dissipated power does no longer cause unacceptable chip temperature. As the dropout voltage increases when the output voltage drops, getting a stable operation point may need to drop the output voltage significantly. I'm going to scope the output voltage to observe the behaviour.

I got stable operation at 4*30MHz at the 3.6V settings of on HOT-433A, so 3.8 or 3.7V should be fine, and actually be less stressful on the CPU, so for my use case, 3.7V wouldn't be bad.

feipoa wrote on 2023-03-14, 10:00:

The only BIOS mods for this board I did were v2012 and v2014. v2012 contains the EDO settings for 4-2-2-2 or 3-1-1-1. When I briefly tested this w/EDO, I saw no performance difference, so I re-hid that EDO speed option and put out v2014. If you see a v2020, it was from someone else. They probably unhid the EDO speed option again.

I only got significant performance differences with 3-1-1-1 compared to 4-2-2-2 when L2 cache was disabled, but when it worked, the speedsys RAM performance was amazing. Possibly I tripped over a chipset quirk when I noted that you need to physically remove the L2 cache at FSB40, I might need to confirm that claim: The 8881E doesn't like the combination L2 off / cache burst leadoff 3 / 0 read WS RAM. You need to configure the leadoff to 2 clocks for L2 off / 0WS read.

feipoa wrote on 2023-03-14, 10:00:

Having to physically remove my L2 is difficult, but it sounds like if someone wants to use EDO on 3-1-1-1 at 40 MHz, that's what they have to do. The 50 ns module requirement is difficult. Did you try a 64 MB EDO 50 ns module at 40 MHz? It also sounds like if you want to use cache, stick with FPM.

I don't have 64MB EDO modules at hand, but plenty of 32MB EDO modules. Most of them are built of 16 chips 4M x 4 60ns, which put a lot of load on the address lines. I have 4 modules using 4 chips of 8M x 8, three of them 50ns, one of them 60ns. One of the three 50ns modules exhibits a clearly located bit error in memtest, so I'm not going to use that in general purpose computer. It might be usable with linux using a badram kernel parameter, and AFAIK there also is some Windows NT series hack to exclude memory areas.

feipoa wrote on 2023-03-14, 10:00:

It sounds like you need better FPM modules. At 66 MHz FSB, w/IBM 5x86c, I am able to use 64 MB/60 ns FPM with 1024K cache on 3-2-2-2 and DRAM read/write at 1ws/0ws.

It seems you must not compare memory timings with and without L2 cache enabled. My run-off-the-mill 16-chip 60ns 32MB modules work fine (at least with only one of them installed) at FSB60 1/0 if L2 is enabled, but they require 2/1 if L2 is disabled. The chipset seems to start memory addressing at the same time as it starts tag lookup. If it needs to wait for the tag compare result to continue the cycle, this seems to effectively add some margin to the addressing timing. I found that the number of chips is definitely the limiting factor for fast memory access, so obviously the address drivers in the north bridge have limited drive capability, or there are series termination resistors on the board that form an R/C lowpass filter.

feipoa wrote on 2023-03-14, 10:00:

Details on the 1024K cache mode are found in my BIOSTAR MB-8433UUD manual ver. 2. Should be located in my world's fastest 486 thread.

Thanks for the pointer! I searched for your manual the last days without success, and was afraid that you felt the need to remove that post for some reason. It's in the initial post of that thread, I found it. I'm unsure whether I skipped that thread or just missed the link because the photos are below it.

Reply 15 of 170, by mkarcher

Posted on 2023-03-14, 21:33

mkarcher Offline

Rank l33t

Rank: l33t
Posts: 3801
Joined: 2019-01-19, 16:29
Location: Germany

mkarcher wrote on 2023-03-14, 19:24:

feipoa wrote on 2023-03-14, 10:00:

At 120 MHz, are you able get 4.0 V out of your VRM on the 8433UUD? Did you measure it? I don't think it can go much above 3.85 V at full load with the cx5x86. At 133 MHz, 2x66, it cannot go much above 3.75 V at full CPU load.

On my copy of the 8433UUD, the voltage regulator is a SHARP PQ30RV21. This regulator is specified for 1.5W dissipation without heatsink. At 4V, this would allow up to 1.5A current draw. The Cx5x86-120 is specified at a peak power of 4.8W at 3.3V, which is 1.45A, so 1.5W might actually be exceeded at 4V target voltage, kicking in the "overtemperature protection", which reduces the output voltage by a significant amount, such that the dissipated power does no longer cause unacceptable chip temperature.

Nice theory about the overtemperature protection. The real cause seems to be much more simple: The +5V line does is not directly connected to the regulator chip (U24), but there is a series diode (D15), a 1N5400, which is a 3A silicon rectifier. At 1.5A, you might get 0.7V dropout at that diode. The regulator chip is specified at 0.5V dropout, so there is no way to obtain 4V at full load.D15 could be replaced by a 5V Shottky barrier diode, like the SB540, which should get down the drop to 0.4V, yielding 0.3V extra headroom.

mkarcher wrote on 2023-03-14, 19:24:

feipoa wrote on 2023-03-14, 10:00:

It sounds like you need better FPM modules. At 66 MHz FSB, w/IBM 5x86c, I am able to use 64 MB/60 ns FPM with 1024K cache on 3-2-2-2 and DRAM read/write at 1ws/0ws.

or there are series termination resistors on the board that form an R/C lowpass filter.

Of course there are series termination resistors:

10 ohms on all 12 address lines
47 ohms on all CAS lines (each CAS line shared between two slots)
22 ohms on each pair of RAS lines (each RAS line pair shared between two slots)

On a single SIMM, you have each CAS line connected to a quarter of the chips, a RAS pair connected to half of the chips and the address lines connected to all of the chips. The result is that (at least for a single SIMM) the R/C time constant is approximately the same on all lines. This actually makes a lot of sense from an electronic design perspective. If multiple slots are populated, though, all address lines get increased load, but only some of the RAS or CAS lines get increased load. This advances (some of) RAS/CAS in relation to the address lines, although I don't know whether the timing advance is significant. I wonder whether using separate termination resistors per slot could improve things.

So, let's do the calculation. The datasheet for a Siemens HYB4116405B quotes max 5pF input capacitance on the address inputs, and max 7pF input capacitance on /RAS and /CAS. A single SIMM with 16 of those chips poses a load of max 80pF address capacitance, and max 56pF RAS capacitance and max 28pF CAS capacitance. This results in a timing constant of 0.8ns for the address lines, around 1.2ns for the RAS lines and 1.3ns for the CAS lines. The actual capacitance is likely lower than the maximum capacitance specified in the data sheet, but there also is capacitance of the PCB, so assuming around 1ns delay per installed module sounds plausible. This shouldn't be that significant at a clock period of 15ns or 17ns, so possibly the assumption that the timing is determined by the R/C effect only is likely wrong, and the drive capability of the chipset is also a limiting factor. To change the level of a 80pF capacitor from 0.4V (a typical low level) to 3.5V (a typical high value) in 3ns, you need 80 milliamperes. This is a very high current for digital outputs. Typical "high drive" outputs as I know them are specified to drive 20 to 30mA. So the time needed to charge the address lines using the available drive currents is likely a contributing factor to the RAM speed limit with multiple modules.

Reply 16 of 170, by majestyk

Posted on 2023-03-15, 07:46

majestyk Offline

Rank Oldbie

Rank: Oldbie
Posts: 1547
Joined: 2020-12-04, 09:13

Voltage drop at the 1N5400 might be even higher @ 1.5A:

The attachment 54xx_drop.JPG is no longer available

Reply 17 of 170, by feipoa

Posted on 2023-03-15, 09:49

feipoa Offline

Rank l33t++

Rank: l33t++
Posts: 10472
Joined: 2011-03-07, 13:54
Location: Canada

This certainly helps explain why I've always had the best luck with a single stick of memory, with 8 chips.

I never expected there to be a diode in series like this. It explains a lot of my previous observations. Do you know why this diode is connected in series between VCC5 and VRM's VIN? The standard connection diagram for the Sharp PQ30RV21 shows a diode, D1, being connected between VOUT and VIN, not in series like on the motherboard. Perhaps adding this diode eliminated the need for Cin? Notice the empty capacitor vias right next to D15.

Unfortunately, I don't have any 3 A shotkey diodes to swap in place of D15.

However, on the motherboard which I use in my IBM 5x86c-133/2x system, I did replace the Sharp PQ30RV21 with PQ30RV31.

On ceramic IBM 5x86c-100HF chips, I noticed to get to 133 MHz, more voltage was necessary - 3.85 V in this case, whereas my QFP IBM 5x86c-100HF only needs 3.73 V.

Will you be playing with the termination resistors in an attempt to optimise for 2 SIMM sockets filled?

I will have to play with this EDO DRAM option again at some point. However, it does seem unlikely that I can use 3-1-1-1 w/66 MHz even with L2 removed. I only want to run 64 MB or greater on my system.

By the way, the 1024K cache mod can also be adjusted such that the cache is swapable, between 256K double-banked, 512K double-banked, or 1024K double-banked. I just never got around to updating the manual for this board. I'll attach some photos of the jumper configuration I came up with.

The attachment IMG_9809.JPG is no longer available

The attachment IMG_9810.JPG is no longer available

The attachment IMG_9811.JPG is no longer available

The attachment IMG_9812.JPG is no longer available

The attachment IMG_9813.JPG is no longer available

Plan your life wisely, you'll be dead before you know it.

Reply 18 of 170, by feipoa

Posted on 2023-03-15, 09:49

feipoa Offline

Rank l33t++

Rank: l33t++
Posts: 10472
Joined: 2011-03-07, 13:54
Location: Canada

The rest of the photos.

The attachment IMG_9814.JPG is no longer available

The attachment IMG_9815.JPG is no longer available

The attachment IMG_9816.JPG is no longer available

Might seem a bit messy because they aren't organised yet. I took these a year ago for another member who did the mod. Looking at my notes, I'm calling the new jumper block JP99.

For 512K double-banked, JP99 all open. And JP5: 2-3, JP6: 1-2, 3-4, and JP7: 2-3

For 1024K double-banked, JP99 1-2, 3-4. And JP5: 2-3, JP6: 1-2, 3-4, and JP7: 2-3

Plan your life wisely, you'll be dead before you know it.

Reply 19 of 170, by mkarcher

Posted on 2023-03-15, 15:21

mkarcher Offline

Rank l33t

Rank: l33t
Posts: 3801
Joined: 2019-01-19, 16:29
Location: Germany

feipoa wrote on 2023-03-15, 09:49:

I never expected there to be a diode in series like this. It explains a lot of my previous observations. Do you know why this diode is connected in series between VCC5 and VRM's VIN? The standard connection diagram for the Sharp PQ30RV21 shows a diode, D1, being connected between VOUT and VIN, not in series like on the motherboard. Perhaps adding this diode eliminated the need for Cin? Notice the empty capacitor vias right next to D15.

I don't know why they used this configuration, and especially, I can't explain why they would use a silicon rectifier with a "nominal" forward voltage of 0.7V combined with a voltage regulator that is specified to need 0.5V dropout voltage for proper regulation, when they are targeting to get 4.0V from 5.0V. The only explanation for this component choice I can come up with is that 4.0V is a "late requirement" that was added into a design that was already finished with a target voltage of 3.45V. At 3.45V, this design makes some sense, because the diode and the regulator both dissipate some of the excess voltage, so the regulator keeps cooler.

The topology on this board using a series diode prevents charge from the Vcore rail to flow into the standard 5V rail. Otherwise this could occur on power-down in case the +5V rail gets discharged faster than Vcore. The suggested anti-parallel diode makes sure there is no excessive current through the regulator chip in that scenario, because the current will instead use the diode next to the regulator. That design doesn't completely prevent current flow in that direction, though. The configuration on the MB-8433UUD on the other hand does not only stop (excessive) reverse current through the regulator, but it stops reverse current at all, so it is different from an electronics engineering perspective.

At the moment, I have no idea why there would be any significant amount of reverse current on the board at all, so the diode for reverse current protection could likely be omitted. The capacitance of the output capacitor doesn't look that big that discharging the capacitors through the regulator would cause big trouble. As I don't see any way for reverse current to happen, I can't explain why they would need to prevent it, so the "load sharing" idea for a target voltage of 3.3V to 3.6V seems like the more likely reason for this design.

feipoa wrote on 2023-03-15, 09:49:

Unfortunately, I don't have any 3 A shotkey diodes to swap in place of D15.

Don't blindly trust that a 3A schottky diode will cause a lower drop than a 3A silicon diode. Schottky diodes are great at low currents, but at high load currents, the forward voltage might rise even higher than on silicon diodes rated for a similar current. That's why I consider replacing it with a 5A Schottky diode (I mistakenly wrote 5V instead of 5A in my initial post about that idea). If you consider using a 3A diode, check the datasheet carefully!

feipoa wrote on 2023-03-15, 09:49:

This certainly helps explain why I've always had the best luck with a single stick of memory, with 8 chips.

Will you be playing with the termination resistors in an attempt to optimise for 2 SIMM sockets filled?

My ramblings in the previous posts were intended to convey the idea that the termination resistors are not the prime reason for lower timings with multiple SIMMs, but the limited drive capability of the chipset is. The solution for limited drive capability is well known: You could use an extremely fast a amplifier chip on each SIMM to drive tha address and control lines of each memory chip. This is known as "buffered" memory modules. You could construct buffered PS/2 modules using chips like the 74ABT245 (the letter combination in the middle make this chip "the magic solution", not the point that it is a 74xx245) that provide very low load on the chipset, and can drive memory chips like a beast, with a flat delay of around 2.5ns.

feipoa wrote on 2023-03-15, 09:49:

I will have to play with this EDO DRAM option again at some point. However, it does seem unlikely that I can use 3-1-1-1 w/66 MHz even with L2 removed. I only want to run 64 MB or greater on my system.

As I was just barely able to get 3-1-1-1 at 40MHz, getting that speed at 66 seems extremely unlikely. The 4-chip 50ns 32MB module should already be "on the fast side", so I don't see how I should get the headroom to go from "barely working at 25ns clock period" to "working fine at 15ns clock period". While you can get blazing fast EDO chips for graphics cards ("100MHz", "25 ns"), those usually are just 256K x 16, so a double-sided PS2 SIMM build from them would be 2MB.

feipoa wrote on 2023-03-15, 09:49:

By the way, the 1024K cache mod can also be adjusted such that the cache is swapable, between 256K double-banked, 512K double-banked, or 1024K double-banked. I just never got around to updating the manual for this board. I'll attach some photos of the jumper configuration I came up with.

Thanks for the photos. My long-term plan with that board is to design/order a prototype PCB that I can plug into the cache sockets with nine CY7C1009BN SOJ32 chips soldered on it. Those chips are available new from the factory today for sensible prices.

Go to top of page Go to top of page

Back to General Old Hardware