VOGONS


Reply 20 of 37, by CoffeeOne

User metadata
Rank Oldbie
Rank
Oldbie
mpe wrote on 2020-02-03, 22:19:
OK Mission accomplished. I soldered in the extra header, installed 1 MBit TAG chips and it is now working! Looks like the jumper […]
Show full quote

OK Mission accomplished. I soldered in the extra header, installed 1 MBit TAG chips and it is now working! Looks like the jumper needs to be permanently in place,

DSC_5295.jpg

ctcm is confused with anything >512k. However, cachechk and speedsys confirm the cache. The speedsys graph is a little unusual:

2048.jpg
IMG_4964.jpg

The only DOS app showing some improvement over 1024k is Quake (20.1fps vs 19.8fps). The rest seems to comfortably fit in the 1M cache already. (DOOM +1 realtick, pcpbench + 0.1fps, ...) I think I will need to better benchmarks that use larger data set, likely in Windows.

Will spent a little bit more time on this board. Currently have "faster" cache setting and 2T write cycles. This is the same as 256k/512k/1024k, but I still believe that with genuine <15ns SRAM chips I should be able to run this at full speed and beat even a 75 MHz mb...

Awesome upgrade, old hardware maxed out 😀
I am a bit surprised, that you used the cheap 10ns chips instead of the probably correct labelled 15ns chips.

Reply 21 of 37, by rmay635703

User metadata
Rank Oldbie
Rank
Oldbie

In the scenario that you have this much cache one would wonder the effect if you could map half of it as base ram

Reply 22 of 37, by mpe

User metadata
Rank Oldbie
Rank
Oldbie
CoffeeOne wrote on 2020-02-04, 23:24:

I am a bit surprised, that you used the cheap 10ns chips instead of the probably correct labelled 15ns chips.

I didn't know they were fakes. Then how could I know 15ns aren't just re-labelled 20ns parts 😀 ?

You can't really tell these days...

Blog|NexGen 586|S4

Reply 23 of 37, by H3nrik V!

User metadata
Rank Oldbie
Rank
Oldbie
rmay635703 wrote on 2020-02-04, 23:59:

In the scenario that you have this much cache one would wonder the effect if you could map half of it as base ram

Oh yeah - or 640k .. That would be cool!

If it's dual it's kind of cool ... 😎

--- GA586DX --- P2B-DS --- BP6 ---

Please use the "quote" option if asking questions to what I write - it will really up the chances of me noticing 😀

Reply 24 of 37, by amadeus777999

User metadata
Rank Oldbie
Rank
Oldbie

That's awesome - I have the same board.
The upgrade to 2MiB cache will be an interesting project.
I use 16KiB 12ns tags for my board and 32KiB srams for the banks resulting in 0.5MiB - all BIOS settings max'd out.

Reply 25 of 37, by feipoa

User metadata
Rank l33t++
Rank
l33t++
mpe wrote on 2020-02-03, 22:19:

The only DOS app showing some improvement over 1024k is Quake (20.1fps vs 19.8fps). The rest seems to comfortably fit in the 1M cache already. (DOOM +1 realtick, pcpbench + 0.1fps, ...) I think I will need to better benchmarks that use larger data set, likely in Windows.

Are you still planning on running some Windows benchmarks? I'm quite curious to see some Windows game and synthetic benchmarks, especially if you can run the full offerings of cache from 256k up to 2 M. If you have a P133 socket 4 overdrive, that would make it even more interesting.

Plan your life wisely, you'll be dead before you know it.

Reply 26 of 37, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

Yes. I plan to publish full tests incl. Winstone 95. The only problem I have is that single-bank configurations can't sustain 3-2-2-2 timings in Windows at 66 MHz. So I need to recapture tests with more conservative timings, at 60/120 MHz or only consider dual-bank configs.

Also I don't know about any synthetic tests that can capture cache size increase. Synthetic benchmarks tests tend to have very small code size/data set and easily fit into L1 cache let alone L2.

I found Quake 320x200 DOS to be almost perfect CPU/L2/MEM benchmark in DOS and Winstone 95 works well in Windows. Both see modest improvements when going from 1024k or 2048k L2.

Yes. I do have PODP5v133.

Blog|NexGen 586|S4

Reply 27 of 37, by feipoa

User metadata
Rank l33t++
Rank
l33t++

Ahhh, super, you have the PODP5V133! Looking forward to this.

Do not all the boards being tested have double-banked cache? If not, I'd run them each with double-banked cache using the fastest stable timings.

Plan your life wisely, you'll be dead before you know it.

Reply 28 of 37, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

Yes. Will need to redo the tests with PODP5V133 and use smaller chips in dual-bank mode to have consistent timing.

Most Socket4/5 motherboards have a single bank L2 cache.

The SiS501 is rather an exception. Also UMC8891 could have dual-banked cache, but that's not a very common chipset. This allows using slower SRAM chips.

Technically, when you use 512k pipelined-burst COAST module on Triton, it is also a dual-bank configuration. However, unlike async cache, the dual-bank PB is actually a bit slower as the back-back burst cycle becomes 3-1-1-1-2-1-1-1 instead of 3-1-1-1-1-1-1-1 because an extra wait-state need to inserted for bank turnaround. However, increased cache efficiency when using 512k should easily win any such penalty back.

Blog|NexGen 586|S4

Reply 29 of 37, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

Adding graphs with performance results. Not easy to capture as some cache configuration (single-banked ones) are non-stable at specific settings with the cache chips I have. However, I managed to capture results after some trying & chip swapping.

These were all captured with PODP5V133 overdrive CPU.

DOOM / Winstone 95 results show the expected "reverse hockey stick". Most benefit is realised by having any sort of cache. Then you get diminishing improvements as you increase the size.

The attachment Screenshot 2020-04-16 at 10.15.33.png is no longer available
The attachment Screenshot 2020-04-16 at 10.15.42.png is no longer available

Quake shows a slightly different picture. The performance scales with cache size quite nicely and even at 2048k doesn't seem to be slowing that much:

The attachment Screenshot 2020-04-16 at 10.15.48.png is no longer available

I also measured more aggresive timings with 60 MHz bus. Like 3-1-1-1 or 4-1-1-1. But those were generally not stable except for few tests. Anyway, it looks like cache timing is more important than size. It is even more important than the clock as in my tests well tuned P120 MHz can beat average P133. I still need to find perfect SRAM chips for this board.

Blog|NexGen 586|S4

Reply 30 of 37, by feipoa

User metadata
Rank l33t++
Rank
l33t++

These charts are epic. I've never seen a 256K-2M cache test like this before. I saved the charts in case I can't find the thread later on. Any chance you can also do 128K, mostly for Quake. I'm surprised at how linear the response was with Quake, whereas DOOM seems more asymptotic.

Yeah, timings are most important. An issue I've come across is that using more cache, esp. the max a chipset can handle, sometimes requires slower timings for stability. So there is a sweet spot somewhere.

Plan your life wisely, you'll be dead before you know it.

Reply 31 of 37, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

Thanks. My board doesn't have a setting for 128k. The chipset supports it (even a non-interleaved 64k). I suppose I should be able to find out settings. However, I don't have enough (16) fast 8kx8 SRAM chips. Just setting by jumpers don't work. The board seems to be doing some active size detection as I found.

I suppose I should be able to fool it by using 32kx8 and isolating high address pins with an electrical tape 😀 Will see..

Blog|NexGen 586|S4

Reply 32 of 37, by H3nrik V!

User metadata
Rank Oldbie
Rank
Oldbie
feipoa wrote on 2020-04-16, 10:51:

These charts are epic. I've never seen a 256K-2M cache test like this before. I saved the charts in case I can't find the thread later on. Any chance you can also do 128K, mostly for Quake. I'm surprised at how linear the response was with Quake, whereas DOOM seems more asymptotic.

Yeah, timings are most important. An issue I've come across is that using more cache, esp. the max a chipset can handle, sometimes requires slower timings for stability. So there is a sweet spot somewhere.

Speaking of epic charts .. Did you ever consider doing similar in the 486 comparison? 🤣

If it's dual it's kind of cool ... 😎

--- GA586DX --- P2B-DS --- BP6 ---

Please use the "quote" option if asking questions to what I write - it will really up the chances of me noticing 😀

Reply 33 of 37, by feipoa

User metadata
Rank l33t++
Rank
l33t++

lol, no! That would make for WAY too many variables. I have thought about doing a 64k, 128k, 256k, 512k, 1024k comparison though. I remember reading that 64k to 128k and 128k to 256k are supposed to yield a large performance benefit, and I want to see it charted.

I would like to redo all 486 benchmarks though, include more 486 chips, as well as 486 chips on 386 motherboards, and try to split the results more 50:50 with synthetics and games. There's so much to do before work can start on that though.

Plan your life wisely, you'll be dead before you know it.

Reply 34 of 37, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

I have a torso of results from my MSI-4144 with Am5x86 @160, but still not enough to draw graphs. The board can do 128k-1024k.

I expect the effect of cache size will be less pronounced on a 486. The cache line size is half-size of Pentium which could improve efficiency of smaller sizes? Similarly I expect the impact of burst timing to be lower for the same reason.

It is not without surprise that in absolute numbers Doom framerates of 150/160 MHz 486 are roughly comparable to Pentium 133. Just shows how bad the DOOM is as a benchmark 😀

Blog|NexGen 586|S4

Reply 35 of 37, by bbuchholtz

User metadata
Rank Newbie
Rank
Newbie

Super awesome thread! I am also a proud new owner of an ECS SI5PI AIO rev1.1 😀

I'm still waiting on my CPU to arrive. But, I've already performed the JP12 mod.

I was noticing that this board has an AMIKEY-2. From my experience, this chip supports PS/2. Also, this board looks to have solder pads for PS/2 keyboard and mouse:

20200527-192854.jpg

All of the componentry looks to be in place:

20200527-193021.jpg

Has anyone tested PS/2 with this board yet?

-Brian

Reply 36 of 37, by Anonymous Coward

User metadata
Rank l33t
Rank
l33t

It's not just AMIKEY-2 that supports the PS/2 mouse. Just about every 8042 I've seen made after 1990 seems to support it.

"Will the highways on the internets become more few?" -Gee Dubya
V'Ger XT|Upgraded AT|Ultimate 386|Super VL/EISA 486|SMP VL/EISA Pentium

Reply 37 of 37, by H3nrik V!

User metadata
Rank Oldbie
Rank
Oldbie
feipoa wrote on 2020-04-16, 12:34:

🤣, no! That would make for WAY too many variables. I have thought about doing a 64k, 128k, 256k, 512k, 1024k comparison though. I remember reading that 64k to 128k and 128k to 256k are supposed to yield a large performance benefit, and I want to see it charted.

I would like to redo all 486 benchmarks though, include more 486 chips, as well as 486 chips on 386 motherboards, and try to split the results more 50:50 with synthetics and games. There's so much to do before work can start on that though.

Yeah, that seems like a lot of work .. And it seems a LOT of effort has already been put in 😀

If it's dual it's kind of cool ... 😎

--- GA586DX --- P2B-DS --- BP6 ---

Please use the "quote" option if asking questions to what I write - it will really up the chances of me noticing 😀