MikeSG wrote on 2026-06-01, 11:44:
Cache flush is not contant it's 10-50 microseconds
Flushing whole cache is pretty constant. Write thru cache (like in 486DLC) flush itself doesnt cost anything and doesnt take any time. It simply drops all L1 on the floor and from the very next cycle acts like CPU with empty L1 cache again. Its the refilling L1 that hurts.
MikeSG wrote on 2026-06-01, 11:44:
and DMA access is further apart.
Hidden Refresh (CPU not stalled during refresh) is available on late 386 boards and makes 5-10% difference.
If you are playing even 11KHz sound you have 11000 cache flushes per second, pretty painful on boards not implementing custom Cyrix flush pins. Im sure you remember feipoa testing SXL2-66 Re: Register settings for various CPUs
DOOM w/sound = 3689 = 20.25 fps (FLUSH# enabled)
DOOM w/out sound = 3370 = 22.16 fps. BARB
DOOM w/sound = 4420 = 16.90 fps BARB. flushes cache on every sound blaster DMA access. 20% hit
DOOM w/sound = 13.57 fps. BARB with disabled hidden refresh. another 20% hit
DOOM w/out sound 11.15 fps. Disabled L1
Now on a board using 386 Chipset but supporting 486 CPUs neither of those should be a problem. I would expect full support for 486 selective flush mechanism (flushing only on DMA writes and only affected cache lines). The only differences from "true" 486 chipset should be no burst support, maybe no support for tighter timings, and maybe less efficient memory controller?
MikeSG wrote on 2026-06-01, 11:44:
A 386DX SIS-Rabbit board with a real 486 socket running a DX4-100 ODP, versus a 486 running the same CPU. Same 12-12.5FPS in Quake.
Ah, 386 chipset + 486 cpu, perfect! Now run tests that dont bottleneck on FPU and VGA 😀 Doom might be ok'ish, it does 8bit video writes when drawing columns and VGA will still bottleneck on faster cpu. FastDoom even better as it renders to ram buffer and copies to video in bulk.
I dont recall ever seeing cachechk results from such combo either 🙁 Would be good comparison to 486SXL2 @80Mhz running on (ALi M1429) ECS PANDA 386V which is the opposite of the case we are wondering about, its a 486 chipset on 386 motherboard running Cyrix 486'like using 386 bus :-] 😮
Re: Custom interposer module for TI486SXL2-66 PGA168 to PGA132 - HELP!
486SXL2 @80Mhz + ALi M1429
L2 27 us/KB
MikeSG wrote on 2026-06-01, 11:44:
Applications are deliberately designed to only use L1 & L2, and rarely go to the memory.
How do you write such application? In nineties even geniuses like Abrash and Carmack didnt bother and Quake speeds up linearly all the way to 2MB of L2 and would probably go beyond https://dependency-injection.com/2mb-cache-benchmarks/ Quake codebase is not cache aware, both total size and cache line width were not a factor during development. It was simply too much to worry about data locality when shipping a DOS game in the nineties. Nowadays everyone tries to fit hot path in L1, but even then you cant predict cache sizes. Data oriented programming only became popular with bad/convoluted console architectures where you simply had to worry about data layout to avoid huge latency penalties.
MikeSG wrote on 2026-06-01, 11:44:
I guarantee if a 386 hybrid had a PODP5v83, versus a 486 with the same CPU they would be within 5-10% again.
I could see 10%