Deunan wrote on 2024-12-15, 13:27:
You do need FLUSH# on the SLC/DLC/SXL(C) class chips for DMA device-to-memory writes to work properly. So on most systems that only affects reading floppy disks,
Exactly. The idea is that the SLC chip may have "old" data in its cache when DMA updates the memory "in the background". The SLC chip must invalidate either the whole cache or at least the address area that has been modified by DMA to not use old stale data. Some BIOS vendors may have put a cache flush instruction in their floppy read function, eliminating the requirement for working FLUSH# as long as the floppy is only ever accessed using the BIOS (and not via direct floppy controller programming).
Deunan wrote on 2024-12-15, 13:27:
and only when the floppy is swapped.
I can't follow this reasoning. Obsiously, if you read the same sector of the floppy into the same location of memory, it doesn't matter whether the 486SLC uses the stale old value from its cache or the new value written by DMA, as those values happen to be identical. If the floppy has been swapped in-between, the new data written by DMA will be different to the old data in the sector buffer. I guess this is what you might have thought of. But on the other hand, even if you read a different sector from the same floppy to the same buffer area, a cache flush (either manually using a 486 INVD instruction or using the FLUSH# signal) is required.
Deunan wrote on 2024-12-15, 13:27:
These chips need FLUSH# or have to work in BARB mode if DMA is used.
And that's what makes MikeSGs statement about "the BIOS knows the 486" not entirely wrong. If the BIOS knows the 486SLC (not just any kind of 486), and it knows that the mainboard does not properly signal FLUSH#, the BIOS might enable the SLC BARB mode to also work around DMA problems (at a higher performance cost).
Deunan wrote on 2024-12-15, 13:27:
Though I agree that 386 to SLC swap is not going to do much due to the narrow bus and very small cache. These CPUs are already bus limited, although there will be some minor speedup in pure 16-bit code. The 8k cache is doing better but also can't work miracles since it's difficult to feed the cache through the narrow bus, and 286/386 code optimization requires loop unrolling, the opposite of what 486+ CPUs want.
Yeah, I like to joke that the 486SLC2 waits for the bus twice as fast as the 486SLC, especially for 32-bit software. The 486 is at the tipping point between "loop unrolling is good" and "loop unrolling is bad". Huge unrolled loops waste precious cache space on the 486, which is a point against unrolled loop. Nevertheless, the 486 still doesn't have jump prediction, and every taken jump flushes the execution pipeline, which is a point for unrolled loops. "Unrolling hurts" is much more true on the Pentium than on the 486, especially for short loops.
Deunan wrote on 2024-12-15, 13:27:
SXLC with clock doubler and cache enabled pretty much requires some form of cooling or it will slowly cook itself and everything around it.
Just as a data point: I have a Taiwanese Cx486SLC2-50 Laptop, which died some time after I upgraded the hard drive from 120MB to 500MB. Possibly the in-rush current of the bigger drive killed the laptop in the long run. The hard drive took two attempts to spin up every time it spun up, likely due to insufficient +5V supply. That laptop did not have any kind of cooling on the SLC2-50, which didn't seem to impose a problem. I do understand that the SXLC with clock doubler is a different thing due to the bigger (much more sensibly sized) cache, which might be considerably more power hungry.