The strange memory speed losses seen in some 386DX boards earlier in this thread could also be from boards using WB cache without a dirty bit, so main memory speed drops with cache enabled. Also note some 32-bit bus systems were being seen as 2-bytes wide instead of 4-bytes wide in cachecheck, so some of the statistics (like memory timing) were 2x as fast as they really are. (I've seen this with my OPTi 495SX board)
Some late gen 386SX and 286 chipsets are optimized for pretty fast operation of DRAM (bank interleave and page-mode optimizations) while some 386 and 486 boards seem to be more lacking, possibly due to more effort put into cache performance (I haven't seen figures from completely cacheless 386DX boards or 486 boards to compare, though).
I've got an M396F as well as an SLI based 386SX board and both have quite fast DRAM read/write times as well as benefiting from having 2 banks of RAM installed rather than 1 (all 4 SIMM slots populated). Closing the turbo switch jumper/header may be necessary to enable fast mode on some boards as it'll be slow by default, while it may be the opposite on others or allow TB switch functionality to be changed in the BIOS. (ie default fast vs default slow)
The OPTi 495SX (and probably 495DLC) chipset seems especially slow in DRAM timing and still even worse with cache enabled, and the 0WS options are also obviously not the fastest timing possible as it's not possible to push RAM into unstable conditions as with some other boards at the same FSB speed. (sometimes the 'auto config' option in the BIOS is faster than manual 0WS as well)
Also it looks like Sysinfo is playing with the Cyrix 486DLC registers, maybe properly enabling and configuring the cache to a useable range. I should try that out myself at some point.
I also realize most of those posts are quite old and the users have learned a lot since then (and probably worked out most of the above already), but since this one already got bumped, I thought it wouldn't hurt to comment on a few things.