Assuming the CPU speed and number of wait states are held constant, I don't see what good having faster memory will do.
A 40MHz CPU has a cycle time of 25ns. Subtract 10ns for decoding and buffering and you would need 15ns DRAM in order to run zero wait states. This is where SRAM cache comes into play, as it can make your system feel like it has 0ws during certain operations. Not that 40MHz 386s usually have 15ns SRAM cache. If you ran 20ns SRAM cache, in theory you would need to add a cache waitstate to ensure system stability.
In the case of your 40MHz system, you would probably need to use 2 DRAM waitstates so that your memory would operate in spec (75ns - 10ns) = 65ns DRAM required. 60ns is recommended, but many people got by with 70ns just fine. Using 80ns DRAM is clearly overclocking your memory.
A 0ws or 1ws (15ns, 40ns DRAM required) 40MHz 386 is pretty much impossible, unless you are able to construct some of your own SIMMs using modern technology.
"Will the highways on the internets become more few?" -Gee Dubya
V'Ger XT|Upgraded AT|Ultimate 386|Super VL/EISA 486|SMP VL/EISA Pentium