root42 wrote:The latter part sounds interesting. Do you have any references for this fact? I assumed that the SX core was more or less the same as the DX, except for the data bus and the limited address lines.
True, the SX core is the same as the DX core, only the bus is different.
But, firstly, the 386 wasn't particularly fast to begin with, with v86 mode, and adjusting page frames etc.
Secondly, setting up v86 mode and using paging to emulate EMS requires various tables in memory. Access to these tables will be slower over the 16-bit bus.
There is probably a 'break even' point somewhere:
EMS memory was originally developed for 808x machines, so even the fastest EMS cards probably use relatively slow memory, and they always use the ISA bus, so there is some practical 'upper bound' to EMS performance.
As CPUs get faster, the overhead of v86 mode and EMS emulation became less of an issue. The advantage is that you could use the regular system memory, which would always run at full speed. So on a given machine, probably a 486DX-33 or so, EMM386 will probably be faster than the fastest hardware solution.
The NEAT chipset is an interesting special case, because like EMM386 it uses the fast system memory. However, it eliminates the overhead of EMM386 (and the need to run in v86 mode at all). So its EMS performance may be faster than the fastest EMS card as well.
Anyway, as for v86 overhead, you see similar things with the TSR solution for SoftMPU and the OPL2LPT and related interfaces. They actually use some EMM386 features for virtualization. And the overhead of this is quite high. A 386SX-16 is generally too slow to run games this way, even though it would run the games just with the actual hardware.