Reply 220 of 581, by pdw
wrote:May I ask what you have changed, because at least on this machine there is now almost NO difference in speed compared to the patched version. This is a game changer.
So this is going to get a bit technical... Games program the OPL2 chip by changing the values in its internal registers. Typically this goes like this:
- Write a register number to the address port
- Delay by reading the address port 6 times (needed because the OPL2 is not a very fast chip)
- Write a new value to the data port
- Delay some more by reading the address port 35 times
So all together these are 43 I/O operations. Each of these takes about 1 microsecond (the exact duration varies a bit from computer to computer). What's important to note is that of these 43 operations, only the two writes have an effect; the reads are solely there to provide a delay.
Now what the TSR does is to ask the CPU to generate a fault whenever a program accesses one of the Adlib ports. EMM386 receives this fault, does some work to figure out what's happened, and passes control to my TSR. The TSR then does the equivalent I/O operation using the parallel port. This of course causes some overhead. On 486 and later systems the overhead is tolerable but it turns out that on a 386 it's gigantic -- the entire fault handling takes about 70 microseconds. What originally required about 43 microseconds now takes 3000 microseconds. So no wonder that games run slow.
What this experimental version does is to check which instruction caused the fault. If it's a read, it's patched out. This way we need to pay the cost of the fault only for the writes. We still need to ensure that we don't access the OPL2 chip too fast, so a delay is added to the write handling in the TSR.
I guess that programs with sample playback (IPLAY/GLX212/Pinball Dreams) try to measure the I/O timings (to calculate what sampling rate is feasible etc) and get completely confused by my shenanigans.
My current feelings are that this should become an option, perhaps only enabled by default on 80386 CPUs.