VOGONS


IBM PC Speaker RC values?

Topic actions

Reply 40 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've just made a little recording, with Links 2 (XT 10MHz emulation on UniPCemu) and compared it to the Youtube recording( https://www.youtube.com/watch?v=m6Uoi9U3gKA ) at https://www.dropbox.com/s/gn83bp8uen2ut5t/Uni … yquist.wav?dl=0

I've modified the low-pass filter to filter at the nyquist frequency(44.1kHz samplerate, with 96dB attenuation(assuming 6dB per octave)).

//Speaker low pass filter values (if defined, it's used)! We're 96dB dynamic range, with 6dB per octave, so 16 times filtering is needed to filter it fully at the nyquist frequency.
#define SPEAKER_LOWPASS ((((float)SPEAKER_RATE)/2.0f)/16.0f)

Although there's some soft noise in the background of UniPCemu's recording, could Youtube be filtering that away?

Edit: I've made a little recording of UniPCemu running 8088MPH (compare against the original 8088MPH recording mpg):
https://www.dropbox.com/s/k8dp8ot0nyvw9he/Uni … 088MPH.zip?dl=0

Original 8088 MPH recording for reference: https://www.dropbox.com/s/9ydrycafqdmd332/808 … %20MPH.mpg?dl=0

Sounds pretty close, although the original noise is a bit softer(more white-noise instead of heavy vibe that's in UniPCemu) it seems?

Last edited by superfury on 2016-12-30, 18:05. Edited 2 times in total.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 41 of 66, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++

I wouldn't say Youtube is an authoritative reference point for how RealSound is supposed to be. Who knows what was being used to run the game?

If you have an older machine, I'd try Links there. Otherwise, see if someone has a decent microphone they can use to record gameplay sounds on their own retro machine. Of course, you have to remember that this is not necessarily the "correct" way for PC speaker audio to sound - you'd be modeling (somewhat) the sound of one particular machine.

All hail the Great Capacitor Brand Finder

Reply 42 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

Listening to UniPCemu recordings and 8088MPH.MPG, it seems that the original recording has more noise, but UniPCemu has some kind of vibe instead of the noise? Does this imply that there's a problem with my filter? Or does it imply the PIT is producing an signal with errors?

My PIT emulation:

//Render 1.19MHz samples for the time that has passed!
length = i/MHZ14_RATE; //How many ticks to tick?
i -= length*MHZ14_RATE; //Rest the amount of ticks!
time_ticktiming = i; //Save the new count!

if (length) //Anything to tick at all?
{
for (channel=0;channel<3;channel++)
{
mode = PITchannels[channel].mode; //Current mode!
for (tickcounter = length;tickcounter;--tickcounter) //Tick all needed!
{
switch (mode) //What mode are we rendering?
{
case 0: //Interrupt on Terminal Count? Is One-Shot without Gate Input?
case 1: //One-shot mode?
switch (PITchannels[channel].status) //What status?
{
case 0: //Output goes low/high?
PITchannels[channel].channel_status = mode; //We're high when mode 1, else low with mode 0!
PITchannels[channel].reloadlistening = 1; //We're listening to reloads!
if (PITchannels[channel].reload)
{
PITchannels[channel].gatelistening = mode; //We're listening to gate with mode 1!
PITchannels[channel].status = 1; //Skip to 1: we're ready to run already!
goto mode0_1; //Skip to step 1!
}
break;
case 1: //Wait for next rising edge of gate input?
mode0_1:
if (!mode) //No wait on mode 0?
{
PITchannels[channel].status = 2;
goto mode0_2;
}
else if (PITchannels[channel].gatewenthigh) //Mode 1 waits for gate to become high!
{
PITchannels[channel].gatewenthigh = 0; //Not went high anymore!
PITchannels[channel].gatelistening = 0; //We're not listening to gate with mode 1 anymore!
PITchannels[channel].status = 2;
goto mode0_2;
}
break;
case 2: //Output goes low and we start counting to rise! After timeout we become 4(inactive) with mode 1!
mode0_2:
if (PITchannels[channel].reload)
{
PITchannels[channel].reload = 0; //Not reloading anymore!
PITchannels[channel].channel_status = 0; //Lower output!
reloadticker(channel); //Reload the counter!
}

oldvalue = PITchannels[channel].ticker; //Save old ticker for checking for overflow!
if (mode) --PITchannels[channel].ticker; //Mode 1 always ticks?
else if ((PCSpeakerPort&1) || (channel<2)) --PITchannels[channel].ticker; //Mode 0 ticks when gate is high! The other channels are tied 1!
wrapPITticker(channel); //Wrap us correctly!
if ((!PITchannels[channel].ticker) && oldvalue) //Timeout when ticking? We're done!
{
PITchannels[channel].channel_status = 1; //We're high again!
}
Show last 203 lines
						break;
default: //Unsupported! Ignore any input!
break;
}
break;
case 2: //Also Rate Generator mode?
case 6: //Rate Generator mode?
switch (PITchannels[channel].status) //What status?
{
case 0: //Output going high! See below! Wait for reload register to be written!
PITchannels[channel].channel_status = 1; //We're high!
PITchannels[channel].status = 1; //Skip to 1: we're ready to run already!
PITchannels[channel].reloadlistening = 1; //We're listening to reloads!
goto mode2_1; //Skip to step 1!
break;
case 1: //We're starting the count?
mode2_1:
if (PITchannels[channel].reload)
{
reload2:
PITchannels[channel].reload = 0; //Not reloading!
reloadticker(channel); //Reload the counter!
PITchannels[channel].channel_status = 1; //We're high!
PITchannels[channel].status = 2; //Start counting!
PITchannels[channel].reloadlistening = 0; //We're not listening to reloads anymore!
PITchannels[channel].gatelistening = 1; //We're listening to the gate!
}
break;
case 2: //We start counting to rise!!
if (PITchannels[channel].gatewenthigh) //Gate went high?
{
PITchannels[channel].gatewenthigh = 0; //Not anymore!
goto reload2; //Reload and execute!
}
if (((PCSpeakerPort & 1) && (channel==2)) || (channel<2)) //We're high or undefined?
{
--PITchannels[channel].ticker; //Decrement?
switch (PITchannels[channel].ticker) //Two to one? Go low!
{
case 1:
PITchannels[channel].channel_status = 0; //We're going low during this phase!
break;
case 0:
PITchannels[channel].channel_status = 1; //We're going high again during this phase!
reloadticker(channel); //Reload the counter!
break;
default: //No action taken!
break;
}
}
else //We're low? Output=High and wait for reload!
{
PITchannels[channel].channel_status = 1; //We're going high again during this phase!
}
break;
default: //Unsupported! Ignore any input!
break;
}
//mode 2==6 and mode 3==7.
case 3: //Square Wave mode?
case 7: //Also Square Wave mode?
switch (PITchannels[channel].status) //What status?
{
case 0: //Output going high! See below! Wait for reload register to be written!
PITchannels[channel].channel_status = 1; //We're high!
PITchannels[channel].reloadlistening = 1; //We're listening to reloads!
if (PITchannels[channel].reload)
{
PITchannels[channel].reload = 0; //Not reloading!
reloadticker(channel); //Reload the counter!
PITchannels[channel].status = 1; //Next status: we're loaded and ready to run!
PITchannels[channel].reloadlistening = 0; //We're not listening to reloads anymore!
PITchannels[channel].gatelistening = 1; //We're listening to the gate!
goto mode3_1; //Skip to step 1!
}
break;
case 1: //We start counting to rise!!
mode3_1:
if (PITchannels[channel].gatewenthigh)
{
PITchannels[channel].gatewenthigh = 0; //Not anymore!
PITchannels[channel].reload = 0; //Reloaded!
reloadticker(channel); //Gate going high reloads the ticker immediately!
}
if ((PCSpeakerPort&1) || (channel<2)) //To tick at all? The other channels are tied 1!
{
PITchannels[channel].ticker -= 2; //Decrement by 2 instead?
switch (PITchannels[channel].ticker)
{
case 0: //Even counts decreased to 0!
case 0xFFFF: //Odd counts decreased to -1/0xFFFF.
PITchannels[channel].channel_status ^= 1; //We're toggling during this phase!
PITchannels[channel].reload = 0; //Reloaded!
reloadticker(channel); //Reload the next value to tick!
break;
default: //No action taken!
break;
}
}
break;
default: //Unsupported! Ignore any input!
break;
}
break;
case 4: //Software Triggered Strobe?
case 5: //Hardware Triggered Strobe?
switch (PITchannels[channel].status) //What status?
{
case 0: //Output going high! See below! Wait for reload register to be written!
PITchannels[channel].channel_status = 1; //We're high!
PITchannels[channel].status = 1; //Skip to 1: we're ready to run already!
PITchannels[channel].reloadlistening = 1; //We're listening to reloads!
PITchannels[channel].gatelistening = 1; //We're listening to the gate!
goto mode4_1; //Skip to step 1!
break;
case 1: //We're starting the count or waiting for rising gate(mode 5)?
mode4_1:
if (PITchannels[channel].reload)
{
pit45_reload: //Reload PIT modes 4&5!
if ((mode == 4) || ((PITchannels[channel].gatewenthigh) && (mode == 5))) //Reload when allowed!
{
PITchannels[channel].gatewenthigh = 0; //Reset gate high flag!
PITchannels[channel].reload = 0; //Not reloading!
reloadticker(channel); //Reload the counter!
PITchannels[channel].status = 2; //Start counting!
}
}
break;
case 2: //We start counting to rise!!
case 3: //We're counting, but ignored overflow?
if (PITchannels[channel].reload || (((mode==5) && PITchannels[channel].gatewenthigh))) //We're reloaded?
{
goto pit45_reload; //Reload when allowed!
}
if (((PCSpeakerPort & 1) && (channel == 2)) || (channel<2)) //We're high or undefined?
{
--PITchannels[channel].ticker; //Decrement?
wrapPITticker(channel); //Wrap us correctly!
if (!PITchannels[channel].ticker && (PITchannels[channel].status!=3)) //One to zero? Go low when not overflown already!
{
PITchannels[channel].channel_status = 0; //We're going low during this phase!
PITchannels[channel].status = 3; //We're ignoring any further overflows from now on!
}
else
{
PITchannels[channel].channel_status = 1; //We're going high again any other phase!
}
}
else //We're low? Output=High and wait for reload!
{
PITchannels[channel].channel_status = 1; //We're going high again during this phase!
}
break;
}
break;
default: //Unsupported mode! Ignore any input!
break;
}
currentsample = PITchannels[channel].channel_status; //The current sample we're processing, prefetched!
if (channel) //Handle channel 1&2 seperately too!
{
//Process the rise toggle!
if (((PITchannels[channel].lastchannel_status^currentsample)&1) && currentsample) //Raised?
{
PITchannels[channel].risetoggle ^= 1; //Toggle the bit in our output port!
}

//Now, write the (changed) output to the channel to use!
if (channel==2) //PIT2 needs a sound buffer?
{
//We're ready for the current result!
writefifobuffer(PITchannels[channel].rawsignal, currentsample&((PCSpeakerPort & 2) >> 1)); //Add the data to the raw signal! Apply the output mask too!
}
else //PIT1 is connected to an external ticker!
{
if ((PITchannels[channel].lastchannel_status^currentsample) & 1) //Changed?
{
if (PIT1Ticker) //Gotten a handler for it?
{
PIT1Ticker(currentsample); //Handle this PIT1 tick!
}
}
}
}
else //PIT0?
{
if ((PITchannels[channel].lastchannel_status^currentsample) & 1) //Changed?
{
if (currentsample) //Raised?
{
raiseirq(0); //Raise IRQ0!
}
else //Lowered?
{
lowerirq(0); //Lower IRQ0!
}
}
}
PITchannels[channel].lastchannel_status = currentsample; //Save the new status!
}
}
}

The used filter (low-pass and high-pass IIR filters):
https://bitbucket.org/superfury/unipcemu/src/ … ers.c?at=master

Edit: The filter itself might be doubled in frequency, seeing as the input signal from the PIT is scaled to -16384 to +16383 15-bit range instead of full 16-bit range to prevent overflow when filtering.

Edit: Is it just me or is the recording from UniPCemu slower than the MPG recording?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 43 of 66, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie

The high-frequency noise in the Links recording is probably aliasing of the PWM carrier, but I can't tell for sure without doing some research and running a spectrogram.

The buzzing (and slowness) in the 8088MPH mod player is almost certainly due to inaccuracies in the timing of your emulated CPU causing samples to be emitted at irregular intervals instead of exactly every 72 PIT cycles. Again I'd need to do some analysis to be completely sure.

The easiest way for you to find out for sure would be to do the experiment I suggested earlier - dump the pre-filtered 1.193MHz speaker data to a file, load it into Cool Edit, resample to 44.1kHz there, listen to the result and compare the resulting spectrogram to that of the actual output from your emulator with its filter in place. Then you'll be able to see exactly what your filter is doing.

Reply 44 of 66, by Scali

User metadata
Rank l33t
Rank
l33t
reenigne wrote:

The buzzing (and slowness) in the 8088MPH mod player is almost certainly due to inaccuracies in the timing of your emulated CPU causing samples to be emitted at irregular intervals instead of exactly every 72 PIT cycles.

I second that, the mod player is definitely running considerably slower than on the real thing.
As a side-effect, the carrier wave is also much lower, and therefore more apparent. There also seems to be quite some fluctuation in the carrier wave frequency. Which probably indicates that the player does not just run more slowly as a whole, but the timing of individual instructions is probably off by some random factor, causing microscopic speedups and slowdowns in the player's inner loops.

I don't think the 8088 MPH mod player is a good target for developing a PC speaker emulation routine. A prerequisite for the 8088 MPH mod player to produce correct PWM data is that your emulator is cycle-exact.
Most other PWM routines, such as in Links, are timed with the PIT, and are therefore not directly dependent on the speed of each instruction in your emulator. This will at least make the PWM data fire at roughly the correct speed.

The thing with PWM data is that it is speed-dependent of course. Every value you write to the PIT is relative to the targeted sampling rate. If you play these samples too slowly, the resulting amplitude will be too low. Play them too quickly, and you get the problem that you are trying to update the PIT counter before it even reached 0. What happens then, depends on how your emulator handles this. On real hardware, the new value will not get active until the counter has reached 0, so it gets shifted in time. You get a bad case of the jitters.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 45 of 66, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++

Yup. Don't try to prove that your PIT is working properly while also trying to filter the output to approximate something real from a speaker.

All hail the Great Capacitor Brand Finder

Reply 46 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

Well, the strange thing is in the recording of the raw 1.19MHz wave: my sound editor refuses to open it(other recordings at 44.1kHz load without problems), so I can't even verify that(despite 10 minutes of recording taking up over 1GB of uncompressed 16-bit sound data, at 2.38 MB/s.). Maybe the rate is too high? I'm using WavePad(free/demo) to view it(the limit's apparently 196kHz).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 47 of 66, by Jepael

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Well, the strange thing is in the recording of the raw 1.19MHz wave: my sound editor refuses to open it(other recordings at 44.1kHz load without problems), so I can't even verify that(despite 10 minutes of recording taking up over 1GB of uncompressed 16-bit sound data, at 2.38 MB/s.). Maybe the rate is too high? I'm using WavePad(free/demo) to view it(the limit's apparently 196kHz).

Audacity might go higher than that but not up to 1.2 MHz sampling rate.
But you can always convert 119318 Hz wave to 4410 Hz wave as the ratio is almost same. It is an approximation and there might be better ratios.

Reply 48 of 66, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++
superfury wrote:

Well, the strange thing is in the recording of the raw 1.19MHz wave: my sound editor refuses to open it(other recordings at 44.1kHz load without problems), so I can't even verify that(despite 10 minutes of recording taking up over 1GB of uncompressed 16-bit sound data, at 2.38 MB/s.). Maybe the rate is too high? I'm using WavePad(free/demo) to view it(the limit's apparently 196kHz).

The 1.19MHz WAV is binary, right? Should be straightforward to whip up a converter for adding all non-zero bits in the sample window.

All hail the Great Capacitor Brand Finder

Reply 49 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

It's a plain wav file with a mono 16-bit uncompressed stream at 1.19MHz(rounded down to integer Hz). It has the 0s translated to -32768 and 1s translated to 32767. Otherwise, it just adds the required RIFF, Header and Data sections, nothing more(See wave.c).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 50 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

And about the 8088MPH loop's timing: only one thing has changed recently: the IN/OUT timing is now more accurate(4 vs 8 cycles, depending on chip and byte/word accesses, see my post at vladstamate's thread). Otherwise, it follows the 8086/8088 user manual 100%(Except MUL, which is more accurate).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 51 of 66, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
Jepael wrote:

Audacity might go higher than that but not up to 1.2 MHz sampling rate.

I know Cool Edit Pro can do a 1.193MHz sample rate, so Adobe Audition almost certainly can too. It might be necessary to load raw PCM samples rather than a .wav file.

Failing that, superfury could just put together a really simple command-line resampling utility using SRC or similar.

superfury wrote:

And about the 8088MPH loop's timing: only one thing has changed recently: the IN/OUT timing is now more accurate(4 vs 8 cycles, depending on chip and byte/word accesses, see my post at vladstamate's thread). Otherwise, it follows the 8086/8088 user manual 100%(Except MUL, which is more accurate).

So how many cycles does the mod player take per sample at the moment on your emulator?

Btw, the 5160 motherboard (and probably the 5150 too) inserts a cycle of wait state for any port access, so an IN or OUT instruction takes 5 cycles (10 for a word) as well as the 4 or 8 cycles to fetch the instruction itself. Though it sounds like you have a much greater error than 1 cycle in 288 at the moment.

Reply 52 of 66, by Scali

User metadata
Rank l33t
Rank
l33t
reenigne wrote:

Though it sounds like you have a much greater error than 1 cycle in 288 at the moment.

You make a good point here... We *know* that the innerloop is 288 cycles (give or take 1 cycle every now and then, as we saw on the spectrum analysis).
There is exactly one out-instruction to port 0x42 in every loop.
So if superfury were to measure the number of emulated cycles between these out-instructions, he'd see exactly how far his emulation is off.
It could be a rough target for fine-tuning the emulation core. When you can get the loop to run at 288 cycles, it's accurate enough for the mod player/PWM data, even though it may not be cycle-exact for every individual instruction.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 53 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

I'll add logging of the CPU cycles taken on instructions in the loop later. As I said, currently most timing(except the variable range unknown timings like (I)MUL/(I)DIV) are the timings from the 8086/8088 user manual(can't remember the exact name, user manual by Intel afaik). It contains a big multipage table at the end with the used timings and started with ModR/M timings afaik(with memory timings after each instruction's variants as far as I remember).

Edit: A quick search on Google later, I've seemed to find the used manual(or one if it's revisions): Intel 8086 Family User's Manual October 1979

https://edge.edx.org/c4x/BITSPilani/EEE231/as … s_Manual_1_.pdf

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 55 of 66, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

As I said, currently most timing(except the variable range unknown timings like (I)MUL/(I)DIV) are the timings from the 8086/8088 user manual(can't remember the exact name, user manual by Intel afaik). It contains a big multipage table at the end with the used timings and started with ModR/M timings afaik(with memory timings after each instruction's variants as far as I remember).

That's not the whole story, though, or there would be lots of cycle-exact 8088/8086 emulators. If I understand correctly, this table contains best-case timings - i.e. doesn't include the cycles that the EU spends waiting for the BIU. Though that suggests that your emulator should be faster than a real CPU rather than slower.

Reply 56 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

Well, those timings are not all UniPCemu uses: it also applies MMU Read/Write cycles, prefetch cycles, modr/m calculations etc. This combines into one big formula(addition). That is the timing spent on the current instruction. Finally, the value is copied, the memory cycles substracted to obtain 'busy'(non-bus) cycles. These are divided by 4 to obtain the amount of prefetch bytes to fetch from memory.

Furthermore, the renderer(CGA on VGA) keeps the CPU in semi-busy state(hardware-induced HLT) to apply CGA WaitState memory.

 if ((CPU[activeCPU].cycles_OP|CPU[activeCPU].cycles_HWOP|CPU[activeCPU].cycles_Exception) && CPU_useCycles) //cycles entered by the instruction?
{
CPU[activeCPU].cycles = CPU[activeCPU].cycles_OP+CPU[activeCPU].cycles_HWOP+CPU[activeCPU].cycles_Prefix + CPU[activeCPU].cycles_Exception + CPU[activeCPU].cycles_Prefetch + CPU[activeCPU].cycles_MMUR + CPU[activeCPU].cycles_MMUW; //Use the cycles as specified by the instruction!
}
else //Automatic cycles placeholder?
{
#endif
CPU[activeCPU].cycles = (CPU_databussize>=1)?9:8; //Use 9 with 8088MPH CPU(8088 CPU), normal 8 with 8086.
#ifdef CPU_USECYCLES

 void CPU_fillPIQ() //Fill the PIQ until it's full!
{
if (CPU[activeCPU].PIQ==0) return; //Not gotten a PIQ? Abort!
byte oldMMUCycles;
oldMMUCycles = CPU[activeCPU].cycles_MMUR; //Save the MMU cycles!
CPU[activeCPU].cycles_MMUR = 0; //Counting raw time spent retrieving memory!
writefifobuffer(CPU[activeCPU].PIQ, MMU_rb(CPU_SEGMENT_CS, CPU[activeCPU].registers->CS, CPU[activeCPU].PIQ_EIP++, 1)); //Add the next byte from memory into the buffer!
CPU[activeCPU].cycles_Prefetch += CPU[activeCPU].cycles_MMUR; //Apply the memory cycles to prefetching!
//Next data! Take 4 cycles on 8088, 2 on 8086 when loading words/4 on 8086 when loading a single byte.
CPU[activeCPU].cycles_MMUR = oldMMUCycles; //Restore the MMU cycles!
}

void CPU_tickPrefetch()
{
if (!CPU[activeCPU].PIQ) return; //Disable invalid PIQ!
byte cycles;
cycles = CPU[activeCPU].cycles; //How many cycles have been spent on the instruction?
cycles -= CPU[activeCPU].cycles_MMUR; //Don't count memory access cycles!
cycles -= CPU[activeCPU].cycles_MMUW; //Don't count memory access cycles!
cycles -= CPU[activeCPU].cycles_Prefetch; //Don't count memory access cycles by prefetching required data!
//Now we have the amount of cycles we're idling.
if (EMULATED_CPU<CPU_80286) //Old CPU?
{
for (;(cycles >= 4) && fifobuffer_freesize(CPU[activeCPU].PIQ);) //Prefetch left to fill?
{
CPU_fillPIQ(); //Add a byte to the prefetch!
cycles -= 4; //This takes four cycles to transfer!
}
}
else //286+
{
for (;(cycles >= 3) && fifobuffer_freesize(CPU[activeCPU].PIQ);) //Prefetch left to fill?
{
CPU_fillPIQ(); //Add a byte to the prefetch!
cycles -= 3; //This takes four cycles to transfer!
}
}
}

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 57 of 66, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Well, those timings are not all UniPCemu uses: it also applies MMU Read/Write cycles, prefetch cycles, modr/m calculations etc. This combines into one big formula(addition).

I think this is wrong. The BIU cycles aren't added to the EU cycles - the EU and the BIU operate concurrently. Sometimes the EU has to wait for the BIU (when it the EU needs to do a bus access and the BIU is busy doing a prefetch) and sometimes the BIU has to wait for the EU (when the prefetch queue fills up during a long-running instruction, or when prefetch is suspending pending a jump). So that explains why your emulator is running slow.

Reply 58 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

Then we reach the next problem: is it possible to apply the BIU/EU split while keeping the current jumptable-based emulation? Currently each step counts timings(prefetch adding to cycles_Prefetch, memory accesses adding to MMUR/W(same with I/O), misc adding to the other timings(HW for hardware interrupts). The BIU(Prefetching, memory/io(MMUR/W) is already added like a wait state of 4/8 cycles) currently substracts(ignores) timings made Then that remaining BIU timing(timing not spend on bus transactions) is spent on filling the prefetch queue(4 cycles per prefetch, like byte memory accesses). Thus BIU fetches to/from memory are counted serially, while prefetching for next instructions are done in parallel(in the calculated remaining time not using the BIU).

So if an CPU instruction of 16(.cycles) cycles spends 4 cycles on memory transactions(reading opcodes, memory i/o, i/o ports), 12(.cycles) cycles are then spent to fetch 3 bytes into the prefetch queue. Thus this does happen in parallel. Although memory/bus accesses stall the CPU(.cycles), the CPU will have to wait for the transaction to complete, thus it's correct to count those non-prefetch bus cycles as EU cycles?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 59 of 66, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Then we reach the next problem: is it possible to apply the BIU/EU split while keeping the current jumptable-based emulation?

No, not really. I tried hard to achieve that. Since I already had a working emulator I hoped I can reuse as much as possible when I started to write my cycle based emulator. But once I did the BIU/EU split there as no way I could keep a jump-table because an instruction execution had to be broken in parts. Somehow I had to find a way to return early from an instruction execution if for example either the prefetch buffer was empty or I had to issue some memory reads. I could not do that cleanly. So in the end I had to rewrite all the instruction emulation from scratch with the idea of doing at the cycle base and not at instruction execution base.

For example a simple instruction like MOV AX, [DI+4] has to issue 3 reads from prefetch (a byte at a time) and 2 bytes from memory (also a byte at a time for an 8088). Only when those reads are done can we execute the instruction, however some part of it is executed in the beginning, like the decoding, decoding the RM, EA calculation, and in the middle to request the memory reads once the EA has done its calculation etc.

Not only that but that instruction (and any other) can take variable number of cycles depending on the preceding instruction. Why? Because what is in the prefetch buffer is different (could be nothing or it could be full) depending on what happened before. So any instruction can really have a variable number of cycles.

Take a look at my emulator to see what I am doing.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/