Reply 60 of 66, by superfury
I've made a little log of the 8088 MPH credits executing, which includes register states etc., but now also logs the used cycles on different parts of the instruction:
https://www.dropbox.com/s/674e6s71smifkp3/deb … redits.zip?dl=0
Every instruction information is dumped right after the instruction disassembly.
Edit: Looking at the source code of your EU( https://bitbucket.org/vstamate/cape-public/sr … U.cpp?at=master ), it seems that the rough workflow is about the same as UniPCemu's. The main difference being that fetches to/from memory and delays are seperated from the main execution flow(whereas UniPCemu does everything in the same order all at once(your decode/rmmode/EA/imm cycles are done at my CPU_readOP_prefix combined with the CPU_readOP's results, while the rest of your execution(read/delay/execute/write) is done in the function in the huge multi-CPU(8086-80586) lookup table(although cycle-number-wise only)), then applying BIU prefetching on unused cycles afterwards instead of during execution/fetching).
My CPU core(close to cycle-exact, only the general CPU requirements are done here, the rest is in the 8086, NECV30(80186), 80286 and 80386 core files(basic execution/read/write timings)): https://bitbucket.org/superfury/unipcemu/src/ … cpu.c?at=master
My 808X core: https://bitbucket.org/superfury/unipcemu/src/ … 086.c?at=master
Can you see what's going wrong, vladstamate, reenigne, Jepael?
Edit: The CPU core can be adjusted to provide (semi-)exact cycle timings though, although, since the CPU has to wait for the BIU transfer to complete anyway(the only thing that can occur in parallel is prefetching?), that might be wasted time to implement? So the only difference might actually be that my emulator has 'faster' prefetch, since the prefetch cycles (4 for each prefetched byte) apply to the total time the BIU is idle, instead of the total time the BIU is idle including little delays(e.g. your emulator might do a prefetch like: 2 cycles decoding, 4 cycles fetching, 2 cycles delay, 2 cycles writing, 2 cycles delay; UniPCemu will simply do the equivalent of: 2 cycles decoding, 4 cycles fetching, [2 cycles writing, 4 cycles delay]. The total cycles will be the same, but since UniPCemu has a 4 cycles delay, it will fetch into the prefetch once, while your emulator won't fetch into the prefetch at all, due to the read/write/execution phases not being emulated seperately).
One thing that's got me wondering though is this: how does the BIU knows to prefetch into the prefetch buffer while the EU is busy? It cannot know to fetch a byte into the prefetch, since it doesn't know for how long the EU is still going to be busy on the execution phase? How is this synchronized?
Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io