VOGONS


UniPCemu 8088 cycle accuracy

Topic actions

First post, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've just been wondering about something.

Does the 8088 require the BIU transfers to be finished (ticked T4 to become T1) when an instruction finishes with a write?
It currently waits for T4 to tick before finishing the instruction on the T1 cycle in that case.

I know it's required for reads, but what about memory or bus(I/O address space) writes?

An interesting issue also is that there's overscan only on every other scanline of the 8088 MPH kefrens effect. Otherwise, it's fully accurate.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 1 of 122, by GloriousCow

User metadata
Rank Member
Rank
Member

A microcode W instruction will indeed finish on T3/TwLast of the last byte transferred. If an instruction ends with W, RNI, then on T3, the byte for the next instruction is read out of the queue as well.

Here's a trace from MartyPC running 'push si' with the next instruction 'push di'

00   [F16E6] CS M:R.. I:... D:. CODE T2 <-r 00 [        ] <-q 56 |    :              |               | [F000:16E5] push si (1)
01 [F16E6] CS M:R.. I:... D:. CODE T3 <-r 57 [ ] | 028: SP -> tmpa | DEC2 tmpa |
02 [F16E6] CS M:... I:... D:. CODE T4 [57 ] | 029: SIGMA-> IND | |
03 A:[F16E7] M:... I:... D:. CODE T1 [57 ] | 02A: SIGMA-> SP | |
04 [F16E7] CS M:R.. I:... D:. CODE T2 <-r 00 [57 ] | 02B: M -> OPR | W,RNI DS,P0 | ; BUS_BEGIN
05 [F16E7] CS M:R.. I:... D:. CODE T3 <-r F3 [57 ] | 02B: M -> OPR | W,RNI DS,P0 |
06 [F16E7] CS M:... I:... D:. CODE T4 [57F3 ] | 02B: M -> OPR | W,RNI DS,P0 |
07 A:[02D48] M:... I:... D:. MEMW T1 [57F3 ] | 02B: M -> OPR | W,RNI DS,P0 |
08 [02D48] SS M:.A. I:... D:. MEMW T2 [57F3 ] | 02B: M -> OPR | W,RNI DS,P0 |
09 [02D48] SS M:.AW I:... D:. MEMW T3 w-> A0 [57F3 ] | 02B: M -> OPR | W,RNI DS,P0 |
10 [02D48] SS M:... I:... D:. MEMW T4 [57F3 ] | 02B: M -> OPR | W,RNI DS,P0 | ; BUS_BEGIN
11 A:[02D49] M:... I:... D:. MEMW T1 [57F3 ] | 02B: M -> OPR | W,RNI DS,P0 |
12 [02D49] SS M:.A. I:... D:. MEMW T2 [57F3 ] | 02B: M -> OPR | W,RNI DS,P0 |
13 [02D49] SS M:.AW I:... D:. MEMW T3 w-> 00 [F3 ] | 02B: M -> OPR | W,RNI DS,P0 | FINALIZE; FINALIZE_END
00 [02D49] SS M:... I:... D:. MEMW T4 [F3 ] <-q 57 | : | | [F000:16E6] push di (1)

Technically, a read operation ends on T3/TwLast as well, but the read is loaded into OPR. You still need to do something with it, like assign it to a register, which is generally done on T4.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 2 of 122, by GloriousCow

User metadata
Rank Member
Rank
Member

here's a little test for you...

this is the CPU timing test yoinked out of 8088MPH (credits to reenigne) with 8 timer checkpoints inserted. the test disables DRAM refresh - if you're watching PIT timer channel #1 your emulator should disable DMA automatically, otherwise you'll want to turn off wait states other than the 1 on IO.

Here are the values produced on my IBM 5150:
FF43
FE59
FDC5
FD58
FD2A
FC6B
FBB7
F9A9
Elapsed ticks: 07CA

Each is timing a particular block of instructions. Let's see what you get!

Attachments

  • Filename
    8088tst3.zip
    File size
    1.18 KiB
    Downloads
    56 downloads
    File comment
    8088mph cpu test v0.3
    File license
    Public domain

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 3 of 122, by superfury

User metadata
Rank l33t++
Rank
l33t++
GloriousCow wrote on 2023-07-13, 19:16:
here's a little test for you... […]
Show full quote

here's a little test for you...

this is the CPU timing test yoinked out of 8088MPH (credits to reenigne) with 8 timer checkpoints inserted. the test disables DRAM refresh - if you're watching PIT timer channel #1 your emulator should disable DMA automatically, otherwise you'll want to turn off wait states other than the 1 on IO.

Here are the values produced on my IBM 5150:
FF43
FE59
FDC5
FD58
FD2A
FC6B
FBB7
F9A9
Elapsed ticks: 07CA

Each is timing a particular block of instructions. Let's see what you get!

UniPCemu only disables DMA if the PIT mode is set to wait (using the gate never going from low to high on channels 0 and 1, as it's tied high). Otherwise, any kind of PIT mode that requires input before starting a (new) initial counter can be stopped by never giving said input.
Is that enough for that program to work?

Does it display those results in the MS-DOS prompt?

Also, thinking about the Kefrens frame start issue, that should be in the instructions until the looped code that's looped a lot of times?

Last edited by superfury on 2023-07-14, 02:23. Edited 2 times in total.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 5 of 122, by superfury

User metadata
Rank l33t++
Rank
l33t++

https://github.com/reenigne/reenigne/blob/mas … ens/kefrens.asm
So it should be in the header block, the footer block, both, or perhaps at the initialization part?
The instructions in the inner loop should be correct then? So those can be ignored in said blocks?

Also, about the credits, it's there too:
https://github.com/reenigne/reenigne/blob/mas … 8/demo/credits/.

Depending on what's loaded during crash (prefetch needed at instruction past POP (in other words: the PIQ not being filled with the "mov cl,99" instruction)?
Or perhaps "pop word [cs:bx]" is the issue? UniPCemu ticks 5(mem) or 3(reg) cycles after the reading of the stack (on T1 5 cycles in this case)? So ending on T2 of fetching the second byte after the pop?

Last edited by superfury on 2023-07-14, 02:47. Edited 1 time in total.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 7 of 122, by superfury

User metadata
Rank l33t++
Rank
l33t++

Also, see my edited last message about pop word [cs:bx]. Is 5 idle EU cycles after reading RAM correct, ending at T2 of a possible second prefetch byte?

I was referring to the kefrens source code first and the credits crash after that.

I'll run the program when I have time for it.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 8 of 122, by GloriousCow

User metadata
Rank Member
Rank
Member
superfury wrote on 2023-07-14, 02:50:

Also, see my edited last message about pop word [cs:bx]. Is 5 idle EU cycles after reading RAM correct, ending at T2 of a possible second prefetch byte?

the way i have pop rm16:
1 cycle
pop stack (timing from biu)
1 cycle
if mem operand,+2 cycles
write operand

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 9 of 122, by superfury

User metadata
Rank l33t++
Rank
l33t++
GloriousCow wrote on 2023-07-14, 03:15:
the way i have pop rm16: 1 cycle pop stack (timing from biu) 1 cycle if mem operand,+2 cycles write operand […]
Show full quote
superfury wrote on 2023-07-14, 02:50:

Also, see my edited last message about pop word [cs:bx]. Is 5 idle EU cycles after reading RAM correct, ending at T2 of a possible second prefetch byte?

the way i have pop rm16:
1 cycle
pop stack (timing from biu)
1 cycle
if mem operand,+2 cycles
write operand

Is that 1 cycle before pop for all POPs? What about RETF? UniPCemu does 2 cycles before all POPs.
Also, what do you mean with "write operand"? What timing is that (in cycles?)? Also, memory? Reg operand?

Edit: Also started on unifying the BIU prefetch and request based ticking.
Edit: OK. Prefetching is now moved into the BIU request handler (which checks on T1 and processes on T3 (T1 on 286, T0(all Tstates are T0 on 486+ afaik and is implemented) only on 486+).
The only thing left to do now would be to convert all bus memory and i/o transactions using a motherboard-specified waitstate count (1 bus waitstate on XT architecture) to trigger when starting a request, before the i/o or memory access (between T2 and T3 on 808x) instead of between T3 and T4. Luckily most of the handling is already there. I'll just need to load the bus waitstate count into the BIU like it did already (if set and the below mentioned flags are both cleared) and set a flag(lower bit) and perform a normal waitstate abort and then when it's returns to said code with that bit set shift it 1 position up in the flag to mark it as processed and perform the actual memory or i/o access and check for waitstates the usual way. Then once the T4 cycle handler activates, it will clear both flags, causing future (or second broken up accesses for word transfers) transfers to perform the motherboard waitstate once again.

One nice thing is that the request based ticking should also fix some previous-commit (the BIU changes) issues with UniPCemu's IPS clocking mode(essentially forcing the BIU and EU into 1 cycle per instruction mode (based on the execute flag instead of cycle counter(which is always 1 in cycle-accurate mode, to keep it running instead of stopping all hardware timing when an instruction is unfinished (with exception of forced waitstates by hardware like the ET4000/W32 ACL accesses, which requires it to tick the hardware multiple times during an instruction if it's buffers are full when writing)))), as the IPS mode BIU is properly handled again with these latest commits.
Another nice thing is that hardware (unlike motherboard-wide, like the 1 cycle I/O waitstate on all I/O devices, which are basically untimed in IPS clocking modr) waitstates in IPS clocking mode can tick the hardware for 1 cycle(as in IPS cycle) to keep them running properly and not deadlocking the BIU because of a hardware-never-ticks chicken and egg deadlock.

Edit: OK. Waitstates are fixed now. Now it hangs on an INTA? Hmmm...
Edit: OK. It's booting again now! 😁 New waitstates should be working now!

Running 8088 MPH again now...
Credits still hang? It seems to have filled the 'blank canvas' code with a 'mov sp,02db' (at 1df8:025d) and 12 NOPs.
Although I've forgotten to enable the cycle logging, this is what happens (common log format):

Filename
debugger_8088MPH_credits_UniPCemu_20230714_1738.7z
File size
3.64 KiB
Downloads
58 downloads
File comment
8088 MPH credits (cycles forgotten to log).
File license
Fair use/fair dealing exception

Perhaps the contents of the instruction and registers is a hint as to what's happening and when?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 10 of 122, by superfury

User metadata
Rank l33t++
Rank
l33t++

One interesting thing with 8088 MPH's Kefrens effect is that I can see (at the top quarter of the screen) what seems like a certain pattern repeating now (when looking at the gaps in the overscan color). It's like a / stairs that keeps repeating, following the beat of the left to right movement of the kefrens bars that's filling the remainder of the screen. Still odd/even rows though.

I'll try running your program for a bit now.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 11 of 122, by GloriousCow

User metadata
Rank Member
Rank
Member
superfury wrote on 2023-07-14, 06:23:

Is that 1 cycle before pop for all POPs? What about RETF? UniPCemu does 2 cycles before all POPs.
Also, what do you mean with "write operand"? What timing is that (in cycles?)? Also, memory? Reg operand?

Nope, just that particular form (8f) Since there is a memory operand, POP rm8/16 has to stash the index value produced by the EA calculation in a temporary register before copying the stack pointer to the index for the stack read; that accounts for one cycle.

By 'write operand' I mean the instruction has to write the popped value from the stack back to either memory (BIU timings) or a register (1 cycle).

superfury wrote on 2023-07-14, 06:23:

Edit: OK. Prefetching is now moved into the BIU request handler (which checks on T1 and processes on T3 (T1 on 286, T0(all Tstates are T0 on 486+ afaik and is implemented) only on 486+).

unclear what you mean by that exactly; but i'll mention that i make prefetch decisions on T3/TwLast. I can only speak for 8088, but the cpu makes a decision to prefetch at least 2 cycles in advance of when it it will actually try to begin a CODE bus cycle; i call this prefetch scheduling. So this usually lands on T3 if no wait states. The prefetch is not always scheduled 2 cycles later; based on the length of the queue the cpu may delay the fetch an additional 2-3 cycles.

In any case, doing the logic this way means that you can prevent a prefetch from being scheduled during a given bus cycle if an EU bus request arrives before the prefetch is scheduled, and will incur a prefetch abort penalty if it arrives on or after the cycle the prefetch is scheduled. The latter is always a 2 cycle delay after T4.

superfury wrote on 2023-07-14, 06:23:

The only thing left to do now would be to convert all bus memory and i/o transactions using a motherboard-specified waitstate count (1 bus waitstate on XT architecture) to trigger when starting a request, before the i/o or memory access (between T2 and T3 on 808x) instead of between T3 and T4. Luckily most of the handling is already there. I'll just need to load the bus waitstate count into the BIU like it did already (if set and the below mentioned flags are both cleared) and set a flag(lower bit) and perform a normal waitstate abort and then when it's returns to said code with that bit set shift it 1 position up in the flag to mark it as processed and perform the actual memory or i/o access and check for waitstates the usual way. Then once the T4 cycle handler activates, it will clear both flags, causing future (or second broken up accesses for word transfers) transfers to perform the motherboard waitstate once again.

my waitstate processing is super simple; basically i have a counter for how many cycles that READY should be deasserted, it is always decremented on each cpu cycle, and if it is >0 when we reach T3 we insert wait states until it is 0. Bus operations that incur wait states just increment the counter; as does DMA.

superfury wrote on 2023-07-14, 06:23:
Running 8088 MPH again now... Credits still hang? It seems to have filled the 'blank canvas' code with a 'mov sp,02db' (at 1df8: […]
Show full quote

Running 8088 MPH again now...
Credits still hang? It seems to have filled the 'blank canvas' code with a 'mov sp,02db' (at 1df8:025d) and 12 NOPs.
Although I've forgotten to enable the cycle logging, this is what happens (common log format):
debugger_8088MPH_credits_UniPCemu_20230714_1738.7z
Perhaps the contents of the instruction and registers is a hint as to what's happening and when?

The main thing with the end credits of 8088MPH is that it's got a bunch of self-modifying code, so your prefetch and instruction queue emulation must be spot on, or you'll end up executing the wrong instructions. it's so tight that you must fetch operands in a cycle accurate way - for example it is tempting to read an immediate operand during instruction decode, but that is too early. Instruction decode handles reading the prefixes, opcode, modrm for you, and common microcode routines handle loading the displacement and EA operand for you, but immediate operands have to be deliberately fetched by the specific opcode's main microcode. That's usually done first thing by an instruction with an immediate operand form, but not always - JCXZ and LOOP for example both wait two cycles before fetching their operands.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 12 of 122, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just modified the 8F opcode as you wrote.

Just looked at the first few instruction executing (Generic Super PC/Turbo XT BIOS v3.0 I think) when logging in cycle-accurate mode (always log, even during skipping, single line format, simplified. State&register logs enabled. Advanced logging enabled.):

00:00:25:24.08614: BIU T1		
00:06:16:98.02016: BIU T2
00:06:16:98.02912: BIU T3 Physical(p):000ffff0=ea(ê); Paged(p):000ffff0=ea(ê); Normal(p):00000000=ea(ê)
00:06:16:98.03168: BIU T4
00:06:16:98.03392: BIU T1
00:06:16:98.03584: BIU T2
00:06:16:98.03776: BIU T3 Paged(p):000ffff1=5b([); Normal(p):00000001=5b([)
00:06:16:98.04000: BIU T4
00:06:16:98.04320: BIU T1
00:06:16:98.04512: BIU T2
00:06:16:98.04672: BIU T3 Paged(p):000ffff2=e0(à); Normal(p):00000002=e0(à)
00:06:16:98.04864: BIU T4
00:06:16:98.05056: BIU T1
00:06:16:98.05216: BIU T2
00:06:16:98.05408: BIU T3 Paged(p):000ffff3=00( ); Normal(p):00000003=00( )
00:06:16:98.05568: BIU T4
00:06:16:98.05728: BIU T1
00:06:16:98.05920: BIU T2
00:06:16:98.06432: BIU T3 Paged(p):000ffff4=f0(ð); Normal(p):00000004=f0(ð)
00:06:16:98.06592: BIU T4 ffff:0000 (EA5BE000F0)JMP F000:E05B
00:06:16:98.06720: BIU T1
00:06:16:98.06848: BIU T2
00:06:16:98.06944: BIU T3 Physical(p):000fe05b=fa(ú); Paged(p):000fe05b=fa(ú); Normal(p):0000e05b=fa(ú)
00:06:16:98.07072: BIU T4 f000:e05b (FA)CLI
00:06:16:98.07200: BIU T1
00:06:16:98.07296: BIU T2
00:06:16:98.07392: BIU T3 Paged(p):000fe05c=fc(ü); Normal(p):0000e05c=fc(ü)
00:06:16:98.07520: BIU T4 f000:e05c (FC)CLD
00:06:16:98.07616: BIU T1
00:06:16:98.07744: BIU T2
00:06:16:98.07872: BIU T3 Paged(p):000fe05d=b0(°); Normal(p):0000e05d=b0(°)
00:06:16:98.08000: BIU T4
00:06:16:98.08096: BIU T1
00:06:16:98.08192: BIU T2
00:06:16:98.08320: BIU T3 Paged(p):000fe05e=00( ); Normal(p):0000e05e=00( )
00:06:16:98.08416: BIU T4 f000:e05d (B000)MOV AL,00
00:06:16:98.08544: BIU T1
00:06:16:98.08640: BIU T2
00:06:16:98.08768: BIU T3 Paged(p):000fe05f=e6(æ); Normal(p):0000e05f=e6(æ)
00:06:16:98.08896: BIU T4
00:06:16:98.08992: BIU T1
00:06:16:98.09120: BIU T2
00:06:16:98.09216: BIU T3 Physical(p):000fe060=a0( ); Paged(p):000fe060=a0( ); Normal(p):0000e060=a0( )
00:06:16:98.09344: BIU T4 f000:e05f (E6A0)OUT A0,AL
00:06:16:98.09440: BIU T1
00:06:16:98.09568: BIU T2
00:06:16:98.09664: BIU T3 Paged(p):000fe061=ba(º); Normal(p):0000e061=ba(º)
00:06:16:98.09792: BIU T4
00:06:16:98.09920: BIU T1
00:06:16:99.00016: BIU T2
00:06:16:99.00144: BIU TW
00:06:16:99.00240: BIU TW
00:06:16:99.00368: BIU TW
00:06:16:99.00464: BIU T4
00:06:16:99.00560: BIU T1
00:06:16:99.00688: BIU T2
00:06:16:99.00816: BIU T3 Paged(p):000fe062=d8(Ø); Normal(p):0000e062=d8(Ø)
00:06:16:99.00912: BIU T4
00:06:16:99.01040: BIU T1
00:06:16:99.01136: BIU T2
Show last 21 lines
00:06:16:99.01840: BIU T3		Paged(p):000fe063=03(); Normal(p):0000e063=03()
00:06:16:99.01968: BIU T4 f000:e061 (BAD803)MOV DX,03D8
00:06:16:99.02096: BIU T1
00:06:16:99.02224: BIU T2
00:06:16:99.02320: BIU T3 Paged(p):000fe064=ee(î); Normal(p):0000e064=ee(î)
00:06:16:99.02448: BIU T4 f000:e064 (EE)OUT DX,AL
00:06:16:99.02544: BIU T1
00:06:16:99.02640: BIU T2
00:06:16:99.02768: BIU T3 Paged(p):000fe065=b2(²); Normal(p):0000e065=b2(²)
00:06:16:99.02864: BIU T4
00:06:16:99.02992: BIU T1
00:06:16:99.03120: BIU T2
00:06:16:99.03216: BIU T3 Paged(p):000fe066=b8(¸); Normal(p):0000e066=b8(¸)
00:06:16:99.03344: BIU T4
00:06:16:99.03440: BIU T1
00:06:16:99.03536: BIU T2
00:06:16:99.03664: BIU TW
00:06:16:99.03760: BIU TW
00:06:16:99.03888: BIU TW
00:06:16:99.04016: BIU T4
00:06:16:99.04144: BIU T1 f000:e065 (B2B8)MOV DL,B8

That would indicate that the first instruction starts at T4 (that is when it fills the executed opcode). But the 4th byte should become available at T1 instead of T3? Or is that correct behaviour?
The BIU sets a flag that is only cleared when ticking T4 on the BIU (The BIU always ticks after the EU), preventing reading a byte before it's finished.
Perhaps it's the fault of the prefetching itself? Since the BIU fills the prefetch and the EU simply empties it (it doesn't take the flag that indicates that Tw finished into account). Or modify the BIU to only fill it at T4... Hmmm...
Edit: I guess the solution is simple:
1. When filling the PIQ with data on T3, store it in an intermediate buffer instead.
2. When ticking T4 and checking the completed type, flush the intermediate buffer and write it to the prefetch buffer instead.

That way the intermediate buffer acts like the waitstate and allows 1/2/4 bytes to be randomly buffered for every prefetch (compatible with all CPUs emulated).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 13 of 122, by GloriousCow

User metadata
Rank Member
Rank
Member
superfury wrote on 2023-07-14, 18:18:
Just modified the 8F opcode as you wrote. […]
Show full quote

Just modified the 8F opcode as you wrote.

00:00:25:24.08614: BIU T1		
00:06:16:98.02016: BIU T2
00:06:16:98.02912: BIU T3 Physical(p):000ffff0=ea(ê); Paged(p):000ffff0=ea(ê); Normal(p):00000000=ea(ê)
00:06:16:98.03168: BIU T4
00:06:16:98.03392: BIU T1
00:06:16:98.03584: BIU T2
00:06:16:98.03776: BIU T3 Paged(p):000ffff1=5b([); Normal(p):00000001=5b([)
00:06:16:98.04000: BIU T4
00:06:16:98.04320: BIU T1
00:06:16:98.04512: BIU T2
00:06:16:98.04672: BIU T3 Paged(p):000ffff2=e0(à); Normal(p):00000002=e0(à)
00:06:16:98.04864: BIU T4
00:06:16:98.05056: BIU T1
00:06:16:98.05216: BIU T2
00:06:16:98.05408: BIU T3 Paged(p):000ffff3=00( ); Normal(p):00000003=00( )
00:06:16:98.05568: BIU T4
00:06:16:98.05728: BIU T1
00:06:16:98.05920: BIU T2
00:06:16:98.06432: BIU T3 Paged(p):000ffff4=f0(ð); Normal(p):00000004=f0(ð)
00:06:16:98.06592: BIU T4 ffff:0000 (EA5BE000F0)JMP F000:E05B

What are all those bus cycles doing?

It should only take 5 cycles to execute the JMP at the reset vector after the CPU resets - 4 to fetch the first opcode, then 1 cycle to read it out, then the JMP executes.

00   [00000]    M:... I:... D:. PASV T1        |  0R [        ]        | 
01 [00000] M:... I:... D:. PASV T1 | 0R [ ] | ; SUSP
02 [00000] M:... I:... D:. PASV T1 | 0R [ ] |
03 [00000] M:... I:... D:. PASV T1 | 0R [ ] | ; FLUSH
04 [00000] M:... I:... D:. PASV T1 | E0R [ ] |
05 [00000] M:... I:... D:. PASV T1 | 0R [ ] |
CPU RESET!
00 A:[FFFF0] M:... I:... D:. CODE T1 | 0R [ ] |
01 [FFFF0] CS M:R.. I:... D:. CODE T2 <-r 00 | 0R [ ] |
02 [FFFF0] CS M:R.. I:... D:. CODE T3 <-r EA | 0R [ ] |
03 [FFFF0] CS M:... I:... D:. CODE T4 | 0W [EA ] |
04 A:[FFFF1] M:... I:... D:. CODE T1 | 1R [ ] |
05 [FFFF1] CS M:R.. I:... D:. CODE T2 <-r 00 | F0R [ ] <-q EA | [FFFF:0000] jmpf far 0xF000:0xE05B (5) ; EXECUTE

Cycles 1-5 here are the actual reset procedure (it's microcoded!)

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 14 of 122, by superfury

User metadata
Rank l33t++
Rank
l33t++
GloriousCow wrote on 2023-07-14, 19:22:
What are all those bus cycles doing? […]
Show full quote
superfury wrote on 2023-07-14, 18:18:
Just modified the 8F opcode as you wrote. […]
Show full quote

Just modified the 8F opcode as you wrote.

00:00:25:24.08614: BIU T1		
00:06:16:98.02016: BIU T2
00:06:16:98.02912: BIU T3 Physical(p):000ffff0=ea(ê); Paged(p):000ffff0=ea(ê); Normal(p):00000000=ea(ê)
00:06:16:98.03168: BIU T4
00:06:16:98.03392: BIU T1
00:06:16:98.03584: BIU T2
00:06:16:98.03776: BIU T3 Paged(p):000ffff1=5b([); Normal(p):00000001=5b([)
00:06:16:98.04000: BIU T4
00:06:16:98.04320: BIU T1
00:06:16:98.04512: BIU T2
00:06:16:98.04672: BIU T3 Paged(p):000ffff2=e0(à); Normal(p):00000002=e0(à)
00:06:16:98.04864: BIU T4
00:06:16:98.05056: BIU T1
00:06:16:98.05216: BIU T2
00:06:16:98.05408: BIU T3 Paged(p):000ffff3=00( ); Normal(p):00000003=00( )
00:06:16:98.05568: BIU T4
00:06:16:98.05728: BIU T1
00:06:16:98.05920: BIU T2
00:06:16:98.06432: BIU T3 Paged(p):000ffff4=f0(ð); Normal(p):00000004=f0(ð)
00:06:16:98.06592: BIU T4 ffff:0000 (EA5BE000F0)JMP F000:E05B

What are all those bus cycles doing?

It should only take 5 cycles to execute the JMP at the reset vector after the CPU resets - 4 to fetch the first opcode, then 1 cycle to read it out, then the JMP executes.

00   [00000]    M:... I:... D:. PASV T1        |  0R [        ]        | 
01 [00000] M:... I:... D:. PASV T1 | 0R [ ] | ; SUSP
02 [00000] M:... I:... D:. PASV T1 | 0R [ ] |
03 [00000] M:... I:... D:. PASV T1 | 0R [ ] | ; FLUSH
04 [00000] M:... I:... D:. PASV T1 | E0R [ ] |
05 [00000] M:... I:... D:. PASV T1 | 0R [ ] |
CPU RESET!
00 A:[FFFF0] M:... I:... D:. CODE T1 | 0R [ ] |
01 [FFFF0] CS M:R.. I:... D:. CODE T2 <-r 00 | 0R [ ] |
02 [FFFF0] CS M:R.. I:... D:. CODE T3 <-r EA | 0R [ ] |
03 [FFFF0] CS M:... I:... D:. CODE T4 | 0W [EA ] |
04 A:[FFFF1] M:... I:... D:. CODE T1 | 1R [ ] |
05 [FFFF1] CS M:R.. I:... D:. CODE T2 <-r 00 | F0R [ ] <-q EA | [FFFF:0000] jmpf far 0xF000:0xE05B (5) ; EXECUTE

Cycles 1-5 here are the actual reset procedure (it's microcoded!)

Actually just 5 cycles is impossible. The JMP opcode (opcode EAh) is 5 bytes long, so requires 5x4 cycles to prefetch first. That's exactly what happens in UniPCemu now with the T4 cycle result bugfix I just implemented (using a simply 4-byte fifo buffer for everything the BIU can fetch in one go).

Result:

00:00:05:21.00735: BIU T1		
00:00:05:21.01989: BIU T2
00:00:05:21.02203: BIU T3 Physical(p):000ffff0=ea(ê); Paged(p):000ffff0=ea(ê); Normal(p):00000000=ea(ê)
00:00:05:21.02391: BIU T4
00:00:05:21.02572: BIU T1
00:00:05:21.02748: BIU T2
00:00:05:21.02929: BIU T3 Paged(p):000ffff1=5b([); Normal(p):00000001=5b([)
00:00:05:21.03104: BIU T4
00:00:05:21.03275: BIU T1
00:00:05:21.03445: BIU T2
00:00:05:21.03618: BIU T3 Paged(p):000ffff2=e0(à); Normal(p):00000002=e0(à)
00:00:05:21.03786: BIU T4
00:00:05:21.03953: BIU T1
00:00:05:21.04123: BIU T2
00:00:05:21.04296: BIU T3 Paged(p):000ffff3=00( ); Normal(p):00000003=00( )
00:00:05:21.04462: BIU T4
00:00:05:21.04629: BIU T1
00:00:05:21.04795: BIU T2
00:00:05:21.05183: BIU T3 Paged(p):000ffff4=f0(ð); Normal(p):00000004=f0(ð)
00:00:05:21.05297: BIU T4
00:00:05:21.05408: BIU T1
00:00:05:21.05519: BIU T2
00:00:05:21.05635: BIU T3 Paged(p):000ffff5=31(1); Normal(p):00000005=31(1)
00:00:05:21.05746: BIU T4
00:00:05:21.05857: BIU T1
00:00:05:21.05967: BIU T2
00:00:05:21.06124: BIU T3 Paged(p):000ffff6=31(1); Normal(p):00000006=31(1)
00:00:05:21.06253: BIU T4
00:00:05:21.06367: BIU T1
00:00:05:21.06478: BIU T2
00:00:05:21.06593: BIU T3 Paged(p):000ffff7=2f(/); Normal(p):00000007=2f(/)
00:00:05:21.06704: BIU T4
00:00:05:21.06857: BIU T1 ffff:0000 (EA5BE000F0)JMP F000:E05B
00:00:05:21.06969: BIU T2
00:00:05:21.07094: BIU T3 Physical(p):000fe05b=fa(ú); Paged(p):000fe05b=fa(ú); Normal(p):0000e05b=fa(ú)
00:00:05:21.07209: BIU T4
00:00:05:21.07319: BIU T1
00:00:05:21.07430: BIU T2
00:00:05:21.07545: BIU T3 Paged(p):000fe05c=fc(ü); Normal(p):0000e05c=fc(ü)
00:00:05:21.07655: BIU T4
00:00:05:21.07765: BIU T1
00:00:05:21.07874: BIU T2
00:00:05:21.07989: BIU T3 Paged(p):000fe05d=b0(°); Normal(p):0000e05d=b0(°)
00:00:05:21.08128: BIU T4
00:00:05:21.08283: BIU T1
00:00:05:21.08401: BIU T2
00:00:05:21.08517: BIU T3 Paged(p):000fe05e=00( ); Normal(p):0000e05e=00( )
00:00:05:21.08627: BIU T4
00:00:05:21.08747: BIU T1 f000:e05b (FA)CLI
00:00:05:21.08867: BIU T2 f000:e05c (FC)CLD
00:00:05:21.08991: BIU T3 f000:e05d (B000)MOV AL,00 Paged(p):000fe05f=e6(æ); Normal(p):0000e05f=e6(æ)
00:00:05:21.09106: BIU T4
00:00:05:21.09218: BIU T1
00:00:05:21.09328: BIU T2
00:00:05:21.09446: BIU T3 Physical(p):000fe060=a0( ); Paged(p):000fe060=a0( ); Normal(p):0000e060=a0( )
00:00:05:21.09557: BIU T4
00:00:05:21.09667: BIU T1
00:00:05:21.09777: BIU T2
00:00:05:21.09891: BIU T3 Paged(p):000fe061=ba(º); Normal(p):0000e061=ba(º)
00:00:05:22.00001: BIU T4
Show last 180 lines
00:00:05:22.00115: BIU T1		
00:00:05:22.00225: BIU T2
00:00:05:22.00339: BIU T3 Paged(p):000fe062=d8(Ø); Normal(p):0000e062=d8(Ø)
00:00:05:22.00450: BIU T4
00:00:05:22.00568: BIU T1 f000:e05f (E6A0)OUT A0,AL
00:00:05:22.00679: BIU T2
00:00:05:22.00793: BIU T3 Paged(p):000fe063=03(); Normal(p):0000e063=03()
00:00:05:22.00903: BIU T4
00:00:05:22.01013: BIU --
00:00:05:22.01125: BIU --
00:00:05:22.01236: BIU T1
00:00:05:22.01347: BIU T2
00:00:05:22.01461: BIU T3 Paged(p):000fe064=ee(î); Normal(p):0000e064=ee(î)
00:00:05:22.02162: BIU T4
00:00:05:22.02274: BIU T1
00:00:05:22.02385: BIU T2
00:00:05:22.02495: BIU TW
00:00:05:22.02605: BIU TW
00:00:05:22.02725: BIU TW
00:00:05:22.02835: BIU T4
00:00:05:22.02957: BIU T1 f000:e061 (BAD803)MOV DX,03D8
00:00:05:22.03071: BIU T2
00:00:05:22.03195: BIU T3 f000:e064 (EE)OUT DX,AL Paged(p):000fe065=b2(²); Normal(p):0000e065=b2(²)
00:00:05:22.03306: BIU T4
00:00:05:22.03416: BIU T1
00:00:05:22.03526: BIU T2
00:00:05:22.03641: BIU T3 Paged(p):000fe066=b8(¸); Normal(p):0000e066=b8(¸)
00:00:05:22.03750: BIU T4
00:00:05:22.03861: BIU T1
00:00:05:22.03971: BIU T2
00:00:05:22.04085: BIU TW
00:00:05:22.04196: BIU TW
00:00:05:22.04308: BIU TW
00:00:05:22.04418: BIU T4
00:00:05:22.04529: BIU T1
00:00:05:22.04640: BIU T2
00:00:05:22.04755: BIU T3 Paged(p):000fe067=fe(þ); Normal(p):0000e067=fe(þ)
00:00:05:22.04921: BIU T4
00:00:05:22.05039: BIU T1
00:00:05:22.05150: BIU T2
00:00:05:22.05265: BIU T3 Paged(p):000fe068=c0(À); Normal(p):0000e068=c0(À)
00:00:05:22.05375: BIU T4
00:00:05:22.05494: BIU T1 f000:e065 (B2B8)MOV DL,B8
00:00:05:22.05605: BIU T2
00:00:05:22.05736: BIU T3 f000:e067 (FEC0)INC AL Paged(p):000fe069=ee(î); Normal(p):0000e069=ee(î)
00:00:05:22.05847: BIU T4
00:00:05:22.05958: BIU T1
00:00:05:22.06072: BIU T2
00:00:05:22.06187: BIU T3 Paged(p):000fe06a=b0(°); Normal(p):0000e06a=b0(°)
00:00:05:22.06299: BIU T4
00:00:05:22.06409: BIU T1
00:00:05:22.06519: BIU T2
00:00:05:22.06633: BIU T3 Paged(p):000fe06b=99(™); Normal(p):0000e06b=99(™)
00:00:05:22.06743: BIU T4
00:00:05:22.06853: BIU T1
00:00:05:22.06963: BIU T2
00:00:05:22.07081: BIU T3 Paged(p):000fe06c=e6(æ); Normal(p):0000e06c=e6(æ)
00:00:05:22.07193: BIU T4
00:00:05:22.07311: BIU T1 f000:e069 (EE)OUT DX,AL
00:00:05:22.07421: BIU T2
00:00:05:22.07535: BIU T3 Paged(p):000fe06d=63(c); Normal(p):0000e06d=63(c)
00:00:05:22.07645: BIU T4
00:00:05:22.07755: BIU --
00:00:05:22.07865: BIU T1
00:00:05:22.07976: BIU T2
00:00:05:22.08090: BIU TW
00:00:05:22.08201: BIU TW
00:00:05:22.08313: BIU TW
00:00:05:22.08423: BIU T4
00:00:05:22.08543: BIU T1 f000:e06a (B099)MOV AL,99
00:00:05:22.08653: BIU T2
00:00:05:22.08777: BIU T3 f000:e06c (E663)OUT 63,AL Paged(p):000fe06e=b0(°); Normal(p):0000e06e=b0(°)
00:00:05:22.08887: BIU T4
00:00:05:22.08997: BIU T1
00:00:05:22.09111: BIU T2
00:00:05:22.09225: BIU T3 Paged(p):000fe06f=a5(¥); Normal(p):0000e06f=a5(¥)
00:00:05:22.09336: BIU T4
00:00:05:22.09446: BIU T1
00:00:05:22.09556: BIU T2
00:00:05:22.09666: BIU TW
00:00:05:22.09776: BIU TW
00:00:05:22.09887: BIU TW
00:00:05:22.09997: BIU T4
00:00:05:23.00115: BIU T1
00:00:05:23.00227: BIU T2
00:00:05:23.00345: BIU T3 Physical(p):000fe070=e6(æ); Paged(p):000fe070=e6(æ); Normal(p):0000e070=e6(æ)
00:00:05:23.00458: BIU T4
00:00:05:23.00569: BIU T1
00:00:05:23.00679: BIU T2
00:00:05:23.00793: BIU T3 Paged(p):000fe071=61(a); Normal(p):0000e071=61(a)
00:00:05:23.00903: BIU T4
00:00:05:23.01021: BIU T1 f000:e06e (B0A5)MOV AL,A5
00:00:05:23.01136: BIU T2
00:00:05:23.01259: BIU T3 f000:e070 (E661)OUT 61,AL Paged(p):000fe072=b0(°); Normal(p):0000e072=b0(°)
00:00:05:23.01369: BIU T4
00:00:05:23.01479: BIU T1
00:00:05:23.01589: BIU T2
00:00:05:23.01704: BIU T3 Paged(p):000fe073=54(T); Normal(p):0000e073=54(T)
00:00:05:23.01814: BIU T4
00:00:05:23.01924: BIU T1
00:00:05:23.02037: BIU T2
00:00:05:23.02149: BIU TW
00:00:05:23.02258: BIU TW
00:00:05:23.02370: BIU TW
00:00:05:23.02481: BIU T4
00:00:05:23.02592: BIU T1
00:00:05:23.02730: BIU T2
00:00:05:23.02850: BIU T3 Paged(p):000fe074=e6(æ); Normal(p):0000e074=e6(æ)
00:00:05:23.02961: BIU T4
00:00:05:23.03081: BIU T1
00:00:05:23.03192: BIU T2
00:00:05:23.03307: BIU T3 Paged(p):000fe075=43(C); Normal(p):0000e075=43(C)
00:00:05:23.03417: BIU T4
00:00:05:23.03537: BIU T1 f000:e072 (B054)MOV AL,54
00:00:05:23.03647: BIU T2
00:00:05:23.03770: BIU T3 f000:e074 (E643)OUT 43,AL Paged(p):000fe076=b0(°); Normal(p):0000e076=b0(°)
00:00:05:23.03881: BIU T4
00:00:05:23.03991: BIU T1
00:00:05:23.04151: BIU T2
00:00:05:23.06450: BIU T3 Paged(p):000fe077=12(); Normal(p):0000e077=12()
00:00:05:23.06569: BIU T4
00:00:05:23.06685: BIU T1
00:00:05:23.06797: BIU T2
00:00:05:23.06907: BIU TW
00:00:05:23.07017: BIU TW
00:00:05:23.07135: BIU TW
00:00:05:23.07245: BIU T4
00:00:05:23.07357: BIU T1
00:00:05:23.07468: BIU T2
00:00:05:23.07583: BIU T3 Paged(p):000fe078=e6(æ); Normal(p):0000e078=e6(æ)
00:00:05:23.07693: BIU T4
00:00:05:23.07803: BIU T1
00:00:05:23.07914: BIU T2
00:00:05:23.08029: BIU T3 Paged(p):000fe079=41(A); Normal(p):0000e079=41(A)
00:00:05:23.08142: BIU T4
00:00:05:23.08261: BIU T1 f000:e076 (B012)MOV AL,12
00:00:05:23.08372: BIU T2
00:00:05:23.08495: BIU T3 f000:e078 (E641)OUT 41,AL Paged(p):000fe07a=b0(°); Normal(p):0000e07a=b0(°)
00:00:05:23.08679: BIU T4
00:00:05:23.08790: BIU T1
00:00:05:23.08901: BIU T2
00:00:05:23.09015: BIU T3 Paged(p):000fe07b=40(@); Normal(p):0000e07b=40(@)
00:00:05:23.09129: BIU T4
00:00:05:23.09239: BIU T1
00:00:05:23.09349: BIU T2
00:00:05:23.09461: BIU TW
00:00:05:23.09570: BIU TW
00:00:05:23.09682: BIU TW
00:00:05:23.09793: BIU T4
00:00:05:23.09905: BIU T1
00:00:05:24.00015: BIU T2
00:00:05:24.00134: BIU T3 Paged(p):000fe07c=e6(æ); Normal(p):0000e07c=e6(æ)
00:00:05:24.00244: BIU T4
00:00:05:24.00354: BIU T1
00:00:05:24.00465: BIU T2
00:00:05:24.00579: BIU T3 Paged(p):000fe07d=43(C); Normal(p):0000e07d=43(C)
00:00:05:24.00689: BIU T4
00:00:05:24.00809: BIU T1 f000:e07a (B040)MOV AL,40
00:00:05:24.00919: BIU T2
00:00:05:24.01048: BIU T3 f000:e07c (E643)OUT 43,AL Paged(p):000fe07e=b0(°); Normal(p):0000e07e=b0(°)
00:00:05:24.01159: BIU T4
00:00:05:24.01269: BIU T1
00:00:05:24.01380: BIU T2
00:00:05:24.01494: BIU T3 Paged(p):000fe07f=00( ); Normal(p):0000e07f=00( )
00:00:05:24.01605: BIU T4
00:00:05:24.01715: BIU T1
00:00:05:24.01825: BIU T2
00:00:05:24.01935: BIU TW
00:00:05:24.02048: BIU TW
00:00:05:24.02161: BIU TW
00:00:05:24.02271: BIU T4
00:00:05:24.02382: BIU T1
00:00:05:24.02492: BIU T2
00:00:05:24.02610: BIU T3 Physical(p):000fe080=e6(æ); Paged(p):000fe080=e6(æ); Normal(p):0000e080=e6(æ)
00:00:05:24.02720: BIU T4
00:00:05:24.02830: BIU T1
00:00:05:24.02941: BIU T2
00:00:05:24.03059: BIU T3 Paged(p):000fe081=81(); Normal(p):0000e081=81()
00:00:05:24.03170: BIU T4
00:00:05:24.03288: BIU T1 f000:e07e (B000)MOV AL,00

The instruction is logged at the first cycle of the EU execution, once decoding completes.
Although it should start executing 1 cycle after the fetch, since T4 completes the final byte (there's only one of those in the first instruction , fetching F0h):

00:00:05:21.05183: BIU T3		Paged(p):000ffff4=f0(ð); Normal(p):00000004=f0(ð)

The T4 after that should have it fetched, but perhaps the EU isn't fetching it on time?
Edit: It might be some decoding logic that's taking too long here? That happens after the parts are fetched, just before the EU starts the EU phase (that causes the instruction disassembly to be logged on that cycle).
So that's at:

00:00:05:21.06857: BIU T1

Edit: Just checked. At clock 'T25' (if not taking the modulo) the first instruction is fully loaded into the BIU prefetch. So it's after the 6th fetch completes.
So that's at 00:00:05:21.05857: BIU T1 it seems. If that's the 25th clock (24 if 0-based).

Last edited by superfury on 2023-07-14, 20:21. Edited 1 time in total.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 15 of 122, by GloriousCow

User metadata
Rank Member
Rank
Member
superfury wrote on 2023-07-14, 19:38:

Actually just 5 cycles is impossible. The JMP opcode (opcode EAh) is 5 bytes long, so requires 5x4 cycles to prefetch first. That's exactly what happens in UniPCemu now with the T4 cycle result bugfix I just implemented (using a simply 4-byte fifo buffer for everything the BIU can fetch in one go).

We will have to disagree a bit...

This goes back to what I was saying about the SMC in 8088MPH end credits. The far pointer operand for opcode EA is not read during instruction decode, it is explicitly fetched during EA's microcode program. Therefore it's correct that JMP is executing after the first byte is fetched - it is actively executing the microcode program for EA at that point. True, the first thing the microcode does is fetch four bytes, so you may consider this splitting hairs, but I pointed out two other opcodes (JCZX and LOOP) where this decode/fetch distinction matters.

modelling the microcode behavior does present a challenge for logging - you don't have all the info to disassemble the instruction after just fetching the opcode, so my decode 'sniffs' ahead - reads but does not fetch the opcode stream. I plug the 'sniffed' values into the instruction structure anyway, and hand it off to the EU, which has to explicitly fetch those operands through the actual queue this time. It may seem inefficient, but this has the bonus of actually letting me detect self modifying code; if the 'sniffed' value differs from the microcode-fetched value. And it happens a lot in the 8088MPH end credits.

this is how my cycle tracelog can print the full disassembly of EA before EA has even fetched the address yet.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 16 of 122, by GloriousCow

User metadata
Rank Member
Rank
Member

Let me expound a bit. The decode phase of the 8088 will fetch prefixes, the opcode byte itself, and the modrm byte if there is one. That's it. Anything else the instruction needs, like reading the EA operand, reading a displacement, or any immediates, is done by the EU in microcode. The first two are handled by microcode routines common to all opcodes; the latter must be done by the opcode-specific microcode routine itself.

have you ever wondered why [BX+DI] is a cycle longer than [BP+DI] ? it's because the former calls the latter - and the jump costs a cycle.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 17 of 122, by superfury

User metadata
Rank l33t++
Rank
l33t++

OK. So UniPCemu logs on the first EU-timed part of the instruction (immediately after fetching and decoding), while you do external decoding (as if some enternal device was listening and reading all bytes in 1 cycle and decoding and logging on said cycle.

The log of UniPCemu shows correct results now (proper execution start after 20 cycles:

00:00:12:45.09091: BIU T1		
00:00:12:46.00407: BIU T2
00:00:12:46.00607: BIU T3 Physical(p):000ffff0=ea(ê); Paged(p):000ffff0=ea(ê); Normal(p):00000000=ea(ê)
00:00:12:46.00777: BIU T4
00:00:12:46.00945: BIU T1
00:00:12:46.01112: BIU T2
00:00:12:46.01286: BIU T3 Paged(p):000ffff1=5b([); Normal(p):00000001=5b([)
00:00:12:46.01454: BIU T4
00:00:12:46.02224: BIU T1
00:00:12:46.02403: BIU T2
00:00:12:46.02576: BIU T3 Paged(p):000ffff2=e0(à); Normal(p):00000002=e0(à)
00:00:12:46.02745: BIU T4
00:00:12:46.02912: BIU T1
00:00:12:46.03078: BIU T2
00:00:12:46.03252: BIU T3 Paged(p):000ffff3=00( ); Normal(p):00000003=00( )
00:00:12:46.03419: BIU T4
00:00:12:46.03585: BIU T1
00:00:12:46.03751: BIU T2
00:00:12:47.09188: BIU T3 Paged(p):000ffff4=f0(ð); Normal(p):00000004=f0(ð)
00:00:12:47.09367: BIU T4
00:00:12:47.09539: BIU T1 ffff:0000 (EA5BE000F0)JMP F000:E05B
00:00:12:47.09655: BIU T2
00:00:12:47.09779: BIU T3 Physical(p):000fe05b=fa(ú); Paged(p):000fe05b=fa(ú); Normal(p):0000e05b=fa(ú)
00:00:12:47.09894: BIU T4
00:00:12:48.00015: BIU T1 f000:e05b (FA)CLI
00:00:12:48.00126: BIU T2
00:00:12:48.00243: BIU T3 Paged(p):000fe05c=fc(ü); Normal(p):0000e05c=fc(ü)
00:00:12:48.00355: BIU T4
00:00:12:48.00473: BIU T1 f000:e05c (FC)CLD
00:00:12:48.00585: BIU T2
00:00:12:48.00700: BIU T3 Paged(p):000fe05d=b0(°); Normal(p):0000e05d=b0(°)
00:00:12:48.00811: BIU T4
00:00:12:48.00923: BIU T1
00:00:12:48.01034: BIU T2
00:00:12:48.01149: BIU T3 Paged(p):000fe05e=00( ); Normal(p):0000e05e=00( )
00:00:12:48.01263: BIU T4
00:00:12:48.01384: BIU T1 f000:e05d (B000)MOV AL,00
00:00:12:48.01495: BIU T2
00:00:12:48.01612: BIU T3 Paged(p):000fe05f=e6(æ); Normal(p):0000e05f=e6(æ)
00:00:12:48.01723: BIU T4
00:00:12:48.01835: BIU T1
00:00:12:48.01945: BIU T2
00:00:12:48.02064: BIU T3 Physical(p):000fe060=a0( ); Paged(p):000fe060=a0( ); Normal(p):0000e060=a0( )
00:00:12:48.02175: BIU T4
00:00:12:48.02295: BIU T1 f000:e05f (E6A0)OUT A0,AL
00:00:12:48.02407: BIU T2
00:00:12:48.02522: BIU T3 Paged(p):000fe061=ba(º); Normal(p):0000e061=ba(º)
00:00:12:48.02631: BIU T4
00:00:12:48.02743: BIU --
00:00:12:48.02853: BIU --
00:00:12:48.02963: BIU T1
00:00:12:48.03075: BIU T2
00:00:12:48.03191: BIU T3 Paged(p):000fe062=d8(Ø); Normal(p):0000e062=d8(Ø)
00:00:12:48.03302: BIU T4
00:00:12:48.03414: BIU T1
00:00:12:48.03526: BIU T2
00:00:12:48.03637: BIU TW
00:00:12:48.03748: BIU TW
00:00:12:48.03882: BIU TW
00:00:12:48.03995: BIU T4
Show last 169 lines
00:00:12:48.04111: BIU T1		
00:00:12:48.04223: BIU T2
00:00:12:48.04339: BIU T3 Paged(p):000fe063=03(); Normal(p):0000e063=03()
00:00:12:48.04449: BIU T4
00:00:12:48.04571: BIU T1 f000:e061 (BAD803)MOV DX,03D8
00:00:12:48.04682: BIU T2
00:00:12:48.04799: BIU T3 Paged(p):000fe064=ee(î); Normal(p):0000e064=ee(î)
00:00:12:48.04909: BIU T4
00:00:12:48.05028: BIU T1 f000:e064 (EE)OUT DX,AL
00:00:12:48.05139: BIU T2
00:00:12:48.05255: BIU T3 Paged(p):000fe065=b2(²); Normal(p):0000e065=b2(²)
00:00:12:48.05366: BIU T4
00:00:12:48.05477: BIU T1
00:00:12:48.05589: BIU T2
00:00:12:48.05704: BIU T3 Paged(p):000fe066=b8(¸); Normal(p):0000e066=b8(¸)
00:00:12:48.05816: BIU T4
00:00:12:48.05927: BIU T1
00:00:12:48.06038: BIU T2
00:00:12:48.06149: BIU TW
00:00:12:48.06292: BIU TW
00:00:12:48.06429: BIU TW
00:00:12:48.06551: BIU T4
00:00:12:48.06682: BIU T1 f000:e065 (B2B8)MOV DL,B8
00:00:12:48.06795: BIU T2
00:00:12:48.06911: BIU T3 Paged(p):000fe067=fe(þ); Normal(p):0000e067=fe(þ)
00:00:12:48.07021: BIU T4
00:00:12:48.07134: BIU T1
00:00:12:48.07250: BIU T2
00:00:12:48.07365: BIU T3 Paged(p):000fe068=c0(À); Normal(p):0000e068=c0(À)
00:00:12:48.07476: BIU T4
00:00:12:48.09642: BIU T1 f000:e067 (FEC0)INC AL
00:00:12:48.09782: BIU T2
00:00:12:48.09903: BIU T3 Paged(p):000fe069=ee(î); Normal(p):0000e069=ee(î)
00:00:12:49.00014: BIU T4
00:00:12:49.00134: BIU T1 f000:e069 (EE)OUT DX,AL
00:00:12:49.00251: BIU T2
00:00:12:49.00368: BIU T3 Paged(p):000fe06a=b0(°); Normal(p):0000e06a=b0(°)
00:00:12:49.00551: BIU T4
00:00:12:49.00663: BIU T1
00:00:12:49.00775: BIU T2
00:00:12:49.00891: BIU T3 Paged(p):000fe06b=99(™); Normal(p):0000e06b=99(™)
00:00:12:49.01003: BIU T4
00:00:12:49.01115: BIU T1
00:00:12:49.01227: BIU T2
00:00:12:49.01339: BIU TW
00:00:12:49.01450: BIU TW
00:00:12:49.01566: BIU TW
00:00:12:49.01677: BIU T4
00:00:12:49.01801: BIU T1 f000:e06a (B099)MOV AL,99
00:00:12:49.01914: BIU T2
00:00:12:49.02030: BIU T3 Paged(p):000fe06c=e6(æ); Normal(p):0000e06c=e6(æ)
00:00:12:49.02142: BIU T4
00:00:12:49.02255: BIU T1
00:00:12:49.02366: BIU T2
00:00:12:49.02481: BIU T3 Paged(p):000fe06d=63(c); Normal(p):0000e06d=63(c)
00:00:12:49.02594: BIU T4
00:00:12:49.02718: BIU T1 f000:e06c (E663)OUT 63,AL
00:00:12:49.02829: BIU T2
00:00:12:49.02945: BIU T3 Paged(p):000fe06e=b0(°); Normal(p):0000e06e=b0(°)
00:00:12:49.03056: BIU T4
00:00:12:49.03167: BIU --
00:00:12:49.03279: BIU --
00:00:12:49.03390: BIU T1
00:00:12:49.03501: BIU T2
00:00:12:49.03616: BIU T3 Paged(p):000fe06f=a5(¥); Normal(p):0000e06f=a5(¥)
00:00:12:49.03727: BIU T4
00:00:12:49.03839: BIU T1
00:00:12:49.03949: BIU T2
00:00:12:49.04060: BIU TW
00:00:12:49.04171: BIU TW
00:00:12:49.04386: BIU TW
00:00:12:49.04595: BIU T4
00:00:12:49.04790: BIU T1 f000:e06e (B0A5)MOV AL,A5
00:00:12:49.04948: BIU T2
00:00:12:49.05209: BIU T3 Physical(p):000fe070=e6(æ); Paged(p):000fe070=e6(æ); Normal(p):0000e070=e6(æ)
00:00:12:49.05405: BIU T4
00:00:12:49.05612: BIU T1
00:00:12:49.05818: BIU T2
00:00:12:49.06031: BIU T3 Paged(p):000fe071=61(a); Normal(p):0000e071=61(a)
00:00:12:49.06192: BIU T4
00:00:12:49.06459: BIU T1 f000:e070 (E661)OUT 61,AL
00:00:12:49.06668: BIU T2
00:00:12:49.06893: BIU T3 Paged(p):000fe072=b0(°); Normal(p):0000e072=b0(°)
00:00:12:49.07104: BIU T4
00:00:12:49.07312: BIU --
00:00:12:49.07470: BIU --
00:00:12:49.07691: BIU T1
00:00:12:49.07927: BIU T2
00:00:12:49.08143: BIU T3 Paged(p):000fe073=54(T); Normal(p):0000e073=54(T)
00:00:12:49.08350: BIU T4
00:00:12:49.08546: BIU T1
00:00:12:49.08740: BIU T2
00:00:12:49.08895: BIU TW
00:00:12:49.09043: BIU TW
00:00:12:49.09299: BIU TW
00:00:12:49.09505: BIU T4
00:00:12:49.09755: BIU T1 f000:e072 (B054)MOV AL,54
00:00:12:49.09961: BIU T2
00:00:12:50.00181: BIU T3 Paged(p):000fe074=e6(æ); Normal(p):0000e074=e6(æ)
00:00:12:50.00335: BIU T4
00:00:12:50.00559: BIU T1
00:00:12:50.00777: BIU T2
00:00:12:50.00995: BIU T3 Paged(p):000fe075=43(C); Normal(p):0000e075=43(C)
00:00:12:50.01197: BIU T4
00:00:12:50.01416: BIU T1 f000:e074 (E643)OUT 43,AL
00:00:12:50.01574: BIU T2
00:00:12:50.01810: BIU T3 Paged(p):000fe076=b0(°); Normal(p):0000e076=b0(°)
00:00:12:50.02029: BIU T4
00:00:12:50.02237: BIU --
00:00:12:50.02451: BIU --
00:00:12:50.02659: BIU T1
00:00:12:50.02835: BIU T2
00:00:12:50.03037: BIU T3 Paged(p):000fe077=12(); Normal(p):0000e077=12()
00:00:12:50.03256: BIU T4
00:00:12:50.03447: BIU T1
00:00:12:50.03651: BIU T2
00:00:12:50.03859: BIU TW
00:00:12:50.04063: BIU TW
00:00:12:50.04223: BIU TW
00:00:12:50.04475: BIU T4
00:00:12:50.04709: BIU T1 f000:e076 (B012)MOV AL,12
00:00:12:50.04915: BIU T2
00:00:12:50.05134: BIU T3 Paged(p):000fe078=e6(æ); Normal(p):0000e078=e6(æ)
00:00:12:50.05338: BIU T4
00:00:12:50.05495: BIU T1
00:00:12:50.05750: BIU T2
00:00:12:50.05974: BIU T3 Paged(p):000fe079=41(A); Normal(p):0000e079=41(A)
00:00:12:50.06181: BIU T4
00:00:12:50.06405: BIU T1 f000:e078 (E641)OUT 41,AL
00:00:12:50.06610: BIU T2
00:00:12:50.06773: BIU T3 Paged(p):000fe07a=b0(°); Normal(p):0000e07a=b0(°)
00:00:12:50.07024: BIU T4
00:00:12:50.07218: BIU --
00:00:12:50.07425: BIU --
00:00:12:50.07632: BIU T1
00:00:12:50.07836: BIU T2
00:00:12:50.08003: BIU T3 Paged(p):000fe07b=40(@); Normal(p):0000e07b=40(@)
00:00:12:50.08181: BIU T4
00:00:12:50.08416: BIU T1
00:00:12:50.08615: BIU T2
00:00:12:50.08923: BIU TW
00:00:12:50.09123: BIU TW
00:00:12:50.09309: BIU TW
00:00:12:50.09471: BIU T4
00:00:12:50.09696: BIU T1 f000:e07a (B040)MOV AL,40
00:00:12:50.09931: BIU T2
00:00:12:51.00135: BIU T3 Paged(p):000fe07c=e6(æ); Normal(p):0000e07c=e6(æ)
00:00:12:51.00335: BIU T4
00:00:12:51.00545: BIU T1
00:00:12:51.00749: BIU T2
00:00:12:51.00914: BIU T3 Paged(p):000fe07d=43(C); Normal(p):0000e07d=43(C)
00:00:12:51.01128: BIU T4
00:00:12:51.01379: BIU T1 f000:e07c (E643)OUT 43,AL
00:00:12:51.01589: BIU T2
00:00:12:51.01805: BIU T3 Paged(p):000fe07e=b0(°); Normal(p):0000e07e=b0(°)
00:00:12:51.02017: BIU T4
00:00:12:51.02195: BIU --
00:00:12:51.02352: BIU --
00:00:12:51.02603: BIU T1
00:00:12:51.02807: BIU T2
00:00:12:51.03023: BIU T3 Paged(p):000fe07f=00( ); Normal(p):0000e07f=00( )
00:00:12:51.03225: BIU T4
00:00:12:51.03437: BIU T1
00:00:12:51.03593: BIU T2
00:00:12:51.03854: BIU TW
00:00:12:51.04063: BIU TW
00:00:12:51.04275: BIU TW
00:00:12:51.04484: BIU T4
00:00:12:51.04711: BIU T1 f000:e07e (B000)MOV AL,00

The instruction executing displayed is on the cycle the EU instruction timings starts (once fetching and decoding have been completed).

So the above

00:00:12:47.09539: BIU T1

actually is the point where the instruction is fully decoded (including all parameters, which are fetched in 1-cycle reads from the PIC).

That's also the point where the first 'EU' part of UniPCemu's instruction specific handler starts. It will generate a decoded instruction for the debugger to display (displayed on said cycle) and start the first part or request of the timings for that specific instruction.

In the JMP case, it will write CS and (E)IP from the decoded instruction (stored as a 32-bit dword for the far pointer) and write it to CS:(E)IP. That happens on T1-T3. T4 is then the next instruction starting to fetch (or delaying the EU by 1 cycle intervals if it's not buffered yet).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 18 of 122, by superfury

User metadata
Rank l33t++
Rank
l33t++

Unfortunately, somehow my OS (within UniPCemu, which is MS-DOS 6.22) became unbootable.
Once the hard drive BIOS starts, it somehow craps out?
I see it throwing a single step exception somehow? Weird?
FLAGS is F702h, so it's throwing a single-step exception, which shouldn't happen?
That never happened before, so something is going very wrong now.
Edit: After some more testing, now somehow the BIOS (during the first part of POST, pretty much after clearing all RAM) ends up in the woods, with all RAM being zeroed and execute (instruction 0000h).

Edit: Managed to fix it. The cause was the delayed PIQ fill (on T4 while fetching at T3). What would happen is that if on T4 the EU would flush the PIQ (because of any kind of jump), the flushing would happen on T4, but after that T4 would fill the PIQ with the result from T3, which was from the old PIQ address. Weird that I didn't think about that.
Is that what happens on a real CPU as well?

Edit: 8088 MPH improved it's metric cycle count. It's 1575 now! 😁

Edit: The nose pattern at 256 colors moved slightly earlier:

Filename
1707_UniPCemu_20230715_1327_8088MPH_16-256 color noise pattern moved earlier.png
File size
6.28 KiB
Downloads
No downloads
File comment
Snow moved earlier at 16/256 colors part of 8088 MPH.
File license
Fair use/fair dealing exception

Edit: Although vertical timing should be correct in total (as in the frames are a solid resolution), the vertical timings seemingly are not anymore. The start of each scanline for the background layer of the raster bars is all over the place again now.

Filename
captures_8088MPH_rasterbars_UniPCemu_20230715_1327.7z
File size
87.52 KiB
Downloads
56 downloads
File comment
Kefrens effect captures. Eventually probably on frame timings (done using keyboard keys).
File license
Fair use/fair dealing exception

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 19 of 122, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just ran 8088tst3 on UniPCemu's latest commit (although there's still 1-cycle i/o waitstate):

Filename
1747_8088tst3_results.png
File size
4.82 KiB
Downloads
No downloads
File comment
8088tst3 results.
File license
Fair use/fair dealing exception

Everything seems too high?

Edit: The same, but written to the disk:

FF36
FE3C
FDA0
FD26
FCF7
FC31
FB6D
F93C
CPU test complete. Elapsed timer ticks: 07F2

Edit: A little comparison chart:

disk:	real:	comp:	disp:	comp:
FF36 FF43 < FF36 <
FE3C FE59 < FE3D <
FDA0 FDC5 < FDA1 <
FD26 FD58 < FD27 <
FCF7 FD2A < FCF8 <
FC31 FC6B < FC31 <
FB6D FBB7 < FB6E <
F93C F9A9 < F93D <
CPU test complete. Elapsed timer ticks:
07F1 07CA < 07F2 <

So the disk column on is redirected to disk (executing 8088tst3>8088tst3.txt in MS-DOS). Real column is ofc your results. The first comp column is the comparison between disk and real (disk always lower than real). Then the disp column is the displayed count on the screen (no redirection to file). And the final comp column is the same as for disk's comp, but for disp instead.

Edit: Added a little diff column to the comp columns (decimal instead):

disk:	real:	comp:	disp:	comp:
FF36 FF43 <(-13) FF36 <(-13)
FE3C FE59 <(-29) FE3D <(-28)
FDA0 FDC5 <(-37) FDA1 <(-36)
FD26 FD58 <(-50) FD27 <(-49)
FCF7 FD2A <(-51) FCF8 <(-50)
FC31 FC6B <(-58) FC31 <(-58)
FB6D FBB7 <(-74) FB6E <(-73)
F93C F9A9 <(-109) F93D <(-108)
CPU test complete. Elapsed timer ticks:
07F1 07CA <(+27) 07F2 <(+28)

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io