VOGONS


First post, by GloriousCow

User metadata
Rank Member
Rank
Member

In the ongoing pursuit of emulator cycle accuracy, I'm investigating the cycle timing for interrupts. Specifically, waking from a HALT.

I have an 8088 on a microcontroller and what I have discovered is that there is a somewhat significant delay between INTR being asserted and the first INTA bus cycle.

Here I assert the INTR while halted and step the CPU...

2023-05-20T15:26:34Z TRACE cpu_client::remote_cpu Setting INTR high to recover from halt...
00000070 [60104] M:... I:... Q:.. PASV T1 | 1 [90 ]
00000071 [60104] M:... I:... Q:.. PASV T1 | 1 [90 ]
00000072 [60104] M:... I:... Q:.. PASV T1 | 1 [90 ]
00000073 [60104] M:... I:... Q:.. PASV T1 | 1 [90 ]
00000074 [60104] M:... I:... Q:.. PASV T1 | 1 [90 ]
00000075 [60104] M:... I:... Q:.. PASV T1 | 1 [90 ]
00000076 [60104] M:... I:... Q:.. PASV T1 | 1 [90 ]
00000077 A:[00195] M:... I:... Q:.. IRQA T1 | 1 [90 ]

I see either a 7 or 8 cycle delay. The additional cycle delay is less frequent, but frequent enough to see.

I confirmed this with a scope, see attached. Yellow is the PIT channel 0 output. It's interesting to see it has a slow enough rise time to effectively delay INTR a half cycle.
I stuck a probe on HOLDA just to make sure the delay wasn't caused by DMA. My arduino-controlled 8088 has no DMA controller, so I had pretty much ruled that out anyway.

Anyone know the underlying rules or logic for interrupt acknowledgement on the 8088?

Attachments

  • wake_from_halt_timing.png
    Filename
    wake_from_halt_timing.png
    File size
    24.17 KiB
    Views
    517 views
    File comment
    oscilloscope measurement of 8088 interrupt processing
    File license
    Public domain

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 1 of 6, by kdr

User metadata
Rank Member
Rank
Member

I thought it might be due to a microcode loop (the similar WAIT instruction is explicity documented as checking the /TEST pin every 5 clocks) but apparently not; reenigne's disassembly of the 8086 microcode page has this to say:

There is no microcode for the segment override prefixes (CS:, SS:, DS: and ES:). Nor for the other prefixes (REP, REPNE and LOCK), nor the instructions CLC, STC, CLI, STI, CLD, STD, CMC, and HLT. The "group" opcodes 0xf6, 0xf7, 0xfe and 0xff do not have top level microcode instructions. So none of the instructions with 0xf in the high nybble of the opcode are initially handled by the microcode. Most of these instruction are very simple and probably better done by random logic. HLT is a little surprising - I really thought I'd find a microcode loop for that one since it only seems to check for interrupts every other cycle.

If you haven't already, Ken's very comprehensive reverse-engineering of the 8086 interrupt hardware page might give you some ideas.

Reply 2 of 6, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie

I no longer think it's checking every other cycle. Here's the logic I use in my microcode-based emulator: https://github.com/reenigne/reenigne/blob/mas … rocode.h#LL2522 . This seems to work in my tests, but the cause is very mysterious! I asked Ken about it and he couldn't see any circuitry that might be responsible for it, so it might be 8088-specifc. I'm hoping it will become clear once I work out the logic for interrupt timing for non-HLT, non-WAIT scenarios.

Reply 3 of 6, by GloriousCow

User metadata
Rank Member
Rank
Member

I was able to disprove any sort of two-cycle phase as I can delay 10, 11, or 12 cycles after halt before asserting INTR and see the same delay value. Interesting from your code that it appears to be related to the last bus transfer type?

Do you ever get the feeling that the logic for prefetching, bus delays, stalls, aborts, etc, feels a bit too complicated for such an old chip? I was hoping one of Ken's blogs might uncover some evidence for simpler unifying rules. I have a pet theory that would simplify things greatly but I've yet to be able to shoehorn it into every delay scenario.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 4 of 6, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie

Yes, the last bus transfer type does seem to be related - at least that's my best guess. But there must be some bit of state (a flip-flop somewhere) which keeps track of whether to do that extra cycle of delay during the HLT, as its output is used long after that bus transfer is complete. It being 8088-specific hints at possibly being related to breaking up a 16-bit transfer into two 8-bit transfers.

With my first attempt at XTCE I definitely felt that the logic I had come up with was far too complicated to be implemented in 29,000 transistors, and that there was a simpler underlying rule. With the microcode version, almost all of it makes sense. One big mystery was cleared up by this blog post of Ken's: http://www.righto.com/2023/01/inside-8086-pro … nstruction.html (see my comment on there). The _extraHaltDelay is the most mysterious puzzle piece remaining.

I'm interested to hear about your pet theory!

Reply 5 of 6, by GloriousCow

User metadata
Rank Member
Rank
Member

I was working on a blog post to try to describe the prefetching algorithm, at least my mental model of it. It would be interesting to compare notes. Although every time I think I have all the rules fully understood, some new bus delay scenario seems to crop up. The last ones were uncovered in my work to finally emulate the Lake effect in Area 5150.

I did see your comment on Ken's blog (I've tried to comment on his blog and just get a browser error, frustrating). I had some further observations about that delay, there's a bit more to it than Ken went into. It's not as simple as if queue_len == 3 then delay, as you've probably found. I actually handle two different states when queue length is 3, one for code fetches and one for all other bus transfers, as they seem to vary by a cycle in length, and I believe they somehow keep track of whether the last queue operation was a read or write. But that implies a flip-flop I don't think Ken has ever mentioned...

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 6 of 6, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie

Yes, I think you're right about the it being more complicated than "queue_len == 3". The logic I have for it is https://github.com/reenigne/reenigne/blob/mas … rocode.h#LL3185 - that doesn't take into account whether the last queue operation was a read or a write (directly) but does take into account whether the last IO was a prefetch or not (which may amount to the same thing). I'd be interested to see if your that makes your logic any simpler than mine!