Just implemented a generic framework for handling protected mode stack and descriptors using the usual stepping mechanism (just like the CPU uses for normal instructions and on recent commits also on parts of the privilege mode switching call gates).
Now I will just need to implement those into the protected mode stack and descriptor fetches/reads and writes for them to handle the stack and descriptors through the BIU.
Also completed the BIU 64-bit memory reads for this (since descriptors fetch 64 bits from RAM, the BIU will need to support such reads).
Those new added steps work using 3 different variables instead of just two the normal instruction use.
One counter specifies where in the current attempt (or repeated attempt, as all attempts are performed until the current attempt is reached) it's checking. This counter resets whenever the entire mechanism is reset (which is during the next instruction starting or when a read/write operation is confirmed to be pending).
One counter specifies were in the attempts of the current instruction (actually various reads/writes are combined into one) left off (this increases by two for every read or write operation. The first value (even) is used for requests to the BIU, the second for reading the result (which is that the write completed or the read data).
One counter specifies the read cache position. This is used to keep the read data into a cache to read it only once per request, later requests (restarts) cause the cached value to be read instead of performing the memory request again. This counter, like the current attempt counter, is also request to keep in sync for detecting the specific read attempts to perform (either from cache (when already done before) or from the memory read at a completed memory cycle).
All counters of course reset whenever the instruction state is reset (new instruction or a fault handler is started). Thus causing new instructions to reset such a behaviour.
Those 3 counters exist in two different copies: one for descriptor fetches and saves(which is only 1 byte, unlike the 8 byte read operation) and one copy for stack pushes and pops.
Thus, just like the CPU execution, it allows for the reads to cache their data for repeated attempts (as most of the instruction state is reset whenever the pending condition is triggered, causing the entire checks to restart (but one of the counters is used to return to the point that was interrupted once it matches the other counter (the current counter matches the interrupt point counter as it were))). It's basically just executing coroutines using counters, when I think about it.
The interruptions itself is just the BIU reading or writing data to/from memory. If it's in cycle-accurate mode, this interruption almost never occurs, as the read/write handlers simply immediately tick the BIU and retrieve the result. But in cycle-accurate mode, the BIU isn't ticked that way, thus causing the BIU to interrupt the EU routine that is handling those protected-mode tasks.
Of course, the small counters (only byte sized (index) pointers) and small read caches (16 descriptors and 64 read stack doublewords (16 bits read take 32 bits anyways on the cache, with the top 16 bits ignored during readback) could in theory overflow with 128 reads/writes, but to my knowledge there isn't a single instruction that performs that many descriptor or stack reads (and writes too, since those don't take any space in those buffers). So 254 steps is more than enough to handle all protected mode tasks.