Debugging an emulated CPU can be the most frustrating debugging experience ever, with so many moving pieces that can interfere with the instructions flow. You'll end reading huge trace logs with millions of executed instructions in the hope of finding that damn bug that keeps Windows from booting.
To try to tame the problem I created a 386 CPU tester, written in NASM assembly, started as a derivative work of PCjs.
In order to test the CPU in a repeatable and predictable way, test386.asm is used as a BIOS replacement, so it does not depend on any OS.
It executes a series of specific tests, both in real and protected mode, reporting a diagnostic code that can be used to determine any possible problem.
Suggestions, issue reporting, and code contributions are very well accepted.
Please note that this program has been tested only on Bochs and on my emulator (IBMulator) and it's still incomplete, so it currently tests only a small subset of every possible instruction combination. The full list of tested opcodes is included in the repo.
@superfury
answering your questions from the other thread, the SDL crash is definitely not related to test386.asm, which doesn't even use the video output.
The COM port is not mandatory and can be disabled.
The LPT port is only used by writing ASCII bytes to 3BCh (unless you changed the address); see print_p.asm
If your CPU gets halted with diagnostic code 00, you should check your emulator's trace log because it could be a problem with the the very first test which is related to conditional jumps and loops.
It seems to be even more simple: the first diagnostic code (00h) isn't ever written to the I/O port, thus it goes wrong even before that(the 00h is the default value that's initialized when the emulation starts). I've tested and verified already that the first two instructions(the JMP at F000:FFF0 and CLI it jumps to) are correctly decoded. Somewhere after that, SDL 2.0.6 crashes in it's own thread(it doesn't seem to be the cause of my application itself afaik).
I am running it now too on CAPE. I've manged to compile with no trouble with NASM (with some warnings). Seems to be fine for me, as in it already showed issues, so I am fixing stuff now.
Thank you for this hottbar, this is superuseful. Me and superfury were looking for a long time for something like this.
it already showed issues, so I am fixing stuff now.
Happy to hear! This tool helped me to fix many bugs as well.
I hope some day it'll be able to test much more because my CPU still has some problems that this program is not able to detect.
Edit: Whoops, removed the ROM directory, causing the internal BIOS to go haywire instead of loading a valid ROM...
It seems to go wrong with the JCXZ check?
The attachment debugger.log is no longer available
Edit: It also doesn't seem the POST macro is being assembled, it resolves to no code being inserted? The conditional jump tests start immediately after CLI, the code "POST 0" isn't being assembled at all?
Can anyone see what's invalid behaviour about that code? It multiplies 0x80000001(unsigned, which is -2147483647 when used with IMULD) and multiplies it with itself? Thus resulting in EDX=C0000000 EAX=00000001? Then it loads EAX with 44332211h and copies that to EBX. Then it loads ECX with 88776655h and multiplies it with itself(unsigned multiply, resulting in EDX=48BF146A EAX=9BEDD839. Then it divides 48BF146A9BEDD839h with 88776655h, which overflows for some unknown reason? That shouldn't happen?
1//Universal DIV instruction for x86 DIV instructions! 2/* 3 4Parameters: 5 val: The value to divide 6 divisor: The value to divide by 7 quotient: Quotient result container 8 remainder: Remainder result container 9 error: 1 on error(DIV0), 0 when valid. 10 resultbits: The amount of bits the result contains(16 or 8 on 8086) of quotient and remainder. 11 SHLcycle: The amount of cycles for each SHL. 12 ADDSUBcycle: The amount of cycles for ADD&SUB instruction to execute. 13 14*/ 15void CPU80386_internal_DIV(uint_64 val, uint_32 divisor, uint_32 *quotient, uint_32 *remainder, byte *error, byte resultbits, byte SHLcycle, byte ADDSUBcycle, byte *applycycles) 16{ 17 uint_64 temp, temp2, currentquotient; //Remaining value and current divisor! 18 byte shift; //The shift to apply! No match on 0 shift is done! 19 temp = val; //Load the value to divide! 20 *applycycles = 1; //Default: apply the cycles normally! 21 if (divisor==0) //Not able to divide? 22 { 23 *quotient = 0; 24 *remainder = temp; //Unable to comply! 25 *error = 1; //Divide by 0 error! 26 return; //Abort: division by 0! 27 } 28 29 if (CPU_apply286cycles()) /* No 80286+ cycles instead? */ 30 { 31 SHLcycle = ADDSUBcycle = 0; //Don't apply the cycle counts for this instruction! 32 *applycycles = 0; //Don't apply the cycles anymore! 33 } 34 35 temp = val; //Load the remainder to use! 36 *quotient = 0; //Default: we have nothing after division! 37 nextstep: 38 //First step: calculate shift so that (divisor<<shift)<=remainder and ((divisor<<(shift+1))>remainder) 39 temp2 = divisor; //Load the default divisor for x1! 40 if (temp2>temp) //Not enough to divide? We're done! 41 { 42 goto gotresult; //We've gotten a result! 43 } 44 currentquotient = 1; //We're starting with x1 factor! 45 for (shift=0;shift<(resultbits+1);++shift) //Check for the biggest factor to apply(we're going from bit 0 to maxbit)! 46 { 47 if ((temp2<=temp) && ((temp2<<1)>temp)) //Found our value to divide? 48 { 49 CPU[activeCPU].cycles_OP += SHLcycle; //We're taking 1 more SHL cycle for this! 50 break; //We've found our shift! 51 } 52 temp2 <<= 1; //Shift to the next position! 53 currentquotient <<= 1; //Shift to the next result! 54 CPU[activeCPU].cycles_OP += SHLcycle; //We're taking 1 SHL cycle for this! Assuming parallel shifting! 55 } 56 if (shift==(resultbits+1)) //We've overflown? We're too large to divide! 57 { 58 *error = 1; //Raise divide by 0 error due to overflow! 59 return; //Abort! 60 }
…Show last 20 lines
61 //Second step: substract divisor<<n from remainder and increase result with 1<<n. 62 temp -= temp2; //Substract divisor<<n from remainder! 63 *quotient += currentquotient; //Increase result(divided value) with the found power of 2 (1<<n). 64 CPU[activeCPU].cycles_OP += ADDSUBcycle; //We're taking 1 substract and 1 addition cycle for this(ADD/SUB register take 3 cycles)! 65 goto nextstep; //Start the next step! 66 //Finished when remainder<divisor or remainder==0. 67 gotresult: //We've gotten a result! 68 if (temp>((1<<resultbits)-1)) //Modulo overflow? 69 { 70 *error = 1; //Raise divide by 0 error due to overflow! 71 return; //Abort! 72 } 73 if (*quotient>((1<<resultbits)-1)) //Quotient overflow? 74 { 75 *error = 1; //Raise divide by 0 error due to overflow! 76 return; //Abort! 77 } 78 *remainder = temp; //Give the modulo! The result is already calculated! 79 *error = 0; //We're having a valid result! 80}
Can anyone see what's going wrong here?
Edit: After some checking, I've found various problems that were causing those (I)MUL and (I)DIV instructions to fail:
- Overflow due to 32-bit shifting off the range of the variable to check for overflow, causing an invalid overflow detection flag to be set when dividing. It caused a Divide by 0 fault incorrectly(it would actually fit within 32-bits, but the value to check against(0xFFFFFFFF) was truncated to be 00000000h due to wrong typing of the shifting number("1" needing to be "1ULL" to prevent shifting off 32-bit boundaries and becoming truncated to 0).
- Truncation after multiply due to type conversions being 32-bit instead of 64-bit(required for getting a 64-bit result out of the multiply operation).
Having fixed this, the BIOS now continues on to the next step(Diagnostics code 02h).
Edit: It also doesn't seem the POST macro is being assembled, it resolves to no code being inserted? The conditional jump tests start immediately after CLI, the code "POST 0" isn't being assembled at all?
Can you upload your intermediate assembler source-listing file? This can be generated by NASM with the -l command line switch (see here).
This file can be used to see the assembler result in a redable format. With it you can compare your CPU log with what the ROM is supposed to execute.
It seems like you have an old version of the src/test386.asm file. I've added the POST 0 macro only recently, after you noticed that it wasn't emitting a 00h diagnostic code for the first test.
Here's mine so you can compare.
The attachment test386.lst.zip is no longer available
After fixing the MOV Sw instructions to allow 32-bit register reading/writing besides 16-bit Segment registers, now it advanced to Diagnostics code 04h(Near CALL/Loading pointers?).
After fixing the stack PUSH/POP operations to depend on the situation(instructions based on operand size and hardware-based(interrupts/task switching etc) being specific), it now properly continues on to POST 09h(Initialize and enter protected mode), which seems to go wrong somewhere.
Edit: Having fixed the 0F01 opcode, it now enters protected mode and immediately(the BIOS itself) crashes after loading CR0 with paging, triple faulting.
It's a problem with the PDE/PTE(Paging)? The linear address being 0xFA079(the JMP instruction's start?). This translates to PDE containing 0x00002007, which points to a PTE containing 0x56781234? This fires the Page Fault handler? Is that correct?
Edit: Found a bug in the Paging unit that caused it to shift all addresses from PDE/PTE to shift right with 12 instead of not shifting at all(it's the address, masking is enough to get the physical addresses). Now Paging is working properly and it continues into normal protected mode.
Now it reaches the "lea ebp, [esp-4]" instruction, which causes an #UD because it thinks the [ESP] base of the SIB byte is illegal? Isn't that illegal?
Edit: Fixing that to allow that, I now see it translating the linear address 0x1000 to PDE 0, PTE 1, which is past 2MB(0x200000 being the PTE address)?
Edit: Having fixed the Paging unit and patching the use of ESP-4 for SIB parameters, it now continues to the next test: code 0Bh.
Edit: Found an strange bug: Using opcode 0x8C(MOV Ev,Sw), moves to 32-bit registers are zero-extended to 32-bits, while moves to memory seem to only write the lower half(leaving the upper 16-bits unchanged)?
Edit: Now it reaches test 0xC, which goes wrong with the MOVZ/SX instructions?
Edit: Having fixed those instruction properly(after consulting the 80386 manual and seeing the 16-bit/32-bit bugs, fixing the timing table contents and rewriting modr/m parsing for those instructions), it now gets to the next phase(phase 0xD or onwards).
Edit: It now gets to Diagnostics code 0xE, which crashes due to a fault.
Edit: It seems to do trange things(correct output, though) with this:
1testLEA32 [eax * 2], 0x00000002
Then it executes the following, which fails the test(also extremely strange code not matching the assembly in the repository):
It seems test 10h fails because of a #GP fault on the REP STOSB instruction(the first subtest's first string instruction), errorcode 0? That shouldn't happen?