VOGONS


Finding bugs in 8086-80386 emulation core?

Topic actions

Reply 120 of 142, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

While that is true, we can definitely help each other. Lets come up with a log format that we can adopt and will make it trivial to compare execution paths. I'll open a new thread about this.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 121 of 142, by peterferrie

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

That's exactly the difficult part: the method of execution is a cycle-based one, whereas most emulators(except mine and vladstamate's afaik) run on an instruction level at most(like Dosbox).

So that would mean I need to rebuild some of the instruction-level hardware synchronization instead of cycle-level to compare with those emulators?

We are interested in seeing the list of instruction addresses in a working path versus a non-working one, of the kind that a single-stepping debugger would show.
When they diverge, we can go back to see what was the specific instruction condition that caused it, backwards from there until we find the test that resulted in that condition, and backwards from there until we find the instructions that set the parameters for the test.
It probably comes down to just a few register values that differ.

Reply 122 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

With the latest bugfixes, HIMEM.SYS no longer detects erroneous high memory:D So the driver should be working properly now:D

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 123 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just tried to run "EMM386.EXE NOEMS" as a device driver(after loading HIMEM.SYS successfully). Now it seems to crash and hang somewhere in protected mode? Virtual 8086 mode isn't running, just plain protected mode, jumping to the same starting point and storing CR2? Anyone can tell me what's going wrong there? It's the MS-DOS 6.22 EMM386.EXE loaded directly after HIMEM.SYS(which should run, memory testing succeeding).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 124 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Now all instructions, except for the rotate instructions(RCL,RCR, ROL, ROR) SHOULD be working without problems. Oddly enough, the Windows 95 setup still crashes due to a BOUND instruction that's faulting because it's offset is out of the specified range? I also notice that the offset overflows? Is that supposed to happen?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 125 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Yay! Good news: floppy disk booting is working now on the Compaq Deskpro 386! 😁 Finally can boot the Windows 95 floppy disks with required stuff for Windows 95 setup:D (Instead of plain MS-DOS already installed on the hard disk)!

460-Windows 95a boot disk booting on Compaq Deskpro 386.jpg
Filename
460-Windows 95a boot disk booting on Compaq Deskpro 386.jpg
File size
49.45 KiB
Views
1019 views
File comment
First-time Windows 95a boot disk booting on Compaq Deskpro 386!
File license
Fair use/fair dealing exception

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 126 of 142, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

Congrats dude!

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 127 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

That also means that I can finally run the Compaq Diagnostics boot floppy and (hopefully) get rid of uninitialized BIOS settings messages etc. 😁

Edit: Doesn't seem to boot? Windows 95a boot floppy booted without problems?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 128 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

For some odd reason, Windows 95a MS-DOS is insanely slow? The screen starts flashing(updates?) and response time becomes extremely long(typing results after a few seconds instead of instantaneously)? DMA isn't running as far as I can see. Maybe some other cause?

Tried checking stuff like interrupts, but it doesn't seem ridiculously many(e.g. hardware interrupts spawning too fast etc.). The PIT is at a normal rate. Stuff like the Sound Blaster isn't used too, so that can't be it. Maybe some heavy instruction being used many times for some odd reason with MS-DOS 7.0? Looking at the screen, with the CPU running at 20% 80386 16MHz, it looks like it takes about 3 seconds to just recognise a single key press? It didn't do that with older MS-DOS versions?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 129 of 142, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

For some odd reason, Windows 95a MS-DOS is insanely slow? The screen starts flashing(updates?) and response time becomes extremely long(typing results after a few seconds instead of instantaneously)? DMA isn't running as far as I can see. Maybe some other cause?

Tried checking stuff like interrupts, but it doesn't seem ridiculously many(e.g. hardware interrupts spawning too fast etc.). The PIT is at a normal rate. Stuff like the Sound Blaster isn't used too, so that can't be it. Maybe some heavy instruction being used many times for some odd reason with MS-DOS 7.0? Looking at the screen, with the CPU running at 20% 80386 16MHz, it looks like it takes about 3 seconds to just recognise a single key press? It didn't do that with older MS-DOS versions?

When that happens to me I do a simple poor-man's statistical sampler: I break into the debugger (pause key in Visual Studio or Xcode) and see where I am, then let go and press it again, and so on. After a while I tend to have a good idea of where the code is spending time. Of course you can also use CodeXL or some other tool but for when suddenly it behaves so slowly, that is not a bad technique.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 130 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

That shouldn't be the case? The 20% reported by UniPCemu's CPU speed is a direct one-per-second renderjng of the emulated time(nanoseconds emulated( vs realtime(nanoseconds HniPCemu is running, using gettimeofday). It ticks about once each second, retrieves current Unix-style timestamp from the OS and divides emulated time accumulated so far(clearing it's value as well for counting the next second), then multiplies it with 100 to obtain the speed in percent the emulation is running. That speed is at a constant 20%, so it's not that the CPU vs hardware is slowing down compared to usual(the same speed applies to the BIOS, which doesn't have a delay in input). So there must be something within the emulated hardware or CPU that's somehow slowing the time the emulated CPU is waiting for something down(excessive interrupts eating cycles on the 80386?). Looking at fired interrupts I don't see excessive INT 08h(IRQ0), which indeed is running at 18.2Hz(65536 PIT cycles per interrupt). DMA maybe? Since the floppy drive is working properly now, maybe other DMA problems are showing?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 131 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've made a little log of the BIOS trying to boot the floppy disk(which also should program the DMA controller incorrectly, due to the index being incorrect):

https://www.dropbox.com/s/ylom43hnhqpy3oe/deb … 22_1427.7z?dl=0

Can you see what's going wrong? Why is the DMA controller mode being programmed with a 0x42 value(which is a self test mode instead of normal read mode)? The index that's loaded is incorrect, so somewhere up the stack is an invalid value being pushed?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 132 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've just taken a look and backtraced it to the source of the seemingly invalid value:
Error location: f000:0000ed6f
Source: 08h needing to be 04h, which is SHL 1 to obtain the correct DMA mode register write.
At SS:BP+01
BP=7EB8
7EB9=04h instead of 02h(required for correct FDC read instruction).
Written 04h at f000:000090aa using opcode C6 46 01 04(mov byte ss:[bp+01],04)?

Can anyone tell my why it loads a self test operation into the FDC DMA Mode control register?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 133 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just found out some unknown bug that could be called the Caps lock of death: pressing caps lock makes the keyboard unresponsive for some unknown reason? Not a single key has any effect after pressing caps lock?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 134 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just did some more tests on the Caps lock issue. It seems that it affects Num lock and Scroll lock as well. Pressing any of those three modifier keys makes the BIOS send a 0xED(Set LEDs) command, but never the second parameter required to finish the command? That causes the PS/2 keyboard controller to continue to block any keys pressed(putting them in the response buffer), causing it to be unresponsive to any following keys pressed.

Anyone can tell me if they know why the BIOS isn't sending the actual data byte(the LED state to be applied for Num/Caps/Scroll lock)?
Edit: Managed to find and fix the bug: The PS/2 keyboard was giving infinite 0xFA(ACN) responses when receiving the Set/Reset LED(0xED) command. The same problem for 0xF0(Set/Get active scancode set). That was causing the BIOS to finish the command before sending the actual parameter required to finish the command, causing the PS/2 keyboard to think it still needs to inhabit keyboard input(because it's needed to be inhabited during commands being sent).

Now the Scroll lock works: I see it (re)setting the LED. But Caps Lock doesn't set any LEDs and Num lock sets Caps Lock and Num lock together. Although that might be a bug in the LED display routine intead(unlikely). Probably another bug in the PS/2 keyboard controller.

Edit: After some simple debugging I found out that the LEDs were being displayed incorrectly: Num Lock and Scroll Lock were working properly, but Caps Lock had it's light turned on based on the Num Lock LED instead of the Caps Lock LED(bit 1 instead of bit 2 in the LEDs being displayed).

Now the keyboard should be working without problems again 😁

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 135 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

It´s odd that, even though the math instructions should be working according to the test386.asm testsuite (https://github.com/barotto/test386.asm), with most jumps/call instructions verified as well, the Windows 95 setup still produces a Bound Exception when it's unexpected(it's handled by the BIOS, which jumps right back) before setup/Windows starts and the VM monitor of EMM386.EXE(MS-DOS 6.22)/SYS(Windows 3.0) crashes into an #UD instruction(which shows the error and press enter to reboot)?

I've manually checked the JMP and CALL instructions, which should be working correctly? Arithmetic instructions SHOULD also be working flawlessly(according to the testsuite results logged into a log file(port E9 log)). Conditional jumps should be correct as well, according to any found documentation.

Then WHY are those two 386-specific pieces of software crashing? Why the odd BOUND exception on Windows 95 setup/booting?

Something odd happens with MS-DOS 7.0(used with Windows 95) as well: the entire system response becomes VERY slow(as in taking several seconds for each keypress to show up on the screen, in emulated time)?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 136 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just tried running Wolfenstein 3D again on the newer CPU emulation. It seems to hang waiting for VGA retrace infinitely jumping back on itself? It first waits for it to enter retrace, then to stop retrace, then jumping back to the first?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 137 of 142, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Just tried running Wolfenstein 3D again on the newer CPU emulation. It seems to hang waiting for VGA retrace infinitely jumping back on itself? It first waits for it to enter retrace, then to stop retrace, then jumping back to the first?

Vertical retrace or horizontal? If it is horizontal then that is ok, it just means it is waiting for all lines to to be rendered (effectively waiting for a vertical retrace). That is how the EGA and VGA bios do their checks: counting number of horiz retraces between vertical retraces, so there is a lot wait and then do it again action.

If it is waiting for the vertical retrace then maybe it is counting frames, for some kind of delay?

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 138 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

That's the strange thing: it waits for the bit to toggle two times and then executes an unconditional jump back to the first loop's start, thus never finishing?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 139 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just tried it again(wolf3d.exe) with 80386 (compaq deskpro 386). It seems to cause #GP faults all the time reaching a certain point (16b7:0000fffd)? That's quite strange, as it shouldn't happen normally in real mode, even with HIMEM.SYS loaded(which will load a 0xFFFF limit, so it shouldn't trigger)?

Edit: After some debugging, I see it throwing a pseudo-protection fault at 16B7:FFFD, which seems to overflow into 16B7:10000 fetching operands, which throws a pseudo #GP fault? The instruction used is 0x83, 0x7D, 0x0E? It seems to cross 64K boundaries, thus faulting? That isn't supposed to happen in normal code execution, thus it's an incorrect location that's faulting already? The cause of the error then must be earlier in execution, before the #GP is reached?

Attachments

  • Filename
    debugger.log
    File size
    1.95 MiB
    Downloads
    39 downloads
    File comment
    Debugger log of infinite loop(#GP fault?).
    File license
    Fair use/fair dealing exception

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io