VOGONS


Finding bugs in 8086-80386 emulation core?

Topic actions

First post, by superfury

User metadata
Rank l33t++
Rank
l33t++

When I try stuff like installing Windows 95(setup.exe) or starting Megarace(the demo version), I end up with it somehow keep returning to an REP MOVSB instruction, which infinitely faults with #GP because it's reaching past the segmentation limits(limit field being 0xFFFF, with the offset taken being way past that).

I also tried installing MS-DOS 6.22 on the 8086 and 80386 emulation, but both inexplicably corrupt the hard disk image, making it unbootable(MS-DOS 5.0 that's installed boots without any problems on all available CPU emulations(8086-80386)).

Does anyone know what might cause this? When I try to run the megarace demo on the 80186 emulation, it will start the intro scene animation, with lots of junk pixels on the screen, interleaved with what seems to be the helmet of the Cryo logo.

Last edited by superfury on 2017-08-17, 20:02. Edited 2 times in total.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 2 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

One thing I know for sure(without firing up the debugger on the currently very slow 2.0GHz CPU running the 16MHz 80386 at 2-4% speed), is that the REP MOVSB attempts to move memory past the 1MB barrier, somewhere between 1MB and 2MB(DS/ES offset), causing infinite General Protection faults that return to the offending instruction. The 0x1xxxxx offset raises the fault because of the limit of 0xFFFF(64KB) of the DS segment being violated. Unless the granularity bit is misinterpreted, which after checking should be checking the correct bit in the descriptor(according to osdev wiki).

The #GP handler does nothing in protected mode and seems to stay in real mode, IRETing back to the offending REP MOVSB instruction.

Afaik, the #GP handler should enter protected mode, load DS or ES, then return to real mode and IRET back to the instruction, which moves data to/from extended memory, as far as I know in the case of himem.sys?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 3 of 142, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

Is the first general protection fault warranted? As in you think it should happen? What happens if you run this code on a 286 CPU ?

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 4 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Whoops. Thinking about it: it won't work. The instruction is a REP MOVSD instead of REP MOVSB(typo). So it'll run on a 80386+ only(32-bit instruction that's faulting). I'll have to check the fault, but it's going to take a while on 4% speed to run it. Also, the ESI/EDI registers are loaded with 32-bit values, which screw up on 16-bit CPUs.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 5 of 142, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

So who is executing REP MOVSD, is it the BIOS or the game? My question was more to find out if the GPF is supposed to happen in the first place. Maybe you have some segment problems, especially if the code is executed in the VESA BIOS.

Have you tried running Megarace with other graphic cards? Like EGA (even if it is a VGA game, you just want to know if it causes a GPF), or plain VGA or ET4000.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 6 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just tried the Windows 95 setup again(no parameters). It will end up with a BOUND instruction faulting to itself(the bound exception handler IRETs back to the BOUND instruction which produces the same kind of loop). It's a BOUND SI,[SS:BP+DI+74] instruction at 0FDE:C36A. SI being 4778h.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 7 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Now trying Megarace on the Compaq Deskpro 386 again. Audio and video seem to work now? Although slowly, at only 16% speed on a 4GHz Intel i7 4790K.

Edit: The Megarace demo now seems to run all the way into the gameplay itself 😁 Windows 95 setup still crashes on that BOUND instruction, though, for some unknown reason(the BOUND instruction faults and returns to the BOUND instructions without any modification on results).

Recording of Megarace audio when running the demo, as well as screen captures taken throughout gameplay in UniPCemu(using RALT+F5 to make a screen capture), on the Compaq Deskpro 386 machine emulation, with a 80386 CPU at 16MHz(only at 16% speed).
https://www.dropbox.com/s/riszsakskvt8cdg/Uni … ceDemo.zip?dl=0

380.jpg
Filename
380.jpg
File size
30.24 KiB
Views
1797 views
File comment
Megarace demo starting
File license
Fair use/fair dealing exception
382.jpg
Filename
382.jpg
File size
30.36 KiB
Views
1797 views
File comment
Cryo logo
File license
Fair use/fair dealing exception
384.jpg
Filename
384.jpg
File size
29.28 KiB
Views
1797 views
File comment
Megarace!
File license
Fair use/fair dealing exception
386.jpg
Filename
386.jpg
File size
52.04 KiB
Views
1797 views
File comment
City display
File license
Fair use/fair dealing exception
393.jpg
Filename
393.jpg
File size
75.11 KiB
Views
1797 views
File comment
Lance!
File license
Fair use/fair dealing exception

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 8 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++
394.jpg
Filename
394.jpg
File size
91.13 KiB
Views
1797 views
File comment
Selecting a car
File license
Fair use/fair dealing exception
396.jpg
Filename
396.jpg
File size
101.9 KiB
Views
1797 views
File comment
Gameplay starting
File license
Fair use/fair dealing exception
398.jpg
Filename
398.jpg
File size
85.82 KiB
Views
1797 views
File comment
And just before stopping the recording of audio.
File license
Fair use/fair dealing exception

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 9 of 142, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

Very nice!

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 10 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Also, during gameplay of Megarace, it won't respond to any keyboard input?

Isn't the Windows 95 setup.exe supposed to install it's own Bound Exception handler? I see the BIOS exception handler modifying AH after some checks, before IRETting.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 11 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Looking at the BOUND instruction that's executed again: it's executing with an segment of SS, which has value 1BFDh, with an EA offset of 10AA0h. The 10AA0h value is truncated to 16-bits, since it's using 16-bit offsets. I'll disable this and try again.

Edit: That causes the Compaq Deskpro 386 BIOS POST itself to fail right when getting to POST code 06h(so pretty much at the beginning), so the wrap around 16-bit is required. Only 32-bit offsets(using 32-bit registers and 32-bit address size for memory) can get past 64k offsets.

So looking at the executed BOUND instruction again: it's comparing the value against a minimum and maximum of 0? That doesn't make sense? Why would you have an array with only one entry?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 12 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just tried a Windows 95 harddisk image(From crazyc if I remember correctly, was a long time ago) with UniPCemu. After fixing a little bug causing the instruction fetcher to fetch modr/m parameters(or any data following the opcodes in general), when an error(like general protection faults or page faults) occurs, infinitely, I see that it's trying to execute some instruction(which can't be fully fetched, due to the fault) at location 0000:FFFF. So that instruction is actually causing a General Protection fault, which hung up the CPU due to that bug. Having fixed the bug, I see it handling the bug somewhat, until it finally returns to the faulting instruction at 0000:FFFF, which continues infinitely. So something is causing it to get to that point, instead of running like it's supposed to(it's still in real mode afaik). It's probably the one from some tutorial on running Windows 95 on a PSP. Don't know whether it's legal to mention here on the forums(The disk image itself might be illegal, but I'm only using it to test my CPU core in this case). As far as I can see, it's a stripped for PSP or minimal installed version of Windows 95 I'm using, which I installed myself(or not, don't remember exactly where I got it from).

Strangely enough, that 0000:FFFF sounds like a misinterpreted EA instruction jumping to FFFF:0000(which is a soft-reset vector)?

Edit: I don't see any jumps to 0000:FFFF being emitted, nor any jump instruction that looks like that. So it's perhaps some error handler? Maybe some interrupt?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 13 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Looking into and stepping through the boot loader process, I see it loading two sectors from the harddisk, then eventually continuing on somewhere into zeroed RAM, which will eventually end up at 0000:FFFF. So it's the second stage of the boot loader that's going wrong(the part of the loader loaded at 0000:7C00 after executing the first stage of the boot loader, which is relocated to another part of RAM).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 14 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've made a little dump of the boot sector, then the second stage of the boot sector after relocation and loading the second sector from disk:

Filename
debugger.log
File size
341.55 KiB
Downloads
71 downloads
File comment
UniPCemu running the second stage of the Windows 95 boot loader process(after relocation and loading the sector at 0000:7C00).
File license
Fair use/fair dealing exception

Can anyone see what's going wrong?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 15 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Hmmm... Strange: When looking at the log executing the REP MOVSB to copy some table from the BIOS ROM(at F000:EFC7), I only see two bytes(0xDF and 0x02) being transferred(addresses: FEFC7=>522 and FEFC8=>523), interleaved with lots of (I assume) instruction fetches from RAM. Nothing's logged about any of the other data being transferred. Finally it ends with the PUSH ES instruction, which neatly pushes ES to the stack. So is most of the data transferred at all? Seeing as the CX register is zeroed at the PUSH ES instruction, they should've been transferred, but nothing's logged. I'm recreating a simple log which includes those instructions that have been skipped(due to the logging mode being set to Log Always, it will ignore logging of transfers made when skipping instructions using the debugger's skip functionality).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 16 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

A quick upload with all data, including the missing skipped executions(INT13 calls etc.):

Filename
debugger_20170716_2225_complete.zip
File size
564.19 KiB
Downloads
46 downloads
File comment
Debugger log, including interrupt 13h executing(Full log, including skipped instructions, from second arrival at 0000:7C00(volume boot sector) until crash at 0000:FFFF).
File license
Fair use/fair dealing exception

The instruction(s) that have the error should be in there somewhere.

Edit: The code that's crashing and burning(the log) is the volume boot sector, not IO.SYS. It's supposed to load the first(or entire) IO.SYS file, which(seeing as MSDOS.SYS is a text file) loads the Windows 95 OS kernel itself. So why is it crashing halfway the volume boot sector execution? What's going wrong? MS-DOS 5.0a boots without problems. MS-DOS 6.22 crashes as well.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 17 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've been looking at the point where it examines the DPB table that's in the volume boot sector at address 24h. What's strange is: it reads the value(80h, hard disk number), substracts it with zero(CMP 80h,00h), but the JGE/JNL isn't followed, leading into some dead code, according to http://iks.cs.ovgu.de/~elkner/fat-amorgana/vb … t32-mswin41.txt ? 80h>=00h, so the jump(Sign Flag==Overflow Flag) should be followed. But the Sign Flag isn't equal to the Overflow flag, thus leading into the dead code block?

: 38 4e 40 cmp [BP+0x40],CL # Check, whether the drive number in the # BPB (7C40 […]
Show full quote

: 38 4e 40 cmp [BP+0x40],CL # Check, whether the drive number in the
# BPB (7C40h) denotes a valid device,
# i.e. >= 0. Since this is always the
# case,
7C8E: 7d 25 jge 0x7CB5 # jump to prepare_read (0x7C8E+0x25+2)
; end of set_disk_params

dead_code_fillbytes:

The equivalent of that jump isn't being taken, due to some odd CMP flag error(SF vs OF problem)?

So, CL needs to be 80h(the booted disk number), but is 00h(BIOS Floppy disk number) instead?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 18 of 142, by superfury

User metadata
Rank l33t++
Rank
l33t++

Edit: Thinking about it: jge is correct, but only for signed operands. The 80h(C drive) and 00h(A drive) are unsigned operands! Why would it handle the BIOS disk number as an signed comparison(C<A, so dead code is only skipped for floppy disks or emulated floppy disks(CD-ROM floppy boot?)? It doesn't make sense, except if you want to execute the 'dead code' block for (emulated) hard disks and CD-ROMs(with emulated hard disk images) only?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io