Finding bugs in 8086-80386 emulation core?

Emulation of old PCs, PC hardware, or PC peripherals.

Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-12 @ 14:35

When I try stuff like installing Windows 95(setup.exe) or starting Megarace(the demo version), I end up with it somehow keep returning to an REP MOVSB instruction, which infinitely faults with #GP because it's reaching past the segmentation limits(limit field being 0xFFFF, with the offset taken being way past that).

I also tried installing MS-DOS 6.22 on the 8086 and 80386 emulation, but both inexplicably corrupt the hard disk image, making it unbootable(MS-DOS 5.0 that's installed boots without any problems on all available CPU emulations(8086-80386)).

Does anyone know what might cause this? When I try to run the megarace demo on the 80186 emulation, it will start the intro scene animation, with lots of junk pixels on the screen, interleaved with what seems to be the helmet of the Cryo logo.
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby peterferrie » 2017-7-13 @ 17:00

Is it writing to graphics memory at the time? If so, then that description sounds to me like the VESA technique, where page-faulting is used to know when to switch the planes.
peterferrie
Oldbie
 
Posts: 561
Joined: 2008-5-08 @ 21:54

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-14 @ 21:23

One thing I know for sure(without firing up the debugger on the currently very slow 2.0GHz CPU running the 16MHz 80386 at 2-4% speed), is that the REP MOVSB attempts to move memory past the 1MB barrier, somewhere between 1MB and 2MB(DS/ES offset), causing infinite General Protection faults that return to the offending instruction. The 0x1xxxxx offset raises the fault because of the limit of 0xFFFF(64KB) of the DS segment being violated. Unless the granularity bit is misinterpreted, which after checking should be checking the correct bit in the descriptor(according to osdev wiki).

The #GP handler does nothing in protected mode and seems to stay in real mode, IRETing back to the offending REP MOVSB instruction.

Afaik, the #GP handler should enter protected mode, load DS or ES, then return to real mode and IRET back to the instruction, which moves data to/from extended memory, as far as I know in the case of himem.sys?
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby vladstamate » 2017-7-14 @ 23:48

Is the first general protection fault warranted? As in you think it should happen? What happens if you run this code on a 286 CPU ?
User avatar
vladstamate
Oldbie
 
Posts: 524
Joined: 2015-8-23 @ 01:43

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-15 @ 08:03

Whoops. Thinking about it: it won't work. The instruction is a REP MOVSD instead of REP MOVSB(typo). So it'll run on a 80386+ only(32-bit instruction that's faulting). I'll have to check the fault, but it's going to take a while on 4% speed to run it. Also, the ESI/EDI registers are loaded with 32-bit values, which screw up on 16-bit CPUs.
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby vladstamate » 2017-7-15 @ 14:50

So who is executing REP MOVSD, is it the BIOS or the game? My question was more to find out if the GPF is supposed to happen in the first place. Maybe you have some segment problems, especially if the code is executed in the VESA BIOS.

Have you tried running Megarace with other graphic cards? Like EGA (even if it is a VGA game, you just want to know if it causes a GPF), or plain VGA or ET4000.
User avatar
vladstamate
Oldbie
 
Posts: 524
Joined: 2015-8-23 @ 01:43

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-15 @ 16:04

Just tried the Windows 95 setup again(no parameters). It will end up with a BOUND instruction faulting to itself(the bound exception handler IRETs back to the BOUND instruction which produces the same kind of loop). It's a BOUND SI,[SS:BP+DI+74] instruction at 0FDE:C36A. SI being 4778h.
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-15 @ 16:17

Now trying Megarace on the Compaq Deskpro 386 again. Audio and video seem to work now? Although slowly, at only 16% speed on a 4GHz Intel i7 4790K.

Edit: The Megarace demo now seems to run all the way into the gameplay itself :D Windows 95 setup still crashes on that BOUND instruction, though, for some unknown reason(the BOUND instruction faults and returns to the BOUND instructions without any modification on results).

Recording of Megarace audio when running the demo, as well as screen captures taken throughout gameplay in UniPCemu(using RALT+F5 to make a screen capture), on the Compaq Deskpro 386 machine emulation, with a 80386 CPU at 16MHz(only at 16% speed).
https://www.dropbox.com/s/riszsakskvt8c ... o.zip?dl=0

380.jpg
Megarace demo starting

382.jpg
Cryo logo

384.jpg
Megarace!

386.jpg
City display

393.jpg
Lance!
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-15 @ 16:46

394.jpg
Selecting a car

396.jpg
Gameplay starting

398.jpg
And just before stopping the recording of audio.
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands


Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-16 @ 10:46

Also, during gameplay of Megarace, it won't respond to any keyboard input?

Isn't the Windows 95 setup.exe supposed to install it's own Bound Exception handler? I see the BIOS exception handler modifying AH after some checks, before IRETting.
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-16 @ 13:29

Looking at the BOUND instruction that's executed again: it's executing with an segment of SS, which has value 1BFDh, with an EA offset of 10AA0h. The 10AA0h value is truncated to 16-bits, since it's using 16-bit offsets. I'll disable this and try again.

Edit: That causes the Compaq Deskpro 386 BIOS POST itself to fail right when getting to POST code 06h(so pretty much at the beginning), so the wrap around 16-bit is required. Only 32-bit offsets(using 32-bit registers and 32-bit address size for memory) can get past 64k offsets.

So looking at the executed BOUND instruction again: it's comparing the value against a minimum and maximum of 0? That doesn't make sense? Why would you have an array with only one entry?
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-16 @ 15:04

Just tried a Windows 95 harddisk image(From crazyc if I remember correctly, was a long time ago) with UniPCemu. After fixing a little bug causing the instruction fetcher to fetch modr/m parameters(or any data following the opcodes in general), when an error(like general protection faults or page faults) occurs, infinitely, I see that it's trying to execute some instruction(which can't be fully fetched, due to the fault) at location 0000:FFFF. So that instruction is actually causing a General Protection fault, which hung up the CPU due to that bug. Having fixed the bug, I see it handling the bug somewhat, until it finally returns to the faulting instruction at 0000:FFFF, which continues infinitely. So something is causing it to get to that point, instead of running like it's supposed to(it's still in real mode afaik). It's probably the one from some tutorial on running Windows 95 on a PSP. Don't know whether it's legal to mention here on the forums(The disk image itself might be illegal, but I'm only using it to test my CPU core in this case). As far as I can see, it's a stripped for PSP or minimal installed version of Windows 95 I'm using, which I installed myself(or not, don't remember exactly where I got it from).

Strangely enough, that 0000:FFFF sounds like a misinterpreted EA instruction jumping to FFFF:0000(which is a soft-reset vector)?

Edit: I don't see any jumps to 0000:FFFF being emitted, nor any jump instruction that looks like that. So it's perhaps some error handler? Maybe some interrupt?
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-16 @ 17:25

Looking into and stepping through the boot loader process, I see it loading two sectors from the harddisk, then eventually continuing on somewhere into zeroed RAM, which will eventually end up at 0000:FFFF. So it's the second stage of the boot loader that's going wrong(the part of the loader loaded at 0000:7C00 after executing the first stage of the boot loader, which is relocated to another part of RAM).
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-16 @ 17:39

I've made a little dump of the boot sector, then the second stage of the boot sector after relocation and loading the second sector from disk:
debugger.log
UniPCemu running the second stage of the Windows 95 boot loader process(after relocation and loading the sector at 0000:7C00).
(341.55 KiB) Downloaded 2 times


Can anyone see what's going wrong?
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-16 @ 20:18

Hmmm... Strange: When looking at the log executing the REP MOVSB to copy some table from the BIOS ROM(at F000:EFC7), I only see two bytes(0xDF and 0x02) being transferred(addresses: FEFC7=>522 and FEFC8=>523), interleaved with lots of (I assume) instruction fetches from RAM. Nothing's logged about any of the other data being transferred. Finally it ends with the PUSH ES instruction, which neatly pushes ES to the stack. So is most of the data transferred at all? Seeing as the CX register is zeroed at the PUSH ES instruction, they should've been transferred, but nothing's logged. I'm recreating a simple log which includes those instructions that have been skipped(due to the logging mode being set to Log Always, it will ignore logging of transfers made when skipping instructions using the debugger's skip functionality).
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-16 @ 20:28

A quick upload with all data, including the missing skipped executions(INT13 calls etc.):
debugger_20170716_2225_complete.zip
Debugger log, including interrupt 13h executing(Full log, including skipped instructions, from second arrival at 0000:7C00(volume boot sector) until crash at 0000:FFFF).
(564.19 KiB) Downloaded 1 time


The instruction(s) that have the error should be in there somewhere.

Edit: The code that's crashing and burning(the log) is the volume boot sector, not IO.SYS. It's supposed to load the first(or entire) IO.SYS file, which(seeing as MSDOS.SYS is a text file) loads the Windows 95 OS kernel itself. So why is it crashing halfway the volume boot sector execution? What's going wrong? MS-DOS 5.0a boots without problems. MS-DOS 6.22 crashes as well.
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-17 @ 00:50

I've been looking at the point where it examines the DPB table that's in the volume boot sector at address 24h. What's strange is: it reads the value(80h, hard disk number), substracts it with zero(CMP 80h,00h), but the JGE/JNL isn't followed, leading into some dead code, according to http://iks.cs.ovgu.de/~elkner/fat-amorg ... swin41.txt ? 80h>=00h, so the jump(Sign Flag==Overflow Flag) should be followed. But the Sign Flag isn't equal to the Overflow flag, thus leading into the dead code block?

: 38 4e 40 cmp [BP+0x40],CL # Check, whether the drive number in the
# BPB (7C40h) denotes a valid device,
# i.e. >= 0. Since this is always the
# case,
7C8E: 7d 25 jge 0x7CB5 # jump to prepare_read (0x7C8E+0x25+2)
; end of set_disk_params


dead_code_fillbytes:


The equivalent of that jump isn't being taken, due to some odd CMP flag error(SF vs OF problem)?

So, CL needs to be 80h(the booted disk number), but is 00h(BIOS Floppy disk number) instead?
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby superfury » 2017-7-17 @ 11:42

Edit: Thinking about it: jge is correct, but only for signed operands. The 80h(C drive) and 00h(A drive) are unsigned operands! Why would it handle the BIOS disk number as an signed comparison(C<A, so dead code is only skipped for floppy disks or emulated floppy disks(CD-ROM floppy boot?)? It doesn't make sense, except if you want to execute the 'dead code' block for (emulated) hard disks and CD-ROMs(with emulated hard disk images) only?
superfury
Oldbie
 
Posts: 1681
Joined: 2014-3-08 @ 11:25
Location: Netherlands

Re: Finding bugs in 8086-80386 emulation core?

Postby peterferrie » 2017-7-17 @ 19:11

It's used to find the first active partition, which might not be the first entry in the list.
peterferrie
Oldbie
 
Posts: 561
Joined: 2008-5-08 @ 21:54

Next

Return to PC Emulation

Who is online

Users browsing this forum: BLEXBot [Bot], Truth Unknown and 2 guests