Writing a 386/486 emulator, having some issues

Reply 20 of 31, by UselessSoftware

Posted on 2025-04-09, 17:45

UselessSoftware Offline

Rank Newbie

Rank: Newbie
Posts: 93
Joined: 2020-06-15, 00:20
Location: United States

superfury wrote on 2025-04-08, 21:42:
Afaik (at least older) Linux should plain work without FPU instructions (and using FPU emulation by trapping those opcodes if it […]
Show full quote

UselessSoftware wrote on 2025-04-08, 18:09:
Ah! Didn't realize that, thanks. I'll get that fixed. […]
Show full quote

superfury wrote on 2025-04-08, 10:45:
Have you tried test386.asm yet? […]
Show full quote

Have you tried test386.asm yet?

Also regarding the BT* instructions, isn't the offset (based on the shr 4(16) or 5(32) bit position shifted left by one(16) or two(32)) supposed to be signed? So bits 8000h is actually byte r/m offset-2048, bit 0(as a 16-bit word read and written)?
So for R/M offset 10000h, it's at a lower address.

Edit: Added a simple MS-DOS based testsuite for bit test string instruction (it tests both 16-bit and 32-bit versions on a 3 doubleword bit string in memory, with the pointer on the middle doubleword).
It will test the first doubleword (positive addresses, at the base address), then the second doubleword (base+1), then the previous doubleword (base-1).

It's fully running in 16-bit MS-DOS mode, but the 32-bit addresses will use operand and address size overrides (it uses EDX for addressing easily).

In UniPCemu's current commits, it seems to run properly at least (with additional bugfixes in the emulator performed). Oddly enough, the test386.asm testsuite doesn't verify that the positive and negative ranges are functioning properly (it just tests the register version of those opcodes for some odd reason).

I found a Youtube video that seems to explain it nicely:
https://www.youtube.com/watch?v=en_7DtfT8Cg

Ah! Didn't realize that, thanks. I'll get that fixed.

I did quickly put in a printf debug line that tells me if any BT opcodes are operating on an offset with the sign bit set but it never triggered. So that's not the cause of my current problems, but definitely still need to fix.

I did run test386 before, it runs successfully up into some of the protected mode tests that fail just because I haven't implemented a number of protections yet. I guess it's time to do those. Or comment out those tests and re-compile so it continues and get to them later.

I actually wonder if my FPU is just extremely broken and that's causing the problems. I've barely worked on it, and I do see that Linux executes a few FPU instructions as it loads. Maybe some errors there are tripping it up, I'm just not sure how much it relies on it for the boot process.

Afaik (at least older) Linux should plain work without FPU instructions (and using FPU emulation by trapping those opcodes if it's not implemented on a x87 (using a specific fault handler, enabled using a CR0 bit (EM) for opcodes D8-DFh. Those can be safely ignored (NOP except with a modr/m, no immediate) to emulate without a FPU (the OS will usually execute FNINIT and FNSTSW):
1FNINIT
2FNSTSW WORD PTR [FPU_STATUS]
My emulator simply does the following for example:
- FNINIT/FNSTSW: Disassemble, behave like a NOP.
- Any other FPU (D8-DF) instruction without EM set: NOP, but disassemble as an 'unimplemented FPU instruction'. Also fetch instruction ModR/M, but ignore it (to continue onwards to any next instruction).
- Any FPU instruction with EM set: trap to the OS using the emulation exception (#NM), like other CPU faults (in this case, just like #UD, except fetching modr/m for the undocumented instruction, as D8-DF instruction fetching is handled first).
In a way, it's like 0F18-0F1F, but all behaving like 0F1F, except optionally throwing an exception on execution (#NM) depending on the EM and TS bits in CR0.

I figured the old kernels shouldn't care, but they were still running a couple of FPU ops so it made me suspicious.

That's close to what I've been doing. When my CPU core is in no-FPU mode, it sets EM in CR0, moves past the ModRM byte, and does an #NM but I was doing it for every FPU op including FNINIT/FNSTSW.

Reply 21 of 31, by superfury

Posted on 2025-04-10, 17:08

superfury Offline

Rank l33t++

Rank: l33t++
Posts: 5818
Joined: 2014-03-08, 11:25
Location: Netherlands

UselessSoftware wrote on 2025-04-09, 17:45:
superfury wrote on 2025-04-08, 21:42:
Afaik (at least older) Linux should plain work without FPU instructions (and using FPU emulation by trapping those opcodes if it […]
Show full quote

UselessSoftware wrote on 2025-04-08, 18:09:
Ah! Didn't realize that, thanks. I'll get that fixed. […]
Show full quote

Ah! Didn't realize that, thanks. I'll get that fixed.

I did quickly put in a printf debug line that tells me if any BT opcodes are operating on an offset with the sign bit set but it never triggered. So that's not the cause of my current problems, but definitely still need to fix.

I did run test386 before, it runs successfully up into some of the protected mode tests that fail just because I haven't implemented a number of protections yet. I guess it's time to do those. Or comment out those tests and re-compile so it continues and get to them later.

I actually wonder if my FPU is just extremely broken and that's causing the problems. I've barely worked on it, and I do see that Linux executes a few FPU instructions as it loads. Maybe some errors there are tripping it up, I'm just not sure how much it relies on it for the boot process.

Afaik (at least older) Linux should plain work without FPU instructions (and using FPU emulation by trapping those opcodes if it's not implemented on a x87 (using a specific fault handler, enabled using a CR0 bit (EM) for opcodes D8-DFh. Those can be safely ignored (NOP except with a modr/m, no immediate) to emulate without a FPU (the OS will usually execute FNINIT and FNSTSW):
1FNINIT
2FNSTSW WORD PTR [FPU_STATUS]
My emulator simply does the following for example:
- FNINIT/FNSTSW: Disassemble, behave like a NOP.
- Any other FPU (D8-DF) instruction without EM set: NOP, but disassemble as an 'unimplemented FPU instruction'. Also fetch instruction ModR/M, but ignore it (to continue onwards to any next instruction).
- Any FPU instruction with EM set: trap to the OS using the emulation exception (#NM), like other CPU faults (in this case, just like #UD, except fetching modr/m for the undocumented instruction, as D8-DF instruction fetching is handled first).
In a way, it's like 0F18-0F1F, but all behaving like 0F1F, except optionally throwing an exception on execution (#NM) depending on the EM and TS bits in CR0.
I figured the old kernels shouldn't care, but they were still running a couple of FPU ops so it made me suspicious.

That's close to what I've been doing. When my CPU core is in no-FPU mode, it sets EM in CR0, moves past the ModRM byte, and does an #NM but I was doing it for every FPU op including FNINIT/FNSTSW.

The behaviour is actually pretty simple for a non-FPU case:

Basically, fetch the modr/m, then, when executing the 'instruction':
UniPCemu performs the following fault cases:
Perform #NM fault (exception #7, return EIP on the stack pointing to the instruction that's faulting (like #GP etc.)) when:
- CR0 EM bit is set and an ESC opcode executed (ESC being a non-FWAIT FPU (not MMX or the like, so opcode D8-DF range) instruction).
- Either the MP bit is set or being an ESC opcode while the TS bit is set in CR0.

If it's not faulting, just perform a NOP cycle timing (or 1 instruction in IPS clocking mode like Dosbox/UniPCemu uses). Basically just 8 (modr/m pointing to memory) or 2 (modr/m pointing to a register) cycles in cycle-accurate mode otherwise.

The fault handling is just the fault with the return point being the FPU instruction. Don't handle anything extra there.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 22 of 31, by UselessSoftware

Posted on 2025-04-16, 04:57

UselessSoftware Offline

Rank Newbie

Rank: Newbie
Posts: 93
Joined: 2020-06-15, 00:20
Location: United States

Any idea why a lot of late 90's Award BIOSes might be showing this? I don't know why it might throw a checksum error. I'm loading the full 128 KB BIOS at 0xE0000.

One of my machine memory layouts that is doing it:

1	//FIC PT-2000
2	{
3		{ MACHINE_MEM_RAM, 0x00000, 0xA0000, MACHINE_ROM_ISNOTROM, NULL },
4		{ MACHINE_MEM_ROM, 0xE0000, 0x20000, MACHINE_ROM_REQUIRED, "roms/machine/ficpt2000/PT2000_v1.01.BIN" },
5		{ MACHINE_MEM_RAM, 0x100000, 0xF00000, MACHINE_ROM_ISNOTROM, NULL },
6		{ MACHINE_MEM_ENDLIST, 0, 0, 0, NULL }
7	},

Looks right to me.

Reply 23 of 31, by superfury

Posted on 2025-04-16, 06:52

superfury Offline

Rank l33t++

Rank: l33t++
Posts: 5818
Joined: 2014-03-08, 11:25
Location: Netherlands

BIOS checksum is usually ADD instructions, a loop to itherate over the entire (compressed, if applicable) BIOS ROM and some CMP instruction against a checksum byte.
The BIOS ROM might also be shadowed into RAM, causing issues if not properly implemented (chipset-dependent).

The i440fx/i430fx uses PCI for this, for example, to map reads to UMA TO RAM and writes to PCI (essentially shadowing the uncompressed ROM) at the F0000 64K memory block (this is possible down to C0000 even, in 8KB chunks if I remember correctly), using the 440fx PCI northbridge and PIIX/PIIX3 southbridge (configured using PCI accesses).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 24 of 31, by superfury

Posted on 2025-04-16, 15:11

superfury Offline

Rank l33t++

Rank: l33t++
Posts: 5818
Joined: 2014-03-08, 11:25
Location: Netherlands

Just looked up your BIOS's motherboard:
https://theretroweb.com/motherboards/s/fic-pt-2000#chips

It looks like a plain i430fx northbridge and 82271FB (PIIX) southbridge.

That's the same my emulator emulates on it's i430fx motherboard (although a different BIOS being used).

Look into the 82437FX/82438FX chipset's TSC, which is located at PCI address B:D:F address 0:0:0. There, at register 59h-5Fh (in this case 59h for the F0000h memory area) you'll find the register that maps the PAM0[7:4], which directs the area to:
- bit 4 or 1(depending on the area, as they're nibble fields): set=reads from RAM, cleared=reads from PCI (where the BIOS ROM might respond).
- bit 5 or 1: set=write to RAM, cleared=write to PCI (the BIOS ROM might respond if mapped there and not write-protected (that's configured in the PIIX PCI space, at B:D:F 0:1:0 byte 4E bit 2, set=write to BIOS flash are handled by the chip (BIOSS# asserted), cleared=not asserted (thus the chip doesn't respond to writes, ignoring them))
- bit 6 or 2 enables read caches.

The flash ROM that stores the BIOS ROM might be different depending on the motherboard though. You can look for 86box for more information of the implementations that are known (there's 4 known of them apparently). Basically it amounts to writes setting commands or data for commands, reads returning results. And writes addresses select an area to apply to (the block to write to or clear (reset to FFh)). The FFh command is simple: return to normal read mode. There's also an ID command that will report it's ID for flash ROM identification (ID byte selected by the A0 address line from what I remember).

The BIOS ROM is basically copied to low memory, then decompressed and put into the RAM at F0000-FFFFFh. Later, it might update the ESCD area into the ROM by flashing it, by performing a command to erase two blocks (at 1C000 and 1D000) followed by writing the data onto part of those blocks. The chip basically has four commands: read data (normal), erase block followed by confirm erase on the block, program followed by 1 byte to flash and an alternative program command (acts the same as normal program). During the commands (except read status, read ID and read data(default)), reads give the status of the flashing operation.
The exact process for all is in the chip's documentation. UniPCemu simply implements a 28F001BX-T chip.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 25 of 31, by UselessSoftware

Posted on 2025-05-22, 15:27

UselessSoftware Offline

Rank Newbie

Rank: Newbie
Posts: 93
Joined: 2020-06-15, 00:20
Location: United States

Thanks again for the info. Sorry I haven't been back to this thread, life got busy.

I have managed to spend some time working on this more though. I'm now able to boot Debian 2.2 (Potato) but have various issues with every other distro I've tried. Still, it's nice progress.

The main issue was that I forgot to not allow the CPU state or memory to be modified by an instruction that faults. This broke things badly when Linux entered ring 3 because the kernel loves to use demand paging in user mode.

I'm even able to use the emulated NE2000 from it and do network stuff. I've telnetted into it and browsed web pages served up from Apache in the emulator. There is still a weird issue while where the rm command works and does it's job, it gives a segmentation fault as it exits. Running gcc in it also doesn't work at all and just segfaults immediately.

Unfortunately every Windows version I've tried still doesn't work, but I think I'm getting close with NT 4. I get an INACCESSIBLE_BOOT_DEVICE after it does an IDENTIFY on my disks. It doesn't like something about my ATA implementation details. Linux is much more forgiving of being out of spec somewhere.

It's not the IDENTIFY response itself. I've cloned the response 86Box gives for the same image and reported it to NT, and it gave the same error when I did that. It's probably a status byte or interrupt timing thing.

Reply 26 of 31, by superfury

Posted on 2025-06-16, 09:04

superfury Offline

Rank l33t++

Rank: l33t++
Posts: 5818
Joined: 2014-03-08, 11:25
Location: Netherlands

I have the same kind of issue with 2000/XP. But NT 4.0 runs correctly.

I think the issue is in the CPU emulation somewhere.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 27 of 31, by UselessSoftware

Posted on 2025-06-19, 03:36

UselessSoftware Offline

Rank Newbie

Rank: Newbie
Posts: 93
Joined: 2020-06-15, 00:20
Location: United States

It'll be tough to track down.

I figured out what the issue is with rm doing a segfault. It's because the FPU is very broken. If I boot Linux with "no387" so that it uses FPU emulation, it works fine.

I had hacked in the FPU from Halfix, but apparently I was in too much of a rush and didn't do it right. I need to either fix that, or roll my own FPU. I'd prefer to write my own, I was just borrowing that one so that I could put it off for a while.

I'm going to start another thread for further posts about the emulator with a better title.

Reply 28 of 31, by superfury

Posted on 2025-06-19, 10:48

superfury Offline

Rank l33t++

Rank: l33t++
Posts: 5818
Joined: 2014-03-08, 11:25
Location: Netherlands

UselessSoftware wrote on 2025-06-19, 03:36:
It'll be tough to track down. […]
Show full quote

It'll be tough to track down.

I figured out what the issue is with rm doing a segfault. It's because the FPU is very broken. If I boot Linux with "no387" so that it uses FPU emulation, it works fine.

I had hacked in the FPU from Halfix, but apparently I was in too much of a rush and didn't do it right. I need to either fix that, or roll my own FPU. I'd prefer to write my own, I was just borrowing that one so that I could put it off for a while.

I'm going to start another thread for further posts about the emulator with a better title.

FPU instructions shouldn't be triggering segmentation faults. Unless the code segment isn't paged perhaps (a segmentation fault in Linux sense, the CPU just triggers a #PF fault (which is translated to a 'segmentation' fault by the OS)).
When a CPU has no FPU, it will simply either:
- Throw a #NM fault (when FPU emulation is enabled in CR0, depending on what FPU instruction (ESC vs FWAIT) is executed).
- Do a simple NOP (when FPU emulation is disabled).

Neither should usually throw a segmentation fault, unless of course CS's segment descriptor causes it (out-of-bounds instruction address part) or a page fault occurs when fetching the FPU instruction.

Trying to execute rm didn't throw any faults in my emulator from what I remember. So if it's throwing faults without FPU instruction support, there's probably either an error in your exception handling (#NM) or another thing related to FPU instructions is somehow being executed when it shouldn't (indicating an error in your x86 instruction set or protected mode emulation).

I'm still trying to get the Windows 2000/XP kernel to work properly (CD-ROM setup boot). So far I've established the issue might be in the CD-ROM or IDE driver loading causing issues based on the PCI device (i440fx, but i430fx probably has the same problem). It does detect the PCI device for the hard drive/cd-rom controller, but fails to properly load the CD-ROM driver it seems. Somewhere when passing control to the kernel for reaching down into the driver, the kernel seems to not start up the driver for some reason, thus the dreaded 7B error occurs. I see it's enumerated the PCI device, but it doesn't load the driver for it for some weird reason, which should be happening in theory (according to the NT kernel's source code, which is leaked).
I did manage to (with the right kernel options on the CD-ROM image's NTLDR ini) hook a modern PC's windbg interface onto the running OS inside UniPCemu, which is why I know it reaches that point to begin with (and can single-step through the kernel with debug symbols loaded), so I at least roughly know where it is (in assembly however, not the original C source code). So I basically just have the assembly and the functions from the debug symbols to know where it's execting at any point in time. And I can set breakpoints onto different virtual addresses of course (at the point the debugger triggers and after that of course). So reasonably early in the boot process.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 29 of 31, by UselessSoftware

Posted on 2025-06-19, 14:39

UselessSoftware Offline

Rank Newbie

Rank: Newbie
Posts: 93
Joined: 2020-06-15, 00:20
Location: United States

superfury wrote on 2025-06-19, 10:48:
FPU instructions shouldn't be triggering segmentation faults. Unless the code segment isn't paged perhaps (a segmentation fault […]
Show full quote

UselessSoftware wrote on 2025-06-19, 03:36:
It'll be tough to track down. […]
Show full quote

It'll be tough to track down.

I figured out what the issue is with rm doing a segfault. It's because the FPU is very broken. If I boot Linux with "no387" so that it uses FPU emulation, it works fine.

I had hacked in the FPU from Halfix, but apparently I was in too much of a rush and didn't do it right. I need to either fix that, or roll my own FPU. I'd prefer to write my own, I was just borrowing that one so that I could put it off for a while.

I'm going to start another thread for further posts about the emulator with a better title.

FPU instructions shouldn't be triggering segmentation faults. Unless the code segment isn't paged perhaps (a segmentation fault in Linux sense, the CPU just triggers a #PF fault (which is translated to a 'segmentation' fault by the OS)).
When a CPU has no FPU, it will simply either:
- Throw a #NM fault (when FPU emulation is enabled in CR0, depending on what FPU instruction (ESC vs FWAIT) is executed).
- Do a simple NOP (when FPU emulation is disabled).

Neither should usually throw a segmentation fault, unless of course CS's segment descriptor causes it (out-of-bounds instruction address part) or a page fault occurs when fetching the FPU instruction.

Trying to execute rm didn't throw any faults in my emulator from what I remember. So if it's throwing faults without FPU instruction support, there's probably either an error in your exception handling (#NM) or another thing related to FPU instructions is somehow being executed when it shouldn't (indicating an error in your x86 instruction set or protected mode emulation).

I'm still trying to get the Windows 2000/XP kernel to work properly (CD-ROM setup boot). So far I've established the issue might be in the CD-ROM or IDE driver loading causing issues based on the PCI device (i440fx, but i430fx probably has the same problem). It does detect the PCI device for the hard drive/cd-rom controller, but fails to properly load the CD-ROM driver it seems. Somewhere when passing control to the kernel for reaching down into the driver, the kernel seems to not start up the driver for some reason, thus the dreaded 7B error occurs. I see it's enumerated the PCI device, but it doesn't load the driver for it for some weird reason, which should be happening in theory (according to the NT kernel's source code, which is leaked).
I did manage to (with the right kernel options on the CD-ROM image's NTLDR ini) hook a modern PC's windbg interface onto the running OS inside UniPCemu, which is why I know it reaches that point to begin with (and can single-step through the kernel with debug symbols loaded), so I at least roughly know where it is (in assembly however, not the original C source code). So I basically just have the assembly and the functions from the debug symbols to know where it's execting at any point in time. And I can set breakpoints onto different virtual addresses of course (at the point the debugger triggers and after that of course). So reasonably early in the boot process.

Well I *have* an FPU and report it being there in CPUID, it's just broken. I tried to haphazardly plug in the one from Halfix. I need to re-do all of that or just write one from scratch.

I think what's happening is the effective address for mem-based FPU ops isn't getting reported to it properly when I execute an FPU instruction, so it tries to access a memory location it shouldn't, causing a segfault.

When I pass "no387" to the kernel so that it ignores the FPU and flips on the EM bit in CR0, then yes the CPU throws an #NM and the kernel's x87 emulator handles it. Everything works fine then.

Reply 30 of 31, by UselessSoftware

Posted on 2025-06-19, 21:25

UselessSoftware Offline

Rank Newbie

Rank: Newbie
Posts: 93
Joined: 2020-06-15, 00:20
Location: United States

Today I pulled the FPU code from Blink and adapted it. That seems to be working. I can play Duke 3D now and no longer need to pass no387 to Linux. No more segfaults there when using FPU.

It's also much smaller and faster than the FPU from Halfix, but just uses double precision math. Halfix FPU used the softfloat library for full 80-bit precision. This should be fine though.

Reply 31 of 31, by danoon

Posted on 2025-07-01, 16:53

danoon Offline

Rank Member

Rank: Member
Posts: 227
Joined: 2011-01-04, 19:12

UselessSoftware wrote on 2025-06-19, 21:25:

Today I pulled the FPU code from Blink and adapted it. That seems to be working. I can play Duke 3D now and no longer need to pass no387 to Linux. No more segfaults there when using FPU.

It's also much smaller and faster than the FPU from Halfix, but just uses double precision math. Halfix FPU used the softfloat library for full 80-bit precision. This should be fine though.

I have gone back and forth on my fpu emulation. The only thing I can think of for going to a 64-bit precision fpu is make sure this test works

finit;
fild data;
fisttp result;

where data is a large 64-bit int, then verify result is the same number (not rounded down to 52-bit precision with 64-bit doubles)

I have seen some old memcpy implementations use fpu registers to copy 64-bit ints.

https://github.com/danoon2/Boxedwine

Main menu