VOGONS


First post, by superfury

User metadata
Rank l33t++
Rank
l33t++

Can 32-bit x86 programs even explicitly use IP instead of EIP(which uses masking on the high 16-bits when loaded this way) during fetching (like SP vs ESP)?

Or is it always 32-bit on such platforms (16-bit loads simply set the top 16-bits to 0, 32-bit increments always overflow to the upper 16-bits)?

Also, do 16-bit modr/m offsets overflow? So [BX+SI] with BX being 9000h and SI being 7000h (will it check the segment descriptor for address 10000h with up to 3 bytes after that(for normal x86 instructions) depending on the operand size)? What is the memory address validated for faults (10000h+ or truncated to 0000h+)? I'd assume it's truncated to 16-bits in the 16-bit addressing case (16-bit address (using an address override in 32-bit mode))?

UniPCemu simply always uses EIP (or IP on 8086) not based on the D-bit for all operations (D-bit only affecting operand size) and SP vs ESP based on operand size(if using pushes/pops of (e)sp itself) or B-bit(when implicit and the instruction uses modr/m, thus not affected by the SS B-bit or CS D-bit for it).
Only B-bit is used using implicit stack operations (like POP DS or the like, irrelevant of prefixes), causing SP 16-bit wraparround if set to 16-bit (B=0).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 1 of 3, by mkarcher

User metadata
Rank l33t
Rank
l33t
superfury wrote on 2024-04-07, 20:28:

Can 32-bit x86 programs even explicitly use IP instead of EIP(which uses masking on the high 16-bits when loaded this way) during fetching (like SP vs ESP)?

The decision whether IP or EIP is used as offset to CS when fetching instructions is the "D/B" bit of the code segment descriptor. If you call a program a "32-bit x86 program" iff that bit is set, a 32-bit x86 program can not choose to use IP instead of EIP.

superfury wrote on 2024-04-07, 20:28:

Also, do 16-bit modr/m offsets overflow? So [BX+SI] with BX being 9000h and SI being 7000h (will it check the segment descriptor for address 10000h with up to 3 bytes after that(for normal x86 instructions) depending on the operand size)? What is the memory address validated for faults (10000h+ or truncated to 0000h+)? I'd assume it's truncated to 16-bits in the 16-bit addressing case (16-bit address (using an address override in 32-bit mode))?

You can only write [BX+SI] if you are in 16-bit address mode (either in a 16-bit code segment without prefix or in a 32-bit code segment with a 67h prefix). In 32-bit address mode you can only use something like [EBX+ESI]. In 16-bit address mode, any overflow is discarded, so [BX+SI] in your example will access offset 0 in any case. The limit check is performed after the address is calculated, so the value 0 is checked against the segment limit, not the value 10000h. This also implies that using an address size prefix in real mode, you can generate offsets above 64K, which will fail the segment limit check and cause INT 0Dh, which will end up in the IRQ5 handler on PC-compatible systems. That's the point of "flat mode" AKA "unreal mode", which initalizes a segment limit of 4G using a protected mode segment descriptor and then switches back to real mode, keeping that limit. If the processor is initialized that way (violating Intels specification on how to switch from protected mode to real mode), limit checks in real mode are performed against the 4G segment limit, and the complete address space can be used using 32-bit addressing prefixes.

Reply 2 of 3, by superfury

User metadata
Rank l33t++
Rank
l33t++
mkarcher wrote on 2024-04-07, 21:15:
The decision whether IP or EIP is used as offset to CS when fetching instructions is the "D/B" bit of the code segment descripto […]
Show full quote
superfury wrote on 2024-04-07, 20:28:

Can 32-bit x86 programs even explicitly use IP instead of EIP(which uses masking on the high 16-bits when loaded this way) during fetching (like SP vs ESP)?

The decision whether IP or EIP is used as offset to CS when fetching instructions is the "D/B" bit of the code segment descriptor. If you call a program a "32-bit x86 program" iff that bit is set, a 32-bit x86 program can not choose to use IP instead of EIP.

superfury wrote on 2024-04-07, 20:28:

Also, do 16-bit modr/m offsets overflow? So [BX+SI] with BX being 9000h and SI being 7000h (will it check the segment descriptor for address 10000h with up to 3 bytes after that(for normal x86 instructions) depending on the operand size)? What is the memory address validated for faults (10000h+ or truncated to 0000h+)? I'd assume it's truncated to 16-bits in the 16-bit addressing case (16-bit address (using an address override in 32-bit mode))?

You can only write [BX+SI] if you are in 16-bit address mode (either in a 16-bit code segment without prefix or in a 32-bit code segment with a 67h prefix). In 32-bit address mode you can only use something like [EBX+ESI]. In 16-bit address mode, any overflow is discarded, so [BX+SI] in your example will access offset 0 in any case. The limit check is performed after the address is calculated, so the value 0 is checked against the segment limit, not the value 10000h. This also implies that using an address size prefix in real mode, you can generate offsets above 64K, which will fail the segment limit check and cause INT 0Dh, which will end up in the IRQ5 handler on PC-compatible systems. That's the point of "flat mode" AKA "unreal mode", which initalizes a segment limit of 4G using a protected mode segment descriptor and then switches back to real mode, keeping that limit. If the processor is initialized that way (violating Intels specification on how to switch from protected mode to real mode), limit checks in real mode are performed against the 4G segment limit, and the complete address space can be used using 32-bit addressing prefixes.

What if a program clears D/B in the CS segment descriptor cache loaded? Does it use EIP (with the only difference loads clear the upper 16-bits of it) or really use IP (only the 16-bit low bits, with 64KB wraparound included when placing INC AX at offset FFFFh for example)? So will it fault first in that case on EIP=10000h (the start of the next instruction) or use the upper 16 bits of EIP at all (does it mask the upper 16-bits to 0000h)?
Say I have a 16-bit MS-DOS program (thus D-bit being 0 in the descriptor cache and descriptor limit of FFFFFh (maximum allowed)) and perform a 32-bit JMP to 0:10000h (with a operand size prefix). Will it perform the next fetch at 10000h (using EIP in all cases) or at 0h (using IP in all cases, including 16-bit wraparound)? What if the upper 16-bits are non-zero for a code fetch with D-bit cleared in the CS segment descriptor cache? Are they used or ignored (assumed 0 for all code fetching and protection checks)?

All documentation I can find only says anything about the linear addresses generated (segment descriptor base + offset), but nothing about the offset for 16-bit modr/m and IP vs EIP (if it actually uses a real 16-bit IP register at all, since the upper 16-bits of EIP bits are cleared on all loads (mask 0000FFFFh on EIP)) at all?
UniPCemu currently loads EIP always as 32-bit, with clearing the upper 16-bits in 16-bit operand mode. Fetching always is done using the full 32-bit EIP register (only on 80(1)86 it actually uses a 16-bit IP register with 16-bit wraparound, except on the 0x10000 special case of operands at FFFFh).

https://www.os2museum.com/wp/does-eip-wrap-ar … 6-bit-segments/ would seem to suggest that there's no 16-bit IP register on 80286+ CPUs, it's only a 32-bit EIP register that's loaded as I described. Since the instruction after CS:FFFF INC EAX ends with EIP=10000h, a fault will happen if the limit of the descriptor is exceeded (returning in real mode causes it to be loaded again, which is clearing the upper 16-bits of EIP on both IRET and when entering the IVT routine), while protected mode code will do the same (either executing CS:10000 or faulting at that point because it exceeds the limit of the segment descriptor cache value).
So only 8086, 80186 and 80286, which actually have 16-bit IP will perform wraparound to a instruction starting at CS:0 instead of CS:10000.

So in effect, you cannot choose to use a real IP (and obtain real 64K offset wraparround) in any case, no matter what D/B-bit is set in the segment descriptor cache. It's always IP(16-bit) up to 80286, and always EIP(32-bit) on 80386 and later processors.
The only 16-bit offsets that wrap around on a 80386 and newer are the 16-bit modr/m addresses and the SP register as selected by the B-bit of the SS segment descriptor cache (which is documented as doing so, and is required for stuff like decreasing from 0h to FFFEh properly). The D-bit has no effect on any of those.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 3 of 3, by superfury

User metadata
Rank l33t++
Rank
l33t++

Slightly related, since it applies to EIP or IP:

What is the return point of a interrupt handling throwing a fault (for example #GP) during an INT instruction itself (opcode CDh)? UniPCemu currently treats all faults the same (resetting EIP to the start of the instruction), but is the same true for the INT instruction? Does that also apply to INTO etc.?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io