VOGONS


First post, by GloriousCow

User metadata
Rank Member
Rank
Member

I've released an CPU test suite for 286 emulators, similar to my previous tests for the 8088, 8086, and V20.

https://github.com/SingleStepTests/80286

Again, these are hardware generated tests, made by controlling a Harris 80C286 CPU via an Arduino microcontroller.
Currently, only real mode tests are present. The test suite contains 326 instruction forms, containing nearly 1.5 million instruction executions with over 32 million cycle states captured.

This test suite is released in a chunked binary format I call MOO. A script is included to convert them into the traditional JSON format if you'd prefer.
The binary format was designed to be easier for emulators written in languages like C to use and example C code for reading MOO files is provided here:

https://github.com/SingleStepTests/8088/blob/ … ls/moo_parser.c

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 2 of 7, by xsaveopt

User metadata
Rank Newbie
Rank
Newbie

Very cool!

If you're planning on doing protected mode tests as well, I've found `((random() << random_range(0..32)) >> random_range(0..32)) ^ (random_bool() ? 0xffffffff : 0)` to be quite helpful in (quickly) generating test cases that cover all behavior. And for cmpxchg(8/16b) you'll also need some percentage of test cases to have random registers/memory set to equal values.

Reply 3 of 7, by GloriousCow

User metadata
Rank Member
Rank
Member
xsaveopt wrote on 2025-07-26, 13:55:

Very cool!

Thanks!

xsaveopt wrote on 2025-07-26, 13:55:

If you're planning on doing protected mode tests as well, I've found `((random() << random_range(0..32)) >> random_range(0..32)) ^ (random_bool() ? 0xffffffff : 0)` to be quite helpful in (quickly) generating test cases that cover all behavior. And for cmpxchg(8/16b) you'll also need some percentage of test cases to have random registers/memory set to equal values.

I do inject all-zeros and all-ones at around 5% odds for registers, memory and immediate operands, but I like that shifting idea a lot too.
The main thing I need to figure out for protected mode tests is how to heuristically produce descriptor table values that provide a reasonable percentage of success.

cmpxchg was introduced on the 486 was it not?

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 4 of 7, by xsaveopt

User metadata
Rank Newbie
Rank
Newbie
GloriousCow wrote on 2025-07-26, 15:45:

I do inject all-zeros and all-ones at around 5% odds for registers, memory and immediate operands, but I like that shifting idea a lot too.

There are some instructions where you're statistically just not going to be able to cover all behavior if you use only uniformly distributed random numbers. For example, for the two and three operand imul instructions you're virtually only going to see overflows.

GloriousCow wrote on 2025-07-26, 15:45:

The main thing I need to figure out for protected mode tests is how to heuristically produce descriptor table values that provide a reasonable percentage of success.

Maybe you could do this iteratively? From the segment registers you know which ~6 or so descriptors could be accessed by an instruction, so you could start by filling those with the least restricting descriptors possible. If that executes successfully, you can do a few mutations making the descriptors more restrictive until it eventually does give you a fault.

GloriousCow wrote on 2025-07-26, 15:45:

cmpxchg was introduced on the 486 was it not?

My bad, you're right. I've mostly been looking at recent CPUs so I didn't realize that instruction wasn't available on the 286. I think it might also relevant for the flags of the cmp/sub instructions and maybe the termination behavior of rep cmps/scas, but I'd have to double-check that.

Reply 5 of 7, by GloriousCow

User metadata
Rank Member
Rank
Member
xsaveopt wrote on 2025-07-26, 16:23:

There are some instructions where you're statistically just not going to be able to cover all behavior if you use only uniformly distributed random numbers. For example, for the two and three operand imul instructions you're virtually only going to see overflows.

I use a beta distribution instead of a uniform distribution, which helps a bit. I don't believe imul faults - idiv of course throws exception 0. I have the stats on this . 16-bit IDIV throws exception 0 in 3224 of the 5000 tests. Not ideal.

The worst performer is BOUND, which "fails" 4351/5000 times. The chances of one random 16 bit value being between two other random 16-bit values is just not that great.

GloriousCow wrote on 2025-07-26, 15:45:

Maybe you could do this iteratively? From the segment registers you know which ~6 or so descriptors could be accessed by an instruction, so you could start by filling those with the least restricting descriptors possible. If that executes successfully, you can do a few mutations making the descriptors more restrictive until it eventually does give you a fault.

I had an idea to leverage my emulator when I had an implementation, to just dump test candidates as the emulator ran for various opcodes in real protected-mode software. These could then be run through the CPU to capture the actual cycle states.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 6 of 7, by xsaveopt

User metadata
Rank Newbie
Rank
Newbie
GloriousCow wrote on 2025-07-26, 17:17:

I use a beta distribution instead of a uniform distribution, which helps a bit. I don't believe imul faults - idiv of course throws exception 0. I have the stats on this . 16-bit IDIV throws exception 0 in 3224 of the 5000 tests. Not ideal.

It doesn't fault, but it sets CF/OF based on whether the result overflowed or not. So you would end up with only outputs where CF/OF are set. Whether that's relevant depends on what you're aiming to do with the tests I guess. For my usecase it was, which is how I ended up with the bitshift.

Reply 7 of 7, by GloriousCow

User metadata
Rank Member
Rank
Member
xsaveopt wrote on 2025-07-26, 18:34:
GloriousCow wrote on 2025-07-26, 17:17:

I use a beta distribution instead of a uniform distribution, which helps a bit. I don't believe imul faults - idiv of course throws exception 0. I have the stats on this . 16-bit IDIV throws exception 0 in 3224 of the 5000 tests. Not ideal.

It doesn't fault, but it sets CF/OF based on whether the result overflowed or not. So you would end up with only outputs where CF/OF are set. Whether that's relevant depends on what you're aiming to do with the tests I guess. For my usecase it was, which is how I ended up with the bitshift.

I'll give the shift method a try.
I do track flags that are always-set or always-cleared in a test set and validate they match documented values. For the 286 test suites, i do not see IMUL always setting OF, but it could be 99.9% I suppose.
Some of them don't get there , like the z flag doesn't get set in all DEC tests because I don't inject 1.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc