VOGONS


test386.asm CPU tester

Topic actions

Reply 101 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

So OF is only affected when maskcnt==1, CF is always set to the last shifted bit(except without numcnt, which leaves it untouched(except with ROL/ROR, which will load even when shifting e.g. 8 or 16, which doesn't shift due to modulo shifting(numcnt==0))?

tempCF ← LSB(SRC);
DEST ← (DEST / 2) + (CF * 2SIZE);

Is in the pseudocode SRC the same as DEST? It would be strange to shift from another source, as well as only r/m being available(reg is used to select the instruction(rol/ror/rcl/rcr/shl/shr/(SAL=shl)/sar))?

The OF flag is defined only for the 1-bit rotates;

So 1-bit rotates mean both CL=1/9/17, IMM8=1/9/17 and the ,1 variant?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 102 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

Adjusting the code based on the manual(except the src argument, which probably isn't correct?) results in the following log:

Filename
porte9.log
File size
2.57 MiB
Downloads
52 downloads
File comment
Port EE log in with the latest ROL/ROR/RCL/RCR improvements.
File license
Fair use/fair dealing exception

It still errors out, even though it fully matches the documentation? OF and CF are updated accordingly as the documentation says it does(except CF updating within the loop as well with ROL/ROR as well as after the loop).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 103 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just looked at IBMulator's code. It seems to do things like Bochs and Dosbox, and with the same results thay should match Carry flag, but it doesn't? Overflow is always affected with maskcnt!=0(and maskcnt==8(byte)/16(byte/word)/24(word) for rol/ror?

Does IBMulator's output match real hardware(which means incorrect manuals)?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 104 of 178, by peterferrie

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

The OF flag is defined only for the 1-bit rotates;

So 1-bit rotates mean both CL=1/9/17, IMM8=1/9/17 and the ,1 variant?

No, that's absolutely untrue.
mov al,80
mov cl,2
rol al,cl
will set OF.
You take the xor of the top two bits before and after the rotate, and set OF accordingly.
The rotate count is irrelevant.

Reply 105 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

So shifting anything(count!=0) will (re)set the overflow flag according to the old and new sign flags(old xor new sign flag)?

XOR of the TOP two bits before AND after? Don't you mean XOR of the top bit(sign bit before xor sign bit after)?

Or do you mean: OF = (MSB(oldval) xor SMSB(oldval)) and (MSB(newval) xor SMSB(newval))?

Edit: Or do I (for each rotate, after rotating a bit) do OF = OF or (MSB(s) xor SMSB(s)), clearing OF before starting the rotation loop when numcnt!=0(to prevent affecting OF when rotating 0 bits)?

Also, apply the same method with accompanying logic(see documentation) with all rotate instructions(rol,ror,rcl,rcr)?

Edit: I'd assume I just handle the overflow flag before/after the instruction in the way that's documented, but instead of keep overwriting the overflow flag, I just need to initialize the state to 0 before starting the shift/rotate loop, then OR(|=) the calculated overflow flag(just like the carry flag) as documented for 1 shifts, then finally(after the loop) write the OR status of all the rotates/shifts overflow(overflow occured during any of the single shifts in the loop) to the overflow flag, but ONLY if the loop was executed(maskcnt!=0)?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 106 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just tried converting the logic for the single-bit shifts to multi-bit shifts:

8-bit shift/rotate:

byte op_grp2_8(byte cnt, byte varshift) {
//word d,
INLINEREGISTER word s, shift, tempCF, msb;
INLINEREGISTER byte numcnt, maskcnt, overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1b;
switch (thereg) {
case 0: //ROL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 7; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF((s&0x80)>>7); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow |= (FLAG_CF^((s >> 7) & 1)); //Only when not using CL?
}
FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 1: //ROR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 7; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7F) | (FLAG_CF << 7);
overflow |= ((s >> 7) ^ ((s >> 6) & 1)); //Only when not using CL?
}
FLAGW_CF(s>>7); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 2: //RCL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 9; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF((s&0x80)>>7); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow |= (FLAG_CF^((s >> 7) & 1)); //OF=MSB^CF, only when not using CL?
}
if (maskcnt) FLAGW_OF(overflow);
break;

case 3: //RCR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 9; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow |= ( FLAG_CF^(s >> 7));
tempCF = FLAG_CF;
FLAGW_CF(s&1); //Save LSB!
Show last 73 lines
			s = ((s >> 1)&0x7F) | (tempCF << 7);
}
if (maskcnt) FLAGW_OF(overflow);
break;

case 4: case 6: //SHL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
if (s & 0x80) FLAGW_CF(1); else FLAGW_CF(0);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFF;
overflow |= (FLAG_CF^(s>>7));
}
if (numcnt) flag_szp8((uint8_t)(s&0xFF));
if (maskcnt) FLAGW_OF(overflow);
break;

case 5: //SHR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow |= FLAGW_OF(s>>7);
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (numcnt) flag_szp8((uint8_t)(s & 0xFF));
if (maskcnt) FLAGW_OF(overflow);
break;

case 7: //SAR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x80;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp8((uint8_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp8((uint8_t)s); //Affect sign as well!
if (EMULATED_CPU<=CPU_NECV30) //Valid to update OF?
{
FLAGW_OF(0); //Cleared with count as well?
}
}
break;
}
op_grp2_cycles(numcnt, varshift);
return (s & 0xFF);
}

16-bit shift/rotate:

word op_grp2_16(byte cnt, byte varshift) {
//word d,
INLINEREGISTER uint_32 s, shift, tempCF, msb;
INLINEREGISTER byte numcnt, maskcnt, overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1;
switch (thereg) {
case 0: //ROL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0xF; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF((s&0x8000)>>15); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow |= (FLAG_CF^((s >> 15) & 1)); //Only when not using CL?
}
FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 1: //ROR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0xF; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FFF) | (FLAG_CF << 15);
overflow |= FLAGW_OF((s >> 15) ^ ((s >> 14) & 1)); //Only when not using CL?
}
FLAGW_CF(s>>15); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 2: //RCL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 17; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF((s&0x8000)>>15); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow |= (FLAG_CF^((s >> 15) & 1)); //OF=MSB^CF, only when not using CL?
}
if (maskcnt) FLAGW_OF(overflow);
break;

case 3: //RCR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 17; //Operand size wrap!
for (shift = 1; shift <= numcnt; shift++) {
overflow |= (FLAG_CF^(s >> 15));
tempCF = FLAG_CF;
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FFF) | (tempCF << 15);
Show last 72 lines
		}
if (maskcnt) FLAGW_OF(overflow);
break;

case 4: case 6: //SHL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
if (s & 0x8000) FLAGW_CF(1); else FLAGW_CF(0);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFFF;
overflow |= (FLAG_CF^(s>>15));
}
if (numcnt) flag_szp16((uint16_t)(s&0xFFFF));
if (maskcnt) FLAGW_OF(overflow);
break;

case 5: //SHR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow |= FLAGW_OF(s>>15);
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (numcnt) flag_szp16((uint16_t)(s & 0xFFFF));
if (maskcnt) FLAGW_OF(overflow);
break;

case 7: //SAR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x8000;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp16(s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp16(s); //Affect sign as well!
if (EMULATED_CPU<=CPU_NECV30) //Valid to update OF?
{
FLAGW_OF(0); //Cleared with count as well?
}
}
break;
}
op_grp2_cycles(numcnt, varshift);
return (s & 0xFFFF);
}

32-bit shift/rotate:

uint_32 op_grp2_32(byte cnt, byte varshift) {
//word d,
INLINEREGISTER uint_64 s, shift, tempCF, msb;
INLINEREGISTER byte numcnt,maskcnt,overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1d;
switch (thereg) {
case 0: //ROL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0x1F; //Operand size wrap!
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF((s&0x80000000)>>31); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow |= (((s >> 31) & 1)^FLAG_CF);
}
FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 1: //ROR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0x1F; //Operand size wrap!
numcnt = maskcnt;
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FFFFFFF) | (FLAG_CF << 31);
overflow |= FLAGW_OF((s >> 31) ^ ((s >> 30) & 1));
}
FLAGW_CF(s>>31); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 2: //RCL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU >= CPU_NECV30) numcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF((s&0x80000000)>>31); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow |= (((s >> 31) & 1)^FLAG_CF); //OF=MSB^CF
}
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 3: //RCR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU >= CPU_NECV30) numcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
overflow |= (((s >> 31)&1)^FLAG_CF);
tempCF = FLAG_CF;
Show last 74 lines
			FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FFFFFFF) | (tempCF << 31);
}
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 4: case 6: //SHL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
if (s & 0x80000000) FLAGW_CF(1); else FLAGW_CF(0);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFFFFFFF;
overflow |= (FLAG_CF^(s>>31));
}
if (maskcnt) FLAGW_OF(overflow);
if (numcnt) flag_szp32((uint32_t)(s&0xFFFFFFFF));
break;

case 5: //SHR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow |= FLAGW_OF(s>>31);
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt) FLAGW_OF(overflow);
if (numcnt) flag_szp32((uint32_t)(s & 0xFFFFFFFF));
break;

case 7: //SAR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x80000000;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp32((uint32_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp32((uint32_t)s); //Affect sign as well!
if (EMULATED_CPU<=CPU_NECV30) //Valid to update OF?
{
FLAGW_OF(0); //Cleared with count as well?
}
}
break;
}
op_grp2_cycles32(numcnt, varshift);
return (s & 0xFFFFFFFF);
}

Now it gives problems with overflow and carry flags in the following sections:
SHR1 B/W/D (overflow)
ROLi B/W/D (overflow)
ROLr B/W/D (carry)
ROR1 W/D (overflow)
RORi B/W/D (overflow)
RORr B/W/D (carry)
RCLi B/W/D (overflow)
RCRi B/W/D (overflow)
RCRr B/W/D (carry)

Can anyone shed some light on these problems? Why isn't the carry flag working properly(which is odd, because the results are correct)? What's going wrong with the overflow flag calculations?

Filename
porte9.log
File size
2.57 MiB
Downloads
57 downloads
File comment
Result of the test386.asm code.
File license
Fair use/fair dealing exception

Edit: Simply fixing the shift instructions was easy, simply replace the "|=" with "=" to overwrite the resulting flag instead of setting it (and not resetting it). That fixed the SHR1 instructions. All other instruction still give errors, though?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 107 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

Adjusting the flags to always be updated after each 1-bit shift(even subshifts of multiple CL or imm8 shifts), gives the following errors:
ROLr B/W: carry flag
RORr B/W: carry flag
RCRr B/W: carry flag

Finally, progress! 😁

So finally, no overflow flag problems anymore:D The only (strange) problem left is the carry flag with ROLr/RORr/RCRr B/W? The results don't show any problems?

8-bit:

byte op_grp2_8(byte cnt, byte varshift) {
//word d,
INLINEREGISTER word s, shift, tempCF, msb;
INLINEREGISTER byte numcnt, maskcnt, overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1b;
switch (thereg) {
case 0: //ROL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 7; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF((s&0x80U)>>7); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (FLAG_CF^((s >> 7) & 1)); //Only when not using CL?
}
//FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 1: //ROR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 7; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FU) | (FLAG_CF << 7);
overflow = ((s >> 7) ^ ((s >> 6) & 1)); //Only when not using CL?
}
//FLAGW_CF(s>>7); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 2: //RCL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 9; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF((s&0x80U)>>7); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (FLAG_CF^((s >> 7) & 1)); //OF=MSB^CF, only when not using CL?
}
if (maskcnt) FLAGW_OF(overflow);
break;

case 3: //RCR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 9; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (FLAG_CF^(s >> 7));
tempCF = FLAG_CF;
FLAGW_CF(s&1); //Save LSB!
Show last 74 lines
			s = ((s >> 1)&0x7FU) | (tempCF << 7);
}
if (maskcnt) FLAGW_OF(overflow);
break;

case 4: case 6: //SHL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>7);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFU;
overflow = (FLAG_CF^(s>>7));
}
if (numcnt) flag_szp8((uint8_t)(s&0xFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 5: //SHR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>7);
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (numcnt) flag_szp8((uint8_t)(s & 0xFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 7: //SAR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x80U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp8((uint8_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp8((uint8_t)s); //Affect sign as well!
if (EMULATED_CPU<=CPU_NECV30) //Valid to update OF?
{
FLAGW_OF(0); //Cleared with count as well?
}
}
break;
}
op_grp2_cycles(numcnt, varshift);
return (s & 0xFFU);
}

16-bit:

word op_grp2_16(byte cnt, byte varshift) {
//word d,
INLINEREGISTER uint_32 s, shift, tempCF, msb;
INLINEREGISTER byte numcnt, maskcnt, overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1;
switch (thereg) {
case 0: //ROL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0xF; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF((s&0x8000U)>>15); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (FLAG_CF^((s >> 15) & 1)); //Only when not using CL?
}
//FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 1: //ROR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0xF; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FFFU) | (FLAG_CF << 15);
overflow = ((s >> 15) ^ ((s >> 14) & 1)); //Only when not using CL?
}
//FLAGW_CF(s>>15); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 2: //RCL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 17; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF((s&0x8000U)>>15); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (FLAG_CF^((s >> 15) & 1)); //OF=MSB^CF, only when not using CL?
}
if (maskcnt) FLAGW_OF(overflow);
break;

case 3: //RCR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 17; //Operand size wrap!
for (shift = 1; shift <= numcnt; shift++) {
overflow = (FLAG_CF^(s >> 15));
tempCF = FLAG_CF;
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FFFU) | (tempCF << 15);
Show last 73 lines
		}
if (maskcnt) FLAGW_OF(overflow);
break;

case 4: case 6: //SHL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>15);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFFFU;
overflow = (FLAG_CF^(s>>15));
}
if (numcnt) flag_szp16((uint16_t)(s&0xFFFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 5: //SHR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>15);
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (numcnt) flag_szp16((uint16_t)(s & 0xFFFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 7: //SAR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x8000U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp16(s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp16(s); //Affect sign as well!
if (EMULATED_CPU<=CPU_NECV30) //Valid to update OF?
{
FLAGW_OF(0); //Cleared with count as well?
}
}
break;
}
op_grp2_cycles(numcnt, varshift);
return (s & 0xFFFFU);
}

32-bit(working):

uint_32 op_grp2_32(byte cnt, byte varshift) {
//word d,
INLINEREGISTER uint_64 s, shift, tempCF, msb;
INLINEREGISTER byte numcnt,maskcnt,overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1d;
switch (thereg) {
case 0: //ROL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0x1F; //Operand size wrap!
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF((s&0x80000000U)>>31); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (((s >> 31) & 1)^FLAG_CF);
}
//FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 1: //ROR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0x1F; //Operand size wrap!
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FFFFFFFU) | (FLAG_CF << 31);
overflow = ((s >> 31) ^ ((s >> 30) & 1));
}
//FLAGW_CF(s>>31); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 2: //RCL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU >= CPU_NECV30) numcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF((s&0x80000000U)>>31); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (((s >> 31) & 1)^FLAG_CF); //OF=MSB^CF
}
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 3: //RCR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU >= CPU_NECV30) numcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
overflow = (((s >> 31)&1)^FLAG_CF);
tempCF = FLAG_CF;
FLAGW_CF(s&1); //Save LSB!
Show last 73 lines
			s = ((s >> 1)&0x7FFFFFFFU) | (tempCF << 31);
}
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 4: case 6: //SHL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>31);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFFFFFFFU;
overflow = (FLAG_CF^(s>>31));
}
if (maskcnt) FLAGW_OF(overflow);
if (numcnt) flag_szp32((uint32_t)(s&0xFFFFFFFFU));
break;

case 5: //SHR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>31);
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt) FLAGW_OF(overflow);
if (numcnt) flag_szp32((uint32_t)(s & 0xFFFFFFFFU));
break;

case 7: //SAR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x80000000U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp32((uint32_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp32((uint32_t)s); //Affect sign as well!
if (EMULATED_CPU<=CPU_NECV30) //Valid to update OF?
{
FLAGW_OF(0); //Cleared with count as well?
}
}
break;
}
op_grp2_cycles32(numcnt, varshift);
return (s & 0xFFFFFFFFU);
}

Anyone can see what's going wrong?

Filename
porte9.log
File size
2.57 MiB
Downloads
73 downloads
File comment
POST EE output
File license
Fair use/fair dealing exception

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 108 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've adjusted the formulas on the 8-bit and 16-bit variants to match it's (correct) 32-bit counterpart, but the carry flag still fails? All other flags give no errors.

The carry flag fails on ROL B/W, ROR B/W and RCR B/W only?

Filename
porte9.log
File size
2.57 MiB
Downloads
58 downloads
File comment
Results with adjusted overflow flags. Carry flag fails mysteriously?
File license
Fair use/fair dealing exception

8-bit:

byte op_grp2_8(byte cnt, byte varshift) {
//word d,
INLINEREGISTER word s, shift, tempCF, msb;
INLINEREGISTER byte numcnt, maskcnt, overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1b;
switch (thereg) {
case 0: //ROL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 7; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF((s&0x80U)>>7); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (((s >> 7) & 1)^FLAG_CF); //Only when not using CL?
}
//FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 1: //ROR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 7; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FU) | (FLAG_CF << 7);
overflow = ((s >> 7) ^ ((s >> 6) & 1)); //Only when not using CL?
}
//FLAGW_CF(s>>7); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 2: //RCL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 9; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF((s&0x80U)>>7); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (((s >> 7) & 1)^FLAG_CF); //OF=MSB^CF, only when not using CL?
}
if (maskcnt) FLAGW_OF(overflow);
break;

case 3: //RCR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 9; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (((s >> 7)&1)^FLAG_CF);
tempCF = FLAG_CF;
FLAGW_CF(s&1); //Save LSB!
Show last 73 lines
			s = ((s >> 1)&0x7FU) | (tempCF << 7);
}
if (maskcnt) FLAGW_OF(overflow);
break;

case 4: case 6: //SHL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>7);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFU;
overflow = (FLAG_CF^(s>>7));
}
if (numcnt) flag_szp8((uint8_t)(s&0xFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 5: //SHR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>7);
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (numcnt) flag_szp8((uint8_t)(s & 0xFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 7: //SAR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x80U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp8((uint8_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp8((uint8_t)s); //Affect sign as well!
if (EMULATED_CPU<=CPU_NECV30) //Valid to update OF?
{
FLAGW_OF(0); //Cleared with count as well?
}
}
break;
}
op_grp2_cycles(numcnt, varshift);
return (s & 0xFFU);
}

16-bit:

word op_grp2_16(byte cnt, byte varshift) {
//word d,
INLINEREGISTER uint_32 s, shift, tempCF, msb;
INLINEREGISTER byte numcnt, maskcnt, overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1;
switch (thereg) {
case 0: //ROL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0xF; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF((s&0x8000U)>>15); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (((s >> 15) & 1)^FLAG_CF); //Only when not using CL?
}
//FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 1: //ROR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0xF; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FFFU) | (FLAG_CF << 15);
overflow = ((s >> 15) ^ ((s >> 14) & 1)); //Only when not using CL?
}
//FLAGW_CF(s>>15); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 2: //RCL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 17; //Operand size wrap!
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF((s&0x8000U)>>15); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (((s >> 15) & 1)^FLAG_CF); //OF=MSB^CF, only when not using CL?
}
if (maskcnt) FLAGW_OF(overflow);
break;

case 3: //RCR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 17; //Operand size wrap!
for (shift = 1; shift <= numcnt; shift++) {
overflow = ((s >> 15)^FLAG_CF);
tempCF = FLAG_CF;
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FFFU) | (tempCF << 15);
Show last 72 lines
		}
if (maskcnt) FLAGW_OF(overflow);
break;

case 4: case 6: //SHL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>15);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFFFU;
overflow = (FLAG_CF^(s>>15));
}
if (numcnt) flag_szp16((uint16_t)(s&0xFFFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 5: //SHR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>15);
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (numcnt) flag_szp16((uint16_t)(s & 0xFFFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 7: //SAR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x8000U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp16(s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp16(s); //Affect sign as well!
if (EMULATED_CPU<=CPU_NECV30) //Valid to update OF?
{
FLAGW_OF(0); //Cleared with count as well?
}
}
break;
}
op_grp2_cycles(numcnt, varshift);
return (s & 0xFFFFU);
}

32-bit(working without problems):

uint_32 op_grp2_32(byte cnt, byte varshift) {
//word d,
INLINEREGISTER uint_64 s, shift, tempCF, msb;
INLINEREGISTER byte numcnt,maskcnt,overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1d;
switch (thereg) {
case 0: //ROL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0x1F; //Operand size wrap!
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF((s&0x80000000U)>>31); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (((s >> 31) & 1)^FLAG_CF);
}
//FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 1: //ROR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0x1F; //Operand size wrap!
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s&1); //Save LSB!
s = ((s >> 1)&0x7FFFFFFFU) | (FLAG_CF << 31);
overflow = ((s >> 31) ^ ((s >> 30) & 1));
}
//FLAGW_CF(s>>31); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 2: //RCL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU >= CPU_NECV30) numcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF((s&0x80000000U)>>31); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (((s >> 31) & 1)^FLAG_CF); //OF=MSB^CF
}
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 3: //RCR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU >= CPU_NECV30) numcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
overflow = 0; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
overflow = (((s >> 31)&1)^FLAG_CF);
tempCF = FLAG_CF;
FLAGW_CF(s&1); //Save LSB!
Show last 73 lines
			s = ((s >> 1)&0x7FFFFFFFU) | (tempCF << 31);
}
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 4: case 6: //SHL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>31);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFFFFFFFU;
overflow = (FLAG_CF^(s>>31));
}
if (maskcnt) FLAGW_OF(overflow);
if (numcnt) flag_szp32((uint32_t)(s&0xFFFFFFFFU));
break;

case 5: //SHR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = 0;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>31);
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt) FLAGW_OF(overflow);
if (numcnt) flag_szp32((uint32_t)(s & 0xFFFFFFFFU));
break;

case 7: //SAR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x80000000U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s & 1);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp32((uint32_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp32((uint32_t)s); //Affect sign as well!
if (EMULATED_CPU<=CPU_NECV30) //Valid to update OF?
{
FLAGW_OF(0); //Cleared with count as well?
}
}
break;
}
op_grp2_cycles32(numcnt, varshift);
return (s & 0xFFFFFFFFU);
}

Anyone can see what's going wrong? Why is the carry flag with those instructions failing?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 109 of 178, by peterferrie

User metadata
Rank Oldbie
Rank
Oldbie

Your 32-bit RCL/RCR have a redundant line with the numcnt assignment since maskcnt is already masked.
They also should be %33 not &0x1F.
Overflow should not be cleared if count==0.
For SAR, overflow flag is cleared for all counts > 0.
Carry is a copy of the last bit rotated, regardless of the count, so:
mov al,1
rol al,8
will have carry set.

Reply 110 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

Documentation says mask with 0x1F, even with 32-bit rcl/rcr?

rol al,8 will modulo 8 with 8, becoming 0, thus not rotating anything, thus no carry flag modification? Or is it set to bit 0 with 8/16 shifts always?

With count(s), do you mean cnt, maskcnt or numcnt in those statements?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 111 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've modified the code to wrap around 33 bits, but after some bugfixes, it seems that the RCL r/m32 instruction doesn't shift enough sometimes?

8-bits:

byte op_grp2_8(byte cnt, byte varshift) {
//word d,
INLINEREGISTER word s, shift, tempCF, msb;
INLINEREGISTER byte numcnt, maskcnt, overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1b;
switch (thereg) {
case 0: //ROL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 7; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>7); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (((s >> 7) & 1)^FLAG_CF); //Only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 1: //ROR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 7; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FU) | (FLAG_CF << 7);
overflow = ((s >> 7) ^ ((s >> 6) & 1)); //Only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>7); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 2: //RCL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 9; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF(s>>7); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (((s >> 7) & 1)^FLAG_CF); //OF=MSB^CF, only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 3: //RCR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 9; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (((s >> 7)&1)^FLAG_CF);
tempCF = FLAG_CF;
Show last 75 lines
			FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FU) | (tempCF << 7);
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 4: case 6: //SHL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>7);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFU;
overflow = (FLAG_CF^(s>>7));
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>7); //Always sets CF, according to various sources?
if (numcnt) flag_szp8((uint8_t)(s&0xFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 5: //SHR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>7);
FLAGW_CF(s);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (numcnt) flag_szp8((uint8_t)(s & 0xFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 7: //SAR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x80U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp8((uint8_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp8((uint8_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared with count as well?
}
break;
}
op_grp2_cycles(numcnt, varshift);
return (s & 0xFFU);
}

16-bits:

word op_grp2_16(byte cnt, byte varshift) {
//word d,
INLINEREGISTER uint_32 s, shift, tempCF, msb;
INLINEREGISTER byte numcnt, maskcnt, overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1;
switch (thereg) {
case 0: //ROL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0xF; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>15); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (((s >> 15) & 1)^FLAG_CF); //Only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 1: //ROR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0xF; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FFFU) | (FLAG_CF << 15);
overflow = ((s >> 15) ^ ((s >> 14) & 1)); //Only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>15); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 2: //RCL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 17; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF(s>>15); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (((s >> 15) & 1)^FLAG_CF); //OF=MSB^CF, only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 3: //RCR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 17; //Operand size wrap!
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
overflow = ((s >> 15)^FLAG_CF);
tempCF = FLAG_CF;
Show last 75 lines
			FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FFFU) | (tempCF << 15);
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 4: case 6: //SHL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>15);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFFFU;
overflow = (FLAG_CF^(s>>15));
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>15); //Always sets CF, according to various sources?
if (numcnt) flag_szp16((uint16_t)(s&0xFFFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 5: //SHR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>15);
FLAGW_CF(s);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (numcnt) flag_szp16((uint16_t)(s & 0xFFFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 7: //SAR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x8000U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp16(s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp16(s); //Affect sign as well!
FLAGW_OF(0); //Cleared with count as well?
}
break;
}
op_grp2_cycles(numcnt, varshift);
return (s & 0xFFFFU);
}

32-bits:

uint_32 op_grp2_32(byte cnt, byte varshift) {
//word d,
INLINEREGISTER uint_64 s, shift, tempCF, msb;
INLINEREGISTER byte numcnt,maskcnt,overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1d;
switch (thereg) {
case 0: //ROL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0x1F; //Operand size wrap!
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>31); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (((s >> 31) & 1)^FLAG_CF);
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 1: //ROR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0x1F; //Operand size wrap!
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FFFFFFFU) | (FLAG_CF << 31);
overflow = ((s >> 31) ^ ((s >> 30) & 1));
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>31); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 2: //RCL r/m32
if (EMULATED_CPU >= CPU_80386) maskcnt %= 33; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//if (EMULATED_CPU >= CPU_NECV30) numcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF(s>>31); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (((s >> 31) & 1)^FLAG_CF); //OF=MSB^CF
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 3: //RCR r/m32
if (EMULATED_CPU >= CPU_80386) maskcnt %= 33; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//if (EMULATED_CPU >= CPU_NECV30) numcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
overflow = (((s >> 31)&1)^FLAG_CF);
tempCF = FLAG_CF;
Show last 75 lines
			FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FFFFFFFU) | (tempCF << 31);
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 4: case 6: //SHL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>31);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFFFFFFFU;
overflow = (FLAG_CF^(s>>31));
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>31); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
if (numcnt) flag_szp32((uint32_t)(s&0xFFFFFFFFU));
break;

case 5: //SHR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>31);
FLAGW_CF(s);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
if (numcnt) flag_szp32((uint32_t)(s & 0xFFFFFFFFU));
break;

case 7: //SAR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x80000000U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp32((uint32_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp32((uint32_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared with count as well?
}
break;
}
op_grp2_cycles32(numcnt, varshift);
return (s & 0xFFFFFFFFU);
}

This result in the following diff:

+++ porte9.log	2017-11-14 14:19:04.844247200 +0100
@@ -33912,8 +33912,8 @@
RCLr D EAX=00000000 EDX=00000001 PS=0045 EAX=00000001 EDX=00000001 PS=0044
RCLr D EAX=00000000 EDX=00000002 PS=0044 EAX=00000002 EDX=00000002 PS=0044
RCLr D EAX=00000000 EDX=0000001F PS=0044 EAX=40000000 EDX=0000001F PS=0044
-RCLr D EAX=00000000 EDX=00000020 PS=0044 EAX=00000000 EDX=00000020 PS=0045
-RCLr D EAX=00000001 EDX=00000000 PS=0045 EAX=00000001 EDX=00000000 PS=0045
+RCLr D EAX=00000000 EDX=00000020 PS=0044 EAX=80000000 EDX=00000020 PS=0044
+RCLr D EAX=00000001 EDX=00000000 PS=0044 EAX=00000001 EDX=00000000 PS=0045
RCLr D EAX=00000001 EDX=00000001 PS=0045 EAX=00000003 EDX=00000001 PS=0044
RCLr D EAX=00000001 EDX=00000002 PS=0044 EAX=00000006 EDX=00000002 PS=0044
RCLr D EAX=00000001 EDX=00000008 PS=0044 EAX=00000180 EDX=00000008 PS=0044
@@ -33925,7 +33925,7 @@
RCLr D EAX=00000001 EDX=00000001 PS=0045 EAX=00000003 EDX=00000001 PS=0044
RCLr D EAX=00000001 EDX=00000002 PS=0044 EAX=00000006 EDX=00000002 PS=0044
RCLr D EAX=00000001 EDX=0000001F PS=0044 EAX=C0000000 EDX=0000001F PS=0044
-RCLr D EAX=00000001 EDX=00000020 PS=0044 EAX=00000001 EDX=00000020 PS=0045
+RCLr D EAX=00000001 EDX=00000020 PS=0044 EAX=80000000 EDX=00000020 PS=0045
RCLr D EAX=00000002 EDX=00000000 PS=0045 EAX=00000002 EDX=00000000 PS=0045
RCLr D EAX=00000002 EDX=00000001 PS=0045 EAX=00000005 EDX=00000001 PS=0044
RCLr D EAX=00000002 EDX=00000002 PS=0044 EAX=0000000A EDX=00000002 PS=0044
@@ -33938,8 +33938,8 @@
RCLr D EAX=00000002 EDX=00000001 PS=0045 EAX=00000005 EDX=00000001 PS=0044
RCLr D EAX=00000002 EDX=00000002 PS=0044 EAX=0000000A EDX=00000002 PS=0044
RCLr D EAX=00000002 EDX=0000001F PS=0044 EAX=40000000 EDX=0000001F PS=0045
-RCLr D EAX=00000002 EDX=00000020 PS=0045 EAX=00000002 EDX=00000020 PS=0045
-RCLr D EAX=0000007E EDX=00000000 PS=0045 EAX=0000007E EDX=00000000 PS=0045
+RCLr D EAX=00000002 EDX=00000020 PS=0045 EAX=80000001 EDX=00000020 PS=0044
+RCLr D EAX=0000007E EDX=00000000 PS=0044 EAX=0000007E EDX=00000000 PS=0045
RCLr D EAX=0000007E EDX=00000001 PS=0045 EAX=000000FD EDX=00000001 PS=0044
RCLr D EAX=0000007E EDX=00000002 PS=0044 EAX=000001FA EDX=00000002 PS=0044
RCLr D EAX=0000007E EDX=00000008 PS=0044 EAX=00007E80 EDX=00000008 PS=0044
@@ -33951,8 +33951,8 @@
RCLr D EAX=0000007E EDX=00000001 PS=0045 EAX=000000FD EDX=00000001 PS=0044
RCLr D EAX=0000007E EDX=00000002 PS=0044 EAX=000001FA EDX=00000002 PS=0044
RCLr D EAX=0000007E EDX=0000001F PS=0044 EAX=4000001F EDX=0000001F PS=0045
-RCLr D EAX=0000007E EDX=00000020 PS=0045 EAX=0000007E EDX=00000020 PS=0045
-RCLr D EAX=0000007F EDX=00000000 PS=0045 EAX=0000007F EDX=00000000 PS=0045
+RCLr D EAX=0000007E EDX=00000020 PS=0045 EAX=8000003F EDX=00000020 PS=0044
+RCLr D EAX=0000007F EDX=00000000 PS=0044 EAX=0000007F EDX=00000000 PS=0045
RCLr D EAX=0000007F EDX=00000001 PS=0045 EAX=000000FF EDX=00000001 PS=0044
RCLr D EAX=0000007F EDX=00000002 PS=0044 EAX=000001FE EDX=00000002 PS=0044
RCLr D EAX=0000007F EDX=00000008 PS=0044 EAX=00007F80 EDX=00000008 PS=0044
@@ -33964,7 +33964,7 @@
RCLr D EAX=0000007F EDX=00000001 PS=0045 EAX=000000FF EDX=00000001 PS=0044
RCLr D EAX=0000007F EDX=00000002 PS=0044 EAX=000001FE EDX=00000002 PS=0044
RCLr D EAX=0000007F EDX=0000001F PS=0044 EAX=C000001F EDX=0000001F PS=0045
-RCLr D EAX=0000007F EDX=00000020 PS=0045 EAX=0000007F EDX=00000020 PS=0045
+RCLr D EAX=0000007F EDX=00000020 PS=0045 EAX=8000003F EDX=00000020 PS=0045
RCLr D EAX=00000080 EDX=00000000 PS=0045 EAX=00000080 EDX=00000000 PS=0045
RCLr D EAX=00000080 EDX=00000001 PS=0045 EAX=00000101 EDX=00000001 PS=0044
RCLr D EAX=00000080 EDX=00000002 PS=0044 EAX=00000202 EDX=00000002 PS=0044
@@ -33977,8 +33977,8 @@
RCLr D EAX=00000080 EDX=00000001 PS=0045 EAX=00000101 EDX=00000001 PS=0044
RCLr D EAX=00000080 EDX=00000002 PS=0044 EAX=00000202 EDX=00000002 PS=0044
RCLr D EAX=00000080 EDX=0000001F PS=0044 EAX=40000020 EDX=0000001F PS=0044
-RCLr D EAX=00000080 EDX=00000020 PS=0044 EAX=00000080 EDX=00000020 PS=0045
-RCLr D EAX=00000081 EDX=00000000 PS=0045 EAX=00000081 EDX=00000000 PS=0045
+RCLr D EAX=00000080 EDX=00000020 PS=0044 EAX=80000040 EDX=00000020 PS=0044
+RCLr D EAX=00000081 EDX=00000000 PS=0044 EAX=00000081 EDX=00000000 PS=0045
Show last 202 lines
 RCLr D EAX=00000081 EDX=00000001 PS=0045 EAX=00000103 EDX=00000001 PS=0044 
RCLr D EAX=00000081 EDX=00000002 PS=0044 EAX=00000206 EDX=00000002 PS=0044
RCLr D EAX=00000081 EDX=00000008 PS=0044 EAX=00008180 EDX=00000008 PS=0044
@@ -33990,7 +33990,7 @@
RCLr D EAX=00000081 EDX=00000001 PS=0045 EAX=00000103 EDX=00000001 PS=0044
RCLr D EAX=00000081 EDX=00000002 PS=0044 EAX=00000206 EDX=00000002 PS=0044
RCLr D EAX=00000081 EDX=0000001F PS=0044 EAX=C0000020 EDX=0000001F PS=0044
-RCLr D EAX=00000081 EDX=00000020 PS=0044 EAX=00000081 EDX=00000020 PS=0045
+RCLr D EAX=00000081 EDX=00000020 PS=0044 EAX=80000040 EDX=00000020 PS=0045
RCLr D EAX=000000FE EDX=00000000 PS=0045 EAX=000000FE EDX=00000000 PS=0045
RCLr D EAX=000000FE EDX=00000001 PS=0045 EAX=000001FD EDX=00000001 PS=0044
RCLr D EAX=000000FE EDX=00000002 PS=0044 EAX=000003FA EDX=00000002 PS=0044
@@ -34003,8 +34003,8 @@
RCLr D EAX=000000FE EDX=00000001 PS=0045 EAX=000001FD EDX=00000001 PS=0044
RCLr D EAX=000000FE EDX=00000002 PS=0044 EAX=000003FA EDX=00000002 PS=0044
RCLr D EAX=000000FE EDX=0000001F PS=0044 EAX=4000003F EDX=0000001F PS=0045
-RCLr D EAX=000000FE EDX=00000020 PS=0045 EAX=000000FE EDX=00000020 PS=0045
-RCLr D EAX=000000FF EDX=00000000 PS=0045 EAX=000000FF EDX=00000000 PS=0045
+RCLr D EAX=000000FE EDX=00000020 PS=0045 EAX=8000007F EDX=00000020 PS=0044
+RCLr D EAX=000000FF EDX=00000000 PS=0044 EAX=000000FF EDX=00000000 PS=0045
RCLr D EAX=000000FF EDX=00000001 PS=0045 EAX=000001FF EDX=00000001 PS=0044
RCLr D EAX=000000FF EDX=00000002 PS=0044 EAX=000003FE EDX=00000002 PS=0044
RCLr D EAX=000000FF EDX=00000008 PS=0044 EAX=0000FF80 EDX=00000008 PS=0044
@@ -34016,7 +34016,7 @@
RCLr D EAX=000000FF EDX=00000001 PS=0045 EAX=000001FF EDX=00000001 PS=0044
RCLr D EAX=000000FF EDX=00000002 PS=0044 EAX=000003FE EDX=00000002 PS=0044
RCLr D EAX=000000FF EDX=0000001F PS=0044 EAX=C000003F EDX=0000001F PS=0045
-RCLr D EAX=000000FF EDX=00000020 PS=0045 EAX=000000FF EDX=00000020 PS=0045
+RCLr D EAX=000000FF EDX=00000020 PS=0045 EAX=8000007F EDX=00000020 PS=0045
RCLr D EAX=00000000 EDX=00000000 PS=0045 EAX=00000000 EDX=00000000 PS=0045
RCLr D EAX=00000000 EDX=00000001 PS=0045 EAX=00000001 EDX=00000001 PS=0044
RCLr D EAX=00000000 EDX=00000002 PS=0044 EAX=00000002 EDX=00000002 PS=0044
@@ -34029,8 +34029,8 @@
RCLr D EAX=00000000 EDX=00000001 PS=0045 EAX=00000001 EDX=00000001 PS=0044
RCLr D EAX=00000000 EDX=00000002 PS=0044 EAX=00000002 EDX=00000002 PS=0044
RCLr D EAX=00000000 EDX=0000001F PS=0044 EAX=40000000 EDX=0000001F PS=0044
-RCLr D EAX=00000000 EDX=00000020 PS=0044 EAX=00000000 EDX=00000020 PS=0045
-RCLr D EAX=00000001 EDX=00000000 PS=0045 EAX=00000001 EDX=00000000 PS=0045
+RCLr D EAX=00000000 EDX=00000020 PS=0044 EAX=80000000 EDX=00000020 PS=0044
+RCLr D EAX=00000001 EDX=00000000 PS=0044 EAX=00000001 EDX=00000000 PS=0045
RCLr D EAX=00000001 EDX=00000001 PS=0045 EAX=00000003 EDX=00000001 PS=0044
RCLr D EAX=00000001 EDX=00000002 PS=0044 EAX=00000006 EDX=00000002 PS=0044
RCLr D EAX=00000001 EDX=00000008 PS=0044 EAX=00000180 EDX=00000008 PS=0044
@@ -34042,7 +34042,7 @@
RCLr D EAX=00000001 EDX=00000001 PS=0045 EAX=00000003 EDX=00000001 PS=0044
RCLr D EAX=00000001 EDX=00000002 PS=0044 EAX=00000006 EDX=00000002 PS=0044
RCLr D EAX=00000001 EDX=0000001F PS=0044 EAX=C0000000 EDX=0000001F PS=0044
-RCLr D EAX=00000001 EDX=00000020 PS=0044 EAX=00000001 EDX=00000020 PS=0045
+RCLr D EAX=00000001 EDX=00000020 PS=0044 EAX=80000000 EDX=00000020 PS=0045
RCLr D EAX=00000181 EDX=00000000 PS=0045 EAX=00000181 EDX=00000000 PS=0045
RCLr D EAX=00000181 EDX=00000001 PS=0045 EAX=00000303 EDX=00000001 PS=0044
RCLr D EAX=00000181 EDX=00000002 PS=0044 EAX=00000606 EDX=00000002 PS=0044
@@ -34055,7 +34055,7 @@
RCLr D EAX=00000181 EDX=00000001 PS=0045 EAX=00000303 EDX=00000001 PS=0044
RCLr D EAX=00000181 EDX=00000002 PS=0044 EAX=00000606 EDX=00000002 PS=0044
RCLr D EAX=00000181 EDX=0000001F PS=0044 EAX=C0000060 EDX=0000001F PS=0044
-RCLr D EAX=00000181 EDX=00000020 PS=0044 EAX=00000181 EDX=00000020 PS=0045
+RCLr D EAX=00000181 EDX=00000020 PS=0044 EAX=800000C0 EDX=00000020 PS=0045
RCLr D EAX=00007FFE EDX=00000000 PS=0045 EAX=00007FFE EDX=00000000 PS=0045
RCLr D EAX=00007FFE EDX=00000001 PS=0045 EAX=0000FFFD EDX=00000001 PS=0044
RCLr D EAX=00007FFE EDX=00000002 PS=0044 EAX=0001FFFA EDX=00000002 PS=0044
@@ -34068,8 +34068,8 @@
RCLr D EAX=00007FFE EDX=00000001 PS=0045 EAX=0000FFFD EDX=00000001 PS=0044
RCLr D EAX=00007FFE EDX=00000002 PS=0044 EAX=0001FFFA EDX=00000002 PS=0044
RCLr D EAX=00007FFE EDX=0000001F PS=0044 EAX=40001FFF EDX=0000001F PS=0045
-RCLr D EAX=00007FFE EDX=00000020 PS=0045 EAX=00007FFE EDX=00000020 PS=0045
-RCLr D EAX=00007FFF EDX=00000000 PS=0045 EAX=00007FFF EDX=00000000 PS=0045
+RCLr D EAX=00007FFE EDX=00000020 PS=0045 EAX=80003FFF EDX=00000020 PS=0044
+RCLr D EAX=00007FFF EDX=00000000 PS=0044 EAX=00007FFF EDX=00000000 PS=0045
RCLr D EAX=00007FFF EDX=00000001 PS=0045 EAX=0000FFFF EDX=00000001 PS=0044
RCLr D EAX=00007FFF EDX=00000002 PS=0044 EAX=0001FFFE EDX=00000002 PS=0044
RCLr D EAX=00007FFF EDX=00000008 PS=0044 EAX=007FFF80 EDX=00000008 PS=0044
@@ -34081,7 +34081,7 @@
RCLr D EAX=00007FFF EDX=00000001 PS=0045 EAX=0000FFFF EDX=00000001 PS=0044
RCLr D EAX=00007FFF EDX=00000002 PS=0044 EAX=0001FFFE EDX=00000002 PS=0044
RCLr D EAX=00007FFF EDX=0000001F PS=0044 EAX=C0001FFF EDX=0000001F PS=0045
-RCLr D EAX=00007FFF EDX=00000020 PS=0045 EAX=00007FFF EDX=00000020 PS=0045
+RCLr D EAX=00007FFF EDX=00000020 PS=0045 EAX=80003FFF EDX=00000020 PS=0045
RCLr D EAX=00008000 EDX=00000000 PS=0045 EAX=00008000 EDX=00000000 PS=0045
RCLr D EAX=00008000 EDX=00000001 PS=0045 EAX=00010001 EDX=00000001 PS=0044
RCLr D EAX=00008000 EDX=00000002 PS=0044 EAX=00020002 EDX=00000002 PS=0044
@@ -34094,8 +34094,8 @@
RCLr D EAX=00008000 EDX=00000001 PS=0045 EAX=00010001 EDX=00000001 PS=0044
RCLr D EAX=00008000 EDX=00000002 PS=0044 EAX=00020002 EDX=00000002 PS=0044
RCLr D EAX=00008000 EDX=0000001F PS=0044 EAX=40002000 EDX=0000001F PS=0044
-RCLr D EAX=00008000 EDX=00000020 PS=0044 EAX=00008000 EDX=00000020 PS=0045
-RCLr D EAX=00008001 EDX=00000000 PS=0045 EAX=00008001 EDX=00000000 PS=0045
+RCLr D EAX=00008000 EDX=00000020 PS=0044 EAX=80004000 EDX=00000020 PS=0044
+RCLr D EAX=00008001 EDX=00000000 PS=0044 EAX=00008001 EDX=00000000 PS=0045
RCLr D EAX=00008001 EDX=00000001 PS=0045 EAX=00010003 EDX=00000001 PS=0044
RCLr D EAX=00008001 EDX=00000002 PS=0044 EAX=00020006 EDX=00000002 PS=0044
RCLr D EAX=00008001 EDX=00000008 PS=0044 EAX=00800180 EDX=00000008 PS=0044
@@ -34107,7 +34107,7 @@
RCLr D EAX=00008001 EDX=00000001 PS=0045 EAX=00010003 EDX=00000001 PS=0044
RCLr D EAX=00008001 EDX=00000002 PS=0044 EAX=00020006 EDX=00000002 PS=0044
RCLr D EAX=00008001 EDX=0000001F PS=0044 EAX=C0002000 EDX=0000001F PS=0044
-RCLr D EAX=00008001 EDX=00000020 PS=0044 EAX=00008001 EDX=00000020 PS=0045
+RCLr D EAX=00008001 EDX=00000020 PS=0044 EAX=80004000 EDX=00000020 PS=0045
RCLr D EAX=0000FFFE EDX=00000000 PS=0045 EAX=0000FFFE EDX=00000000 PS=0045
RCLr D EAX=0000FFFE EDX=00000001 PS=0045 EAX=0001FFFD EDX=00000001 PS=0044
RCLr D EAX=0000FFFE EDX=00000002 PS=0044 EAX=0003FFFA EDX=00000002 PS=0044
@@ -34120,8 +34120,8 @@
RCLr D EAX=0000FFFE EDX=00000001 PS=0045 EAX=0001FFFD EDX=00000001 PS=0044
RCLr D EAX=0000FFFE EDX=00000002 PS=0044 EAX=0003FFFA EDX=00000002 PS=0044
RCLr D EAX=0000FFFE EDX=0000001F PS=0044 EAX=40003FFF EDX=0000001F PS=0045
-RCLr D EAX=0000FFFE EDX=00000020 PS=0045 EAX=0000FFFE EDX=00000020 PS=0045
-RCLr D EAX=0000FFFF EDX=00000000 PS=0045 EAX=0000FFFF EDX=00000000 PS=0045
+RCLr D EAX=0000FFFE EDX=00000020 PS=0045 EAX=80007FFF EDX=00000020 PS=0044
+RCLr D EAX=0000FFFF EDX=00000000 PS=0044 EAX=0000FFFF EDX=00000000 PS=0045
RCLr D EAX=0000FFFF EDX=00000001 PS=0045 EAX=0001FFFF EDX=00000001 PS=0044
RCLr D EAX=0000FFFF EDX=00000002 PS=0044 EAX=0003FFFE EDX=00000002 PS=0044
RCLr D EAX=0000FFFF EDX=00000008 PS=0044 EAX=00FFFF80 EDX=00000008 PS=0044
@@ -34133,7 +34133,7 @@
RCLr D EAX=0000FFFF EDX=00000001 PS=0045 EAX=0001FFFF EDX=00000001 PS=0044
RCLr D EAX=0000FFFF EDX=00000002 PS=0044 EAX=0003FFFE EDX=00000002 PS=0044
RCLr D EAX=0000FFFF EDX=0000001F PS=0044 EAX=C0003FFF EDX=0000001F PS=0045
-RCLr D EAX=0000FFFF EDX=00000020 PS=0045 EAX=0000FFFF EDX=00000020 PS=0045
+RCLr D EAX=0000FFFF EDX=00000020 PS=0045 EAX=80007FFF EDX=00000020 PS=0045
RCLr D EAX=00000000 EDX=00000000 PS=0045 EAX=00000000 EDX=00000000 PS=0045
RCLr D EAX=00000000 EDX=00000001 PS=0045 EAX=00000001 EDX=00000001 PS=0044
RCLr D EAX=00000000 EDX=00000002 PS=0044 EAX=00000002 EDX=00000002 PS=0044
@@ -34146,8 +34146,8 @@
RCLr D EAX=00000000 EDX=00000001 PS=0045 EAX=00000001 EDX=00000001 PS=0044
RCLr D EAX=00000000 EDX=00000002 PS=0044 EAX=00000002 EDX=00000002 PS=0044
RCLr D EAX=00000000 EDX=0000001F PS=0044 EAX=40000000 EDX=0000001F PS=0044
-RCLr D EAX=00000000 EDX=00000020 PS=0044 EAX=00000000 EDX=00000020 PS=0045
-RCLr D EAX=00000001 EDX=00000000 PS=0045 EAX=00000001 EDX=00000000 PS=0045
+RCLr D EAX=00000000 EDX=00000020 PS=0044 EAX=80000000 EDX=00000020 PS=0044
+RCLr D EAX=00000001 EDX=00000000 PS=0044 EAX=00000001 EDX=00000000 PS=0045
RCLr D EAX=00000001 EDX=00000001 PS=0045 EAX=00000003 EDX=00000001 PS=0044
RCLr D EAX=00000001 EDX=00000002 PS=0044 EAX=00000006 EDX=00000002 PS=0044
RCLr D EAX=00000001 EDX=00000008 PS=0044 EAX=00000180 EDX=00000008 PS=0044
@@ -34159,7 +34159,7 @@
RCLr D EAX=00000001 EDX=00000001 PS=0045 EAX=00000003 EDX=00000001 PS=0044
RCLr D EAX=00000001 EDX=00000002 PS=0044 EAX=00000006 EDX=00000002 PS=0044
RCLr D EAX=00000001 EDX=0000001F PS=0044 EAX=C0000000 EDX=0000001F PS=0044
-RCLr D EAX=00000001 EDX=00000020 PS=0044 EAX=00000001 EDX=00000020 PS=0045
+RCLr D EAX=00000001 EDX=00000020 PS=0044 EAX=80000000 EDX=00000020 PS=0045
RCLr D EAX=00018001 EDX=00000000 PS=0045 EAX=00018001 EDX=00000000 PS=0045
RCLr D EAX=00018001 EDX=00000001 PS=0045 EAX=00030003 EDX=00000001 PS=0044
RCLr D EAX=00018001 EDX=00000002 PS=0044 EAX=00060006 EDX=00000002 PS=0044
@@ -34172,7 +34172,7 @@
RCLr D EAX=00018001 EDX=00000001 PS=0045 EAX=00030003 EDX=00000001 PS=0044
RCLr D EAX=00018001 EDX=00000002 PS=0044 EAX=00060006 EDX=00000002 PS=0044
RCLr D EAX=00018001 EDX=0000001F PS=0044 EAX=C0006000 EDX=0000001F PS=0044
-RCLr D EAX=00018001 EDX=00000020 PS=0044 EAX=00018001 EDX=00000020 PS=0045
+RCLr D EAX=00018001 EDX=00000020 PS=0044 EAX=8000C000 EDX=00000020 PS=0045
RCLr D EAX=7FFFFFFE EDX=00000000 PS=0045 EAX=7FFFFFFE EDX=00000000 PS=0045
RCLr D EAX=7FFFFFFE EDX=00000001 PS=0045 EAX=FFFFFFFD EDX=00000001 PS=0044
RCLr D EAX=7FFFFFFE EDX=00000002 PS=0044 EAX=FFFFFFFA EDX=00000002 PS=0045
@@ -34185,8 +34185,8 @@
RCLr D EAX=7FFFFFFE EDX=00000001 PS=0045 EAX=FFFFFFFD EDX=00000001 PS=0044
RCLr D EAX=7FFFFFFE EDX=00000002 PS=0044 EAX=FFFFFFFA EDX=00000002 PS=0045
RCLr D EAX=7FFFFFFE EDX=0000001F PS=0045 EAX=5FFFFFFF EDX=0000001F PS=0045
-RCLr D EAX=7FFFFFFE EDX=00000020 PS=0045 EAX=7FFFFFFE EDX=00000020 PS=0045
-RCLr D EAX=7FFFFFFF EDX=00000000 PS=0045 EAX=7FFFFFFF EDX=00000000 PS=0045
+RCLr D EAX=7FFFFFFE EDX=00000020 PS=0045 EAX=BFFFFFFF EDX=00000020 PS=0044
+RCLr D EAX=7FFFFFFF EDX=00000000 PS=0044 EAX=7FFFFFFF EDX=00000000 PS=0045
RCLr D EAX=7FFFFFFF EDX=00000001 PS=0045 EAX=FFFFFFFF EDX=00000001 PS=0044
RCLr D EAX=7FFFFFFF EDX=00000002 PS=0044 EAX=FFFFFFFE EDX=00000002 PS=0045
RCLr D EAX=7FFFFFFF EDX=00000008 PS=0045 EAX=FFFFFFBF EDX=00000008 PS=0045
@@ -34198,7 +34198,7 @@
RCLr D EAX=7FFFFFFF EDX=00000001 PS=0045 EAX=FFFFFFFF EDX=00000001 PS=0044
RCLr D EAX=7FFFFFFF EDX=00000002 PS=0044 EAX=FFFFFFFE EDX=00000002 PS=0045
RCLr D EAX=7FFFFFFF EDX=0000001F PS=0045 EAX=DFFFFFFF EDX=0000001F PS=0045
-RCLr D EAX=7FFFFFFF EDX=00000020 PS=0045 EAX=7FFFFFFF EDX=00000020 PS=0045
+RCLr D EAX=7FFFFFFF EDX=00000020 PS=0045 EAX=BFFFFFFF EDX=00000020 PS=0045
RCLr D EAX=80000000 EDX=00000000 PS=0045 EAX=80000000 EDX=00000000 PS=0045
RCLr D EAX=80000000 EDX=00000001 PS=0045 EAX=00000001 EDX=00000001 PS=0045
RCLr D EAX=80000000 EDX=00000002 PS=0045 EAX=00000003 EDX=00000002 PS=0044
@@ -34211,8 +34211,8 @@
RCLr D EAX=80000000 EDX=00000001 PS=0045 EAX=00000001 EDX=00000001 PS=0045
RCLr D EAX=80000000 EDX=00000002 PS=0045 EAX=00000003 EDX=00000002 PS=0044
RCLr D EAX=80000000 EDX=0000001F PS=0044 EAX=60000000 EDX=0000001F PS=0044
-RCLr D EAX=80000000 EDX=00000020 PS=0044 EAX=80000000 EDX=00000020 PS=0045
-RCLr D EAX=80000001 EDX=00000000 PS=0045 EAX=80000001 EDX=00000000 PS=0045
+RCLr D EAX=80000000 EDX=00000020 PS=0044 EAX=C0000000 EDX=00000020 PS=0044
+RCLr D EAX=80000001 EDX=00000000 PS=0044 EAX=80000001 EDX=00000000 PS=0045
RCLr D EAX=80000001 EDX=00000001 PS=0045 EAX=00000003 EDX=00000001 PS=0045
RCLr D EAX=80000001 EDX=00000002 PS=0045 EAX=00000007 EDX=00000002 PS=0044
RCLr D EAX=80000001 EDX=00000008 PS=0044 EAX=000001C0 EDX=00000008 PS=0044
@@ -34224,7 +34224,7 @@
RCLr D EAX=80000001 EDX=00000001 PS=0045 EAX=00000003 EDX=00000001 PS=0045
RCLr D EAX=80000001 EDX=00000002 PS=0045 EAX=00000007 EDX=00000002 PS=0044
RCLr D EAX=80000001 EDX=0000001F PS=0044 EAX=E0000000 EDX=0000001F PS=0044
-RCLr D EAX=80000001 EDX=00000020 PS=0044 EAX=80000001 EDX=00000020 PS=0045
+RCLr D EAX=80000001 EDX=00000020 PS=0044 EAX=C0000000 EDX=00000020 PS=0045
RCLr D EAX=FFFFFFFE EDX=00000000 PS=0045 EAX=FFFFFFFE EDX=00000000 PS=0045
RCLr D EAX=FFFFFFFE EDX=00000001 PS=0045 EAX=FFFFFFFD EDX=00000001 PS=0045
RCLr D EAX=FFFFFFFE EDX=00000002 PS=0045 EAX=FFFFFFFB EDX=00000002 PS=0045
@@ -34237,8 +34237,8 @@
RCLr D EAX=FFFFFFFE EDX=00000001 PS=0045 EAX=FFFFFFFD EDX=00000001 PS=0045
RCLr D EAX=FFFFFFFE EDX=00000002 PS=0045 EAX=FFFFFFFB EDX=00000002 PS=0045
RCLr D EAX=FFFFFFFE EDX=0000001F PS=0045 EAX=7FFFFFFF EDX=0000001F PS=0045
-RCLr D EAX=FFFFFFFE EDX=00000020 PS=0045 EAX=FFFFFFFE EDX=00000020 PS=0045
-RCLr D EAX=FFFFFFFF EDX=00000000 PS=0045 EAX=FFFFFFFF EDX=00000000 PS=0045
+RCLr D EAX=FFFFFFFE EDX=00000020 PS=0045 EAX=FFFFFFFF EDX=00000020 PS=0044
+RCLr D EAX=FFFFFFFF EDX=00000000 PS=0044 EAX=FFFFFFFF EDX=00000000 PS=0045
RCLr D EAX=FFFFFFFF EDX=00000001 PS=0045 EAX=FFFFFFFF EDX=00000001 PS=0045
RCLr D EAX=FFFFFFFF EDX=00000002 PS=0045 EAX=FFFFFFFF EDX=00000002 PS=0045
RCLr D EAX=FFFFFFFF EDX=00000008 PS=0045 EAX=FFFFFFFF EDX=00000008 PS=0045

Oddly enough, the RCLr instruction is having bugs now? It seems to not shift enough(being off by 1 shift), looking at the first shift that goes wrong? So maybe it actually wraps around 32 instead of 33 with RCL?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 112 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've applied all your comments to the code, except RCL, which only gives correct (errorless) results when it the count around like the other instruction(&0x1F instead of %33). When it uses %33(for 32-bit rotate), the results will become entirely wrong some edge cases. When reverted to &0x1F, ALL shift/rotate instructions match the port EE logs that were generated by hottobar(which I assume using a completely errorless CPU for that).

So the wrap around 33 is only for RCR, but all other shift/rotate instructions wrap around 0x1F(5 bits)? peterferrie?

With this 'bug'(modulo 32 instead of 33 with RCL) implemented in the 32-bit shift/rotate instructions, the entire POST EE output 100% matches the reference file in the test386.asm repository 😁

The only bugs that might be left(according to the test386.asm testsuite) might be the undocumented processor-specific part.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 113 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just tried running the Windows 95a boot disk again. With the latest shift/rotate fixes, the boot image palette scrolling works without problems now! (instead of just inserting white at the right side, scrolling in white) It now properly rotates and adjusts the palette! 😁 It's still slow for some unknown reason, though.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 114 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

So, let's assume the testsuite(without cpu specific part) validates OK. Then why does Windows 95 setup still crash loading(no output) due to Bound Exception due to ModR/M offset overflow or invalid bounds in memory? Any known untested opcodes in the testsuite?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 115 of 178, by peterferrie

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Documentation says mask with 0x1F, even with 32-bit rcl/rcr?

rol al,8 will modulo 8 with 8, becoming 0, thus not rotating anything, thus no carry flag modification? Or is it set to bit 0 with 8/16 shifts always?

With count(s), do you mean cnt, maskcnt or numcnt in those statements?

The modulo here is 32, not 8. rol al,8 will perform rol al,8 (and rol ax,16 will perform rol ax,16), and if al bit 0=1 at the time, then the carry will be set as a result.

Reply 116 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

Doesn't the documentation all say that 8-bit ROL/ROR mask with 0x1F, then modulo 8(which is the same as &0x7, as is done in my emulator for optimized performance)? Thus (8&0x1F)%8=(8&0x1F)&7=0, thus nothing is shifted/rotated? Only with RCL/RCR, the modulo is 9 instead, thus actually shifting with "rcl al,8" or "rcr al,8"? UniPCemu simply optimizes %8, %16 and %32 into &7, &0xF and &0x1F for optimization purposes. The result is the same either way. Thus rol al,8 actually performs rol al,0? The rol operation itself doesn't set the carry flag, but UniPCemu makes an exception (by checking if ((maskcnt && (numcnt==0))) to filter out the "al,8"(with 8-bits), "al,16"(with 8/16-bits) versions and set CF to bit 0 with RCL and ROL or the sign bit(bit 7(byte) or 15(word)) with ROR/RCR. That way it handles those edge cases correctly.

8-bit:

byte op_grp2_8(byte cnt, byte varshift) {
//word d,
INLINEREGISTER word s, shift, tempCF, msb;
INLINEREGISTER byte numcnt, maskcnt, overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1b;
switch (thereg) {
case 0: //ROL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 7; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>7); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (((s >> 7) & 1)^FLAG_CF); //Only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 1: //ROR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 7; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FU) | (FLAG_CF << 7);
overflow = ((s >> 7) ^ ((s >> 6) & 1)); //Only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>7); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 2: //RCL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 9; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF(s>>7); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (((s >> 7) & 1)^FLAG_CF); //OF=MSB^CF, only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 3: //RCR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 9; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (((s >> 7)&1)^FLAG_CF);
tempCF = FLAG_CF;
Show last 75 lines
			FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FU) | (tempCF << 7);
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 4: case 6: //SHL r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>7);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFU;
overflow = (FLAG_CF^(s>>7));
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>7); //Always sets CF, according to various sources?
if (numcnt) flag_szp8((uint8_t)(s&0xFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 5: //SHR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>7);
FLAGW_CF(s);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (numcnt) flag_szp8((uint8_t)(s & 0xFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 7: //SAR r/m8
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x80U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp8((uint8_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp8((uint8_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared with count as well?
}
break;
}
op_grp2_cycles(numcnt, varshift);
return (s & 0xFFU);
}

16-bit:

word op_grp2_16(byte cnt, byte varshift) {
//word d,
INLINEREGISTER uint_32 s, shift, tempCF, msb;
INLINEREGISTER byte numcnt, maskcnt, overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1;
switch (thereg) {
case 0: //ROL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0xF; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>15); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (((s >> 15) & 1)^FLAG_CF); //Only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 1: //ROR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0xF; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FFFU) | (FLAG_CF << 15);
overflow = ((s >> 15) ^ ((s >> 14) & 1)); //Only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>15); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 2: //RCL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 17; //Operand size wrap!
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF(s>>15); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (((s >> 15) & 1)^FLAG_CF); //OF=MSB^CF, only when not using CL?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 3: //RCR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt %= 17; //Operand size wrap!
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
overflow = ((s >> 15)^FLAG_CF);
tempCF = FLAG_CF;
Show last 75 lines
			FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FFFU) | (tempCF << 15);
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
break;

case 4: case 6: //SHL r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>15);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFFFU;
overflow = (FLAG_CF^(s>>15));
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>15); //Always sets CF, according to various sources?
if (numcnt) flag_szp16((uint16_t)(s&0xFFFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 5: //SHR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>15);
FLAGW_CF(s);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (numcnt) flag_szp16((uint16_t)(s & 0xFFFFU));
if (maskcnt) FLAGW_OF(overflow);
break;

case 7: //SAR r/m16
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x8000U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp16(s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp16(s); //Affect sign as well!
FLAGW_OF(0); //Cleared with count as well?
}
break;
}
op_grp2_cycles(numcnt, varshift);
return (s & 0xFFFFU);
}

32-bit:

uint_32 op_grp2_32(byte cnt, byte varshift) {
//word d,
INLINEREGISTER uint_64 s, shift, tempCF, msb;
INLINEREGISTER byte numcnt,maskcnt,overflow;
//word backup;
//if (cnt>0x8) return (oper1b); //NEC V20/V30+ limits shift count
numcnt = maskcnt = cnt; //Save count!
s = oper1d;
switch (thereg) {
case 0: //ROL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0x1F; //Operand size wrap!
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>31); //Save MSB!
s = (s << 1)|FLAG_CF;
overflow = (((s >> 31) & 1)^FLAG_CF);
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 1: //ROR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
if (EMULATED_CPU>=CPU_80386) numcnt &= 0x1F; //Operand size wrap!
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FFFFFFFU) | (FLAG_CF << 31);
overflow = ((s >> 31) ^ ((s >> 30) & 1));
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>31); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 2: //RCL r/m32
if (EMULATED_CPU >= CPU_80386) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//if (EMULATED_CPU >= CPU_NECV30) numcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
tempCF = FLAG_CF;
FLAGW_CF(s>>31); //Save MSB!
s = (s << 1)|tempCF; //Shift and set CF!
overflow = (((s >> 31) & 1)^FLAG_CF); //OF=MSB^CF
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 3: //RCR r/m32
if (EMULATED_CPU >= CPU_80386) maskcnt %= 33; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//if (EMULATED_CPU >= CPU_NECV30) numcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
overflow = numcnt?0:FLAG_OF; //Default: no overflow!
for (shift = 1; shift <= numcnt; shift++) {
overflow = (((s >> 31)&1)^FLAG_CF);
tempCF = FLAG_CF;
Show last 75 lines
			FLAGW_CF(s); //Save LSB!
s = ((s >> 1)&0x7FFFFFFFU) | (tempCF << 31);
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow); //Overflow?
break;

case 4: case 6: //SHL r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s>>31);
//if (s & 0x8) FLAGW_AF(1); //Auxiliary carry?
s = (s << 1) & 0xFFFFFFFFU;
overflow = (FLAG_CF^(s>>31));
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s>>31); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
if (numcnt) flag_szp32((uint32_t)(s&0xFFFFFFFFU));
break;

case 5: //SHR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
//FLAGW_AF(0);
overflow = numcnt?0:FLAG_OF;
for (shift = 1; shift <= numcnt; shift++) {
overflow = (s>>31);
FLAGW_CF(s);
//backup = s; //Save backup!
s = s >> 1;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
if (maskcnt) FLAGW_OF(overflow);
if (numcnt) flag_szp32((uint32_t)(s & 0xFFFFFFFFU));
break;

case 7: //SAR r/m32
if (EMULATED_CPU >= CPU_NECV30) maskcnt &= 0x1F; //Clear the upper 3 bits to become a NEC V20/V30+!
numcnt = maskcnt;
msb = s & 0x80000000U;
//FLAGW_AF(0);
for (shift = 1; shift <= numcnt; shift++) {
FLAGW_CF(s);
//backup = s; //Save backup!
s = (s >> 1) | msb;
//if (((backup^s)&0x10)) FLAGW_AF(1); //Auxiliary carry?
}
if (maskcnt && (numcnt==0)) FLAGW_CF(s); //Always sets CF, according to various sources?
byte tempSF;
tempSF = FLAG_SF; //Save the SF!
/*flag_szp8((uint8_t)(s & 0xFF));*/
//http://www.electronics.dit.ie/staff/tscarff/8086_instruction_set/8086_instruction_set.html#SAR says only C and O flags!
if (!maskcnt) //Nothing done?
{
FLAGW_SF(tempSF); //We don't update when nothing's done!
}
else if (maskcnt==1) //Overflow is cleared on all 1-bit shifts!
{
flag_szp32((uint32_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared!
}
else if (numcnt) //Anything shifted at all?
{
flag_szp32((uint32_t)s); //Affect sign as well!
FLAGW_OF(0); //Cleared with count as well?
}
break;
}
op_grp2_cycles32(numcnt, varshift);
return (s & 0xFFFFFFFFU);
}

These versions seem to work without problems, according to the testsuite?

The only odd thing, compared to the documentation, is that the 32-bit RCR wraps around 33(using %33), but the RCL instruction does follow the documentation, wrapping around 32. If %33 is used instead of &0x1F, the results differ in some cases, coming up one shift short? See the results becoming 0x40000000 instead of 0x80000000 and those other final errors found in the real chip vs my emulator? My emulator performs 0x1F wrap instead of modulo 33 on RCL, which makes it produce the exact same results as the real chip(POST EE reference file). Executing %33 like RCR will make it execute one shift short, producing invalid results(both on carry flag and result dword).

So that 'bug' is either a real bug in the actual chip the POST EE reference is created on, or the documentation of that exact instruction(RCL r/m32) is invalid and only applies to RCR?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 117 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

Oddly enough, with the current fixes based on the test386.asm testsuite, the Compaq BIOS once again fails setting up DMA correctly? It's once again setting the DMA mode control register for self test mode(mode 0(verify) instead of mode 1(mode 2=memory read, mode 3=invalid)) instead of memory write mode(required to properly store floppy disk data in memory, instead of performing a dummy memory access(a read, probably)) after acnowledging the floppy disk controller?

I've tried reversing the 8086/80386 cores to their state at commit of 20171106_1539, but the error still persists? Maybe an error in one of the other units?

Edit: Just tried the commit from that day, which also fails setting up DMA correctly? I guess it's time to dive into the Compaq BIOS again and find out why it's programming it incorrectly...
Edit: Whoops. I have the disassembly of the BIOS, but no clue where it even sets that up?
https://www.pcjs.org/devices/pcx86/rom/compaq … /1988-01-28.asm

Edit: This seems to be the cause of the invalid load of the DMA controller, when starting to initialise the FDC for booting(by my hard disk BIOS):

jnz	xedd7			; 0000ED61  7574  'ut'
xor bh,bh ; 0000ED63 32FF '2.'
mov bl,[bp+0x1] ; 0000ED65 8A5E01 '.^.'
shl bx,1 ; 0000ED68 D1E3 '..'
mov ax,[cs:bx+0xefd1] ; 0000ED6A 2E8B87D1EF '.....'
out 0xb,al ; 0000ED6F E60B '..'

BP is 7EB8, thus it loads BL from SS(0):BP+1=address 7EB9. This is the value 04h. Then it looks up the table, resulting in F000:[EFD1+08]=F000:EFD9. That reads E642, of which 42h is set up as the FDC mode control? That cannot be correct?

The value at F000:EFD5 contains the correct value to load(46h). So the value of BX should be 04h. So Unshifted, BL should have read 02h instead of 04h. The value at BP+1 is 04h instead of 02h?

So the problem is further back? The current POST code is 10h. So it's starting at F000:9BD7.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 118 of 178, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've logged (after finding and fixing a bug which stops logging when the debugger is logging in common log format always log even during skipping and the Settings menu is loaded) a complete log file, which should contain the problem. It's a log from the start/middle of the HDD BIOS(XT-IDE AT BIOS) waiting until it continues to boot the first specified disk(which is C by default in the ROM, but modifyable by pressing A or C(or F8 for letting the original BIOS(Compaq BIOS) handle it).

The problem should be somewhere in the log, but it's quite big(5GB for just those 5 seconds the BIOS runs and starts booting(plus a handful of extra instruction logged by me after the breakpoint (see code in previous post) was hit).

The log(compressed with 256MB dictionary for better compression): https://www.dropbox.com/s/88yacsijqxmh1jf/deb … 16_1601.7z?dl=0

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 119 of 178, by peterferrie

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Doesn't the documentation all say that 8-bit ROL/ROR mask with 0x1F, then modulo 8(which is the same as &0x7

Regardless of what the documentation says, the only mask is 0x1F. A rol al,8 will really rol al,8, and the carry will be affected.