VOGONS


UniPCemu cycle accurate 8088 implementation

Topic actions

Reply 180 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've just adjusted the logging method during the cycle logging(simplified, common log format, single-line, state logging enabled). The results are... interesting(just when the XT BIOS starts running):

Details

00:00:33:92.08636: BIU T1 Physical(p):000ffff0=ea(ê); Paged(p):000ffff0=ea(ê); Normal(p):00000000=ea(ê)
00:00:33:92.09388: BIU T2
00:00:33:92.09536: BIU T3
00:00:33:92.09676: BIU T4
00:00:33:92.09828: BIU T1 Physical(p):000ffff1=5b([); Paged(p):000ffff1=5b([); Normal(p):00000001=5b([)
00:00:33:92.09972: BIU T2
00:00:33:93.00112: BIU T3
00:00:33:93.00252: BIU T4
00:00:33:93.00400: BIU T1 Physical(p):000ffff2=e0(à); Paged(p):000ffff2=e0(à); Normal(p):00000002=e0(à)
00:00:33:93.00540: BIU T2
00:00:33:93.00684: BIU T3
00:00:33:93.00820: BIU T4
00:00:33:93.00972: BIU T1 Physical(p):000ffff3=00( ); Paged(p):000ffff3=00( ); Normal(p):00000003=00( )
00:00:33:93.01112: BIU T2
00:00:33:93.01252: BIU T3
00:00:33:93.01628: BIU T4
00:00:33:93.01740: BIU T1 Physical(p):000ffff4=f0(ð); Paged(p):000ffff4=f0(ð); Normal(p):00000004=f0(ð)
00:00:33:93.01836: BIU T2
00:00:33:93.01936: BIU T3
00:00:33:93.02032: BIU T4
00:00:33:93.02180: BIU T1 ffff:0000 (EA5BE000F0)JMP F000:E05B Physical(p):000fe05b=fa(ú); Paged(p):000fe05b=fa(ú); Normal(p):0000e05b=fa(ú)
00:00:33:93.02276: BIU T2
00:00:33:93.02376: BIU T3
00:00:33:93.02960: BIU T4
00:00:33:93.03092: BIU T1 Physical(p):000fe05c=fc(ü); Paged(p):000fe05c=fc(ü); Normal(p):0000e05c=fc(ü)
00:00:33:93.03188: BIU T2
00:00:33:93.03284: BIU T3
00:00:33:93.03388: BIU T4
00:00:33:93.03496: BIU T1 Physical(p):000fe05d=b0(°); Paged(p):000fe05d=b0(°); Normal(p):0000e05d=b0(°)
00:00:33:93.03596: BIU T2
00:00:33:93.03700: BIU T3
00:00:33:93.03804: BIU T4
00:00:33:93.03916: BIU T1 Physical(p):000fe05e=00( ); Paged(p):000fe05e=00( ); Normal(p):0000e05e=00( )
00:00:33:93.04028: BIU T2
00:00:33:93.04148: BIU T3
00:00:33:93.04268: BIU T4
00:00:33:93.04392: BIU T1 Physical(p):000fe05f=e6(æ); Paged(p):000fe05f=e6(æ); Normal(p):0000e05f=e6(æ)
00:00:33:93.04508: BIU T2 f000:e05b (FA)CLI
00:00:33:93.04644: BIU T3
00:00:33:93.04812: BIU T4
00:00:33:93.04928: BIU T1 Physical(p):000fe060=a0( ); Paged(p):000fe060=a0( ); Normal(p):0000e060=a0( )
00:00:33:93.05100: BIU T2 f000:e05c (FC)CLD
00:00:33:93.05244: BIU T3
00:00:33:93.05420: BIU T4
00:00:33:93.05584: BIU T1 Physical(p):000fe061=ba(º); Paged(p):000fe061=ba(º); Normal(p):0000e061=ba(º)
00:00:33:93.05712: BIU T2
00:00:33:93.05880: BIU T3 f000:e05d (B000)MOV AL,00
00:00:33:93.06004: BIU T4
00:00:33:93.06140: BIU T1 Physical(p):000fe062=d8(Ø); Paged(p):000fe062=d8(Ø); Normal(p):0000e062=d8(Ø)
00:00:33:93.06260: BIU T2
00:00:33:93.06360: BIU T3
00:00:33:93.06460: BIU T4
00:00:33:93.06572: BIU T1 Physical(p):000fe063=03(); Paged(p):000fe063=03(); Normal(p):0000e063=03()
00:00:33:93.06672: BIU T1
00:00:33:93.06772: BIU T2
00:00:33:93.06872: BIU T3
00:00:33:93.06972: BIU T4
00:00:33:93.07080: BIU T1
00:00:33:93.07180: BIU T2
00:00:33:93.07276: BIU T3
00:00:33:93.07376: BIU W
00:00:33:93.07476: BIU T4
00:00:33:93.07576: BIU T4
00:00:33:93.07688: BIU T1 Physical(p):000fe064=ee(î); Paged(p):000fe064=ee(î); Normal(p):0000e064=ee(î)
00:00:33:93.07788: BIU T2
00:00:33:93.07884: BIU T3
00:00:33:93.07988: BIU T4
00:00:33:93.08096: BIU T1 Physical(p):000fe065=b2(²); Paged(p):000fe065=b2(²); Normal(p):0000e065=b2(²)
00:00:33:93.08204: BIU T2 f000:e061 (BAD803)MOV DX,03D8
00:00:33:93.08304: BIU T3
00:00:33:93.08408: BIU T4
00:00:33:93.08516: BIU T1 Physical(p):000fe066=b8(¸); Paged(p):000fe066=b8(¸); Normal(p):0000e066=b8(¸)
00:00:33:93.08620: BIU T2
00:00:33:93.08720: BIU T3
00:00:33:93.08820: BIU T4
00:00:33:93.08920: BIU T1
00:00:33:93.09020: BIU T2
00:00:33:93.09132: BIU T3
00:00:33:93.09236: BIU W
00:00:33:93.09336: BIU T4
00:00:33:93.09436: BIU T4
00:00:33:93.09548: BIU T1 Physical(p):000fe067=fe(þ); Paged(p):000fe067=fe(þ); Normal(p):0000e067=fe(þ)
00:00:33:93.09644: BIU T2
00:00:33:93.09744: BIU T3
00:00:33:93.09844: BIU T4
00:00:33:93.09964: BIU T1 f000:e065 (B2B8)MOV DL,B8 Physical(p):000fe068=c0(À); Paged(p):000fe068=c0(À); Normal(p):0000e068=c0(À)
00:00:33:94.00112: BIU T2
00:00:33:94.00220: BIU T3
00:00:33:94.00320: BIU T4
00:00:33:94.00428: BIU T1 Physical(p):000fe069=ee(î); Paged(p):000fe069=ee(î); Normal(p):0000e069=ee(î)
00:00:33:94.00528: BIU T2
00:00:33:94.00640: BIU T3 f000:e067 (FEC0)INC AL
00:00:33:94.00740: BIU T4
00:00:33:94.00848: BIU T1 Physical(p):000fe06a=b0(°); Paged(p):000fe06a=b0(°); Normal(p):0000e06a=b0(°)
00:00:33:94.00952: BIU T2
00:00:33:94.01056: BIU T3
00:00:33:94.01160: BIU T4
00:00:33:94.01260: BIU T1
00:00:33:94.01360: BIU T2
00:00:33:94.01464: BIU T3
00:00:33:94.01704: BIU W
00:00:33:94.01808: BIU T4
00:00:33:94.01908: BIU T4
00:00:33:94.02020: BIU T1 Physical(p):000fe06b=99(™); Paged(p):000fe06b=99(™); Normal(p):0000e06b=99(™)
00:00:33:94.02120: BIU T2
00:00:33:94.02220: BIU T3
00:00:33:94.02316: BIU T4
00:00:33:94.02428: BIU T1 Physical(p):000fe06c=e6(æ); Paged(p):000fe06c=e6(æ); Normal(p):0000e06c=e6(æ)
00:00:33:94.02540: BIU T2 f000:e06a (B099)MOV AL,99
00:00:33:94.02636: BIU T3
00:00:33:94.02736: BIU T4
00:00:33:94.02844: BIU T1 Physical(p):000fe06d=63(c); Paged(p):000fe06d=63(c); Normal(p):0000e06d=63(c)
00:00:33:94.02944: BIU T2
00:00:33:94.03048: BIU T3
00:00:33:94.03148: BIU T4
00:00:33:94.03260: BIU T1 Physical(p):000fe06e=b0(°); Paged(p):000fe06e=b0(°); Normal(p):0000e06e=b0(°)
00:00:33:94.03360: BIU T1
00:00:33:94.03460: BIU T2
00:00:33:94.03560: BIU T3
00:00:33:94.03668: BIU T4
00:00:33:94.03768: BIU T1
00:00:33:94.03864: BIU T2
00:00:33:94.03996: BIU T3
00:00:33:94.04096: BIU W
00:00:33:94.04192: BIU T4
00:00:33:94.04288: BIU T4
00:00:33:94.04396: BIU T1 Physical(p):000fe06f=a5(¥); Paged(p):000fe06f=a5(¥); Normal(p):0000e06f=a5(¥)
00:00:33:94.04488: BIU T2
00:00:33:94.04584: BIU T3
00:00:33:94.04700: BIU T4
00:00:33:94.04828: BIU T1 Physical(p):000fe070=e6(æ); Paged(p):000fe070=e6(æ); Normal(p):0000e070=e6(æ)
00:00:33:94.04956: BIU T2 f000:e06e (B0A5)MOV AL,A5
00:00:33:94.05068: BIU T3
00:00:33:94.05164: BIU T4
00:00:33:94.05276: BIU T1 Physical(p):000fe071=61(a); Paged(p):000fe071=61(a); Normal(p):0000e071=61(a)
00:00:33:94.05372: BIU T2
00:00:33:94.05468: BIU T3
00:00:33:94.05564: BIU T4
00:00:33:94.05676: BIU T1 Physical(p):000fe072=b0(°); Paged(p):000fe072=b0(°); Normal(p):0000e072=b0(°)
00:00:33:94.05772: BIU T1
00:00:33:94.05864: BIU T2
00:00:33:94.05992: BIU T3
00:00:33:94.06128: BIU T4
00:00:33:94.06228: BIU T1
00:00:33:94.06344: BIU T2
00:00:33:94.06444: BIU T3
00:00:33:94.06540: BIU W
00:00:33:94.06636: BIU T4
00:00:33:94.06732: BIU T4
00:00:33:94.06840: BIU T1 Physical(p):000fe073=54(T); Paged(p):000fe073=54(T); Normal(p):0000e073=54(T)
00:00:33:94.06940: BIU T2
00:00:33:94.07036: BIU T3
00:00:33:94.07132: BIU T4
00:00:33:94.07240: BIU T1 Physical(p):000fe074=e6(æ); Paged(p):000fe074=e6(æ); Normal(p):0000e074=e6(æ)
00:00:33:94.07348: BIU T2 f000:e072 (B054)MOV AL,54
00:00:33:94.07444: BIU T3
00:00:33:94.07540: BIU T4
00:00:33:94.07648: BIU T1 Physical(p):000fe075=43(C); Paged(p):000fe075=43(C); Normal(p):0000e075=43(C)
00:00:33:94.07744: BIU T2
00:00:33:94.07840: BIU T3
00:00:33:94.07948: BIU T4
00:00:33:94.08092: BIU T1 Physical(p):000fe076=b0(°); Paged(p):000fe076=b0(°); Normal(p):0000e076=b0(°)
00:00:33:94.08204: BIU T1
00:00:33:94.08316: BIU T2
00:00:33:94.08420: BIU T3
00:00:33:94.08524: BIU T4
00:00:33:94.08648: BIU T1
00:00:33:94.08776: BIU T2
00:00:33:94.08888: BIU T3
00:00:33:94.09040: BIU W
00:00:33:94.09164: BIU T4
00:00:33:94.09264: BIU T4
00:00:33:94.09376: BIU T1 Physical(p):000fe077=12(); Paged(p):000fe077=12(); Normal(p):0000e077=12()
00:00:33:94.09476: BIU T2
00:00:33:94.09604: BIU T3
00:00:33:94.09712: BIU T4
00:00:33:94.09868: BIU T1 Physical(p):000fe078=e6(æ); Paged(p):000fe078=e6(æ); Normal(p):0000e078=e6(æ)
00:00:33:95.00012: BIU T2 f000:e076 (B012)MOV AL,12

It's now actually logging the instruction line on the actual cycle the instruction starts executing, instead of the line when the instruction finishes executing(for cycle logging only, all other logs should be unaffected and log when the instruction is finished).
One interesting thing I see is that on some times, the emulated bus is sometimes applying waitstates(the "BIU W" lines). But when said waitstate (RAM probably) access is performed, somehow the actual memory access is somehow missing from the T1 state?

Edit: Just confirmed. Those are IN/OUT instructions executing. But they aren't logged somehow?

Edit: Some debugger bugfixing later(still checking executed while it shouldn't):

Details

00:00:07:13.05335: BIU T1 Physical(p):000ffff0=ea(ê); Paged(p):000ffff0=ea(ê); Normal(p):00000000=ea(ê)
00:00:07:13.06876: BIU T2
00:00:07:13.07019: BIU T3
00:00:07:13.07157: BIU T4
00:00:07:13.07337: BIU T1 Physical(p):000ffff1=5b([); Paged(p):000ffff1=5b([); Normal(p):00000001=5b([)
00:00:07:13.07475: BIU T2
00:00:07:13.07612: BIU T3
00:00:07:13.07749: BIU T4
00:00:07:13.07926: BIU T1 Physical(p):000ffff2=e0(à); Paged(p):000ffff2=e0(à); Normal(p):00000002=e0(à)
00:00:07:13.08073: BIU T2
00:00:07:13.08231: BIU T3
00:00:07:13.08368: BIU T4
00:00:07:13.08515: BIU T1 Physical(p):000ffff3=00( ); Paged(p):000ffff3=00( ); Normal(p):00000003=00( )
00:00:07:13.08652: BIU T2
00:00:07:13.08788: BIU T3
00:00:07:13.09116: BIU T4
00:00:07:13.09248: BIU T1 Physical(p):000ffff4=f0(ð); Paged(p):000ffff4=f0(ð); Normal(p):00000004=f0(ð)
00:00:07:13.09346: BIU T2
00:00:07:13.09445: BIU T3
00:00:07:13.09541: BIU T4
00:00:07:13.09697: BIU T1 ffff:0000 (EA5BE000F0)JMP F000:E05B Physical(p):000fe05b=fa(ú); Paged(p):000fe05b=fa(ú); Normal(p):0000e05b=fa(ú)
00:00:07:13.09796: BIU T2
00:00:07:13.09893: BIU T3
00:00:07:13.09990: BIU T4
00:00:07:14.00098: BIU T1 Physical(p):000fe05c=fc(ü); Paged(p):000fe05c=fc(ü); Normal(p):0000e05c=fc(ü)
00:00:07:14.00224: BIU T2
00:00:07:14.00330: BIU T3
00:00:07:14.00446: BIU T4
00:00:07:14.00641: BIU T1 Physical(p):000fe05d=b0(°); Paged(p):000fe05d=b0(°); Normal(p):0000e05d=b0(°)
00:00:07:14.00750: BIU T2
00:00:07:14.00858: BIU T3
00:00:07:14.00977: BIU T4
00:00:07:14.01151: BIU T1 Physical(p):000fe05e=00( ); Paged(p):000fe05e=00( ); Normal(p):0000e05e=00( )
00:00:07:14.01282: BIU T2
00:00:07:14.01383: BIU T3
00:00:07:14.01504: BIU T4
00:00:07:14.01641: BIU T1 Physical(p):000fe05f=e6(æ); Paged(p):000fe05f=e6(æ); Normal(p):0000e05f=e6(æ)
00:00:07:14.01769: BIU T2 f000:e05b (FA)CLI
00:00:07:14.01867: BIU T3
00:00:07:14.01965: BIU T4
00:00:07:14.02072: BIU T1 Physical(p):000fe060=a0( ); Paged(p):000fe060=a0( ); Normal(p):0000e060=a0( )
00:00:07:14.02193: BIU T2 f000:e05c (FC)CLD
00:00:07:14.02464: BIU T3
00:00:07:14.02564: BIU T4
00:00:07:14.02672: BIU T1 Physical(p):000fe061=ba(º); Paged(p):000fe061=ba(º); Normal(p):0000e061=ba(º)
00:00:07:14.02769: BIU T2
00:00:07:14.02876: BIU T3 f000:e05d (B000)MOV AL,00
00:00:07:14.02973: BIU T4
00:00:07:14.03082: BIU T1 Physical(p):000fe062=d8(Ø); Paged(p):000fe062=d8(Ø); Normal(p):0000e062=d8(Ø)
00:00:07:14.03200: BIU T2
00:00:07:14.03297: BIU T3
00:00:07:14.03395: BIU T4
00:01:59:54.09208: BIU T1 f000:e05f (E6A0)OUT A0,AL Physical(p):000fe063=03(); Paged(p):000fe063=03(); Normal(p):0000e063=03()
00:04:06:34.00144: BIU T1
00:05:36:98.02816: BIU T2
00:05:49:13.05808: BIU T3
00:05:50:61.00688: BIU T4
00:06:56:58.04352: BIU T1
00:07:10:39.09040: BIU T2
00:07:10:39.09296: BIU T3
00:07:10:39.09552: BIU W
00:07:10:39.09744: BIU T4
00:07:10:39.09968: BIU T4
00:07:14:70.06592: BIU T1 Physical(p):000fe064=ee(î); Paged(p):000fe064=ee(î); Normal(p):0000e064=ee(î)
00:07:14:70.06784: BIU T2
00:07:14:70.06912: BIU T3
00:07:14:70.07008: BIU T4
00:07:14:70.07104: BIU T1 Physical(p):000fe065=b2(²); Paged(p):000fe065=b2(²); Normal(p):0000e065=b2(²)
00:07:14:70.07264: BIU T2 f000:e061 (BAD803)MOV DX,03D8
00:07:14:70.07360: BIU T3
00:07:15:87.05936: BIU T4
00:07:15:87.06160: BIU T1 Physical(p):000fe066=b8(¸); Paged(p):000fe066=b8(¸); Normal(p):0000e066=b8(¸)
00:07:15:87.06256: BIU T2 f000:e064 (EE)OUT DX,AL
00:07:15:87.06352: BIU T3
00:07:15:87.07024: BIU T4
00:07:15:87.07120: BIU T1
00:07:15:87.07216: BIU T2
00:07:15:87.07344: BIU T3
00:07:15:87.07440: BIU W
00:07:15:87.07536: BIU T4
00:07:15:87.07632: BIU T4
00:07:16:44.05536: BIU T1 Physical(p):000fe067=fe(þ); Paged(p):000fe067=fe(þ); Normal(p):0000e067=fe(þ)
00:07:16:44.05760: BIU T2
00:07:16:44.05856: BIU T3
00:07:16:44.05952: BIU T4
00:07:16:44.06176: BIU T1 f000:e065 (B2B8)MOV DL,B8 Physical(p):000fe068=c0(À); Paged(p):000fe068=c0(À); Normal(p):0000e068=c0(À)

Now that makes more sense. Still, why would it be logging T1 twice when starting the first OUT instruction. And again double T4 when ending that last OUT DX,AL instruction... Weird?
Edit: OK. After looking at the BIU's code I now see what's happening: that's one cycle active(T1?) then two cycles stalling. But it looks like it's somehow misbehaving? It looks like a 1 cycle stall only and no cycles ticked?

Edit: OK. T1 is ticked normally. On T2, a stall for 2 cycles is requested. The CPU records it, but doesn't see that the BIU needs to be ticked in that case!

Edit: Ah. Bug found in the CPU. When a bus stall was requested(the 2nd cycle of the OUT instructions), it's now waiting for T1 again(when not T1 already), then stalls the bus for 2 cycles, then starts the port transfer.

Details

00:00:55:09.08580: BIU T1 Physical(p):000ffff0=ea(ê); Paged(p):000ffff0=ea(ê); Normal(p):00000000=ea(ê)
00:00:55:09.09980: BIU T2
00:00:55:10.00124: BIU T3
00:00:55:10.00788: BIU T4
00:00:55:10.00940: BIU T1 Physical(p):000ffff1=5b([); Paged(p):000ffff1=5b([); Normal(p):00000001=5b([)
00:00:55:10.01076: BIU T2
00:00:55:10.01216: BIU T3
00:00:55:10.01376: BIU T4
00:00:55:10.01524: BIU T1 Physical(p):000ffff2=e0(à); Paged(p):000ffff2=e0(à); Normal(p):00000002=e0(à)
00:00:55:10.01664: BIU T2
00:00:55:10.01800: BIU T3
00:00:55:10.01948: BIU T4
00:00:55:10.02100: BIU T1 Physical(p):000ffff3=00( ); Paged(p):000ffff3=00( ); Normal(p):00000003=00( )
00:00:55:10.02240: BIU T2
00:00:55:10.02396: BIU T3
00:00:55:10.02748: BIU T4
00:00:55:10.02860: BIU T1 Physical(p):000ffff4=f0(ð); Paged(p):000ffff4=f0(ð); Normal(p):00000004=f0(ð)
00:00:55:10.02956: BIU T2
00:00:55:10.03072: BIU T3
00:00:55:10.03176: BIU T4
00:00:55:10.03328: BIU T1 ffff:0000 (EA5BE000F0)JMP F000:E05B Physical(p):000fe05b=fa(ú); Paged(p):000fe05b=fa(ú); Normal(p):0000e05b=fa(ú)
00:00:55:10.03444: BIU T2
00:00:55:10.03636: BIU T3
00:00:55:10.03740: BIU T4
00:00:55:10.03848: BIU T1 Physical(p):000fe05c=fc(ü); Paged(p):000fe05c=fc(ü); Normal(p):0000e05c=fc(ü)
00:00:55:10.03948: BIU T2
00:00:55:10.04084: BIU T3
00:00:55:10.04188: BIU T4
00:00:55:10.04296: BIU T1 Physical(p):000fe05d=b0(°); Paged(p):000fe05d=b0(°); Normal(p):0000e05d=b0(°)
00:00:55:10.04396: BIU T2
00:00:55:10.04492: BIU T3
00:00:55:10.04604: BIU T4
00:00:55:10.04712: BIU T1 Physical(p):000fe05e=00( ); Paged(p):000fe05e=00( ); Normal(p):0000e05e=00( )
00:00:55:10.04808: BIU T2
00:00:55:10.04904: BIU T3
00:00:55:10.05004: BIU T4
00:00:55:10.05116: BIU T1 Physical(p):000fe05f=e6(æ); Paged(p):000fe05f=e6(æ); Normal(p):0000e05f=e6(æ)
00:00:55:10.05216: BIU T2 f000:e05b (FA)CLI
00:00:55:10.05312: BIU T3
00:00:55:10.05600: BIU T4
00:00:55:10.05708: BIU T1 Physical(p):000fe060=a0( ); Paged(p):000fe060=a0( ); Normal(p):0000e060=a0( )
00:00:55:10.05820: BIU T2 f000:e05c (FC)CLD
00:00:55:10.05920: BIU T3
00:00:55:10.06012: BIU T4
00:00:55:10.06120: BIU T1 Physical(p):000fe061=ba(º); Paged(p):000fe061=ba(º); Normal(p):0000e061=ba(º)
00:00:55:10.06216: BIU T2
00:00:55:10.06316: BIU T3 f000:e05d (B000)MOV AL,00
00:00:55:10.06508: BIU T4
00:00:55:10.06632: BIU T1 Physical(p):000fe062=d8(Ø); Paged(p):000fe062=d8(Ø); Normal(p):0000e062=d8(Ø)
00:00:55:10.06728: BIU T2
00:00:55:10.06816: BIU T3
00:00:55:10.06912: BIU T4
00:00:55:10.07028: BIU T1 f000:e05f (E6A0)OUT A0,AL Physical(p):000fe063=03(); Paged(p):000fe063=03(); Normal(p):0000e063=03()
00:02:27:55.06272: BIU T2
00:02:35:86.05984: BIU T3
00:03:11:41.07088: BIU T4
00:03:20:61.05120: BIU --
00:04:16:45.04432: BIU --
00:04:37:01.09904: BIU T1
00:04:39:49.08272: BIU T2
00:04:39:49.08592: BIU T3
00:04:39:49.08848: BIU W
00:04:39:49.09104: BIU T4
00:04:39:49.09360: BIU T4
00:04:39:49.09648: BIU T1 Physical(p):000fe064=ee(î); Paged(p):000fe064=ee(î); Normal(p):0000e064=ee(î)
00:04:39:49.09904: BIU T2
00:04:39:50.00192: BIU T3
00:04:39:50.00512: BIU T4
00:04:39:50.00896: BIU T1 Physical(p):000fe065=b2(²); Paged(p):000fe065=b2(²); Normal(p):0000e065=b2(²)
00:04:39:50.01216: BIU T2 f000:e061 (BAD803)MOV DX,03D8
00:04:39:50.01472: BIU T3
00:04:39:50.01728: BIU T4
00:04:39:50.01984: BIU T1 Physical(p):000fe066=b8(¸); Paged(p):000fe066=b8(¸); Normal(p):0000e066=b8(¸)
00:04:39:50.02272: BIU T2 f000:e064 (EE)OUT DX,AL
00:04:39:50.02496: BIU T3
00:04:39:50.02752: BIU T4
00:04:39:50.03008: BIU T1
00:04:39:50.03232: BIU T2
00:04:39:50.03488: BIU T3
00:04:39:50.03744: BIU W
00:04:39:50.03968: BIU T4
00:04:39:50.04224: BIU T4
00:04:39:50.04576: BIU T1 Physical(p):000fe067=fe(þ); Paged(p):000fe067=fe(þ); Normal(p):0000e067=fe(þ)
00:04:39:50.04896: BIU T2
00:04:39:50.05152: BIU T3
00:04:39:50.05408: BIU T4
00:04:39:50.05728: BIU T1 f000:e065 (B2B8)MOV DL,B8 Physical(p):000fe068=c0(À); Paged(p):000fe068=c0(À); Normal(p):0000e068=c0(À)

So now the bus stall is working properly again.
Now there's still those strange double T4 states there, at the end of the OUT instructions?

Edit: After some investigation, it's a new case that wasn't there before. This is the specific case where the execution state handler of the OUT instruction is called once more for making it terminate the instruction normally. But since it does nothing more than read the result from the BIU and finishing the instruction, said operation ends in the same cycle as the last cycle that was executed(only UniPCemu has to return to the instruction handler because it doesn't know it's finished doing it's job). And once the instruction handler finishes up(reading the result from the BIU), it returns to the BIU without any cycles spent(it doesn't emulate any cycles in that case).
Then the CPU core sees that and finishes up without calling the BIU cycle timing(since there are no cycles to spend on it). The debugger then came by to log whatever the CPU/BIU was doing(in this case the BIU) and blindly logged the cycle that the BIU was executing the last time(which was the T4 cycle). So it just logs a second T4 cycle, while in effect only one T4 cycle has occured(which was the previous cycle, before the instruction was in it's finish up and be terminated state).

Now, adjusting the CPU for that was easy(simply keep a flag if the BIU is actually ticked or not). Then the debugger can act based on that, sees that it didn't actually tick anything, thus don't log anything invalid.

This is the new result:

Details

00:00:03:87.00358: BIU T1 Physical(p):000ffff0=ea(ê); Paged(p):000ffff0=ea(ê); Normal(p):00000000=ea(ê)
00:00:03:87.01457: BIU T2
00:00:03:87.01599: BIU T3
00:00:03:87.01736: BIU T4
00:00:03:87.01935: BIU T1 Physical(p):000ffff1=5b([); Paged(p):000ffff1=5b([); Normal(p):00000001=5b([)
00:00:03:87.02159: BIU T2
00:00:03:87.02319: BIU T3
00:00:03:87.02456: BIU T4
00:00:03:87.02604: BIU T1 Physical(p):000ffff2=e0(à); Paged(p):000ffff2=e0(à); Normal(p):00000002=e0(à)
00:00:03:87.02741: BIU T2
00:00:03:87.02877: BIU T3
00:00:03:87.03013: BIU T4
00:00:03:87.03180: BIU T1 Physical(p):000ffff3=00( ); Paged(p):000ffff3=00( ); Normal(p):00000003=00( )
00:00:03:87.03317: BIU T2
00:00:03:87.03453: BIU T3
00:00:03:87.03789: BIU T4
00:00:03:87.03901: BIU T1 Physical(p):000ffff4=f0(ð); Paged(p):000ffff4=f0(ð); Normal(p):00000004=f0(ð)
00:00:03:87.04000: BIU T2
00:00:03:87.04098: BIU T3
00:00:03:87.04223: BIU T4
00:00:03:87.04367: BIU T1 ffff:0000 (EA5BE000F0)JMP F000:E05B Physical(p):000fe05b=fa(ú); Paged(p):000fe05b=fa(ú); Normal(p):0000e05b=fa(ú)
00:00:03:87.04465: BIU T2
00:00:03:87.04562: BIU T3
00:00:03:87.04659: BIU T4
00:00:03:87.04766: BIU T1 Physical(p):000fe05c=fc(ü); Paged(p):000fe05c=fc(ü); Normal(p):0000e05c=fc(ü)
00:00:03:87.04863: BIU T2
00:00:03:87.05436: BIU T3
00:00:03:87.05533: BIU T4
00:00:03:87.05640: BIU T1 Physical(p):000fe05d=b0(°); Paged(p):000fe05d=b0(°); Normal(p):0000e05d=b0(°)
00:00:03:87.05737: BIU T2
00:00:03:87.05834: BIU T3
00:00:03:87.05931: BIU T4
00:00:03:87.06038: BIU T1 Physical(p):000fe05e=00( ); Paged(p):000fe05e=00( ); Normal(p):0000e05e=00( )
00:00:03:87.06136: BIU T2
00:00:03:87.06232: BIU T3
00:00:03:87.06332: BIU T4
00:00:03:87.06438: BIU T1 Physical(p):000fe05f=e6(æ); Paged(p):000fe05f=e6(æ); Normal(p):0000e05f=e6(æ)
00:00:03:87.06542: BIU T2 f000:e05b (FA)CLI
00:00:03:87.06640: BIU T3
00:00:03:87.06737: BIU T4
00:00:03:87.06844: BIU T1 Physical(p):000fe060=a0( ); Paged(p):000fe060=a0( ); Normal(p):0000e060=a0( )
00:00:03:87.06947: BIU T2 f000:e05c (FC)CLD
00:00:03:87.07046: BIU T3
00:00:03:87.07144: BIU T4
00:00:03:87.07250: BIU T1 Physical(p):000fe061=ba(º); Paged(p):000fe061=ba(º); Normal(p):0000e061=ba(º)
00:00:03:87.07347: BIU T2
00:00:03:87.07452: BIU T3 f000:e05d (B000)MOV AL,00
00:00:03:87.07549: BIU T4
00:00:03:87.07656: BIU T1 Physical(p):000fe062=d8(Ø); Paged(p):000fe062=d8(Ø); Normal(p):0000e062=d8(Ø)
00:00:03:87.07753: BIU T2
00:00:03:87.07850: BIU T3
00:00:03:87.07947: BIU T4
00:00:03:87.08063: BIU T1 f000:e05f (E6A0)OUT A0,AL Physical(p):000fe063=03(); Paged(p):000fe063=03(); Normal(p):0000e063=03()
00:00:03:87.08188: BIU T2
00:00:03:87.08284: BIU T3
00:00:03:87.08381: BIU T4
00:00:03:87.08478: BIU --
00:00:03:87.08575: BIU --
00:00:03:87.08677: BIU T1
00:00:03:87.08774: BIU T2
00:00:03:87.08871: BIU T3
00:00:03:87.08968: BIU W
00:00:03:87.09065: BIU T4
00:01:00:61.01596: BIU T1 Physical(p):000fe064=ee(î); Paged(p):000fe064=ee(î); Normal(p):0000e064=ee(î)
00:01:00:61.02060: BIU T2
00:01:00:61.02392: BIU T3
00:01:00:61.02712: BIU T4
00:01:00:61.03064: BIU T1 Physical(p):000fe065=b2(²); Paged(p):000fe065=b2(²); Normal(p):0000e065=b2(²)
00:01:00:61.03448: BIU T2 f000:e061 (BAD803)MOV DX,03D8
00:01:00:61.03772: BIU T3
00:01:00:61.04100: BIU T4
00:01:00:61.04524: BIU T1 Physical(p):000fe066=b8(¸); Paged(p):000fe066=b8(¸); Normal(p):0000e066=b8(¸)
00:01:00:61.04888: BIU T2 f000:e064 (EE)OUT DX,AL
00:01:00:61.05212: BIU T3
00:01:00:61.05564: BIU T4
00:01:00:61.05904: BIU T1
00:01:00:61.06224: BIU T2
00:01:00:61.06556: BIU T3
00:01:00:61.06876: BIU W
00:01:00:61.07196: BIU T4
00:01:00:61.07568: BIU T1 Physical(p):000fe067=fe(þ); Paged(p):000fe067=fe(þ); Normal(p):0000e067=fe(þ)
00:01:00:61.07888: BIU T2
00:01:00:61.08208: BIU T3
00:01:00:61.08536: BIU T4
00:01:00:61.08920: BIU T1 f000:e065 (B2B8)MOV DL,B8 Physical(p):000fe068=c0(À); Paged(p):000fe068=c0(À); Normal(p):0000e068=c0(À)

As can be seen, the OUT A0,AL instruction starts executing, lets the BIU tick one clock(starting a memory read into the PIQ), then waits for T1, performs 2 idle bus cycles and finally executes the normal port access on the bus.
Is that correct behaviour?

Edit: 8088 MPH now reports 1588 metric cycles.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 181 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

A little thing to get back to: what does the BIU actually do on the second cycle and up, just before the T1 state of the OUT(E6h) transfer? UniPCemu has done a T1 state if it was idle on the first cycle, any other T-state or W-state otherwise.
On that second cycle, UniPCemu requests a buss stall for 2 cycles, which makes the BIU finish the current operation(ticking until and including T4 if not T1 yet), then idles the bus for 2 cycles before starting the port I/O cycles. Is that correct? Or is it supposed to tick the BIU normally like the first cycle or count the cycles until T4 as part of the --- cycles but not acting on it(so if those happen on T3, for those two cycles requested, it will tick T3,T4, then start the port transfer T1)?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 182 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just for confirmation, what happens when the opcode E6 is executed?

Your log says:

10A89 IC...  10A89 EB 00 FC 7 .......  T1                      I
60A89 .C... 10A89 FF 00 FC 7 ..r.... T2
60A00 Sp... 10A89 00 00 FC 6 ..r.... T3 00 <-f [ 10A89] S E6E0 OUT E0, AL
60A00 .C... 10A89 00 00 FC 6 ....... T4
10A8A .C... 10A8A 00 00 FC 7 ....... T1
60A8A .C... 10A8A FD 00 FC 7 ..r.... T2
60ACD .p... 10A8A CD 00 FC 6 ..r.... T3 CD <-f [ 10A8A]
60ACD .W... 10A8A CD 00 FC 6 ....... T4
000E0 .W... 000E0 CD 00 FC 7 ....... T1
60000 .W... 000E0 00 00 FC 7 .W..... T2
60000 .W... 000E0 00 00 FC 6 .W..... T3
60000 .p... 000E0 00 00 FC 6 .W..... Tw 00 --> port[00E0]
60000 IC... 000E0 00 00 FC 7 ....... T4 I
10A8B .C... 10A8B 00 00 FC 7 ....... T1
60A8B SC... 10A8B DD 00 FC 6 ..r.... T2 S EB00 JMP IP+00
60AFF .p... 10A8B FF 00 FC 6 ..r.... T3 FF <-f [ 10A8B]
60AFF .p... 10A8B FF 00 FC 7 ....... T4
60AFF .p... 10A8B FF 00 FC 7 .......
60AFF .p... 10A8B FF 00 FC 6 .......
400AF .p... 1008B FF 00 FC 6 .......
400AF .p... 1008B FF 00 FC 7 .......
400AF Ep... 1008B FF 00 FC 7 ....... E

So:

10A89 IC...  10A89 EB 00 FC 7 .......  T1                      I
60A89 .C... 10A89 FF 00 FC 7 ..r.... T2 <- This is where UniPCemu fetches the last byte of the instruction from the PIQ into the EU.
60A00 Sp... 10A89 00 00 FC 6 ..r.... T3 00 <-f [ 10A89] S E6E0 OUT E0, AL <- This is UniPCemu's first cycle being run normally? This is the point where UniPCemu now logs the instruction.
60A00 .C... 10A89 00 00 FC 6 ....... T4 <- Here, UniPCemu waits for T1 first(part of the BIU delay algorithm)... This T1 waiting always happens. If it's already T1 ready(idle), the idle states start here instead of below.
10A8A .C... 10A8A 00 00 FC 7 ....... T1 <- UniPCemu reaches T1 and starts it's 2 cycle BIU delay. This is the first cycle of that. This is the --- cycle.
60A8A .C... 10A8A FD 00 FC 7 ..r.... T2 <- The 2nd cycle of UniPCemu's BIU delay. Another --- cycle.
60ACD .p... 10A8A CD 00 FC 6 ..r.... T3 CD <-f [ 10A8A] <- Here, UniPCemu requests the port write, which will happen when it reaches T1 again.
60ACD .W... 10A8A CD 00 FC 6 ....... T4
000E0 .W... 000E0 CD 00 FC 7 ....... T1
60000 .W... 000E0 00 00 FC 7 .W..... T2
60000 .W... 000E0 00 00 FC 6 .W..... T3
60000 .p... 000E0 00 00 FC 6 .W..... Tw 00 --> port[00E0]
60000 IC... 000E0 00 00 FC 7 ....... T4 I
10A8B .C... 10A8B 00 00 FC 7 ....... T1
60A8B SC... 10A8B DD 00 FC 6 ..r.... T2 S EB00 JMP IP+00
60AFF .p... 10A8B FF 00 FC 6 ..r.... T3 FF <-f [ 10A8B]
60AFF .p... 10A8B FF 00 FC 7 ....... T4
60AFF .p... 10A8B FF 00 FC 7 .......
60AFF .p... 10A8B FF 00 FC 6 .......
400AF .p... 1008B FF 00 FC 6 .......
400AF .p... 1008B FF 00 FC 7 .......
400AF Ep... 1008B FF 00 FC 7 ....... E

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 183 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

The double T1 and T4 are actually the BIU delays(NOT bus delays) that are doing their magic. UniPCemu has 2 kinds of delays regarding the BIU and bus: BIU stalls(make the BIU not start a T1 cycle for n cycles) and bus delays(the --- cycles, for n cycles). Bus delays in effect make it wait for T1(if the last wasn't T4 already) and on T1 starts the requested amount of idle bus cycles. Of course, DMA can actually start on these cycles (as the CPU has released the bus after T4).

So the following does seem to have a 2-cycle bus stall:

60AFF Ip...  10A87 FF 00 FC 6 .......                          I
60AF0 .C... 10A87 FF 00 FC 6 .......
10A88 SC... 10A88 FF 00 FC 7 ....... T1 S E6E0 OUT E0, AL <- UniPCemu has a 1-cycle normal execution here. Starting a prefetch here...
60A88 .C... 10A88 FF 00 FC 7 ..r.... T2 <- UniPCemu starts waiting for T1
60ACD .p... 10A88 CD 00 FC 6 ..r.... T3 CD <-f [ 10A88] <- Still waiting for T1
60ACD .p... 10A88 CD 00 FC 6 ....... T4 <- Still waiting for T1
50A89 .p... 10A88 CD 00 FC 7 ....... <- UniPCemu stalling here 1/2 because T1 was reached.
50A89 .W... 10A88 CD 00 FC 7 ....... <- another stall with UniPCemu 2/2
000E0 .W... 000E0 CD 00 FC 6 ....... T1 <- The I/O port access after the stall starts.
60000 .W... 000E0 00 00 FC 6 .W..... T2
60000 .W... 000E0 00 00 FC 7 .W..... T3
60000 .p... 000E0 00 00 FC 7 .W..... Tw 00 --> port[00E0]
60000 IC... 000E0 00 00 FC 6 ....... T4 I
10A89 .C... 10A89 00 00 FC 6 ....... T1
60A89 SC... 10A89 DD 00 FC 7 ..r.... T2 S EB00 JMP IP+00
60AFF .p... 10A89 FF 00 FC 7 ..r.... T3 FF <-f [ 10A89]
60AFF .p... 10A89 FF 00 FC 6 ....... T4
60AFF .p... 10A89 FF 00 FC 6 .......
60AFF .p... 10A89 FF 00 FC 7 .......
4008F .p... 10089 FF 00 FC 7 .......
4008F .p... 10089 FF 00 FC 6 .......
4008F Ep... 10089 FF 00 FC 6 ....... E

But another doesn't?

10A89 IC...  10A89 EB 00 FC 7 .......  T1                      I
60A89 .C... 10A89 FF 00 FC 7 ..r.... T2
60A00 Sp... 10A89 00 00 FC 6 ..r.... T3 00 <-f [ 10A89] S E6E0 OUT E0, AL <- first cycle normally. This is parsed normally by UniPCemu?
60A00 .C... 10A89 00 00 FC 6 ....... T4 <- waiting for stall on T1? Same as UniPCemu would.
10A8A .C... 10A8A 00 00 FC 7 ....... T1 <- Huh? No 2-cycle bus idle here? UniPCemu would have the bus idlehere(1/2)?
60A8A .C... 10A8A FD 00 FC 7 ..r.... T2 <- Same with this. This should be another bus stall?
60ACD .p... 10A8A CD 00 FC 6 ..r.... T3 CD <-f [ 10A8A] <- The T1 of the port I/O should happen here with UniPCemu?
60ACD .W... 10A8A CD 00 FC 6 ....... T4
000E0 .W... 000E0 CD 00 FC 7 ....... T1
60000 .W... 000E0 00 00 FC 7 .W..... T2
60000 .W... 000E0 00 00 FC 6 .W..... T3
60000 .p... 000E0 00 00 FC 6 .W..... Tw 00 --> port[00E0]
60000 IC... 000E0 00 00 FC 7 ....... T4 I
10A8B .C... 10A8B 00 00 FC 7 ....... T1
60A8B SC... 10A8B DD 00 FC 6 ..r.... T2 S EB00 JMP IP+00
60AFF .p... 10A8B FF 00 FC 6 ..r.... T3 FF <-f [ 10A8B]
60AFF .p... 10A8B FF 00 FC 7 ....... T4

Can you explain what's happening here, Reenigne?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 184 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just have been looking at xtce's code. What's the whole deal with the tFirstIdle and tSecondIdle cycles?
Does it have something to do with this current UniPCemu idle bus cycle problem?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 185 of 198, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote on 2020-05-24, 13:19:

Just have been looking at xtce's code. What's the whole deal with the tFirstIdle and tSecondIdle cycles?
Does it have something to do with this current UniPCemu idle bus cycle problem?

I don't remember the exact details, but in the process of testing XTCE against XTServer+ISA bus sniffer, I must have found a set of testcases which required distinguishing between the first, second and subsequent idle cases. It is quite possible that there is a simpler model of the 8088 which gives the same output but doesn't require distinguishing between these states. Almost certainly there is a simpler model because the code stuff in busInit() doesn't make any sense for a real CPU to be doing (I'd expect that function to be a few lines which don't depend on _accessNumber, plus some wait() calls scattered around in other places, but haven't yet found a set of changes which allows such simplifications to be made without changing the timing).

Reply 186 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++
reenigne wrote on 2020-05-24, 19:53:
superfury wrote on 2020-05-24, 13:19:

Just have been looking at xtce's code. What's the whole deal with the tFirstIdle and tSecondIdle cycles?
Does it have something to do with this current UniPCemu idle bus cycle problem?

I don't remember the exact details, but in the process of testing XTCE against XTServer+ISA bus sniffer, I must have found a set of testcases which required distinguishing between the first, second and subsequent idle cases. It is quite possible that there is a simpler model of the 8088 which gives the same output but doesn't require distinguishing between these states. Almost certainly there is a simpler model because the code stuff in busInit() doesn't make any sense for a real CPU to be doing (I'd expect that function to be a few lines which don't depend on _accessNumber, plus some wait() calls scattered around in other places, but haven't yet found a set of changes which allows such simplifications to be made without changing the timing).

OK.

Also, can you explain what happens during the first 2/3 cycles of that OUT instruction? As UniPCemu currently runs it, that weird case without the idle bus cycles will never happen. The first cycle ticks normally. Then, it waits for T1 and on that cycle executes two idle bus states, before starting the T1-T3-Tw-T4 cycles of the actual port write.
My question now is: why doesn't those idle bus cycles happen after T4 in your sniffer log? What's going on there? In your log, the idle states don't happen in one of the two cases, as well (my last post explanation with my comments added) as it being a normal T1-T4 second memory prefetch access?

Edit: So when starting at T1, the first cycle ticks it to T2. Then it waits for T4 to finish before ticking 2 idle bus cycles(an abort?) and starts the port write cycles.

But when at T3(or perhaps any non-T1 cycle starting), the first cycle ticks it to T4(perhaps even T3 as well as T4 does this?), so it decides to let it prefetch instead of two idle cycles, before performing the bus write cycles?
So, it decides that on the second cycle? Either wait for T4 to finish, 2 idle cycles and then start the transfer, or when not at T2, wait for T4 to finish, 1 active cycle and then start the transfer?

Last edited by superfury on 2020-05-24, 22:04. Edited 2 times in total.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 187 of 198, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote on 2020-05-24, 21:15:
Also, can you explain what happens during the first 2/3 cycles of that OUT instruction? As UniPCemu currently runs it, that weir […]
Show full quote

Also, can you explain what happens during the first 2/3 cycles of that OUT instruction? As UniPCemu currently runs it, that weird case without the idle bus cycles will never happen. The first cycle ticks normally. Then, it waits for T1 and on that cycle executes two idle bus states, before starting the T1-T3-Tw-T4 cycles of the actual port write.
My question now is: why doesn't those idle bus cycles happen after T4 in your sniffer log? What's going on there? In your log, the idle states don't happen in one of the two cases, as well (my last post explanation with my comments added) as it being a normal T1-T4 second memory prefetch access?

Edit: So when starting at T1, the first cycle ticks it to T2. Then it waits for T4 to finish before ticking 2 idle bus cycles(an abort?) and starts the port write cycles.

But when at T3(or perhaps any non-T1 cycle starting), the first cycle ticks it to T4(perhaps even T3 as well as T4 does this?), so it decides to let it prefetch instead of two idle cycles, before performing the bus write cycles?
So, it decides that on the second cycle? Either wait for T4 to finish, 2 idle cycles and then start the transfer, or when not at T2, wait for T4 to finish, 1 active cycle and then start the transfer?

That sounds right to me. Though I don't remember the details. I don't think I knew about the abort behaviour when writing XTCE, but I did notice that there never seemed to be 1 cycle of idle between T4 and T1, only 0 or 2 or more. I now think that is probably the abort effect, and it might simplify XTCE if I model it as an abort.

Reply 188 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

Opcode EE seems even more weird: wait for T4 to finish, then 5 idle bus states, then transfer?
Edit: Perhaps if we add the abort effect(already T1 after the first cycle, so immediately idle for 2 cycles on the second cycle), that leaves 3 unknown idle cycles that happen for some unknown reason... Perhaps a repeat of the previous 2 steps(1 cycle active(idle because prefetch filled?, 2 cycles idle)? Or perhaps just 3 always-idle cycles?

Edit: Hmmm... The first 2 steps are unchanging it seems(1 cycle wait, 2 cycles wait). But after that it gets weird?

Edit: After some observation, I see the following seeming to happen:
T4 starting instruction: takes 2 cycles. Then 4 cycles delay on the EU before starting the transfer.
non-T4 starting instruction: takes 1 cycle. Then 4 cycles delay on the EU before starting the transfer?
So combining the two:
So only the first delay is 2 cycles with T4(1 cycle otherwise). Then a 4 cycle delay(allowing prefetching) before starting the transfer?

Opcode EC seems to be a weird one with it's delays? It seems to use the same start as the OUT version, but can have a case of 3 idle cycles? Although more T-state examples are needed to be sure(not enough sniffer logs). To be exact, what happens when it's T1-T4?

Edit: Just implemented said behaviour in UniPCemu, except the opcode EC/ED behaviour, which is still unknown.
Edit: It now drops to 1585 metric cycles(12 CPU cycles less again).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 189 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

All that currently might be cycle-inaccurate are the JMP(non-conditional only)/CALL opcodes afaik. All other opcodes should actually match Reenigne's emulation.

Edit: Implemented all those JMP/CALL opcode timings now(with the same timings, derived from Reenigne's xtce.h).

8088 MPH now says it's at 1585/1586 metric cycles.

Edit: Thinking about it, the only thing that might be left now is the string instructions themselves, which have the total timings at the end of the instruction always. Although those timings were actually derrived from xtce.h, they are implemented as a single wait(n) instruction at the end of the command, not in between transfers(as in your emulation).
Those timings only depend on whether it's already repeating, starting a repeat or not repeating(those 3 cases each have an unchanging cycle count).

Edit: Just implemented the string instruction timings according to xtce.h. Now the metric cycle count jumps way down to 1503? It looks that the more accuracy I add, the lower the metric cycle count gets?
Edit: The credits now crash with those timings?
Edit: After fixing the CMPS/SCAS instruction timing to parse the final timing correctly, it still crashes the credits?
Edit: Found a bug in the 808X MOVSW timing...
Edit: But that doesn't fix the credits either. It has some EU timing shortage on the credits now it seems?
Edit: Then fixed the CMPS/SCAS final timings to check the REP prefix correctly, but it still crashes?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 190 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

Does the credits of 8088 MPH use a REP prefix?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 191 of 198, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote on 2020-05-26, 16:36:

Does the credits of 8088 MPH use a REP prefix?

Not during execution of the actual effect. It does in the init code, but timings being off there shouldn't affect anything. There must be something else broken.

Reply 192 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

So the issue is probably that the non-REP string instructions take less time than what they did before. Although, looking at your code, I can't see anything taking much less than the original UniPCemu timings...
The original timings were:
MOVS: 4. Now 3 after writing the data to memory. This might not be enough for anything to get prefetched?
CMPS: 8. Now 7 +1 if idle bus cycle after the memory write. Maybe slightly less, but still worth 1 prefetch.
STOS: 4. Now 4 cycles after the write. So the same.
LODS: 5. Now 4 for non-REP after the read. Still 1 prefetch cycle.
SCAS: 7. Now 5 after the read. But still a memory prefetch cycle.
With after the read/write I mean the T1(or idle bus) cycle after the read/write completes on the BIU.

But all of them still pretty much force 1 or maybe 2(CMPS) prefetch cycle(s) because they're idling the BIU at least 1 cycle(and in CMPS another cycle after that). MOVS is the only one out of those that doesn't guarantee 1 prefetch time in total, but it should, since the prefetch starts immediately after the T4 cycle of it's memory access.

Then why is it still crashing with the new timings? Is perhaps another unknown instruction not taking enough time to make it prefetch enough?

Of course, said timings don't take the pre-instruction timings for the REP handling into account.

This is the pre-instruction timing:

	if ((CPU[activeCPU].is0Fopcode == 0) && newREP) //REP instruction affected?
{
switch (OP) //Check for string instructions!
{
case 0xA4:
case 0xA5: //MOVS
case 0xAC:
case 0xAD: //LODS
if ((CPU[activeCPU].preinstructiontimingnotready == 0) && (CPU[activeCPU].repeating == 0)) //To start ready timing?
{
CPU[activeCPU].cycles_OP += 1; //1 cycle for starting REP MOVS!
}
if ((CPU[activeCPU].preinstructiontimingnotready==0) && (CPU[activeCPU].repeating==0)) //To start ready timing?
{
CPU[activeCPU].preinstructiontimingnotready = 1; //Timing not ready for the instruction to execute?
CPU[activeCPU].executed = 0; //Not executed yet!
goto BIUWaiting; //Wait for the BIU to finish!
}
CPU[activeCPU].preinstructiontimingnotready = 2; //Finished and ready to check for cycles now!
if (BIU_getcycle() == 0) CPU[activeCPU].cycles_OP += 1; //1 cycle for idle bus?
if (((OP == 0xA4) || (OP == 0xA5)) && gotREP && (CPU[activeCPU].repeating==0)) //REP MOVS starting?
{
CPU[activeCPU].cycles_OP += 1; //1 cycle for starting REP MOVS!
}
break;
case 0xA6:
case 0xA7: //CMPS
case 0xAE:
case 0xAF: //SCAS
if (CPU[activeCPU].repeating == 0) CPU[activeCPU].cycles_OP += 1; //1 cycle for non-REP!
break;
case 0xAA:
case 0xAB: //STOS
if ((CPU[activeCPU].preinstructiontimingnotready == 0) && (CPU[activeCPU].repeating == 0)) //To start ready timing?
{
CPU[activeCPU].cycles_OP += 1; //1 cycle for non-REP!
}
if ((CPU[activeCPU].preinstructiontimingnotready == 0) && (CPU[activeCPU].repeating==0)) //To start ready timing?
{
CPU[activeCPU].preinstructiontimingnotready = 1; //Timing not ready for the instruction to execute?
CPU[activeCPU].executed = 0; //Not executed yet!
goto BIUWaiting; //Wait for the BIU to finish!
}
CPU[activeCPU].preinstructiontimingnotready = 2; //Finished and ready to check for cycles now!
if (BIU_getcycle() == 0) CPU[activeCPU].cycles_OP += 1; //1 cycle for idle bus?
if (gotREP && (CPU[activeCPU].repeating==0)) CPU[activeCPU].cycles_OP += 1; //1 cycle for REP prefix used!
break;
default: //Not a string instruction?
break;
}
}

//Perform REPaction timing before instructions!
if (gotREP) //Starting/continuing a REP instruction? Not finishing?
{
//Don't apply finishing timing, this is done automatically!
CPU[activeCPU].cycles_OP += 2; //rep active!
//Check for pending interrupts could be done here?
if (unlikely(blockREP)) ++CPU[activeCPU].cycles_OP; //1 cycle for CX=0 or interrupt!
else //Normally repeating?
Show last 25 lines
		{
CPU[activeCPU].cycles_OP += 2; //CX used!
if (newREP) CPU[activeCPU].cycles_OP += 2; //New REP takes two cycles!
switch (OP)
{
case 0xAC:
case 0xAD: //LODS
CPU[activeCPU].cycles_OP += 2; //2 cycles!
case 0xA4:
case 0xA5: //MOVS
case 0xAA:
case 0xAB: //STOS
CPU[activeCPU].cycles_OP += 1; //1 cycle!
break;
case 0xA6:
case 0xA7: //CMPS
case 0xAE:
case 0xAF: //SCAS
CPU[activeCPU].cycles_OP += 2; //2 cycles!
break;
default: //Unknown timings?
break;
}
}
}

That's the code handling the pre-execution state(as far as I could infer from your code, that's everyting until and including the repAction() function call).

gotREP is a REP/REPZ/REPNZ prefix being used. blockREP is set when (E)CX is zero. newREP is a pretty much the inverse of CPU[activeCPU].repeating (only instead of being controlled by the instruction starting/resetting, it gets set when determining whether or not to fetch from the buffer, which in effect is controlled by the CPU[activeCPU].repeating, which clears it when set, sets it otherwise(so it's an inversed value(0=>1, 1=>0))). CPU[activeCPU].repeating is a flag indicating that the current instruction wasn't fetched from the prefetch(in other words: it's an instruction that's already re-executing because of a REP prefix).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 193 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

Hmmm... I see something interesting happening on the CGA with UniPCemu running the 8088 XT configuration: when shifting the display upwards(e.g. through the BIOS shifting text up one row for MS-DOS), I see that the Generic Super PC/Turbo XT BIOS keeps swapping bits 2 and 3 of the CGA Mode Control Register with each other. Bit 3 being cleared and set makes sense(Disabling video output during the update), but bit 2 is also toggled, causing the color burst to be enabled and disabled each time it shifts the text upwards(like when pressing carriage return at a command prompt)?

Edit: OK. So the BIOS disables the color burst and video output while it's shifting the display up/down(as it says it's to avoid snow).

@@wait: in	al, dx				; Read CGA status register
test al, 00001000b ; vertical retrace?
jz @@wait ; wait until it is
mov dx, 3D8h ; Then go and
mov al, 25h ; turn the display
out dx, al ; off to avoid snow

That's at line 3487 of the Generic Super PC/Turbo XT BIOS v3.1 (phatcode.net)'s pcxtbios.asm file.
So, it waits for vertical retrace, disables the display enable signal and color burst, shifts the text on the screen, then restores the CGA Mode Control register after that. That causes the color burst to somehow be disabled for a little while while it's updating the VRAM(to 'prevent snow'), but causes the display to render in black and white instead because of the disabled color burst?

UniPCemu converts said bit 3 and puts it in the VGA Clocking Mode Register bit 5 in an inverted state(so setting it clears said bit in the Clocking Mode Register). UniPCemu interprets that bit in the clocking mode register to cause blanking of the display. So why do I see the color burst disabled for a short while when said update is happening(with the color burst disabled), causing the text to become white?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 194 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

It makes me wonder, though... Can color burst even be disabled in text mode by clearing bit 2 of the CGA Mode Control register? Or does it have no effect in text mode?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 195 of 198, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote on 2020-06-04, 21:56:

It makes me wonder, though... Can color burst even be disabled in text mode by clearing bit 2 of the CGA Mode Control register? Or does it have no effect in text mode?

It has the same effect in text and graphics modes.

Reply 196 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++
reenigne wrote on 2020-06-05, 06:20:
superfury wrote on 2020-06-04, 21:56:

It makes me wonder, though... Can color burst even be disabled in text mode by clearing bit 2 of the CGA Mode Control register? Or does it have no effect in text mode?

It has the same effect in text and graphics modes.

So the toggling of the color burst(bit 2 in the CGA Mode Control Register) being toggled off(before shifting the display up/down) and on afterwards is actually a Generic Super PC/Turbo XT BIOS bug? That BIOS is toggling it off(color burst disable) before starting to shift up the display, then shifts the display up(in the case of MS-DOS, one line) which shows up on the rendering as a very fast flash of white text, then toggles it back on(which causes the display to revert back to the reddish text with color bleed).

Looking at https://www.seasip.info/VintagePC/cga.html , it looks like shift routine sets it up for mode 2(BW80), while the current video mode for MS-DOS during boot is actually uses mode 3(CO80)? And the weird static, unchanging loading in the 80 column mode text shifting routine actually doesn't take that into account and thus toggles it on together with setting the mode to effectively become BW80 instead of CO80, causing the weird display issue when shifting text up or down using the BIOS routine?

Edit: Hmmm... Said routine clears bit 3 of the CGA Mode Control Register while doing that, but somehow it seems like the rendering continues on as usual and doesn't perform the blanking(what said bit being cleared is supposed to do)? Although UniPCemu maps said bit to the VGA Sequencer Clocking Mode Register's Screen Disable bit to the bit 3 of the CGA Mode Control Register inverted(sets clears said bit, clear sets said bit), causing the renderer to be supposed to use it.

UniPCemu adds said input to the blanking signal(OR'ed together with the 3C0 bit 5 of the index and the CRTC don't render flag(used for VGA when the registers 0x30-0x3F are accessed to disable display until a valid register is written(according to VGA undocumented information)).

Perhaps the renderer isn't applying the blanking signal correctly on CGA? Hmmm...
Edit: Hmmm... It's supposed to work correctly in that case. It should use the normal CGA/VGA rendering handler, which uses the blanking rendering instead of the non-blanking rendering. Said blanking renderer sets the pixels to color #0, so resulting in the direct color #0 being rendered in those cases(unaffecting the normal fetching and rendering from VRAM, just forcing the output of that stage to be the 16-color value of 0 instead of it's normal 16-color value(depending on the set video mode(and color register translation happens like on the EGA/VGA and up, but using static values loaded from the color register(some precalculated tables) instead))).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 197 of 198, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
reenigne wrote on 2020-03-12, 13:55:

However, bear in mind that the 8088MPH speed test was never meant to be an emulator torture test - it was just sufficiently sensitive to tell IBM PC/XTs from contemporary machines with similar (but not identical) timings. Getting the speed test to pass doesn't mean that all instructions are correct, nor will even guarantee that the rest of the demo will run correctly. These logs of carefully curated patterns are much more of a torture test. I plan to make an actual emulator torture test out of them soon.

I have now released the first version of this torture test: https://www.reenigne.org/software/acid88.zip . I will be interested to see how many of the tests UniPCemu passes!

Reply 198 of 198, by Stiletto

User metadata
Rank l33t++
Rank
l33t++
reenigne wrote on 2020-09-26, 15:18:
reenigne wrote on 2020-03-12, 13:55:

However, bear in mind that the 8088MPH speed test was never meant to be an emulator torture test - it was just sufficiently sensitive to tell IBM PC/XTs from contemporary machines with similar (but not identical) timings. Getting the speed test to pass doesn't mean that all instructions are correct, nor will even guarantee that the rest of the demo will run correctly. These logs of carefully curated patterns are much more of a torture test. I plan to make an actual emulator torture test out of them soon.

I have now released the first version of this torture test: https://www.reenigne.org/software/acid88.zip . I will be interested to see how many of the tests UniPCemu passes!

Awesome!

"I see a little silhouette-o of a man, Scaramouche, Scaramouche, will you
do the Fandango!" - Queen

Stiletto