VOGONS


First post, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

Started a separate thread as the first one diverged from the initial question. I wanted to give an update where I am. This is a log of the first 4 instructions in the BIOS.

JMP 0xf000:0xe05b
MOV AX, 0x40
MOV DS, AX
MOV WORD [0x472], 0x0

it took me a while to get here because I am building a system that allows me to possibly expand it later to other CPUs. So I built a very simple stack based scripting language (very simple!) that controls the emulation of the 3 main units: EU, EA and BIU. To see how this works, lets look at a line from the log below

Cycle 12 T4	 0xffff2 	 0xe0 	     >P 	[ 0xe0 xxxx xxxx xxxx ]	 waiting for data 	 {  imm  execute  delay  } 

This means we are in cycle 12 (type T4) and we are reading from bus at memory address 0xffff2 and we just read value 0xe0. We put that into the prefetch buffer (>P) which looks like this [ 0xe0 xxxx xxxx xxxx ] (xxxx means empty prefetch slot). The state of the EU is "waiting for data" and the stack based script I am executing for this instruction has the following steps: imm, execute and delay. It gets more interesting at cycle 52 where after EU executes the decoding of the rmmode it added few more steps in the script to be: { ea read imm execute delay } .

This system allows me to describe all instructions and properly execute them as the 8088 would do.

Sure there are some differences right now with the real 8088, such as the prefetch is locked halfway though the JMP not at the end and maybe the prefetch is read 1 cycle later after being filled in than what I do, but those are details that I can fix later as long as my system is sound and solid.

Oh and I do not think this is too slow, it sounds complicated but actually the code is quite simple.

<SYSTEM HARD RESET>
Cycle 1 T1 [ xxxx xxxx xxxx xxxx ] { decode }
Cycle 2 T2 [ xxxx xxxx xxxx xxxx ] { decode }
Cycle 3 T3 [ xxxx xxxx xxxx xxxx ] { decode }
Cycle 4 T4 0xffff0 0xea >P <P [ xxxx xxxx xxxx xxxx ] new opcode 0xea { imm execute delay }
Cycle 5 T1 [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 6 T2 [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 7 T3 [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 8 T4 0xffff1 0x5b >P [ 0x5b xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 9 T1 <P [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 10 T2 [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 11 T3 [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 12 T4 0xffff2 0xe0 >P [ 0xe0 xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 13 T1 <P [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 14 T2 [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 15 T3 [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 16 T4 0xffff3 0x 0 >P [ 0x00 xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 17 T1 <P [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 18 T2 [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 19 T3 [ xxxx xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 20 T4 0xffff4 0xf0 >P [ 0xf0 xxxx xxxx xxxx ] waiting for data { imm execute delay }
Cycle 21 T1 <P [ xxxx xxxx xxxx xxxx ] { execute delay }
Cycle 22 T2 [ xxxx xxxx xxxx xxxx ] { delay }
Cycle 23 T3 [ xxxx xxxx xxxx xxxx ] executing { delay }
Cycle 24 T4 0xfe05b 0xb8 >P [ 0xb8 xxxx xxxx xxxx ] executing { delay }
Cycle 25 T1 [ 0xb8 xxxx xxxx xxxx ] executing { delay }
Cycle 26 T2 [ 0xb8 xxxx xxxx xxxx ] executing { delay }
Cycle 27 T3 [ 0xb8 xxxx xxxx xxxx ] executing { delay }
Cycle 28 T4 0xfe05c 0x40 >P [ 0xb8 0x40 xxxx xxxx ] executing { delay }
Cycle 29 T1 [ 0xb8 0x40 xxxx xxxx ] executing { delay }
Cycle 30 T2 [ 0xb8 0x40 xxxx xxxx ] executing { delay }
Cycle 31 T3 [ 0xb8 0x40 xxxx xxxx ] executing { delay }
Cycle 32 T4 0xfe05d 0x 0 >P [ 0xb8 0x40 0x00 xxxx ] executing { delay }
Cycle 33 T1 [ 0xb8 0x40 0x00 xxxx ] executing { delay }
Cycle 34 T2 [ 0xb8 0x40 0x00 xxxx ] executing { delay }
Cycle 35 T3 [ 0xb8 0x40 0x00 xxxx ] executing { delay }
Cycle 36 T4 0xfe05e 0x8e >P [ 0xb8 0x40 0x00 0x8e ] executing { delay }
Cycle 37 T1 [ 0xb8 0x40 0x00 0x8e ] executing { delay }
Cycle 38 T2 [ 0xb8 0x40 0x00 0x8e ] executing { decode }
Cycle 39 T3 <P [ 0x40 0x00 0x8e xxxx ] new opcode 0xb8 { imm execute delay }
Cycle 40 T4 0xfe05f 0xd8 <P >P [ 0x00 0x8e 0xd8 xxxx ] waiting for data { imm execute delay }
Cycle 41 T1 <P [ 0x8e 0xd8 xxxx xxxx ] { execute delay }
Cycle 42 T2 [ 0x8e 0xd8 xxxx xxxx ] { delay }
Cycle 43 T3 [ 0x8e 0xd8 xxxx xxxx ] executing { delay }
Cycle 44 T4 0xfe060 0xc7 >P [ 0x8e 0xd8 0xc7 xxxx ] executing { delay }
Cycle 45 T1 [ 0x8e 0xd8 0xc7 xxxx ] executing { delay }
Cycle 46 T2 [ 0x8e 0xd8 0xc7 xxxx ] executing { decode }
Cycle 47 T3 <P [ 0xd8 0xc7 xxxx xxxx ] new opcode 0x8e { rmmode execute delay }
Cycle 48 T4 0xfe061 0x 6 >P <P [ 0xc7 0x06 xxxx xxxx ] decode rmmode 0xd8 { execute delay }
Cycle 49 T1 [ 0xc7 0x06 xxxx xxxx ] { delay }
Cycle 50 T2 [ 0xc7 0x06 xxxx xxxx ] executing { delay }
Cycle 51 T3 [ 0xc7 0x06 xxxx xxxx ] executing { decode }
Cycle 52 T4 0xfe062 0x72 >P <P [ 0x06 0x72 xxxx xxxx ] new opcode 0xc7 { rmmode imm execute delay }
Cycle 53 T1 <P [ 0x72 xxxx xxxx xxxx ] decode rmmode 0x6 { ea read imm execute delay }
Cycle 54 T2 <P [ xxxx xxxx xxxx xxxx ] EA waiting for data EA calculation { ea read imm execute delay }
Cycle 55 T3 [ xxxx xxxx xxxx xxxx ] EA waiting for data EA calculation { ea read imm execute delay }
Cycle 56 T4 0xfe063 0x 0 >P [ 0x00 xxxx xxxx xxxx ] EA waiting for data EA calculation { ea read imm execute delay }
Cycle 57 T1 <P [ xxxx xxxx xxxx xxxx ] EA calculation { ea read imm execute delay }
Cycle 58 T2 [ xxxx xxxx xxxx xxxx ] EA calculation { ea read imm execute delay }
Cycle 59 T3 [ xxxx xxxx xxxx xxxx ] EA calculation { ea read imm execute delay }
Show last 18 lines
Cycle 60 T4	 0xfe064 	 0x 0 	     >P 	[ 0x00 xxxx xxxx xxxx ]	   EA calculation 	 {  ea  read  imm  execute  delay  } 
Cycle 61 T1 [ 0x00 xxxx xxxx xxxx ] EA calculation { ea read imm execute delay }
Cycle 62 T2 [ 0x00 xxxx xxxx xxxx ] EA calculation { ea read imm execute delay }
Cycle 63 T3 [ 0x00 xxxx xxxx xxxx ] EA calculation { ea read imm execute delay }
Cycle 64 T4 0xfe065 0x 0 >P [ 0x00 0x00 xxxx xxxx ] { read imm execute delay }
Cycle 65 T1 [ 0x00 0x00 xxxx xxxx ] waiting for data { read imm execute delay }
Cycle 66 T2 [ 0x00 0x00 xxxx xxxx ] waiting for data { read imm execute delay }
Cycle 67 T3 [ 0x00 0x00 xxxx xxxx ] waiting for data { read imm execute delay }
Cycle 68 T4 0x00472 0x 0 <M [ 0x00 0x00 xxxx xxxx ] waiting for data { read imm execute delay }
Cycle 69 T1 [ 0x00 0x00 xxxx xxxx ] waiting for data { read imm execute delay }
Cycle 70 T2 [ 0x00 0x00 xxxx xxxx ] waiting for data { read imm execute delay }
Cycle 71 T3 [ 0x00 0x00 xxxx xxxx ] waiting for data { read imm execute delay }
Cycle 72 T4 0x00473 0x 0 <M [ 0x00 0x00 xxxx xxxx ] { imm execute delay }
Cycle 73 T1 [ 0x00 0x00 xxxx xxxx ] { execute delay }
Cycle 74 T2 [ 0x00 0x00 xxxx xxxx ] { delay }
Cycle 75 T3 [ 0x00 0x00 xxxx xxxx ] executing { delay }
Cycle 76 T4 0xfe066 0xfa >P [ 0x00 0x00 0xfa xxxx ] executing { decode }

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 1 of 6, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie

Sounds good. One question, though: what is the EA unit? If you're talking about effective address calculations (as in instructions that use the mod/rm byte) I don't think that's a separate thing to the EU. EA calculations are synchronous inside the EU so it's all one unit. BIU and EU run independently to each other (except when the EU turns off prefetching to execute its own bus operations, or when the EU grabs a byte from the prefetch queue) so need to be separate units but the 8088 only has two independent units.

If that's just your way of breaking down the code and isn't supposed to reflect the internal architecture of the CPU, then fair enough.

Reply 2 of 6, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

Yes that is correct, it is just my own code separation. The CPU loop looks like this:

	for(int i=0; i<times; i++)
{
// tick the bus part of the CPU
m_pProcessor->m_pBIU->tick();

// tick the execution unit
m_pProcessor->execution_unit_tick();

m_pProcessor->m_kPrefetchBuffer.print();

PRINT_CYCLES_LINE;
}

As you can see, like you said only BIU and EU are real units. EA is just for my code separation and it does run part of EU emulation.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 3 of 6, by Scali

User metadata
Rank l33t
Rank
l33t

I think the most important thing here is that you're making an emulator that actually tries to emulate the whole CPU, rather than just interpreting instructions.
It may take a while to get the timing 100% correct, but with the right design and just using the timings as they are known today, you're probably already 95% there, and you can add small tweaks to the code as and when more specific cases become known.

What is important though, which is not clear to me at this point, is that the other devices that make use of the bus, should also hook into the emulation at a cycle-exact level (this also includes the FPU). They need to be able to add waitstates to the CPU when they access the bus.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 4 of 6, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

An update on this. I am still working on it, although life has been keeping me busy so I had to slow down a bit. I can now execute the first 1 second (more or less 4.7mil cycles) after power on. It is going well, and it will go faster soon as most of the time is taken by effectively writing new instruction execution code.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 5 of 6, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

VICTORY!! Well partial but still something worth reporting. I now have the entire PC/Turbo XT BIOS booting (and the beginning of MSDOS) on the cycle accurate 8088 implementation I have. I am sure I am off a cycle here or there but the entire CPU pipeline is correctly emulated (including ALL BUS operations of course, and memory wait states). CGA is not cycle accurate, that is my next effort. There are also a few instructions not emulated yet simply because nothing executed them.

I am trying to clean up the code and figure out how to put it up so that people can have a look.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 6 of 6, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

Now I got DOS booting too. This is a screenshot of the emulator running DOS 5.0 with the launcher.

QjBMgLz.jpg

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/