VOGONS


Common log format

Topic actions

Reply 80 of 90, by superfury

User metadata
Rank l33t++
Rank
l33t++

I do notice that on your page, two things aren't mentioned yet:
- Usage of segment prefixes in disassembly(e.g. ds:[bx+02] vs [ds:bx+02]). Already had trouble comparing logs using those two. UniPCemu uses the former.
- Memory transactions is still unimplemented? I've added it as an option to UniPCemu for now(unreleased).
- What about information about descriptor loading and storing? UniPCemu can log these(currently in a different, line-per-byte format), but how do we log those(In UniPCemu this also logs TSS loads/stores(for task switches) and exception details(e.g. #PF(0000)). Although stuff like stack switches aren't logged(yet).

Atm memory transactions start with a tab, so you can easily compare logs by filtering those out? (Not yet the case for descriptors and task switch information).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 81 of 90, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

I do notice that on your page, two things aren't mentioned yet:
- Usage of segment prefixes in disassembly(e.g. ds:[bx+02] vs [ds:bx+02]). Already had trouble comparing logs using those two. UniPCemu uses the former.

Yes, lets use the former, I agree it is cleaner. I'll update the page.

superfury wrote:

- Memory transactions is still unimplemented? I've added it as an option to UniPCemu for now(unreleased).

We do need to spec that, I agree. If you have something already implemented we can use that. Let me take a look at it.

superfury wrote:

- What about information about descriptor loading and storing? UniPCemu can log these(currently in a different, line-per-byte format), but how do we log those(In UniPCemu this also logs TSS loads/stores(for task switches) and exception details(e.g. #PF(0000)). Although stuff like stack switches aren't logged(yet).

Do you have an example of descriptor manipulation ?

Exceptions are a different beast. In CAPE exceptions are an instruction in themselves which undergoes decoding and execution. That is probably not how the HW really works underneath but it made it easier to have exception in the context of the scripting language that all instructions are broken into in CAPE.

superfury wrote:

Atm memory transactions start with a tab, so you can easily compare logs by filtering those out? (Not yet the case for descriptors and task switch information).

That is fine, and fairly clear yes. I'll update the page.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 82 of 90, by superfury

User metadata
Rank l33t++
Rank
l33t++

I should have older logs with that manipulation already(look at about two Windows 3.0 standard mode logs back). There's a bit of extra information that's dumped when in protected mode(like bytes read/written for descriptors(except IDT atm), which use a different line-based log instead of the merged BIU I/O log(since they don't use the BIU at all and just directly map memory through the Paging Unit to logical and the physical layer, bypassing the logical to physical part that's normally done by the BIU itself in a cycle-accurate way).

The extra information is mainly reporting of #GP(XXXX) faults etc. and the descriptor bytes being read/written on each layer in the same order as the single-line format, but on mutliple lines and of course a different format. Although they could be logged with 2 tabs in front of them for filtering purposes(to differentiate them from the BIU I/O).

The normal BIU memory transactions we've already talked about, it's that format like "physical(w):00000000=00" we've talked about earlier.

The transactions initiated by the descriptor caches and IDT lookups(as well as Paging lookups) use a slightly different format:
https://bitbucket.org/superfury/unipcemu/src/ … gger/debugger.c
Look at line 219+ for the exact logging method. The case that's used when accessing descriptors and paging information is the advancedlog==1(which also enables the format I've mentioned above, see lines 249+ and those other layer cases for the same on the other layers. The order is the transfer itself, as it passes through the layers(writes go from normal(segmented) all the way down to physical(for hardware, like the VGA) or RAM(for RAM). When no device responds to the layer, the layer simply isn't logged at all(so when the log starts/stops at the last response, hardware itself not responding(RAM does respond though)). So when you write to 0000:A0000 value 20h(space) which maps to physical A0000 which maps to VGA RAM, you'll get the following:

	Normal(w):000A0000=20( ); Paged(w):000A0000=20( ); Physical(w):000A000=20( )

But when you do the same to a RAM location at that location(same address for convenience, assume there's RAM there instead of adapter RAM):

	Normal(w):000A0000=20( ); Paged(w):000A0000=20( ); Physical(w):000A0000=12( ); RAM(w):000A0000=20( ); RealRAM(w):000A0000=20( )

Reading logs the same, but in reverse order, since data is received in opposite direction:

	RealRAM(r):000A0000=20( ); RAM(r):000A0000=20( ); Physical(r):000A0000=12( ); Paged(r):000A0000=20( ); Normal(r):000A0000=20( )

The prefetch is handled just like those RAM, but the (r) is (p) instead, indicating it's a prefetch operation.

That's what happens when it's being executed by the BIU itself.

Currently, memory accesses to/from any memory through the memory access unit for any kind of descriptor and paging data uses the other format(the above is only done for the BIU, since it's synchronized with memory cycles, which are timing-dependant and depending on the CPU).

UniPCemu has those memory accesses for descriptors done through the same layered system, but it's logging in a different way:
1. Memory accesses are logged while processing the memory layers directly, currently bypassing the whole BIU.
2. The memory accesses are logged immediately(see the dolog function calls, which uses sprintf-style parameters(only using a filename instead of a variable to print formatted text to)), instead of buffered in debugger_memoryaccess_text and debugger_memoryaccess_line, since that's only used for cycle-accurate logs(it's logged together with any cycle that's currently being handled, on a cycle-by-cycle basis(thus performing in the cycle-accurate way of logging), see debugger_autolog() where DEBUGGERLOG equals DEBUGGERLOG_DEBUGGING_COMMONLOGFORMAT.
3. As you can see, the dolog function logs one line of text to the log file, although it adds a timestamp(which is actual emulator runtime) for debugging purposes (which can easily be parsed or stripped using a simple parser, like sed).

The reading and writing of the descriptors and Paging tables uses plain memory accesses that use either the logical layer(descriptors) or physical layer(Page tables). They all read or write only one byte at a time through all used layers. No indicator is currently logged that it's accessing said tables.
Although UniPCemu still has a simple buffer that's buffering physical memory writes at the physical layer until the BIU has finished ticking the current cycle(this happens at the LOGMEMORYACCESS_DIRECT layer, so the physical layer buffers it's writes(not it's reads) until the current cycle has finished writing all data it needs to. After the cycle is terminated(at the end of the main CPU execution function, before invoking debugger when finished or ticking the hardware), said buffer is flushed to RAM or hardware, passing through the remaining layers(this happens for write operations only). That currently only has noticable effect when running in the Dosbox-style IPS-cycle mode, where it can be seen that the memory writes follow the prefetch buffer reads(the IPS-cycle mode tries to imitate Dosbox by completely filling the BIU prefetch buffer as much as possible to remain compatible with Self-modifying code when in IPS-cycle mode). So that's a side effect of the simple hack I've implemented to allow Dosbox-style cycles to be used on the otherwise cycle-accurate method of emulation core.

You're right that exceptions are instructions in and of itself. That happens on UniPCemu as well(it has three type of 'execution modes': instruction mode(normal instructions), interrupt mode(handling any kind of interrupt) and task switching mode(handles task switches)). The interrupt and task switching modes terminate the instruction mode, thus also terminating the whole processing the instruction handler was doing. It also terminates those other modes when switching between them.

Currently the advancedlog being zeroed allows me to block any memory access from being logged to be easily compatible with the logs that are currently being compared with IBMulator(see the Windows 3.0/3.1 thread Windows 3.0/3.1 enhanced mode and 3.0 standard mode issues in UniPCemu?).

Currently exceptions are a single-line log, which allows their identification of them being started(since IDT entries, Paging accesses and IDT reading the used GDT/LDT descriptor for the interrupt isn't logged yet). It's a simple log of #XX(errorcode), like this:

void CPU_GP(int_64 errorcode)
{
if (debugger_logging() && advancedlog) //Are we logging?
{
if (errorcode>=0)
{
dolog("debugger","#GP fault(%08X)!",errorcode);
}
else
{
dolog("debugger","#GP fault(-1)!");
}
}

The other faults use the same pattern, using their official 2-character code for identification together with the error code.

For some actual examples(before I've disabled logging them), look at the Windows 3.0 logs of the UniPCemu logs folder. One of the more recent ones should still have that setting unimplemented and logging the descriptor information together with memory accesses(don't remember which one, though). https://www.dropbox.com/sh/xy7ou1ag8n001mb/AA … KpHZC7K-sa?dl=0

Edit: Found it, although probably directly uploaded here due to it's size being small enough:

Filename
debugger_REPMOVSB_ES0000.7z
File size
1.5 MiB
Downloads
49 downloads
File comment
Example log with the descriptor loading and exceptions used.
File license
Fair use/fair dealing exception

The second instruction already shows a descriptor being load and stored back(to write the accessed bit). Further on you can find the #GP fault(just search for #GP). You can also see the memory accesses being performed in the common log format mentioned about above, which directly follows it.
I believe it was the old bug of ES being zeroed due to various CPU emulation bugs(the RETF/IRET and ARPL bugs).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 83 of 90, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

Ok, I've updated the page to mention how to log memory operands in instructions.

For exceptions that looks good. I assume something like this:

#GP fault(34)

For memory accesses. Physical and Paged I think I know what they are. But what is "Normal", "RAM" and "RealRAM" ?

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 84 of 90, by superfury

User metadata
Rank l33t++
Rank
l33t++

Normal is the layer using offsets(the offset part to be combined with the segment, e.g. the result of EBP+05(+1,+2,+3 depending on byte to access). RAM is the installed RAM actually receiving/sending the value. RealRAM(according to my RAM module) are writes that get to the RAM layer, but (when RAM is missing due to not having enough installed, memory holes or limited address space in any form) might not be written to actual RAM due to their properties(obvious) or the special Compaq addresses intercepting(at physical address 80C00000). That might actually be a bit backwards and needs reversing, thinking about that.

Edit: Just fixed the order of the two layers and their reporting. The following is the order for writes:
First is the RAM layer. If this responds, it's either RAM or a memory-mapped RAM hardware register(at Compaq 80C00000, see the bottom of Bochs' ioports.lst).
Second(if it responds) is the RealRAM layer. This is the installed RAM loading or saving data in said memory cells.

Hardware that has memory-mapped RAM responds like neither of the above responds(it's hardware-specific, thus unloggable). It could be ROM, Flash memory or video card RAM or maybe even PCI address space or something mapped by PCI or other hardware boards added.

Of course, the above two layers are reversed for reads in the log(opposite data transfer direction). So first RealRAM(if it exists) and then the RAM layer following it.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 85 of 90, by superfury

User metadata
Rank l33t++
Rank
l33t++

Also, I still see "mov dword ptr [0], eax". Shouldn't that be "mov dword ds:[00000000],eax"? Adhering to the dword(no ptr), operand size for the instruction(0000 or 00000000 depending on address size) and missing segment, also no spaces between operands(only after the opcode name and operand size(if using memory modr/m))?
Edit: Also, "cs:[di+03]".

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 86 of 90, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Also, I still see "mov dword ptr [0], eax". Shouldn't that be "mov dword ds:[00000000],eax"? Adhering to the dword(no ptr), operand size for the instruction(0000 or 00000000 depending on address size) and missing segment, also no spaces between operands(only after the opcode name and operand size(if using memory modr/m))?
Edit: Also, "cs:[di+03]".

Fixed.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 87 of 90, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Normal is the layer using offsets(the offset part to be combined with the segment, e.g. the result of EBP+05(+1,+2,+3 depending on byte to access). RAM is the installed RAM actually receiving/sending the value. RealRAM(according to my RAM module) are writes that get to the RAM layer, but (when RAM is missing due to not having enough installed, memory holes or limited address space in any form) might not be written to actual RAM due to their properties(obvious) or the special Compaq addresses intercepting(at physical address 80C00000). That might actually be a bit backwards and needs reversing, thinking about that.

That sounds way too complicated and unneeded. Also way too verbose making the log files (which are already too large) even bigger. Not to mention that most of the time (read 99.99%) all those addresses are the same, as most accesses in normal running mode are to just normal memory. This is also implementation (and architecture) dependent.

I vote we just have Physical and Paged (linear) addresses. As most computers really just work with that.

Plus remember each emulator can have its flags to turn on extra verbosity on logs. We are just establishing a common denominator.

What about IN/OUT ? How do you log port accesses?

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 88 of 90, by superfury

User metadata
Rank l33t++
Rank
l33t++

Well, the only added layer is the Normal layer when not counting RAM/special RAM(80C00000) accesses. It also displays the actual descriptor base and offset being used by comparing the two(offset=Normal, paged already has the offset combined with the segment base(added)).
So disable the RAM layers in common log format and including the Normal(offset) layer? Said Normal(maybe renaming it "offset" instead?) layer is informative differentiating between normal memory accesses(e.g. mov and LDT) and things using the Paged layer(IDT/GDT descriptor table lookups)? It is a translation being applied after all(Especially useful in 286 protected mode).

The IN/OUT instructions currently aren't logged at all. I can enable the byte stages of port processing to log each byte i/o(independantly of hardware responding). Maybe naming it "bus", e.g. "bus(w):0064=ff( )"? And don't differentiate between hardware responding or not(like RAM)? So you can see byte/word/dword accesses based on their amounts on a row(also depending on alignment)?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 89 of 90, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've just improved the logging of memory accesses to log in lower case(so everything is in lower case except register names, the "Registers:" line, Layer names (e.g. "Normal", "Paged", "Physical", "RAM", "RealRAM" etc.), FLAGSINFO and the opcodes themselves, which are Capitalized(and lower case) and upper case(opcodes) respectively.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 90 of 90, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've just been thinking. What about the descriptor caches themselves? Should we log them as well(as they're affecting all cpu operations through descriptors)? Of course only for 80286+, while FS and GS are only for 80386+?

So, like this(this is the 80486's first instruction when starting emulation, directly after the reset signal):

f000:0000fff0 EA 05 F9 00 F0 jmp f000:0000f905	Physical(p):fffffff0=ea(ê); Paged(p):fffffff0=ea(ê); Normal(p):0000fff0=ea(ê); Physical(p):fffffff1=05(); Paged(p):fffffff1=05(); Normal(p):0000fff1=05(); Physical(p):fffffff2=f9(ù); Paged(p):fffffff2=f9(ù); Normal(p):0000fff2=f9(ù); Physical(p):fffffff3=00( ); Paged(p):fffffff3=00( ); Normal(p):0000fff3=00( ); Physical(p):fffffff4=f0(ð); Paged(p):fffffff4=f0(ð); Normal(p):0000fff4=f0(ð); Physical(p):fffffff5=20( ); Paged(p):fffffff5=20( ); Normal(p):0000fff5=20( ); Physical(p):fffffff6=30(0); Paged(p):fffffff6=30(0); Normal(p):0000fff6=30(0); Physical(p):fffffff7=31(1); Paged(p):fffffff7=31(1); Normal(p):0000fff7=31(1); Physical(p):fffffff8=2f(/); Paged(p):fffffff8=2f(/); Normal(p):0000fff8=2f(/); Physical(p):fffffff9=32(2); Paged(p):fffffff9=32(2); Normal(p):0000fff9=32(2); Physical(p):fffffffa=38(8); Paged(p):fffffffa=38(8); Normal(p):0000fffa=38(8); Physical(p):fffffffb=2f(/); Paged(p):fffffffb=2f(/); Normal(p):0000fffb=2f(/); Physical(p):fffffffc=38(8); Paged(p):fffffffc=38(8); Normal(p):0000fffc=38(8); Physical(p):fffffffd=38(8); Paged(p):fffffffd=38(8); Normal(p):0000fffd=38(8); Physical(p):fffffffe=fc(ü); Paged(p):fffffffe=fc(ü); Normal(p):0000fffe=fc(ü); Physical(p):ffffffff=98(˜); Paged(p):ffffffff=98(˜); Normal(p):0000ffff=98(˜)
Registers:
EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000421
ESP: 00000000 EBP: 00000000 ESI: 00000000 EDI: 00000000
CS: f000 DS: 0000 ES: 0000 FS: 0000 GS: 0000 SS: 0000 TR: 0000 LDTR: 0000
EIP: 0000fff0 EFLAGS: 00000002
CR0: 60000010 CR1: 00000000 CR2: 00000000 CR3: 00000000
CR4: 00000000
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: 00000000 DR7: 00000000
GDTR: 000000000000ffff IDTR: 00000000000003ff
CS descriptor: FF0093FF0000FFFF
DS descriptor: 000093000000FFFF
ES descriptor: 000093000000FFFF
FS descriptor: 000093000000FFFF
GS descriptor: 000093000000FFFF
SS descriptor: 000093000000FFFF
TR descriptor: 000082000000FFFF
LDTR descriptor: 000082000000FFFF
FLAGSINFO: 0000000000i00avr0n00oditsz0a0p1c

(although I enabled the advanced logging to show the new descriptor information, the memory accesses not being included when advanced logging is disabled(for compatibility with you emulators atm. The descriptor information falls under that category as well atm.)

You thoughts?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io