VOGONS


Reply 80 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++
vladstamate wrote:

No, there is no VGA emulation in CAPE. However I am working on a rewrite of all MDA/Herc/CGA/EGA parts using a 6845 emulator. Most of the current issues in CAPE with 8088MPH are related to CGA innacuracies which this re-write should solve.

I'm looking forward to the result when it's done(although all those things already work in UniPCemu, which uses a general patch-style VGA renderer at it's core, with customizations/extensions for all CGA/MDA/SVGA accesses). The only thing left needing accuracy in UniPCemu is the CPU itself(timing only). Although the 80286 still has some execution bugs(mostly protection fault errors afaik, seeing as it runs without problems(except for keyboard(errors out when resetting it using command 0xFF)/CMOS(why this is is unknown, seeing as it should be giving the correct results, albeit using the high resolution clock combined with the actual Windows time as it's relative RTC time(displacement added to the actual time to obtain emulated timestamp) source instead of running at the same speed as the emulated hardware) and FDC(Not 100% accurate yet: seek timings are emulated, but read/write times aren't yet)). All other hardware besides floppy(read/write timings only)/CMOS(RTC timestamp only) are already emulated cycle-accurate(afaik). So only the FDC and CPU still need work to become cycle-accurate, all other hardware is already cycle-accurate(although the OPL2 emulation still needs fixing on the drumkit, which odd enough seems to fail using the formulas I've found in OPL3EMU). The RTC won't be changed anymore(time-wise), because it's built to keep time accurately(high-resolution clock) synchronized with the actual time reported by the host OS(gettimeorday-based or equivalent). So setting time in the emulator to a certain date/time, terminating the emulator and starting it up again exactly 5 hours later will result in the emulated RTC reporting the time 5 hours later as well(so it will act just like an actual RTC was installed, except when the host System time is changed, which will affect emulated time by the same amount that it's changed due to the same synchronizing effect). So essentially it keeps time like Dosbox, but it time changed in the emulated RTC will be kept consistent automatically due to UniPCemu only storing time divergeance compared to the Host OS timestamp.

vladstamate wrote:

Currently I have 4 individual emulations for each card however in the new scheme I have a 6845 emulator that each card can use. Still early but I am hoping to iron out the CGA issues in 8088mph with this.

That sounds about the same as my VGA renderer emulation, which isn't exactly emulating a 6845, but rather a generic VGA framework(with general functions performing cycle-exact rendering at a specified pixel rate) which is extended or modified by special CGA, MDA or SVGA handlers to modify it's functionality and extend it to support the specific hardware fully.

The renderer, to be exact, just renders pixel according to stored display precalculated parameters(retrace, blanking, bitdepth etc. Just the VGA precalculated values) at the set pixelrate. The I/O and MMU handling etc. is handled by the hardware-specific(or general in case of the SVGA emulation using the IBM VGA hardware layer) CGA, MDA or VGA(all other video cards) handlers in their seperate units. Although most of the emulation is split according to the VGA documentation(Attribute controller, CRT controller, Sequencer, MMU controller, I/O controller).

So essentially, the current UniPCemu video card emulation can be explained like this:
VGA base+CRT emulation is the base of all.
CGA/MDA emulation overrides the VGA I/O and precalcs base emulation only, leaving the CRT emulation unmodified.
EGA overrides part of the VGA emulation only by masking off parts of the registers and custom code in the VGA base emulation.
SVGA emulation extends the VGA base and precalcs with it's new values only. Otherwise, it's still normal VGA emulation, giving full compatibility with the VGA emulation.

CGA/MDA emulation core, which overrides the VGA emulation core for the most part: https://bitbucket.org/superfury/unipcemu/src/ … mda.c?at=master

The NTSC/RGBI conversion routines are hardcoded in the renderer itself, but are toggled on/off using various flags in the VGA base emulation itself(read by the general renderer to decide how to render a scanline or pixel).

Base renderer: https://bitbucket.org/superfury/unipcemu/src/ … rer.c?at=master

The extensions (CGA/MDA/SVGA) essentially work by hooking the VGA emulation and handling part of the I/O and precalcs by overriding it with their own handlers or precalculated CRT values(). See the VGA_registerExtension function call at the bottom of the CGA/MDA emulation unit. If this function and it's related functions isn't called, just a normal IBM VGA is emulated.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 81 of 198, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

Started to look at the timings again. I can explain why CAPE had 12 cycles for INT.

INT (and INT3) Intel says 51 clocks (52 for INT 3). But that includes the memory transfers (which are not just simple writes as they involved stack address changes)

So for INT x (0xcd imm) you have to fetch the immediate, then 2 16 bit values (the jump address) and then write out 2 16 bit values (the current CS:IP) so that 8 byte reads/writes at 4 clocks each that is 32. But it is actually more than that as I expect the microcode for INT will actually internally resemble a PUSH CS + PUSH IP + more stuff.

Also the distribution of cycles that I would like to to do is this

1 cycle fetch opcode
1 cycle fetch immediate <-- this cycle also disables the prefetch (same as JMP instructions)
11 cycles to write out IP (PUSH reg)
10 cycles to write out CS (PUSH set-reg)
12 cycles to calculate offset inside interrupt table <-- Actual execution time as CAPE understands it
16 cycles to read new CS:IP
----
51 cycles total.

NOTE: that being said that is not 100% how CAPE deals with it right now. I need to break out PUSH reg/PUSH seg as actual pushes opcodes and then it will be correct.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 82 of 198, by Scali

User metadata
Rank l33t
Rank
l33t
superfury wrote:
So essentially, the current UniPCemu video card emulation can be explained like this: VGA base+CRT emulation is the base of all. […]
Show full quote

So essentially, the current UniPCemu video card emulation can be explained like this:
VGA base+CRT emulation is the base of all.
CGA/MDA emulation overrides the VGA I/O and precalcs base emulation only, leaving the CRT emulation unmodified.
EGA overrides part of the VGA emulation only by masking off parts of the registers and custom code in the VGA base emulation.
SVGA emulation extends the VGA base and precalcs with it's new values only. Otherwise, it's still normal VGA emulation, giving full compatibility with the VGA emulation.

Does this mean that the CGA/EGA/MDA emulation in VGA/SVGA mode is different from these CGA/EGA/MDA modes?
Because that's how it is on real hardware: EGA and VGA aren't fully backward compatible, and some things will work differently, even when a CGA/EGA/MDA mode is selected.
If I run an emulator in VGA mode, I expect it to behave like this, and not be 100% CGA/EGA compatible in the same way as real hardware.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 83 of 198, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

Also more timings:

INC/DEC word: 15** for memory(***7), 2 for reg.
INC/DEC byte: 15** for memory(***7), 3 for reg(***2).

If ** means subtract 8 then we both execute INC/DEC for memory in 7 cycles. This is execution only. Yes the whole instruction takes 17 cycles for an 8bit inc/dec but only 7 of those are the actual math operations. The instruction will take actually 23 cycles for a 16bit memory INC/DEC. But still taking only 7 cycles for the actual decrement/increment. Timeline is like this

INC [DI]

1 cycle decode
x cycles for EA calculation
8 cycles for reading [DI]
6 cycles for increment and flag update
8 cycles for writing new [DI]
---
23+x cycles

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 84 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

Vladstamate: I think you're forgetting something. You say to substract 32 to get the actual cycle count, but it's supposed to be 16? The timings mentioned in the manuals are the 8086 timings afaik, thus don't include the extra 8088 4 cycles/word(the manual says to add 4 cycles to the already 51/52 cycles to obtain 8088 timings instead of 8086 timings). Thus the 4 word accesses only need 4x4=16 cycles substracted from the manual's timings to obtain the actual EU cycles. Thus it's 51-16=35 EA cycles. The timings applied to obtain 51 cycles are using 4 cycles/word(8086 cycles), not 4 cycles/byte(8088 cycles).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 85 of 198, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Vladstamate: I think you're forgetting something. You say to substract 32 to get the actual cycle count, but it's supposed to be 16? The timings mentioned in the manuals are the 8086 timings afaik, thus don't include the extra 8088 4 cycles/word(the manual says to add 4 cycles to the already 51/52 cycles to obtain 8088 timings instead of 8086 timings). Thus the 4 word accesses only need 4x4=16 cycles substracted from the manual's timings to obtain the actual EU cycles. Thus it's 51-16=35 EA cycles. The timings applied to obtain 51 cycles are using 4 cycles/word(8086 cycles), not 4 cycles/byte(8088 cycles).

Good point. So now it looks like this:

8088

1 cycle fetch opcode
1 cycle fetch immediate <-- this cycle also disables the prefetch (same as JMP instructions)
15 cycles to write out IP (PUSH reg)
14 cycles to write out CS (PUSH set-reg)
20 cycles to calculate offset inside interrupt table <-- Actual execution time as CAPE understands it
16 cycles to read new CS:IP
----
67 cycles total.

8086

1 cycle fetch opcode
1 cycle fetch immediate <-- this cycle also disables the prefetch (same as JMP instructions)
11 cycles to write out IP (PUSH reg)
10 cycles to write out CS (PUSH set-reg)
20 cycles to calculate offset inside interrupt table <-- Actual execution time as CAPE understands it
8 cycles to read new CS:IP
----
51 cycles total.

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/

Reply 86 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++
Scali wrote:
Does this mean that the CGA/EGA/MDA emulation in VGA/SVGA mode is different from these CGA/EGA/MDA modes? Because that's how it […]
Show full quote
superfury wrote:
So essentially, the current UniPCemu video card emulation can be explained like this: VGA base+CRT emulation is the base of all. […]
Show full quote

So essentially, the current UniPCemu video card emulation can be explained like this:
VGA base+CRT emulation is the base of all.
CGA/MDA emulation overrides the VGA I/O and precalcs base emulation only, leaving the CRT emulation unmodified.
EGA overrides part of the VGA emulation only by masking off parts of the registers and custom code in the VGA base emulation.
SVGA emulation extends the VGA base and precalcs with it's new values only. Otherwise, it's still normal VGA emulation, giving full compatibility with the VGA emulation.

Does this mean that the CGA/EGA/MDA emulation in VGA/SVGA mode is different from these CGA/EGA/MDA modes?
Because that's how it is on real hardware: EGA and VGA aren't fully backward compatible, and some things will work differently, even when a CGA/EGA/MDA mode is selected.
If I run an emulator in VGA mode, I expect it to behave like this, and not be 100% CGA/EGA compatible in the same way as real hardware.

It's exactly as I've described:
- CGA and MDA register extension handlers. These handlers override all of the precalcs to effect rendering to apply the seperated CGA registers to the CRTC rendering emulation like a CGA. The VGA precalcs are effectively overridden with CGA-compatible values based on the CGA registers stored. The I/O handler disables the base VGA handling completely by returning the value 2. All I/O is redirected to the CGA registers this way(with VGA registers being unreachable). The same is done for all other handlers registered(clock select). Part of the CGA emulation is handled by modifying the actually used VGA registers(e.g. everything related to the MMU and part of the rendering parameters(e.g. serial vs 4-bit, text mode vs graphics mode etc.) by exploiting the VGA registers used for the original precalcs. This shows to be working correctly, as software tested(BIOS, MS-DOS, Ultima II and 8088 MPH) return correct output everywhere(all except the Kefrens Bars, but that's due to CPU timing itself, not due to faulty CGA emulation).
- EGA uses mostly the same register layout as the VGA. Enabling EGA emulation has the effect of redirecting EGA-specific bits from the VGA emulation and enabling the masks on all VGA registers, effectively forcing the undefined EGA bits in the VGA to 0 when written to(reading back like in the VGA is disabled, giving an undefined port result(all bits set)). The MUX bits read back are modified in the VGA core to behave either like an EGA or like a VGA, depending on the emulated card.
- SVGA registers extension handlers like CGA/MDA do, but instead extend registers with more bits or options to provide full VGA compatibility(by fallback to the base VGA emulation by returning 0) and newer functionality and extend precalcs and clocks to provide new functionality used with the SVGA cards.

This way, UniPCemu still uses the base for rendering based on the precalcs and (S)VGA VRAM mapping, while the extension units handle all I/O specific stuff without the running software noticing anything. It just sees a CGA, MDA, ET3000, ET4000 or EGA and doesn't notice anything of the translation process handled by the subunits of those(tseng.c, cgamda core or parts of the VGA core modified to switch between EGA and VGA using register masks and overridden handling(MUX bits)).

Edit: Btw, the screen shifts to b/w and normal NTSC output and back when scrolling the window(e.g. making MS-DOS print past the bottom of the screen by teletyping past the bottom), but that is because it's writing invalid values to the CGA mode control register, toggling on/off the b/w bit(bit 1 or 2, don't remember which one) each time it scrolls the text up one row, for some odd unknown reason.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 87 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

I've just implemented cycle-accurate interrupt timings, but it now gets 1632 cycles, so it's jumping down again with more accurate interrupt timings?
Edit: The Delorean car messes up again with it's even/odd row addressing it seems.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 88 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

After thinking some about the timings: The jump instructions take time on the EU but don't stall the BIU currently. So I'll need to modify the BIU to stall when the EU is executing those, which should slow down things again on that part.
Edit: Implementing a stall on the BIU with all jumps increases it into the 1640 cycle range (2% divergeance).
Edit: These stalls seem to fix the Delorean car issue.

Edit: Btw, reenigne, I notice one strange thing during the flower girl water drop fade in effect(has been there since the beginning): The top row isn't changing. Is this a bug in the 8088 MPH code?
Edit: The remaining parts of the demo needing accuracy still needs fixing, as well as the credits outright crashing for some unknown reason. Probably it's due to the cycle (in)accuracy, causing the self-modifying code to fail for some reason?
Edit: Also, do any of you guys know the address of the beginning of the MOD playback loop? Then I can dump the timings and maybe find out what's causing the SMC to crash?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 89 of 198, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Edit: Btw, reenigne, I notice one strange thing during the flower girl water drop fade in effect(has been there since the beginning): The top row isn't changing. Is this a bug in the 8088 MPH code?

Looks like it - it's in the video capture too.

superfury wrote:

Edit: The remaining parts of the demo needing accuracy still needs fixing, as well as the credits outright crashing for some unknown reason. Probably it's due to the cycle (in)accuracy, causing the self-modifying code to fail for some reason?

Almost certainly the cx load instruction in the prefetch queue, as I've mentioned before.

superfury wrote:

Edit: Also, do any of you guys know the address of the beginning of the MOD playback loop? Then I can dump the timings and maybe find out what's causing the SMC to crash?

Not offhand, but just break into it when the credits are rolling and you'll be right there - there's no other code that runs at the same time (I think this is true even when it's crashed, assuming it is the prefetch queue problem).

Reply 90 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

Well, that's exactly the problem: the program can be executing at any point, since it's hanging SMC. It could be jumping to the start of the NOP block, to the main loop or a random location in memory, depending on the contents that's overwritten vs what's in the prefetch. Thus the start location of the NOP block can't be found when it's already overwritten part of the code(making it jump to any random location depending on the popped vs prefetched data combination).

The only way to find out what's going wrong is by finding the start of the executable block in memory, which is only giving the results we need during the very first loop(which is corrupting itself). The mov cl instruction can be anything(mov cl,xx or a completely corrupted instruction, depending on what's popped vs prefetch mixing together). The same could be said about the jump after it, which might be jumping to an invalid place, when interpreted incorrectly due to the same problem with the popw [cs:bx] instruction. It can jump to anywhere, which could or couldn't be what we need(it could jump to a point outside the loop or do something entirely different).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 91 of 198, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Well, that's exactly the problem: the program can be executing at any point, since it's hanging SMC. It could be jumping to the start of the NOP block, to the main loop or a random location in memory, depending on the contents that's overwritten vs what's in the prefetch. Thus the start location of the NOP block can't be found when it's already overwritten part of the code(making it jump to any random location depending on the popped vs prefetched data combination).

The only way to find out what's going wrong is by finding the start of the executable block in memory, which is only giving the results we need during the very first loop(which is corrupting itself). The mov cl instruction can be anything(mov cl,xx or a completely corrupted instruction, depending on what's popped vs prefetch mixing together). The same could be said about the jump after it, which might be jumping to an invalid place, when interpreted incorrectly due to the same problem with the popw [cs:bx] instruction. It can jump to anywhere, which could or couldn't be what we need(it could jump to a point outside the loop or do something entirely different).

The usual problem (what I was seeing in DOSBox) is that it loads CL with the new (just-patched) value rather than the one in the prefetch queue. That makes it run so slowly it appears to be stalled, but doesn't make the instruction pointer go off into the weeds if I remember correctly. The "mov cl,xx" instruction is only patched to change xx, not to a different instruction altogether.

However, if you tried breaking in and IP isn't pointing into the mixing loop, you could instead set a breakpoint in the part of your emulator that handles writes to port 0x42. There are only a few different code locations that do this of these in all of 8088MPH, so you should be able to find the one in the mixing loop fairly quickly.

Reply 92 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

I'm currently searching a huge log file (1.93GB) for the POPW [CS:BX] instruction. I think it might be in the woods, as it's executing at 2004:013D. It was executing at segment 4E2F:0080 before the crash(during the final image of you(the creators of the demo) before the credits).

If only I knew where to place a breakpoint(execution breakpoint CS:IP address)...

Btw, port 42h is unreliable: This is used all throughout the demo, so it can't be used properly that easily(without needing to insert fully custom code in UniPCemu).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 93 of 198, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie

Yeah, 013D seems too low. Looking at the assembler log file from the last time I assembled mod.asm (which may or may not have been the final version) the "out 0x42,al" instruction ended up at IP==0x276 (no idea what CS will be since it depends on how much free conventional RAM you had when starting the demo).

Reply 94 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

Any idea what file in the 8088 MPH files this is? Could the offset be obtained by examining the 8088 MPH executable or data files?
Edit: So far, excluded offsets 0x1418 and 0x141B during the start of the demo(the SALC requiring part).

Also, once found the credits OUTB instruction, how many bytes should be substracted to get the start-of-the-loop address(I'm using the 8088 MPH final version)?
Edit:(Man.... Conditional Breakpoints in Visual Studio 2015 are slow as hell, slowing the application quite a lot). It's running at 3% during the Delorean scene already. The sprite is getting even/odd becoming background every other frame it seems.
Edit: Temporarily disabled the breakpoint during that scene(returning it to be enabled after) speeds up the application to normal ~37% speeds again.
Edit: Flower girl 45%...
Edit: Racing the beam 5%. Turning it off again...
Edit: Credits starting: 2004:002B. Then 2004:0294 lots of times.

Edit: So it seems to be at 2004:0294?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 95 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

Could 2004:0294 be the correct location of the OUTB instruction? It's the last one seen before the crash.

Filename
360_Delorean sprite, even-oddframes.zip
File size
41.4 KiB
Downloads
51 downloads
File comment
Delorean sprites, first and second frame captured(easy, since it was running so slow due to the conditional breakpoint of Visual Studio).
File license
Fair use/fair dealing exception

Edit: Little warning: it's also an XLAT during the Intel logo part.
Edit: Also, MOV AL,54 during the next part.
Edit: Probably best to only enable it once I get to the image of the 8088 MPH creators' faces.

Edit: Thinking about the code again, I just need to look at the LOOP instruction offset following it: it points directly to the start of the loop(the v, which, after 14 more bytes, is loopTop):P According to http://www.reenigne.org/blog/8088-pc-speaker- … r-how-its-done/

Edit: Gotcha, got the breakpoint on the correct spot. (Credits initial screen is already visible though, is that correct?)
Edit: The next instruction is LOOP 025D, so at 2004:025D is the initial v-instruction, as you called it.

Last edited by superfury on 2017-04-28, 14:50. Edited 1 time in total.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 96 of 198, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Could 2004:0294 be the correct location of the OUTB instruction? It's the last one seen before the crash.

Yes, that looks right. In that case the label "v" will be at 025D and "mixPatch" will be at 026C.

Reply 97 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

A simple question now: At which point does the execution of that block begin? At the v-label(025D) or at loopTop(026C)? I'd think 025D(Totalling 288 cycles)?
Edit: Now running to that point(1641 metric cycles)...
Edit: Will still need to enable the breakpoint once I get to the final parts.
Edit: For some strange reason, there's no even/odd problem at the Delorean car anymore now(as far as I can see at the normal framerate)?
Edit: Bingo! It starts out as a NOP. So it's working correctly at that point. Now just to switch to always log mode and wait for quite a few loops to accumulate. I'll post it once I've got it.

Last edited by superfury on 2017-04-28, 15:15. Edited 1 time in total.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 98 of 198, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

A simple question now: At which point does the execution of that block begin? At the v-label(025D) or at loopTop(026C)? I'd think 025D(Totalling 288 cycles)?

Both! The "loop" instruction loops back to v and the the final "jmp" instruction jumps back to loopTop, so that the routine always takes 288 cycles whether or not the patching code is run. The first time through it starts at v.

Reply 99 of 198, by superfury

User metadata
Rank l33t++
Rank
l33t++

So I was right: it starts at the v-location. It's now logging the data...
Edit: Got the data. Now compressing and uploading...

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io