VOGONS


First post, by superfury

User metadata
Rank l33t++
Rank
l33t++

Anyone knows the exact timing of the adlib status port reads?
As in, does it signal _READY to the CPU? How many cycles does that take? Is it based on 14.31818MHz divded by 288?

The documentation says the following (https://bochs.sourceforge.io/techspec/adlib_sb.txt):

After writing to the register port, you must wait twelve cycles before sending the data; after writing the data, eighty-fo […]
Show full quote

After writing to the register port, you must wait twelve cycles before
sending the data; after writing the data, eighty-four cycles must elapse
before any other sound card operation may be performed.

| The AdLib manual gives the wait times in microseconds: three point three
| (3.3) microseconds for the address, and twenty-three (23) microseconds
| for the data.
|
| The most accurate method of producing the delay is to read the register
| port six times after writing to the register port, and read the register
| port thirty-five times after writing to the data port.

Edit: The documentation says 6 and 36 status register reads? https://www.vgmpf.com/Wiki/images/4/48/AdLib_ … mming_Guide.pdf

Because of the nature of the card, you must wait 3.3 µsec after a register select write, and 23 µsec after a data write. Best wa […]
Show full quote

Because of the nature of the card, you must wait 3.3 µsec after a register
select write, and 23 µsec after a data write. Best way to handle this is to read
ALMSC status register in loop, because bus speed is always the same regardless
of processor speed. 6 reads after register select write and 36 reads after data
write should do the job.

So:
3.3 µsec is 6 reads.
23 µsec is 36 reads.

So how many µsec is 1 status port read? The chips's rendering speed is (in 288 division of 14MHz): 20,114288268481049965847614710808 µsec?

What does this say about read timings on the adlib status port? It clearly isn't the speed of the 288 division of 14MHz?
If 6 reads is a 3.3 µsec and 36 reads is a 23 µsec delay, what read timing does the status register use?

Last edited by superfury on 2025-06-03, 17:17. Edited 1 time in total.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 1 of 6, by mkarcher

User metadata
Rank l33t
Rank
l33t

I don't remember seeing a sound card that drives IOCHRDY for OPL2/OPL3 access. The status read is usually performed as a default wait-state 8-bit I/O read. Assuming that 35 reads is around 23 microseconds, the suggestion is based on a duration of around 660ns for a single status read. On the original PC/XT platform, an ISA cycle without extra wait states took 838ns, so this suggestion would be overly conservative. The "Intel ISA bus specification 2.01" specifies at least 530ns active time for the read command and 187ns recovery time before the next command, which is 717ns minimum cycle time. A conforming mainboard has to add that many wait states to 8-bit read/write cycles, unless the card interacts with /0WS.

As the specified timing for the ISA bus of 717ns period is still (around 10%) slower than the requerd 660ns, there is some margin in that suggestion. Even more margin if we consider that a loop would enounter extra overhead, and 23 "IN AL, DX" instructions do not fit the prefetch queue, so (up to the 386) there will be extra bus overhead to fetch instructions if the loop is unrolled.

OTOH, the CMS chips on the GameBlaster provide a RDY output which is used to drive the IOCHRDY line, but they don't take nearly as long as the OPL2 to be ready again. Blocking the ISA bus for 23µs is out of spec anyway (the ISA specification claims 15.6µs max).

Reply 2 of 6, by superfury

User metadata
Rank l33t++
Rank
l33t++

But it's 23µs using 36 reads (and 3.3µs with 6 reads apparently), so that wouldn't be out-of-spec? Roughly 0.63µs (for each of the 36 reads) or 0.55µs (for each of the 6 reads) on average?
I'd assume it's based on the 14.31818MHz bus clock, as the Adlib doesn't have any other sources of clocks? I know that it's sample rendering clock is a division of 288 of that (almost 49715Hz being the result), but how would it be used to drive IOCHRDY if it did?
Or was it just based on the ISA bus timings used for any I/O port?

The XT ISA bus is a division of 3 on the 14.31818MHz clock, so that's roughly 0.2µs each tick, so that's 15-16 ticks for 3.3µs or 109-110 ticks for 23µs.
The base for a 8088 would be 4 CPU clocks on a 8088, so combined with the division of 3 that's 12 14.31818MHz bus clocks excluding any waitstates, if they were added at all.
Although the CPU itself also adds some extra clocks for the instruction and prefetching. But that would make the specification too irregular to use usually (unless you'd start cycle-counting).

My implemented hardware in theory (at the CPU side) supports IOCHRDY to implement waitstates, but right now the CGA is the only chip that actually uses it.
It's basically just like the real pin, keeping the CPU from progressing to T4 (or T2 on 286+) state until a certain clock modulus is reached (on the CGA that is).
The EGA and newer chips don't implement any waitstates at all, always clearing IOCHRDY.
Other hardware like the SAA-1099 (Game Blaster or Sound Blaster) also don't implement any waitstates yet.
Neither does the OPL2. If any, it's just because the CPU clock is at a specified speed (implemented up to 80386 instructions, 80486 instructions and newer all execute in 1 cycle, excluding BIU timings (which are 1 cycle anyways on 486+)).

Edit: My bad, 14.31818MHz is actually incorrect as it's used in my emulator. It's actually (15.75/1.1) MHz exactly.
The Game Blaster for example simply divides it by 2 for it's base rate. That's divided by 256, 512 or 1024 for noise and 512 for melodic voices.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 3 of 6, by mkarcher

User metadata
Rank l33t
Rank
l33t
superfury wrote on 2025-06-03, 17:48:

But it's 23µs using 36 reads (and 3.3µs with 6 reads apparently), so that wouldn't be out-of-spec? Roughly 0.63µs (for each of the 36 reads) or 0.55µs (for each of the 6 reads) on average?

I'm sorry, I don't understand what you mean. Your calculations are correct. I cited the ISA specification that claims that a read needs to take at least 0.71µs, so following the advice to to 36 or 6 reads, this will be in spec. The requirement is not to wait exactly 23µs, but at least 23µs.

superfury wrote on 2025-06-03, 17:48:

I'd assume it's based on the 14.31818MHz bus clock, as the Adlib doesn't have any other sources of clocks?

The timing requirements of the OPL3 chips (23µs / 3.3µs) are definitely based on a certain number of cycles of the 14.318 MHz clock.

superfury wrote on 2025-06-03, 17:48:

but how would it be used to drive IOCHRDY if it did? Or was it just based on the ISA bus timings used for any I/O port?

You can find lots of photos of the original AdLib card online. IOCHRDY is at position A1 on the ISA bus, and that contact is not populated on the AdLib card. The quoted advice to perform 36 / 6 reads is indeed just based on the default wait state amount generated by the main board for any ISA port.

superfury wrote on 2025-06-03, 17:48:

My implemented hardware in theory (at the CPU side) supports IOCHRDY to implement waitstates, but right now the CGA is the only chip that actually uses it.
It's basically just like the real pin, keeping the CPU from progressing to T4 (or T2 on 286+) state until a certain clock modulus is reached (on the CGA that is).
The EGA and newer chips don't implement any waitstates at all, always clearing IOCHRDY.

If EGA doesn't implement wait states, you will get way faster EGA performance in your emulator than you do on real hardware. Actually, EGA is not significantly faster than CGA regarding video memory access.

superfury wrote on 2025-06-03, 17:48:

Other hardware like the SAA-1099 (Game Blaster or Sound Blaster) also don't implement any waitstates yet.

For general-purpose emulators, emulating the wait states is not required. Unless CRTC and Hornet to write a successor to Area 5150 that includes GameBlaster music ("we break all your emulators, AGAIN!"), there will likely be no software that requires SAA-1099 wait states to be executed properly. The physical SAA-1099 chips generate a ready signal and they require that this signal is obeyed.

Reply 4 of 6, by superfury

User metadata
Rank l33t++
Rank
l33t++

Then how does ISA timing work? I know the bus is apparently running at 14MHz, but is anything known how timing behaves with inserted waitstates on 808x/Vx0 or 286+ CPUs running at various speeds? Simply wait for the next tick or tick multiple (as in modulo 3 being zero for example to get 808x timings to finish, otherwise wait for next finish on the ISA bus)?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 5 of 6, by GloriousCow

User metadata
Rank Member
Rank
Member
superfury wrote on 2025-06-09, 17:17:

Then how does ISA timing work? I know the bus is apparently running at 14MHz, but is anything known how timing behaves with inserted waitstates on 808x/Vx0 or 286+ CPUs running at various speeds? Simply wait for the next tick or tick multiple (as in modulo 3 being zero for example to get 808x timings to finish, otherwise wait for next finish on the ISA bus)?

Depends on the system you're emulating, likely, the 5150 and 5160 enforce one wait state for all IO, but also respect the IO_READY line which comes off the ISA bus.

MartyPC: A cycle-accurate IBM PC/XT emulator | https://github.com/dbalsom/martypc

Reply 6 of 6, by mkarcher

User metadata
Rank l33t
Rank
l33t
superfury wrote on 2025-06-09, 17:17:

Then how does ISA timing work? I know the bus is apparently running at 14MHz,

No, not at all. There is the 14.318MHz oscillator signal on the ISA bus, but that signal (while originally being thrice the processor clock) is not defined to relate to any bus timing, it's just there in case you think a 14.318MHz clock is useful for your card, like the CGA card, which "just happens" to use this frequency as dot clock.

The clock that is used to derive the ISA timings is also on the bus, it is called "CLK". Standard ISA cards do not need to care about it, though. For example, on a port read, the port address is placed on the bus, and after some setup time passed, the /IOR line is asserted. The device is then supposed to place the value on the bus within a specified amount of nanoseconds (IIRC around 400), or, if it is unable to do so, pull IOCHRDY low until the data is present. The mainboard may sample the data as soon as the minimum time is over and IOCHRDY is not pulled low. After the mainboard has sampled the data from the bus (actually, the chip doing the sampling might be the CPU), it stops asserting /IOR. This causes the card to stop driving the data to the bus.

Please note that this description of a typical ISA cycle has exactly zero occurrences of the word "clock" or "clock cycles" or even "wait states". While it is true that there are clock cycles on the mainboard and the processor will perform wait states until the data may be read, that's not how the specification for the card works. In fact, there is a minimum amount of wait states (zero on the XT bus at 4.77MHz, but be aware that a 0WS XT cycle takes four bus clocks, while a 0WS AT cycle takes two bus clocks), and after the minimum number of wait states have expired, IOCHRDY is polled once per clock. Lowering IORDY thus prolongs the cycle by an integer number of bus clocks, which is seen as extra wait states by the CPU.

but is anything known how timing behaves with inserted waitstates on 808x/Vx0 or 286+ CPUs running at various speeds?

It's different depending on the system. A 4.77MHz 8088/V20 generally operates with no wait states (the bus is still quite slow...), my Turbo XT at 10 MHz inserts 1WS for cycles that don't target system RAM. 80286 systems generally insert wait states for XT compatibility, as the 80286 needs less clocks for a bus cycle, and usually runs at least at 6 MHz. ATs at 6 and 8 MHz operate the ISA bus at processor clock frequency. There also is a new signal to cancel the default wait states, which is usually named /0WS or /NOWS. This signal is actually the only signal which has timing specifications relative to CLK. With system clocks at 12Mhz or higher, different solutions emerged. My first 286 system dropped the processor clock from 12MHz to 8MHz is a bus cycle targeting an ISA card was detected. Other systems have a dedicated bus clock that is either derived from the system clock (e.g. divide-by-2) or the ISA bus is coupled entirely asynchronously.

Then there is the Deskpro 386/20 (you can find the service manual on pcjs.org), which has a "pseudo-synchronous" 8MHz ISA bus. The ISA bus usually runs at 8MHz, but if the mainboard logic recognizes a pending ISA cycle on the FSB, a clock might be slightly delayed to not miss the clock, and avoid excessive synchronization delays.