feipoa wrote on 2024-05-04, 02:29:
Is there any performance benefit in using single-mode DMA vs., presumably, multi-mode DMA?
There is a performance draw-back in using single-mode DMA. The full name would be "single-cycle mode DMA". Single Mode DMA is the simplest mode of DMA, and using it for high-bandwidth applications is not desireable. For an expansion card, the modes are very similar, and likely many sound chips are designed to work in both the "single cycle" and the "demand" DMA mode. Regarding your presumption "multi-mode DMA", it's not that far off if you are thinking of IDE Multi-Word DMA: The IDE DMA mode (later called single-word DMA mode) is based on the 8237 "single-cycle mode", and the IDE multi-word DMA mode is based on the 8237 "demand mode".
There is one key difference between the single-cycle and the demand mode: The original 8237 is "too stupid" to interrupt an on-going DMA transaction, even if a higher-priority DMA request comes in. So one a device got its DMA request granted, the DMA controller will only serve this device until "the transaction is over". The difference between "single-cycle mode" and "demand mode" is the way it is decided whether "the transaction is over": In single-cycle mode, as the name says, every grant only performs one cycle (8 bit on DMA 0-3, 16 bit on DMA 5-7), and then "the transaction is over". If the device still requests DMA, a new arbitration will happen, and unless a higher-priority request is there, another single cycle will be granted to that device. This re-arbitration is not just inside the DMA controller, but at the end of every DMA transaction, the DMA controller relinquishes bus ownership, and needs to re-arbitrate for the bus itself. As an arbitration happens after every cycle, this is the slowest possible DMA mode. Demand mode behaves slightly different: It works just like single-cycle mode, but at some point before the end of that single cycle, the DMA request line of the device currently in service is checked, and if it is still asserted, another DMA cycle will be performed without re-arbitration.
To understand why there is an issue about (re-)arbitration, and why the preceeding paragraph is relevant for the PC architecture, you need to understand that on the classic PC/AT architecture, memory refresh is reflected on the ISA bus, to assist ISA-based memory expansion cards. A memory refresh cycle is requested every 15.6µs, but it can't take place while the bus is owned by the DMA controller. On the PC, memory refresh uses a DMA request to channel 0, the highest-priority channel on the 8237 to claim bus ownership. On the AT, the memory refresh logic is dedicated, and can only take ownership if no DMA transaction is active. In either case, there won't be a memory refresh cycle interrupting a DMA transaction, so any ISA device using DMA has to make sure that no refresh cycles get lost due to blocking the bus for too long. While I don't have any hard specification at hand right now, the accepted maximum DMA transaction time 13µs to 15µs. With single-mode DMA, this is not an issue at all: No single cycle may take that long (you can make cycles take that long by adding enough wait-states, but this is also forbidden, independent of DMA design), and once that single cycle is over, re-arbitration of the bus will pass over to DMA0 (on the PC) or stall the DMA controller for getting the bus re-granted until the refresh cycle completes. On the other hand, in demand mode, the device is responsible for not keeping its request line active for more than 15µs.
In the case of sound cards, the 15µs limit is not an issue, because the sound card can use demand mode only to fill an internal FIFO of typically 16 items. Transferring 16 items at 4MHz DMA clock might hit the 16µs limit, if every cycle takes 4 clocks, but usually you don't hit an completely empty FIFO every time DMA is requested, so this will just work out. The idea of sound cards using a FIFO is that you rely less on DMA latency, so the higher the sample rate, the more important it is to have a FIFO to be able to continue playback while you can't own the bus. As soon as a sound card has a FIFO (e.g. on Creative Labs cards, starting with the SB16), it makes sense to run the DMA controller in demand mode to reduce bus load by avoiding frequent re-arbitrations. This is especially important if a card plays 16-bit stereo samples through an 8-bit DMA channel. You get corrupted sound playback in two cases:
- The DMA controller is set to single mode, and the bus overhead caused by using single mode DMA prevents the card from getting the required throughput. Many cards don't handle buffer underflow gracefully, and especially with 16-bit samples, if a single 8-bit transfer is missed, the card starts mixing up high bytes and low bytes, causing a lot of static-like noise. This is an issue for WSS-type cards (AD1848/CS423x).
- The DMA controller is set to demand mode, but the timing for relinquishing the DMA request is off, such that the DMA controller performs an extra cycle while the card already tried to end the transaction. This extra cycle on a full FIFO may cause a single byte to get lost, causing the same misalignment issue as described above. The root cause for "the timing is off" might be either on the mainboard side (sampling DRQ too early) or on the card side (negating DRQ too late), and the DMA controller clock clearly influences how early the DMA controller samples.
While there are a lot of appropriate criticisms to the architecture of Creative Labs SoundBlaster cards, this is one thing Creative seems to have gotten perfectly right with their SoundBlaster 16 architecture, which handles DMA underruns during playback perfectly by just repeating samples on underrun.
So, what is the take-away for you: If a driver doesn't offer the choice between single-cycle mode and demand mode, the only way to use the other mode is patching the driver at the point where it requests ISA DMA (either by talking to the DMA controller, or requesting it from a kernel-level DMA driver). Also, if you have issues with demand-mode DMA that vanish when using single-mode DMA, an overclocked DMA controller (related to ISA clock!) or a bad contact on the DRQ line hampering signal response time might be the root cause.