douglar wrote on 2024-03-01, 13:25:This guy blames IBM, not Intel-- […]
Show full quote
rasz_pl wrote on 2024-03-01, 12:03:
Thank you for info on MFM controllers. So after all there were three legitimate user of ISA DMA - floppy, MFM until AT, and sound cards because Creative lazy 😀.
...
Intel was, and is again with Gelsinger back, famously run by engineers. Cant play ignorance with Andy Grove at the wheel 😀
This guy blames IBM, not Intel--
[...]
In the beginning there was a PC, but the PC was slow. IBM looked down from the heavens and said "Slap on a DMA controller -- that should speed it up." IBM's heart was in the right place; its collective brains were elsewhere as the DMA controller never met the needs of the system. The PC/AT standard contains 2 Intel 8237A DMA chips, connected as Master/Slave. The second chip is Master, and its first line (Channel 4) is used by the first chip, which is Slave. (This is unlike the interrupt controller, where the first chip is Master.) The 8237A was designed for the old 8080 8-bit processor and this is probably the main reason for so many DMA problems. The 8088 and 8086 processors chosen by IBM for its PC were too advanced for the DMA controller.
Reading this comment, you should keep in mind that this paragraph applies to the AT. DMA in the PC/XT was mostly an adaequate solution for considering the general performance of the system. The DMA controller in the PC/XT did actually speed it up. The 64K pagination was annoying on the XT, but nothing that couldn't be worked around in software. I guess (pure speculation) IBM originally intended to run the DMA subsystem in the AT at the system clock rate of 6MHz, but discovered that some XT hardware designed for 4.77 MHz did not work reliably at the DMA timings that result from clocking the DMA controller at 6MHz, or IBM could not source 8237A chips specified for more than 5 MHz. There were faster chips at some time, called 82C37A (CMOS process instead of NMOS process), but I didn't research market availability of these chips with the introduction of the AT.
The primary issue of the AT is not that the 8088 is "too advanced" for the 8237 (while this statement is a valid opinion, it's not primary), but that the 80286 (not the 8086) with the REP INSW "DMA substitute" and its frontside bus allowing a bus cycle per 2 clocks (in the AT: there is 1WS, so it is 3 clocks) instead of the 8088 that required 4 clocks is not sufficiently assisted by a DMA controller operating at half the bus clock. Furthermore, with the advancement of technology, the typical data set sizes rose, too and the 64K/128K barrier of the DMA pagination started to hurt harder than it did in the PC. Starting with the 80386 and virtual memory / paging support, the statement of "the 8237(A) is not advanced enough to be useful for more demanding applications than playing back digitized sound" got true - but that is like 7 years after IBM desinged the 8237 into the IBM PC.
douglar wrote on 2024-03-01, 21:56:
Seems like in order to get an IDE drive is working in multi word DMA mode, you would have to have VLB or PCI controller, yes?
https://en.wikipedia.org/wiki/WDMA_(computer)
While it looks like "DMA 0" could be implemented on an ISA bus, seems unlikely that you can get multi word DMA 0 working through ISA in practice because you would need to have either a fancy ISA controller that can bus master which are so rare that they might as well as not exist or perhaps an IDE drive that can work as an ISA bus master, which if someone ever made one of those, they didn't leave a written record about it.
IDE drives could work perfectly as targets for 3rd-party DMA driven by the 8237A DMA controller. The "single-word DMA" mode is specified in a way that it is compatible with the "single" transfer mode of the 8237A, and the "multi-word DMA" mode is specified to be compatible with the "demand" transfer mode of the 8237A. A shortcoming of the classic 8237A design is that once a device arbitrated for DMA, the 8237A doesn't give up the bus for exchanging data with this device until the "transfer" ends. While in single mode, "the transfer ends" after every transferred byte (or word), in block and demand mode, the transfer doesn't end before the block size is exhausted or (in the case of demand mode) the requesting device voluntary relinquishes the bus request. On the other hand, the AT needs to arbitrate the bus to the refresh controller (which no longer is part of the DMA controller, as it was on the PC/XT) every 16µs. Even at the maximum permitted MWDMA0 rate, a single sector takes around 120µs, so the classic ISA bus requires that a DMA capable I/O device periodically drops DRQ to not disturb RAM refresh. This might be "missing" from the IDE specification.
The point of 3rd-party DMA is that the coordination of bus signals is performed by the DMA controller, not the I/O device. This in turn implies that the DMA controller can not be "too slow" to cooperate with a device unless we are talking about real-time transfers and buffer over/underuns in the DMA target. At 4MHz DMA clock with compressed timing disabled (the typical AT timing), demand mode results in a cycle time of 1µs. The "official maximum" for the ISA bus, 8.33MHz, with the DMA controller clocked at 4.16MHz would result in a cycle time of 960ns, which is still twice as long as permitted by the IDE specification. Nevertheless, A 2.1MB/s transfer between an IDE drive and an ISA mainboard at 8.33MHz ISA / 4.16MHz DMA clock would have been possible with the drive configured to "MWDMA 0".
EISA improved a lot on the ISA DMA controller by providing modes with tighter timing as long as the "memory" affected by the DMA transfer is standard system memory, providing scatter/gather lists, allowed addressing of the whole 4G address space and eliminating the 64K/128K page boundaries. All of these enhancements are also available for transfers with ISA cards, the only mode of the EISA DMA controller that requires an EISA target is the burst mode, because that mode uses burst handshaking that is not part of the ISA protocol. The EISA implementation of the ISA DMA controller could have helped ISA DMA a lot, but it seems it never got enough market penetration that it was deemed useful to design ISA cards or ISA card drivers specifically meant to take advantage of the EISA DMA controller. One of the curiosities around EISA DMA is the south bridge used in ISA Saturn / Saturn II chipsets, the Intel 82378ZB System I/O supports all the EISA DMA enhancements except burst mode in a completely non-EISA system. Contrary to the original 8237A, the new DMA timing modes introduced with EISA allowed a higher-priority master to take over during demand-mode transfer, so the device is no longer required to repeatedly relinquish DMA itself to keep the system alive. For a well-implemented DMA target, temporarily losing arbitration during a demand-mode transfer just looks like extra wait states, so no extra measures need to be implemented on the device to make use of the interruptability. On the other hand, in the enhanced EISA modes, a device must not depend on a maximum time between transfers after it won DMA arbitration.
Newer PCI-to-EISA bridges still implement "quick DMA timings" on the ISA bus (called "type-F DMA"), which would yield a cycle time of 360ns at 8.33MHz ISA clock, but skip implementing all the niceties that were implemented in the EISA DMA controller. Actually, type-F DMA is faster than the two ISA-compatible accelerated DMA modes provided by an EISA DMA controller (called type-A and type-B), and the Saturn ISA south bridge also supports this type-F mode instead of the EISA burst mode.
rasz_pl wrote on 2024-03-02, 03:24:
When ISA yields the bus to a Bus Master it also halts CPU, CPU cant even run from its internal cache. That was the case all the way to 486 and maybe even Pentium? This is the tragedy of allowing external bus masters on your computer main data artery, you have to stop everything. No sweat on XT, big problem on xx-xxx MHz platforms.
Do you have any reliable source for this claim? It sounds wrong. While it is true that neither the CPU nor the refresh controller can actively get hold of the ISA bus while a master owns the ISA bus (at least, a master can trigger a refresh cycle while it owns the bus, temporarily relinquishing control to the refresh controller), and while it also is true that most AT/386/486 chipsets claimed the CPU frontside bus while ISA bus mastering is in progress, the claim that the CPU can not run code from L1 cache seems wrong, and I don't remember reading anything in the 486 databook even explaining a way to inhibit execution of cached instructions, short of stopping the CPU clock (which is not permitted on early 486 processors at all). As 286 and 386 processors do not have any kind of L1 cache, those processors will definitely stop doing anything useful during ISA bus mastering as soon as they perform any kind of memory access or the prefetch queue is drained.
When the (ISA) bus is owned by a non-CPU master, the CPU needs to get informed about memory addresses touched by that master to maintain cache coherency. The complexity of cache coherency got even higher as CPUs started to have WB L1 cache. As the cache coherency protocol uses the same address lines as the CPU uses for addressing memory, performing cache coherency ("snoop cycles") obviously claims some bandwidth of the frontside bus, but as running cache coherency cycles just requires the CPU to give up the address bus, and not a full bus arbitration, this is a "lightweight" operation and at least for classic ISA DMA timings, it would make sense to not block the frontside bus during the full ISA DMA transaction.