VOGONS


List of VLB IDE Controllers

Topic actions

Reply 280 of 289, by mkarcher

User metadata
Rank l33t
Rank
l33t
douglar wrote on 2024-11-19, 13:31:

The Promise 20630 controller can do MWDMA2 transfers in DOS if you have a storage device that's agreeable to that mode, but I've had little luck getting that to work as DMA in Windows.

The Promise 20630 can indeed use the DMA transfer protocol between the drive and the Promise chip, but instead of doing DMA, the host has to wait until the device asserts /DRQ, then run a PIO block IN or OUT instruction to get the data from/to the PDC20630 chip. As you don't get an IRQ when DRQ is asserted by the drive, but only when the DMA transfer is done, the operation for read requests is like this: The host issues the command, then the drive locates the sector, reads the sector, verifies the CRC, possibly applies ECC correction and then asserts DRQ to get the data to the host. The host needs to run a busy polling loop on the DRQ status bit reported by the 20630, and then start the data transfer, which looks like a DMA transfer to the drive. Finally, when the transfer is done, the drives issues an interrupt for "DMA complete". This means that you don't only lose the advantage of DMA that the CPU can perform useful operations while the data is transferred from the drive to memory, but it is even worse than PIO in that the CPU already is busy with watching the DMA status bit while the drive is not yet ready to transfer. In "real" PIO modes, the drive will issue an IRQ before the transfer happens, so the CPU can do other tasks until the sector is ready. Bottom line: The Pseudo-DMA feature of the 20630 is even worse than PIO for multi-tasking operating systems. It only makes sense if you happen to have a drive that can achieve decent DMA speeds, but fails to do anything faster than PIO0 in PIO mode.

Reply 281 of 289, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2025-01-22, 18:17:
douglar wrote on 2024-11-19, 13:31:

The Promise 20630 controller can do MWDMA2 transfers in DOS if you have a storage device that's agreeable to that mode, but I've had little luck getting that to work as DMA in Windows.

The Promise 20630 can indeed use the DMA transfer protocol between the drive and the Promise chip, but instead of doing DMA, the host has to wait until the device asserts /DRQ, then run a PIO block IN or OUT instruction to get the data from/to the PDC20630 chip. As you don't get an IRQ when DRQ is asserted by the drive, but only when the DMA transfer is done, the operation for read requests is like this: The host issues the command, then the drive locates the sector, reads the sector, verifies the CRC, possibly applies ECC correction and then asserts DRQ to get the data to the host. The host needs to run a busy polling loop on the DRQ status bit reported by the 20630, and then start the data transfer, which looks like a DMA transfer to the drive. Finally, when the transfer is done, the drives issues an interrupt for "DMA complete". This means that you don't only lose the advantage of DMA that the CPU can perform useful operations while the data is transferred from the drive to memory, but it is even worse than PIO in that the CPU already is busy with watching the DMA status bit while the drive is not yet ready to transfer. In "real" PIO modes, the drive will issue an IRQ before the transfer happens, so the CPU can do other tasks until the sector is ready. Bottom line: The Pseudo-DMA feature of the 20630 is even worse than PIO for multi-tasking operating systems. It only makes sense if you happen to have a drive that can achieve decent DMA speeds, but fails to do anything faster than PIO0 in PIO mode.

That sounds kind of like the UM8886B which we've talked about privately, but in reverse. The UM8886B can do bus mastering, but it still talks PIO to the drive. Because the IRQ comes in before the bus mastering happens instead of after, the host has to sit and spin waiting for the transfer to complete.

As we've also talked about, what is it about the Intel Triton design that didn't make it much more obvious to these VLB/early PCI IDE manufacturers to do it 'right' that way? The whole scatter/gather structure was too complicated for smaller chip companies without the clout to make it the standard?

Reply 282 of 289, by douglar

User metadata
Rank l33t
Rank
l33t
mkarcher wrote on 2025-01-22, 18:17:

The Promise 20630 can indeed use the DMA transfer protocol between the drive and the Promise chip, but instead of doing DMA, the host has to wait until the device asserts /DRQ, then run a PIO block IN or OUT instruction to get the data from/to the PDC20630 chip.

Very very interesting. Many thanks for sharing. How did you learn this?

Generally speaking, my results with the 20630 & the Promise DOS driver are that reasonably modern Solid State Storage devices do reads & writes @ 5500-6000 KB/s in Turbo PIO and IRDY modes and but get 6000 - 9200 KB/s in Turbo DMA mode, depending on the device and transfer direction.

Curiously, I got some outlier results on 20230 w/ XUB r625 instead of the promise driver that had transfer rates > 9400 KB/s. I'll have to go back and see if I can replicate with r630 and isolate what events lead to the high transfer rate & low latency results. I hope it wasn't a cut and paste error.

Reply 283 of 289, by mkarcher

User metadata
Rank l33t
Rank
l33t
douglar wrote on 2025-01-22, 19:47:
mkarcher wrote on 2025-01-22, 18:17:

The Promise 20630 can indeed use the DMA transfer protocol between the drive and the Promise chip, but instead of doing DMA, the host has to wait until the device asserts /DRQ, then run a PIO block IN or OUT instruction to get the data from/to the PDC20630 chip.

Very very interesting. Many thanks for sharing. How did you learn this?

I reverse engineered the Promise DOS driver around 22 years ago.

Reply 284 of 289, by douglar

User metadata
Rank l33t
Rank
l33t
mkarcher wrote on 2025-01-22, 18:17:

The Promise 20630 can indeed use the DMA transfer protocol between the drive and the Promise chip, but instead of doing DMA, the host has to wait until the device asserts /DRQ, then run a PIO block IN or OUT instruction to get the data from/to the PDC20630 chip.

I suppose that's why the 20630 never found its way onto a caching controller. There was probably no benefit to it since most of them can do memory mapped transfers.

p.s I finally found this thread that has a lot of info Is there a VLB card that supports > Mode 3 or UDMA support?

Reply 286 of 289, by Dorunkāku

User metadata
Rank Member
Rank
Member
douglar wrote on 2025-04-24, 12:23:
Found a new old one -- I completely forgot about this one, which is funny because I used to own it until about 5 years ago. […]
Show full quote

Found a new old one -- I completely forgot about this one, which is funny because I used to own it until about 5 years ago.

Although technically, it's on a card with a proprietary local bus, not VLB

Appian P928

https://theretroweb.com/chips/10619
https://theretroweb.com/expansioncards/s/tekr … ost-4000-3-plus

I don't think the Appian P928 is a IDE controller.

Have a look at ISA cards with the Winbond W83787F. The pin headers on the Tekram card match those on the ISA Winbond W83787F cards. If the Appian chip provided IDE it would make sense to swap the IDE and floppy pin headers for easier routing.
Now have a look at VLB Tseng Labs ET4000AX cards. There are 74LS244 or 74LS245 buffers and 74LS373 latches where the Tekram card has the Appian P928 chip. Based on this I think the P928 is just the 74LS buffers and latches condensed in a single chip.

Also: https://www.techmonitor.ai/technology/appian_ … _local_bus_chip

Reply 287 of 289, by douglar

User metadata
Rank l33t
Rank
l33t
Dorunkāku wrote on 2025-04-24, 15:49:
I don't think the Appian P928 is a IDE controller. […]
Show full quote

I don't think the Appian P928 is a IDE controller.

Have a look at ISA cards with the Winbond W83787F. The pin headers on the Tekram card match those on the ISA Winbond W83787F cards. If the Appian chip provided IDE it would make sense to swap the IDE and floppy pin headers for easier routing.
Now have a look at VLB Tseng Labs ET4000AX cards. There are 74LS244 or 74LS245 buffers and 74LS373 latches where the Tekram card has the Appian P928 chip. Based on this I think the P928 is just the 74LS buffers and latches condensed in a single chip.

Also: https://www.techmonitor.ai/technology/appian_ … _local_bus_chip

OK, nifty. Looks like perhaps provides a faster AT style bus over the 486 Local bus if I'm reading that correctly.

Appian Technology Inc is shipping its P928 FAST Local Bus Peripheral Interface chip: it links a local 80486 CPU bus to AT peripherals

Reply 288 of 289, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2023-04-23, 15:37:

So a read-ahead buffer of 32 bits is enough: It takes two IDE words of data, waiting for the 486 to perform the next REP INSD iteration, which can then be answered without any wait states. The buffer is then refilled as soon as possible. The same is true for the post-write buffer and REP OUTSD. REP OUTSD can be performed faster (at 5 clocks per cycle, that is 150ns cycle time), if the data to output is currently cached in L1, though.

I'm about to post about the ADS# signal and a search led me here and I was thinking anyway: I suppose these chips have to track the last command written to the command register to avoid triggering readahead on ATAPI devices (when the transfer length is going to be something other than 256 words)? Would readahead start as soon as the IRQ comes in and the "last command written to the command register" is an appropriate one, or would the first rep insd trigger buffer filling, and subsequent ones then read from the buffer? I'm thinking about multi-sector reads, and whether the controller has to be smart enough to track the current number of sectors for read multiple sectors, or if a premature stop to readahead once every 256 words wouldn't hurt anything, since the total read is still going to be a multiple of 256 words.

Reply 289 of 289, by mkarcher

User metadata
Rank l33t
Rank
l33t
jakethompson1 wrote on 2025-07-22, 01:46:

I'm about to post about the ADS# signal and a search led me here and I was thinking anyway: I suppose these chips have to track the last command written to the command register to avoid triggering readahead on ATAPI devices (when the transfer length is going to be something other than 256 words)? Would readahead start as soon as the IRQ comes in and the "last command written to the command register" is an appropriate one, or would the first rep insd trigger buffer filling, and subsequent ones then read from the buffer?

I'm thinking about multi-sector reads, and whether the controller has to be smart enough to track the current number of sectors for read multiple sectors, or if a premature stop to readahead once every 256 words wouldn't hurt anything, since the total read is still going to be a multiple of 256 words.

We are talking about consumer VL IDE stuff, not professional SCSI EISA stuff, so if there is a perfect way to do things and a cheap way that is good enough, you can bet on them using the cheap way. I think tracking commands is way too complicated. Doing the first 32-bit I/O read of a sector without read-ahead wouldn't noticably hit performance, so that would be fine. Read-ahead for the second DWORD is triggered as soon as the first DWORD was read.

There is a "magic synchronization sequence" in some VL IDE drivers: You read the sector count (or sector number?) register three times in a row before doing a 32-bit block transfer. I bet this is resetting a 7-bit DWORD / 8-bit WORD counter to know the "end of sector". For block transfers ("READ MULTIPLE"), I guess one skipped read-ahead is good enough, as you already suspected. So the final question is about ATAPI, and again I think the answer is easy: If you only do read-ahead for 32-bit reads, you just don't REP INSD in the ATAPI driver. CD-ROMs were double-speed at best when these VL IDE chips were designed, and usually equipped with a non-IDE interface anyway, so who cares about read-ahead?

In practice, I heard of little problems just enabling 32-bit I/O in Linux (hdparm -c 1, if I remember correctly) instead of enabling 32-bit I/O with sync (hdparm -c 3), so synchronization wasn't that important in practice. You can think of other triggers to clear the 256-word (or 128-dword) counter, like an IRQ or a write to the command register. Yeah, that can fail if a sector transfer from device 0 is interrupted and a request to device 1 is performed during that transfer, but I guess that's one of the reasons Linux 2.x disabled interrupts during IDE transfers by default.