VOGONS


List of VLB IDE Controllers

Topic actions

Reply 40 of 262, by douglar

User metadata
Rank Oldbie
Rank
Oldbie

Thanks for those explanations. Great stuff.

Here's the cards I got lined up to test:

I was going to test using a WinTech MV008 mobo that I found recently. Seemed like a great choice because it supports 386 & 486 chips, but the BIOS been real touchy about attaching drives > 512MB even with XUB or MR BIOS. I can get some to work if I use SIIG Advanced BIOS and EZ drive together, but that was weird. I have a couple other board to try if I can't get this working, but all in all it chewed up my morning. I suppose I could switch to drives < 528MB for testing.

So far I collected dos drivers for Lion 3+ and UM8672F.
I'm looking for drivers for the ADI/2, Atronics, Holtek, Promise and Vision chips. Any uploads or links would be appreciated.
I suppose I could try to use the XUB Advanced support for the Promise and Vision chips if nothing else turns up.

Reply 41 of 262, by rasz_pl

User metadata
Rank l33t
Rank
l33t
mkarcher wrote on 2023-04-23, 15:37:

you obviously are not that deep into how VL IDE interface chips actually operate internally

yes! thats why I was asking out loud 😀

mkarcher wrote on 2023-04-23, 15:37:

A VL transaction takes at least two cycles.
So the fastest cycle on the VL bus is 60ns
At this point, it's obvious that a dumb VL/IDE adapter can't perform a transaction every 30ns, as you assumed

in theory there are also burst transfers "Inside of a burst, data transfers can occur at the rate of one per clock period."

mkarcher wrote on 2023-04-23, 15:37:

You are typically reading data using REP INSW ...
...The net result is: Using a dumb VL controller, you get 300ns cycle time for a PIO3 drive, and 330ns cycle time for a PIO2 drive. This is less than spectacular.

Great explanation (as always), thank you.
Still over ~5MB/s of contemporary 1994 drives like https://stason.org/TULARC/pc/hard-drives-hdd/ … -ATA2-FAST.html
or even this insanely high end scsi https://stason.org/TULARC/pc/hard-drives-hdd/ … SCSI2-DIFF.html
Most people had to make do with something like Western Digital WDAC2340 http://bk0010.narod.ru/DRIVESPECS/WD/4752.txt maxing out at 1.2MB/s
hell, in 1996 UDMA-2 capable AC-22100 https://stason.org/TULARC/pc/hard-drives-hdd/ … -5-SL-ATA3.html was still at 5MB/s

mkarcher wrote on 2023-04-23, 15:37:

The 486DX-33 is bottlenecked to a 240ns cycle time REP INSW.

I think best case is even worse at 300ns, and worst case (V86 mode) 500ns

mkarcher wrote on 2023-04-23, 15:37:
[*]Read-ahead and posted writes tackle the first observation: During a sector transfer, it is expected that 256 words are being […]
Show full quote

[*]Read-ahead and posted writes tackle the first observation: During a sector transfer, it is expected that 256 words are being transferred. So, while the processor handles the first word, the controller chip can already initiate a 16-bit IDE read cycle to the drive, before the next I/O read happens on the VLB. This is "read ahead". Also, the controller chip can acknowledge a write cycle on the VL bus, and still drive the data and the /IOW or /IOR signal on the IDE cable. This is called "posted writes".
[*]32-bit I/O takcles the second observation: The 486DX-33 can also perform REP INSD at a cycle time of 240ns! If the VL IDE chip manages to pack two IDE transfers into one VL transfer, we get 240ns per double-transfer, so an effective rate of 120ns per 16-bit IDE word.
[/list]

So a read-ahead buffer of 32 bits is enough: It takes two IDE words of data, waiting for the 486 to perform the next REP INSD iteration, which can then be answered without any wait states. The buffer is then refilled as soon as possible. The same is true for the post-write buffer and REP OUTSD. REP OUTSD can be performed faster (at 5 clocks per cycle, that is 150ns cycle time), if the data to output is currently cached in L1, though.

I think rep outsd is 10 cycles best case scenario on 486, not that it makes much difference. Imo one transfer buffer is both useless with drives available at the time VLB was relevant (92-94), and also too small to be useful on really fast drives or in multi tasked systems where you could get some benefit from slamming whole sector at a time. If anything memory mapping VLB controller would make the biggest difference, with even one sector build in buffer you could expect 10-15MB/s (typical vidspeed write for DX2-66 in 16bit mode) to 30MB/s (32bit) write speeds letting CPU get back to computation tasks much faster after writes, but still useless for single sector reads (100usecond too small to do anything between).

Open Source AT&T Globalyst/NCR/FIC 486-GAC-2 proprietary Cache Module reproduction

Reply 42 of 262, by mkarcher

User metadata
Rank l33t
Rank
l33t
rasz_pl wrote on 2023-04-23, 18:09:

in theory there are also burst transfers "Inside of a burst, data transfers can occur at the rate of one per clock period."

Well, yeah, in theory there is, but there are no bursts for I/O access. Furthermore, the other end of the VL bus is the 486 processor. That processor only performs bursts on cache line fills, when splitting unaligned access and possibly on 64-bit floating point read/writes. I have yet to see a card that offers cachable memory over VL or supports bursts for the other fringe cases. Later 486 processors would also support bursts on copying back dirty cache lines from L1, but this again happens only into address ranges that are deemed cacheable (i.e. usually not on VL devices).

rasz_pl wrote on 2023-04-23, 18:09:

Imo one transfer buffer is both useless with drives available at the time VLB was relevant (92-94), and also too small to be useful on really fast drives or in multi tasked systems where you could get some benefit from slamming whole sector at a time.

It is useful on multi tasked systems. As you correctly identified, drives were too slow to keep up with PIO3 anyway. But the drive has a buffer for a complete sector (or even multiple sectors if you use IDE Block Mode aka IDE READ/WRITE MULTIPLE. You can fill that buffer with REP OUTSD or read it with REP INSD at 10MB/s to 13MB/s if you run at PIO3-like timings, and then you can continue executing tasks that are not starved on disk access, until the next IRQ requests the next data burst at PIO3. This is also one reason why IDE block mode is considered a good thing: You get longer transfer bursts and less interrups when doing large reads and writes.

So basically your conception that you would need a sector buffer at least for good speedup is true. It's just that this buffer is in the drive, not in the VL IDE controller. On the other hand, the two-word (or single DWORD) buffer in the VL controller is what makes it possible to talk to the drive at PIO3 speed.

You are in fact correct that REP MOVSDing into a on-controller MMIO buffer will be even faster for the multi-tasking OS use case, but AFAIK no one implemented that for IDE. For SCSI, at least in theory, VL bus mastering is better, because the data doesn't have to travel the FSB twice (device to processor, then processor to memory), but can directly get from device to memory. In practice, my experiments with an Adaptec 2842VL were disappointing. It doesn't get close to 10MB/s (different VL chipsets), whereas the 2742W gets quite close to 20MB/s on an 8MHz EISA bus (SiS 411 chipset).

Reply 43 of 262, by pshipkov

User metadata
Rank Oldbie
Rank
Oldbie

Attached drivers for the requested controllers. Included are multiple drivers for Promise (all known models, i think) and QDI.

Can you share the driver for Lion 3+.
Also, do you have a driver for Compass Lab CL3202 - GoldStar GM82C712 - Chips 82C711/712 ? Cannot locate this guy.

That motherboard is kind of slow. Pick something faster for the testing.

Attachments

  • Filename
    vld_ide_drivers.zip
    File size
    3.11 MiB
    Downloads
    41 downloads
    File license
    Public domain

retro bits and bytes

Reply 44 of 262, by douglar

User metadata
Rank Oldbie
Rank
Oldbie
pshipkov wrote on 2023-04-24, 01:40:
Attached drivers for the requested controllers. Included are multiple drivers for Promise (all known models, i think) and QDI. […]
Show full quote

Attached drivers for the requested controllers. Included are multiple drivers for Promise (all known models, i think) and QDI.

Can you share the driver for Lion 3+.
Also, do you have a driver for Compass Lab CL3202 - GoldStar GM82C712 - Chips 82C711/712 ? Cannot locate this guy.

That motherboard is kind of slow. Pick something faster for the testing.

Thanks for all the drivers. I put:
Lion 3+ drivers http://vogonsdrivers.com/getfile.php?fileid=1 … menustate=56,55
Appian ADI/2 http://www.vogonsdrivers.com/getfile.php?file … &menustate=55,0
Holtek http://www.vogonsdrivers.com/getfile.php?file … &menustate=55,0
UM8672 http://www.vogonsdrivers.com/getfile.php?file … 999&menustate=0
QD6580 http://www.vogonsdrivers.com/getfile.php?file … &menustate=55,0

I'll get the rest uploaded tomorrow

I don't have a Compass Lab CL3202 - GoldStar GM82C712 - Chips 82C711/712 driver yet but I'll post it if I find it.

I have three other 486 VLB boards that I'd consider for testing.
PC Chips M919
FIC 486 VIP
Shuttle HOT 419 v1.0

I'll start with the M919 and see how that goes.

Reply 45 of 262, by rasz_pl

User metadata
Rank l33t
Rank
l33t
mkarcher wrote on 2023-04-23, 18:43:

It is useful on multi tasked systems. As you correctly identified, drives were too slow to keep up with PIO3 anyway. But the drive has a buffer for a complete sector (or even multiple sectors if you use IDE Block Mode aka IDE READ/WRITE MULTIPLE.

right

mkarcher wrote on 2023-04-23, 18:43:

You can fill that buffer with REP OUTSD

sadly IO writes wont be that fast im afraid, even on fast CPUs. 32bit access might help here a lot

mkarcher wrote on 2023-04-23, 18:43:

or read it with REP INSD at 10MB/s to 13MB/s if you run at PIO3-like timings, and then you can continue executing tasks that are not starved on disk access

I have a feeling no one attempted that, at least not in Windows 3.x. One sector at 5MB/s is ~100 microseconds? seek can be up to tens of ms but you can never be sure if its going to be long or very quick. All too quick to even start thinking about switching tasks not to mention interrupt overhead. Im guessing all windows drivers will do the simple stupid reliable thing of issuing read and spinning on IO until completion. Win3 uses cooperative multitasking. System Read handler Yielding to another tasks would mean every Read call would automagically turn into Yeald. Would that even work? Wouldnt that slow disk operations tremendously?
Maybe some really clever drivers have a provision for doing that in case of long multisector reads? I hope to be disproved 😀

edit: maybe easy to verify - does grabbing and waving windows (well, frames since its win3) around get very choppy during heavy random reads? lots of long seeks that in theory could be yealded.

mkarcher wrote on 2023-04-23, 18:43:

In practice, my experiments with an Adaptec 2842VL were disappointing. It doesn't get close to 10MB/s (different VL chipsets), whereas the 2742W gets quite close to 20MB/s on an 8MHz EISA bus (SiS 411 chipset).

On one hand weird considering Adaptec was a powerhouse of ASIC design. They even designed chips that went into drives themselves for Seagate/Maxtor/Conner. Grant Saviers, CEO of Adaptec: https://archive.computerhistory.org/resources … 7-05-01-acc.pdf
"we were designing our own chips and having them
fabbed. TSMC, Taiwan Semiconductor Manufacturing was the company that made them for us, most of
them. We had a couple of other suppliers. We had a good relationship with TSMC. We had a couple of
other businesses but the SCSI business was the biggest and most profitable we had. We were making
disk drive controller chips for Maxtor and Conner and for Seagate
. So we had a pretty capable not drive
development team but everything but the mechanics development, doing all the data paths and the SCSI
interface or whatever the drive interface was. So that business, it was a dogfood business. It was really,
really difficult building custom ASICS."
On the other Adaptec was very much all in on SCSI and didnt see IDE as something serious.

Open Source AT&T Globalyst/NCR/FIC 486-GAC-2 proprietary Cache Module reproduction

Reply 46 of 262, by pshipkov

User metadata
Rank Oldbie
Rank
Oldbie

Windows 3.1 is a coop task system.
So, it takes one process to not properly yield control back to the OS and we kiss goodbye the word "multi".
That's why most HDD operations are kind of hard blocking.
Even the native File Manager paired with decent drivers (Promise EIDE2300Plus, for example) makes the system very choppy.
It is usually worse with many third party applications from that time.
Never explored which apps handle this stuff well, but the system never feels responsive during backgrounded local storage load.
My point is that it will be difficult to judge any of the discussed points by simply observing the behavior of arbitrary closed source programs.
One has to code drivers and set of unit tests to quantify that stuff.

retro bits and bytes

Reply 47 of 262, by pshipkov

User metadata
Rank Oldbie
Rank
Oldbie

Btw, added one more controller to the other thread - DTC 2177A based on Atronics IDEC-2020L silicon.

In general, synthetic tests like Coretest, or Speedsys are quite misleading and in that way kind of meaningless.
For example - that DTC controller shows pretty good numbers there, but if i test with copying Quake 1 from Phil's test suite (that's about 21Mb of content), it takes 40+ seconds with DTC 2177A, while Promise EIDE2300Plus completes the task in about 10 seconds.

retro bits and bytes

Reply 48 of 262, by rasz_pl

User metadata
Rank l33t
Rank
l33t

Copying from where to where? is the bottleneck R and W on same drive for that controller/its driver?
would separately testing how much SmartDrive helps be reasonable?
WB versus WT cache cpu?
so many variables and combinations 😀

Open Source AT&T Globalyst/NCR/FIC 486-GAC-2 proprietary Cache Module reproduction

Reply 49 of 262, by pshipkov

User metadata
Rank Oldbie
Rank
Oldbie

Couple of notes for clarity:
Smartdrive not used.
WB/WT L1 cache is a minor factor in general.
Testing with modern-day CF cards. Their perf upper boundary is way above what these old controllers can do. But also - not all CF cards are equally applicable. This was discussed in other places.
R/W on same CF card, or between 2 CF cards (primary-master and second-master) - does not make much (if any) difference, as far as i can remember.
Does make difference for period correct mechanical HDDs.

But you are right - this stuff is messy and fuzzy.

retro bits and bytes

Reply 50 of 262, by douglar

User metadata
Rank Oldbie
Rank
Oldbie
pshipkov wrote on 2023-04-24, 01:40:

That motherboard is kind of slow. Pick something faster for the testing.

Just curious, slow in what respect?

I'm using a 133mhz Kingston turbo chip in the board.

Reply 51 of 262, by mkarcher

User metadata
Rank l33t
Rank
l33t
rasz_pl wrote on 2023-04-24, 02:44:

I have a feeling no one attempted that, at least not in Windows 3.x. One sector at 5MB/s is ~100 microseconds? seek can be up to tens of ms but you can never be sure if its going to be long or very quick. All too quick to even start thinking about switching tasks not to mention interrupt overhead. Im guessing all windows drivers will do the simple stupid reliable thing of issuing read and spinning on IO until completion. Win3 uses cooperative multitasking. System Read handler Yielding to another tasks would mean every Read call would automagically turn into Yeald. Would that even work? Wouldnt that slow disk operations tremendously?

Not talking about WfW 3.11 with 32-bit file access, but just about Win 3.1 with 32-bit disk access, that kind of access is only used for accessing a "permanent swap file". So indeed, switching tasks from within the 32-bit disk access driver likely isn't going to work. It would be even more aggressive than making read and write possible yields, it would make every memory access a possible yield, completely subverting the cooperative multitasking environment created by Windows 3.x. On the other hand, every DOS box in Windows 3.x is a separate preemptively multitasked VM, so suspending just one DOS box or the primary Windows VM while keeping all other VMs running would be consistent with the Windows 3.x execution model, but I don't know whether it is actually performed that way.

Also, in a preemptive multitasking system, yielding on disk access doesn't mean the task can be immediately resumed as soon as the hard drive IRQ is triggered. Having the hard drive IRQ just setting a flag, and then waiting for a yield to continue the disk access is not going to fly, that's correct.

Did you know that the original IBM AT BIOS provided some hooks for multitasking environments? The standard BIOS implementation for IDE reads sends the read command to the hard drive, and then calls INT 15 with AX=9001, before entering a spin loop waiting for the Flag "IRQ 14 seen" to be set.A multitasking environment (think DesqVIEW) may hook that vector, and switch to a different task at that point. The IRQ14 handler subsequently calls INT15 with AX=9101, asking the environment to switch back to the task that called INT15/AX=9001 and return from that call. Returning from that call, the spin loop sees the IRQ14 flag already set and directly reads the sector buffer from the hard drive using REP INSW.

rasz_pl wrote on 2023-04-24, 02:44:

Maybe some really clever drivers have a provision for doing that in case of long multisector reads? I hope to be disproved 😀

edit: maybe easy to verify - does grabbing and waving windows (well, frames since its win3) around get very choppy during heavy random reads? lots of long seeks that in theory could be yealded.

As detailed above: This can not happen on Windows 3.1. On the other hand, preemptive multitasking systems like OS/2 or some Unix variant (Linux wasn't that popular in 1992, though) definitely did suspend tasks while waiting for the hard disk IRQ. I don't know about NetWare, but it would make sense for NetWare to handle cached requests while the drive is seeking or just reading sectors, too.

rasz_pl wrote on 2023-04-24, 02:44:
mkarcher wrote on 2023-04-23, 18:43:

In practice, my experiments with an Adaptec 2842VL were disappointing. It doesn't get close to 10MB/s (different VL chipsets), whereas the 2742W gets quite close to 20MB/s on an 8MHz EISA bus (SiS 411 chipset).

On one hand weird considering Adaptec was a powerhouse of ASIC design. [...]
On the other Adaptec was very much all in on SCSI and didnt see IDE as something serious.

Probably it's the context of this thread that set a wrong expectation. The Adaptec controller models I cited are in fact SCSI controllers. The 2842VL is a fast narrow SCSI controller doing VL mastering, which should easily handle the 10MB/s you can transfer over the SCSI cable, but IIRC it capped out at 6.5 to 7.5 MB/s. The 2742W is a fast wide SCSI controller doing EISA mastering, and reading from a sufficiently fast Cheetah got well above 18MB/s.

Reply 52 of 262, by rasz_pl

User metadata
Rank l33t
Rank
l33t
mkarcher wrote on 2023-04-24, 15:21:

Did you know that the original IBM AT BIOS provided some hooks for multitasking environments? ... INT 15 with AX=9001...INT15 with AX=9101

Surely you meant ST506. Interesting https://stanislavs.org/helppc/int_15-90.html https://stanislavs.org/helppc/int_15-91.html looks like this was introduced in XT for floppy drives? I cant find it in 5150 bios, cant find good 5160 decompilation. Maybe made some theoretical sense with head seeks in hundreds of milliseconds, maybe IBM requested it. Even hard drives werent much better seek wise than floppies at that time https://stason.org/TULARC/pc/hard-drives-hdd/ … -MFM-ST506.html. In theory you could do some work in ~100ms if staying in real mode. Did anything use it? like aforementioned DesqVIEW? OS/2?

mkarcher wrote on 2023-04-24, 15:21:

As detailed above: This can not happen on Windows 3.1. On the other hand, preemptive multitasking systems like OS/2 or some Unix variant (Linux wasn't that popular in 1992, though) definitely did suspend tasks while waiting for the hard disk IRQ. I don't know about NetWare, but it would make sense for NetWare to handle cached requests while the drive is seeking or just reading sectors, too.

Linux 1.2 indeed fires ata command and goes on with its life until disk interrupt.

maybe relevant https://jeffpar.github.io/kbarchive/kb/141/Q141591/
Q. Does Windows NT use 32-bit I/O accesses (also known as HDD Block Mode)?
A. To date, this has been seen to corrupt data in some cases. Therefore, it is
not used.
😳 Microsoft made an oopsie boopsie

mkarcher wrote on 2023-04-23, 18:43:

In practice, my experiments with an Adaptec 2842VL were disappointing.
..
Probably it's the context of this thread that set a wrong expectation. The Adaptec controller models I cited are in fact SCSI controllers.

I should have googled the model 😀

mkarcher wrote on 2023-04-23, 18:43:

The 2842VL is a fast narrow SCSI controller doing VL mastering, which should easily handle the 10MB/s you can transfer over the SCSI cable, but IIRC it capped out at 6.5 to 7.5 MB/s. The 2742W is a fast wide SCSI controller doing EISA mastering, and reading from a sufficiently fast Cheetah got well above 18MB/s.

Did Adapted ever release fast VLB controller? I dont think anyone ever found any IDE bus mastering ones. Were there ever any bus mastering VLB cards? Maybe it was always broken, or sucked?

Open Source AT&T Globalyst/NCR/FIC 486-GAC-2 proprietary Cache Module reproduction

Reply 53 of 262, by douglar

User metadata
Rank Oldbie
Rank
Oldbie
pshipkov wrote on 2023-04-24, 01:40:

Also, do you have a driver for Compass Lab CL3202 - GoldStar GM82C712 - Chips 82C711/712 ? Cannot locate this guy.

I found this: https://www.vogonsdrivers.com/getfile.php?fil … 025&menustate=0

It's a self extracting zip, so if you don't want to run the binary, you can rename and open as a zip file.

It's not drivers exactly. It's inf files that allow additional config settings to the defaults windows drivers.

Goldstar EIDE Drivers Win95 INF
-Follow these instruction to install the information file into Windows 95
-This file is intented for use with the E-IDE controller card.
-This file allows the Port Address and IRQ of the controller card to be manually set if the standard IDE Controller does not work with E-IDE card.
-Refer to your system information. ( You can know your system information, used IRQ,I/O address in properties under device mannager. Easy test method : run the program fv2.exe in dos mode under not install CD-ROM )
-6/19/96 L.G. Electronics Inc.

Reply 54 of 262, by douglar

User metadata
Rank Oldbie
Rank
Oldbie

The ADI/2 driver made me laugh: https://www.vogonsdrivers.com/getfile.php?fil … menustate=56,55
The author padded out the DOS driver file with the repeated text "So why are you poking around inside my driver???"

The CMD driver had an interesting note. https://www.vogonsdrivers.com/getfile.php?fil … menustate=56,55

NOTE: Some IDE drives have firmware bugs which cause them to declare
themselves as Mode 2 drives even though they are really slower.
As a result, the following exception table has been implemented in
all CMD drivers:

1. Samsung Mode 2 drives are slowed down to Mode 1.
2. Maxtor Mode 2 drives are slowed down to Mode 1.
3. Quantum Mode 2 drives are slowed down to Mode 0.


Might explain why I have trouble trying to get old EIDE drives to work on newer controllers but they work fine on ATA-0.
PS: I can't believe that their VLB chipset is named "PCI". Maybe it was a dual use chip or something.

Reply 55 of 262, by douglar

User metadata
Rank Oldbie
Rank
Oldbie

Still looking for drivers for these chipsets:
Atronics IDE-2015PL
Atronics IDEC-2020L
Chips 82C711
Chips 82C712
OPTI 82C611A
OPTI 82C46X
PiC 12885A-125
SIS 83C411
Tans TS8310
VIA AV150
VIC HAOI-0221
VIDE VIDE-1
Winbond W83759F
Winbond W83759AF
Tekram ST300ALI

Reply 56 of 262, by mkarcher

User metadata
Rank l33t
Rank
l33t
rasz_pl wrote on 2023-04-25, 00:37:
mkarcher wrote on 2023-04-24, 15:21:

Did you know that the original IBM AT BIOS provided some hooks for multitasking environments? ... INT 15 with AX=9001...INT15 with AX=9101

Surely you meant ST506. Interesting https://stanislavs.org/helppc/int_15-90.html https://stanislavs.org/helppc/int_15-91.html looks like this was introduced in XT for floppy drives? I cant find it in 5150 bios, cant find good 5160 decompilation.

I have an DTK/ERSO XT BIOS 2.42 at hand: The floppy handler doesn't call int 15/90. The same is true for the Phoenix XT BIOS 2.27. I doubt that the 5160 BIOS is really different in this regard. If I understand correctly, HelpPC documents what the default handler of Int 15 does on the PC/PCjr (returns unspecific error code 80) or on the XT (returns specific error code 86 which means "function not implemented"). I also checked the BIOS of the WD1002-27X RLL hard disk controller. Again, it doesn't call INT 15/90 or INT 15/91.

The IBM PC was original planned as "home computer" with the original base model having 16KB RAM, no floppy drive, a casette port and ROM basic. This base model never made it to market, as IBM couldn't compete with other computers that already occupied that market segment when the IBM PC was ready, so the minimum model actually sold was 32KB RAM and one floppy drive. The XT is is mostly an PC with 8 instead of 5 slots, and a beefier power supply, so a hard drive could be built into the main case. The XT (5650) is meant to be the PC (5650) + its expansion unit (5655), reduced to a single unit of the same size. Furthermore, XT boards were built to support more on-board memory as bigger memory chips were available, but the architecture didn't change significantly compared to the PC.

With the AT, IBM tried to enter the "serious computing" market, and IBM considered BIOS-level support of multitasking a good idea as foundation. This is why the Int 15/90 and Int 15/91 callbacks were added. That's also why IBM introduced the SysRq (system request) key in the AT: This key was meant to request to suspend the current task and get to the operating system.So I really don't mean ST-506. I do mean the IBM AT BIOS, which included BIOS support for the AT variant of the WD MFM/RLL controllers in the mainboard bios. I not aware that these calls were ever used in mainstream software, though.

mkarcher wrote on 2023-04-24, 15:21:
maybe relevant https://jeffpar.github.io/kbarchive/kb/141/Q141591/ Q. Does Windows NT use 32-bit I/O accesses (also known as HDD […]
Show full quote

maybe relevant https://jeffpar.github.io/kbarchive/kb/141/Q141591/
Q. Does Windows NT use 32-bit I/O accesses (also known as HDD Block Mode)?
A. To date, this has been seen to corrupt data in some cases. Therefore, it is
not used.
😳 Microsoft made an oopsie boopsie

This FAQ conflates two different concepts: 32-bit I/O (which is a thing between the CPU and the IDE interface chip) and read/write multiple (which is a thing between the IDE driver and the hard disk). Both of these things usually works, but some hardware has buggy implementation of these features that can corrupt data. So Microsoft plays safe and uses the same features other legacy operating systems also use, to avoid entering "unchartered terrain". At those time, Linux also defaulted to not use those feature unless explicitly enabled using hdparm. I wouldn't call that an oopsie by Microsoft.

rasz_pl wrote on 2023-04-25, 00:37:

I dont think anyone ever found any IDE bus mastering ones. Were there ever any bus mastering VLB cards? Maybe it was always broken, or sucked?

VL mastering is little used, and thus more prone to "interesting issues". As we already discussed in this thread (basically your point), the mainstream consumer operating systems (DOS, Windows 3.x) at that time were blocking during disk access anyway. If the OS blocks during access, it doesn't matter whether you use PIO or DMA, if PIO fast enough to not stall the disk. So a DMA IDE controller will not significantly increase performance in DOS / Win3.1, but might have much more compatibility issues. If you were serious about disk drive performance at that time, you bought into SCSI anyway.

Reply 57 of 262, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2023-04-23, 15:37:
And that's where the features of all good VL IDE chips get relevant: […]
Show full quote

And that's where the features of all good VL IDE chips get relevant:

  • Read-ahead and posted writes tackle the first observation: During a sector transfer, it is expected that 256 words are being transferred. So, while the processor handles the first word, the controller chip can already initiate a 16-bit IDE read cycle to the drive, before the next I/O read happens on the VLB. This is "read ahead". Also, the controller chip can acknowledge a write cycle on the VL bus, and still drive the data and the /IOW or /IOR signal on the IDE cable. This is called "posted writes".
  • 32-bit I/O takcles the second observation: The 486DX-33 can also perform REP INSD at a cycle time of 240ns! If the VL IDE chip manages to pack two IDE transfers into one VL transfer, we get 240ns per double-transfer, so an effective rate of 120ns per 16-bit IDE word.

So a read-ahead buffer of 32 bits is enough: It takes two IDE words of data, waiting for the 486 to perform the next REP INSD iteration, which can then be answered without any wait states. The buffer is then refilled as soon as possible. The same is true for the post-write buffer and REP OUTSD. REP OUTSD can be performed faster (at 5 clocks per cycle, that is 150ns cycle time), if the data to output is currently cached in L1, though.

I'm curious how ATAPI devices were dealt with for read-ahead since the 256 word assumption doesn't hold.
We've talked about the UMC 8886A and 8886B which while being PCI instead of VLB, are similar in capability to VLB IDE chips, and those kept the FIFO disabled and required the driver to manually flip it on before doing a transfer, then flip it to draining mode at the right moment so that the controller won't over-read words from the drive, then disable it before returning from the IRQ handler.
I don't think the UM8672 (their VLB IDE chip) had any read-ahead capability aside from assembling words into dwords.
Did "better" chips just have their driver scan the bus for any ATAPI devices and de-tune it if they were found? I believe one of the drivers for a PDC or DTC chip did something like that. Were any smart enough to watch the command being sent to the drive and automatically enable the FIFO for read/write [multiple] sectors and keep it disabled for other/ATAPI commands?

The read ahead mode is the Achilles heel of the infamous CMD640/RZ1000, right?

Reply 58 of 262, by mkarcher

User metadata
Rank l33t
Rank
l33t
douglar wrote on 2023-04-25, 12:27:

Tekram ST300ALI

That's the custom chip of the Tekram DC-680T caching IDE controller. The matching drivers can be found at http://files.mpoli.fi/hardware/HDD/TEKMAR/ for example (search for DC-6x0). It seems that in some countries, Tekram used the brand name Tekmar.

Don't expect good performance of the DC-680T unless you have a great amount of (onboard) cache hits. The IDE interface on that card is awfully slow (PIO0!). The firmware measures VL clock at boot-up and chooses one of four possible active/recovery time combinations for the drive interface to be that slow. I made a custom firmware that allows faster IDE timings by choosing the values designed for low VL clocks also at higher VL clocks, but you still don't get faster than PIO2 at 50MHz VL. As long as your mainboard can support more RAM, you are likely much better off adding RAM to the main board than using RAM on the Tekram controller.

Also, don't expect good IDE compatiblity of the DC-680T controllers: The 1.x firmware doesn't even support IDE IDENTIFY DRIVE! The 2.x firmware does, but it doesn't support IDE block mode on either the host side, nor the drive side. The RAID implementation of the 1.x firmware is also extremely basic, the one in the 2.0 firmware is kind of OK. The issue with the Tekram controllers is in my oppinion that they might be a great card to improve the performance of 1989 IDE drives by adding a cache to them, and (in the 1.x firmware) some kind of defect remapping, but that concept was obsolete as soon as these controllers hit the market. While it feels kind of nice to reboot your DOS machine without hearing any seeking noise from the hard drive (because the DOS kernel and the programs called from autoexec.bat are all in the DC-6x0 cache), for practical purposes, just use smartdrive. Other VL IDE controller vendors also found out that smartdrive with host memory is a simple yet effective caching solution.

Reply 59 of 262, by mkarcher

User metadata
Rank l33t
Rank
l33t
jakethompson1 wrote on 2023-04-25, 20:27:

The read ahead mode is the Achilles heel of the infamous CMD640/RZ1000, right?

Yes. Those chips used a single FIFO (possibly just 2 words) for read-ahead for both IDE channels. This means that in the case of an IDE data transfer on the primary channel getting interrupted by a different task that tries to issue a command on the second channel will mess up the FIFO state. Do you remember the generic Windows 95 PCI IDE driver calling the the supported PCI IDE interface chips "dual FIFO"? This basically means: "Not like the CMD640/RZ1000". The usual workarounds to deal with the CMD640 or RZ1000 chips was to either make sure that only one IDE channel is ever in use at the same time, or possibly just making sure an REP INSW/INSD not getting interrupted.