[*]Switching to a caching ISA controller will do memory mapped transfers instead of PIO transfers which can get you 4MB/s with some latency overhead when accessing data that's not in the cache.
[/list]
Sorry to get back on this. Do you have any proof of that memory mapped transfer? I couldn’t find one sign of that 😒 thank you
Thanks for posting the BIOS. This is a standard EIDE LBA BIOS, no signs of DMA anywhere. It also includes some kind of setup/low-level format program which can be invoked by calling offset 9 in that BIOS. Does the card ship with a a driver disk that contains a DMA driver?
[*]Switching to a caching ISA controller will do memory mapped transfers instead of PIO transfers which can get you 4MB/s with some latency overhead when accessing data that's not in the cache.
[/list]
Sorry to get back on this. Do you have any proof of that memory mapped transfer? I couldn’t find one sign of that 😒 thank you
My ISA graphics - on normal, non-overclocked bus - does achieve about 4 MB/s: Re: Fast Ethernet on ISA
[*]Switching to a caching ISA controller will do memory mapped transfers instead of PIO transfers which can get you 4MB/s with some latency overhead when accessing data that's not in the cache.
[/list]
Sorry to get back on this. Do you have any proof of that memory mapped transfer? I couldn’t find one sign of that 😒 thank you
My ISA graphics - on normal, non-overclocked bus - does achieve about 4 MB/s: Re: Fast Ethernet on ISA
Right, the ISA Caching controllers transfer in a similar way to the ISA graphics cards, and that's how the ISA caching controllers get 4MB/s when reading from the cache. I can post some screen shots if you like.
It's also how FreddyV can get close to 2MB/s on his 8 bit cards or how the VLB Caching controllers get 16MB/s when reading from the cache.
The EIDE Master ISA
is the only ISA IDE card available on the current market that provides Multiword DMA transfer hardware (up to 8.33 MB/sec). It breaks the bottleneck of IDE performance.
Furthermore that just seems to be a bios kind of thing. When you use Xtide all ata analysis tools show that my current ide HDD is using mwdma 2. at least iirc
Sure. Not quite sure it’s the latest result but here we go. This is not with the CL1050 but I achieved same results with that card both times with XTIDE. Isa Bus @16MHz
Thanks also for the memmap explanation and the non Dma usage of listed controllers. I did not receive dos drivers for that card.
I had hoped to use this card: https://theretroweb.com/expansioncards/s/prom … se-eide4030plus but my Rev B1 card gave me a tantalum light show half way through the testing. Very exciting, but undesirable. The B2 version is finicky and I ran out of time.
The attachment ISACACH2_result.png is no longer available
Buslogic BT-410cd
The attachment VLBCACH2_result.png is no longer available
Promise Eide2300+ v2.0
The attachment PROM_result.png is no longer available
Here are the results in text form:
1SiS 471 8mhzISA Seek Time Cache Read Linear Read 2Buslogic BT-510a 0.88 ms 3548 KB/s 1140 KB/s 3Buslogic BT-410cd 0.67 ms 21303 KB/s 2247 KB/s 4Promise Eide2300+ 0.38 ms 11949 KB/s 11775 KB/s 5 6OPTi 391 12mhzISA Seek Time Cache Read Linear Read 7Buslogic BT-510a 0.73 ms 6517 KB/s 1520 KB/s 8Buslogic BT-410cd 0.65 ms 18275 KB/s 2217 KB/s 9Promise Eide2300+ 0.36 ms 12028 KB/s 11900 KB/s
Notes:
The Buslogic BT-410cd appears to be limited by main memory read throughput of the Opti chipset. The SIS motherboard on the other had seems to reach max VLB speeds. I tried to max out the memory settings in the BIOS for both boards.
I loaded the driver for the Promise 2300+ using the "M8" flag, which it calls "turbo MWDMA mode". It's the fastest DMA speed that the Promise 20C630 controller does. It's my fastest all-around VLB performer, but it can be finicky about which storage devices work in that mode. It is curious that it looks like it has a "ramp up" period on the linear read test.
I don't know why speedsys started treating the CF on the Promise Eide2300 like a spinning disk. Usually that happens only if I forget to turn on BIOS shadowing and I double checked that. But I was out of time at that point, which is why I skipped the CPU and memory tests. The numbers matched expectations, so I didn't worry about it.
Each buslogic controller had one episode where they wouldn't write the PCX report to the gotek device after completing the tests. Thus is the way of retro computing, I guess. Sometimes weird stuff happens.
Sure. Not quite sure it’s the latest result but here we go. This is not with the CL1050 but I achieved same results with that card both times with XTIDE. Isa Bus @16MHz
Thanks also for the memmap explanation and the non Dma usage of listed controllers. I did not receive dos drivers for that card.
You did get a DOS driver for your CL-1050. It was just on an EPROM instead of a floppy disk. XTIDE is also a DOS storage driver.
Your 4,072KB/s linear read score is about what I'd expect to see for PIO running on a 16Mhz ISA bus with the XTIDE ROM shadowed.
I'd expect 30-50% faster if you have block transfers enabled. Are you using the 386l XTIDE? Did you enable block transfers?
p.s. I loved my WD drive with the 8MB cache back in the early XP days. I remember hunting diligently for that "JB" suffix while nervously running an IBM deathstar drive.
Wouldn't ISA IDE controllers with busmaster support be something redundant/not useful anyway?
My understanding was that busmastering for the most part isn't possible on 486 anyway (when it comes to pci, not sure about ISA- is that even a thing?), but then anything newer has the possibility of busmaster on PCI which sort of negates the need to have an awfully slow controller card sitting on the ISA bus
Wouldn't ISA IDE controllers with busmaster support be something redundant/not useful anyway?
My understanding was that busmastering for the most part isn't possible on 486 anyway (when it comes to pci, not sure about ISA- is that even a thing?), but then anything newer has the possibility of busmaster on PCI which sort of negates the need to have an awfully slow controller card sitting on the ISA bus
486 PCI should have bus mastering; even if the onboard PCI IDE doesn't use it, PCI ethernet cards (even early ones like EtherLink III PCI) would expect it to be there and work, not to mention SCSI cards.
ISA has bus mastering capability added with the AT. This isn't the same as 8237 ISA DMA (as in floppy or Sound Blaster), but as mkarcher pointed out on another thread, the 8237 is still used to acquire and release the bus.
ISA bus mastering is used by SCSI cards. That means a few issues, including the 16MB memory boundary, and the DOS VDS specification having to be added, so that the SCSI option ROM can lock and read/write to the right physical memory when EMM386 or Windows protected mode is in use. If that is broken or the buffer is above 16MB, you have to use "double buffering" to copy to and from a buffer fixed in low physical memory, then copy from there to the desired buffer, defeating the point of the bus mastering. I suspect all this is part of why SCSI was considered difficult or complicated, beyond SCSI itself.
Yes block transfer is enabled. 8 or 16 not sure. So no further room for 30-50% unfortunately
Can you verify if it is working? When the XTIDE BIOS initializes, do you see “Block Mode” in the IDE device summary line? Or after running speedsys, do the report at the end and check out the ATA section in the txt file.
Do you have the XTIDE bios area shadowed into RAM?
Yes. 16 blocks enabled. Sorry for the foto. Hard to read. It’s from my gallery as I am not in front of the retro pc. 16 blocks. I use the 8kb bios only for 386. Not the full extended this with less information. I will check speedsys tomorrow.
This 4100kb really seem to be on the upper end of what is possible to what I see and heard here so far.
Bus master DMA:
Default transfer rate is 5 MB/s - actual observable speed in this mode is about 2.5 MB/s - download/file.php?id=230173&mode=view
Some systems can run in 8 MB/s mode - actual obervable speed is about 6 MB/s - Re: Fast Ethernet on ISA
Adaptec BIOS even has the 10 MB/s option - but has anybody got it to work?
ISA has bus mastering capability added with the AT. This isn't the same as 8237 ISA DMA (as in floppy or Sound Blaster), but as mkarcher pointed out on another thread, the 8237 is still used to acquire and release the bus.
ISA bus mastering is used by SCSI cards. That means a few issues, including the 16MB memory boundary, and the DOS VDS specification having to be added, so that the SCSI option ROM can lock and read/write to the right physical memory when EMM386 or Windows protected mode is in use. If that is broken or the buffer is above 16MB, you have to use "double buffering" to copy to and from a buffer fixed in low physical memory, then copy from there to the desired buffer, defeating the point of the bus mastering. I suspect all this is part of why SCSI was considered difficult or complicated, beyond SCSI itself.
in win9x, you'd often have to go out of your way to disable dblbuff.sys, as apparently just using a fat32 partition will cause it to be loaded by default. this source actually claims that double buffering can improve disk i/o performance under realmode DOS, although with just a 486 and ISA busmastering SCSI controller that perhaps could be debatable: https://www.mdgx.com/newtip5.htm#DBLBUFF
the other thing is that the DoubleBuffer=1 line in msdos.sys is conditional and is supposed to only actually enable double buffering when the driver deems it necessary. did anyone actually test this, for instance whether it would enable double buffering for a PCI SCSI controller as well?
So, what's the maximum observable throughput of ISA at 8 MHz ?
That's a very nice summary. I'd like to add that for IDE over ISA, while the hardware on the motherboard side is almost all the same, finding the right combination of Int13h handler ( either BIOS, driver, or Option Rom) & and storage device firmware are necessary to get close to the theoretical limit. That's the maddening part of doing storage benchmarks. A storage device that does fine in one system might underperform in another for reasons that are just opaque to most system utilities out there.
I see MR-Bios v1.6x has options to mess with the cycles for the onboard DMA handler. Shouldn't affect anything IDE related unless you get one of those unicorn DMA IDE controllers
1DMA SETUP 2This menu selection appears only in systems containing an 82C206 Integrated 3Peripheral Controller (IPC), or equivalent part. In such systems, this Utility permits 4optional fine tuning of the DMA subsystem. Usually, no adjustments need be made, and 5you should simply confirm that an asterisk (*) appears next to each field (default 6settings). 7 8All PC computers contain a Direct Memory Access (DMA) controller, either as a discrete 9component or incorporated in large-scale logic. This device allows various peripherals 10to bypass the system CPU and access memory directly. In typical installations, DMA is 11used only by the floppy disk controller. Add-on cards that might use DMA services 12include Networks, SCSI, tape drives, and special feature adapters like music 13synthesizers. 14 15In certain cases, you will be able to improve throughput to/from an add-on card that uses 16DMA by adjusting parameters here. Or, an adapter card that refuses to operate correctly 17may be made to work by adjusting DMA timing more to its liking. As a general rule, high 18performance cards may benefit from a fast DMA Clock, and slower cards may require 19more Wait States. The other parameters are extremely technical, and you will need to 20consult timing diagrams in order to fully understand them. 21 22DMA Clock - The clock to the DMA controller is derived from ATCLK (AT-Bus Clock). 23ATCLK divided by 2 is the industry standard (default) setting. You can select the 24unreduced clock, ATCLK divided by 1, for faster operation. 25* DMA Clock ... ATCLK/2 standard DMA clock 26 DMA Clock ... ATCLK/1 fast DMA clock 27 288-Bit Waits - The number of Wait States in an 8-bit DMA cycle is programmable from 1 29(the default) to a maximum of 4. This field affects floppy operation. 30* 8-Bit Waits ... 1 WS standard wait state 31 8-Bit Waits ... 4 WS maximum wait states 32 3316-Bit Waits - The number of Wait States in a 16-bit DMA cycle is programmable from 341 (the default) to a maximum of 4. This field does not affect floppy operation. 35* 16-Bit Waits ... 1 WS standard wait state 36 16-Bit Waits ... 4 WS maximum wait states 37 38Command Width - A "Normal" DMA transfer cycle requires 4 clocks to complete, but 39can be optionally "Compressed" to 3 clocks. 40* Command Width ... Normal default width 41 Command Width ... Compress compressed width 42 43MEMR# Signal - The MEMORY READ control signal can be configured to start one 44clock cycle earlier than its normal position. 45* MEMR# Signal ... Normal standard MEM READ 46 MEMR# Signal ... Early start MEM READ early 47 48MEMW# Signal - The MEMORY WRITE control signal can be configured to start one 49clock cycle earlier than its normal position. 50* MEMW# Signal ... Normal standard MEM WRITE 51 MEMW# Signal ... Early start MEM WRITE early
What are DMA modes?
DMA or Direct Memory Access means that the data is transferred directly between drive and memory without using the CPU as an intermediary, in contrast to PIO. In true multitasking operating systems like OS/2 or Linux, DMA leaves the CPU free to do something useful during disk transfers. In a DOS/Windows environment the CPU will have to wait for the transfer to finish anyway, so in these cases DMA isn't terribly useful.
There are two distinct types of direct memory access: third-party DMA and first-party or busmastering DMA. Third-party DMA relies on the DMA controller on the system's mainboard to perform the complex task of arbitration, grabbing the system bus and transferring the data. In the case of first-party DMA, all this is done by logic on the interface card itself. Of course, this adds considerably to the complexity and the price of a busmastering interface.
Unfortunately, the DMA controller on ISA systems is ancient and slow, and out of the question for use with a modern harddisk. VLB cards cannot be used as DMA targets at all and can only do busmastering DMA. It is only on EISA- and PCI-based interfaces that non-busmastering DMA is viable: EISA type 'B' DMA will transfer 4MB/s, PCI type 'F' DMA between 6 and 8MB/s.
Today, all modern chipsets, including the ubiquitous Triton chipsets, incorporate a busmastering DMA capable ATA interface. Efforts to standardize the DMA hardware will ensure stable and reliable software support.
1 DMA Mode Cycle time transfer rate 2 Single word (ns) (MB/s) 3 0 960 2.1 ATA 4 1 480 4.2 ATA 5 2 240 8.3 ATA 6 Multiword 7 0 480 4.2 ATA 8 1 150 13.3 ATA-2 9 2 120 16.6 ATA-2
The single word DMA modes are hardly useful and are obsoleted in ATA-3. Note that some older interfaces are able to use these DMA modes as a way to communicate with the drive, without actually doing direct memory access at all. In these cases, the DMA modes are just used as glorified PIO modes.
There's no reason PCI IDE interfaces couldn't speak PIO4 to the drive yet still bus master the data to memory. The UMC UM8886B can do this. The problem is that the IRQ14 comes at the wrong time: before the sectors are bus mastered to RAM and not after. So once the IRQ happens, the driver has to poll the 8886B until its 'remaining DWORDs to transfer' count reaches zero. There could be error handling reasons why the 8886B couldn't "absorb" the IRQ when it comes in, go ahead and bus master the transfer, then issue the IRQ14 to the host once the transfer is done. The UM8886B requires the driver to manage the state of the FIFO between filling and draining, can't scatter-gather, and has alignment restrictions so those could be more reasons not to bother. As MWDMA already issues the IRQ14 at the right time for both error and success conditions, perhaps that's why Intel resurrected MWDMA with the PIIX.