VOGONS


First post, by rasz_pl

User metadata
Rank Oldbie
Rank
Oldbie

Plenty of EISA Bus Masters, but why bother when 3rd party DMA controller on EISA bus was apparently full speed with no stupid ISA 3MHz 2 cycles per byte 8237 limitations? A ton of PCI chips supported Bus Mastering (for example S3 ViRGE) but a lot of CPU chipsets didnt (sis/via/amd athlon ones?) so rarely used. The only ISA Bus Mastering card families I found are Adaptec AHA-154x and Novell NE2100/AMD PCNET Am2100/AMD LANCE Am79C9xx. Issues I know of:

- 16 MB addressable system RAM limitation
- feipoa mentioned Re: What is 2nd-party DMA on SCSI controllers? ISA Bus Mastering has a a chance of screwing cache coherency when used on 386 boards in combination with the DLC/SXL. This sounds like motherboards without direct Cyrix cpu support problem.
- requires cascaded DMA configured on first 8237, used for securing bus isolation (HOLD/HLDA) signal. Intel 386 documentation at least suggests (but doesnt outright spell it out) that HOLD just controls access to the buss and will not disrupt executing already loaded instruction.

Speed: AHA-1542 claims "Host Bus Burst Data Rate: 10 MByte/sec" in the brochure which does sound sus, while manual https://www.philscomputerlab.com/uploads/3/7/ … aha1540b_um.pdf describes jumper selection between 5-5.7-6.7-8 MB/s. Considering standard ~8.33MHz 16bit ISA transfer can be performed in 2 cycles minimum this might actually be correct, but where 5-5.7-6.7 comes from? I would understand 5.3 MB/s as thats 3 cycle transfers on standard 8MHz buss. Maybe they meant MHz which would made "8 MB/s" translate to 125ns bus cycle setting?

Were there any other ISA Bus Mastering cards?
Is it possible to perform an ISA Bus Master transfer from one card to another? Lets say directly to VGA card memory? What about potential wait states (ISA System Architecture Third Ed doesnt mention such possibility or wait states at all)?

Reply 1 of 13, by pshipkov

User metadata
Rank Oldbie
Rank
Oldbie
rasz_pl wrote on 2022-05-14, 03:47:

but where 5-5.7-6.7 comes from?

Didn't bother with figuring out the exact math, but i take it as added wait states somewhere in the i/o pipeline to accommodate for different grade hardware.

rasz_pl wrote on 2022-05-14, 03:47:

Is it possible to perform an ISA Bus Master transfer from one card to another? Lets say directly to VGA card memory? What about potential wait states (ISA System Architecture Third Ed doesnt mention such possibility or wait states at all)?

P2P DMA is possible, but requires hardware/software alignment on both sides - there must be silicon that does something with the data and you need 2 layers of drivers - source/client and orchestrator.
I cannot think of PC hardware like this, but if we search online there is good chance some specialized systems from back then show up.
The storage to VGA example does not make sense because the (common at least) graphics adapters from back then just didn't have silicon/drivers to handle such scenarios.

retro bits and bytes

Reply 2 of 13, by Disruptor

User metadata
Rank Oldbie
Rank
Oldbie

Well, I'd like to compare 8 bit DMA from XT with 16 bit DMA from AT and ISA busmaster DMA and EISA busmaster DMA.

EISA busmaster DMA is really fast and a class of its own. With an Adaptec 2742W it is no problem to get transfer rates of over 18 MB/s.

XT DMA is the fastest way to connect a hard drive to a PC XT. The DMA controller runs with full 4,77 MHz.
AT DMA is not the ideal way for fast transfers because the DMA controller does not run with bus frequency. However, it may be used when you do not need maximum speed but 8 bit DMA transfers are not fast enough. An example are playback and recording of 16 bit sounds on a Sound Blaster 16. On AT PCs for disk transfers won't use DMA at all but use PIO (programmed I/O) instead because it is faster to avoid the legacy DMA chips.
ISA busmaster DMA is the fastest way. Adaptecs 1542 controllers have a diagnose program that let you find out what's the maximum stable transfer rate in your particular computer. The configurable rates are: 3.3, 5.0, 5.7, 6.7, 8.0, 10 MB/s. In my 386 SX/20 the maximum stable rate is 5.7 MB7s. The busmaster controller 'misuses' the legacy DMA chip just to reserve the bus, but it does the transfers itself, resulting in a faster speed.

Last edited by Disruptor on 2022-05-14, 15:44. Edited 1 time in total.

Reply 3 of 13, by maxtherabbit

User metadata
Rank l33t
Rank
l33t

The Adaptec 154x cards can perform bus master DMA to ANY memory address in the first 16MB. This included video RAM and any other RAM mapped to the UMA on peripheral cards.

Reply 4 of 13, by pshipkov

User metadata
Rank Oldbie
Rank
Oldbie

The question was if DMA can be performed without routing data through system memory.
Similar to how NVIDIA's RDMA works now-a-days.

As i said above - cannot think of ordinary retro hardware that can do it.

retro bits and bytes

Reply 5 of 13, by mkarcher

User metadata
Rank Oldbie
Rank
Oldbie
pshipkov wrote on 2022-05-14, 18:54:

The question was if DMA can be performed without routing data through system memory.
Similar to how NVIDIA's RDMA works now-a-days.

As i said above - cannot think of ordinary retro hardware that can do it.

maxtherabbit is probably correct here. On a classic ISA system (ordinary retro hadware), an Adaptec 1542 (that's ordinary retro hardware, too) will act as ISA bus master. When the 1542 arbitrated for the bus (using DREQ/DACK/MASTER), it will drive the ISA bus just the same way as the chipset would drive the ISA bus if the processor did a memory access cycle. So the 1542 can issue bus cycles to the VGA card just like the processor does. The VGA card has no way of knowing whether it is a processor-initiated or an Adaptec-initiated ISA cycle.

The might be stuff in the fine print spoiling direct ISA-to-ISA DMA, though: ISA dynamically negotiates bus width (8 or 16 bits), and I don't know whether the Adaptec host adapter performs negotiation. Possibly, the Adaptec host adapter blindly assumes that the ISA target is able to perform 16 bit cycles and doesn't bother to check MEMCS16. In that case, bus mastering to (and especially from) VGA cards might or might not work, depending on the card. Another point in the fine print of ISA bus mastering: There is nothing in the specification mandating the mainboard to drive IOCHRDY or 0WS when you bus master to/from memory. So a bus master issues a memory cycle to the ISA bus and then waits for "an appropriate amount of time" for the cycle to actually complete. That's why you need to set the rate on the Adaptec 1542, you choose the "appropriate time".

Keep in mind that the ISA bus is not specified as a synchronous bus with cycles taking an integer number of bus clocks and signals being evaluated at specific phases of the clock signal. Nearly everything on the ISA bus is asynchronous. A memory write cycle starts when /MEMW is pulled low. It takes a "default duration", but will be extended when IOCHRDY is pulled low. It can be shortened when /0WS is pulled low (on an AT or newer, there is no /0WS on an PC/XT). And here's the point why I wrote nearly everything is asynchronous: The /0WS signal is the only signal that's specified to be sampled in a specific clock phase.

Reply 6 of 13, by maxtherabbit

User metadata
Rank l33t
Rank
l33t
mkarcher wrote on 2022-05-14, 22:25:
maxtherabbit is probably correct here. On a classic ISA system (ordinary retro hadware), an Adaptec 1542 (that's ordinary retro […]
Show full quote
pshipkov wrote on 2022-05-14, 18:54:

The question was if DMA can be performed without routing data through system memory.
Similar to how NVIDIA's RDMA works now-a-days.

As i said above - cannot think of ordinary retro hardware that can do it.

maxtherabbit is probably correct here. On a classic ISA system (ordinary retro hadware), an Adaptec 1542 (that's ordinary retro hardware, too) will act as ISA bus master. When the 1542 arbitrated for the bus (using DREQ/DACK/MASTER), it will drive the ISA bus just the same way as the chipset would drive the ISA bus if the processor did a memory access cycle. So the 1542 can issue bus cycles to the VGA card just like the processor does. The VGA card has no way of knowing whether it is a processor-initiated or an Adaptec-initiated ISA cycle.

The might be stuff in the fine print spoiling direct ISA-to-ISA DMA, though: ISA dynamically negotiates bus width (8 or 16 bits), and I don't know whether the Adaptec host adapter performs negotiation. Possibly, the Adaptec host adapter blindly assumes that the ISA target is able to perform 16 bit cycles and doesn't bother to check MEMCS16. In that case, bus mastering to (and especially from) VGA cards might or might not work, depending on the card. Another point in the fine print of ISA bus mastering: There is nothing in the specification mandating the mainboard to drive IOCHRDY or 0WS when you bus master to/from memory. So a bus master issues a memory cycle to the ISA bus and then waits for "an appropriate amount of time" for the cycle to actually complete. That's why you need to set the rate on the Adaptec 1542, you choose the "appropriate time".

Keep in mind that the ISA bus is not specified as a synchronous bus with cycles taking an integer number of bus clocks and signals being evaluated at specific phases of the clock signal. Nearly everything on the ISA bus is asynchronous. A memory write cycle starts when /MEMW is pulled low. It takes a "default duration", but will be extended when IOCHRDY is pulled low. It can be shortened when /0WS is pulled low (on an AT or newer, there is no /0WS on an PC/XT). And here's the point why I wrote nearly everything is asynchronous: The /0WS signal is the only signal that's specified to be sampled in a specific clock phase.

2.1.3 8 Bit Memory […]
Show full quote

2.1.3 8 Bit Memory

During normal DMA operations, nearly all transfers to and from memory are 16
bit transfers. At the very end, or the very beginning of an odd address
boundary, an 8 bit transfer on the upper data bits (D8-D15) will occur
according to the AT bus architecture. Some memory in the I/O space, such as
video RAM, is 8 bits only and always transfers data only on the lower data bits
(D0-D7). The AHA-1540A/1542A will transfer 16 bit or 8 bit memory in the
address space between 0A0000 hex and 0BFFFF hex depending on the signal line
MEM16 on the AT bus. If this signal is active, 16 bit memory is assumed, and if
inactive 8 bit memory is assumed. Outside of this address space 16 bit memory
is always assumed.

Attachments

  • Filename
    1540A.DOC
    File size
    252.22 KiB
    Downloads
    3 downloads
    File license
    Fair use/fair dealing exception

Reply 7 of 13, by pshipkov

User metadata
Rank Oldbie
Rank
Oldbie

Thanks @mkarcher and @maxtherabbit for the info.
Looks like you are familiar with the matter.
I checked online about what others say.
There is plenty of info related to peer-to-peer DMA on a PCI bus.
Providing two links:

link 1

and especially this one
https://lwn.net/Articles/767281/

Was not too far off in my previous replies but no idea how applicable this is to ISA.
Looks like in a non-fixed function pipeline you cannot just move data between devices in a meaningful way (or at all), unless the end points are designed to work together and/or there is software orchestration and/or BIOS support.

I am not sure what mechanism will be used in ISA system to interop between devices.
How to discover from one device let's say the VGA (its memory), write to it (the memory), then instruct the client device to immediate or deferred handle the transaction without affecting the integrity of the system as a whole.
Some of that can be implemented as firmware but the rest of the system needs to be aware of it and respect it.
I don't know - should probably read on all this before talking here.

What you think ?

retro bits and bytes

Reply 8 of 13, by rasz_pl

User metadata
Rank Oldbie
Rank
Oldbie
maxtherabbit wrote on 2022-05-14, 14:26:

The Adaptec 154x cards can perform bus master DMA to ANY memory address in the first 16MB. This included video RAM and any other RAM mapped to the UMA on peripheral cards.

Thats the theory, the nitty gritty details is where I cant find any information

mkarcher wrote on 2022-05-14, 22:25:

The might be stuff in the fine print spoiling direct ISA-to-ISA DMA, though: ISA dynamically negotiates bus width (8 or 16 bits), and I don't know whether the Adaptec host adapter performs negotiation. Possibly, the Adaptec host adapter blindly assumes that the ISA target is able to perform 16 bit cycles and doesn't bother to check MEMCS16. In that case, bus mastering to (and especially from) VGA cards might or might not work, depending on the card

1540 documentation mentions LA lines (address pipelining), might suggest it strictly expects to be in 16bit capable slot and nothing else. document linked by maxtherabbit claims it shouldnt have trouble talking to 8bit devices:

1540A.DOC wrote:

16- and 8-bit transfers Odd and Even starting address transfers and odd or even data lengths

on the other hand very fun topic by again maxtherabbit Re: Why does EMS suck on my 286? discovered otherwise

mkarcher wrote on 2022-05-14, 22:25:

Another point in the fine print of ISA bus mastering: There is nothing in the specification mandating the mainboard to drive IOCHRDY or 0WS when you bus master to/from memory.

Seems Adaptec thought otherwise? :

1540A.DOC wrote:

The I/O Channel Ready signal automatically slows the system further if required by the host memory.

but still no mention of handling NOWS signal or any hint if motherboards generate one.

mkarcher wrote on 2022-05-14, 22:25:

So a bus master issues a memory cycle to the ISA bus and then waits for "an appropriate amount of time" for the cycle to actually complete. That's why you need to set the rate on the Adaptec 1542, you choose the "appropriate time".

This bugs me. ISA clock is fixed, how would bus master wait anything between 2 and 3/4/5/etc clocks? its either 8 or 5.3 or 4 MB/s etc, other numbers could only be derived by dropping bus mastering every x transfers - something AMD LANCE manual mentions. 1540A.DOC also states default 11us bus on time and 4us off time. That translates to ~70% bus utilization. 8MHz * 70% = ~5MB/s sort off. This would explain default "5.0 MB/s". On the other hand 2 clock cycle is only possible when target generates NOWS, do motherboards do that? With 3 clock bus transfers numbers again stop making sense 🙁 Unless Adaptec DGAF and just blasts 2 clock transfers even at the slowest 250ns setting.

1540A.DOC wrote:
AHA-1540A/1542A MEMORY CYCLE TIMING STANDARD tRR TIMING SPEEDS tWW 5.0 MB 250 A 5.7 MB 200 A 6.7 MB […]
Show full quote

AHA-1540A/1542A MEMORY CYCLE TIMING
STANDARD tRR TIMING
SPEEDS tWW
5.0 MB 250 A
5.7 MB 200 A
6.7 MB 200 A
8.0 MB 150 A
10.0 MB 100 B

wait what, 5.7 and 6.7 are the same, a placebo setting? 😀
Fastest standard ISA Bus cycle takes two clocks, 250ns at 8MHz. So do those higher settings simply mean 154x could potentially be able to work at up to 20MHz ISA clock without waitstates? That would finally make some sense!

mkarcher wrote on 2022-05-14, 22:25:

Keep in mind that the ISA bus is not specified as a synchronous bus with cycles taking an integer number of bus clocks and signals being evaluated at specific phases of the clock signal. Nearly everything on the ISA bus is asynchronous.

that doesnt sound legit 😀

mkarcher wrote on 2022-05-14, 22:25:

A memory write cycle starts when /MEMW is pulled low. It takes a "default duration", but will be extended when IOCHRDY is pulled low. It can be shortened when /0WS is pulled low (on an AT or newer, there is no /0WS on an PC/XT). And here's the point why I wrote nearly everything is asynchronous: The /0WS signal is the only signal that's specified to be sampled in a specific clock phase.

From what Im seeing in ISA System Architecture Third Ed everything (but CHRDY?) is happening at specific clock edges and is very much counted in integer bus clocks.

1540A.DOC wrote:
2.1.3 8 Bit Memory During normal DMA operations, nearly all transfers to and from memory are 16 bit transfers. At the very end, […]
Show full quote

2.1.3 8 Bit Memory
During normal DMA operations, nearly all transfers to and from memory are 16
bit transfers. At the very end, or the very beginning of an odd address
boundary, an 8 bit transfer on the upper data bits (D8-D15) will occur
according to the AT bus architecture. Some memory in the I/O space, such as
video RAM, is 8 bits only and always transfers data only on the lower data bits
(D0-D7). The AHA-1540A/1542A will transfer 16 bit or 8 bit memory in the
address space between 0A0000 hex and 0BFFFF hex depending on the signal line
MEM16 on the AT bus. If this signal is active, 16 bit memory is assumed, and if
inactive 8 bit memory is assumed. Outside of this address space 16 bit memory
is always assumed.

now this is weird and just adds to the confusion 😀

>Some memory in the I/O space, such as video RAM, is 8 bits only and always transfers data only on the lower data bits (D0-D7).

This must of talked about specific IBM models with specific video adapters? maybe it was still true in 1989 when it was written.

>Outside of this address space 16 bit memory is always assumed.

and this would mean Adaptec obeys M16 only in 0A0000-0BFFFF range?!?! 😀 Re: Making my first ISA card, as pain-free as possible Re: Why does EMS suck on my 286? amazing 😮

Whole reason for this topic is my curiosity if it was at all possible to release a 2D sprite accelerator in late 80 286 to very early 90 slow 386 era, a PowerVR PCX1/2 of its time 😀. One alternate history timeline could see Atari introducing Lynx Suzy blitter (1989) on an ISA card. Another option is basic DMA engine. Something like 1984 Commodore REU can only perform ~1MB/s block transfers, but is able to deliver Sonic on C64 https://www.youtube.com/watch?v=XAc-em-Kugk.
30 fps sprite games would be solidly in reach of such a contraption. 2MB/s VGA writes + 2MB/s of reads from ram. Maybe even 60 fps with own buffer/cache on board.
Card with dedicated 128KB for work framebuffer and some sprites and only two block commands:
-copy to/from main ram
-internal only copy, conditional on source value (transparency), optional conditional on destination value range (could define custom planes by assigning color palette ranges), optional trigger on destination value range (collision detection)

Reply 9 of 13, by mkarcher

User metadata
Rank Oldbie
Rank
Oldbie
rasz_pl wrote on 2022-05-15, 09:57:
mkarcher wrote on 2022-05-14, 22:25:

A memory write cycle starts when /MEMW is pulled low. It takes a "default duration", but will be extended when IOCHRDY is pulled low. It can be shortened when /0WS is pulled low (on an AT or newer, there is no /0WS on an PC/XT). And here's the point why I wrote nearly everything is asynchronous: The /0WS signal is the only signal that's specified to be sampled in a specific clock phase.

From what Im seeing in ISA System Architecture Third Ed everything (but CHRDY?) is happening at specific clock edges and is very much counted in integer bus clocks.

I'm looking at the Intel ISA Bus Specification and Application Notes, V2.01. Chapter 8.0 begins like this (Note that SRDY* is /0WS)

Intel ISA Bus Specification wrote:

ISA bus cycles are asynchronous in that bus activities are independent of the SYSCLK. Some signals enable and disable at anytime; others respond within minimum and maximum times of other signals being enabled or disabled. The only exception is the SRDY* signal which is synchronized with SYSCLK.

On the other hand, you are right that some times do refer to the ISA clock period, e.g. it says in chapter 8.1.3 about IOCHRDY-extended cycles:

Intel ISA Bus Specification wrote:

The amount that the command line is lengthened is a multiple of the bus clock, even though none of the functions are synchronized to it.

1540A.DOC wrote:

The I/O Channel Ready signal automatically slows the system further if required by the host memory.

That's fine. If the mainboard actually drives IOCHRDY. The ISA Bus Specification I'm reading shows the system memory as separate resource on the ISA Bus, which will add waitstates just like any other ISA target, so in that case, it will be fine. Newer boards don't have the platform memory located on the ISA bus, and generate timing for memory access independent of the ISA bus clock (but instead dependent on the host processor clock). While the specification might still mandate that even in this case, IOCHRDY is properly driven, AFAIK it isn't on all ISA bridges.

Reply 10 of 13, by bakemono

User metadata
Rank Oldbie
Rank
Oldbie
rasz_pl wrote on 2022-05-15, 09:57:
Whole reason for this topic is my curiosity if it was at all possible to release a 2D sprite accelerator in late 80 286 to very […]
Show full quote

Whole reason for this topic is my curiosity if it was at all possible to release a 2D sprite accelerator in late 80 286 to very early 90 slow 386 era, a PowerVR PCX1/2 of its time 😀. One alternate history timeline could see Atari introducing Lynx Suzy blitter (1989) on an ISA card. Another option is basic DMA engine. Something like 1984 Commodore REU can only perform ~1MB/s block transfers, but is able to deliver Sonic on C64 https://www.youtube.com/watch?v=XAc-em-Kugk.
30 fps sprite games would be solidly in reach of such a contraption. 2MB/s VGA writes + 2MB/s of reads from ram. Maybe even 60 fps with own buffer/cache on board.
Card with dedicated 128KB for work framebuffer and some sprites and only two block commands:
-copy to/from main ram
-internal only copy, conditional on source value (transparency), optional conditional on destination value range (could define custom planes by assigning color palette ranges), optional trigger on destination value range (collision detection)

I'd say 2D graphics accelerators were around but probably considered overly expensive for playing games at the time. I believe eg. the IBM 8514 could do bus mastering, and there were ISA bus clones of it (and perhaps ISA prototypes at IBM as well http://www.os2museum.com/wp/isa-bus-8514a/

new retro game on itch: https://90soft90.itch.io/glamorous-zombie-flakes

Reply 11 of 13, by maxtherabbit

User metadata
Rank l33t
Rank
l33t
rasz_pl wrote on 2022-05-15, 09:57:

>Outside of this address space 16 bit memory is always assumed.

and this would mean Adaptec obeys M16 only in 0A0000-0BFFFF range?!?! 😀 Re: Making my first ISA card, as pain-free as possible Re: Why does EMS suck on my 286? amazing 😮

Yep, that thread (where I had an EMS card with only 8-bit access window) is where I discovered that fact experimentally

Reply 12 of 13, by mkarcher

User metadata
Rank Oldbie
Rank
Oldbie
bakemono wrote on 2022-05-15, 16:04:

I'd say 2D graphics accelerators were around but probably considered overly expensive for playing games at the time. I believe eg. the IBM 8514 could do bus mastering, and there were ISA bus clones of it (and perhaps ISA prototypes at IBM as well http://www.os2museum.com/wp/isa-bus-8514a/

The 8514/A (mind the "/A", the IBM 8514 is a monitor, the IBM 8514/A is a video card) doesn't do bus mastering. The 8514/A doesn't even have a memory mapped frame buffer. If you want to paint a bitmapped pictue to the 8514/A memory, you set up the accelerator using port I/O and then you can REP OUTSW the image data to that card. The 8514/A is from the same era as IDE, where REP INSW / REP OUTSW was considered adequately fast for bulk data transfer. The 8514/A wasn't meant for high performance CPU-to-framebuffer transfer. It was meant as CAD accelerator, mainly drawing lines or bitmapped symbols (like text characters) that were already in off-screen memory.

Reply 13 of 13, by bakemono

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2022-05-15, 20:10:

The 8514/A (mind the "/A", the IBM 8514 is a monitor, the IBM 8514/A is a video card) doesn't do bus mastering.

Oh? Must have been the XGA and not 8514/A http://bitsavers.trailing-edge.com/pdf/ibm/pc … anual_May92.pdf

and ISA version of XGA? https://www.computer.org/publications/tech-ne … chips-ibms-xga/

In any case, if someone wanted to make the PC into more of a games machine back in the '80s, IMO the best thing to do would have been to duct tape on an NEC/Hudson HuC6270, overlaying the sprites and tilemap onto VGA/MCGA video. (NEC did this with two of the chips in the PC-FX, and they are able to handle different pixel clocks / scanrates)

new retro game on itch: https://90soft90.itch.io/glamorous-zombie-flakes