VOGONS


EXMS86 (XMS for your 8086)

Topic actions

Reply 60 of 74, by wierd_w

User metadata
Rank Oldbie
Rank
Oldbie

1) only needs to be considered when an xms program requests a locked address.

2) already an issue with an xms function being called at the same time an ems one is. Logically, this xms emulator is allocating these ems pages, and as far as the ems driver is concerned, already owns them, and is already paging these owned pages in and out as is needed to satisfy these requests. The only magic here would be clever misuse of virtual addresses, such that 'any above 1mb location with requested lock' is allocated from these 'clever' locations, that 'just happen' to become the address of the pageframe when the upper bits are omitted. There's 4gb of logical address, so being 'wasteful' to pull off this trick is maybe ok.

Reply 61 of 74, by digger

User metadata
Rank Oldbie
Rank
Oldbie
mateusz.viste wrote on 2025-07-30, 18:14:

It would be nice to know what applications exactly need a kludge in the first. Ie. applications that are 8086-compatible AND XMS-capable AND EMS-unaware AND use DMA-over-XMS.
The "Legend of Kyrandia 2" has been mentioned earlier, but it's a 200M download so I somehow doubt it's 8086-compatible. I've checked the first part though, and it works well with EXMS86 (sound included).

As mentioned, Dune II is also 8086 compatible, XMS-capable, EMS-unaware and highly likely also uses DMA-over-XMS, since the installer mentions that digital sound effects will only be available if (enough) XMS memory is present in the system. Dune II is also a game by Westwood Studios, like Legend of Kyrandia 2. but it's a lot smaller than 200M. If I'm not mistaken, the installer is only about 5MB or so.

Dune II uses Miles ADV drivers (but can be swapped with DIGPAK drivers). That might help you debug exactly how digital audio playback is performed in the game.

Oh wait, I just realized that the game likely does not use DMA-over-XMS, if you can swap the digital audio driver with a driver that plays back the digital audio on a non-DMA sound device, which I described successfully doing in that linked thread. 🤔

Reply 62 of 74, by mateusz.viste

User metadata
Rank Member
Rank
Member
digger wrote on 2025-07-31, 08:31:

As mentioned, Dune II is also 8086 compatible, XMS-capable, EMS-unaware and highly likely also uses DMA-over-XMS, since the installer mentions that digital sound effects will only be available if (enough) XMS memory is present in the system. Dune II is also a game by Westwood Studios, like Legend of Kyrandia 2. but it's a lot smaller than 200M. If I'm not mistaken, the installer is only about 5MB or so.

I tried Dune 2. Without EXMS86 it plays music and sfx, but no speech. With EXMS86 it plays music, sfx and speech (music is disabled during the intro, but I guess this is normal, it's either music or speech there - later in the game the music works fine). I tested this on 86box with an emulation of an 8088 with a LoTech card and a Sound Blaster Pro card. I had to set the SB card to IRQ 7, for some reasons speech wasn't working when the card was set to IRQ 5 (surely unrelated to EXMS86).

The game is totally playable on a 8088, very cool.

digger wrote on 2025-07-31, 08:31:

Oh wait, I just realized that the game likely does not use DMA-over-XMS, if you can swap the digital audio driver with a driver that plays back the digital audio on a non-DMA sound device

Either that, or the game has a graceful fallback to a non-DMA mode when the XMS driver refuses to expose the physical addresses of XMS regions.

http://mateusz.fr

Reply 63 of 74, by DosFreak

User metadata
Rank l33t++
Rank
l33t++
mateusz.viste wrote on 2025-07-17, 12:22:
Checked these two. […]
Show full quote
zb10948 wrote on 2025-07-16, 21:51:

mem shows no XMS, Wolf3D shows no XMS?

Checked these two.

It appears that MS MEM doesn't even try to detect XMS on machines that are not at least a 286:
LINK REMOVED TO POSSIBLY LEAKED MICROSOFT CODE

As for Wolf3D, the starting screen doesn't actually show the amount of available memory, but rather the amount of memory that it was able to allocate. First, it looks for EMS and allocates as much as possible (ie. everything). Then, it looks for XMS and asks the XMM driver about how much memory is available - and my driver answers "0 bytes", because there is no longer any EMS memory left that I could use to back my XMS emulation.

If you'd like to play WOLF3D with EXMS86, then you need to instruct the game not to reserve all the EMS memory. It's actually easy: just run "WOLF3D NOEMS".

Removed link to Microsoft source code. If the op is using this code to work on their projects then likely this thread will need to be closed.

How To Ask Questions The Smart Way
Make your games work offline

Reply 64 of 74, by wierd_w

User metadata
Rank Oldbie
Rank
Oldbie

Was it from released dos 4 sources?

It would be period correct, and kosher.

Reply 65 of 74, by GemCookie

User metadata
Rank Member
Rank
Member

No, it was a repository that supposedly contained the MS-DOS 6.0 source code.

Gigabyte GA-8I915P Duo Pro | P4 530J | GF 6600 | 2GiB | 120G HDD | 2k/Vista/10
MSI MS-5169 | K6-2/350 | TNT2 M64 | 384MiB | 120G HDD | DR-/MS-DOS/NT/2k/XP/Ubuntu
Dell Precision M6400 | C2D T9600 | FX 2700M | 16GiB | 128G SSD | 2k/Vista/11/Arch/OBSD

Reply 66 of 74, by mateusz.viste

User metadata
Rank Member
Rank
Member
GemCookie wrote on 2025-07-31, 21:04:

No, it was a repository that supposedly contained the MS-DOS 6.0 source code.

It's on github, easy to find.

GemCookie wrote on 2025-07-31, 21:04:

If the op is using this code to work on their projects then likely this thread will need to be closed.

You need to define "using".

http://mateusz.fr

Reply 67 of 74, by mkarcher

User metadata
Rank l33t
Rank
l33t
digger wrote on 2025-07-30, 07:48:

That's the cool thing about this forum. I keep learning new stuff from the DOS PC era. Thanks. 🙂

Nevertheless, this thread is about EXMS86, not the design(?) of the DMA System of IBM-compatible computers, so we shouldn't derail that thread too much with off-topic posts. If you are interested in a more detailed discussion, I suggest you open a new thread if you have further questions. Feel free to send me a PM with a link if you are afraid I miss it otherwise.

digger wrote on 2025-07-30, 07:48:

Wasn't it still the case that ISA DMA was kind of a bottleneck, especially on anything faster than a 286, since it always had to access the system RAM at 5MHz?

While you have a point in that the transfer rate of ISA DMA ist quite low, and the base frequency of the DMA controller in AT computers is half the ISA clock, i.e. around 4MHz, that is even less than the 4,77MHz in the PC/XT, calling it a "bottleneck" is exaggerating it.

digger wrote on 2025-07-30, 07:48:

It's kind of disappointing that it ended up being the primary I/O method for playing back digitized audio samples on popular sound cards back in the day, causing major headaches w.r.t. DOS sound compatibility once PCs moved beyond ISA slots.

For ISA sound cards, ISA DMA ist the method of choice for data transfer. Having a central multi-channel DMA controller on the mainboard instead of some DMA technique on each sound card also is a smart choice, as it makes card design easier. The issue with non-ISA sound cards is that there was no standard how PCI cards can be target of the central (ISA) DMA controller, which is a design choice when PCI was specified.

digger wrote on 2025-07-30, 07:48:

If I understand correctly, from the 386 and up, this became such a bottleneck, that even driving a Covox-like dumb LPT DAC directly would be take up less overhead than relying on ISA DMA for digital sound playback. Or does that deserve more nuance?

This paragraph mirrors the sentiment about ISA DMA, which clearly is deficient, and in my opinion was barely adequate for the original PC, but never kept up, yet it is that far exaggerated that one has to consider it factually false. So, let's put the rants aside and get back to the facts.

ISA DMA is inconvenient to use, as the DMA controller used in the original IBM PC is actually designed for 8080 or 8085 systems with a total address space of 64K. IBM added a "page register" that supplies four extra address bits per channel, but these page registers are not tied to the address bits in the DMA controller, so a transfer always stays inside one 64K block. You can't do a transfer starting at physical address 60K (which is in the first 64K block) and ending at address 70K (which is in the second 64K block). The IBM mainboard BIOS rejects a floppy operation like this with error code 9 ("DMA segment overrun"). It is quite likely you never knew this, even if you were writing programs that directly access the floppy drive using INT 13h. This is because DOS installs a shim layer above the BIOS which transparently works around this limitation, so the disk operating system actually did care about making disk operations easier. As the second DMA controller in the AT is wired to address 64K words (128K bytes, but each transfer needs to be performed from an even address), you get barriers every 128K instead of every 64K, but the issue persists, as the AT still uses the 8237A meant for 8-bit systems. If the AT hard drive controller would have used a 16-bit DMA channel instead of port I/O, all hard disk transfers would have required word alignment, which is something XT software didn't care about, and is likely a contributing factor why the AT hard drive controller did not use DMA.

Indeed, the clock speed of the 8237A is limited to 5 MHz, and the controller is operated at 3MHz in the original 6MHz AT, and at 4 to 4.2 MHz in later systems. The standard configuration of the ISA DMA controller requires 4 click cycles per transfer (+ some overhead), so at 4MHz, a transfer rate of 1 mega transfer per second was theoretically obtainable, which is 2MB/s. This is faster than the transfer rate of an MFM hard drive. Typical controllers first transferred a sector from the drive surface into controller RAM, and only after getting the signal from the hardware CRC comparator that the checksum is OK, the sector is transferred from the controller to the host. The hardware CRC comparator checks the CRC while the data is read from the drive, so ther is no notable latency for the CRC check, and the transfer to the host can begin immediately after reading the sector+its CRC. In case the CRC mismatched, the correction was a slow process performed on the controller, but let's omit this fringe case for now. A desirable design uses interleave 2, which records the sectors in the order 1 - 10 - 2 - 11 - 3 - 12 - 4 - 13 - 5 - 14 - 6 - 15 - 7 - 16 - 8 - 17 - 9. This means that the data from sector 1 can be transferred from the controller to the host while sector 10 passes the drive head. Thus 50% of the time is used for reading data from the disk to the controller and the other 50% is used to transfer data from the controller to the host. At 3600rpm (60 rotations per second), we can read 17 sectors in two revolutions, so 30 * 17 = 510 sectors can be read in a second, yielding a net transfer rate of 261 KB/s. As only 50% of the time is spent on transferring data from the controller to the host, that transfer would require twice the speed, i.e. arounf 520KB/s. This is way below the 2MB/s theoretical (and likely 1.6MB/s practical) limit of ISA DMA, so contrary to what some people say, when IBM designed the AT, ISA DMA was not prohibitively slow for hard drive transfers. Even if you used very modern MFM drives that have 50% higher transfer rates, you still end up at around 780KB/s required DMA rate. While faster hard drives did exist in 1984, they were not available at a form factor or price point that are a good fit for the IBM AT. So while the transfer rate is limited, disregard any claim that the rates are unusable for any practical purpose.

Furthermore, it is true that DMA transfers at their quite low rate put a burden on the ISA bus. If you run a WSS-type card in the less efficient "single mode DMA" variant at 48kHz 16 bit stereo (around 200KB/s), expect ISA transfer rates to go down by 33%. This is because WSS-type cards use an 8-bit channel, and "single mode" being the least efficient mode of the three available modes. Basically, you have

  • Block mode: The device request DMA, the DMA controller obtains bus ownership and then issues as many DMA read or or write cycles from/to the device as the DMA controller is set up for. It doesn't matter whether the device stops requesting DMA during the transfer, the transfer will be continued until the programmed number of bytes/words have been transferred.
  • Single mode: The device requests DMA, the DMA controller obtains bus ownership, transfers a single byte or word and then release bus ownership. If after releasing bus ownership, the device is still requesting DMA, the DMA controller again enters bus arbitration to obtain the bus for the next byte or word.
  • Demand mode: This mode a cross-over of the previous two modes: If the device requests DMA, the DMA controller obtains the bus and starts sending/requesting multiple bytes/words just like in block mode, but the device can interrupt the transfer any time it likes by stopping to request DMA. In that case, the current transfer gets finished and the DMA controller releases the bus, even if it still has some bytes to transfer.

So, assuming the device wants to transfer a certain number of bytes (or words, but ignore that for now), and the DMA controller is set up for exactly that size, and the device keeps its DMA request line active all the time, it will flawlessly work with all modes, even without knowing what mode the DMA controller is in! The DMA controller is quite primitive. It can not interrupt a transfer if a higher priority request comes in. As long as bus ownership for one DMA channel is established, the DMA controller will keep the bus assigned for that channel. This effectively means block transfers can not be interrupted, even for memory refresh! Also, demand mode transfers can only be interrupted if the device temporarily releases the DMA request line. Single mode transfers can be interrupted any time. As we know that we can get around 1 transfer per microsecond, the information that you may not "own" the ISA bus for longer than 15µs immediately shows that transfers exceeding 15 bytes or words may not be transferred in block mode, and in demand mode, the device needs to "cooperate" by periodically releasing the bus. In single mode, re-arbitration happens after every byte/word transferred, so there is no issue with RAM refresh in single mode.

WSS-type cards are meant to be used in demand mode. The AD1848 (or later compatible chips) and the SB16 have a FIFO the card, and they transfer a couple of bytes or word in demand mode, until the FIFO is full, and then release the bus until the FIFO is nearly empty. This is more efficient than arbitrating for the ISA bus for every byte/word. The sound chip that is meant to interface demand mode will not notice if the DMA controller is in single mode instead - except for the worse performance delivered in single mode.

So, ISA DMA does put a clearly relevant burden on the ISA bus, which will reduce the bandwidth available to writing to ISA graphics card (which is not good for games). On the other hand, the comparison with simple LPT DACs is nonsense. Using a device like that in the background means that for every sample the processor will get interrupted, needs to save the current execution address and processor state, look up the timer interrupt handler, jump there, push some registers to the stack, output a sample to the DAC, pop the registers stored manually and then return back to the interrupted task. This is clearly more overhead even than DMA in single byte mode for 16-bit stereo sound. Things might turn against DMA if the parallel port sound device has a FIFO, so you don't have to fill every single byte. The Disney Sound Source works this way, but as the parallel port usually uses 8-bit I/O at default wait states (which are set quite high usually to be PC compatible), I don't think you can beat DMA. Now, if EMM386 is loaded, things for non-DMA sound playback look even worse because EMM386 is basically a virtual machine monitor (or hypervisor) that runs the DOS tasks. As monitor programm, EMM386 receives all interrupts, and handles them in protected mode. When a DOS program is interrupted, the processor switches from virtual-8086-mode (a sub-mode of protected mode) to standard protected mode (which takes ~100 clock cycles) to have EMM386 handle the interrupt. EMM386 is then supposed to "forward" the interrupt to the virtualized DOS task, which means it has to switch back to virtual-8086-mode. I assume it can set up the processor in a way that returning from the interrupt (for example the timer interrupt to send samples to a parallel-port DAC) does not require another round trip through standard protected mode. Every kind of DMA is more efficient than having a high-frequency timer interrupt with EMM386 loaded.

In the end, (non-busmaster) ISA DMA was only used in cases where high throughput wasn't required, but background transfers were important. Sound cards fit this pattern perfectly. The programming model of ISA DMA is awful, and if EMM386 virtualizes RAM, it also needs to virtualize DMA, which is a very cumbersome operation. So nobody in the software industry liked ISA DMA. This explains why system designers were happy to get rid of the obselete ISA DMA scheme when they designed PCI systems. They did not expect the use case of "ISA compatible sound cards" being that important. Later on, some standards (eg. PC/PCI) were designed to give PCI sound cards access to the ISA DMA/IRQ system, and that's why PCI cards can't generally offer a nice Plug&Play experience including soundblaster compatibility.

Reply 68 of 74, by digger

User metadata
Rank Oldbie
Rank
Oldbie

Thank you for the very extensive and informative answer, mkarcher. Your level of knowledge about this topic is quite impressive.

Reply 69 of 74, by digger

User metadata
Rank Oldbie
Rank
Oldbie
mateusz.viste wrote on 2025-07-31, 13:09:

I tried Dune 2. Without EXMS86 it plays music and sfx, but no speech. With EXMS86 it plays music, sfx and speech (music is disabled during the intro, but I guess this is normal, it's either music or speech there - later in the game the music works fine).

It's not supposed to be "either/or" in the intro. On 286 and higher systems with sufficient XMS memory, the intro should play with music, sound effects and speech. No idea why you experienced this limitation in the intro. What version of the game did you download? And how much emulated XMS was available to the game? Maybe this happened due to a lack of conventional memory?

I tested this on 86box with an emulation of an 8088 with a LoTech card and a Sound Blaster Pro card. I had to set the SB card to IRQ 7, for some reasons speech wasn't working when the card was set to IRQ 5 (surely unrelated to EXMS86).

The game is totally playable on a 8088, very cool.

That is indeed amazing. 😄 It must run a bit on the sluggish side, though. Or is it not that bad? It might get a lot worse later in the game, with more units moving around.

Either that, or the game has a graceful fallback to a non-DMA mode when the XMS driver refuses to expose the physical addresses of XMS regions.

Maybe. But why go through the trouble of implementing such a workaround if the XMS drivers back in the day did expose those physical addresses in most cases? Or was this limitation quite common in XMS drivers?

Reply 70 of 74, by mateusz.viste

User metadata
Rank Member
Rank
Member
digger wrote on Yesterday, 15:11:

It's not supposed to be "either/or" in the intro. On 286 and higher systems with sufficient XMS memory, the intro should play with music, sound effects and speech. No idea why you experienced this limitation in the intro. What version of the game did you download? And how much emulated XMS was available to the game? Maybe this happened due to a lack of conventional memory?

I retested on a stronger VM (486) and speech is playing at the same time as music indeed.
When I tested on the 8088 VM it had 2MB of XMS (from an emulated LoTech 2 MB EMS board), but I think the problem is either the soundcard (on the 486 I tested with an SB16, while the 8088 could only have an SB Pro) or the amount of conv. RAM. I had only about ~570K of conv RAM available while the game's setup program was telling that I need 602K... Unfortunately I wasn't able to get more on this 86Box setup without HMA.

But why go through the trouble of implementing such a workaround if the XMS drivers back in the day did expose those physical addresses in most cases? Or was this limitation quite common in XMS drivers?

I checked now - Dune2 does not even try to lock XMS regions. It does not care about the XMS version either. The only functions it calls are "query available XMS", "allocate XMS", "move XMS" and "free XMS".

http://mateusz.fr

Reply 71 of 74, by mkarcher

User metadata
Rank l33t
Rank
l33t
mateusz.viste wrote on Yesterday, 21:23:

I checked now - Dune2 does not even try to lock XMS regions. It does not care about the XMS version either. The only functions it calls are "query available XMS", "allocate XMS", "move XMS" and "free XMS".

So it just uses the XMS as swap space / RAM drive to be able to load the speech samples quickly to conventional RAM when they are supposed to be played. Programs like that are the perfect use case for EXMS86.

Reply 72 of 74, by Jo22

User metadata
Rank l33t++
Rank
l33t++

But why go through the trouble of implementing such a workaround if the XMS drivers back in the day did expose those physical addresses in most cases? Or was this limitation quite common in XMS drivers?

Did they all expose it? I mean, less popular DOSes such as ROM-DOS, X-DOS or PTS/Paragon DOS had their own Himem.sys substitutes.
Then there are the synthetic DOS environments of Unixes or OSes such as L3.
If they had XMS support, they maybe had to keep it abstract in order to not clash with the limits of the, um, sandbox.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 73 of 74, by mkarcher

User metadata
Rank l33t
Rank
l33t
Jo22 wrote on Yesterday, 22:33:

But why go through the trouble of implementing such a workaround if the XMS drivers back in the day did expose those physical addresses in most cases? Or was this limitation quite common in XMS drivers?

Did they all expose it?

They had to. XMS was not only used for real-mode software like disk caches or RAM drives, but also as hardware abstraction layer for Windows 3.0 in standard mode (or other protected mode software). Such software uses the XMS driver to allocate extended memory and lock all the blocks before entering protected mode. Then the locked physical addresses of the XMS blocks can be used to set up descriptors pointing into memory allocations managed by Windows 3.1.

This use case will break with EXMS86. Protected mode software requires "real XMS", but as long as you don't use EXMS on a 286 or newer, you can't run protected mode software anyway.

Reply 74 of 74, by Jo22

User metadata
Rank l33t++
Rank
l33t++

They had to.

It makes sense in principle. Had this ever been checked in practice, though?
I vaguely remember that some himem.sys alternatives are Windows 3.x incompatible (also including public domain XMS managers).
The PTS/Paragon version is quite tiny and may lack certain features.
X-DOS had a compatibility switch in config.sys for making Windows 3.x run, I vaguely remember.
That being said, I often had to use Microsoft's or IBM's himem.sys do run Windows 3.1x..

Edit: I don’t mean to argue, I just wonder if there had been, um, emperical evidences collected regarding various XMS managers.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//