VOGONS


First post, by MarkDastedt

User metadata
Rank Newbie
Rank
Newbie

Hello, id like to share this source of an old dos multicore Demo.

https://pastebin.com/9sgW9dVu
and its assembler startu code
https://pastebin.com/3rG2eiWD

i found this on an old dvd that was part of company assignments. it seems to be a demo of multicore use in plain dos.
im not a programmerso i dont know where to post this the nerd way but it contained a binary and it at least does not crash and prints a heartbeat.
enjoy
Mark Dastedt

Reply 1 of 15, by myne

User metadata
Rank l33t
Rank
l33t

Very interesting.
I wonder if sbemu/other quasi-emulators might find this useful...?

I built:
Convert old ASUS ASC boardviews to KICAD PCB!
Re: A comprehensive guide to install and play MechWarrior 2 on new versions on Windows.
Dos+Windows 3.11+tcp+vbe_svga auto-install iso template
Script to backup Win9x\ME drivers from a working install
Re: The thing no one asked for: KICAD 440bx reference schematic

Reply 2 of 15, by MarkDastedt

User metadata
Rank Newbie
Rank
Newbie

Lets hope it will be for someone. multi core use in Dos would be great feature...

Reply 3 of 15, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Quick, buy up all the slot 1, 370 and socket A dual boards before someone figures out how to do Voodoo emulation on 2nd CPU. 🤣

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 4 of 15, by myne

User metadata
Rank l33t
Rank
l33t

I'm running an i3 with reasonable success in w98.
At 3ghz I probably don't really need to offload the sound, but I wouldn't say no to a virtual voodoo if it has the grunt.
That said, if picogus could magically become firmware of sorts that talks IRQs, io etc and to the hda, it might drag the "to new" curve up a few years.

I built:
Convert old ASUS ASC boardviews to KICAD PCB!
Re: A comprehensive guide to install and play MechWarrior 2 on new versions on Windows.
Dos+Windows 3.11+tcp+vbe_svga auto-install iso template
Script to backup Win9x\ME drivers from a working install
Re: The thing no one asked for: KICAD 440bx reference schematic

Reply 5 of 15, by dartfrog

User metadata
Rank Newbie
Rank
Newbie

I think RayeR talked about something similar somewhere. Or maybe i'm going crazy.

Edit: Comments say original by "bloodwych" They cannot be talking about bloodwych for DOS, right?
Edit 2: It's Bloodwych from the EAB (English Amiga Board)

Interesting stuff nonetheless. a worker core is running with no interrupt handlers, no page tables, no memory protection, and no OS. That's about as close to bare metal as you can get, meanwhile the other core is still running DOS. Fascinating.

Potential PCIe-to-PCI-to-ISA pathway repository: https://github.com/DartFrogTek/PCIe-PCI-ISA
Using KMDF driver on Win10 PicoGUS PLAYS DOOM SAMPLES VIA PORT IO & DMA!

Reply 6 of 15, by wierd_w

User metadata
Rank Oldbie
Rank
Oldbie

Using additional cores (though cache contention on multicore instead of oldschool SMP might crop up) on anachronistic boxes (thin clients, and laptops running DOS and vsbhda/sbemu, etc) might be a good application here.

Reply 7 of 15, by MarkDastedt

User metadata
Rank Newbie
Rank
Newbie

@dartfrog nice analysis! This reads quite cool. reminds me to "everybody thought its impossible, till somebody came along an simly created it."

Reply 8 of 15, by myne

User metadata
Rank l33t
Rank
l33t
dartfrog wrote on 2026-05-28, 19:17:

Interesting stuff nonetheless. a worker core is running with no interrupt handlers, no page tables, no memory protection, and no OS. That's about as close to bare metal as you can get, meanwhile the other core is still running DOS. Fascinating.

So...
If we imagine it as a blind, deaf, dumb math machine not that different to punch cards...
How does it get fed? (Where do you load the punch cards?)
Can you assign it a block of memory to put a "main" loop (load all the punch cards)?
Where does it put its work? (Where's the printer?)

Is my thought above about using it as a coprocessor for emulationy things eg picogus plausible?

I wish people commented code better. Idiots like me have nfi what's going on.

I assume this is pointers directly to memory addresses to update...?
2x 8bit integers and a 32bit int.
state = (volatile uint8_t *)0x90100;
alive = (volatile uint8_t *)0x90101;
cpuHz = *((volatile uint32_t *)0x90104); // bogomips

I built:
Convert old ASUS ASC boardviews to KICAD PCB!
Re: A comprehensive guide to install and play MechWarrior 2 on new versions on Windows.
Dos+Windows 3.11+tcp+vbe_svga auto-install iso template
Script to backup Win9x\ME drivers from a working install
Re: The thing no one asked for: KICAD 440bx reference schematic

Reply 9 of 15, by MarkDastedt

User metadata
Rank Newbie
Rank
Newbie

meanwhile i send it and phoned to a study collegue. he explained the 2cores thingy like a shared appartement. if theres only 2 people, than youll always know who is and was in charge for the house keeping. you only need plans if theres more dudes (or bras) in there. but of course you share separate rooms and such you send text messages between each other.
only if you have food to share, you have to bring it over. but remember your collegue has no teeth so he cant eat many tasty things u like! I love such analogies

Reply 10 of 15, by RayeR

User metadata
Rank Oldbie
Rank
Oldbie

I already noted this on various DOS forums, Michael Chourdakis was the one of pioneers of multi-core on batemetal/DOS, his articles vanished by time, thankfully we have archive:
https://web.archive.org/web/20221007133135/ht … ore-Programming
He also did interesting work on HW virtualization.
The posted source is from somebody else...
So we have some multi-core demos available maybe for ~10 years but no real application for it yet...

Gigabyte GA-P67-DS3-B3, Core i7-2600K @4,5GHz, 8GB DDR3, 128GB SSD, GTX970(GF7900GT), SB Audigy + YMF724F + DreamBlaster combo + LPC2ISA

Reply 11 of 15, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++
RayeR wrote on Today, 00:23:

but no real application for it yet...

But imagine the multitasking abilities of Gem if you could run 3 apps per core on a dual core, or 2 apps per core on a triple, or 1 app per core on a hexacore.... because there's only about 6 apps worth running 🤣

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 12 of 15, by RayeR

User metadata
Rank Oldbie
Rank
Oldbie

Sure but just to mention it would require more hard work to bring some real applications. The demo-it's not so hard to start other cores and run some very simple code on them (that don't use interrups, bios services, etc) but it't far step to turn it into universal DOS multitasker. I don't know how much the single task DOS would have to be modified to support this. So I think it's much easier that a single app can create some computing threads on other cores, when it makes sense and where app architecture allows it. I can imagine that e.g. some encoders or packers can easily offload some routines and data blocks on other cores. Also a function that just copy offsceen buffer to vram could be offloaded easily this way. But multicore systems running DOS usually have enough power even on single core so there was not special need to do this. Simply the effort for programming and the output wouldn't worth otherwise someone already programmed it...

Gigabyte GA-P67-DS3-B3, Core i7-2600K @4,5GHz, 8GB DDR3, 128GB SSD, GTX970(GF7900GT), SB Audigy + YMF724F + DreamBlaster combo + LPC2ISA

Reply 13 of 15, by MarkDastedt

User metadata
Rank Newbie
Rank
Newbie

Thanks for clarifyng RayeR . But at least it looks promising - from what i can tell it could be used to connect to geforce now or such stuff. this would be cool!

Reply 14 of 15, by dartfrog

User metadata
Rank Newbie
Rank
Newbie
RayeR wrote on Today, 13:40:

I don't know how much the single task DOS would have to be modified to support this.

I actually looked into this around when Microsoft released the MS DOS 4.0 source and the Ozzie binaries, and the answer depends heavily on what someone means by multicore DOS. I see two main projects, each are a ton of work in their own right:

  • A native SMP DOS like OS using the DOS 4.0 source as a base/reference.
  • A modern SMP/VM host that runs legacy DOS programs inside VM guests, with a DOS personality layer handling DOS services.

These sound similar from the outside, but internally they are very different. I've included both lists for anyone who wants a starting point or an idea on how much work is required. FWIW I probably missed a bunch of stuff, so it's likely not conclusive.


Native SMP DOS like OS using DOS 4.0 code wrote:
This is the make DOS itself multicore aware route. It is closest in spirit to modifying DOS, but it is also the hardest to make […]
Show full quote

This is the make DOS itself multicore aware route. It is closest in spirit to modifying DOS, but it is also the hardest to make compatible because DOS was built around global state, single tasking assumptions, and non reentrant services.

Core requirements

  • Build a new boot path: UEFI, Multiboot2, or a custom loader.
  • Bring the BSP into protected mode or long mode with early paging.
  • Parse ACPI tables, especially RSDP/XSDT/MADT, to enumerate Local APICs and IO APICs.
  • Wake APs using INIT-SIPI-SIPI.
  • Set up per CPU GDT, IDT, TSS/IST or equivalent stacks, and per CPU data.
  • Implement spinlocks, atomics, memory barriers, and IPI plumbing.
  • Implement a scheduler and context switching.
  • Add APIC timer support for scheduling ticks.
  • Add a monotonic timer source such as HPET, TSC deadline, or invariant TSC calibration.
  • Add a real memory manager: physical page allocator, virtual memory manager, kernel heap, and low memory allocator.
  • Decide what DOS process means in the new kernel.
  • Preserve visible DOS structures where compatibility requires it: PSP, MCB, SFT, CDS, DPB, DTA, file handles, environment blocks, etc.
  • Rework the DOS global state problem. The SDA is a major one; either replicate it per CPU/task or serialize entry into DOS services.
  • Start with a coarse INT 21h global lock, then gradually split it into finer locks around MCBs, SFTs, CDS, DPBs, device chains, and buffer cache.
  • Port or reimplement the FAT filesystem layer.
  • Reimplement the INT 21h service layer while preserving DOS ABI behavior, rather than literally porting old 16bit code to 64bit.
  • Provide compatibility policy for old programs: by default, assume old binaries are not SMP safe.

Important caveat

If the kernel is in 64bit long mode, normal 16bit DOS programs cannot just run directly as ordinary code. Long mode does not support virtual 8086 mode. So a native long mode DOS like kernel would still need one of these:

  • VMX/SVM guests for real mode programs.
  • A separate compatibility mode execution environment.
  • A software emulator or dynamic translator.
  • Or a non long mode 32bit protected mode kernel if you want v86 style execution.

Even the native SMP DOS route runs into the old program execution problem very quickly.

Desirables

  • Modern build system using NASM/Clang/GCC instead of MASM 5.10 and old Microsoft C.
  • Retire or replace BIOS/, BOOT/, MEMM/, and MAPPER/ runtime paths after mining them for compatibility behavior.
  • Strip 8086/286 baggage from the host kernel while preserving 8086/286 visible behavior for DOS programs.
  • PCI/PCIe enumeration, ideally including MCFG.
  • AHCI and NVMe storage drivers.
  • xHCI USB driver and USB HID keyboard/mouse.
  • Framebuffer output through UEFI GOP.
  • Audio backend such as Intel HDA.
  • Network backend such as e1000, rtl8139, or virtio net.
  • XMS, EMS, DPMI, and UMB/HMA compatibility layers if real DOS software compatibility matters.
  • A metadata flag or wrapper format marking new binaries as SMP safe.
  • A CLI tool to set/clear that SMP safe flag.
  • Documentation for what an SMP safe DOS program is allowed to do.

Verification for this route

  • Boot to a command shell.
  • Run basic internal and external commands.
  • Run FORMAT and CHKDSK against a test FAT volume.
  • Run two CPU bound DOS like tasks on separate cores and prove actual parallel execution.
  • Stress the filesystem with concurrent opens, reads, writes, deletes, and directory operations.
  • Stress INT 21h reentrancy and locking.
  • Test PSP/MCB/SFT/CDS/DPB visibility against programs that inspect DOS internals.
  • Test old timing sensitive programs.
Legacy DOS programs in VM guests with a DOS personality layer wrote:
This is the more modern and probably cleaner architecture. In this model, the host is not really DOS internally. It is a small S […]
Show full quote

This is the more modern and probably cleaner architecture. In this model, the host is not really DOS internally. It is a small SMP hypervisor/kernel that runs DOS programs in isolated real mode or v86 like guests. The DOS API is provided by a host side DOS personality layer. In other words, MS DOS 4.0 becomes reference material for behavior, structures, and filesystem/API semantics, not necessarily the literal kernel you keep running.

Core requirements

  • Build a modern boot harness: UEFI or Multiboot2.
  • Bring the BSP into long mode with identity mapped early paging.
  • Parse ACPI RSDP/XSDT/MADT for CPU and interrupt controller discovery.
  • Initialize Local APICs and IO APICs.
  • Wake APs with INIT-SIPI-SIPI.
  • Set up per CPU GDT, IDT, stacks, and per CPU data.
  • Implement SMP basics: locks, atomics, IPIs, scheduler, timers, and context switching.
  • Implement a physical and virtual memory manager.
  • Enable VMX or SVM on each CPU.
  • Use unrestricted guest mode, if available, for real mode DOS guests.
  • Create a per guest VMCS/VMCB.
  • Use EPT/NPT to give each guest a private low memory address space.
  • Build a DOS .COM/.EXE loader that creates a guest environment: PSP, environment block, command tail, initial registers, memory layout, etc.
  • Virtualize key PC hardware: PIC, PIT, RTC/CMOS, keyboard controller, VGA text/graphics policy, and optionally DMA.
  • Intercept HLT exits and treat them as scheduler yields.
  • Intercept port I/O exits and route them to emulation, trap queues, or real hardware policy.
  • Intercept software interrupts such as INT 20h, INT 21h, INT 25h, INT 26h, INT 2Fh, and possibly BIOS interrupts like INT 10h, INT 13h, INT 16h, and INT 1Ah.
  • Route DOS service calls to the DOS personality layer.
  • Preserve DOS visible structures: PSP, SFT, MCB, CDS, DPB, DTA, file handles, environment blocks.
  • Implement FAT/filesystem behavior in the host personality layer.
  • Make guests independently schedulable across cores.

Desirables

  • Per guest 1 MB conventional memory model, plus optional EMS/XMS/DPMI services.
  • Virtual disks backed by host files.
  • Virtual packet driver feeding a modern NIC.
  • Sound Blaster DSP emulation at 0x220-0x22F.
  • Adlib/OPL emulation at 0x388-0x389.
  • Nuked OPL3 or equivalent FM synthesis.
  • PCM mixer feeding HDA, USB audio, or another modern audio backend.
  • MT-32 emulation as an optional service.
  • VGA/SVGA emulation or a software rasterizer.
  • Pinned core emulation threads for audio, video, networking, or other timing sensitive services.
  • Lock free SPSC rings for low latency event delivery between guest exits and emulation threads.
  • Guest affinity policy: default old programs to one core, allow safe guests to migrate.
  • Wrapper metadata or config files describing guest hardware expectations.
  • Debug tools for VM exits, INT calls, port I/O, filesystem calls, and timing behavior.

Verification for this route

  • Boot or launch COMMAND.COM inside a guest.
  • Run .COM and .EXE programs with correct PSP and command tail behavior.
  • Run FORMAT and CHKDSK against a virtual FAT disk.
  • Run two CPU bound DOS guests on different host cores and prove real parallel execution.
  • Run multiple guests doing file I/O at the same time and test SFT/filesystem locking.
  • Run a Sound Blaster game and confirm audio exits/emulation/mixing work.
  • Run Adlib/OPL software and confirm FM output.
  • Run timer sensitive software to validate PIT/PIC/RTC behavior.
  • Stress eight or more concurrent guests for fairness, latency, and lock contention.

So it is quite a bit of work, but the amount of work depends on which target you mean. Both are substantial projects in their own right and both are basically building an OS from the ground up.

If you mean make MS DOS 4.0 itself into a real SMP OS; then the hard part is making ancient single tasking DOS internals safe in a concurrent kernel. The DOS global state, INT 21h reentrancy, MCB/SFT/CDS/DPB locking, and memory model are the big problems.

If you mean run legacy DOS programs across multiple cores; then a VM guest approach is cleaner imo. Build a modern SMP host/hypervisor, run each DOS program in a guest, and route DOS calls and hardware accesses into a DOS personality/emulation layer. At that point you are not really just adding multicore to DOS. You are building a new SMP DOS compatible system, using the DOS 4.0 source as reference material for behavior, ABI, filesystem semantics, and compatibility.

Potential PCIe-to-PCI-to-ISA pathway repository: https://github.com/DartFrogTek/PCIe-PCI-ISA
Using KMDF driver on Win10 PicoGUS PLAYS DOOM SAMPLES VIA PORT IO & DMA!

Reply 15 of 15, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++
RayeR wrote on Today, 13:40:

. But multicore systems running DOS usually have enough power even on single core so there was not special need to do this. Simply the effort for programming and the output wouldn't worth otherwise someone already programmed it...

That's really the nub of it, by the time you get a couple hundred mhz in your box DOS task switching is fine, if you wanted to use your dual core PPro you were on NT, by the time consumers were buying dual cores DOS was forgotten.

It might have come about if the Mhz race had stalled out for technical reasons in late 1998, maybe coppermine wasn't tried, or failed, but DOS as a gaming platform was about over by then. So if the pause button was hit, when the only way to get near a ghz was dual PII or celeron, then it might have happened.

Also could have been a path if Win98 and direct X gaming really dropped the ball badly, didn't work, crashed every time, that sort of thing and games went BACK to DOS around that period, leaving it important to develop for DOS on DOS. Then programmers might have opened up one core as a compile core on the dual boards, and things developed a bit from there.

But all in all probably a chicken and egg situation, customers weren't interested in dual core because no killer-app took real advantage, so no duals were sold, and because no duals were sold, no dual core stuff was written. Look how long gaming took to get away from single core performance on windows, when everyone had been getting dual cores for a number of years.

edit: that's a relative "no duals were sold" not a literal, meaning not enough were in installed user base of mass market consumers.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.