CPU Accelerators - Why not put SRAM on them \ VOGONS

CPU Accelerators - Why not put SRAM on them

Topic actions

First post, by Phido

Posted on 2025-01-27, 03:49

Phido Offline

Rank Newbie

Rank: Newbie
Posts: 76
Joined: 2018-06-21, 02:23

Ok, I have a question.

Machines like the Macintosh LC/LCII/Classic/Colour Classic/Classic II with 32bit CPUS on 16bit buses, but also PC machines with soldered on slow memory.. For example I have a IBM PS/2 with 1mb of ram, and a MCA bus which is pretty much impossible to find useful memory boards at reasonable prices. With the Macintoshes, the LC/Colour classics are all limited to 10Mb ram anyway. 4Mb of 32bit ram would make it literally twice as fast.

Why not make a CPU accelerator with say 2Mb or 4Mb of SRAM on it? SRAM is pretty cheap, fast, easy to integrate, you can just hook up the address lines straight to the chips. You could also run the CPU at say twice (or thrice or more) the bus speed.

So on a 10Mhz 286 you could strap on a 20Mhz Harris 286, or a 386SX (or 486slc) at say 20, 30 or 40Mhz with 2 or 4 Mb of ram straight on the PCB of the accelerator. No cache costs and complexity. While 2mb or 4Mb isn't much, for retro purposes, it's probably enough. Those with ISA slots can add additional expanded memory on ISA or similar. Disk access is generally much quicker these days anyway.

On a Colour classic with a 16mhz 16bit bus you could go 32bit 33Mhz, literally increasing performance by a factor of four. Clock doubling accelerators seem quite common, but are hamstrung by the narrow bus at 16bit and 16 mhz.

The 68030 has a mmu integrated, but I assume you can just hook it straight up to the address lines anyway.
DMA access could be an issue

Reply 1 of 12, by rmay635703

Posted on 2025-01-27, 04:40

rmay635703 Offline

Rank Oldbie

Rank: Oldbie
Posts: 1814
Joined: 2019-01-19, 19:32

Phido wrote on 2025-01-27, 03:49:
Ok, I have a question. […]
Show full quote

Ok, I have a question.

Machines like the Macintosh LC/LCII/Classic/Colour Classic/Classic II with 32bit CPUS on 16bit buses, but also PC machines with soldered on slow memory.. For example I have a IBM PS/2 with 1mb of ram, and a MCA bus which is pretty much impossible to find useful memory boards at reasonable prices. With the Macintoshes, the LC/Colour classics are all limited to 10Mb ram anyway. 4Mb of 32bit ram would make it literally twice as fast.

Why not make a CPU accelerator with say 2Mb or 4Mb of SRAM on it? SRAM is pretty cheap, fast, easy to integrate, you can just hook up the address lines straight to the chips. You could also run the CPU at say twice (or thrice or more) the bus speed.

So on a 10Mhz 286 you could strap on a 20Mhz Harris 286, or a 386SX (or 486slc) at say 20, 30 or 40Mhz with 2 or 4 Mb of ram straight on the PCB of the accelerator. No cache costs and complexity. While 2mb or 4Mb isn't much, for retro purposes, it's probably enough. Those with ISA slots can add additional expanded memory on ISA or similar. Disk access is generally much quicker these days anyway.

On a Colour classic with a 16mhz 16bit bus you could go 32bit 33Mhz, literally increasing performance by a factor of four. Clock doubling accelerators seem quite common, but are hamstrung by the narrow bus at 16bit and 16 mhz.

The 68030 has a mmu integrated, but I assume you can just hook it straight up to the address lines anyway.
DMA access could be an issue

Buffalo Memco did just that on HKN x86 accelerators, sadly the ram isn’t on the south bridge which puts it up into a wierd address range requiring a custom means of access.

Because it’s bespoke and not a “it just works “ with existing software it’s usefulness is limited

Reply 2 of 12, by Phido

Posted on 2025-01-27, 05:00

Phido Offline

Rank Newbie

Rank: Newbie
Posts: 76
Joined: 2018-06-21, 02:23

rmay635703 wrote on 2025-01-27, 04:40:

Buffalo Memco did just that on HKN x86 accelerators, sadly the ram isn’t on the south bridge which puts it up into a wierd address range requiring a custom means of access.

Because it’s bespoke and not a “it just works “ with existing software it’s usefulness is limited

Oh I mean replacing the current CPU with a board with a CPU and ram. Software wouldn't have to do anything. You pop out your old CPU and put in a new CPU on a card that plugs into the same socket.
The concept is quite popular on Mac and Amigas. Originally back in the day, you could add processor cache to a system with no cache this way. But now SRAM is cheap, why not just make main memory in sram?

Like the 286 to 386 cards. This replaces your PGA68 80286 CPU with a 386sx on a card.
https://www.rehsdonline.com/post/386sx-upgrade

Not putting a CPU on a card on a bus, which I think you're referring to. Certainly with more modern systems it could be problematic. Memory would be mapped into normal addresses. Probably replacing any soldered in memory on the board, or those chips could be removed/disabled.

A CPU has cache, it is in the CPU and sits between the CPU and memory. I know on 386, on die cache was problematic, and on 486s WB cache was problematic because they really needed more signaling particularly when using weird IO devices etc. But we are talking about a 286. It really has nothing going on.

Reply 3 of 12, by Phido

Posted on 2025-01-27, 07:16

Phido Offline

Rank Newbie

Rank: Newbie
Posts: 76
Joined: 2018-06-21, 02:23

rmay635703 wrote on 2025-01-27, 04:40:

Buffalo Memco did just that on HKN x86 accelerators, sadly the ram isn’t on the south bridge which puts it up into a wierd address range requiring a custom means of access.
Because it’s bespoke and not a “it just works “ with existing software it’s usefulness is limited

Oh those Buffalo HSP-16DR 286 to 486 25MHz Cyrix CX486SLC PLCC CPU Upgrade Card things..
They don't have ram, just cache.
https://www.os2museum.com/wp/386-cache-coherency/

Same problem if you have local SRAM as main memory connected to your address lines, stuff can get into ram from bus masters without the CPU knowing about it, or in this case, the stuff would never get there at all.

So either map additional ram higher than the existing after boot
Have the firs 64kb per 1mb run on local ram, but then you have to sync everything anyway, no point.
Have a FPGA copy everything when A20 triggered copy it into local ram. I assume this is what terrible fire accelerator does. It has 4mb on board ram.
https://amigastore.eu/702-terrible-fire-534.html

Maybe I am too niche.. Just strap a 486sxlc onto a 386-286 board and be done.

For a Mac however, maybe?

Reply 4 of 12, by bakemono

Posted on 2025-01-27, 19:12

bakemono Offline

Rank Oldbie

Rank: Oldbie
Posts: 839
Joined: 2018-01-15, 06:56

IIRC the Cyrix upgrade chips with on-chip cache also snooped IO port access so they could invalidate the cache if it looked like software was doing a DMA transfer.

SRAM has never really been fast+big+cheap all at the same time. Hence it was usually used as cache instead of RAM. The NeoGeo CD console maybe would be an exception, that used a lot of SRAM.

Puting fast memory on a CPU board was pretty common in the Amiga world. In many cases it was DMA-capable, because the CPU slot provided the necessary signals to make that possible. But it was also possible to simply flag memory regions as non-DMA-capable and Amiga OS could deal with that.

GBAJAM 2024 submission on itch: https://90soft90.itch.io/wreckage

Reply 5 of 12, by Jo22

Posted on 2025-01-27, 20:10

Jo22 Offline

Rank l33t++

Rank: l33t++
Posts: 11176
Joined: 2009-12-13, 07:06
Location: Europe

The Sega Mega Drive had used SRAM, in principle.
Though the actual RAM on the mainboard was PSRAM, pseudo-static RAM.
It behaved like SRAM, though, no external DRAM refresher was needed.
Modders replaced PSRAM by true SRAM and it worked. Was nice for overclocking.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 6 of 12, by Phido

Posted on 2025-01-28, 00:17

Phido Offline

Rank Newbie

Rank: Newbie
Posts: 76
Joined: 2018-06-21, 02:23

bakemono wrote on 2025-01-27, 19:12:

IIRC the Cyrix upgrade chips with on-chip cache also snooped IO port access so they could invalidate the cache if it looked like software was doing a DMA transfer.

SRAM has never really been fast+big+cheap all at the same time. Hence it was usually used as cache instead of RAM. The NeoGeo CD console maybe would be an exception, that used a lot of SRAM.

Puting fast memory on a CPU board was pretty common in the Amiga world. In many cases it was DMA-capable, because the CPU slot provided the necessary signals to make that possible. But it was also possible to simply flag memory regions as non-DMA-capable and Amiga OS could deal with that.

Yeh looking into it the Amiga workbench and bios are much more flexible and capable. Amigas can support many different types of ram, accessible in different ways, and CPUs. Terriblefire has quite advanced FPGA doing a lot of work underneath to make everything work as it does. He seemed to make it amiga compatible. When asked about macintoshes he said he would never touch them.

The macintosh accelerators are often very complex, and require a extension in the OS to make them work. This extension is loaded on bootup and basically enables everything and patches the OS to make it work. Some of these have onboard ram and even onboard Simm slots on the accelerator cards. Even then, cache and onboard memory can be problematic with DMA transfers certain software. Also you are limited in OS compatibilities with the extension too.

The Cyrix cpus invalidating their catch is really quite nifty, and the CPU guys have some really good strategies for that, and even then its problematic. But the fact that you can get a 486sxcl with a 16bit bus and 1kb or 8kb cache means that for basically no effort you get quite a good outcome. Its neat, its compact, its simple, its cheap.

The 68030 has 512byes of cache (256+256) built in as well. So while that is pretty small, it's not nothing. I might have to look into the mac thing a bit more. There are a lot of systems hamstrung with slow 16mhz buses and 16bit ram bus. The potential is huge if you can fit sram on the CPU socket board. Lifting it from a 16mhz 68030 on 16bit, to 68030 on 32bit would be a four fold improvement. OC it and you can probably get a 5 or 6 fold improvement in CPU/Memory speed. The Mac LC/Colour classic accelerators already clock doubles the CPU to 33mHz. But that memory speed really limits performance.

Looking at it you would need to unmap probably the first 64kb, and the BIOS. This complicates things, I want simple and simple would be hooking sram straight up to the address lines. That isn't going to work. Not only do you have to have a way of having the memory sit at a different address space, but also have CPU bus signals switch from the onboard ram to the ROM chips on the motherboard, plus the IO device maps ranges for VGA and say EMS. Something is going to have to do a lot of translations.

On a 16bit CPU maybe, but on a 32bit CPU with essentially two 32bit buses, that is a whole lot of pinout. You will need quite a capable FPGA or CPU (like a Pistorm), it would be the size of a motherboard. Plus macs have weird memory management and ROMs that do all sorts of wacky things. 68030 has a dynamic memory bus width capability, so that is going on as well as the Apple ROMs are accessed through 32 bit bus, but the RAM is accessed on a 16bit bus. Im looking at the LC reloaded and other motherboard schematics to get a feel. Looks like the CPU address lines need to go into the chipset so again, without a lot of work it doesn't look easily solvable. Looks like it would be more solvable if you made a proper 32bit motherboard and used the LC parts to make it, even then I'm not sure if it would work at the higher frequency, sure your memory might, but nothing else will, it would be like sticking 60ns simms in and clocking it up, it would just be the motherboard that falls over. But again, 32bit computers already exist, its just reinventing the wheel.

You could probably build a CPU adapter that had SRAM on it, but would operate at the same speed and bus type as the existing setup. You would have to get the OS/ROM to then recognise this new ram, probably doable on x86, but not easily on a mac.

Existing accelerators like the Thunder PDS or a 286 make it 486 type upgrades already do everything pretty efficiently. They have a clock doubled CPU talking to the regular bus.

I get now why things like ThunderCachePro Accelerator were so complex and expensive, and big as a motherboard. They plugged into the PDS and basically replaced the motherboard.
Maybe with the x86 stuff. Maybe doable. But would be expensive and you would need to be smarter and more knowledgeable than me.

TLDR: They don't do it because it gets complicated and expensive and will need to run at the same speed thus gaining no performance.

Reply 7 of 12, by Tiido

Posted on 2025-01-28, 00:28

Tiido Offline

Rank l33t

Rank: l33t
Posts: 3494
Joined: 2018-01-14, 04:40
Location: Norway (used to be Estonia)

bakemono wrote on 2025-01-27, 19:12:
The NeoGeo CD console maybe would be an exception, that used a lot of SRAM.

NGCD uses a whole bunch of FPM DRAM for holding stuff that would normally live in a cartridge. But the SRAMs it does use as same as the AES and MVS machines, work RAM, video parameters and palettes etc.

But as far as Amiga stuff goes, I always wondered why nobody has tried to do cache on them. The problems are fundamentally same as with PCs with DRAMs and the solutions should work just the same way too. 486 with its 8KB internal cache still runs a whole lot faster with 256KB L2 than without, not to mention a vanilla 386. 68040 or 68060 should absolutely benefit from some L2 in a DRAM system. I'm not fully familiar with 68020/030 bus, it might benefit from L2 too, but vanilla 68000 is not going to, as a 286 wouldn't (assuming a bus that isn't having wait states every cycle)

T-04YBSC, a new YMF71x based sound card & Official VOGONS thread about it
Newly made 4MB 60ns 30pin SIMMs ~
mida sa loed ? nagunii aru ei saa 😜

Reply 8 of 12, by Jo22

Posted on 2025-01-28, 01:11

Jo22 Offline

Rank l33t++

Rank: l33t++
Posts: 11176
Joined: 2009-12-13, 07:06
Location: Europe

Phido wrote on 2025-01-27, 05:00:

A CPU has cache, it is in the CPU and sits between the CPU and memory. I know on 386, on die cache was problematic, and on 486s WB cache was problematic because they really needed more signaling particularly when using weird IO devices etc. But we are talking about a 286. It really has nothing going on.

The Microsoft MACH 20 accelerator card had an 80286 and a cache memory of "16K".
Microsoft MACH 20 leaflet (PDF)

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 9 of 12, by rmay635703

Posted on 2025-01-28, 03:50

rmay635703 Offline

Rank Oldbie

Rank: Oldbie
Posts: 1814
Joined: 2019-01-19, 19:32

Jo22 wrote on 2025-01-28, 01:11:

Phido wrote on 2025-01-27, 05:00:

A CPU has cache, it is in the CPU and sits between the CPU and memory. I know on 386, on die cache was problematic, and on 486s WB cache was problematic because they really needed more signaling particularly when using weird IO devices etc. But we are talking about a 286. It really has nothing going on.

The Microsoft MACH 20 accelerator card had an 80286 and a cache memory of "16K".
Microsoft MACH 20 leaflet (PDF)

Those ms accelerators must be rare as I’ve never seen anyone with one.

They are a sort of inboard 286 but with other interesting expansions, lack of extended memory for only expanded is a bummer and I doubt os2 would have worked

Reply 10 of 12, by Phido

Posted on 2025-01-28, 04:55

Phido Offline

Rank Newbie

Rank: Newbie
Posts: 76
Joined: 2018-06-21, 02:23

Jo22 wrote on 2025-01-28, 01:11:

Phido wrote on 2025-01-27, 05:00:

A CPU has cache, it is in the CPU and sits between the CPU and memory. I know on 386, on die cache was problematic, and on 486s WB cache was problematic because they really needed more signaling particularly when using weird IO devices etc. But we are talking about a 286. It really has nothing going on.

The Microsoft MACH 20 accelerator card had an 80286 and a cache memory of "16K".
Microsoft MACH 20 leaflet (PDF)

The Microsoft MACH 20 accelerator card had an 80286 and a cache memory of "16K".
Microsoft MACH 20 leaflet (PDF)
[/quote]

I wasn't aware of such a microsoft upgrade.

Also here is a amiga accelerator with SRAM
https://gitlab.com/MHeinrichs/68030tk
https://gitlab.com/MHeinrichs/68030tk-SRAM_IDE-interface
They use a Xillex CPLD to map the memory.

So its possible, but its much more involved to design and execute.

Looking across projects that are happening.

*Amigas already have onboard memory, its very popular for their new/cloned accelerators
*There is a mac accelerator based off clone accelerators. 32kb of 32bit cache. Maxx over at 68k seems to be very active.
*PC land is rather boring, just buy a new motherboard or a CPU upgrade or a new PC or clock crystal. Many 386 motherboards support cache and there are very few reasons why you would hotrod up a 286/386sx system.

For the mac and amiga (and Atari or sharp 68000) their machine is very much tied to that particular speed and processor. So expensive accelerators make a sense.

I think I will focus my effort on making a 286 PGA to 386/486SXLC adapter that has an external clock multiplier. Thus take a slow 286 bus, and allow it to clock up significantly. For example on a PS/2 286, take the 10Mhz bus, clock it x2, which then gets internally clocked doubled by the 486SXLC. So you can drop it into a machine, run it at 40Mhz. The cache will hopefully do a lot of heavy lifting, and make it perform like a 20-30Mhz 386sx CPU. Not fast enough for DOOM, but fast enough for wolf3d, 2D games, win 3.1.

If you disable the cache and the built in multiplier, which can be done in software, you basically get standard performance. So possibly of interest to those with PC-AT, early PS/2s, chonky 286s.

Reply 11 of 12, by Jo22

Posted on 2025-01-28, 06:46

Jo22 Offline

Rank l33t++

Rank: l33t++
Posts: 11176
Joined: 2009-12-13, 07:06
Location: Europe

^I like the Amiga 2000 here, it has an processor slot for accelerator cards.
There were cards with 68020 and 68030 being available in late 80s.

Also, there was the 68010 as a drop-in replacement for 68000.
WHDLoad uses it, for example.

I often think that the 68010 was in same role as the NEC V30 (and V20) on PC here.

They both had certain advantages and not just in terms of speed.

The NECs used CMOS, were responsive and could tri-state, that's why they were so popular on PC emulator boards that went into 68000 machines.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 12 of 12, by Phido

Posted on 2025-02-10, 05:40

Phido Offline

Rank Newbie

Rank: Newbie
Posts: 76
Joined: 2018-06-21, 02:23

I am looking at this much closer. I think I am ready to build a prototype.

Harris 286-25Mhz. I am aiming to plug this into a 10mhz PS/2. So a 2.5x multiplier. PS/2 sucks for these experiments though.

1Mb SRAM 45ns. However to save complications I will just wire up 640Kb. I think I will latch memory bus stuff in the 0-640kb range so they never leave the CPU board to go to the motherboards socket.
Floppy and scsi won't work with the memory connected that way. Although I guess I could make the first 64kb a hole to main memory much like the Ti/Cyrix 486slx cpus do.

Digging around I found this guy who built a 286 motherboard with SRAM and ran some benchmarks.
My 286 Mainboard build

Go to top of page Go to top of page

Back to General Old Hardware