Can a faster ISA Graphics card be built?

Reply 40 of 64, by BitWrangler

Posted on 2024-04-09, 19:31

BitWrangler Online

Rank l33t++

Rank: l33t++
Posts: 7685
Joined: 2017-10-11, 00:55
Location: Ontario

megatron-uk wrote on 2024-04-09, 19:21:

You would probably want a design where you could pass over the rendering in its entirety to the card.

Maybe this happens in theory, but in effect everything from the late 90s up is a graphics decelerator for anything below half of it's minimum system requirements due to needing lots of CPU for driver overhead.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 41 of 64, by Jo22

Posted on 2024-04-09, 21:20

Jo22 Offline

Rank l33t++

Rank: l33t++
Posts: 10078
Joined: 2009-12-13, 07:06
Location: Europe

Tiido wrote on 2024-04-09, 09:55:
A number of retro computers did similar stuff but they were very slow to begin with, and this sort of memory access interleaving […]
Show full quote

Jo22 wrote on 2024-04-09, 02:56:
A shared memory architecture, isn't that something the PC Jr did use once ? But without fast, dual-ported chips? […]
Show full quote
A shared memory architecture, isn't that something the PC Jr did use once ?
But without fast, dual-ported chips?

Generaly speaking, assuming that the CPU and graphics chip would be on same die or chip, like with Cyrix MediaGX (?) or Mx Macs, couldn't this be also an advantage at some point ?

I remember that a shared memory concept always had been seen as something rather negative,
but if all components could access the same RAM like in an efficient network topology..

Edit: These are just some thoughts (it's night time here), it's more of a rhetorical thing.

A number of retro computers did similar stuff but they were very slow to begin with, and this sort of memory access interleaving they did easily halved actual CPU throughput. Many of the video capability limitations also stemmed from such things, you only had so many accesses per frame into which the entire image had to be able to be composed out of...

Modern systems are largely able to do this stuff due to ginormous caches, main memory has been far too slow for a long time for the CPU to run anything out of without being choked. Much of the operation is done in the caches of the CPU, leaving memory bus free for other stuff such as accesses from GPU and other bus masters.

[..]

Um, I was thinking of using a shared memory that's not a bottle neck, rather than ordinary SDRAM/DDRx RAM.
Something like video RAM, static RAM, dual-ported RAM, or something based on a new, experimental technology.

Edit: Caches are using static RAM technology, still, I assume.
So if memory could be build using static RAM, overall performance would improve, maybe.

The GPU being physically close to CPU would also reduce latency, so there wouldn't be so much caching stages needed being spread across all over the mainboard, maybe.

Now that I think of it, I once had an AMD E-350 APU based motherboard and it wasn't that bad, actually, performance wise.
I wasn't so much into 3D games at the time, though. I was simply happy that the system incl. Aero Glass was responsive.

..if there just wasn't the ever occurring complaint about static RAM being expensive and hard to produce.
I sometimes wonder how technology would be now if industry had focused on producing static RAM instead of cheap, high-capacity DRAM.

In my opinion, DRAM was a mistake. It shouldn't have been allowed to spread like fire throughout industry. Or like sugar in groceries.
Dynamic RAM held back development just like non-SLC based SSD technology does now.

Last edited by Jo22 on 2024-04-09, 21:34. Edited 1 time in total.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 42 of 64, by BitWrangler

Posted on 2024-04-09, 21:22

BitWrangler Online

Rank l33t++

Rank: l33t++
Posts: 7685
Joined: 2017-10-11, 00:55
Location: Ontario

Bubble memory, that's the memory of the future, Texas Instruments themselves told me so in 1980.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 43 of 64, by Tiido

Posted on 2024-04-09, 21:28

Tiido Offline

Rank l33t

Rank: l33t
Posts: 3176
Joined: 2018-01-14, 04:40
Location: Norway (used to be Estonia)

mkarcher wrote on 2024-04-09, 17:45:
Not really. On the PC, there is no "line cache", so each text line needs to be retrieved for each scan line of the character. For example, on CGA, in high-res text mode, a character takes 8 pixels at 14.318MHz pixel clock, which is 1.8 megacharacters/second.......

I primarly wrote about the idea of having framebuffer bits lie in the system memory part, and a system that is meant to work in such a way while not necessarily abiding to the way the actual video standards being emulated work. What I wrote would roughly apply for just the character+attribs/framebuffer part the programmer etc. directly manipulates, while character lookup etc. would stay on the "video card" side and not also be stolen from main memory. That would absolutely destroy any performance possible unless the character bits can say in another bank and leverage page mode accesses so that there is no need to swap between two unrelated rows every character and get a huge DRAM induced stalls in the process.

In the same vein, VGA's double scan wouldn't be an obstacle, it is the video controller side that will do that part and spare the CPU side from lost bandwidth. I did forget about the blanking periods but these are parts that the any possible main memory video controller won't have to waste any memory bandwidth on and leave it all to the CPU side as guaranteed access windows. Of course it will require a line buffer etc. in the video card side and all the practical implementations do work that way, and it gets more critical as CPU speed increases. Every video line the controller has to open at least one a new row depending on mode/resolution and page size (since CPU/cache/busmasters destroy any previous state anyway), do a page mode burst within those rows and surrender the memory bus back to CPU, which will then have to suffer through opening any rows it was previously working in. It is not going to be viable without page mode accesses. In the end quite many µs of time will be lost on just these row openings and in a fast system that can be very many thousands of instructions that coulnd't execute, even when not including the time to actually read the actual video data necessary. At higher resolutions the video data reads together will take a substantial amount of time away from the CPU and if the cache of CPU isn't enough, it will have to get stalled until video process finishes, giving a potentially huge loss in performance yet again. Ability to have multiple DRAM banks and have things distributed between them is key to any success here.

My experience with integrated video without their own local memory ended at roughly P3 era, but I do remember that using onboard video vs external card of any sort, showed a very dramatic performance difference which should be explainable by all the DRAM related stalls that such a method has to produce. One motherboard I had could use local DRAMs for integrated video and it showed much better CPU performance, similar to dedicated video card (although video performance itself sucked but that's another story 🤣). I except modern things to fare a lot better, mostly because of much bigger caches, larger DRAM pages and much faster page mode accesses, it definitely seems to work well enough for the modern game consoles, despite opening a new row not being much faster if at all compared to the old memory chips.

I do very much appreciate the writeups on how the actual CGA, MDA etc. controllers do their business, very detailed and easy to understand ! It isn't something I have looked very deeply into myself but if I ever revive my FPGA VLB+ISA video card, I'll definitely get intimately familiar 🤣. The ISA+VLB means that while it is primarly a VLB card, it does connect entire ISA bus and is supposed to be usable from just an ISA slot on a 286 or something older even. I planned to have several video BIOSes too, to leverage capabilities of better CPUs, if there's room (I have not really tried to see what video BIOSes do and how much room is there for 32bit data manipulation, perhaps there isn't at all). I did plan to play as fast and loose as possible without trying to reimplement exactly how the standards work, apart from when that way is absolutely necessary to maintain compatibility with existing software. Access interfaces will definitely look the same to the programmer as the real deals, but what goes behind the scenes and timings of things are gonna be something else entirely...

Jo22 wrote on 2024-04-09, 21:20:
Um, I was thinking of using a shared memory that's not a bottle neck, rather than ordinary SDRAM/DDRx RAM.
Something like video RAM, static RAM, dual-ported RAM, or something based on a new, experimental technology.

It can work on the olden computer but won't really carry on into the future and quite possibly will kill the computer because the memory just cannot scale to many tens of Mbytes or even hundreds and beyond... In the end we are stuck with DRAMs (be it vanilla SDRAM, DDR or GDDR flavors). I would love SRAM based main memory in current times but we can only do tens of Mbytes at obscene cost...

T-04YBSC, a new YMF71x based sound card & Official VOGONS thread about it
Newly made 4MB 60ns 30pin SIMMs ~
mida sa loed ? nagunii aru ei saa 😜

Reply 44 of 64, by Sphere478

Posted on 2024-04-10, 01:54

Sphere478 Offline

Rank l33t++

Rank: l33t++
Posts: 5835
Joined: 2021-01-13, 04:45

Well, there is one way to make an isa card that is fast.

Give it a socket on a ribbon cable that plugs into an interposer on 486 socket

Quasi VL bus

What was the most capable VL GPU?

Sphere's PCB projects.
-
Sphere’s socket 5/7 cpu collection.
-
SUCCESSFUL K6-2+ to K6-3+ Full Cache Enable Mod
-
Tyan S1564S to S1564D single to dual processor conversion (also s1563 and s1562)

Reply 45 of 64, by darry

Posted on 2024-04-10, 02:21

darry Offline

Rank l33t++

Rank: l33t++
Posts: 6133
Joined: 2014-01-20, 06:27
Location: Canada

Sphere478 wrote on 2024-04-10, 01:54:
Well, there is one way to make an isa card that is fast. […]
Show full quote

Well, there is one way to make an isa card that is fast.

Give it a socket on a ribbon cable that plugs into an interposer on 486 socket

Quasi VL bus

What was the most capable VL GPU?

Drop it into the grand canyon, it will be very fast as long as your favorite game is terminal velocity. 😉

Reply 46 of 64, by Sphere478

Posted on 2024-04-10, 14:02

Sphere478 Offline

Rank l33t++

Rank: l33t++
Posts: 5835
Joined: 2021-01-13, 04:45

Are there VL bus to PCI adapter chips?

Sphere's PCB projects.
-
Sphere’s socket 5/7 cpu collection.
-
SUCCESSFUL K6-2+ to K6-3+ Full Cache Enable Mod
-
Tyan S1564S to S1564D single to dual processor conversion (also s1563 and s1562)

Reply 47 of 64, by BitWrangler

Posted on 2024-04-10, 14:53

BitWrangler Online

Rank l33t++

Rank: l33t++
Posts: 7685
Joined: 2017-10-11, 00:55
Location: Ontario

Sort of... in that a PCI bridge chip oughta do it.... also there was the IBM selectabus in PC330 systems where you could use the VLB riser or the PCI riser, and I am fairly sure the slot was more or less VLB plus ISA and the PCI addon had most of the necessities on the PCI riser.

edit: that might be PC350 in other references, I dunno if IBM even got it straight what they were called, and also annoyingly referred to them as IBM Personal Computer range.... which is super un-unique.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 48 of 64, by Sphere478

Posted on 2024-04-10, 15:02

Sphere478 Offline

Rank l33t++

Rank: l33t++
Posts: 5835
Joined: 2021-01-13, 04:45

Okay so a isa, plus 486 socket interposer, plus PCI-VL bridge chip, to pci slot may be possible.

Sphere's PCB projects.
-
Sphere’s socket 5/7 cpu collection.
-
SUCCESSFUL K6-2+ to K6-3+ Full Cache Enable Mod
-
Tyan S1564S to S1564D single to dual processor conversion (also s1563 and s1562)

Reply 49 of 64, by BitWrangler

Posted on 2024-04-10, 15:08

BitWrangler Online

Rank l33t++

Rank: l33t++
Posts: 7685
Joined: 2017-10-11, 00:55
Location: Ontario

At that point you'll get to wondering if the ISA board is just an ISA backplane for a new PCI board or vice versa.... also I don't think it would end up costing much less than what Sergei K. is brewing up in the way of new 486 boards.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 50 of 64, by rasz_pl

Posted on 2024-04-11, 12:02

rasz_pl Offline

Rank l33t

Rank: l33t
Posts: 4031
Joined: 2017-06-04, 00:57

clb wrote on 2024-04-07, 20:40:

These best ISA graphics cards (and motherboards) had perfected the minimum number of ISA bus cycles needed to undertake 8-bit and 16-bit bus writes

Absolute ISA maximum (2 Bytes x clock MHz)/2 cycles for 0WS.

bakemono wrote on 2024-04-08, 21:02:

Using only two clocks for every word transaction would reach ~8MB/s, which would be better than most if not all existing ISA cards. Maybe there are some cards around that can do it, but I've only seen cards that do ~5MB/s (three clocks per word) at best, and most are slower.

There are gaps between transfers, memory refresh etc, all chipset dependent.

No amount of magic is going to make standard software run significantly faster with ISA attached peripheral.

bakemono wrote on 2024-04-08, 21:02:

BTW, If someone wanted to go nuts, they could design a video card with a ribbon cable connecting it to a 72-pin SIMM daughter card, providing a linear framebuffer at the full system memory bandwidth.

damn, you and BitWrangler beat me posting first 😀 There are two possible ways to give 386/486 fastest graphics possible, and you covered both:

1 Simm form factor FPGA emulating DRAM

Could emulate fastest supported ram possible. This would require SRAM or old school DRAM on the other side, never ram types (DDR2/3) might not have the latency?
Or alternatively simpler implementation using interposed with normal SIMM plugged in and FPGA just intercepting accesses to 128KB window.

It should be quite easy to simulate potential effect using CGA/Hercules card and VGA bios stub. Benchmark 13h games (means not Doom, but fastdoom_13h or Heretic/Hexen yes) on a 386/486 computer without VGA card and VGA stub pretending to initialize 13h properly, make sure BIOS doesnt cache VGA window. If this gives us double the DOOM framerate then it is indeed something worth pursuing.

BitWrangler wrote on 2024-04-08, 22:07:

I was thinking of an ATA100 cable from an interposer between the CPU and socket, and a daughterboard that goes on the feature connector, but it's hardly an ISA card then, more like jury rigged local bus.

2 CPU socket interposer adding VLB on VLBless motherboards.

Sphere478 wrote on 2024-04-10, 01:54:

Give it a socket on a ribbon cable that plugs into an interposer on 486 socket
Quasi VL bus
What was the most capable VL GPU?

Re: VLB with 386DX-40 ?
Re: Diy VL bus

but results from 386 boars with VLB arent too encouraging, merely couple fps better results over ISA 🙁 contemporary VGA chips are the bottleneck with slow reads and writes topping out at ~20MB/s.

AT&T Globalyst/FIC 486-GAC-2 Cache Module reproduction
Zenith Data Systems (ZDS) ZBIOS 'MFM-300 Monitor' reverse engineering

Reply 51 of 64, by ViTi95

Posted on 2024-04-11, 14:22

ViTi95 Offline

Rank Oldbie

Rank: Oldbie
Posts: 520
Joined: 2017-02-14, 22:18

I really want to see a FPGA based ISA video card with lot's of VRAM, 3D acceleration and sound acceleration. I'm pretty sure games like Doom could run at full speed even on a slow 386 with such card. Most of the CPU time is wasted on texture mapping and scaling, which can be accelerated quite easily on the video card. Same for the sound, positional audio + channel mixing takes quite some CPU cycles.

https://www.youtube.com/@viti95

Reply 52 of 64, by Shagittarius

Posted on 2024-04-11, 14:40

Shagittarius Offline

Rank Oldbie

Rank: Oldbie
Posts: 1630
Joined: 2007-12-20, 06:49
Location: California, USA

ViTi95 wrote on 2024-04-11, 14:22:

I really want to see a FPGA based ISA video card with lot's of VRAM, 3D acceleration and sound acceleration. I'm pretty sure games like Doom could run at full speed even on a slow 386 with such card. Most of the CPU time is wasted on texture mapping and scaling, which can be accelerated quite easily on the video card. Same for the sound, positional audio + channel mixing takes quite some CPU cycles.

It would be nice to have an FPGA video card that you could swap out cores to turn into different hardware. That way you could also have all the experimental cores that people came up with, perhaps even game specific cores.

Reply 53 of 64, by BitWrangler

Posted on 2024-04-11, 14:54

BitWrangler Online

Rank l33t++

Rank: l33t++
Posts: 7685
Joined: 2017-10-11, 00:55
Location: Ontario

Shagittarius wrote on 2024-04-11, 14:40:

ViTi95 wrote on 2024-04-11, 14:22:

I really want to see a FPGA based ISA video card with lot's of VRAM, 3D acceleration and sound acceleration. I'm pretty sure games like Doom could run at full speed even on a slow 386 with such card. Most of the CPU time is wasted on texture mapping and scaling, which can be accelerated quite easily on the video card. Same for the sound, positional audio + channel mixing takes quite some CPU cycles.

It would be nice to have an FPGA video card that you could swap out cores to turn into different hardware. That way you could also have all the experimental cores that people came up with, perhaps even game specific cores.

Well you could see if you get anywhere with one of these, to see if it's worth the effort doing it on ISA FPGA found on PCI parallel card, possibilities??? Many other Lava i/o cards based on that chip not just the parallel ones.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 54 of 64, by Dothan Burger

Posted on 2024-04-11, 14:57

Dothan Burger Offline

Rank Member

Rank: Member
Posts: 102
Joined: 2023-02-21, 16:40

Shagittarius wrote on 2024-04-11, 14:40:

ViTi95 wrote on 2024-04-11, 14:22:

I really want to see a FPGA based ISA video card with lot's of VRAM, 3D acceleration and sound acceleration. I'm pretty sure games like Doom could run at full speed even on a slow 386 with such card. Most of the CPU time is wasted on texture mapping and scaling, which can be accelerated quite easily on the video card. Same for the sound, positional audio + channel mixing takes quite some CPU cycles.

It would be nice to have an FPGA video card that you could swap out cores to turn into different hardware. That way you could also have all the experimental cores that people came up with, perhaps even game specific cores.

Gamers Nexus recently featured an FGPA graphics card that the creator stated could play XP era games.

Edit: Sorry I forgot that was the basis for the whole thread.

Last edited by Dothan Burger on 2024-04-11, 15:17. Edited 1 time in total.

Reply 55 of 64, by BitWrangler

Posted on 2024-04-11, 14:59

BitWrangler Online

Rank l33t++

Rank: l33t++
Posts: 7685
Joined: 2017-10-11, 00:55
Location: Ontario

Does anyone with a P233 MMX and what everyone would regard as a "proper" 3D card, want to use the phil recipe for getting high 386 speeds and then benchmark quake in software and glquake to show us the awesome (yes I said that sarcastically) performance potential of bringing 3D acceleration to ISA class systems.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 56 of 64, by BitWrangler

Posted on 2024-04-11, 15:00

BitWrangler Online

Rank l33t++

Rank: l33t++
Posts: 7685
Joined: 2017-10-11, 00:55
Location: Ontario

Dothan Burger wrote on 2024-04-11, 14:57:

Shagittarius wrote on 2024-04-11, 14:40:

ViTi95 wrote on 2024-04-11, 14:22:

I really want to see a FPGA based ISA video card with lot's of VRAM, 3D acceleration and sound acceleration. I'm pretty sure games like Doom could run at full speed even on a slow 386 with such card. Most of the CPU time is wasted on texture mapping and scaling, which can be accelerated quite easily on the video card. Same for the sound, positional audio + channel mixing takes quite some CPU cycles.

It would be nice to have an FPGA video card that you could swap out cores to turn into different hardware. That way you could also have all the experimental cores that people came up with, perhaps even game specific cores.

Gamers Nexus recently featured an FGPA graphics card that the creator stated could play XP era games.

First post first sentence 🤣

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 57 of 64, by rasz_pl

Posted on 2024-04-11, 15:52

rasz_pl Offline

Rank l33t

Rank: l33t
Posts: 4031
Joined: 2017-06-04, 00:57

ViTi95 wrote on 2024-04-11, 14:22:

I really want to see a FPGA based ISA video card with lot's of VRAM, 3D acceleration and sound acceleration. I'm pretty sure games like Doom could run at full speed even on a slow 386 with such card.

no, games like doom would run just like they do run now. You would have to write New games or patches for old games to make them use your accelerator, and at that point those games would be running on your FPGA and not 386
you end up with [Doom runing on an unmodified Nintendo Entertainment System] https://www.youtube.com/watch?v=FzVN9kIUNxw

AT&T Globalyst/FIC 486-GAC-2 Cache Module reproduction
Zenith Data Systems (ZDS) ZBIOS 'MFM-300 Monitor' reverse engineering

Reply 58 of 64, by BitWrangler

Posted on 2024-04-11, 16:04

BitWrangler Online

Rank l33t++

Rank: l33t++
Posts: 7685
Joined: 2017-10-11, 00:55
Location: Ontario

rasz_pl wrote on 2024-04-11, 15:52:

ViTi95 wrote on 2024-04-11, 14:22:

I really want to see a FPGA based ISA video card with lot's of VRAM, 3D acceleration and sound acceleration. I'm pretty sure games like Doom could run at full speed even on a slow 386 with such card.

no, games like doom would run just like they do run now. You would have to write New games or patches for old games to make them use your accelerator, and at that point those games would be running on your FPGA and not 386
you end up with [Doom runing on an unmodified Nintendo Entertainment System] https://www.youtube.com/watch?v=FzVN9kIUNxw

Anyone else I would have said was full of shit but this is "Mr FastDoom" so I assume he would have the ability and knowledge to rejig Doom to take advantage of some hardware support.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 59 of 64, by bakemono

Posted on 2024-04-11, 18:35

bakemono Offline

Rank Oldbie

Rank: Oldbie
Posts: 758
Joined: 2018-01-15, 06:56

Shagittarius wrote on 2024-04-11, 14:40:

It would be nice to have an FPGA video card that you could swap out cores to turn into different hardware. That way you could also have all the experimental cores that people came up with, perhaps even game specific cores.

It would be nifty if there was an ISA card with FPGA, level shifters, RAM, RGB DAC, and audio DAC. Maybe put a PCI edge on the other end so the card can be flipped over to use in either type of slot. Then everyone who wants to do an FPGA retro project can use the same board.

Only problem is that if you put a beefy FPGA on there, nobody would buy one because of the price, while if you put a cheap FPGA on there then the people who want to make their own GeForce 3 will complain it's too weak and nobody would buy one because of that...

I have a Chinese board with an EP4CE15 but I have to run a ribbon cable to connect it to anything. That's not too bad a solution but mounting it inside a PC would be a bit ugly.

GBAJAM 2024 submission on itch: https://90soft90.itch.io/wreckage

Main menu