ATI Graphics Solution

Reply 40 of 127, by reenigne

Posted on 2019-06-01, 14:59

reenigne Offline

Rank Oldbie

Rank: Oldbie
Posts: 649
Joined: 2006-11-30, 05:13
Location: Cornwall, UK

Deunan wrote:
For "VGA": the "smart DMA" that would make the card pull and convert or interpret the data on-the-fly rather than just dumb RAM to VRAM copy would be nice. That would allow you to prepare a pixel group in system RAM, without having to worry about packing - just a pixel value per byte or word, and then let the card handle it.

That would indeed have been nice with the benefit of hindsight. I suspect that would have been way down on IBM's list of priorities for VGA, though, for a couple of reasons. One is that system RAM to VRAM blitting speed was probably not very important for the kind of workloads that they were envisioning (business graphics, desktop GUIs and perhaps a little bit of 2D scrolling games). One of the more demanding applications at the time was CAD so if they were adding hardware features to accelerate anything, then drawing Bresenham lines would have been it (and indeed I suspect some of the ALU/read-mode/write-mode stuff on the card was actually designed with that in mind).

Deunan wrote:
Way I see it, we didn't need that clunky bitlplane stuff to make the early CPU go fast, it could've been done the right way from the start.

You see bitplanes as clunky because (as an interface from software to hardware) they are lot more complicated and difficult to understand than a linear framebuffer of packed pixels. But (and I guess this was my point when talking about the Wolfenstein 3D code) when you understand them they increase the flexibility of the hardware and can drastically accelerate a number of common operations (by a factor approaching 4) by performing operations in parallel. So they were a good design given the constraints of the time.

Deunan wrote:
And if that took any changes to the IBM PC to make it happen? So be it. We did move from XT to AT, increased the bus size, IRQ/DMA channel count, then we had PCI - it was all about incremental changes for the better.

Perhaps. But part of the reason the PC was so successful was incremental upgrades and backwards compatibility. You could upgrade from a CGA to an EGA without changing your monitor. You could upgrade to VGA without getting a new motherboard as well, and so on. If VGA had required upgrading other parts of the system as well it might not have taken off nearly as well as it did.

Deunan wrote:
Now, a DMA can saturate the bus, true, but if that is happening then you are trying to push too much data through it. How is that different from CPU not being fast enough to push all that by itself in the same amount of time?

It's not. Apart from DMA adding more complexity for hardware and programming. Which is rather my point - when you have a framebuffer that the CPU can copy to at approaching bus-saturation speeds, it rather negates the need for DMA.

Deunan wrote:
If it turned out that CPU writes are faster then PC DMA, then this DMA could've simply be upgraded with another controller that is faster and could do microbursts, and pace itself rather than transfer everything in one go. Then you just need your code to work on a pixel group the size of the burst, while DMA is transfering the previous group. That would steal some cycles from you but not halt the CPU completly, so this works faster than having the CPU do everything.

Throughput is maximised when the bus is saturated, regardless of whether it's the CPU or the DMA controller that are saturating it.

Deunan wrote:
For "CGA": I'd give it 32k RAM so that it could do 320x200 in 16 colors, and 640x200 in 4 colors. Simple 2 or 4 pixels per byte. Ideally it would use EGA-like 2-bit per pixel output but I don't want to sound like I'm just replacing CGA with EGA-lite. So let's stick to original RGBI. Then I would add palette LUT,

So far this is sounding exactly like PCjr graphics! Which did use a ULA (or something very similar).

Deunan wrote:
or 2 in fact. Even and odd pixel columns would use LUT0 and LUT1, and those would be independent. This would also be swapped eve […]
Show full quote

or 2 in fact. Even and odd pixel columns would use LUT0 and LUT1, and those would be independent. This would also be swapped every row, like this (row (y), column (x)):

0,0: L0; 0,1: L1; 0,2: L0; 0,3: L1; ...
1,0: L1; 1,1: L0; 1,2: L1; 1,3: L0; ...

That's easy to do, a couple of XOR gates on the counter lowest bits to drive enable signal from the correct LUT to the output amps. Those LUTs would be small enough to use SRAM cells inside the GPU chip itself, but external SRAM also works. This way not only you are not forced to use "blue or not blue" colors, but you could very cheaply do dithering - if nothing else, to be used while showing static images.

Clever! I like that idea a lot. The "dither clock" doesn't even need to be tied to the pixel clock - it could have been sub-pixel dithering. I wonder what the availability and cost of suitable SRAMs (like the 7489?) would have been in 1981, and if the designers of the CGA considered using them instead of the palette logic that they ended up with.

Deunan wrote:
But it should be fast enough to use in games and having palletes and HSYNC interrupt allows for all kinds of cool color increasing tricks - if you like demos.

HSYNC interrupts were possible with the original PC and CGA (we used one in the final version of 8088 MPH). I'm not sure why they weren't used more. One game that I know used them (California Games) had some bugs which made the effect kind of fragile, so I guess it was a bit of a black art back in the day, without the benefit of the internet to share ideas and documentation. By the time people had figured it out, CGA had been superseded already and was only interesting as a fallback for customers who didn't have EGA/VGA.

Deunan wrote:
Ha, I suppose I just don't see that "38 years later" as a positive thing. It should've been so easy to use that people could utilize 99% of its performance a year after it was released.

I guess my point was that a good design works well for the use cases it was designed for, and a great design is sufficiently flexible to accommodate use cases the designer never thought of. The important use cases at the time (drawing graphs and simple games where the speed of the display adapter wasn't critical) were well documented and performed adequately for the time. With modern eyes we can envisage much more difficult cases, and (surprisingly) the CGA can rise to the challenge more often than we felt entitled to expect it to. So yes, in hindsight we can see some ways in which CGA's design could have been done better, but overall it's remarkably good given the constraints!

Reply 41 of 127, by Scali

Posted on 2019-06-01, 15:13

Scali Offline

Rank l33t

Rank: l33t
Posts: 4873
Joined: 2014-12-13, 14:24

reenigne wrote:
That would indeed have been nice with the benefit of hindsight. I suspect that would have been way down on IBM's list of priorities for VGA, though, for a couple of reasons. One is that system RAM to VRAM blitting speed was probably not very important for the kind of workloads that they were envisioning (business graphics, desktop GUIs and perhaps a little bit of 2D scrolling games). One of the more demanding applications at the time was CAD so if they were adding hardware features to accelerate anything, then drawing Bresenham lines would have been it (and indeed I suspect some of the ALU/read-mode/write-mode stuff on the card was actually designed with that in mind).

Well, all that sort of 'fancy' stuff went into the 8514, which was released at about the same time as VGA. So they had the technology available, I suppose the feature set of VGA was mostly based on what it was supposed to cost vs what the average buyer would want.

reenigne wrote:
Throughput is maximised when the bus is saturated, regardless of whether it's the CPU or the DMA controller that are saturating it.

Yes, that's an interesting thing.
The Amiga blitter is so efficient because you can send a single command, and then it goes about its business, in a 'block copy' fashion. The 68000 does not have such a feature, so any copy operation would also require you to read the instruction from memory, which costs additional bandwidth, so you can never get the full theoretical bandwidth out of the CPU.
I'm not entirely sure how an x86 handles that with a rep movsw. It might not actually touch the memory after the first iteration, but it may still 'waste' cycles by setting up each next copy.
I believe DMA also has a special optimization to start the next transfer a cycle early, or something to that effect.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 42 of 127, by Deunan

Posted on 2019-06-01, 17:26

Deunan Offline

Rank l33t

Rank: l33t
Posts: 2088
Joined: 2018-05-29, 12:32

reenigne wrote:
they increase the flexibility of the hardware and can drastically accelerate a number of common operations (by a factor approaching 4) by performing operations in parallel

Yes, but not all cases. And again, was this a clever solution from the start or rather a clever hack for a silly hardware design we got. I argue it's the latter.

reenigne wrote:
Throughput is maximised when the bus is saturated, regardless of whether it's the CPU or the DMA controller that are saturating it.

That is true but only if you have one common bus. With 386 came L2 cache and 486 had L1 on-chip. And at this point you can have the CPU work entirely on the cached data (and code) and DMA doing its job. And best of all - it's a "free" performance upgrade. Not a single code change would be required, all previously written software benefits from it.

On the other hand, when 486 showed up it often had slower bus than 386, especially for writes since those are not bursted (except if you had a Cyrix and proper mobo support). So a CPU-based VRAM transfer was now slowed down from 33-40 to 25MHz. It was partially offset by the code being cached but still slower for any REP-based stores.

reenigne wrote:
So far this is sounding exactly like PCjr graphics!

And here I thought I was being clever. I admit I know nothing about PCjr - other that it didn't quite take off. But possibly because the PC was already there and was different? Though it makes me happy that someone had the same idea, it sort of validates mine.

reenigne wrote:
HSYNC interrupts were possible with the original PC and CGA (we used one in the final version of 8088 MPH). I'm not sure why they weren't used more.

Yet another thing that was promising but then got broken. I think more VGAs didn't have proper support for IRQ than did. Good for emulators though, high-speed interrupts are a PITA.

reenigne wrote:
I guess my point was that a good design works well for the use cases it was designed for, and a great design is sufficiently flexible to accommodate use cases the designer never thought of.

That I fully agree with. But in my eyes CGA was barely acceptable when it came out, and got quickly ridiculed by pretty much every other computer on the market. Perhaps it was faster, but that's not much to go on when your available colors are red, green and poo. And that's the better palette.

Well, thanks for all the input. This was only supposed to be a what-if and my curiosity is sated now. I better let go before this turns into beating a dead horse.

Reply 43 of 127, by Scali

Posted on 2019-06-01, 17:43

Scali Offline

Rank l33t

Rank: l33t
Posts: 4873
Joined: 2014-12-13, 14:24

Deunan wrote:
That is true but only if you have one common bus. With 386 came L2 cache and 486 had L1 on-chip. And at this point you can have the CPU work entirely on the cached data (and code) and DMA doing its job. And best of all - it's a "free" performance upgrade. Not a single code change would be required, all previously written software benefits from it.

Because of limitations named earlier, you can't use the DMA controller to copy memory from system ram to vram or vice versa.
Ergo, no software has ever used DMA.
So no software benefits from any kind of DMA controller at all, and as such, the upgrade would not affect anything, and could only be used in code specifically written for it.

Deunan wrote:
On the other hand, when 486 showed up it often had slower bus than 386, especially for writes since those are not bursted (except if you had a Cyrix and proper mobo support). So a CPU-based VRAM transfer was now slowed down from 33-40 to 25MHz. It was partially offset by the code being cached but still slower for any REP-based stores.

How is that? When the 486 was launched, the fastest 386 was 33 MHz. Common 386es would be 25 MHz or slower, so only a few 386es would actually have a faster bus than the slowest 486es out there.
But this is moot, since 386 never had a localbus, and was limited to the 16-bit ISA bus at ~8 MHz, regardless of CPU speed. It's a huge bottleneck for either system, and in fact the 486 has the advantage because it has bigger caches, which allow it to work around the bottleneck better than a 386 can.
Even a low-end 486 at 25 MHz with a localbus would be way faster.

Deunan wrote:
Yet another thing that was promising but then got broken. I think more VGAs didn't have proper support for IRQ than did. Good for emulators though, high-speed interrupts are a PITA.

We are talking about HSYNC, not VSYNC. The HSYNC interrupt that reenigne is referring to, is created by synchronizing the system PIT to the vertical blank. Since the system PIT and the CGA run on the same clock, they remain in sync, which means you can calculate positions on the screen based on the PIT count.

There was only a VSYNC interrupt in hardware, introduced in EGA. But this never seemed to work very well, and most VGA cards, including IBM's own, have the line disconnected.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 44 of 127, by Grzyb

Posted on 2019-06-07, 17:00

Grzyb Offline

Rank l33t

Rank: l33t
Posts: 2462
Joined: 2019-05-08, 13:47
Location: CET-1CEST-2

I have one question after reading thru the entire thread...

Back in the Wolfenstein/Doom era, it was commonly believed that one of the major reasons why Amiga can't into 3D is the fact it only has bitplanes, and not chunky.
Now I've learned that both Wolfenstein and Doom, on a PC, use... bitplanes, exactly!

So what's the real problem with Amiga here?

And don't tell me that there's no problem...
https://www.youtube.com/watch?v=_jrMuesQQqk
Note the comments:
"it runs much better, silky smooth, on a 386dx20. it runs like this video on a 286"
"So this is about 2/3 the performance of a 286 at 10MHz. That's not good, and given that a 68030 at 28mhz is doing 10 MIPS and the 286 is doing about 1.5"

Kiełbasa smakuje najlepiej, gdy przysmażysz ją laserem!

Reply 45 of 127, by rasz_pl

Posted on 2019-06-07, 17:10

rasz_pl Offline

Rank l33t

Rank: l33t
Posts: 4208
Joined: 2017-06-04, 00:57

Grzyb wrote:
I have one question after reading thru the entire thread...

Back in the Wolfenstein/Doom era, it was commonly believed that one of the major reasons why Amiga can't into 3D is the fact it only has bitplanes, and not chunky.
Now I've learned that both Wolfenstein and Doom, on a PC, use... bitplanes, exactly!

different bitplanes 😀 mode x/y stores one pixel in single byte. amiga in what, 5 bytes per pixel/8 pixels per byte?

https://github.com/raszpl/FIC-486-GAC-2-Cache-Module for AT&T Globalyst
https://github.com/raszpl/386RC-16 memory board
https://github.com/raszpl/440BX Reference Design adapted to Kicad
https://github.com/raszpl/Zenith_ZBIOS MFM-300 Monitor

Reply 46 of 127, by Grzyb

Posted on 2019-06-07, 17:26

Grzyb Offline

Rank l33t

Rank: l33t
Posts: 2462
Joined: 2019-05-08, 13:47
Location: CET-1CEST-2

OK, thanks, I guess this explains it.
Still, the difference in performance is shocking.

Kiełbasa smakuje najlepiej, gdy przysmażysz ją laserem!

Reply 47 of 127, by Scali

Posted on 2019-06-07, 17:27

Scali Offline

Rank l33t

Rank: l33t
Posts: 4873
Joined: 2014-12-13, 14:24

Grzyb wrote:
Back in the Wolfenstein/Doom era, it was commonly believed that one of the major reasons why Amiga can't into 3D is the fact it only has bitplanes, and not chunky.
Now I've learned that both Wolfenstein and Doom, on a PC, use... bitplanes, exactly!

The PC version uses 'byteplanes', for lack of a better word.
This is a quirky undocumented mode, where you use 256 colour mode, while disabling the 'chaining' addressing mode, so the memory layout is the 4 bitplanes of EGA.
I wrote a blog about that some years ago, explaining how you can create a polygon filler exploiting this:
https://scalibq.wordpress.com/2011/11/23/just … ld-skool-style/

Grzyb wrote:
So what's the real problem with Amiga here?

Nothing really... The technique of raycasting for Amiga's bitplane mode was just not perfected yet.
Here's a preview of Cyberwolf, shown at Revision 2019:
https://www.pouet.net/prod.php?which=81051
Video: https://youtu.be/MyYCdd0Z88w?t=405
This runs on a stock 7 MHz Amiga 500.
So it certainly was possible, programmers simply didn't know how yet.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 48 of 127, by Grzyb

Posted on 2019-06-07, 17:47

Grzyb Offline

Rank l33t

Rank: l33t
Posts: 2462
Joined: 2019-05-08, 13:47
Location: CET-1CEST-2

Scali wrote:
So it certainly was possible, programmers simply didn't know how yet.

But what exactly is possible?
Are you saying that Amiga bitplanes can be as good in 3D as VGA mode X "byteplanes"?
I won't believe it until I see stock A1200 running Wolfenstein 3D as smoothly as low-end 386s do.
That Cyberwolf is sure impressive, but still, not quite Wolfenstein...

Kiełbasa smakuje najlepiej, gdy przysmażysz ją laserem!

Reply 49 of 127, by Scali

Posted on 2019-06-07, 17:57

Scali Offline

Rank l33t

Rank: l33t
Posts: 4873
Joined: 2014-12-13, 14:24

Grzyb wrote:
Are you saying that Amiga bitplanes can be as good in 3D as VGA mode X "byteplanes"?

Depends on what you want to do exactly. Some stuff is easier to do with bitplanes, some stuff easier with mode X, some stuff easier with chunky mode. Also, what exactly is 'VGA'? I mean, there are so many different PCs that support VGA, that this really is a meaningless statement.
The fastest PCs with VGA-compatible hardware are so fast that they can emulate a full Amiga with its bitplane hardware faster than any real Amiga can run.

Grzyb wrote:
I won't believe it until I see stock A1200 running Wolfenstein 3D as smoothly as low-end 386s do.

Is that a fair comparison then?
An Amiga 1200 runs on an 68EC020 CPU at 14 MHz, that's a low-cost version of a CPU from 1984.
A low-end 286 would be a more fair comparison I'd say. A 286-12 or a 286-16 perhaps (which would still be much more expensive than an Amiga 1200, but hey).

Grzyb wrote:
That Cyberwolf is sure impressive, but still, not quite Wolfenstein...

Why not? You have raycast walls, you have sprite-based enemies. All the 'difficult' stuff is already there. Adding animation for opening doors, and some HUD to show score and other info isn't drastically going to change the performance.
This is basically a machine from 1985, 2 years before VGA even existed. Imagine what Wolf3D would look like on a PC from 1985.
Any comparison between Amiga and VGA is basically unfair to the Amiga.
If you'd want to do a comparison with VGA at all, at least you could compare with the original IBM VGA, not a much later, faster SVGA variation.

I think the biggest problem with Wolf3D ports like the one from your video, is that they tried to shoehorn the PC-based rendering algorithm onto the Amiga. This required an additional chunky2planar conversion step.
Cyberwolf performs its raycasting 'natively' in a bitplane-friendly format, reducing c2p overhead.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 50 of 127, by Grzyb

Posted on 2019-06-07, 18:38

Grzyb Offline

Rank l33t

Rank: l33t
Posts: 2462
Joined: 2019-05-08, 13:47
Location: CET-1CEST-2

Scali wrote:
Is that a fair comparison then? An Amiga 1200 runs on an 68EC020 CPU at 14 MHz, that's a low-cost version of a CPU from 1984. A […]
Show full quote
Grzyb wrote:
I won't believe it until I see stock A1200 running Wolfenstein 3D as smoothly as low-end 386s do.

Is that a fair comparison then?
An Amiga 1200 runs on an 68EC020 CPU at 14 MHz, that's a low-cost version of a CPU from 1984.
A low-end 286 would be a more fair comparison I'd say. A 286-12 or a 286-16 perhaps (which would still be much more expensive than an Amiga 1200, but hey).

A1200: 32-bit bus, 14 MHz
386DX: 32-bit bus, 16 MHz (see eg. the original Compaq Deskpro 386 from 1986)
386SX: 16-bit bus, 25/33 MHz
They should all have similar performance, right?

Any comparison between Amiga and VGA is basically unfair to the Amiga.

I don't expect Wolfenstein (1992) to run on any 1985 hardware, I expect it to run on a 1992 PC (where it runs fine, even on a 386SX, which was already low-end), and on a 1992 A1200 (where it sucks)...

I think the biggest problem with Wolf3D ports like the one from your video, is that they tried to shoehorn the PC-based rendering algorithm onto the Amiga. This required an additional chunky2planar conversion step.
Cyberwolf performs its raycasting 'natively' in a bitplane-friendly format, reducing c2p overhead.

...so, there's still some room for improving Wolfenstein on A1200. Well, we shall see...

Kiełbasa smakuje najlepiej, gdy przysmażysz ją laserem!

Reply 51 of 127, by Scali

Posted on 2019-06-07, 19:17

Scali Offline

Rank l33t

Rank: l33t
Posts: 4873
Joined: 2014-12-13, 14:24

Grzyb wrote:
A1200: 32-bit bus, 14 MHz 386DX: 32-bit bus, 16 MHz (see eg. the original Compaq Deskpro 386 from 1986) 386SX: 16-bit bus, 25/33 […]
Show full quote
A1200: 32-bit bus, 14 MHz
386DX: 32-bit bus, 16 MHz (see eg. the original Compaq Deskpro 386 from 1986)
386SX: 16-bit bus, 25/33 MHz
They should all have similar performance, right?

Well no.
Why not take the A4000?

Grzyb wrote:
I don't expect Wolfenstein (1992) to run on any 1985 hardware, I expect it to run on a 1992 PC (where it runs fine, even on a 386SX, which was already low-end), and on a 1992 A1200 (where it sucks)...

What do you mean "where it sucks"?
There never was any official Amiga version of Wolfenstein 3D. As I said, the unofficial Wolfenstein 3D ports for Amiga suffer from trying to shoehorn the PC rendering algo onto the Amiga hardware. So they're not optimized at all. While the PC version of Wolfenstein 3D was written by some legendary names in game development.

Also, 'low-end' is relative. A 386SX may have been a 'low-end' PC, but still far more expensive than an Amiga 1200, and the 386SX is a more modern and sophsticated CPU than the 68EC020 budget CPU in an Amiga 1200.
When Wolfenstein 3D came out, a 286 was far more common than a 386. And even 286 machines were more expensive than an Amiga 1200.

Grzyb wrote:
...so, there's still some room for improving Wolfenstein on A1200. Well, we shall see...

Sure. Cyberwolf is done by Britelite, a demo coder who really understands 68000 assembly and the Amiga bitplane layout and chipset.
He can optimize a raycaster about as well as the original coders did for x86/VGA. But apparently he thought Amiga 500 was a more interesting target than Amiga 1200.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 52 of 127, by Grzyb

Posted on 2019-06-07, 20:42

Grzyb Offline

Rank l33t

Rank: l33t
Posts: 2462
Joined: 2019-05-08, 13:47
Location: CET-1CEST-2

Scali wrote:
Why not take the A4000?

We can take the A4000, even with 68040 - but this is supposed to be similar to a 486, or even faster.
Similar CPUs, different video memory layout.
Now, how do the above machines compare in Doom?

386SX is a more modern and sophsticated CPU than the 68EC020 budget CPU in an Amiga 1200.

OK, 386SX has MMU, but this doesn't affect processing power.
http://www.faqs.org/faqs/motorola/68k-chips-faq/ - 68EC020@16 MHz has 7559 Dhrystones
386DX25 has about 5000 Dhrystones, 386SX can only be slower

However...
http://eab.abime.net/showthread.php?t=33243 - they claim stock A1200 has only 1283 Dhrystones...
But why? Does the sharing of RAM bandwidth with the chipset slow down the CPU so much?

Kiełbasa smakuje najlepiej, gdy przysmażysz ją laserem!

Reply 53 of 127, by Scali

Posted on 2019-06-07, 22:07

Scali Offline

Rank l33t

Rank: l33t
Posts: 4873
Joined: 2014-12-13, 14:24

Grzyb wrote:
Now, how do the above machines compare in Doom?

I thought this was about Wolfenstein 3D?
Anyway, the A4000 with the 030 is probably closest to a 386 system.
The 020 is closer to a 286. Just because all 68k have a 32-bit arch doesn't mean that they can be compared to 386+ only. They are built on older, less powerful technology. Besides, in the case of Wolfenstein 3D, 32-bit is mostly a moot point, since the game is only 16-bit. There probably isn't that much of a difference between a 386SX and 386DX at the same clockspeed (assuming they both have the same cache config, because most 386SX-systems are cache-less, because they were budget machines, where most 386DX have cache, because they were high-end).

Grzyb wrote:
http://www.faqs.org/faqs/motorola/68k-chips-faq/ - 68EC020@16 MHz has 7559 Dhrystones
386DX25 has about 5000 Dhrystones, 386SX can only be slower

Ehhh... I have no idea where those numbers come from, but pretty sure the 020 is nowhere near that fast.
It certainly isn't faster than a 386DX at 25 MHz. Probably not even as fast as a 386 at 16 MHz.
According to Wikipedia, the 68020 has ~190000 transistors and the 386DX has ~275000 transistors.
The 020 would have to be a helluva CPU design if it is faster than a 386 at all, with only about 2/3 of the transistorcount.
The 286 has ~134000 transistors according to Wikipedia, by the way.
And the 030 has ~273000 transistors, so that's a very good match to the 386DX.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 54 of 127, by rasz_pl

Posted on 2019-06-08, 07:00

rasz_pl Offline

Rank l33t

Rank: l33t
Posts: 4208
Joined: 2017-06-04, 00:57

Scali wrote:
Nothing really... The technique of raycasting for Amiga's bitplane mode was just not perfected yet. Here's a preview of Cyberwol […]
Show full quote
Nothing really... The technique of raycasting for Amiga's bitplane mode was just not perfected yet.
Here's a preview of Cyberwolf, shown at Revision 2019:
https://www.pouet.net/prod.php?which=81051
Video: https://youtu.be/MyYCdd0Z88w?t=405
This runs on a stock 7 MHz Amiga 500.
So it certainly was possible, programmers simply didn't know how yet.

Yep, saw it after the compo, very clever design. Looks like its able to stay smooth by sticking to 2 bitplanes for walls. I saw similar thing on C64 back in the nineties 😉
You could play some form of FPS on non PC platforms in 1993. SNES Jurassic Park was playable, even supported mouse apparently
https://www.youtube.com/watch?v=DBwIq77omRA
Jurassic Park had A500 port, no idea how "smooth" tho, AGA version is ok in small window.

https://github.com/raszpl/FIC-486-GAC-2-Cache-Module for AT&T Globalyst
https://github.com/raszpl/386RC-16 memory board
https://github.com/raszpl/440BX Reference Design adapted to Kicad
https://github.com/raszpl/Zenith_ZBIOS MFM-300 Monitor

Reply 55 of 127, by Grzyb

Posted on 2019-06-08, 07:01

Grzyb Offline

Rank l33t

Rank: l33t
Posts: 2462
Joined: 2019-05-08, 13:47
Location: CET-1CEST-2

Scali wrote:
I thought this was about Wolfenstein 3D?

Not necessarily.
My primary goal is to answer the following question:

How does the video memory layout affect performance in 3D games?

To answer that, we need to run the same game on machines with similar CPUs, but different video circuitry.
I see that Wolfenstein may be too x86-centric (or even 286-centric), so perhaps Doom is a better benchmark?
After all, Doom was ported to various architectures already in the era.

The 020 is closer to a 286.

Well, according to the Amiga fanboys' propaganda I heard back in the era, A1200 was supposed to be a match to 386... heh, I guess one more myth busted...

Kiełbasa smakuje najlepiej, gdy przysmażysz ją laserem!

Reply 56 of 127, by rasz_pl

Posted on 2019-06-08, 08:26

rasz_pl Offline

Rank l33t

Rank: l33t
Posts: 4208
Joined: 2017-06-04, 00:57

Grzyb wrote:
Not necessarily. My primary goal is to answer the following question: […]
Show full quote
Scali wrote:
I thought this was about Wolfenstein 3D?

Not necessarily.
My primary goal is to answer the following question:

How does the video memory layout affect performance in 3D games?

To answer that, we need to run the same game on machines with similar CPUs, but different video circuitry.
I see that Wolfenstein may be too x86-centric (or even 286-centric), so perhaps Doom is a better benchmark?
After all, Doom was ported to various architectures already in the era.

You can do direct comparison using already mentioned Jurassic Park. Dos is 256 color fully textured 1/2 screen game. Amiga AGA A1200 one is ~16? color 1/4 screen no floor/ceiling
https://www.youtube.com/watch?v=cB6wnSL9eVA
https://www.youtube.com/watch?v=HIqbR2YT5b8
no clue how bad the A500 version was.

Grzyb wrote:

Well, according to the Amiga fanboys' propaganda I heard back in the era, A1200 was supposed to be a match to 386... heh, I guess one more myth busted...

You must of read the same delusional people dealing with their cognitive dissonance by publishing total balderdash in Bajtek/Top Secret around 1994. Page 4 Commodore bankruptcy, 3 pages further 6 page spread about choosing your dream Amiga computer! 😕 😵 Amiga dedicated publications were even worse 😒

https://github.com/raszpl/FIC-486-GAC-2-Cache-Module for AT&T Globalyst
https://github.com/raszpl/386RC-16 memory board
https://github.com/raszpl/440BX Reference Design adapted to Kicad
https://github.com/raszpl/Zenith_ZBIOS MFM-300 Monitor

Reply 57 of 127, by Grzyb

Posted on 2019-06-08, 09:00

Grzyb Offline

Rank l33t

Rank: l33t
Posts: 2462
Joined: 2019-05-08, 13:47
Location: CET-1CEST-2

rasz_pl wrote:
You can do direct comparison using already mentioned Jurassic Park. Dos is 256 color fully textured 1/2 screen game. Amiga AGA A1200 one is ~16? color 1/4 screen no floor/ceiling

No info on the PC specs there.
If 68020 is really that bad, it seems we need to compare it against 386SX/16, maybe even in de-turbo?

Amiga dedicated publications were even worse 😒

Oh yes, can't forget about Marek Pampuch 😁

Kiełbasa smakuje najlepiej, gdy przysmażysz ją laserem!

Reply 58 of 127, by Scali

Posted on 2019-06-08, 09:35

Scali Offline

Rank l33t

Rank: l33t
Posts: 4873
Joined: 2014-12-13, 14:24

Grzyb wrote:
My primary goal is to answer the following question: […]
Show full quote

My primary goal is to answer the following question:

How does the video memory layout affect performance in 3D games?

To answer that, we need to run the same game on machines with similar CPUs, but different video circuitry.
I see that Wolfenstein may be too x86-centric (or even 286-centric), so perhaps Doom is a better benchmark?
After all, Doom was ported to various architectures already in the era.

As I already said, it depends on what you do, and how you do it. If you bother to read my blogs, I explain how to write flatshaded polyfillers on CGA, EGA, VGA and Amiga. The EGA and Amiga versions are the most efficient in that case. The Amiga wins because it can use the blitter to maximize video ram bandwidth. Which is why stock Amiga 500s could do fullscreen flatshaded poly graphics at the full framerate. PCs didn't have enough fillrate to do that until the VLB era.

As for Wolfenstein 3D vs DOOM, I would think that DOOM is even more x86-centric than Wolfenstein 3D was.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 59 of 127, by Grzyb

Posted on 2019-06-08, 11:20

Grzyb Offline

Rank l33t

Rank: l33t
Posts: 2462
Joined: 2019-05-08, 13:47
Location: CET-1CEST-2

Scali wrote:
As I already said, it depends on what you do, and how you do it.

That's why I insist on some real-world software, not just some synthetic flatshaded polyfillers.
And I think Doom would be a good example: it's been around for decades, source code is available, and it's been hacked by countless people.

Perhaps it would be best to forget about the PC at all, and consider the following:
- Amiga with AGA (bitplanes)
- the same Amiga with some SVGA card, using mode X (byteplanes)
- the same Amiga with the same SVGA card, using mode 13h (chunky)

So, which would win?

Kiełbasa smakuje najlepiej, gdy przysmażysz ją laserem!

Main menu

Topic actions

Reply 40 of 127, by reenigne

Reply 41 of 127, by Scali

Reply 42 of 127, by Deunan

Reply 43 of 127, by Scali

Reply 44 of 127, by Grzyb

Reply 45 of 127, by rasz_pl

Reply 46 of 127, by Grzyb

Reply 47 of 127, by Scali

Reply 48 of 127, by Grzyb

Reply 49 of 127, by Scali

Reply 50 of 127, by Grzyb

Reply 51 of 127, by Scali

Reply 52 of 127, by Grzyb

Reply 53 of 127, by Scali

Reply 54 of 127, by rasz_pl

Reply 55 of 127, by Grzyb

Reply 56 of 127, by rasz_pl

Reply 57 of 127, by Grzyb

Reply 58 of 127, by Scali

Reply 59 of 127, by Grzyb