My thought on integrated video chipsets

Reply 20 of 74, by Bruninho

Posted on 2020-03-09, 20:11

Bruninho Offline

Rank Oldbie

Rank: Oldbie
Posts: 1911
Joined: 2017-09-07, 02:39
Location: Back to Rio de Janeiro :(

Intel GPUs compared to Honda Civics? Lol. That was true back in the day, but now I think they can compare to a Le Mans GTE Pro class, meaning they can now be the early Ford first attempts at fighting against Ferraris. 😜

They are far from Le Mans P1 class though, where is Nvidia and AMD...

Last edited by Bruninho on 2020-03-09, 21:14. Edited 2 times in total.

"Design isn't just what it looks like and feels like. Design is how it works."
JOBS, Steve.
READ: Right to Repair sucks and is illegal!

Reply 21 of 74, by kjliew

Posted on 2020-03-09, 20:48

kjliew Offline

Rank Oldbie

Rank: Oldbie
Posts: 1304
Joined: 2004-01-08, 03:03

Intel has always been the biggest GPU vendors since year 2002, but a market dominance that many regarded as meaningless as no one cared about Intel GPU built into chipsets and CPUs.

Reply 22 of 74, by SPBHM

Posted on 2020-03-09, 21:37

SPBHM Offline

Rank Oldbie

Rank: Oldbie
Posts: 876
Joined: 2012-10-26, 15:59
Location: Brazil

Standard Def Steve wrote on 2020-03-09, 16:39:

Out of sheer curiosity, does anyone know when (and how) IGPs stopped robbing CPUs of half the available memory bandwidth? Right now I only have two machines with onboard graphics, a Core 2 E8600/dual channel DDR3-1333 with GMA x4500 HD, and an i7-4790/dual channel DDR3-1600 with HD 4600. Both machines have dedicated GPUs installed. However, I ran an AIDA64 memory benchmark on both machines before and after installing the video cards. On both machines, the read/write/copy bandwidth didn't change at all, but latency was slightly better, dropping a few nanoseconds after the video card install.

Memtest86 bandwidth numbers were exactly the same. I remember back in the P4/Athlon XP days, installing an AGP graphics card resulted in significantly higher RAM bandwidth in memtest and AIDA64.

On a side note, I remember seeing an i815 based machine with the 4MB display cache built right into the motherboard. Think it was an SFF Deskpro.

the impact to performance from the 2002's 845GE is not looking bad already, you would gain 15% bandwidth by disabling the IGP.

https://techreport.com/review/4186/intels-845 … 845ge-chipsets/

it would be interesting to look at the i810,

Reply 23 of 74, by dionb

Posted on 2020-03-09, 22:50

dionb Offline

Rank l33t++

Rank: l33t++
Posts: 6483
Joined: 2017-12-23, 15:35
Location: Amsterdam

Standard Def Steve wrote on 2020-03-09, 16:39:

Out of sheer curiosity, does anyone know when (and how) IGPs stopped robbing CPUs of half the available memory bandwidth? Right now I only have two machines with onboard graphics, a Core 2 E8600/dual channel DDR3-1333 with GMA x4500 HD, and an i7-4790/dual channel DDR3-1600 with HD 4600. Both machines have dedicated GPUs installed. However, I ran an AIDA64 memory benchmark on both machines before and after installing the video cards. On both machines, the read/write/copy bandwidth didn't change at all, but latency was slightly better, dropping a few nanoseconds after the video card install.

Memtest86 bandwidth numbers were exactly the same. I remember back in the P4/Athlon XP days, installing an AGP graphics card resulted in significantly higher RAM bandwidth in memtest and AIDA64.

They didn't.

As mentioned above, the memory controllers have become much smarter, only taking bandwidth when needed - but if you stress both CPU and GPU at the same time, CPU will still only have half. What has changed is the dependence of the CPU on memory bandwidth - huge caches have made the impact smaller.

On a side note, I remember seeing an i815 based machine with the 4MB display cache built right into the motherboard. Think it was an SFF Deskpro.

Compaq had some creative designs. Never saw this one (and I have two Compaq i815 boards), but did see the 4MB-on-and-AGP-moddule they also used.

Reply 24 of 74, by ragefury32

Posted on 2020-03-10, 15:37

ragefury32 Offline

Rank Oldbie

Rank: Oldbie
Posts: 571
Joined: 2019-10-18, 13:59

dionb wrote on 2020-03-07, 21:38:

A lot of the big innovations in integrated VGA were also done by others. The first actually shipping dual channel chipset was ALi's Aladdin 7, with dual-channel SDR-SDRAM to allow both CPU and integrated ArtX (later bought by ATi and turned into the Radeon) VGA to get decent memory performance and bandwidth. A good discrete card still performed better, but it blew away all other 2000-era integrated solutions. ATi was also first to allow integrated VGA and discrete VGA to cooperate so as to get maximum performance when needed and delivering maximum power efficiency when not.

Intel has just chugged along with the 'good enough' mantra- for non-gaming purposes (gamers may be noisy, but are and were a small minority) anything that reliably gets pictures onto a screen is good enough, and by integrating that into entry-level chipsets and later CPUs they've pushed all but the most high-end competitors out of the market. Great business acumen, but hardly a technical innovation.

I haven't seen any graphics benchmarks based on that ArtX core, and the 2000-era integrated stuff includes graphics c0res like the Ali CyberAladdin TNT2 and the Via PM/KM133 with the ProSavage core - both of them were "decent" in their own right if you were using them for DirectX6 applications only. The original nForce with the Geforce2Go MX based IGP came out in 2001, so even if the ArtX came out ahead, it'll not be able to beat the nForce IGP. I also remembered accusations that ArtX either fudged or faked their tech demos at Comdex and then engaged in some sketchiness on Ars Technica forum trying to manipulate public opinion.

As for that ATi integrated/discrete cooperation thing, it's, what, the Hybrid Crossfire thing? It works on an ATi/AMD Chipset in conjunction with an ATi discrete GPU, and that came out much later (...and I am sure some enterprising fellow tried to get an nVidia GPU to work with it).

Last edited by ragefury32 on 2020-03-11, 16:12. Edited 2 times in total.

Reply 25 of 74, by Standard Def Steve

Posted on 2020-03-10, 19:02

Standard Def Steve Offline

Rank Oldbie

Rank: Oldbie
Posts: 1483
Joined: 2012-09-15, 08:04
Location: Canada!

dionb wrote on 2020-03-09, 22:50:

They didn't.

As mentioned above, the memory controllers have become much smarter, only taking bandwidth when needed - but if you stress both CPU and GPU at the same time, CPU will still only have half. What has changed is the dependence of the CPU on memory bandwidth - huge caches have made the impact smaller.

Good to know, good to know. So I guess that when I was running the AIDA64 bench with the IGP still active, the static screen was allowing the mem controller to allocate all bandwidth to the CPU. It would be interesting to see how dragging around a Chrome window would affect the bandwidth numbers - IGP vs PCIe. But I'm far too lazy to yank out the video card, change drivers, etc. just to test this. 😜

Looking at some of my old SuperPi results, the SiS IGP was definitely hurting my P4's FPU performance. 1M completion times were around 12% shorter with an AGP card installed.

On an i7-4790, SuperPi scores w/ the IGP enabled are identical to those generated with a GTX-1660 Ti installed. I guess most of SuperPi is able to squeeze itself into an 8 meg L3.

"A little sign-in here, a touch of WiFi there..."

Reply 26 of 74, by pentiumspeed

Posted on 2020-03-10, 19:41

pentiumspeed Offline

Rank l33t

Rank: l33t
Posts: 3197
Joined: 2017-05-17, 23:17
Location: Great Northern: Canada.

All 815 that has iGPU and a AGP slot motherboard, you can purchase a AIMM 4MB module and plug into the AGP slot, not just compaq thing.

Cheers,

Great Northern aka Canada.

Reply 27 of 74, by Swiego

Posted on 2020-03-10, 21:21

Swiego Offline

Rank Member

Rank: Member
Posts: 166
Joined: 2019-09-03, 05:34
Location: SF Bay Area

FWIW I have a Deskpro XE 560 which has Compaq's QVision 1280 integrated onto the motherboard. It's blazing fast in DOS and Windows. I found a couple of old magazine articles suggesting it was connected to the CPU via some proprietary bus different from VLB. It's too bad there's so little information easily attainable about the design of some of these old systems.

Reply 28 of 74, by dionb

Posted on 2020-03-11, 14:22

dionb Offline

Rank l33t++

Rank: l33t++
Posts: 6483
Joined: 2017-12-23, 15:35
Location: Amsterdam

Swiego wrote on 2020-03-10, 21:21:

FWIW I have a Deskpro XE 560 which has Compaq's QVision 1280 integrated onto the motherboard. It's blazing fast in DOS and Windows. I found a couple of old magazine articles suggesting it was connected to the CPU via some proprietary bus different from VLB. It's too bad there's so little information easily attainable about the design of some of these old systems.

Onboard != integrated.

Exactly where you draw the line depends on focusing on shared memory access (UMA) or on actually being in same die. Either way, those QVision chips had their own memory and were separate chips, so not integrated according to any definition. Iirc XE 560 had an EISA bus, and the QVision chips were originally EISA chips (PCI versions came later), so I guess that's how it was connected. EISA is anything but proprietary, the first truly open standard after ISA.

Reply 29 of 74, by pentiumspeed

Posted on 2020-03-11, 20:13

pentiumspeed Offline

Rank l33t

Rank: l33t
Posts: 3197
Joined: 2017-05-17, 23:17
Location: Great Northern: Canada.

Correct,

Integrated GPU means it is part of a northbridge "chipset" that is hooked between memory and CPU. Also another way GPU is integrated into CPU themselves.
Keep in mind AMD integrated real GPU, somewhat decent if you buy high end APU (FM1/FM2) into their AMD CPUs. Like A10-5800K or 6800K for example.

Onboard GPU means it is actual GPU IC that usually are was add on card but soldered on the motherboard instead. Saves some BOM of a metal bracket and cost of a PCB card.

Cheers,

Great Northern aka Canada.

Reply 30 of 74, by swaaye

Posted on 2020-03-12, 06:01

swaaye Offline

Rank l33t++

Rank: l33t++
Posts: 8317
Joined: 2002-07-22, 21:24

I think the modern consoles are the most interesting IGPs. Though I feel like the PS4 and XBox One are more like fairly beefy GPUs with little tablet CPUs built-in. 🤣

The PC chips have different market realities and priorities. I do enjoy seeing what they can do though. More than adequate for older games most of the time. Even Intel IGPs are excellent.

Reply 31 of 74, by diagon_swarm

Posted on 2020-03-12, 11:03

diagon_swarm Offline

Rank Newbie

Rank: Newbie
Posts: 51
Joined: 2014-01-17, 10:02
Location: Czech Republic

kainiakaria wrote on 2020-03-05, 17:09:

Before ATI got into making mobile laptop video chipsets you had video chipsets like the Cirrus Logic GD7543, Cirrus Logic GD7548 or Trident Cyber9397 which was standard fair for integrated video chipsets on desktop computers from 1996-1997. These chipset usually had 1 MB to 2 MB of VRAM available. Socket 7 based HP Vectra Intel Pentium-Pentium MMX based machines had Cirus Logic and S3 integrated video chipsets. ATI 3D Rage LT PRO was one of ATI's greatest contributions to integrated video chipsets back in 1997-1998. Wikipedia says that the 3D Rage LT (aka Mach64 LT) was often implemented on motherboards and in mobile applications like notebook computers. This late 1996 chip was very similar to the 3D Rage II and supported the same application coding. It integrated a low-voltage differential signaling (LVDS) transmitter for notebook LCDs and advanced power management (block-by-block power control). The 3D RAGE LT PRO, based on the 3D RAGE PRO, was the very first mobile GPU to use AGP.

The 3D Rage LT Pro offered Filtered Ratiometric Expansion, which automatically adjusted images to full-screen size. ATI's ImpacTV2+ is integrated with the 3D RAGE LT PRO chip to support multi-screen viewing; i.e., simultaneous outputs to TV, CRT and LCD. In addition, the RAGE LT PRO can drive two displays with different images and/or refresh rates with the use of integrated dual, independent CRT controllers. The 3D Rage LT Pro was often used in desktop video cards that had a VESA Digital Flat Panel port to drive some desktop LCD monitors digitally.

There is a lot of confusion in your text. As was stated already - most of the chipset you mentioned are not IGPs.

Another issues are the dates from wikipedia. It was very common to introduce the chip far sooner than real hardware was available. 3D Rage LT (Rage II for mobile use; definitely not integrated) was introduced in late 1996, but it was not available before 1998. In May 1998, Apple used this chip for a few months and then replaced it quietly with Rage LT Pro (Q4 1998). I'm not sure if there was any other use of 3D Rage LT.

Rage LT Pro improved 3D performance in laptops. It was better (2.5x) than S3 Virge MX - the previous king in this segment of the market. It was also the first card to support filtered scaling. This chip was really a big contribution to laptop GPUs, but not to integrated GPUs (there were no consumer PC laptop/mobo 3D accelerators with stared memory yet). In Q4 1999, there were laptops with S3's new mobile version of Savage4, which was the fastest mobile GPU. For a short time period... In Q1 2000, ATI released Mobility 128/M4 (Rage Pro 128) and the real battle started (GeForce2Go was released in 2001).

IGPs were, however, completely different segment from all of this - they were a low-cost low-end option to these mobile chips. Thus, it is obvious that there were even slower.

Vintage computers / SGI / PC and UNIX workstation OpenGL performance comparison

Reply 32 of 74, by diagon_swarm

Posted on 2020-03-12, 11:26

diagon_swarm Offline

Rank Newbie

Rank: Newbie
Posts: 51
Joined: 2014-01-17, 10:02
Location: Czech Republic

swaaye wrote on 2020-03-12, 06:01:

I think the modern consoles are the most interesting IGPs. Though I feel like the PS4 and XBox One are more like fairly beefy GPUs with little tablet CPUs built-in. 🤣

Even certain old consoles were a good example of fast IGPs.

As the main priority for PC IGPs was to make the design as cheap as possible, it is obvious that they were slow (being the performance baseline for other solutions). However, even in the 1990s, there were fast high-end IGP solutions... even for Windows (NT). Meet the SGI 320 mainboard from late 1998 / early 1999 😀 ...

The attachment sgi-320-mainboard.jpg is no longer available

This was a fast UMA (unified memory architecture) solution, where you could do zero copy things like using TV input as a texture on a 3D model and send it to TV output. The graphics core was integrated in the main chip together with a high-throughput memory switch. Btw the core had two pixel pipelines (each with one TMU), worked natively in 32bpp and was able to render 180 Mpix/s on textured & smooth-shaded polygons (for comparison: Riva TNT did 150 Mpix/s in high-color). It even had a geometry unit allowing to process 3.5 milion vertices/s.

Btw John Carmack had one during the development of Quake III

Vintage computers / SGI / PC and UNIX workstation OpenGL performance comparison

Reply 33 of 74, by ragefury32

Posted on 2020-03-19, 03:49

ragefury32 Offline

Rank Oldbie

Rank: Oldbie
Posts: 571
Joined: 2019-10-18, 13:59

diagon_swarm wrote on 2020-03-12, 11:26:
Even certain old consoles were a good example of fast IGPs. […]
Show full quote

swaaye wrote on 2020-03-12, 06:01:

I think the modern consoles are the most interesting IGPs. Though I feel like the PS4 and XBox One are more like fairly beefy GPUs with little tablet CPUs built-in. 🤣

Even certain old consoles were a good example of fast IGPs.

As the main priority for PC IGPs was to make the design as cheap as possible, it is obvious that they were slow (being the performance baseline for other solutions). However, even in the 1990s, there were fast high-end IGP solutions... even for Windows (NT). Meet the SGI 320 mainboard from late 1998 / early 1999 😀 ...
sgi-320-mainboard.jpg

This was a fast UMA (unified memory architecture) solution, where you could do zero copy things like using TV input as a texture on a 3D model and send it to TV output. The graphics core was integrated in the main chip together with a high-throughput memory switch. Btw the core had two pixel pipelines (each with one TMU), worked natively in 32bpp and was able to render 180 Mpix/s on textured & smooth-shaded polygons (for comparison: Riva TNT did 150 Mpix/s in high-color). It even had a geometry unit allowing to process 3.5 milion vertices/s.

Btw John Carmack had one during the development of Quake III

Oh yeah - the SGI Visual Workstation 320/540 - the very definition of selling people expensive dead-end solutions. I still remember an old gig trying to give a bunch of SGI VW 320s away to anyone who wants one back in 2004, and finding only one taker.
Reasons?
a) They are big, rather heavy and power hungry machines. Well, at least in terms of a Coppermine PIII. Their expansion parts were also (at that time) rather expensive and propietary
b) They are not really PC compatible - they will run on Windows 2000 but via the ARC HAL...which was not carried over to production copies of Windows XP. So Windows 2000 or Linux.
c) They work okay in Linux, but the Cobalt UMA video chipset (similar in concept to the one on the SGI O2...which was a nice little MIPS box) is not supported, so what you have is this awkward Pentium III that takes up too much room and consumed too much power for what it couldn't do. You could slap a Geforce 2 or 3 in there, but then, what's the point of having the 320/540?
d) An Athlon with a Geforce3 will outdo it, performance-wise (at least for assets that can fit in its VRAM), and I wager that even an Athlon with an nForce IGP 2 years later will be somewhat competitive against it. If I remember Carmack's .plan back when he had the SGI VW320, he lamented how quickly non-SGI tech caught up to it.
SGI was supposed to do a tech refresh every 18 months and push them as the x86 outgrowth of the O2...too bad it ran late, and by the time they delivered on the market the fast moving state-of-the-art in 3D graphics already blew by it. SGI supposedly cancelled the entire product line after a single iteration and laid off the engineers...who moved onto nVidia. The subsequent SGI Visual Workstation offerings were essentially standard PCs with nVidia cards.

Oh yeah, one of my buddies did pick up a 320 (at the aforementioned corporate giveaway) along with an option card to interface with a 1600SW LCD panel, which use this dead standard called OpenLDI. When he tried to use that 1600SW monitor with a standard PC...well, cue the hilarity of looking for a card that supported the wrong side of the OpenLDI/DVI standards fight. The monitor was nice. The need to pick up a rare device that was on the wrong side of obsolete...wasn't.

In terms of good UMA implementation, one can look at game consoles - the Nintendo 64 was a UMA design, as was the Microsoft X-Box, and the PS4.

Eh...technically the CPU on the PS4 is more powerful than merely "a tablet processor" (I also don't recall AMD having much luck pushing their Jaguar APUs to tablet vendors)....the same CPU core family were used in 40 Gigabit data center switches from Arista as their supervisory engines. They were certainly competitive against the Intel Atom C2000 series used in other designs.

If you want to play with AMD's UMA/Zero-copy DMA tech, you can pick up an HP t620 Plus or t730 thin client - they make for compelling and curiously competent XP/Win7 gaming machines on the cheap.

Last edited by ragefury32 on 2020-03-20, 00:59. Edited 3 times in total.

Reply 34 of 74, by SPBHM

Posted on 2020-03-19, 07:30

SPBHM Offline

Rank Oldbie

Rank: Oldbie
Posts: 876
Joined: 2012-10-26, 15:59
Location: Brazil

one difference is that in a PS4 the CPU portion can only access something around 20GB/s max, while the GPU has always 100+GB/s
that's not like the behavior of current PC IGPs

Reply 35 of 74, by ragefury32

Posted on 2020-03-19, 15:58

ragefury32 Offline

Rank Oldbie

Rank: Oldbie
Posts: 571
Joined: 2019-10-18, 13:59

SPBHM wrote on 2020-03-19, 07:30:

one difference is that in a PS4 the CPU portion can only access something around 20GB/s max, while the GPU has always 100+GB/s
that's not like the behavior of current PC IGPs

Well, not entirely, and that's not the only difference -
- The Liverpool SoC of the original PS4 came out in 2013, and it's something like this:
4 Jaguar cores, a GCN 2 based GPU with 18 Compute Units, which gives you 1152 Unified Shader Processors, 72 Texture Mapping Units, 32 Render Output Units, all accounting for around 1800 Gigaflops, and 8GB of 256 bit data path GDDR5 unified RAM with a potential max bandwidth of 176GBytes/sec.
I have no idea what the CPU interconnect with the GDDR5 RAM will look like, but it should not be something as low as 30GByte/sec. I have no idea what the thermal envelope looks like, but I am guessing at least 65w.

The AMD "stock" APU from 2013 closest to the Liverpool is something like a GX-420CA, which looks something like this:
4 Jaguar cores, a GCN 1 based GPU with 2 Compute Units, which gives you 128 Unified Shader Processors, 8 Texture Mapping Units, 4 Render Output Units, all accounting for around 150 Gigaflops, and it can be connected to a maximum of 32GB DDR3-1600 unified RAM via a single channel, giving you approximately 12.8GBytes/sec
The entire thing runs on a 35w TDP envelope.

Now, for a more fair comparison (since the GPU on that GX420CA is kinda old and the CU count is like 1/10th of the Liverpool), we can step one year ahead and look at an embedded SoC from AMD, which is an RX-427BB (it's also sold as an FX-7600p and made it into a few Asus laptops), which is the closest to Liverpool.
4 Steamroller cores, a GCN 2 based GPU with 8 Compute Units, which gives you 512 Unified Shader Processors, 32 Texture Mapping Units, 8 Render Output Units, all accounting for around 615 Gigaflops, and it can be connected to a maximum of 32GB DDR3-2133 unified RAM via dual channels, giving you approximately 28 GBytes/sec
The entire thing runs on a 35w TDP envelope.

If you look at the RX-427BB, the overall GPU output is about even compared to (or slightly better than) the Intel Iris Pro 5200 on my Haswell machine (that Iris Pro also has a 128MB L4 cache that works rather like the high bandwidth VRAM in most GPUs). To get embedded graphics close to that of the original PS4 just on a pure compute horsepower standpoint, you'll need to look at, say, the Ryzen embedded V1807B with the Polaris 11 embedded GPU. That's 11 Compute Units, which gives you 704 Unified Shader Processors, 44 Texture Mapping Units, 16 Render Output Units, all accounting for around 1850 Gigaflops, and it can be connected to a maximum of 64GB DDR4-2400 unified RAM via dual channels, giving you approximately 40 GBytes/sec. It's still nowhere near the memory bandwidth of the PS4 Liverpool, though.

Why yes, the PS4 is overpowered as a UMA, but it's designed that way - it is after all a gaming oriented platform. You could also say the same about the 4GB of HBM2 Memory embedded on-die in my Kaby Lake-G machine (that one is a Polaris 22 with some Vega features).

Reply 36 of 74, by diagon_swarm

Posted on 2020-03-21, 11:07

diagon_swarm Offline

Rank Newbie

Rank: Newbie
Posts: 51
Joined: 2014-01-17, 10:02
Location: Czech Republic

ragefury32 wrote on 2020-03-19, 03:49:

Oh yeah - the SGI Visual Workstation 320/540 - the very definition of selling people expensive dead-end solutions. I still remember an old gig trying to give a bunch of SGI VW 320s away to anyone who wants one back in 2004, and finding only one taker.

Well, any 1998 workstation was useless in 2004 as a general purpose computer. The technology advanced so much during those years. SGI VW320 was in fact not an expensive when compared with other true workstations on the market and it offered the same or better performance than for example Intergraph Intense 3D Wildcat 4000 (with the Lynx5 geometry card).

GeForce 256 (and the first Quadro based on it) was the game changer, but that happened a year later. Anyone who really needed good workstation performance had to go with special workstations in the late 90s. SGI VW320 was released half a year before NVIDIA Riva TNT2.

A big problem with Visual Workstations was that they were an unwanted baby inside SGI. It was soon obvious that there would be very little effort put in improving current models and developing a true successor. With the overall declining trust in SGI by the market, this was not a good starting point for the product. Anyway, it didn't matter much as NVIDIA killed all others not much later with the Quadro (after Windows NT and Pentium Pro/II, this was the last missing piece for generic hardware on the way to dominate the workstation market).

Vintage computers / SGI / PC and UNIX workstation OpenGL performance comparison

Reply 37 of 74, by ragefury32

Posted on 2020-03-21, 16:47

ragefury32 Offline

Rank Oldbie

Rank: Oldbie
Posts: 571
Joined: 2019-10-18, 13:59

diagon_swarm wrote on 2020-03-21, 11:07:
Well, any 1998 workstation was useless in 2004 as a general purpose computer. The technology advanced so much during those years […]
Show full quote

ragefury32 wrote on 2020-03-19, 03:49:

Oh yeah - the SGI Visual Workstation 320/540 - the very definition of selling people expensive dead-end solutions. I still remember an old gig trying to give a bunch of SGI VW 320s away to anyone who wants one back in 2004, and finding only one taker.

Well, any 1998 workstation was useless in 2004 as a general purpose computer. The technology advanced so much during those years. SGI VW320 was in fact not an expensive when compared with other true workstations on the market and it offered the same or better performance than for example Intergraph Intense 3D Wildcat 4000 (with the Lynx5 geometry card).

GeForce 256 (and the first Quadro based on it) was the game changer, but that happened a year later. Anyone who really needed good workstation performance had to go with special workstations in the late 90s. SGI VW320 was released half a year before NVIDIA Riva TNT2.

A big problem with Visual Workstations was that they were an unwanted baby inside SGI. It was soon obvious that there would be very little effort put in improving current models and developing a true successor. With the overall declining trust in SGI by the market, this was not a good starting point for the product. Anyway, it didn't matter much as NVIDIA killed all others not much later with the Quadro (after Windows NT and Pentium Pro/II, this was the last missing piece for generic hardware on the way to dominate the workstation market).

Well, the problem facing SGI was that they spent millions of dollars developing the Cobalt chipset (delivering late), just to see PC makers with commodity parts match or beat them at their own game (see contemporary review here: https://books.google.cz/books?id=93nBwQ5XIAwC … epage&q&f=false) - the reviews did also mention that when compared to a dedicated discrete card, the SGI Cobalt chipset came up short on apps like Solidworks and/or Pro/E, but complimented its ability to do videos-as-textures well (which is probably more for broadcast graphics, a rather niche role). Note that the reviews compared the SGI 540 versus its contemporaries, and it was just barely on parity, price-wise. I am not sure how a comparison between the 320 and cut-down versions of its competitors might look like, but I am willing to bet that the SGI won't come up all that compelling, either.
The Visual Workstations were supposed to be the replacement for the O2, but the market just kinda shrugged.

Then of course, there is the entire debacle of nVidia versus SGI - Jansen Huang had a tendency to snipe talent away from SGI, and SGI sued nVida in '98 for patent infringements...which due to SGI gathering a mountain of debt was concluded in nVidia's favor (the companies decided to swap patent portfolios and SGI transferred engineers from the Visual Workstation team to nVidia)...which was described here https://www.eetimes.com/sgi-graphics-team-moves-to-nvidia/. It does also explain why the successors to the Visual Workstation line were a bunch of generic PCs with nVidia Quadros - that business was essentially cut and management at that time decided that it's easier and cheaper just to offer PCs with nVidia cards and keep engineering to a minimum. The visual workstation 320/540 line was like a red-headed stepchild to the rest of SGI. It's not quite as custom as their hardcore MIPS based hardware (like the Octane 2 MXE/VPro or the Origins), SGI was about to abandon MIPS for Itanium (yeah, for a while it looks as if x86-64 will be VLIW), the 320/540 hardware wasn't commodity enough to allow easy compatibility/upgrades (that weird credit-card like DIMM modules was already a source for gritting and gnashing of the teeth, and anyone who dealt with SGI knows about their custom nature) and no one knows why SGI is going Windows NT instead of improving IRIX. Well, they were hemorrhaging money on R&D and not making enough back, and the entire operation was sinking fast.

You can say that nVidia became SGI on the bones of Wintel, and I am willing to say that the engineers who worked on NV2A (the UMA chipset for the original X-Box) and nForce (the nVidia Athlon chipsets with the Geforce2 GPU integrated) probably cut their teeth at SGI working on O2s and Visual Workstations, much like how ArtX (the guys who did the flipper chip for the Nintendo gamecube, later acquired by ATi and went on to work on the Radeon 9700) had SGI engineering roots as well.

Reply 38 of 74, by ragefury32

Posted on 2020-03-21, 17:19

ragefury32 Offline

Rank Oldbie

Rank: Oldbie
Posts: 571
Joined: 2019-10-18, 13:59

diagon_swarm wrote on 2020-03-12, 11:03:

It was very common to introduce the chip far sooner than real hardware was available. 3D Rage LT (Rage II for mobile use; definitely not integrated) was introduced in late 1996, but it was not available before 1998. In May 1998, Apple used this chip for a few months and then replaced it quietly with Rage LT Pro (Q4 1998). I'm not sure if there was any other use of 3D Rage LT.

Rage LT Pro improved 3D performance in laptops. It was better (2.5x) than S3 Virge MX - the previous king in this segment of the market. It was also the first card to support filtered scaling. This chip was really a big contribution to laptop GPUs, but not to integrated GPUs (there were no consumer PC laptop/mobo 3D accelerators with stared memory yet). In Q4 1999, there were laptops with S3's new mobile version of Savage4, which was the fastest mobile GPU. For a short time period... In Q1 2000, ATI released Mobility 128/M4 (Rage Pro 128) and the real battle started (GeForce2Go was released in 2001).

IGPs were, however, completely different segment from all of this - they were a low-cost low-end option to these mobile chips. Thus, it is obvious that there were even slower.

As far as I am aware, the Rage LT was only used in a single volume produced laptop - the Mainstreet/Wallstreet PowerBook G3. Apple did replace it with the Rage Pro LT for the Lombard Powerbook, but due to its use of the old Grackle northbridge, it was connected via PCI rather than AGP. The first clamshell iBook did also use the Rage Pro LT, but since it uses the newer Uninorth northbridge, it was connected to the system via an AGP 2x port. Of course, the problem with ATi GPUs back then was that the Rage Pro LT/Mobility M/M1/P were all rather used interchangeably in trade press, so unless you see a photo of one on the logic board, you were not really sure which you really dealt with. They are generally about similar in terms of 3D performance, most differences are in the IDCT compensation/video circuitry.

As for filtered scaling, I am not 100% sure but I think the Neomagic 128/256 lines also feature that option. The Virge MX wasn't really based on the original Virge but on the VX2. Yeah, it was not good whatsoever in 3D (the same can be said about the ATi Rage Pro line...mediocre at best), but its strength was in DOS compatibility. Its successor the SavageIX/MX line were not based on the Savage4, it was more of a bugfix Savage3D - they offered okay performance for 1999, and good DOS support. The laptop chipset based on the Savage4 was the ProSavage, but that one was found only on a single volume produced machine - the IBM Thinkpad T23 (which was a purely AC97 machine). The Rage 128 Mobility came in 2 variants, the M3 (2x AGP) and the M4 (4x AGP) - most M3s only feature 8MB of VRAM, while M4s tend to come with 16MB or above. Then came the Geforce2Go, followed by the Radeon Mobility M6 in 2001.

Last edited by ragefury32 on 2020-03-22, 01:24. Edited 1 time in total.

Reply 39 of 74, by diagon_swarm

Posted on 2020-03-21, 19:45

diagon_swarm Offline

Rank Newbie

Rank: Newbie
Posts: 51
Joined: 2014-01-17, 10:02
Location: Czech Republic

ragefury32 wrote on 2020-03-21, 16:47:

Well, the problem facing SGI was that they spent millions of dollars developing the Cobalt chipset (delivering late), just to see PC makers with commodity parts match or beat them at their own game (see contemporary review here: https://books.google.cz/books?id=93nBwQ5XIAwC … epage&q&f=false) - the reviews did also mention that when compared to a dedicated discrete card, the SGI Cobalt chipset came up short on apps like Solidworks and/or Pro/E, but complimented its ability to do videos-as-textures well (which is probably more for broadcast graphics, a rather niche role). Note that the reviews compared the SGI 540 versus its contemporaries, and it was just barely on parity, price-wise. I am not sure how a comparison between the 320 and cut-down versions of its competitors might look like, but I am willing to bet that the SGI won't come up all that compelling, either.

VW 320 was far more interesting product than 540 (540 had just more PCI-X slots and doubled number of CPU slots). You could get VW 320 under $4,000 and get 3D and geometry performance comparable with $10,000 workstations (3D core is exactly the same in 320 and 540). That was not a bad deal. In addition to it, the texture fill-rate was superior to any PC workstation card available at that time. Btw, I've measured many professional cards from the 90s - http://swarm.cz/gpubench/_GPUbench-results.htm (it's not in an easily readable form but provides multiple useful details ) ... the hi-res texture performance was bad with Intergraph and 3Dlabs products.

ragefury32 wrote on 2020-03-21, 16:47:

...and no one knows why SGI is going Windows NT instead of improving IRIX.

When I spoke to my colleagues who worked at SGI since the 90s, they saw it differently. Customers liked IRIX a lot, but they bought Windows NT machines instead. The shift was quick when multiple 3D/CAD/CAE packages were ported to NT and IRIX lost its exclusivity. Another problem for IRIX was Oracle - they decided to support only let's say five most popular UNIX systems and stopped supporting others (IRIX was ~8th). About 40% of SGI's large systems were sold to run Oracle software so it was clear to SGI that days of IRIX would be soon over. The idea of shifting to industry standard operating systems was accepted better by customers than by SGI itself.

ragefury32 wrote on 2020-03-21, 16:47:

As far as I am aware, the Rage LT was only used in a single volume produced laptop - the Mainstreet/Wallstreet PowerBook G3. Apple did replace it with the Rage Pro LT for the Lombard Powerbook, but due to its use of the old Grackle northbridge, it was connected via PCI rather than AGP. The first clamshell powerbook did also use the Rage Pro LT, but since it uses the newer Uninorth northbridge, it was connected to the system via an AGP 2x port. Of course, the problem with ATi GPUs back then was that the Rage Pro LT/Mobility M/M1/P were all rather used interchangeably in trade press, so unless you see a photo of one, you were not really sure which you really dealt with. They are generally about similar in terms of 3D performance, most differences are in the IDCT compensation/video circuitry.

As for filtered scaling, I am not 100% sure but I think the Neomagic 128/256 lines also feature that option. The Virge MX wasn't really based on the original Virge but on the VX2. Yeah, it was not good whatsoever in 3D (the same can be said about the ATi Rage Pro line...mediocre at best), but its strength was in DOS compatibility. Its successor the SavageIX/MX line were not based on the Savage4, it was more of a bugfix Savage3D - they offered okay performance for 1999, and good DOS support. The laptop chipset based on the Savage4 was the ProSavage, but that one was found only on a single volume produced machine - the IBM Thinkpad T23 (which was a purely AC97 machine). The Rage 128 Mobility came in 2 variants, the M3 (2x AGP) and the M4 (4x AGP) - most M3s only feature 8MB of VRAM, while M4s tend to come with 16MB or above. Then came the Geforce2Go, followed by the Radeon Mobility M6 in 2001.

Rage LT - That's what I think, but I have never found evidence to support it (other that I haven't found any other laptop with this chip... but that could be like EGA-equipped laptops... I thought that only few were made and then I found a lot of them from both - well-known brands and OEMs).

Rage LT Pro and Mobility - Not too long ago, I've checked the behavior of the first-gen ATI Mobility chips and I was surprised that some 3D parts behaves differently (minor things in blending) so it seems they fixed some issues (not the bad mip-mapping though). Although LT Pro run at 75MHz core and 100MHz mem, Mobility M1 run at 83MHz both (it was faster without textures and slower with hi-res textures & blending).

I understand that press didn't care about M/P/... models of Mobility chips. The main difference was just in the package type - some needed separate memory chips to be connected, some had small memory chips on the graphics chip package and some had half of the memory integrated in the package (64-bit) allowing the manufacturer of the laptop to connect additional (optional) memory chips and increase the bus widht to 128-bit (Apple laptops and business laptops often used the simple version with no extra memory chips and just 64-bit memory interface during mobile Rage128 Mobility era).

Savage4 - I'm not sure about that. The info I found was always very fuzzy. I just know that when I tested the chip by myself, the per-clock performance was perfectly comparable with desktop Savage4. If you have relevant sources, I would be happy to read about this.

Virge/MX - You mean GX2. And yes, they share the same architecture with improved video and dual-head support. Btw any Virge DX/GX/GX2/MX has exactly the same per clock 3D performance if they have the same type of memory (EDO is always faster than synchronnous memory chips).

NeoMagic 256 doesn't support filtered scaling. I had it in Toshiba T8000 long time ago and scaling was super-ugly with just doubled certain row/columns of pixels. Based on my findings, Rage LT Pro should be the first chip with this feature.

Vintage computers / SGI / PC and UNIX workstation OpenGL performance comparison

Main menu