VOGONS


First post, by Ozzuneoj

User metadata
Rank l33t
Rank
l33t

Everywhere I look I see NV3x based Quadro FX cards listed as having twice the ROPs (Render Output Units; pipelines) of their equivalent Geforce FX cousins. It seems like it has to be wrong.

Compare:
https://www.techpowerup.com/gpu-specs/?genera … %20FX&sort=name
https://www.techpowerup.com/gpu-specs/?genera … %20FX&sort=name

https://en.wikipedia.org/wiki/List_of_N ... xx)_series
https://en.wikipedia.org/wiki/List_of_Nvidia_ … uadro_FX_series

Obviously, if nvidia had professional cards based on nearly identical chips from the Geforce range, they would have to be nearly identical in core configuration. In fact, they could be modded from one to the other! If a softmod could be done to bump the Geforce FX 5800 Ultra from 4 to 8 ROPs (meaning, twice the pixel fill rate), don't you think people would have DONE that, and made the card actually good? Why wouldn't nvidia have released a Geforce FX with twice the pixel fill rate in the first place? Derp... because they never created an FX chip with these capabilities. 😵

Reviews generally didn't get into the core configuration of Quadro cards back then, but they certainly did for their Geforce equivalents:
https://techreport.com/review/5797/nvidia-gef … 950-ultra-gpu/2
https://techreport.com/review/4966/nvidia-gef … 800-ultra-gpu/4

Clearly, half the pixel fill rate versus the texel fill rate (4 ROPs, 2 texture mapping units each). Which is what the wiki and techpowerup database show... so why are the nearly identical QuadroFX cards shown as having the same pixel and texel fill rate? Back in the day, this was a huge deal, so it seems odd that numbers would be incorrect basically everywhere online now.

Then I find this:
https://h10057.www1.hp.com/ecomcat/hpcatalog/ … tion_xw8000.pdf

There's a section for the FX 2000 (Geforce FX 5800 equivalent, which has 4 ROPs) and while it doesn't break down the specs in as much detail as other cards in the document, it says "8 parallel pixel pipelines" . So... what's the deal? Either the pipeline numbers were grossly misrepresented (see the 5800 techreport article above) to the point that even corporate customers like HP were duped, or these things should actually have twice the pixel fill rate of their Geforce equivalents. I'm 90% sure it is the former, but, I can't dig up any more specific information than this.

Does anyone have a Quadro FX 2000 or 3000? Some fill rate tests may help clarify this a bit.

Last edited by Ozzuneoj on 2019-03-10, 04:55. Edited 3 times in total.

Now for some blitting from the back buffer.

Reply 1 of 26, by agent_x007

User metadata
Rank Oldbie
Rank
Oldbie

PYTiONb.png
AIDA64 :
ROPs : 4
TMUs/Pixel Shader : 2
Vertex Shaders (v2.0) : 3
Pixel Shaders (v2.0) : 4
I know it's using forced 5950 Ultra driver, but it's best screenshot I got.

Last edited by agent_x007 on 2019-03-09, 17:07. Edited 1 time in total.

157143230295.png

Reply 2 of 26, by Ozzuneoj

User metadata
Rank l33t
Rank
l33t

Thanks for the information! I appreciate the data!

While the data available online seems to mostly be goofed up for the higher end cards, I tested the only NV3x based Quadro FX I own, which is the 1100. It is based on the Geforce FX 5700. In Everest and it has 4 pixel pipelines with 1 TMU each. A higher end model should have the same number of pixel pipelines but with 2 TMUs each. In the 3dmark 2001SE single and multitexture tests it gets 1040 and 1459.

In 3dmark 2000 Pro I get 1150 and 1038 for single and multi. With a real Geforce FX 5950 Ultra I get 1834 and 2363 for single and multi.

Maybe there's a better test to run, as I'm pretty sure memory bandwidth is having a significant impact on these tests.

Anyone know of a very basic but reliable pixel fill rate test?

Now for some blitting from the back buffer.

Reply 3 of 26, by The Serpent Rider

User metadata
Rank l33t++
Rank
l33t++

Everywhere I look I see NV3x based Quadro FX cards listed as having twice the ROPs (Render Output Units) of their equivalent Geforce FX cousins.

It's impossible, they are tied to pixel pipelines.

I must be some kind of standard: the anonymous gangbanger of the 21st century.

Reply 4 of 26, by havli

User metadata
Rank Oldbie
Rank
Oldbie

I am by no means expert in this matter but perhaps some specs can be explained by flexible ROP/TMU ratio of NV3x architecture? IIRC it can be altered by driver for some chips, perhaps only the high-end ones - NV30/35/38. So it can act like 8x1 or 4x2. But I am not really sure.

Btw - if someone here has deeper understanding of NV3x architecture, I would like to know what is the reason of very different performance of mainstream FX GPUs -> FX 5200, 5600, 5700.

Let's take these three cards as a reference:

FX 5200 => 128MB, 128bit, 250/400MHz, 4 ROPS, 4 TMU, 4ps, 1vs
FX 5600 => 128MB, 128bit, 325/550MHz, 4 ROPS, 4 TMU, 4ps, 1vs
FX 5700 => 128MB, 128bit, 425/650MHz, 4 ROPS, 4 TMU, 4ps, 3vs

and performance:
http://abload.de/img/fxetuq5.png

Most of the time performance gain is bigger that simple frequency increase. Also these games aren't using DX9 (in which second generation FX should be better) and I doubt improved vertex shader is the key either. So what might be the reason?

HW museum.cz - my collection of PC hardware

Reply 5 of 26, by The Serpent Rider

User metadata
Rank l33t++
Rank
l33t++

So it can act like 8x1 or 4x2.

It can "act" on some operations with similar performance to true 8x1 pipes, while still being only 4x2. But that's Nvidia marketing bull and we all know how it turned out really.

I would like to know what is the reason of very different performance of mainstream FX GPUs -> FX 5200, 5600, 5700.

5200 - features to save memory bandwidth are disabled/not implemented, weak vertex performance.
5600 - scaled down NV30, weak vertex performance.
5700 - scaled down NV35 with improved internal caches, UltraShadow, vertex performance is identical to full NV35.

I must be some kind of standard: the anonymous gangbanger of the 21st century.

Reply 7 of 26, by bakemono

User metadata
Rank Oldbie
Rank
Oldbie

3dmark01 is pretty good for testing fill rate. The single-texture number is usually limited by memory bandwidth, while the multi-texture number is usually limited by the GPU. So the latter will be close to what you could theorettically get by multiplying the chip clock speed by the number of TMUs. On the other hand if you over/underclock the memory then you will see the single-texture result change significantly whereas the multi-texture result won't change much.

I recall that on a GeForce FX 5700 LE (250MHz core and 200/400 mem) I got 700Mtex and 900Mtex. And on a 7600GT I got 2Gtex and 6Gtex.

I'm sure you're right that the Quadro FX 1000/2000 has 4 ROPs just like the FX5800. There are a lot of other suspect numbers there and on wiki. Quadro FX 1800M - 550MHz GDDR5? G73 8 ROPs or 12? Did NV44 ever come with 128-bit bus? Is there a variant of NV34 with 2 vertex shaders? For that matter, does NV34 have 2 or 4 ROPs? One thing I read said that it could only use 4 ROPs under certain conditions, and certainly in practice the fill rate is closer to NV17/18 than it is to NV31/36. Although NV34 has lower multi-texture fill rate results as well, when compared to NV36, so maybe it is limited in the number of textures it can buffer at one time.

Reply 9 of 26, by Ozzuneoj

User metadata
Rank l33t
Rank
l33t

Techpowerup GPU database has fixed it after contacting them. 😀

I've fixed a few things on the wiki but it's still a mess.

Does anyone know what's up with the Quadro FX 700? That one was labeled as 4 ROPs and 4 TMUs in both places, but says it is NV35 based, and it is passively cooled.

Last edited by Ozzuneoj on 2019-03-10, 05:06. Edited 1 time in total.

Now for some blitting from the back buffer.

Reply 10 of 26, by meljor

User metadata
Rank Oldbie
Rank
Oldbie

I use the gpureview.com website as the database on old gpu's and that one seems pretty accurate.

asus tx97-e, 233mmx, voodoo1, s3 virge ,sb16
asus p5a, k6-3+ @ 550mhz, voodoo2 12mb sli, gf2 gts, awe32
asus p3b-f, p3-700, voodoo3 3500TV agp, awe64
asus tusl2-c, p3-S 1,4ghz, voodoo5 5500, live!
asus a7n8x DL, barton cpu, 6800ultra, Voodoo3 pci, audigy1

Reply 11 of 26, by swaaye

User metadata
Rank l33t++
Rank
l33t++

NV3x is a bit complicated when it comes to fillrate. Some of the chips have double rate for Z/stencil ops. This got called 8x0 mode. The number of textures applied also affects fillrate and chips vary in this respect. There are also effects from memory bandwidth. It's hard to just give it a simple specification.

https://www.beyond3d.com/content/reviews/10/24
http://ixbtlabs.com/articles2/gffx/gffx-ref-p3.html#p6

NV30 is pretty competitive with 9700 Pro as long as you don't go into DirectX 9 territory.

Reply 12 of 26, by Ozzuneoj

User metadata
Rank l33t
Rank
l33t
meljor wrote:

I use the gpureview.com website as the database on old gpu's and that one seems pretty accurate.

Ah! I'd forgotten about that one. The name makes it blend into the sea of fake "review" sites with useless benchmarks, but it is indeed a very informative website. It definitely looks like they are a bit more accurate, though there is some other info missing (like vertex shader count).

Now for some blitting from the back buffer.

Reply 13 of 26, by swaaye

User metadata
Rank l33t++
Rank
l33t++

The exact "vertex shader" count was never really defined for NV3x.

By the way, Putas has put together a comprehensive table. Tries to abstract everything to an extreme so you can kinda compare all the cards.
http://vintage3d.org/dbn.php#sthash.Il6om7Lq.dpbs

Reply 14 of 26, by The Serpent Rider

User metadata
Rank l33t++
Rank
l33t++

NV30 is pretty competitive with 9700 Pro as long as you don't go into DirectX 9 territory.

Or with heavy AA modes, or on the same clock speeds, or...

I must be some kind of standard: the anonymous gangbanger of the 21st century.

Reply 15 of 26, by havli

User metadata
Rank Oldbie
Rank
Oldbie

R300 can't get anywhere near 500 MHz...
And is some games FX 5800 U and R9700P are pretty close in performance. Few examples:

COD 1600x1200 -> noAA/noAF, 2xAA/4xAF, 4xAA/8xAF
R9700 Pro = 47, 38, 33
FX 5800 U = 77, 49, 30

Serious Sam SE 1600x1200 -> noAA/noAF, 2xAA/4xAF, 4xAA/8xAF
R9700 Pro = 59, 34, 29
FX 5800 U = 80, 45, 26

Max Payne 2 1600x1200 -> noAA/noAF
R9700 Pro = 58
FX 5800 U = 64

UT 2004 1600x1200 -> noAA/noAF, 2xAA/4xAF, 4xAA/8xAF
R9700 Pro = 45, 23, 18
FX 5800 U = 54, 33, 24

GTA 3 1280x1024 -> noAA/noAF, 2xAA/4xAF, 4xAA/8xAF
R9700 Pro = 88, 55, 40
FX 5800 U = 60, 45, 33

Far Cry DX8 1280x1024 -> noAA/noAF, 2xAA/4xAF, 4xAA/8xAF
R9700 Pro = 72, 47, 35
FX 5800 U = 50, 36, 27

HW museum.cz - my collection of PC hardware

Reply 16 of 26, by agent_x007

User metadata
Rank Oldbie
Rank
Oldbie
swaaye wrote:

By the way, Putas has put together a comprehensive table. Tries to abstract everything to an extreme so you can kinda compare all the cards.
http://vintage3d.org/dbn.php#sthash.Il6om7Lq.dpbs

How he get to 48 Pixel Pipes number compared to 96 ROPs on newer NV cards (Maxwell/Pascal/Turing) ?
Does he have a specsheet with number of pixels/rasteriser (or per SM) ?

157143230295.png

Reply 17 of 26, by The Serpent Rider

User metadata
Rank l33t++
Rank
l33t++

R300 can't get anywhere near 500 MHz...

FX5800 is 400mhz.

Also: Radeon 9800 Pro vs GeForce 5900XT are perfect for clock-for-clock comparison.

I must be some kind of standard: the anonymous gangbanger of the 21st century.

Reply 18 of 26, by havli

User metadata
Rank Oldbie
Rank
Oldbie

Yes... but FX 5800 Ultra runs at 500 MHz.

Also I don't see the point of clock-for-clock comparison when FX series were always designed to be clocked higher.

HW museum.cz - my collection of PC hardware