VOGONS


Reply 20 of 27, by The Serpent Rider

User metadata
Rank l33t++
Rank
l33t++

I can confirm. In my experience with Radeon 9700, D3D tesselation was much faster in Serious Sam SE. I think D3D borrows heavily from SSE, while OGL is pure x87.

I must be some kind of standard: the anonymous gangbanger of the 21st century.

Reply 21 of 27, by Scali

User metadata
Rank l33t
Rank
l33t
The Serpent Rider wrote:

I think D3D borrows heavily from SSE, while OGL is pure x87.

Not really. D3D is set up like this:
Application -> Microsoft Direct3D runtime -> vendor-provided low-level driver -> GPU

OpenGL is like this:
Application -> vender-provided runtime/driver -> GPU

So in D3D, the application talks to common code, provided by Microsoft, and the driver only implements some basic low-level functionality.
The runtime can provide SSE/3DNow! optimized code, and this is shared for all GPUs and drivers.

With OpenGL, the entire implementation is made by the vendor. This means that the vendor is also responsible for any optimizations, including SSE/3DNow! and whatnot.
In practice, most vendors didn't deliver very highly optimized OpenGL drivers. NVIDIA stood out because they did. Their driver includes SSE/3DNow! optimizations, and is generally the fastest driver by far (even on vanilla x87 machines).
What you're seeing is the result of ATi's poor OpenGL implementation.
Try the same game on an NVIDIA card from the same era, and you're likely to see much less difference between D3D and OpenGL. In fact, it wasn't uncommon for NV cards to perform better in OpenGL than in D3D in the days before Direct3D 7, because NV found ways to exploit the hardware T&L on GeForce256 and newer cards, even in games that weren't specifically designed for that, by having clever optimizations in their OpenGL driver. In D3D, T&L wasn't supported before D3D 7, and unlike OpenGL, it wasn't really possible to use it anyway, inside the driver, because of the different design of the API and driver interface (the runtime was basically in the way).

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 22 of 27, by lost77

User metadata
Rank Member
Rank
Member
an81 wrote:

Also, which one is the Memphis demo?

The one called Memphis Suburbs, it is the middle one on the selection screen. There are three reasons I use that for testing:

1. It has a good length. Long enough to give useful results, short enough so I don't go do something else and forget about it for hours 😉

2. It has a lot of different type enemies on the screen at once so it should in theory use a lot of texture memory.

3. It represents some pretty constant solo combat so a good indicator for how the game will play for me.

Reply 23 of 27, by an81

User metadata
Rank Newbie
Rank
Newbie

Here's my updated results for the pci-e system using the memphis suburbs demo. The e8400 is at 3.6ghz this time.

BCyNGaZ.png

The lost77's system at 1024x768 in D3D with High Truform, when cpu frequency adjusted (granted the scaling is linear), seems only some 5% faster for average fps in this demo and 7% faster for peak low fps. But then I think that this here commentary on the cpu performance would only be valid with both systems running the same gpus.

By the way, overclocking the e8400 from 3 to 3.6ghz has yielded me only ~4% fps increase in D3D High Truform in Memphis suburbs demo.

Ok, did some more investigating, this time with the gpu core and memory overclocked to 450mhz from their default of 400.

H3DRdNf.png

Last edited by an81 on 2019-08-17, 19:49. Edited 5 times in total.

Reply 24 of 27, by an81

User metadata
Rank Newbie
Rank
Newbie
lost77 wrote:
The one called Memphis Suburbs, it is the middle one on the selection screen. There are three reasons I use that for testing: […]
Show full quote
an81 wrote:

Also, which one is the Memphis demo?

The one called Memphis Suburbs, it is the middle one on the selection screen. There are three reasons I use that for testing:

1. It has a good length. Long enough to give useful results, short enough so I don't go do something else and forget about it for hours 😉

2. It has a lot of different type enemies on the screen at once so it should in theory use a lot of texture memory.

3. It represents some pretty constant solo combat so a good indicator for how the game will play for me.

Could you perhaps try benching it in 1024x768 with the auto-demoMP0001 (second from the top) as well? It seems about 30% slower in OpenGL. It's really meaty.

Reply 26 of 27, by an81

User metadata
Rank Newbie
Rank
Newbie

R200 is faster at Serious Sam Truform in OpenGL than in D3D, by about 20%. On P4 2.8Ghz and on C2D 6420 (c2d is ~3% faster). It's almost exactly as fast as the above graph for 2500k and x850, at 1024x768 in OGL with high Truform. Forgot to mention: no AA for Radeon 9100 @ 275/275Mhz yields ~43fps average.

Reply 27 of 27, by The Serpent Rider

User metadata
Rank l33t++
Rank
l33t++

Good to know that we "just" need overclocked Sandy Bridge to emulate R200 OpenGL tessellation in full speed 😵

I must be some kind of standard: the anonymous gangbanger of the 21st century.