VOGONS


Reply 20 of 38, by Darkstorm

User metadata
Rank Newbie
Rank
Newbie

I have no idea what's causing this. This is Ryzen 7 5800X with 8c/16t. It looks like it runs on 2 cores/4 threads core7+8 and 11+12

The attachment meh.png is no longer available

.

While on Ryzen 5 3600 it goes solid 100% most of the time on core 1 and core 3(and I think it would be running rather on 1 core / 2 threads). Why it works like that I have no idea, I can boot the other PC, and make screenshot of such graph... as for Visual Studio, it's not installed there so it would take a while to run profiling over there, but I think I can do it too. I can also launch the files over there and see if the solid texture bug is gone, and maybe install EverQuest to see if it's the same bug.

As for profiling, I tried DX12, but it fails to drop any results after closing the game.
Only DX11 works and this is the report: https://www.dropbox.com/scl/fi/on8m9iau8zcmsu … msjprxlm7h&dl=0

Reply 21 of 38, by Dege

User metadata
Rank l33t
Rank
l33t

TBH, the only thing that I can see is that the game spends about one and a half as many time in drawing primitives (the characters) for you than me.
In return, other game logic and drawing (unwalkable/broken path) is half as many time for you than me: 39-38% for me and 59-18% for you, see the attached image.

sacrifice-profiling.png

The Draw method in question is GPU driver-independent, the most of it is pure dgVoodoo code. So, is it possible that the game does not draw the same amount of characters/polygons under all circumstances?
I don't have another explanation for the slowdown ATM. 😐

Reply 22 of 38, by appiah4

User metadata
Rank l33t++
Rank
l33t++

Maybe game patch levels are different?

Reply 23 of 38, by Darkstorm

User metadata
Rank Newbie
Rank
Newbie
Dege wrote on 2023-08-09, 19:01:

The Draw method in question is GPU driver-independent, the most of it is pure dgVoodoo code. So, is it possible that the game does not draw the same amount of characters/polygons under all circumstances?
I don't have another explanation for the slowdown ATM. 😐

It draws 900k polys, if you press F8 twice it shows amount of polys. When I change detail level to let's say "high" it out of sudden keeps redrawing them and it goes like 200k -> 700k 40 FPS -> 10 FPS and it repeats over and over, and it starts to behave normally around normal/medium preset: https://www.youtube.com/watch?v=jimc8NE0raM
You can see on the video once I set it back to "Insane" it's solid 900k.
It's weird that without dgvoodoo, it uses the same cores the same way and it gets twice as much FPS.

I was trying to find another game that could potentially have similar issue, and I got Toon car. "Natively" it does have 144 FPS and this awful clipping issue

The attachment Toon car.png is no longer available

dgvoodoo fixes the issue but framerate is around 80-90 FPS, and sometimes drops to about 45. CPU behavior is like 70% on one thread and it switches to next thread every few seconds on dgvoodoo.

I guess it's no dgvoodoo for me on this CPU 🙁

Reply 24 of 38, by Darkstorm

User metadata
Rank Newbie
Rank
Newbie

I did GPU swap, reinstalled drivers, now for AMD GPU. Performance is the same, so it's 100% not NVIDIA like I assumed on the beginning... just to confirm it.

Ryzen 7 5800X and Ryzen 5 3600 are on the same motherboard model but they have different BIOS, and I installed W10 on R5 platform, but in past I could use W11 and I seen like 2FPS increase so I don't think that Windows matters. It's either the BIOS or the CPUs itself. Friend of mine tried it before(I mentioned it somewhere above)on R7 3800X which is same gen as R5 I have but same "class" as mine R7 and he also does have worse performance with wrappers(however while force d3d9on12 is also 6 FPS, dgvoodoo for him is 19 FPS in DX11).

I assume it's not BIOS specifically and it's CPU itself, maybe some setting could fix it, but I don't really know what could help(I tried to run in single core mode, and besides being super slow while launching and on idle the performance in game didn't change). Maybe I should ask on AMD forums or something, and maybe they would figure out why it's happening.

It feels to me like R5 3600 is giving it all to run the game from one core(2 threads), it temperature quickly goes up by 20 degrees from 40 to 60. While R7 5800X doesn't care, looks like it's running on 2 cores and 4 threads and it uses only one of these at the any given moment, and temperature doesn't change.

Reply 25 of 38, by Dege

User metadata
Rank l33t
Rank
l33t

Thanks for the report, but I was completely sure that it's not a driver/GPU thing. The profiling results shows that the 98% of the critical code is running inside dgVoodoo, no call into the driver at all.
That code is the vertex data copying code, that's what runs so poorly on your cpu (Toon Cars also runs steady at 60fps for me, not a demanding game).
So, I modified it to not use certain cpu-instructions, to see if it restores the performance or not.

Try this: http://dege.fw.hu/temp/dgv2_81_1_vtxcpy.zip

In fact, once I did the same thing for other part of the vertex processing, when I measured that avoiding those cpu-instructions yields better results on a weak Pentium N CPU.

Another thing I forgot to mention last time: dgVoodoo currently has no support for hw vertex buffers in D3D7. Maybe your native NV driver provides hw D3D7 vertex buffers or just use efficient CPU code for the data copying.
Maybe I'll backport the D3D8/9 vertex buffer implementation into D3D7.

Reply 26 of 38, by Darkstorm

User metadata
Rank Newbie
Rank
Newbie

30 FPS on my benchmark map! WOW! Thanks Dege

The attachment 30 FPS.png is no longer available

Reply 27 of 38, by Dege

User metadata
Rank l33t
Rank
l33t

Ok, good news, thanks!
But then it's just incredible how poorly "rep movsd" performs on a Ryzen, at least for small memory ranges.

Reply 28 of 38, by Darkstorm

User metadata
Rank Newbie
Rank
Newbie

I tried it on Ryzen 5 3600, it's still 32 FPS. My friend with Ryzen 7 3800x got 19 FPS on DX11 on 2.81.1(he didn't play with config to check DX12), I tried now DX11 on this version on 5800X and I also get 19 FPS(before it would be no difference between DX11 and DX12 and it would be that 10-11 FPS no matter what). So I guess if my friend would set DX12, he would also get some performance gain, and it's just Ryzen 7 5800X that sucks 😀

I also tried Toon Car on this version, I think FPS now is more stable(but maybe it's just me and I would need to do proper logging to confirm it), however it still drops to 45 in one particular place on that lava map. 3 laps and 3 times it goes down to 45.

My issue is fixed, so I don't ask for anything else(but I still can test some things if needed) 😁
Again, thank you so much Dege

Reply 29 of 38, by Dege

User metadata
Rank l33t
Rank
l33t

Ok, I'll include it in the next version, the modified version works with the same performance on my cpu too, so it can be a general solution.

I will look into the Lava map, then it must be something else.

Reply 30 of 38, by Darkstorm

User metadata
Rank Newbie
Rank
Newbie
Dege wrote on 2023-08-15, 18:55:

Ok, I'll include it in the next version, the modified version works with the same performance on my cpu too, so it can be a general solution.

I will look into the Lava map, then it must be something else.

Hopefully it won't affect other CPUs, fingers crossed. As for the Toon Car, I guess some profilling would be required? According to PCGamingWiki: https://www.pcgamingwiki.com/wiki/ToonCar game has 60 FPS cap, which isn't the case for me when not using the wrapper(broken graphics but 144 FPS), or about 90 FPS with some drops here and there to 45 FPS.

Edit. I tried profilling DX12 but it fails. I also think that DX11 is better for this game. However they both lose to "native" performance.
DX11

The attachment Zrzut ekranu 2023-08-16 085154.png is no longer available

DX12

The attachment Zrzut ekranu 2023-08-16 085455.png is no longer available

No wrapper

The attachment Zrzut ekranu 2023-08-16 090655.png is no longer available

I tried"Forced3d9on12". It crashes, after few seconds into the race, however it looks like it offers slightly better FPS.

Reply 31 of 38, by Dege

User metadata
Rank l33t
Rank
l33t

How can I unlock all the tracks? I guess it's bound to the player data, could you plz share yours?

Reply 32 of 38, by Darkstorm

User metadata
Rank Newbie
Rank
Newbie
Dege wrote on 2023-08-16, 17:20:

How can I unlock all the tracks? I guess it's bound to the player data, could you plz share yours?

I have no idea where the save is located, I pretty much didn't unlock anything(I think), the lava map that I use as example is second one after the default moon map(if you go the other way then you will find the blocked maps instead). I use the single race option. I will try google the solution to unlock everything.

Edit. Drop the Players folder from attached archive into RData directory located within game installation folder. Choose "Player" when running the game to get everything unlocked. Btw. the clipping issue happens when using Direct3D T&L HAL and game is fine when using Direct3D HAL. Unfortunately I cannot monitor FPS using that option(but I guess it's 144), probably I need FRAPS too.

Reply 33 of 38, by Dege

User metadata
Rank l33t
Rank
l33t

Thanks, I found the same location in the game that is on your screenshot.

I can't think of anything other than some inefficient instructions, on your cpu, in vertex data copying. I explored all places in the related code having "suspicious" intrinsics and modified/optimized them.
I updated the .zip, try again plz:
http://dege.fw.hu/temp/dgv2_81_1_vtxcpy.zip

If it still has fps drops then it should be profiled with D3D11 (the output API doesn't really matter) to see the cause because I have stable 60fps all along the game (even with the unmodified 2.81.1):

R3-Dgl-Revistronic-Aug-7-2001-2023-08-16-19-46-35.png

Reply 34 of 38, by Darkstorm

User metadata
Rank Newbie
Rank
Newbie

I tried it, it's not really different for Toon Car(drops to 45 FPS around that place while using DX12). For Sacrifice it went down from 30 to 28 FPS. I did profilling with the version you sent me in DM previously(I don't know if these .pdb files are compatible with other versions so I wasn't messing with that and just used entire package, so it's before you did the changes that fixed Sacrifice for me).

This is me playing normally 3 laps: https://www.dropbox.com/scl/fi/5e41re6h6udo1p … wscxvxvj5l&dl=0
This is me getting around the spot and idle: https://www.dropbox.com/scl/fi/n18xi5fq345uuv … bykblubtl1&dl=0

The "native" 3D HAL also does FPS drop around this place, but it's not dropping under 100 FPS(I captured FPS with FRAPS).

The attachment Zrzut ekranu 2023-08-17 162707.png is no longer available

3D HAL T&L just doesn't care and it's solid 144 FPS, but game is unplayable due to insane clipping.

The attachment Zrzut ekranu 2023-08-17 173059.png is no longer available

I picked out Toon car, as it showed similar behavior to Sacrifice(worse FPS compared to non wrapper launch), so I thought maybe it could help diagnose the cause of why it's happening. 30FPS in Sacrifice is pretty awesome, so there is no need to investigate this anymore, unless out of curiosity.

Reply 35 of 38, by Dege

User metadata
Rank l33t
Rank
l33t

Then this must be some out-of-sync problem instead. I updated the .zip again (reverted the modified code), but now v-sync is disabled to get max performance. I get 150fps at the point I posted on my screenshot, 200+ at other places.
Could you try it again plz? Also, if it's still produces low fps, could you profile it at this point again? (and tell me the time range of idling with the car)

Btw, HAL does all the vertex processing on CPU, while TnL HAL does it on the GPU. That's why the difference with native DX.
With dgVoodoo it's all for the same, vertex processing is always done on GPU from Draw calls.

Reply 37 of 38, by Dege

User metadata
Rank l33t
Rank
l33t

Ok, it's not vertex data. This game calls surface GetDC/Lock per frame which is expensive, especially for large application resolutions. Try option DirectX\FastVideoMemoryAccess for such games or select a low resolution in the game (for example 640x480, like I did) and scale the resolution by forcing through dgVoodoo.

Reply 38 of 38, by Darkstorm

User metadata
Rank Newbie
Rank
Newbie

Yes, fast memory access helps a bit, I think about 30% increase in that spot in dx12, but resolution helps a lot(I used FHD since I thought it's not demanding game, 6 cars and something sometimes happening on the screen that I can max out settings and that's it). So it's nowhere close to that case of Sacrifice and I guess that wraps it up.