VOGONS


First post, by superfury

User metadata
Rank l33t++
Rank
l33t++

When I try to profile my emulator using Visual Studio Community 2015, the profiler tells me 57.41% of the time is spent in ntdll.dll. Clicking on the [ntdll.dll] entry simply gives me the Function Details which tells me:
57.4% is called by ntdll.dll
11.7% is called by KernelBase.dll
0.9% is called by SDL_SemPost
< 0.1% is called by getrealtickspassed

Anyone can tell me why most calls are recorded from ntdll.dll (The applications is also only using up to 20% of total CPU time according to the Windows Task Manager)?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 1 of 6, by calvin

User metadata
Rank Member
Rank
Member

My guess is threading. Have you not looked at the function calls?

We aren't here to spoonfeed and make your emulator for you.

2xP2 450, 512 MB SDR, GeForce DDR, Asus P2B-D, Windows 2000
P3 866, 512 MB RDRAM, Radeon X1650, Dell Dimension XPS B866, Windows 7
M2 @ 250 MHz, 64 MB SDE, SiS5598, Compaq Presario 2286, Windows 98

Reply 2 of 6, by Stiletto

User metadata
Rank l33t++
Rank
l33t++
calvin wrote:

We aren't here to spoonfeed and make your emulator for you.

... Agreed...

"I see a little silhouette-o of a man, Scaramouche, Scaramouche, will you
do the Fandango!" - Queen

Stiletto

Reply 3 of 6, by superfury

User metadata
Rank l33t++
Rank
l33t++

The difficult part is that it says 57.4% is spent in ntdll.dll, which when clicked on will give the very same screen. Why would it be spending 57.4% of CPU time in ntdll.dll? SDL_SemPost does indeed have something to do with multithreading (releasing a locked part of the CPU, Timer, BIOS Menu or Debugger thread). But that only takes up 0.9% of total time (which shouldn't be much comparing to 57.4% total spent time in ntdll.dll). The CPU is only interrupted every 3000 CPU instructions (I usually set it to 30000 to get good performance on my 4.0GHz Intel i7 PC). The problem is I still don't know why 57.4% is spent inside ntdll.dll. Do simple delays (SDL_Delay() function) also count towards this percentage? The CPU thread calls SDL_Delay(0) every X instruction cycles (which can be set in the CPU menu in the BIOS). I usually leave this at the default setting (3000 cycles) or crank it up to 30000 cycles for good performance on my fast PC(the slow 2.2GHz dual core laptop I use can't handle such a speed for some reason).

The thing the main CPU thread does is simply execute X instructions/cycles(3000 or 30000 in this case), next adds 1 ms to the CPU time (the cycles are kept on ms resolution this way) and finally delays 0 ms using SDL_Delay(0) until the CPU time (the destination time of the CPU) matches or surpasses the real time (gotten from SDL or any possible high resolution timer, depending on the destination platform(Windows, Linux or PSP)). This way up to X instructions are executed each ms. This can generate 1000+ 0 us delays each second (depending on the time the CPU thread takes executing those instructions). Could this be the source of this?

The Timer thread also executes delays, but it will execute a minimum delay of 0 us after each timer step to provide maximum accuracy (timer step length depends on the timers themselves, which are keeped synchronized using the Timer thread's high resolution timer, which in turn is updated only once after each delay after which it's added to all available timers and after that the timers are updated up to current time as dictated by the high resolution timer current time).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 4 of 6, by calvin

User metadata
Rank Member
Rank
Member

Have you installed debug symbols/set up the symbol server? You aren't getting debug info without it.

2xP2 450, 512 MB SDR, GeForce DDR, Asus P2B-D, Windows 2000
P3 866, 512 MB RDRAM, Radeon X1650, Dell Dimension XPS B866, Windows 7
M2 @ 250 MHz, 64 MB SDE, SiS5598, Compaq Presario 2286, Windows 98

Reply 5 of 6, by aqrit

User metadata
Rank Member
Rank
Member

also SDL_Delay(0) maps to Sleep(0) on windows which causes the thread to yield its remaining time slice...
IMO this call should be replaced with something that just spins

http://www.windowstimestamp.com/

Reply 6 of 6, by superfury

User metadata
Rank l33t++
Rank
l33t++

The problem is that other threads also need to run in a single-threading OS(like the PSP) and SDL needs it to update status(input&Windows itself needs time). So it would simply waste valuable time doing no delay on Windows and/or hanging the OS or PSP.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io