VOGONS


First post, by zuldan

User metadata
Rank Oldbie
Rank
Oldbie

After 8 hours of trying to diagnose this issue, I've come to Vogons as a last resort in case someone can think of something I haven't. I'm going nuts 🤣

Installing the driver for any PCI-E graphics card (Nvidia / ATI) in a LANParty UT nF4 Ultra-D running Windows XP causes the CPU to sit above 80% usage in the performance tab of Task Manager while each process shows 0% usage.

I've tried...

- Installed brand new 850W PSU
- AMD 3200+ and a 4400+ CPUs
- DFI bios versions NF4 v3.10 and NF4 v4.06 (resetting the BIOS each time)
- Nforce v6.86 and Nforce v15.23 chipset drivers (without the IDE SW driver and Network Access Manager)
- Disabled Cool N Quiet in the BIOS.
- Set PCI-E frequency is set to 100 MHz (and tried 101 to 105)
- Disable all onboard devices including USB
- Disabled MSI (Message Signaled Interrupts) in the registry
- Changed PCI-E Maximum Payload Size from 4096 to 128
- Multiple Windows XP reinstalls including disabling SATA and installing on a IDE disk
- Double checked all the motherboard jumpers and confirmed the SLI jumpers are in single VGA mode
- Plugged in power to the 4 pin Molex connector on the motherboard
- As soon as the graphic card drivers are uninstalled (Nvidia or ATI), CPU usage goes back down to 0%.

fyi, when the 4200+ is installed, 1 core sits at 90%

The attachment 2.png is no longer available
The attachment 1.png is no longer available

Reply 1 of 39, by Ozzuneoj

User metadata
Rank l33t
Rank
l33t

That is definitely a head scratcher!

Since task manager isn't showing what process is causing the issue, I would give Process Explorer a try. Google tells me this version from archive.org is the most recent one that works in XP:
https://web.archive.org/web/20140508010446/ht … ernalsSuite.zip

That might help you at least narrow down which process is using the CPU.

Now for some blitting from the back buffer.

Reply 2 of 39, by zuldan

User metadata
Rank Oldbie
Rank
Oldbie
Ozzuneoj wrote on 2026-03-15, 09:36:
That is definitely a head scratcher! […]
Show full quote

That is definitely a head scratcher!

Since task manager isn't showing what process is causing the issue, I would give Process Explorer a try. Google tells me this version from archive.org is the most recent one that works in XP:
https://web.archive.org/web/20140508010446/ht … ernalsSuite.zip

That might help you at least narrow down which process is using the CPU.

Thanks for the suggestion Ozzuneoj. This is 100% a head scratcher.

I forgot to mention I tried Process Explorer earlier today. I’ve come to the conclusion that this is not a user-mode process eating cycles, it's the kernel spending all its time servicing hardware interrupts (Process Explorer confirmed this) from the PCI-E graphics card once the proper driver loads. The basic Microsoft VGA driver doesn't trigger it because it doesn't fully initialize the x16 link or enable advanced interrupt handling.

I suspect bad capacitors are causing noise/interference which manifests as excessive interrupt requests from the PCI-E system. I have partly replaced bad caps (the ones that looked bad) before testing the motherboard. I think I’ll replace the rest and see if the problem persists.

Reply 3 of 39, by PD2JK

User metadata
Rank Oldbie
Rank
Oldbie

How is the CPU temperature when the graphics driver is installed? And when not installed?

Maybe you can rule out any CPU load reading errors, gut feeling tells me some kind of hardware error.
What about underclocking the PCIe bus, if possible.

i386 16 ⇒ i486 DX4 100 ⇒ Pentium MMX 200 ⇒ Athlon Pluto 700 ⇒ AthlonXP 1700+ ⇒ Opteron 165 ⇒ Dual Opteron 856

Reply 4 of 39, by zuldan

User metadata
Rank Oldbie
Rank
Oldbie
PD2JK wrote on 2026-03-15, 10:23:

How is the CPU temperature when the graphics driver is installed? And when not installed?

Maybe you can rule out any CPU load reading errors, gut feeling tells me some kind of hardware error.
What about underclocking the PCIe bus, if possible.

CPU temp sits at 40c in every situation and chipset at 42c. I had a large fan on the entire system to make sure everything stays cool to rule out heating issues.

Unfortunately I can’t go any slower than 100Mhz on the PCI-E bus.

I agree this is leaning towards a hardware problem. My first guess is caps.

Reply 5 of 39, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
zuldan wrote on 2026-03-15, 09:54:

This is 100% a head scratcher.

What is the % is User Mode and Kernel Mode ? Right/Click on the plot and select Show Kernel times.

I suspect the % Kernel Mode will be very high if it is then it would be good to know if this is Kernel/passive, DPC/Dispatch or Interrupt level time. I know my SIV utility will report this, there may be others.

file.php?id=238348

If you Left/Click on the plot a bigger plot should popup.

Reply 6 of 39, by Living

User metadata
Rank Member
Rank
Member
zuldan wrote on 2026-03-15, 09:54:

I suspect bad capacitors are causing noise/interference which manifests as excessive interrupt requests from the PCI-E system. I have partly replaced bad caps (the ones that looked bad) before testing the motherboard. I think I’ll replace the rest and see if the problem persists.

same behavior as when you have a bad sata cable, the controller is trying to communicate all the time and shows a 100% usage when in reality the SSD / HDD is iddle waiting for an order.

in this case its the cpu who has to do the heavy lifting since the pci express is directly connected.

Reply 7 of 39, by douglar

User metadata
Rank l33t
Rank
l33t

I’ve seen that behavior when a PC is getting garbage signals on the cable.

Most notorious example in my experience was when I was trying to troubleshoot a PC that kept locking up. Replaced everything in the PC methodically and it was still locking up from time to time. And then I unplugged the network cable and it started working. Turns out that the cable run went over a light bulb balast on its way over from the closet. Any time someone turned on the light switch down the hall, the PC would get saturated with bad packets and pretty much hit 100% cpu usage.

Reply 8 of 39, by zuldan

User metadata
Rank Oldbie
Rank
Oldbie
red-ray wrote on 2026-03-15, 10:32:
What is the % is User Mode and Kernel Mode ? Right/Click on the plot and select Show Kernel times. […]
Show full quote
zuldan wrote on 2026-03-15, 09:54:

This is 100% a head scratcher.

What is the % is User Mode and Kernel Mode ? Right/Click on the plot and select Show Kernel times.

I suspect the % Kernel Mode will be very high if it is then it would be good to know if this is Kernel/passive, DPC/Dispatch or Interrupt level time. I know my SIV utility will report this, there may be others.

file.php?id=238348

If you Left/Click on the plot a bigger plot should popup.

Here you go!

The attachment CPU.png is no longer available
The attachment 3.png is no longer available

Reply 9 of 39, by zuldan

User metadata
Rank Oldbie
Rank
Oldbie

Further developments...

Some good information on caps for this board https://www.badcaps.net/forum/troubleshooting … ra-d-recap-help

I took the board out of the PC for another inspection. Found a leaking capacitor right next to the PCI-E slot. The leak must have happened recently as the board was thoroughly washed. It was a 470uf 16v KZG (yes, I know. Yuck 🤣). I then replaced all 470uf 16v caps with Panasonics. Unfortunately, the issue still persists 🙁

The attachment IMG_0199.JPG is no longer available

A bit out of spec but probably still functioning to some degree.

The attachment IMG_0200.JPG is no longer available
Last edited by zuldan on 2026-03-16, 10:16. Edited 1 time in total.

Reply 10 of 39, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
zuldan wrote on 2026-03-16, 09:30:

Here you go!

It's almost all interrupt time, no DPC/Dispatch, so I expect that when the GPU interrupts are not being processed at all as were being processed/cleared I would expect noticeable DPC time.

It could be down to edge vs. level triggered interrupts, is there any way to know/change what is used?

Which GPU is it/which GPUs have you tried? I recall you said you have tried both AMD + NVidia GPUs assuming so I don't think it's a driver issue.

I noticed SIV reported the memory speed as 0MHz, clearly this is incorrect, please will you send me the two Menu->File->Save Local files so I can figure out why and fix it?

Last edited by red-ray on 2026-03-16, 10:07. Edited 1 time in total.

Reply 11 of 39, by zuldan

User metadata
Rank Oldbie
Rank
Oldbie
red-ray wrote on 2026-03-16, 09:52:
It's almost all interrupt time, no DPC/Dispatch, so I expect that when the GPU interrupts are not being processed at all as were […]
Show full quote
zuldan wrote on 2026-03-16, 09:30:

Here you go!

It's almost all interrupt time, no DPC/Dispatch, so I expect that when the GPU interrupts are not being processed at all as were being processed/cleared I would expect noticeable DPC time.

It could be down to edge vs. level triggered interrupts, is there any way to know/change what is used?

Which GPU is it/which GPUs have you tried? I recall you said you have tried both AMD + NVidia GPUs assuming so I don't think it's a driver issue.

I noticed SIV reported the memory speed as 0MHz, clearly this and incorrect, please will you send me the two Menu->File->Save Local files so I can figure out why and fix it?

I’ve tried a 8600GT and X850 XT PE. Tried multiple driver versions with each.

I’ll generate that file for ya tomorrow 👍

Reply 12 of 39, by zuldan

User metadata
Rank Oldbie
Rank
Oldbie

Some more information…

While inspecting the board I noticed the 32KHz crystal oscillator had been glued down to the motherboard with green solder mask. Some of the solder mask was wiped on the board next to it. Now I’m not sure if this is from the factory. I can’t see any kind of tampering with the board under the microscope (I’ve done a lot of repairs so I usually pickup other people’s repairs quickly).

The attachment C90447AE-5077-4848-9B57-A8FBA4481183.jpeg is no longer available

I measured it with the oscilloscope and it’s measuring correctly but the signal looks a bit weird. That might be because I’m using the shield on a USB port for the oscilloscope ground.

The attachment 503FDAE8-1379-4181-AA7D-F5C35D3DFD97.jpeg is no longer available

I also measured the 25Mhz crystal oscillator near it (SMD format) and it measured fine.

The attachment A8B7FF30-FEFF-43E8-8A0E-B9E8147FCFDA.jpeg is no longer available

Reply 13 of 39, by zuldan

User metadata
Rank Oldbie
Rank
Oldbie

Here’s something interesting. When the card is running at x2 speed, there are zero CPU issues.

The attachment 31BBBE11-38F7-4101-9E7D-7FCD42A6676F.jpeg is no longer available

Reply 14 of 39, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
zuldan wrote on 2026-03-16, 10:15:

Here’s something interesting. When the card is running at x2 speed, there are zero CPU issues.

The x2 is not the speed, it's the number of PCIe lanes. Have a look at Menu->Machine->GPU Info and you should see similar to:

file.php?id=238473

Looking at https://www.techpowerup.com/gpu-specs/geforce-8600-gt.c198 you can see Bus Interface PCIe 1.0 x16 so it's impossible for the GPU to run @ PCIe 2 speed.

Update: Looking at https://theretroweb.com/motherboards/s/dfi-la … arty-nf4-sli-dr I can see your board has two PCIe x16 slots, what happens when you move the GPU to the other slot and set it to use x16 lanes?

I would change the Task Manager setup to show kernel times.

file.php?id=238382

Last edited by red-ray on 2026-03-18, 09:01. Edited 2 times in total.

Reply 15 of 39, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
zuldan wrote on 2026-03-16, 09:54:

I’ll generate that file for ya tomorrow 👍

Great, I just looked at a SIV save file from 12-Sep-2005 and it's correct in that file and also correct when I use it to run SIV in test mode so I suspect the issue is triggered by the speed of your memory, what speed does your memory run at please? I suspect it's other than 400 MHz.

file.php?id=238383

I also noted SIV reported Dual Channel rather than Single Channel for your board and wonder, are your two DIMMs plugged into slots on the same memory channel? With a save file I could tell and suspect [Machine] would be enough to tell.

file.php?id=238375

Reply 16 of 39, by tehsiggi

User metadata
Rank Member
Rank
Member
red-ray wrote on 2026-03-16, 10:29:

The x2 is not the speed, it's the number of PCIe lanes. Have a look at Menu->Machine->GPU Info.

you can see Bus Interface PCIe 1.0 x16 so it's impossible for the GPU to run @ PCIe 2 speed.

It can be the lane count the card is running on:

https://theretroweb.com/motherboard/manual/54 … 24505190694.pdf

I think zuldan just meant that, not the PCIe version.

AGP Card Real Power Consumption
AGP Power monitor - diagnostic hardware tool
Graphics card repair collection

Reply 17 of 39, by zuldan

User metadata
Rank Oldbie
Rank
Oldbie

Yep, not talking about the PCI-E version. I mean the PCI-E bandwidth. The second PCI-E x16 slots runs at 2x because this is not the SLI version of the board. The BIOS and GPU-Z both confirm the card is running at x2 when plugged into the second PCI-E x16 slot. This is normal for this DFI board. I can change the slot to x16 but that requires a 0 ohm resistor installed on the chipset. I've attached the report for you.

The attachment SIV_DFI_LP_NF4.zip is no longer available

I've now tested every capacitor with a ESR meter and all of them were well within spec. Desoldered random capacitors and they were in spec with capacitance. It could just be a failing chipset, and I could replace it, however because it's such a rare board (well now days it is) I can't risk destroying it. Not sure where to go from this point. I was hoping to use a FX-55 but might end up leaving the 4400+ so that 1 core can be devoted to dealing with the PCI-E storm. Sounds terrible but in 2005 games were only using 1 core anyway...

Reply 18 of 39, by tehsiggi

User metadata
Rank Member
Rank
Member
zuldan wrote on 2026-03-16, 10:05:

While inspecting the board I noticed the 32KHz crystal oscillator had been glued down to the motherboard with green solder mask. Some of the solder mask was wiped on the board next to it. Now I’m not sure if this is from the factory. I can’t see any kind of tampering with the board under the microscope (I’ve done a lot of repairs so I usually pickup other people’s repairs quickly).

The clock looks fine. That 32kHz crystal is for the RTC most likely, so that should not harm any PCIe communications.

zuldan wrote on 2026-03-18, 09:30:

Yep, not talking about the PCI-E version. I mean the PCI-E bandwidth. The second PCI-E x16 slots runs at 2x because this is not the SLI version of the board. The BIOS and GPU-Z both confirm the card is running at x2 when plugged into the second PCI-E x16 slot. This is normal for this DFI board. I can change the slot to x16 but that requires a 0 ohm resistor installed on the chipset. I've attached the report for you.

The attachment SIV_DFI_LP_NF4.zip is no longer available

I've now tested every capacitor with a ESR meter and all of them were well within spec. Desoldered random capacitors and they were in spec with capacitance. It could just be a failing chipset, and I could replace it, however because it's such a rare board (well now days it is) I can't risk destroying it. Not sure where to go from this point. I was hoping to use a FX-55 but might end up leaving the 4400+ so that 1 core can be devoted to dealing with the PCI-E storm. Sounds terrible but in 2005 games were only using 1 core anyway...

Have you checked the traces going from the PCIe slot to the chipset? Kind of feels like the PCIe slot/lanes are having some issues.

AGP Card Real Power Consumption
AGP Power monitor - diagnostic hardware tool
Graphics card repair collection

Reply 19 of 39, by zuldan

User metadata
Rank Oldbie
Rank
Oldbie
tehsiggi wrote on 2026-03-18, 09:34:

Have you checked the traces going from the PCIe slot to the chipset? Kind of feels like the PCIe slot/lanes are having some issues.

Yep double checked them under the microscope.

It could be as simple as a failing SMD capacitor or resistor between the chipset and PCI-E but as you know, without a boardview or schematics it would be almost impossible to diagnose.