VOGONS


Reply 380 of 1037, by CoffeeOne

User metadata
Rank Oldbie
Rank
Oldbie
red-ray wrote on 2020-03-04, 09:17:
CoffeeOne wrote on 2020-03-03, 21:26:

stc info:

Thank you, I can see the Sensor Thread Load is now 57% which is what I was hoping for. Looking in the save file at [ DIMM Status ] then I don't see any wacky temperatures, but SIV had only been running for 28 seconds and done 14 updates. Please will you post a screen shot of [ DIMM Status ] after SIV has been running for at least 5 minutes and the sample counts are >= 150. I don't need a save file.

I am wondering if I need to get SIV to delay for a few ms after switching the SMBus mux so please try the attached SIV64X 5.47 Slug-02 that does this. Again post [ DIMM Status ] after SIV has been running for at lest 5 minutes and then generate a new save file.

Sure.

Attachments

  • dimm-status1.png
    Filename
    dimm-status1.png
    File size
    85.14 KiB
    Views
    799 views
    File license
    Fair use/fair dealing exception
  • dimm-status2.png
    Filename
    dimm-status2.png
    File size
    80.32 KiB
    Views
    799 views
    File license
    Fair use/fair dealing exception
  • Filename
    SIV64Xslug02-save.zip
    File size
    353.48 KiB
    Downloads
    41 downloads
    File license
    Fair use/fair dealing exception

Reply 381 of 1037, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
CoffeeOne wrote on 2020-03-04, 21:09:

Sure.

Thank you and given there are retries I expect some other software is accessing the SMBuses at the same time as SIV 🙁

Looking at what is running it may be C:\Program Files\HP\Cissesrv\cissesrv.exe so please will you stop the CISSERV WIN32 service, exit then restart SIV and see if the retries go away?

There are also C:\Program Files\HP IO Accelerator Management Tool\Utils\fio-agent.exe + C:\Program Files\HP IO Accelerator Management Tool\Utils\fio-msrv.exe that may be triggering the retries.

Reply 382 of 1037, by CoffeeOne

User metadata
Rank Oldbie
Rank
Oldbie
red-ray wrote on 2020-03-04, 22:02:
Thank you and given there are retries I expect some other software is accessing the SMBuses at the same time as SIV :( […]
Show full quote
CoffeeOne wrote on 2020-03-04, 21:09:

Sure.

Thank you and given there are retries I expect some other software is accessing the SMBuses at the same time as SIV 🙁

Looking at what is running it may be C:\Program Files\HP\Cissesrv\cissesrv.exe so please will you stop the CISSERV WIN32 service, exit then restart SIV and see if the retries go away?

There are also C:\Program Files\HP IO Accelerator Management Tool\Utils\fio-agent.exe + C:\Program Files\HP IO Accelerator Management Tool\Utils\fio-msrv.exe that may be triggering the retries.

Hello,
OK, will disable all 3 services and re-run SIV, but one question before: What about the DIMMs (only one in the last screenshot), that says temperature support not available?
It is not always the same DIMM(s), in the current run, there are 2 of it.

Reply 383 of 1037, by CoffeeOne

User metadata
Rank Oldbie
Rank
Oldbie

One more remark:
There must be still bugs in the displaying part, see screenshot:

main-2020-03-04.png
Filename
main-2020-03-04.png
File size
172.37 KiB
Views
787 views
File license
Fair use/fair dealing exception

Reply 384 of 1037, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
CoffeeOne wrote on 2020-03-04, 22:23:

There must be still bugs in the displaying part

From SIV main screen I can't tell and to sensibly comment at a minimum I need to see what [ DIMM Status ] reports.

As Martin has already said with these types of issue it's rather tricky to know what is happening without having a system here. What I really need to do is enable the SIV driver SMBus trace so I can see what is in the SMBus control register when it's not the value SIV set it to.

Reply 385 of 1037, by CoffeeOne

User metadata
Rank Oldbie
Rank
Oldbie
red-ray wrote on 2020-03-04, 23:02:
CoffeeOne wrote on 2020-03-04, 22:23:

There must be still bugs in the displaying part

From SIV main screen I can't tell and to sensibly comment at a minimum I need to see what [ DIMM Status ] reports.

As Martin has already said with these types of issue it's rather tricky to know what is happening without having a system here. What I really need to do is enable the SIV driver SMBus trace so I can see what is in the SMBus control register when it's not the value SIV set it to.

Sure.

So OK, here are the last run (4 services stopped: the ProliantMonitor and the other 3 which you wrote in the last posting)

dimm-status3.png
Filename
dimm-status3.png
File size
84.87 KiB
Views
776 views
File license
Fair use/fair dealing exception

I fully support the idea to now stop investigating on that issue, the effort display SPD info and temperatures of the DIMMs is way too high.

Attachments

  • Filename
    SIV64Xslug02-save2.zip
    File size
    353.92 KiB
    Downloads
    60 downloads
    File license
    Fair use/fair dealing exception

Reply 386 of 1037, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
CoffeeOne wrote on 2020-03-04, 23:23:

I fully support the idea to now stop investigating on that issue, the effort display SPD info and temperatures of the DIMMs is way too high.

Thank you and I also feel we need to leave this for a while. I suspect I look out for a cheap DL580 G7 on eBay and once I have one I suspect I will make some progress.

I suspect the DL585 G6 will be rather easier to deal with, but it's possible the DIMM SPD will be hidden just as it is on the DL585 G7 and I guess time will tell.

Reply 387 of 1037, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
CoffeeOne wrote on 2020-03-04, 23:23:

dimm-status3.png

I have been trying to figure out why only the SMBus-A were getting retries and I think I have figured it out. Basically the SMBus-B retries were being counted as SMBus-A retries as the context was wonky.

I suspect I have fixed this in the attached SIV64X 5.47 Oops-02 which also counts write retries when the mux switch is not as expected.

Please will you test this and assuming it works as expected and I will stop making changes. As I don't like to leave things half working I now plan to get a DL580 G7 so I can test things here. I have been pondering one for a while as there are several things I would like to checkout on such as system as the 36 CPUs I have does not really come close to what SIV is designed to handle.

Last edited by red-ray on 2020-03-07, 07:57. Edited 1 time in total.

Reply 388 of 1037, by CoffeeOne

User metadata
Rank Oldbie
Rank
Oldbie
red-ray wrote on 2020-03-05, 21:47:
I have been trying to figure out why only the SMBus-A were getting retries and I think I have figured it out. Basically the SMBu […]
Show full quote
CoffeeOne wrote on 2020-03-04, 23:23:

dimm-status3.png

I have been trying to figure out why only the SMBus-A were getting retries and I think I have figured it out. Basically the SMBus-B retries were being counted as SMBus-A retries as the context was wonky.

I suspect I have fixed this in the attached SIV64X 5.47 Oops-02 which also counts write retries when the mux switch is not as expected.

Please will you test this and assuming it works as expected and I will stop making changes. As I don't like to leave things half working I now plan to get a DL580 G7 so I can test things here. I have been pondering one for a while as there are several things I would like to checkout on such as system as the 36 CPUs I have does not really come close to what SIV is designed to handle.

Hello
After starting it crashed the PC, also when all 4 HP services are stopped.

siv-Oops.png
Filename
siv-Oops.png
File size
268.29 KiB
Views
747 views
File license
Fair use/fair dealing exception

Reply 389 of 1037, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
CoffeeOne wrote on 2020-03-06, 21:12:

After starting it crashed the PC, also when all 4 HP services are stopped.

Wow, I was not expecting that and apologies. Do you get the same effect with Beta-02 and if so please will you post the minidump file that should be in C:\Windows\Minidump.

Reply 390 of 1037, by CoffeeOne

User metadata
Rank Oldbie
Rank
Oldbie
red-ray wrote on 2020-03-06, 23:35:
CoffeeOne wrote on 2020-03-06, 21:12:

After starting it crashed the PC, also when all 4 HP services are stopped.

Wow, I was not expecting that and apologies. Do you get the same effect with Beta-02 and if so please will you post the minidump file that should be in C:\Windows\Minidump.

Hello,

It's the same.

There is no file Minidump.
But I do have a MEMORY.DMP, but it's 8,5GB. With 7.zip still ~140MB

Reply 391 of 1037, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
CoffeeOne wrote on 2020-03-07, 00:45:

But I do have a MEMORY.DMP, but it's 8,5GB. With 7.zip still ~140MB

Thank you for checking out Beta-02 and 140MB is still a bit on the big side.

After pondering this I decided I should be able to recreate the issue on my Dell 490 by building a driver that pretended it had a PCA9544 SMBus mux so I did and tracked down the issue. I am pretty sure it was that an incorrect/invalid lock was being used and this caused KeWaitForMutexObject() to get into a tizz. Once I found this it was trivial to fix so have done this for Beta-03.

Please will you try SIV 5.47 Beta-03, let it run for 5 minutes then post Menu->Hardware->DIMM Status and finally generate a new save file.

Reply 392 of 1037, by CoffeeOne

User metadata
Rank Oldbie
Rank
Oldbie
red-ray wrote on 2020-03-07, 08:44:
Thank you for checking out Beta-02 and 140MB is still a bit on the big side. […]
Show full quote
CoffeeOne wrote on 2020-03-07, 00:45:

But I do have a MEMORY.DMP, but it's 8,5GB. With 7.zip still ~140MB

Thank you for checking out Beta-02 and 140MB is still a bit on the big side.

After pondering this I decided I should be able to recreate the issue on my Dell 490 by building a driver that pretended it had a PCA9544 SMBus mux so I did and tracked down the issue. I am pretty sure it was that an incorrect/invalid lock was being used and this caused KeWaitForMutexObject() to get into a tizz. Once I found this it was trivial to fix so have done this for Beta-03.

Please will you try SIV 5.47 Beta-03, let it run for 5 minutes then post Menu->Hardware->DIMM Status and finally generate a new save file.

Looks very good to me!

beta3-dimm-status.png
Filename
beta3-dimm-status.png
File size
79.55 KiB
Views
719 views
File license
Fair use/fair dealing exception

Remark: I did NOT stop any HP services this time 😀

Attachments

  • Filename
    SIV64Xbeta3-save.zip
    File size
    353.6 KiB
    Downloads
    40 downloads
    File license
    Fair use/fair dealing exception

Reply 393 of 1037, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
CoffeeOne wrote on 2020-03-07, 19:19:

Remark: I did NOT stop any HP services this time 😀

Thank you for "hanging on in there" and trying Beta-03. I have fixed the SIV bug and SIV is dealing better with the collisions, but I feel I need to know why there are collisions.

To make progress I need to enable "SIV Driver Retry Tracing", would you like to try this? If so then get and run-as-admin https://docs.microsoft.com/en-us/sysinternals … loads/debugview then do SIV64X -drvdbg=30000000 | more and post the debugview log after SIV has been running for >= 120 seconds. On this system I just get as below, but you would also see the SMBus retries. Once I check this log then I should be able to see what is being accessed when SIV decides it needs to retry and may be able to deduce what might be doing this.

I have been looking at what I could run on a DL580 G7 and W10 does not seem to be listed as supported, have you tried W10/2019? W10 Professional only allows 2 sockets, but W10 Enterprise allows 4 (see [Windows]) so I suspect it will run OK and I am inclined to try it, what do you think?

file.php?id=78489

Attachments

  • DebugView.png
    Filename
    DebugView.png
    File size
    23.72 KiB
    Views
    716 views
    File comment
    DebugView on this system
    File license
    Public domain

Reply 394 of 1037, by xjas

User metadata
Rank l33t
Rank
l33t

Here's an updated report from my Shuttle XPC running 5.47 Beta 03 showing that the [menus] crash is indeed fixed. Nice work!

It still doesn't read much info about the GF3 in that system under Win98 but I understand there isn't much that can be done about it.

I have some other platforms and GPUs to get you some test results from, but I'm waiting for a few Ebay parts to come in. Will add them when I can. 😀

Attachments

  • Filename
    SIV_ASTRONAUT.7z
    File size
    45.94 KiB
    Downloads
    42 downloads
    File license
    Public domain

twitch.tv/oldskooljay - playing the obscure, forgotten & weird - most Tuesdays & Thursdays @ 6:30 PM PDT. Bonus streams elsewhen!

Reply 395 of 1037, by CoffeeOne

User metadata
Rank Oldbie
Rank
Oldbie
red-ray wrote on 2020-03-07, 21:24:
Thank you for "hanging on in there" and trying Beta-03. I have fixed the SIV bug and SIV is dealing better with the collisions, […]
Show full quote
CoffeeOne wrote on 2020-03-07, 19:19:

Remark: I did NOT stop any HP services this time 😀

Thank you for "hanging on in there" and trying Beta-03. I have fixed the SIV bug and SIV is dealing better with the collisions, but I feel I need to know why there are collisions.

To make progress I need to enable "SIV Driver Retry Tracing", would you like to try this? If so then get and run-as-admin https://docs.microsoft.com/en-us/sysinternals … loads/debugview then do SIV64X -drvdbg=30000000 | more and post the debugview log after SIV has been running for >= 120 seconds. On this system I just get as below, but you would also see the SMBus retries. Once I check this log then I should be able to see what is being accessed when SIV decides it needs to retry and may be able to deduce what might be doing this.

I have been looking at what I could run on a DL580 G7 and W10 does not seem to be listed as supported, have you tried W10/2019? W10 Professional only allows 2 sockets, but W10 Enterprise allows 4 (see [Windows]) so I suspect it will run OK and I am inclined to try it, what do you think?

file.php?id=78489

Hello, I tried: I openend debug view and then started SIV from command line with the line you posted, but not a single line appeared in the debug view program window.
Did I do something wrong?

About running windows 10 on a HP DL580 G7:
I did not know that Windows 10 enterprise supports 4 sockets, but actually it does since a 17xx version.
So yes, it is a nice option, but you need to check, if there is Windows 10 support for the raid controller (built in P410i), that's essential.

EDIT:
The driver should be that:
https://support.hpe.com/hpsc/swd/public/detai … a4358a34bfffd39
but I am not 100% sure that it will work under Windows 10.
ONE MORE EDIT: I guess the chances are good 😁

Reply 396 of 1037, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
xjas wrote on 2020-03-07, 22:06:

It still doesn't read much info about the GF3 in that system under Win98 but I understand there isn't much that can be done about it.

Thank you for confirming that the crash has been resolved. I spotted that the fan speeds were silly and should have fixed this in the attached SIV32L 5.47 Fans-04, have I?

If you can boot NT4/W2K/WXP then much more should be reported for the GF3 and GPUs in general, that said Fans-04 should do better with the GF3 on W98.

I can see the SMART reporting that I recently added for W9x worked OK, out of interest did you get the Reboot Windows to Enable SMARTVSD panel? If not I guess is was already setup.

The Mobile AMD Athl is down to what CPUID 80000002 reports, see Menu->Hardware->CPUID->CPU-0, which I suspect is down to what the BIOS sets up. Do you know why the name is strange?

Last edited by red-ray on 2020-04-25, 20:52. Edited 5 times in total.

Reply 397 of 1037, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
CoffeeOne wrote on 2020-03-07, 23:17:

I tried: I openend debug view and then started SIV from command line with the line you posted, but not a single line appeared in the debug view program window.

Thank you for trying, but I would need to see the Debug View screen shot to know for sure.

Was there a red line through the yellow cog like icon, the 5th from the left in the toolbar? If so then at a guess you did not do run-as-admin so Debug View was unable to load it's driver.

I am wondering if I could plug an SSD into the ICH10 disk controller and boot W10 from that.

As later HP servers also have the P410i and do support W10/2019 then I expect there will be a driver that runs on W10/2019.

Last edited by red-ray on 2020-03-08, 16:18. Edited 1 time in total.

Reply 398 of 1037, by CoffeeOne

User metadata
Rank Oldbie
Rank
Oldbie
red-ray wrote on 2020-03-08, 11:07:
Thank you for trying, but I would need to see the Debug View screen shot to know for sure. […]
Show full quote
CoffeeOne wrote on 2020-03-07, 23:17:

I tried: I openend debug view and then started SIV from command line with the line you posted, but not a single line appeared in the debug view program window.

Thank you for trying, but I would need to see the Debug View screen shot to know for sure.

Was there a red line through the yellow cog like icon, the 5th from the left in the toolbar? If so then at a guess you did not do run-as-admin so Debug View was unable to load it's driver.

It would also be worth checking Menu->Windows->Services->SIV Driver to confirm the traces are enabled. You can also use this panel to disable the traces. I advise against enabling other traces as doing this could generate vast amounts of trace data.

I am wondering if I could plug an SSD into the ICH10 disk controller and boot W10 from that.

As later HP servers also have the P410i and do support W10/2019 then I expect there will be a driver that runs on W10/2019.

OK, kernel logging was simply disabled, so I had to click on the icon, that was all that was missing.

debug.png
Filename
debug.png
File size
92.28 KiB
Views
687 views
File license
Fair use/fair dealing exception

About SSDs: You can simply put a SSD in a SAS hard-drive caddy (the older type used in G2, G3, .... G7. They changed it in G8)
Basically every SATA disk works. I tried using 4 times of normal SATA notebook disks, that did NOT work well with a P410.
But I have very good experience for example with "normal" Samsung SSD Pro drives. in G6 and G7 HP servers. It's important that the SSD has a working temperature sensor.
The only negative point is that you have only 3GB SATA bandwidth, but in 99% of the cases, that's still OK 😁

Reply 399 of 1037, by red-ray

User metadata
Rank Oldbie
Rank
Oldbie
CoffeeOne wrote on 2020-03-08, 14:49:

OK, kernel logging was simply disabled, so I had to click on the icon, that was all that was missing.

OK and it's much as I expected. I suspect the HP software was active when you did this and if the HP software is inactive is the retry rate lower?

file.php?id=78525

  1. When act starts with 3x this is some other entity reading the DIMM temperature and the first 4 are when SIV is reading the SPD data.
  2. When act starts with E8 this is some other entity switching the PCA9544 mux
  3. When act starts with C8 this is some other entity accessing salve 0x60 memory buffer
  4. When act starts with C9 this is some other entity accessing salve 0x61 memory buffer

It would be interesting to get a driver log when all the HP software is inactive to see if any of the above go away.

I am also wondering if the HP software uses any locks. From Menu->Windows->Processes Right/Click on the PID button and select Locks to find out. Attached is what I just got for SIV64X so you know what to expect.

Attachments

  • RED.png
    Filename
    RED.png
    File size
    104.09 KiB
    Views
    679 views
    File comment
    SIV64X Locks
    File license
    Public domain