VOGONS


First post, by AndrewBSSC4

User metadata
Rank Newbie
Rank
Newbie

Hi folks,

Hoping for some help. So still trying to get this Commodore PC10-III fixed. Having resolved the BIOS issues and changing out the bad RAM I'm now at an issue where I'm having intermittent RAM problems now.
- Sometimes the computer will detect all RAM fine and run fine. All memory tests will pass and running software on it will run with no issues. I've had a few times run for 9+ and even 12+ hours straight with no issues. Even with warm and cold boots mixed in. No issues. But this has been very rare.
- Sometimes during boot it will detect bad memory Sometimes it might be after 512KB, or 384KB, 256KB, or even less. The machine will let me continue and run with the detected memory. But often that won't be for last long as eventually it will either detect more bad memory during memory tests or freeze.
- Often when I get the weird bad memory error above I can reset the machine and often the memory test will be different. So I might reset from a 512 and get a 384. Reset again and be back at 512. Reset and get the full 640. Reset and get another change.
- And then sometimes it'll just get "stuck" with complete memory failure in the 16KB critical. At which point I change the BIOS to the Supersoft-Landmark BIOS and the memory problems are visible. And again the memory errors will fluctuate. Once test cycle might show 2 bad chips. Next might be 1. Next might be no bad chips and all tests are fine. Next might be all chips bad.
- Often when it enters the worst case scenario it never comes back regardless of resets. I often have to leave it for a few hours. I've notice most often it will start up and detect all ram and run fine for the longest first thing in the morning after it has sat all night. But often it might start having problems after running between half hour to 2 hours.

The memory chips have been tested each individually and in another system. I've tried 3 different brands. All react the same. And as mentioned the few times where it all checks out and runs - it is rock solid. So I don't believe it's the memory chips or my socketing work.

I checked power to the chips and all looks fine.

I've even tried freezing the chips and board to see if temperature would play a role. No effect.

It's driving me crazy because it's so unpredictable. Lookign at the schematic I'm leaning it towards being on the LS chips failing. From what I can see there's:
- One 74LS245
- 3 x 74LS158
- One 74LS04
- One 74LS00

I'm wondering if it might be the 74LS04 or 74LS00. Unfortunately I don't have a scope as part of my tool arsenal.

So I was wondering if anyone ever encountered this and might point me in the direction to debug and fix this. Am I going in the right direction in thinking it's an LS chip failing? Or should I be looking at something else? Any simple tests I can try with a multimeter and logic probe?

Any input would be appreciated.
Thanks.

Attachments

  • PC10III_Memory.jpg
    Filename
    PC10III_Memory.jpg
    File size
    619.47 KiB
    Views
    310 views
    File license
    Public domain

Reply 1 of 6, by weedeewee

User metadata
Rank l33t
Rank
l33t

Is your power supply stable ? especially the +5v line.

Right to repair is fundamental. You own it, you're allowed to fix it.
How To Ask Questions The Smart Way
Do not ask Why !
https://www.vogonswiki.com/index.php/Serial_port

Reply 2 of 6, by AndrewBSSC4

User metadata
Rank Newbie
Rank
Newbie

I'm pretty confident the power supply is stable.

I'm not sure any power fluctuations would account for the alternating memory errors during testing cycles. When I test the power at the chips they all show the same voltage around 4.87V, including the ones that would be in error.

Reply 3 of 6, by weedeewee

User metadata
Rank l33t
Rank
l33t

just an FYI, While 4.87v is within the ripple margin, it can be a cause for intermittent troubles.
Have the PSU and/or mainboard ever had their electrolytic capacitors replaced, that you know of?

Right to repair is fundamental. You own it, you're allowed to fix it.
How To Ask Questions The Smart Way
Do not ask Why !
https://www.vogonswiki.com/index.php/Serial_port

Reply 4 of 6, by AndrewBSSC4

User metadata
Rank Newbie
Rank
Newbie

I've had to replace some of the ceramic caps that popped or were running hot on the board. They were at the head of the memory chips. I did not change them all as the others weren't running hot or showed any signs of concern. I guess I could change all of these.

For the power supply, I visually inspected it before power on and checked the outputs and all seemed fine so I didn't re-cap anything in the power supply.

Reply 5 of 6, by mkarcher

User metadata
Rank l33t
Rank
l33t

I'm afraid that maybe focussing on the memory system is too narrow. I agree with your conclusion that the memory chips are unlikely culprits if they work reliably in a different system, and the problem is not specific to the address range of a single bank. On the other hand, this means that anything connected to the system bus (ROM, ISA slots, on-board components) might interfere with the data during memory access.

The intermittent problem you observe is a typical symptom of a broken or bad connection (but there may be other causes, too). Often, it is stress induced by thermal expansion of PCBs and/or chips that makes the difference between a working and a non-working system. Another plausible cause is probing with thick wires in classic DIP sockets. It might bend the springs far enough that they don't have enough pressure to correctly contact an inserted IC. Another possible cause is soldering such sockets with an IC inserted. If you heat up the socket while an IC is inserted, the clamping force might deform the socket when the plastic gets soft due to heat, resulting in a too low clamping force after cooling down. If you soldered all sockets in the same way (for example with chips inserted), you might have damaged a lot of your sockets, causing issues with different random memory chips.

Reply 6 of 6, by AndrewBSSC4

User metadata
Rank Newbie
Rank
Newbie

Oh definitely I didn't solder in the new sockets with the chips in them. And also didn't push my probe into the socket when checking the sockets.

I don't think thermals are playing a role in this. As I mentioned sometimes it will happen sometimes soon after first power on after 12+ hours of sitting. Sometimes the machine for more than 9 hours with no issues. And when the board did mess up, I even tried to freeze the board and it had no effect.

Also I'm not sure the thermals would apply when you consider the the memory test loops is like 1-2 minutes apart. So I don't see how thermals in that short of a period could affect 1 chip on cycle then 6 chips then next, then 3 chips the next, then no faults the next, etc... I've also used a thermal probe to check the temperatures of items on the board. There's no longer any areas that are warmer than expected. So no longer an warmer caps and no warmer than others memory chips.

I'm still worried it might some of those LS chips that are "aging" or may be damaged by past problems on the board since 11 of 16 memory chips where dead as well as a few popped ceramic caps. The other piece you mention is the possible bus issues. Floppy controller, HDD controller, video, serial, parallel, and RTC are all on board. But those issues might not be evident without a scope. I've run the system with and without Floppy and/or XTIDE and the same issues remain present.