VOGONS


First post, by tony359

User metadata
Rank Member
Rank
Member

Hi all,

Entirely for fun reasons I'm curious to find out whether it would be possible to find out what IC is bad on a SDRAM stick.

I've got some PC150 sticks I was sent by Sphere - they show errors on memtest.

One is 256MB and it's double sided - so that would be more difficult as the 64bit of the module would be shared between front and back.
One is 128MB - basically half the 128MB.

I found a schematic for the 512MB suggesting that the data lines are all connected sequentially.

Now, memtest seems to report issues always in the same area and with same or similar bits. See below.
If I understand right the "Err-Bits" is in Hex so the first line 02000000 means 0000 0010 0000 0000 0000 0000 0000 0000 - this is 32bit because memtest uses 32bit patterns.

I also understand that it's not a given that ICs and the memory controller are wired 1:1 - maybe they're not for PCB routing needs. Hence it's more or less impossible to track down the faulty IC. Is that correct?

I believe one IC behaves differently when warmed up - the errors count is like 10x and even memtest crashes at some point. So "heat" can be used. But wondering if there is another way.

Thanks!

My Youtube channel: https://www.youtube.com/@tony359

Reply 1 of 5, by PC@LIVE

User metadata
Rank Oldbie
Rank
Oldbie
tony359 wrote on 2025-01-13, 17:43:
Hi all, […]
Show full quote

Hi all,

Entirely for fun reasons I'm curious to find out whether it would be possible to find out what IC is bad on a SDRAM stick.

I've got some PC150 sticks I was sent by Sphere - they show errors on memtest.

One is 256MB and it's double sided - so that would be more difficult as the 64bit of the module would be shared between front and back.
One is 128MB - basically half the 128MB.

I found a schematic for the 512MB suggesting that the data lines are all connected sequentially.

Now, memtest seems to report issues always in the same area and with same or similar bits. See below.
If I understand right the "Err-Bits" is in Hex so the first line 02000000 means 0000 0010 0000 0000 0000 0000 0000 0000 - this is 32bit because memtest uses 32bit patterns.

I also understand that it's not a given that ICs and the memory controller are wired 1:1 - maybe they're not for PCB routing needs. Hence it's more or less impossible to track down the faulty IC. Is that correct?

I believe one IC behaves differently when warmed up - the errors count is like 10x and even memtest crashes at some point. So "heat" can be used. But wondering if there is another way.

Thanks!

Hi Tony,
This is a very interesting thing 🧐
Honestly, the only thing I think can be done is to check all the tracks, to see if any are interrupted, but even the SMDs could be defective, this is assuming that the memory chips are all ok.
One thing that could be done with chips removed, is to make measurements, and check if there is an anomalous value in any chip, maybe here a pdf related to the chip helps, but having a RAM on two faces, you could try to remove those of the face, which is usually not occupied in the single-sided modules, if it works you have one or more failed chips among those removed, otherwise you should remove those and put the ones you had removed first, and try again, but I don't know if there is a tool to verify the individual chips.
Watching your YouTube video I saw that you don't have a 370 card, if you're interested I can give you a free VIA chipset card (I don't have time to do anything about it having others the same), and if you want I can send you one or two motherboards with other sockets, the 370 has from what I remember a non-working IDE, but I've never tried it.
Possibly send me a message 💬

AMD 286-16 287-10 4MB HD 45MB VGA 256KB
AMD 386DX-40 Intel 387 8MB HD 81MB VGA 256KB
Cyrix 486DLC-40 IIT387-40 8MB VGA 512KB
AMD 5X86-133 16MB VGA VLB CL5428 2MB and many others
AMD K62+ 550 SOYO 5EMA+ and many others
AST Pentium Pro 200 MHz L2 256KB

Reply 2 of 5, by tony359

User metadata
Rank Member
Rank
Member

Yes, I need to check those SMD resistors though I think with a failed one the stick wouldn't work at all. But yes, it's on my list.
the ICs are BGA so a bit a pain to move around!

I know Alex of Bits und Bolts tried already to identify the bad ICs with memtest and failed.
One thing I'd like to try is to pull to ground some data lines while memtest works and see if the outcome changes/gets worse/stays the same. If it stays the same it might mean that that data line is the one connected to the bad IC?
I tried with a 10K resistor but nothing changed, I'd imagine I need a bit more to pull down the line. I'll do some tests. if someone has ideas, I am all ears 😀

The socket 370: you are very kind but the truth is that I don't need one! If it's faulty, by all means, the fun is in repairing them! Same for the other boards 😀 The non working IDE might be interesting though 😀

My Youtube channel: https://www.youtube.com/@tony359

Reply 3 of 5, by PC@LIVE

User metadata
Rank Oldbie
Rank
Oldbie
tony359 wrote on 2025-01-13, 23:20:
Yes, I need to check those SMD resistors though I think with a failed one the stick wouldn't work at all. But yes, it's on my li […]
Show full quote

Yes, I need to check those SMD resistors though I think with a failed one the stick wouldn't work at all. But yes, it's on my list.
the ICs are BGA so a bit a pain to move around!

I know Alex of Bits und Bolts tried already to identify the bad ICs with memtest and failed.
One thing I'd like to try is to pull to ground some data lines while memtest works and see if the outcome changes/gets worse/stays the same. If it stays the same it might mean that that data line is the one connected to the bad IC?
I tried with a 10K resistor but nothing changed, I'd imagine I need a bit more to pull down the line. I'll do some tests. if someone has ideas, I am all ears 😀

The socket 370: you are very kind but the truth is that I don't need one! If it's faulty, by all means, the fun is in repairing them! Same for the other boards 😀 The non working IDE might be interesting though 😀

Well in fact it's a problem that the chips are BGA, if they had been normal chips, I think the speech could have been done, of course it's not simple, and for me it's impossible, but I saw in the last videos that you used a trick, with a shaped copper wire, here maybe 🤔 the one in the absence of adequate tools is the solution, but unfortunately 😣 with the BGAs if I'm not mistaken the only system to remove them is hot air, here I don't know if you've already tried, but maybe if you heat them up while memtest works, and you notice the difference, that is, some of the errors disappear, could it be some welding of the BGA to be renewed?
I follow beyond your channels, other channels on the old hw topic, including even Alex from BuB (more others that I think you follow too, one is the one from Phil's Computer Labs), and honestly I don't remember if he made a video on the search for failed memory chips with memtest, but I know that there are other diagnostic programs perhaps even better, and that maybe they can make it clear where exactly the bytes go, but if I'm not mistaken, there is in the Quick Tech Pro a memory diagnostic, which is quite good at finding storage flaws, I used it years ago (several), to find defective DDR RAM banks, but it was several errors, and so I immediately understood that they didn't work, and I didn't waste time waiting for the control to arrive at the end.
There is one thing I thought, maybe you can easily check, on some motherboards, ASUS and the like, there is a jumper to increase the voltage 😬 of the RAM, I don't know if there is something like that on the SDRAM, but if there was a possibility, to raise the voltage from 3.3V to 3.5V, the errors could disappear 🫥???
For the 370 card no problem 😉, honestly I've never tried it, because I have two other identical ones, I remember that it's a VIA chipset ECS, however if you want as a curiosity, I'll post the images here on Vogons (Test troubleshooting MB PC@LIVE), maybe I'll try to do a test on the fly, if visually it's ok.
I forgot, if the double-sided RAM, have the power pins on the same side of the chips, you could isolate the side, with adhesive tape, a bit like you did with the PINs of the slot1 adapters, maybe 🤔 you could have luck 🍀, and half RAM works?

AMD 286-16 287-10 4MB HD 45MB VGA 256KB
AMD 386DX-40 Intel 387 8MB HD 81MB VGA 256KB
Cyrix 486DLC-40 IIT387-40 8MB VGA 512KB
AMD 5X86-133 16MB VGA VLB CL5428 2MB and many others
AMD K62+ 550 SOYO 5EMA+ and many others
AST Pentium Pro 200 MHz L2 256KB

Reply 4 of 5, by tony359

User metadata
Rank Member
Rank
Member

here is Alex's video: https://www.youtube.com/watch?v=ufwwHd_c2Ho

I found that heating up a section of the RAM stick would completely freeze the system so I have a lead. But I was curious if there was a more scientific way to do it - the same as you'd do with an old system where the RAM is much simpler to follow via schematics and a specific bit is simple to track down.

The picture above clearly says the issue happens at the beginning of the RAM on the same bits. I might want to experiment a bit 😀

My Youtube channel: https://www.youtube.com/@tony359

Reply 5 of 5, by PC@LIVE

User metadata
Rank Oldbie
Rank
Oldbie
tony359 wrote on 2025-01-14, 10:36:

here is Alex's video: https://www.youtube.com/watch?v=ufwwHd_c2Ho

I found that heating up a section of the RAM stick would completely freeze the system so I have a lead. But I was curious if there was a more scientific way to do it - the same as you'd do with an old system where the RAM is much simpler to follow via schematics and a specific bit is simple to track down.

The picture above clearly says the issue happens at the beginning of the RAM on the same bits. I might want to experiment a bit 😀

Thanks ☺️ for the link,
I followed the three videos, and I admit that it is quite complicated to understand exactly where to locate the errors, basically in which chip they occur, 🤔 I don't know if there is something that shows where a byte ends up, I make it easier if the RAM is 32 MB and the chips are 16, in practice each chip stores 2 MB, so theoretically those 2 MB should go on all 16 chips, rather than on just one, and if in those 2 MB an error occurs, finding where it is would not be too difficult 😩
And if for example the error was at the beginning, it would easily be in chip 1, if it was at the end instead it would be chip 16 to create the error, clearly if it was at a point X, seeing visually the map 🗺 divided by 16, we will immediately identify where X is, that is, in which chip is it?
In the RAM you have, the main problem is the BGA chips, but if Alex found the defective chip, it was thanks to the fact that he was able to remove some chips, then at the end of the video, he managed to narrow the possibility to only two chips, and finally trying what he thought was ok (as then occurred), he found that fault, and if I understood correctly 😌, he will see to do some tests on that chip, so that he can see which chips work and which instead are faulty.
It must also be considered that a double-sided SDRAM corresponds to two SIMM 72 PIN, so ideally an SDRAM would be if divided in half, a SIMM 72 PIN, in short it would be more complicated to understand which side the defective chip is on, maybe 🤔 to understand more you could make a spreadsheet, in which to insert the errors, and see from the graph in which chip they are, I have never done anything like this, but assuming that the chips are 16, you can make 16 columns, and for example 20 rows if they are 2 MB.

AMD 286-16 287-10 4MB HD 45MB VGA 256KB
AMD 386DX-40 Intel 387 8MB HD 81MB VGA 256KB
Cyrix 486DLC-40 IIT387-40 8MB VGA 512KB
AMD 5X86-133 16MB VGA VLB CL5428 2MB and many others
AMD K62+ 550 SOYO 5EMA+ and many others
AST Pentium Pro 200 MHz L2 256KB