VOGONS


386 board dead?

Topic actions

Reply 40 of 51, by Guld

User metadata
Rank Member
Rank
Member
myne wrote on 2025-02-12, 03:26:
Completely different generation and symptoms, but an athlon I had, had bad caps. The only symptom I noticed was that the mosfet […]
Show full quote

Completely different generation and symptoms, but an athlon I had, had bad caps. The only symptom I noticed was that the mosfet area got super hot.
Once changed, they were quite cool.
I believe they were leaking too much.
Are any caps noticeably lower in resistance or hotter than the others?

I realise mine were electrolytic, and yours are tantrums, but caps do tend to fail short. I don't know the speed of this process, but it's possible it could be slow.

Not that I've found yet, I replaced all of the tantalums on the 5V today and checked the old ones once out and got:
9.15 - 11.47 uF (should be 10uF, so, nothing odd there)
1.2-1.6 ohm ...with one of them at 2.2 ohm. maybe a little high, but honestly I didn't think it was that bad, just higher than the others.
loss at 0.4-0.6%

the one with slightly higher resistance was the one that appears to be for the CMOS volage? Replaced it, no real change though.

Starting to wonder if the system is really having issues with memory beyond the 3712kB config that seems to work so well. Seems to be unstable to varying degrees with any of my configs above that (with cache enabled, without they all work).

Another odd thing I happened to notice today is that when it's running normally, it's pulling 2.6-2.7 amps on the 5V line, but immediately drops to 2.1 amps when it freezes. Could just be that several chips are disabled, etc. and might just be a result of the freeze, but found it at least noteworth. It was measured with my ATX2AT that I'm using which tells me the current pull while it's running.

Is there any reason the tantalums on the -5, -12, or +12 could cause an issue? I didn't think so and I don't have any cards in that are using those so I haven't touched them.

Reply 41 of 51, by DaveDDS

User metadata
Rank Oldbie
Rank
Oldbie
Guld wrote on 2025-02-12, 00:56:

But it's still fairly unstable and I'm still not sure why the fan is helping. It doesn't always help, but clearly there's a heat related issue somewhere.

So it's unclear to me right now if adding the additional capacitors has helped the system just though adding additional capacitance?

I wouldn't discount some kind of bad-solder/microcrack connection issue...

Two not terribly obvious things that can affect such an issue are temperature/expansion,
and small/specific physical strain (like added parts hanging off the board)

A technique I've user more than once to find this kind of thing is to mount the board vertically
(so you can access both sides) and securely (so It can't easily flex/bend) and using small non-conductive
but hard tool (like a stiff plastic rod). With it running tap around the board/component to see where it
stops running... You may be able to narrow it down with gentler tapping until you get very close to
the failing connection... (and use external fans as needed to keep temps constant during test).

At one time I wrote a test program to generate a continuous fast two-tone (where my program had to
reprogram the frequency) beep from the on-board speaker circuit (and some resistance can keep this to
a tolerable sound level) - so I could tell when the system "died" without having to keep glancing at a screen...
(and you don't need "lots of stuff" to run something like this - a video display and a floppy disk (hopefully
these are built into the mainboard - otherwise suspend cards and needed - you can boot DOS from a floppy)

Dave ::: https://dunfield.themindfactory.com ::: "Daves Old Computers"->Personal

Dave ::: https://dunfield.themindfactory.com ::: "Daves Old Computers"->Personal

Reply 42 of 51, by Guld

User metadata
Rank Member
Rank
Member
DaveDDS wrote on 2025-02-14, 21:08:
I wouldn't discount some kind of bad-solder/microcrack connection issue... […]
Show full quote

I wouldn't discount some kind of bad-solder/microcrack connection issue...

Two not terribly obvious things that can affect such an issue are temperature/expansion,
and small/specific physical strain (like added parts hanging off the board)

A technique I've user more than once to find this kind of thing is to mount the board vertically
(so you can access both sides) and securely (so It can't easily flex/bend) and using small non-conductive
but hard tool (like a stiff plastic rod). With it running tap around the board/component to see where it
stops running... You may be able to narrow it down with gentler tapping until you get very close to
the failing connection... (and use external fans as needed to keep temps constant during test).

Yeah, I haven't discounted it yet. Thanks for the suggestions!

So what I've basically determined so far is:
1) with external cache disabled, the system is stable in all memory configs I've tried.
2) with external cache enabled, the only stable configuration is with only bank 0 filled with 4MB or less of total RAM. This config works for long periods with no fan at all.
3) with external cache enabled, all other memory configs run for varying lengths, but none more than an hour or so (fan or no fan)
4) Having the fan on does seem to generally increase the length of time the system works under #3.
5) Also, filling bank 0 with 4MB SIMMs (16MB total) does not work with cache enabled.

So..from this it seems to me that I can conclude:
A) the external cache chips and the TAG themselves are working fine (due to #2 above)...with a small caveat (due to only working with up to 1MB memory, see notes below).
B) the CPU to DRAM interface is fine (due to #1/#2 above)

So I'm focusing in the area of the cache/tag to DRAM traces and possible the chipset 82C495XLC chip to cache/tag/SRAM.

The fact that when I go > 4MB memory does make me suspicious that a trace that is not used in the <= 4MB configuration might be one that is causing issues with > 4 MB. #5 above seems to point in that direction.

Since each SIMM slot can address up to 4 MB, I wonder if address lines A10 and A11 are unused when I only have 1 MB SIMMs installed (as per #2 above). Which could mean focusing on how the cache interacts with those specific address lines could be a clue. But I'm still trying to understand exactly how the system addresses the memory on the system and through the cache/TAG, so could be that my sphere of ignorance is too large right now 😀.

But looking at my 1MB SIMMs, I can see that pins 19 and 24 (address lines 10/11) are not connected. So...could be a pointer towards a bad trace on those lines.

If anyone has any recommendations for good books to read up on how exactly the memory/cache/TAG/CPU/chipset interact, I'd like to know how I can learn more! The chipset documentation is somewhat helpful but doesn't go into details.

Reply 43 of 51, by Nexxen

User metadata
Rank l33t
Rank
l33t

Have you checked for any smd cap to be too hot?
Maybe the ones that are connected to bank 1, if it is the one having trouble.
Just a hunch.

You could put some ipa on those and observe if anyone evaporates sooner.

PC#1 Pentium 233 MMX - 98SE
PC#2 PIII-1Ghz - 98SE/W2K

"One hates the specialty unobtainium parts, the other laughs in greed listing them under a ridiculous price" - kotel studios

Reply 44 of 51, by Guld

User metadata
Rank Member
Rank
Member

SOLVED
TLDR: the 32kx8 SRAM in the TAG which only requires 8kx8, pin 1 floating.

Alright, so as suggested above, I tried several more things. And eventually had the board so I could tap on it and see if I could isolate the issue.

When I tapped on the TAG, specifically on the end where pin 1 is...I could reliably get it to freeze.

So, I went to the back of the board and tried to isolate/confirm more details and first discovered that tapping on pin 2 (A12) would cause it to fail...but eventually determined that pins 1 (NC or A14....details coming) and pin 3 (A7) could sometimes cause it. Eventually I settled on pin 1 being the one to most reliably cause the issue.

As noted earlier in this thread, the TAG on this system only requires an 8kx8 SRAM, but I received it with a 32kx8 SRAM. With the differences being:
pin 1: NC on 8kx8 A14 on 32kx8
pin 26: CS on 8kx8 A13 on 32kx8
all other pins are the same.

As noted earlier, pin 26 is always high on this board...so..not really an issue, it just always selects the same address line for the 32kx8 (effectively making it 16kx8)
My error was I previously looked at pin 1 with my scope and noted that it was always 0 volts, so I assumed it was grounded. In which case the 32kx8 effectively becomes a 8kx8 and we're all good.

But....I tried to verify and pin 1 is truly NOT connected to anything, which works fine for the 8k x 8 SRAM, but leaves the A14 line on the 32k x 8 SRAM floating.

I tried to wiggle pin 1 and see if I could see if it was originally connected to ground or 5V, either of which would work for a 32kx8, but floating apparently DOES NOT. Somehow that pin floating causes errant behavior on the chip from time to time, although it is unclear to me why the different memory configs behave different, perhaps pin 2 (A12) is more active in some configs and induces a voltage on pin 1 somehow? I'd be interested if anyone has any ideas.

So I grounded pin 1 and the following happened:
Code 0x13 no longer appears at boot anymore! Even in configs where it ALWAYS appeared.
The system works for hours now with all the memory configs I've tried!

Moral:
Don't trust that 0V on the scope means ground! 😁

I have no idea why the fan made some difference in the behavior either. Possibly just a change in EMI affecting pin 1?
And I'm a little surprised that it's so sensitive, but it seems conclusive.

I plan to add a jumper to ground on pin 1 to keep the system stable, it would work with 8k and 32k SRAM. I do not have any 8k x 8 SRAM to try on it, but that would also likely fix it.

Thanks for the help and the tips everyone! Will post more if I see anything else but it appears to be working!

Reply 45 of 51, by DaveDDS

User metadata
Rank Oldbie
Rank
Oldbie
Guld wrote on 2025-02-17, 17:29:
... but floating apparently DOES NOT. Somehow that pin floating causes errant behavior on the chip from time to time, although i […]
Show full quote

... but floating apparently DOES NOT. Somehow that pin floating causes errant behavior on the chip from time to time, although it is unclear to me why the different memory configs behave different, perhaps pin 2 (A12) is more active in some configs and induces a voltage on pin 1 somehow? I'd be interested if anyone has any ideas.
...
I have no idea why the fan made some difference in the behavior either. Possibly just a change in EMI affecting pin 1?
And I'm a little surprised that it's so sensitive, but it seems conclusive.

Unlike TTL which pretty much always assumes a high - CMOS pins are very high-impedance and can toggle high/low at the slightest
external noise (like an adjacent pin toggling or slightly different electrical noise -- either in lines, or radiated somewhere.
Makes perfect sense to me that the system running different software, or different hardware/power config (like fan) could make a difference.

A useful tool you can make is two high value resistors (a meg or more) one to Gnd, one to Vcc .. setup so you can clip the
arrangement to a scope probe with the center going to the probe input and a test pin to probe... this will sit at Vcc/2 if
unloaded, but will show Gnd/Vcc if actually driven.

Dave ::: https://dunfield.themindfactory.com ::: "Daves Old Computers"->Personal

Dave ::: https://dunfield.themindfactory.com ::: "Daves Old Computers"->Personal

Reply 46 of 51, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie
Guld wrote on 2025-02-17, 17:29:

SOLVED
TLDR: the 32kx8 SRAM in the TAG which only requires 8kx8, pin 1 floating.

And I did say to put a good chip in there. On the bright side you've learned much more this way. Good job.

Guld wrote on 2025-02-17, 17:29:

I have no idea why the fan made some difference in the behavior either. Possibly just a change in EMI affecting pin 1?

CMOS inputs are like very small capacitors (a few pF at most, the better production process the smaller it is). Tha capacitance is the gate to channel. I/O pins on pretty much any modern chip will also have protection diodes or the gate insulation will be blown and the input permanently damaged by ESD. These diodes have very, very small leakage currents but not zero. It depends on the temperature. So at any given temperature one diode might have higher leakage then the other and be able to drive the input to L or H level. But if "stable" floating voltage ends up in the switching region then all bets are off. And it doesn't take much to influence that voltage because of the very small currents.
The fan moves air molecules, which can be charged, and there will be moisture in it. So it might help drive the input one way or the other, in addition to temperature. Long story short floating CMOS inputs are good sources of randomness.

Reply 47 of 51, by Guld

User metadata
Rank Member
Rank
Member

Thanks for the explanation Dave and Deunan, always good to learn something new.

And thanks Deunan, you were right to ask about the chips, I should have just done a better job of verifying pin 1 was connected to something constant and not floating for the TAG 😁.

Reply 48 of 51, by nocash

User metadata
Rank Newbie
Rank
Newbie
Guld wrote on 2025-01-22, 00:52:

I'm not sure if pin 1 on the TAG is actually connected to anything. It's possible, but I haven't found anything yet. I should probably probe it for good measure and make sure it's at least holding either ground or 5v.

Yes, make sure. Did you?

Reply 49 of 51, by Guld

User metadata
Rank Member
Rank
Member
nocash wrote on 2025-02-20, 02:21:
Guld wrote on 2025-01-22, 00:52:

I'm not sure if pin 1 on the TAG is actually connected to anything. It's possible, but I haven't found anything yet. I should probably probe it for good measure and make sure it's at least holding either ground or 5v.

Yes, make sure. Did you?

Yeah, I have as best I can tell, I've not found a connection to anything, not even temporarily if there was a bad connection. For the original board spec for an 8k x 8 SRAM it would not normally be connected unless they planned for someone to be able to use a 32k x 8 in it's place.

So, unless it's somehow completely disconnected via a break, but I can see no evidence of such, solder connection seems solid. No signs of traces going to it on the top or bottom of the board although all the chip sockets do make it very difficult to see much, and of course can't see the middle board layer(s). I tried shining a light around it and couldn't see much through the board unfortunately.

I am running an ATX2AT on it, so if I ever see that indicate a shutdown due to an over current, I'll know what to suspect. It's run for about 8 hours and hasn't shown any issues yet.

Reply 50 of 51, by nocash

User metadata
Rank Newbie
Rank
Newbie

Outch, sorry, I had missed to read page 3 of this thread, where you had already solved the pin 1 issue.

Reply 51 of 51, by SWZSSR

User metadata
Rank Newbie
Rank
Newbie

I have fixed dozens of small 386 boards like this one, I had similar cache \ tag issues as above on a few.

I'd highly recommend checking the legs on the chipset. They become loose due to the board flex of inserting ram\transporting etc.

Find yourself a digital scope and start probing each leg and see if any movement.

M919(3.4bf)/5x86@180/Banshee/SoundscapeElite/DOS6.22
5TH/Dual233MMX/MGA2164W/Voodoo2/AWE64Gold/NT4.0
P5A(1.6)/K6-3+550/Ti500/EWS64XL/WIN9X
P6S5AT/Tualatin1.4/980XGL/DMX6Fire/XPLite
CUV4XDLS/Dual1.4TUAL/HD4670AGP/XONAR/XP
680i/QX6800/3x8800U/XiFi/VISTAx64