VOGONS


286 motherboard repairs

Topic actions

Reply 40 of 66, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie
Skip94 wrote on 2022-02-22, 06:46:

The modified POST 4 seems to work exactly as expected, holding on C0,--, then 00,C0, before running the test as before.

Nope, that actually failed the sanity check. The first code should be C003, not C0--, which mean the first write failed to register on the POST card. That really should not happen, I've been using similar code on 386 and 486 mobos without such issues. But I do know the POST cards can be slow to process the I/O writes for some reason, perhaps due to oversimplified signal decoding.

So here's yet another POST4 variant. This time I've added ~20ms delay loops before and after every access to POST card. I mean it could be down to the problems your mobo has, but we need reliable readouts. Without that the results will be always confusing. If we can't use the POST card, maybe we can take advantage of the serial ports on your mobo. I assume you have another PC with serial port input, or RS232 to USB dongle? For that I need to brush up on low-level 8250 init code, it's been quite some time since I last did anything of the sort.

Attachments

  • Filename
    POST4.7z
    File size
    433 Bytes
    Downloads
    29 downloads
    File license
    Public domain

Reply 41 of 66, by Skip94

User metadata
Rank Newbie
Rank
Newbie
Deunan wrote on 2022-02-22, 11:06:

Nope, that actually failed the sanity check. The first code should be C003, not C0--, which mean the first write failed to register on the POST card. That really should not happen, I've been using similar code on 386 and 486 mobos without such issues. But I do know the POST cards can be slow to process the I/O writes for some reason, perhaps due to oversimplified signal decoding.

Ooops, sorry, I think I misunderstood slightly.
I've run it a bunch more times. perhaps 1 in 10 times, I do get C0,03.

Apologies, I forgot to post the POST 5 video before I went to bed! Here it is https://youtu.be/9Z4MxB8AQgc
The latest POST 4 however gives C0,03 every time.
Andrew

Reply 42 of 66, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie

Well, the POST5 test is inconclusive due to the same issue with not displaying some (most actually) of the results. Please make a video of the new POST4 that does display C003 every time, it has the fixes also in the main testing loop.
Here's another version of POST5 test with the same modifications (longer delays to give the POST card ample time to capture it). Please try that too, and make sure to have the RTC chip pulled out when running it.

I still have some more ideas on why missing DMA chips would cause the system to not boot, but that is only for my curiosity - we can maybe revisit that later if necessary. Since you tested them on another mobo and they don't mess the bus there, I would at least assume that it is safe to have them installed. And they will be needed to run RAM refresh eventually. As for the missing reset button, well you could try those two headers near the PCB edge with memory banks. Try shorting each pin there with 1k resistor to GND. A value of 1k will be enough to prevent any outputs (or even 5V pins) from becoming shorted, and it should still pull low enough to cause a reset if that input is somewhere there. Otherwise some mobo modification would be needed, like maybe buffering the PWR_GOOD signal from PSU. All that being said, it's not strictly needed for any of these tests - just an idea for future.

Attachments

  • Filename
    POST5.7z
    File size
    419 Bytes
    Downloads
    28 downloads
    File license
    Public domain

Reply 43 of 66, by weedeewee

User metadata
Rank l33t
Rank
l33t

Deunan, DMA channel 0 is normally used for memory refresh. googled it, found the song... and it's mentioned on the wiki of the intel 8237.

Right to repair is fundamental. You own it, you're allowed to fix it.
How To Ask Questions The Smart Way
Do not ask Why !
https://www.vogonswiki.com/index.php/Serial_port

Reply 44 of 66, by Skip94

User metadata
Rank Newbie
Rank
Newbie

This is the latest version of POST 4
https://youtu.be/5PRavSbwt9w

And the latest version of POST 5
https://youtu.be/L-G-y5c7Cz4

I had a quick play with a 1k resistor and the unidentified pin headers. No joy so far, but I'll have a better look some time.

I should also have said in response to your previous post, yes, I do have access to a PC with a serial port.

Cheers
Andrew

Reply 45 of 66, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie
weedeewee wrote on 2022-02-22, 21:55:

Deunan, DMA channel 0 is normally used for memory refresh.

I'm aware, but we are not even getting past a basic bus test, there's a long way from here to getting DRAM refresh working. This test is run from ROM with nothing but CPU registers.

Reply 46 of 66, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie
Skip94 wrote on 2022-02-23, 07:09:

This is the latest version of POST 4
And the latest version of POST 5

The good news - now the tests are consistent, both before and after values are shown on the POST card. Also, the way I've constructed the tests proves the 16-bit data bus from BIOS ROM to CPU is intact.
The bad - seems like on every NVRAM pattern bits 2 and 3 are stuck high and bit 6 is stuck low. DMA register test is completly wrong.

So that got me thinking, perhaps the problems is not data but address - we could be writing to different registers rather than NVRAM, and at the same time completly miss the DMA register. This would also explain why the speaker is silent, and nothing else really works. We are already lucky enough the POST card works as well as it does. Let me see if I can figure out where the write goes to, and/or come up with a test program for that. I assume you do not have an oscilloscope?

UPDATE: Let's try to approach this Apollo 13 style. What do we have that's good? Writes to port 0x80 (POST card) seem to work fine.
It so happens that on systems with 74612 chip that I/O port should be 8-bit W/R register, that's otherwise unused. So put 74612 back into socket and try this POST6 test code. It will read the byte it wrote to port 0x80 back, negate it and write it again. So just like in previous tests you should (after initial 0xC003) see a value and it's binary negation on the POST card. If it kinda works, like with NVRAM, but with glitched bits, then my bet would be the 74ALS245 is bad. I know it tests good in programmer but these are almost static tests, very low switching speeds. Perhaps it's not completly dead, just degraded. If you have a spare try it, though ALS chips can be hard to come by these days. AHCT or ACT might be good modern substitutes - that depends how good the PCB ISA design is. HCT or LS will most likely not be fast enough (not if it has to drive multiple ISA cards) but perhaps could be used for this test.

Attachments

  • Filename
    POST6.7z
    File size
    416 Bytes
    Downloads
    27 downloads
    File license
    Public domain

Reply 47 of 66, by Skip94

User metadata
Rank Newbie
Rank
Newbie

Right, I've tried POST 6.
I don't get a code and its binary opposite on the card at the same time, but it does seem to be a regular, repeatable pattern now.
The code on the 2nd pair is always the code that was on the 1st pair the previous time, I.E. AA,00 - FF,AA - 55,FF etc

So, now the codes in order are
00 00000000
AA 10101010
FF 11111111
55 01010101
FF 11111111
55 01010101
00 00000000
AA 10101010
01 00000001
02 00000010
04 00000100
08 00001000
10 00010000
20 00100000
40 01000000
80 10000000
FE 11111110
FD 11111101
FB 11111011
F7 11110111
EF 11101111
DF 11011111
BF 10111111
7F 01111111
00 00000000 (This is the start of it looping again

This looks like a predictable pattern to me, which feels right, but then I'm no expert!

I did try swapping the ALS245's around, as there are two, but it made no difference. I also tried replacing each one in turn with a regular LS245 from my XT clone board.

I also had a new P82C201 turn up today, simply because it was cheap on eBay. Unfortunately it seems dead, as with that in, absolutely nothing.

Andrew

Reply 48 of 66, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie
Skip94 wrote on 2022-02-23, 18:44:

I don't get a code and its binary opposite on the card at the same time

Sigh, and no wonder you don't, I messed it up. I was testing the code in DOS and did away with negation to make it easier to compare pattern and readout. So there isn't any binary opposite on the POST card in the test... and the code is not doubled because as I've said POST cards will remember last seen code and not output it again if it matches.

You can try this corrected version if you wish, just so that you'll know what it was supposed to look like, but the result you got is pretty good as-is. Reason being, if the second write was actually different than the first the pattern you got would look differently. And you have exactly 24 patterns in a loop, that's what the code tries. Therefore writes to and reads from 74612 work. So that tells me the data bus is OK, forget the '245 for now. I think we have a stuck address bit.

We can maybe use the 74612 a bit more, it actually has 16 bytes of memory. That can be used to test lower 4 bits of the ISA address bus (since address 0x80 works I assume something is stuck low, or bit 7 is stuck high). I'll have another test program ready soon.

UPDATE: Here's POST7 test code. Run with 74612 in socket. Let's see if that gets us any further.

Attachments

  • Filename
    POST7.7z
    File size
    388 Bytes
    Downloads
    29 downloads
    File license
    Public domain
  • Filename
    POST6.7z
    File size
    418 Bytes
    Downloads
    28 downloads
    File license
    Public domain

Reply 52 of 66, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie

OK, so it's not bits 0-3. I suspect bit 4 but testing that might be more problematic. So here's another variant of POST7, this one should start like the previous ones but then extend the address range. Since the read results might be somewhat random, and there will be more, it's probably best to video this one. In order to let you know when the loop actually restarts I've made it return to the C003 code. So once you see that again the test should be done with at least one full pass.

BTW if bit 4 is stuck at zero then results from 0x90 range should be the same as from 0x80. We'll see.

Attachments

  • Filename
    POST7.7z
    File size
    420 Bytes
    Downloads
    28 downloads
    File license
    Public domain

Reply 54 of 66, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie

That looks promising so it's worth further investigation. So it would appear we do have a stuck bit on XADDR bus, XA4 is always zero. I think ISA slots are on SADDR bus so we can't use that as reference, and XADDR bus is kinda complicated in how it works because the chipset does at least some address decoding. I need to properly read the datasheet now to see where XA4 goes to, and will provide an update.

In the meantime you can inspect the socket of the '204 chip, pins #50 and #49 too. Note that this is one of those weird chips that count the pins starting in the middle of one side, there should be a dot there. Inspect for solder cracks, cut traces, etc. If you have time try to trace these two signals around the motherboard and make sure there is continuity from chip to chip.

EDIT: The first thing to check would be connection between '204 #50 and '202 #28. That's the most suspect connection.

Reply 55 of 66, by Skip94

User metadata
Rank Newbie
Rank
Newbie
Deunan wrote on 2022-02-23, 22:22:

That looks promising so it's worth further investigation. So it would appear we do have a stuck bit on XADDR bus, XA4 is always zero. I think ISA slots are on SADDR bus so we can't use that as reference, and XADDR bus is kinda complicated in how it works because the chipset does at least some address decoding. I need to properly read the datasheet now to see where XA4 goes to, and will provide an update.

In the meantime you can inspect the socket of the '204 chip, pins #50 and #49 too. Note that this is one of those weird chips that count the pins starting in the middle of one side, there should be a dot there. Inspect for solder cracks, cut traces, etc. If you have time try to trace these two signals around the motherboard and make sure there is continuity from chip to chip.

EDIT: The first thing to check would be connection between '204 #50 and '202 #28. That's the most suspect connection.

This is sounding promising!
I've checked between pin 50 on 204 and pin 28 on 202, all seems good there. I did pull 204 out, give the pins on the chip another clean and put a little more tension on the pin in the socket. To be honest, the more I look at these PLCC sockets, the less I like them, they really are rubbish. I've ordered some decent ones and will replace them when they arrive.
Putting the logic probe on that line, shows it pulsing, but mostly high, dropping low every time the code changes on the POST card.
Where should I be looking in the string of codes for changes? Running it now shows different codes in places. It goes through and does 8*,0* then 9*,0*, then A*,BC as before, but after that , a few of the 2nd codes change. I will try and video this later.
Andrew

Reply 56 of 66, by Skip94

User metadata
Rank Newbie
Rank
Newbie

Ok, so I'm probably getting ahead of myself here... But I found a pair of good PLCC sockets. So I socketed 204 and 203. Just for laughs I decided to try the original BIOS. I now get 0A,09 on the POST card, but more excitingly, beeps! Its well past my bed time, so I'll do some proper investigation soon.
Andrew

Reply 57 of 66, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie

I think we can conclude the issue is most likelty with missing mobo signals, for whatever reason, and not one of the removed chips breaking the bus. So at this point it's probably best to re-populate the mobo, including some RAM (it takes 18 chips to have one full bank, 16 data bits + 2 for parity, for each byte) and try booting it. I'll check what the 0A code is for exactly but most likely KBC or RAM. Could also be not detected video RAM of any kind, so maybe add some VGA card to ISA. You could also try the diagnostic BIOS since now that XA4 seems to work you should get at least some beeps. Do keep in mind the ROM size jumpers, those might need to be changed for that.

Reply 58 of 66, by Skip94

User metadata
Rank Newbie
Rank
Newbie
Deunan wrote on 2022-02-24, 08:50:

I think we can conclude the issue is most likelty with missing mobo signals, for whatever reason, and not one of the removed chips breaking the bus. So at this point it's probably best to re-populate the mobo, including some RAM (it takes 18 chips to have one full bank, 16 data bits + 2 for parity, for each byte) and try booting it. I'll check what the 0A code is for exactly but most likely KBC or RAM. Could also be not detected video RAM of any kind, so maybe add some VGA card to ISA. You could also try the diagnostic BIOS since now that XA4 seems to work you should get at least some beeps. Do keep in mind the ROM size jumpers, those might need to be changed for that.

You are quite correct, 0A and the beep code 1-3-3 both point to first 64K RAM failure. The reason? Well, I've been playing with mostly XT's over the last few months... So I had populated the far left and far right most banks of 9 RAM chips, completely overlooking the fact a 286 is 16 bit!
So, it is now fully populated with RAM and I have flashed the original BIOS onto the EEPROMS. I think the BIOS sockets are on their way out, as with the original chips in there, I get a mixture of errors, but the EEPROM adapters are a much tighter fit. So now I get a POST code of 27 and a beep code of 3-2-4. All of which points towards KBC failure. Well, that's not surprising as the KBC is on the desk next to the board. However, as soon as a KBC is inserted, I get nothing at all. So I shall do some digging around there. It's looking like a lot of my issues are down to poor connections, so I've taken the decision to order a complete set of new sockets for the board. Lots of desoldering practice for me! Just about every DIP socket is the nasty single wipe type.
I'll change all these, then have another go.
Andrew

Reply 59 of 66, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie

I was rather hoping it would be something simple like a broken connection between chips, but if the sockets are misbehaving then anything is possible. That being said, now that you seem to have proper addressing and chips responding, I could maybe write a test for the KBC as well. This would have to wait a bit though, I have other things to take care of first.

You should investigate the PWR_GOOD signal from PSU and see where it goes. Probably some TTL logic first then the chipset. If your KBC requires that signal to toggle before it will report a cold boot to the BIOS then you will not be able to run the system until it's fixed.