VOGONS


First post, by GeorgeMan

User metadata
Rank Oldbie
Rank
Oldbie

Hello forum!

I have a QDI Titanium IIIB socket 7 motherboard. It suddenly stopped working (no signal after POST screen, never turned on again).
I installed only barebones (CPU, RAM, GPU) and the PCI analyzer card shows me error code 0b0a.

From what I've found first 2 digits are where it hung and second 2 are the previous code. So it seems that I'm stuck at:

1.Verify the RTC time is valid or not
2.Detect bad battery
3.Read CMOS data into BIOS stack area
4.PnP initializations including (PnP BIOS only)
Assign CSN to PnP ISA card
Create resource map from ESCD
5.Assign IO & Memory for PCI devices (PCI BIOS only)

or

Test CMOS RAM Checksum if bad or Insert key depressed; load defaults.

That happens (correctly) before GPU initialization, so the error code stays the same with other PCI GPUs or without any PCI device. What else I tried so far:

  • measuring the voltage regulator outputs, seem all fine and they change accordingly if I change CPU - it's a jumperless motherboard
  • visually inspect motherboard for any capacitors leaking or any damage, but I didn't even move it when it broke
  • change RAM
  • change CPU - tried AMD K5-100, Intel Pentium 100 and Intel Pentium MMX 233

No matter what, code stays the same. Except if I remove CPU or RAM.

I don't really understand the error code and I'm stuck. Is the motherboard toast? Should I just abandon fixing it? I cannot accept that it went bust this way without any VRM-capacitor issues...

Can anyone help me a bit please?

Acer Helios Neo 16 | i7-13700HX | 64G DDR5 | RTX 4070M | 32" AOC 75Hz 2K IPS + 17" DEC CRT 1024x768 @ 85Hz
Win11 + Virtualization => Emudeck @consoles | pcem @DOS~Win95 | Virtualbox @Win98SE & softGPU | VMware @2K&XP | ΕΧΟDΟS

Reply 1 of 16, by ratfink

User metadata
Rank Oldbie
Rank
Oldbie

According to this it's probably Award 4.51G
https://theretroweb.com/motherboards/s/qdi-p5 … anium-iiib#bios

And the 4.51 table here -

https://blog.theretroweb.com/2024/01/20/award … ost-codes-list/

indicates 0B is "Test CMOS RAM checksum" and further down the Test sequence table indicates that what that means is "CMOS RAM checksum tested and BIOS defaults loaded if invalid. Failure would indicate CMOS RAM failure".

Corrupted bios settings? Try clearing them?

Reply 2 of 16, by mkarcher

User metadata
Rank l33t
Rank
l33t
GeorgeMan wrote on 2024-10-06, 15:27:

Test CMOS RAM Checksum if bad or Insert key depressed; load defaults.

"if insert key depressed" might be a hint: This requires communication with the keyboard controller. If your keyboard controller is socketed, re-seat it.

Reply 3 of 16, by GeorgeMan

User metadata
Rank Oldbie
Rank
Oldbie
ratfink wrote on 2024-10-06, 19:49:
According to this it's probably Award 4.51G https://theretroweb.com/motherboards/s/qdi-p5 … anium-iiib#bios […]
Show full quote

According to this it's probably Award 4.51G
https://theretroweb.com/motherboards/s/qdi-p5 … anium-iiib#bios

And the 4.51 table here -

https://blog.theretroweb.com/2024/01/20/award … ost-codes-list/

indicates 0B is "Test CMOS RAM checksum" and further down the Test sequence table indicates that what that means is "CMOS RAM checksum tested and BIOS defaults loaded if invalid. Failure would indicate CMOS RAM failure".

Corrupted bios settings? Try clearing them?

Oh I forgot to write that, it was the first thing I tried.
Actually I removed the battery so it should be auto cleared in every try.

Acer Helios Neo 16 | i7-13700HX | 64G DDR5 | RTX 4070M | 32" AOC 75Hz 2K IPS + 17" DEC CRT 1024x768 @ 85Hz
Win11 + Virtualization => Emudeck @consoles | pcem @DOS~Win95 | Virtualbox @Win98SE & softGPU | VMware @2K&XP | ΕΧΟDΟS

Reply 4 of 16, by GeorgeMan

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2024-10-07, 06:25:
GeorgeMan wrote on 2024-10-06, 15:27:

Test CMOS RAM Checksum if bad or Insert key depressed; load defaults.

"if insert key depressed" might be a hint: This requires communication with the keyboard controller. If your keyboard controller is socketed, re-seat it.

Will try that and let you know!

Acer Helios Neo 16 | i7-13700HX | 64G DDR5 | RTX 4070M | 32" AOC 75Hz 2K IPS + 17" DEC CRT 1024x768 @ 85Hz
Win11 + Virtualization => Emudeck @consoles | pcem @DOS~Win95 | Virtualbox @Win98SE & softGPU | VMware @2K&XP | ΕΧΟDΟS

Reply 5 of 16, by ratfink

User metadata
Rank Oldbie
Rank
Oldbie

Worth trying but bear in mine that the next check after 0B is 0C "Initialize keyboard; Detect the type of keyboard controller" so it may not have got that far.

If that doesn't help - I wonder whether the CMOS is getting cleared, maybe try removing the power cord and shorting the CMOS reset if it has one. Failing that, leave it unplugged without a battery for a few hours.

Reply 6 of 16, by kmeaw

User metadata
Rank Member
Rank
Member

2A59IQ1DC BIOS emits code 0x0B at E000:61AB which is a function (E000:61A6 .. E000:61C0) that takes an address of a table and a starting index, runs every function from that table reporting to the POST card the number of that function offset by the specified starting index.

This function is called 3 times but only once with a starting index low enough to cover your POST code of 0x0B - it happens at 0x619B: the starting index is 0x3 and the table is located at E000:60F2.

So E000:60F2 is POST 0x03, E000:60F4 is POST 0x04, ..., E000:6102 is POST 0x0B. The pointer at E000:6102 points to E000:1BE0.

The function at E000:1BE0 iterates over the table (E000:1BC1 .. E000:1BDD) which maps CMOS locations holding RTC data to default, min and max values. If the value read from CMOS is not a valid BCD digit or goes off limits, the defaults are loaded.
Then (E000:1C23) it checks if CMOS location 0x8D has the highest bit set. If it does, it assumes the battery is dead and invalidates the CMOS checksum.
It stores (E000:1C40) some default value for the CPU frequency in the BIOS data area's RAM inventory list based on the CPU class, then calls a subroutine (at E000:1C58 to E000:5B3D over trampolines at F000:EC30 and F000:0F75) reads 128 bytes of CMOS data to the stack.
It calls to another subroutine (at E000:1C69 to E000:1D23) that goes over a table of possible boot media to update CMOS boot drive settings.
It checks (E000:1C6C) if the CMOS checksum is good - if it is then it checks CMOS location 0x8B at bit 5 (probably alarm interrupt enable bit). If the bit is cleared or the checksum is bad, it clears (E000:1C7E) 6 lower bits of CMOS location 0x8D.
Then it updates (E000:1C8C) floppy settings in CMOS and clears the FPU presence bit, then (at E000:1C98) calls into E000:1D69 which checks (by calling E000:1D76 -> F000:EC30 -> F000:4F7F and then running some no-wait FPU tests) if FPU is present and sets the bit back if it is. Then (E000:1C9B) the BIOS data area is updated to reflect that.
Then it checks (E000:1CA6) the number of floppies configured, if the swap floppy setting is enabled and also updates the BIOS data area.
Finally, it calls to a subroutine (from E000:1CC2 to F000:6E34) with DI=0x3 to initialize the chipset (it goes over every available BIOS setting that has an associated chipset register and writes it to the hardware) and does PnP Init (E000:4BE8 which resets the bus and does the isolate-and-fetch-CSN loop and E000:4C90 which does ESCD-related stuff) and (from E000:1CDC to F000:4E14) shadows ROM into RAM.
The failure bit (which is CF) is cleared and the function returns to E000:61B9, reporting success.

Reply 7 of 16, by Chkcpu

User metadata
Rank Oldbie
Rank
Oldbie

Hi kmeaw,

Thanks for your detailed analysis of the QDI Titanium IIIB 2A59IQ1DC BIOS and explanation of the POST_B code. I’m impressed! 😀

Recently I patched this Titanium IIIB v1.2S BIOS for K6-2(+)/III(+) support, so I have a disassembly listing of the CPU detection and SpeedEasy jumperless control parts of this BIOS.
Looking at the POST_B routine you described, I totally agree with your analysis but didn’t understand the PnP part. Can you elaborate what you mean by the “isolate-and-fetch-CSN loop”?

Anyway to help @GeorgeMan further, can you think of any hardware failure that would cause this hang at POST_B?

@GeorgeMan reported that the VRM outputs are all fine and change according the installed CPU. This makes sense because the CPU detection for the Auto Vcore control in this jumperless BIOS is done very early during POST steps C0 and C1 in the Bootblock, at every boot-up.
As the BIOS doesn't hang on a decompression integrity check and starts the main module (original.tmp) POST routines normally at POST_3, I’m convinced it is fine and there is no BIOS corruption.
A lucky circumstance on a board with soldered-on BIOS flash chip. 😉

GeorgeMan, I hope you can fix this nice board, or find someone to fix it for you. I’m presently out of ideas where to look for the fault.

Cheers, Jan

CPU Identification utility
The Unofficial K6-2+ / K6-III+ page

Reply 8 of 16, by kmeaw

User metadata
Rank Member
Rank
Member

Let's look into the PnP Init procedure (E000:4BE8) in more detail.

1. It calls (over a trampoline at E000:C7A0) a stub at F000:C6A4 which does nothing - probably vendor could put some kind of hook there.
2. Another stub at E000:3F95 does nothing.
3. Call to E000:7808 - it writes 0x02 to the PnP Config Control register, which forces every ISAPnP compliant card to enter the "Wait for key" state.
4. Call to E000:79EC - issue ~1ms delay before accessing the auto-configuration ports.
5. Call to E000:7814 - intializes LFSR to generate the Initiation Key and send it to PnP devices which will place the logic into configuration mode.
6. Call to E000:77FC - it writes 0x07 to the PnP Config Control register. All possible config control bits are set, that means: reset CSN to 0, return to the Wait for Key state, Reset all logical devices and restore configuration registers to their power-up values.
7. Call to E000:79EC - another delay.
8. Call to E000:7814 - send Initiation Key again once the devices have been reset.
9. Iterate (E000:4C18 .. E000:4C25) over the table at E000:4BC4 which lists possible RD_DATA ports to find a non-conflicting one. To verify the address, it calls into E000:783C which isolates a single PnP device.
10. Now goes the loop I was telling about (E000:4C2D .. E000:4C35). It calls E000:79A0 and E000:783C until the last PnP device has been isolated. E000:79A0 stores the loop counter (CSN) to the last isolated device. Now every PnP device has its own unique CSN.
11. Store the last assigned CSN to 0x2002 and the RD_DATA port which worked fine to 0x2000.
12. Inactivation loop (E000:4C54 .. E000:4C84) - iterate over every detected card, over every logical device of each card, reset bit 0 (deactivate logical device) of register 0x30 (Activate register).
13. Call to E000:79AC - issue Wake[CSN] with zero write data to force all cards without a CSN to enter isolation mode.

As you can see, PnP init procedure involves a lot of cooperation between the devices and the firmware - cards are sharing the same ports and a single misbehaving card could cause others to fail.
Another risky part is E000:4C90 - it is a large function that allocates IRQ and DMA resources, storing the result in ESCD buffer. Parts of it can be bypassed if "PNP OS Installed" and "Reset ESCD configuration" setup items is set to disabled and "Resources Controlled By" to "Manual" (which is surely hard to do on a non-booting machine). If the ESCD table in flash is bad, things might go wrong - rewriting the flash with an external programmer would help here.

Another thing to try - hook up a floppy drive and short the highest address lines on the flash chip, then power up the machine and remove the short. BIOS would fail the checksum test and would be forced to perform an emergency boot from the bootblock, skipping most of the hardware initialization and attempting to boot from a floppy.

Reply 9 of 16, by Horun

User metadata
Rank l33t++
Rank
l33t++
kmeaw wrote on 2024-10-13, 01:28:

If the ESCD table in flash is bad, things might go wrong - rewriting the flash with an external programmer would help here.

Not exact same issue as here but had an odd one on a KA31 long ago where it would 1/2 post and halt, tried all types of hardware thinking it was something I installed but same thing. Took socketed eeprom out and read it, appeared fine, try to reprogram it and the programming halted on error writing at a point about 2/3 down, in a non-code area but had some data. It was a 4k block that was bad and would not take a write, so programmed a different prom chip and it worked.
Even though the bios had not got to the near end of most POST of the typical "updating escd" part, it was reading corrupt data somewhere before that. I figured it was for the ESCD area like you say, all the other code (main and boot block) looked good.

Hate posting a reply and then have to edit it because it made no sense 😁 First computer was an IBM 3270 workstation with CGA monitor. Stuff: https://archive.org/details/@horun

Reply 10 of 16, by Chkcpu

User metadata
Rank Oldbie
Rank
Oldbie

Hi kmeaw,

Thanks for another detailed BIOS analysis!
I thought I understood most of the inner workings of the socket 3/5/7 Award BIOS, but obviously my PnP knowledge was lacking. 😉
So I have been brushing-up on my PnP knowledge and I could now follow your PnP-init story without problem. However, it is hard to see if and how the CPU would hang on this PnP code if all but the videocard is removed.

I get the feeling that the BIOS hang is more likely to happen near the end of POST_B were you indicated another “risky” function at E000:4C90 about resource allocation and updating the ESCD block in the BIOS flashchip.
Because OP already cleared the CMOS data, the BIOS will now boot with its Setup defaults. For “PNP OS Installed” this is ”No” so the BIOS will try to enumerate all PnP devices, and not leave this task to the OS. The “Force Updating ESCD” is obviously Disabled by default, and I’ve found the “Resources Controlled By” option defaults to “Manual”.
This causes the assignment of IRQ’s 3, 4, 14, and 15 to be set at “Legacy ISA” so they will not be shared by PnP. All other IRQ’s and all DMA channels default to “PCI/ISA PnP” so those are available for the PnP init and resources allocation routines.

I assume the resources allocation function has to read the ESCD data to check if an update is required. So my main focus will be to find-out what this function does when the ESCD data is missing or corrupted.

@Horun, thanks for adding your experience with a similar BIOS hang because of a bad block in the flashchip at the location of the ESCD. Got me thinking if it is the same with this Titanium IIIB board. I know from the OP that he was testing various CPUs and just installed a K6-2/500 that ran fine at the BIOS’s default 175MHz speed (= x3.5 / 50MHz after changing to a K6). But when he changed the FSB and multiplier setting in the BIOS to x6.0 / 66MHz to run the K6-2 at 400MHz and rebooted, the fault occurred.
With each CPU change, the ESCD block has to be updated so maybe his flashchip was at its write limit as well. Who knows…

@GeorgeMan, this will be a tough one to solve, but we will try.

Cheers, Jan

CPU Identification utility
The Unofficial K6-2+ / K6-III+ page

Reply 11 of 16, by GeorgeMan

User metadata
Rank Oldbie
Rank
Oldbie

I've been thoroughly reading your replies and enjoying your knowledge.
There is no rush, we 'll do whatever we can to solve it. 😀 But I definitely cannot desolder the BIOS chip and/or program it or another one externally.

I also want to add that the "no signal" occured right after the last time it POSTed. I mean, it POSTed normally, then got to the final POST information screen and right after that, "no signal" and never got a signal again.
It did not occur after a power-up or saving changes and restart etc. Ie: I was able to see it just once to operate and report K6-2@400MHz and right after that, the problem occured.

Last edited by GeorgeMan on 2024-10-15, 20:26. Edited 1 time in total.

Acer Helios Neo 16 | i7-13700HX | 64G DDR5 | RTX 4070M | 32" AOC 75Hz 2K IPS + 17" DEC CRT 1024x768 @ 85Hz
Win11 + Virtualization => Emudeck @consoles | pcem @DOS~Win95 | Virtualbox @Win98SE & softGPU | VMware @2K&XP | ΕΧΟDΟS

Reply 13 of 16, by PC@LIVE

User metadata
Rank Oldbie
Rank
Oldbie

As it happens, I just wrote something about it, I put a couple of images, which show which pins to connect, but I don't know if the chip is different, if the pins change.
Re: Test and troubleshoot PC@LIVE motherboards

AMD 286-16 287-10 4MB HD 45MB VGA 256KB
AMD 386DX-40 Intel 387 8MB HD 81MB VGA 256KB
Cyrix 486DLC-40 IIT387-40 8MB VGA 512KB
AMD 5X86-133 16MB VGA VLB CL5428 2MB and many others
AMD K62+ 550 SOYO 5EMA+ and many others
AST Pentium Pro 200 MHz L2 256KB

Reply 14 of 16, by GeorgeMan

User metadata
Rank Oldbie
Rank
Oldbie

Success!

Thanks to everyone who helped or even tried to help! 😁

I installed an ISA GPU, shorted the last 2 data pins of the soldered small BIOS chip while powering up the system, and I was able to see the bootblock screen! So the whole analysis was spot on!
I then loaded a boot floppy, reflashed the (modded) BIOS successfully and right after that I had a working board.

Acer Helios Neo 16 | i7-13700HX | 64G DDR5 | RTX 4070M | 32" AOC 75Hz 2K IPS + 17" DEC CRT 1024x768 @ 85Hz
Win11 + Virtualization => Emudeck @consoles | pcem @DOS~Win95 | Virtualbox @Win98SE & softGPU | VMware @2K&XP | ΕΧΟDΟS

Reply 15 of 16, by Chkcpu

User metadata
Rank Oldbie
Rank
Oldbie
GeorgeMan wrote on 2024-10-23, 17:02:
Success! […]
Show full quote

Success!

Thanks to everyone who helped or even tried to help! 😁

I installed an ISA GPU, shorted the last 2 data pins of the soldered small BIOS chip while powering up the system, and I was able to see the bootblock screen! So the whole analysis was spot on!
I then loaded a boot floppy, reflashed the (modded) BIOS successfully and right after that I had a working board.

Hi GeorgeMan,

Wow, this is great news!!
I’m happy to hear that this risky procedure paid off. 😀

Seeing that you used Uniflash and flashed INCLUDING bootblock, the ESCD and DMI blocks were erased as well. It is still a guess, but if the corruption was in the ESCD block, this action will have cleared it away.

I hope this very nice Titanium IIIB keeps working fine now, but if this problem should re-occur, it may be a good idea to relax the memory timing a bit or use 75MHz FSB as maximum.
BIOS code runs from shadow RAM, so it is sensitive to fast RAM timings too.

Cheers, Jan

CPU Identification utility
The Unofficial K6-2+ / K6-III+ page

Reply 16 of 16, by GeorgeMan

User metadata
Rank Oldbie
Rank
Oldbie

Yes it seems unstable even at the most relaxed settings but maybe the culprit is the old RAM DIMM I used. I'll have to fiddle more with hardware and settings, but that's fun now that the board is back to life!
Oh and thank you for making this possible in the first place due to BIOS modding!
Now that's confirmed that it works maybe you'd want to add it to your website too.

Acer Helios Neo 16 | i7-13700HX | 64G DDR5 | RTX 4070M | 32" AOC 75Hz 2K IPS + 17" DEC CRT 1024x768 @ 85Hz
Win11 + Virtualization => Emudeck @consoles | pcem @DOS~Win95 | Virtualbox @Win98SE & softGPU | VMware @2K&XP | ΕΧΟDΟS