VOGONS


Reply 20 of 38, by CoffeeOne

User metadata
Rank Oldbie
Rank
Oldbie
Disruptor wrote on 2023-10-29, 09:56:
That depends on the RAM and cache timings you can achieve. If you have a board that may achieve 2-1-1-1 cache timing and good RA […]
Show full quote

That depends on the RAM and cache timings you can achieve.
If you have a board that may achieve 2-1-1-1 cache timing and good RAM timinigs at 50 MHz, do it. But first you have to find such a board.
It will be more likely you'll find a board that will run that settings at 40 MHz.

But, to be honest, stability check is not easy, as you may read in this topic.

I can achieve 2-1-1-1 at 40MHz, but never at 50MHz.
So yes that is the reason, that 40MHz for me always is faster than 50MHz.
What works as a stability check for me is Doom and Quake under DOS as a first step.
Afterwards I install Windows 98SE and run a longer session of Lightwave 3D rendering inside (about 20 minutes), I am confident to have a stable system then.

Reply 21 of 38, by bakemono

User metadata
Rank Oldbie
Rank
Oldbie

I don't think running old games is really proof of stability. If it was rendering a few wrong pixels here and there, are you certain you would notice that? Or would you think it looks normal and your system must be "stable"? At least with modern games, they tend to glitch in more spectacular fashion when there are problems, like big flickering triangles and wrong textures.

again another retro game on itch: https://90soft90.itch.io/shmup-salad

Reply 22 of 38, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Sure, run Crysis for a stability test on your 486. Specifically for 386/486/Pentium nothing is better than Doom for filtering out a huge chunk of timing issues. There might be games that are a bit more sensitive, but they may be only available as a retail version, not have automated demo modes that you can loop incessantly, not have a community of experience behind so you can tell the difference between it not liking your mouse driver, QEMM, or actual hardware glitches and so on. Over 200Mhz or so, then utils like Prime95 under windows become a lot more relevant. memtest86 is invaluable for stressing your memory to find out problems there, but it doesn't filter out CPU issues, unless they're so bad that it's a wonder it boots. But in that situation the first filter is usually that it locks up booting DOS and you don't get a prompt.

FP units do indeed do subtle errors, that's how Intel managed to ship out CPU with FPU faults several times. Integer units don't, they start having ones where there should be zeros or vice versa and stuff crashes. Fast integer doesn't do spit for quake, therefore we know that it must be twiddling it's thumbs waiting on FPU most of the time. Therefore quake tells you very little about the essential parts of the system, even though it looks like it works it hard. It might find faults in the FPU if you get real lucky, but last I heard it ran on FDIV bugged pentiums so what does that tell you? FPU is very much a bolt on optional extra until very late in the day as far as 90s software is concerned. It's like testing the air conditioning in your car then being surprised the brakes are locked or it won't go into gear, because you checked things. No you didn't.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 23 of 38, by Disruptor

User metadata
Rank Oldbie
Rank
Oldbie
bakemono wrote on 2023-10-29, 12:47:

I don't think running old games is really proof of stability. If it was rendering a few wrong pixels here and there, are you certain you would notice that? Or would you think it looks normal and your system must be "stable"? At least with modern games, they tend to glitch in more spectacular fashion when there are problems, like big flickering triangles and wrong textures.

I guess we aren't talking about pixel glitches.
We just look whether we don't get any general protection fault with too tight timings.

Reply 24 of 38, by rasz_pl

User metadata
Rank l33t
Rank
l33t
bakemono wrote on 2023-10-29, 12:47:

I don't think running old games is really proof of stability. If it was rendering a few wrong pixels here and there, are you certain you would notice that? Or would you think it looks normal and your system must be "stable"? At least with modern games, they tend to glitch in more spectacular fashion when there are problems, like big flickering triangles and wrong textures.

Modern game flickering triangles and bad textures are symptoms of modern GPU failing to perform internal calculations without errors = math errors, wrong values and writes to wrong memory locations.
In case of 486 running Quake all those calculations are performed by CPU and failure will result in instant computer crash.

BitWrangler wrote on 2023-10-29, 13:34:

Fast integer doesn't do spit for quake, therefore we know that it must be twiddling it's thumbs waiting on FPU most of the time. Therefore quake tells you very little about the essential parts of the system

Quake interleaves fpu and integer instructions hammering both simultaneously.

Open Source AT&T Globalyst/NCR/FIC 486-GAC-2 proprietary Cache Module reproduction

Reply 25 of 38, by alvaro84

User metadata
Rank Member
Rank
Member
rasz_pl wrote on 2023-10-29, 17:38:

Quake interleaves fpu and integer instructions hammering both simultaneously.

At least on Pentiums. A 486 will really wait for its FPU to finish before going on with those integer operations. But I don't really know how good of a stability test it is for them. On Pentiums it's quite nitpicky, it crashes easier than most other games or demos. Though I can't say that it's a bulletproof method to test those systems, it's just one remarkable data point. And I can't be dead sure about the 486/Doom relation either, though I remember that it caught cache issues quite quickly. On one of my 486 boards, I don't even know which one...

On a side note, I don't like to test such old systems with windows installs as installing it can bring other uncertainties and, foremost, I feel it a huge ugly chore.

Shame on us, doomed from the start
May God have mercy on our dirty little hearts

Reply 26 of 38, by bakemono

User metadata
Rank Oldbie
Rank
Oldbie
rasz_pl wrote on 2023-10-29, 17:38:

Modern game flickering triangles and bad textures are symptoms of modern GPU failing to perform internal calculations without errors = math errors, wrong values and writes to wrong memory locations.

Not necessarily. I induced this behavior by running northbridge voltage too low. (And usually IME it is buggy video drivers and not even a hardware issue.)

In case of 486 running Quake all those calculations are performed by CPU and failure will result in instant computer crash.

How do you know it would crash?

again another retro game on itch: https://90soft90.itch.io/shmup-salad

Reply 27 of 38, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Yes you can have a system in a state where Quake crashes, just minimally tweaking things until it doesn't means it's stable to play Quake. In my experience dating back to the first shareware episode release on contemporary hardware of the time, it is not all that sensitive in comparison to windows 3.x or 9.x or many other games. If I seem a bit feisty on this point it's because I got tired of this argument sometime in 1998 and here it is going on another 25 years.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 28 of 38, by rasz_pl

User metadata
Rank l33t
Rank
l33t
bakemono wrote on 2023-10-30, 12:05:

Not necessarily. I induced this behavior by running northbridge voltage too low. (And usually IME it is buggy video drivers and not even a hardware issue.)

that would suggest data corrupted in flight to the graphic card

bakemono wrote on 2023-10-30, 12:05:

In case of 486 running Quake all those calculations are performed by CPU and failure will result in instant computer crash.

How do you know it would crash?

If CPU jumps to a wrong address its all over, there is no gentle recovery like in case of GPUs where minor visual glitches can go unnoticed and driver can attempt aggressive failure recovery with clean source data uploaded from main ram.

BitWrangler wrote on 2023-10-30, 12:39:

Yes you can have a system in a state where Quake crashes, just minimally tweaking things until it doesn't means it's stable to play Quake. In my experience dating back to the first shareware episode release on contemporary hardware of the time, it is not all that sensitive in comparison to windows 3.x or 9.x or many other games. If I seem a bit feisty on this point it's because I got tired of this argument sometime in 1998 and here it is going on another 25 years.

I would say stable Quake means reliable CPU/memory subsystem/basic dos HDD IO in DOS. With Windows there is hardware task switching, chipset reprogramming, interrupt rerouting/sharing/coalescing, often buggy drivers, lots more opportunity for things to go bad even with stable CPU/ram operation.

Open Source AT&T Globalyst/NCR/FIC 486-GAC-2 proprietary Cache Module reproduction

Reply 29 of 38, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

I am not talking subtle here, I'm saying you can have a system 24+ hours "quake stable" tuned to highest framerate, and inside an hour, if it will even boot, you will have a mangled windows install that is a nuke and pave event, because things can still go THAT wrong.... and until you fix your "quake stable" system it will likely not manage to reinstall windows either. If you are lucky, it didn't nuke your FAT or partition table and corrupt the whole disk. Giving examples of having run quake, and then having run windows for hours afterwards is of course not the point here. It's simply that quake can give you an extremely false impression of system stability, it only fails like the worst 70%, Doom does more like 95%. As you get to system speeds above 233Mhz, doom becomes not quite enough load, and some motherboards configure processor features in strange ways by default that means Doom can give the out of synch error, even a 70% filter like quake is then becoming useful, but you have to put it through multiple filters because 70% is still about like playing Russian Roulette with 2 bullets in the revolver.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 30 of 38, by rasz_pl

User metadata
Rank l33t
Rank
l33t

Corrupted drive means faulty disk IO when using 32bit windows driver. No amount of running DOS Quake is going to expose this problem. I would look for source of problems in high FSB and disk controller.

Open Source AT&T Globalyst/NCR/FIC 486-GAC-2 proprietary Cache Module reproduction

Reply 31 of 38, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

All kinds of shit is gonna come up faulty if the code and data is getting corrupted in the RAM/Cache or CPU because timings are too tight. You are approaching the problem from the wrong end, like "The paint on my window sill is blistering rapidly, I better get out the paint pot" No, put out the frigging fire that's right next to it first.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 32 of 38, by douglar

User metadata
Rank Oldbie
Rank
Oldbie

If I feel like there is a CPU/RAM issue, I use memtest86
If I feel like there's a problem with storage, I'd use speedsys
If I feel like there is an issue with the bus or IRQ's, maybe I'd use CheckIt
If I feel like there is an issue with the VRam, it's because I can see it on the screen, so I've already seen it, no need to check

Reply 33 of 38, by GigAHerZ

User metadata
Rank Oldbie
Rank
Oldbie
douglar wrote on 2023-10-30, 19:59:

If I feel like there is a CPU/RAM issue, I use memtest86

Do you have experience with memtest86 detecting all kinds of memory issues? L1 WB/WT, L2 WB/WT and RAM? All of this is part of "memory subsystem", yet is that the right tool to detect issues in all of those components mentioned?

"640K ought to be enough for anybody." - And i intend to get every last bit out of it even after loading every damn driver!

Reply 35 of 38, by douglar

User metadata
Rank Oldbie
Rank
Oldbie
GigAHerZ wrote on 2023-11-01, 08:09:
douglar wrote on 2023-10-30, 19:59:

If I feel like there is a CPU/RAM issue, I use memtest86

Do you have experience with memtest86 detecting all kinds of memory issues? L1 WB/WT, L2 WB/WT and RAM? All of this is part of "memory subsystem", yet is that the right tool to detect issues in all of those components mentioned?

Maybe I'm just a little too simple but if Memtest starts showing ram errors on a 486 with simms that work on an known good motherboard, first I relax the BIOS memory timings, then I go to a single bank, and then next I disable the motherboard cache to see if the issues go away. The one time that the issue went away after disabling the motherboard cache, I reconfigured the cache from 256KB to 128KB and started swapping chips to identify the bad chip. Usually when I've had L2 cache incompatibilities, the motherboard would not post with the cache enabled, so I didn't need memtest to identify those issues. The only time I had on-chip cache issues (Cyrix 486dlc on a 386 board), the stability issues were clear and showed up quickly when I tried to run doom. I was ready for those issues because they are common and expected that I'd need to tweak the configuration of the cyrix.exe tool, so that was a special case. ( Although the floppy drive not working issue caught me by surprise, and that was more of a DMA issue than a cache issue I suppose)

Sometimes memtest needs to run for 12+ hours before the issues show up.
Often you need to get memtest 86+ v4.00 or older if you want to run it on a 486.

Reply 36 of 38, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Yah, memtest is great for ferreting out specific memory errors, kinda thing I use when I know there's a problem then go hunting for the problem. That floppy issue is a DMA thing I think, though if you're running off hard drives or substitutes you may notice sound issues first with DMA failing to work correctly for the digitized sound.

If you have a perfectly behaved system, and change one thing, like you've had slackarse memory timings on 80ns SIMM and get some shiny new 60ns to put in, then it is most efficient to go straight to something like memtest that is highly specific to the changed component to get it tuned to perfection and get it running right. Likewise if you have a perfectly behaved system running great for months, then a problem occurs, it is appropriate to use tools to delve into that subsystem individually, start at the leaf and work down the branch. Most efficient use of time.

However, should you have just assembled a system or done many part swaps at the same time, you want a highly general test to see how it's all working, preferably one that works the core of the system, so that you know the trunk of your tree is sound, before you go worrying about what symptoms various leaves and twigs are showing. Unless you do this, you might be swapping floppy drives and controllers, going through a whole pile of soundcards, because of the DMA problems and never realise you've got garbage generated in the core of the system flaking everything out, and not multiple issues with random peripherals.

So you can say things like "If a floppy doesn't read the disk is bad, if three floppies don't read the drive is bad" etc and these may be true in certain cases on a system otherwise known to be stable, specific diagnoses of problems. But you sound like a complete clown if you start from this end before you have established the "otherwise known to be stable" part.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 37 of 38, by douglar

User metadata
Rank Oldbie
Rank
Oldbie
BitWrangler wrote on 2023-11-01, 14:33:

But you sound like a complete clown if you start from this end before you have established the "otherwise known to be stable" part.

I don't just sound like a clown when I forget the "otherwise known to be stable" check. I feel like one too!

Sage advice to only change one thing at a time btw.

A long time ago back in school, someone was trying to explain the meaning of the word "Hubris" to me. He was trying to use flowery terms like "Shaking your fist at the gods!" and I wasn't getting it. Then he said "Hubris is changing three pieces of hardware in your computer, putting all the case screws back in, and then expecting it to work when you power it on". I understood that last analogy. "Damn you, the god of PC interrupts!! Damn you the deity who created unkeyed IDE cables!"

Reply 38 of 38, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Heh, I am naturally suspicious, I think the few times when I have assembled something complex, and it went together really easy, and it worked first time, I have actually ended up spending longer checking, rechecking, testing and retesting, the darn thing than if I had an initial problem and solved it. It's like "That never happens" 🤣

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.