VOGONS


3 (+3 more) retro battle stations

Topic actions

Reply 2480 of 2488, by sqpat

User metadata
Rank Newbie
Rank
Newbie

I actually tried FPU for some RealDOOM stuff a couple weeks ago. Mostly the obvious use case is to replace 48:32 bit division (a single instruction in x86-32 but a complicated mess that in the worst case might involve multiple div/mul instructions in x86-16 to replicate) but even in the worst case I could not get a performance increase, in fact it was generally a significant loss in performance. Probably something could be built from the ground up to use the 287 properly as a coprocessor rather than block on calculation of values, but it's hard to integrate into doom. I don't think FPU is fast enough to do be doing a FIDIV for every column drawn in the doom engine. There's a handful of other operations done to prepare the column for drawing, but i think even if they were done in parallel you'd sit there waiting a long time for the FIDIV calculation to complete. And if its not even fast enough to do this, it's hard to think of something it could do often and at a high frame rate to be useful.

That said It was my first time in using 287s though, and when I tried to bench performance on various programs, it seemed the performance of both my FPUs was halved. At 25 mhz i scored ~521 KWhetstones on Navtratil (some of the other benches were not working or were 'off the scale' low) while it seems you got 1097 kwhetstones with the same FPU and speed. I tried with both a 20 mhz IIT and Cyrix CX-82S87-NP-SV (both ran 25 mhz just fine) and both got about half your scores. I fiddled around with bus clocks and things like that, but the performance was consistently half of what I expected. The screen did say the FPU was running at 25 mhz and there was no sign it was a fake chip or anything. I could imagine something like wait states existing in 287 communication but most of the cycles are spent with the FPU doing its work so there's no reason for such a large performance drop. This was the SCAT router board; I can try another board later. I scoured all the chipset settings and docs and couldn't find anything to suggest the speed should be halved. If the FPU performance was really compromised in some way I might have to revisit those realdoom FPU div tests at some point.

Outside of benchmarks is there anything actually 'cool' using the FPU? I guess I am mostly aware of CAD, mandelbrot/fractal programs and flight sims. I never really messed around with this stuff either so I have some learning to do. Perhaps a raytracer could use an FPU as an "accelerator" that traces other pixels in parallel or something. Might make a neat demo/benchmark itself.

Reply 2482 of 2488, by pshipkov

User metadata
Rank l33t
Rank
l33t

Yeah, for this class hardware it is unfair to compare FPUs against fixed point math specialization in on-rails runtimes such as videogames.
FPUs will make a significant difference for general purpose math computations, usually found in CAD systems, offline graphics software (image and video editing, vector and 3D rendering, etc), spreadsheets, research/modeling (Mathlab, etc.).
In the late 80ies and early 90ies, compatible with 286/287, i can think of a handful of "cool" things such as AutoCAD, TurboCAD, Microstation, POVRay, Chaos (a cool set of fractal generators published by Autodesk), CorelDraw, drive/flight simulators with vector graphics, but that's about it.
From what i can see online, people tend to run a game or two on their 286es once in a while for the good old memories, so the obscurities above are largely a no factor.

VLSI SCAMP is brutally slow on a clock-to-clock basis. It is incomparable to VLSI VL82CPCAT-16QC or Headland HT-18.
I was able to squeeze out of it ~750 kwhetstones/s in NSSI at 20MHz CPU and 10MHz FPU but that's about it. Link here. This is long-term stable.
Hope these notes help somewhat.

I prefer to test things with real software. For the purpose of benchmarking 287 FPUs i use Autodesk Chaos. It greatly stresses the system and also makes nice images (for its time).
Take a look at this post - the last chart. It gives a glimpse into how things are for 287. Time is measured in seconds. Only handful of chipsets/motherboards are worth FPU-testing as the rest are way WAY too slow.
The fastest chipset and motherboard for 287 FPUs, that i have seen so far, is Protech PM286 based on Headland HT-18/C. The thing is a beast in this area.

(btw, this conversation reminds me that it is time to post info about few more 286 chipsets - obscure ones)

retro bits and bytes | DOS media library

Reply 2483 of 2488, by pshipkov

User metadata
Rank l33t
Rank
l33t

Ever since the very successful outcome with Chicony CH-471B rev 2.0 i have been wondering how Chico Jr. (the A models) would do in terms of performance and overclocking. Not far ago i managed to obtain rev 3.0 of the assembly and inspected it.

Chicony CH-471A rev:3.0 based on SiS 85C471, 85C407

motherboard_486_chicony_ch-471a_ver_3.0.jpg

Classic ISA/VLB layout. Nothing much to say really.

There was a small corrosion from leaked battery around the lower right corner of the memory slots. Cleaned it for good.

Upgraded it to 1Mb level 2 cache. Board is not very picky about level 2 cache chips - was able to quickly bin the right set.
I see this class hardware as mostly DOS-bound, so my preferred card is Ark1000VL, since it is the VGA blaster = fastest DOS interactive graphics. The card can be fussy in some motherboards with tight BIOS timings - this is a known issue. Instead of relaxing the wait states and slowing the system down, i switched to a Diamond Stealth 64 DRAM T VLB REV B2 (S3 Trio64) which is more resistent than Ark1000VL in such situations. The S3 Trio64 is slightly slower clock-to-clock than Ark1000VL at DOS interactive graphics, but it handles the tightest BIOS settings which results in a better overall performance.

Local storage through Promise EIDE2300Plus with CF card attached to it.

--- Am5x86 at 160MHz (4x40)

All BIOS settings on max.
1Mb level 2 cache - achieved with a mix of 10/15 ns chips.
32Mb (2x16) RAM, 60ns.

Level 1 cache policy is always in write-trough mode regardless of what the corresponding BIOS setting says.

Nothing much to add really. Things just work. System is fully stable. Intermediate performance.
performance results

chicony_ch_471a_ver_3.0_speedsys_160.png

--- Am5x86 at 180MHz (3x60)

3.6V to CPU, 5V Peltier element for cooling.

All BIOS settings on max, except:
DRAM SPEED = "SLOWER" (best is FASTEST)
DRAM WRITE WS = 1 WS (best is 0 WS)
CACHE BURST READ = 1T (best is 0T)
LOCAL BUS READY = SYNCHRONIZE (best is TRANSPARENT)

Was not able to achieve fully stable system.
All is good in the simple DOS interactive graphics (Wolf3D, Doom, Quake 1, bunch of other benchmarks / games) and standard Windows usability tests, but some of the heavy offline compute tests fail no matter what. Tried all CPU voltages. 12V peltier for deep processor freeze. All sorts of BIOS settings combinations - from most conservative to different grades of wait states tightening. Rotated CPUs, L2 cache chips, video and IDE controllers, RAM modules. Used multiple sets of trusted components. Nothing helped.
For DOS gaming and casual Windows activities, the motherboard totally cuts it. For more intricate usage - there can be trouble.

Performance is lacking - below average.
performance results

chicony_ch_471a_ver_3.0_speedsys_180.png

--- Am5x86 at 200MHz (3x66)

System is unstable no matter what.

--- P24T (POD100) at 100MHz (2.5 x 40)

Similar to Chicony CH-471B ver 2.0, the system hangs at boot time no matter what. Tried everything - components, frequencies, BIOS settings - no luck.

---

All in all - a slightly disappointing motherboard.

retro bits and bytes | DOS media library

Reply 2484 of 2488, by feipoa

User metadata
Rank l33t++
Rank
l33t++

The Chicony CH-471A rev 3.0 looks similar to Chicony CH-471B rev 2.0, except that the B-rev2 had to make space for onboard IDE/floppy/IO (UM82C865F, SMC FDC37C666GT, Appian ADI/2). The B-rev2 being the better board, I wonder if having these components integrated onto the motherboard somehow was the magic difference. Everything is hit or miss at 180 MHz.

Plan your life wisely, you'll be dead before you know it.

Reply 2485 of 2488, by pshipkov

User metadata
Rank l33t
Rank
l33t

I expected that the simpler, more compact A model will do better than B, but as you said - when operating out of specification, presumptions are meaningless.

retro bits and bytes | DOS media library

Reply 2486 of 2488, by sqpat

User metadata
Rank Newbie
Rank
Newbie

OK - here is my first crack at the 5434 bios for Diamond Speedstar64. It works on 86box on a 286. I am on a plane right now and cannot test my real hardware for a day or two. Use at your own risk etc.

https://github.com/sqpat/5434bios/blob/main/5 … 6-speedstar.BIN

I think they probably used this on the PCI 5434. Theres a lot of 32 bit instructions used in PCI bus tests. The asm returns carry flag on when the card is not found on the pci bus, and this call is made in a lot of places before running other pci checks. Instead i just set carry and return. Then there are some spots that check all the memory on the card. They were using DWORD (32 bit) string copies, this had to be changed to 16 bit copies to make it work. It's a pretty naive but safe implementation. I have a feeling the initialization could be slow on some machines.
Until I have hardware access I won't really be able to tell the performance of the card on 286 systems but I hope to be able to check in a couple of days.

I worked hard on realdoom the past few months, but yesterday/the day before got to binning 286 chips and overclocking a little bit. I actually went through about 200 chips - a lot of new sources from chinese sellers. I am traveling and will have hardware access again in a few weeks, then i will finish binning chips and I think I will have some interesting learnings, but some of the interesting results is that if you go through real legitimate old early 90s harris 12/16 mhz stock you will get a lot of 30 mhz chips, so maybe the fake 25s that are sometimes capable of low to mid 30s arent even fakes of legitimate 20s, maybe they are fakes of 12s or 16s. Of course, I really want to confirm date codes and batch IDs since it might be just specific good batches here and there.

I managed a 38.4 3d bench run at 14.4 (though there are some missing pixels if you look closely, not sure how legit it is.) At 38.0 mhz i got a timedemo of realdoom to complete at a little over 26 fps, and a 14.2 3d bench run. These involved chucking my ram in the freezer pre-run so I had limited run-time. Mostly I wanted to get the practice down. It's like a lazy man's peltier.

38.0 mhz
https://www.youtube.com/watch?v=zUrvGz0fMz0
38.4 mhz
https://www.youtube.com/watch?v=Tab0VChx9UQ

As usual these are all on my SCAMP board, this time with an ET4000AX/W32i. I experimented at slightly higher clocks. I can definitely boot with 1 ws at 40 mhz for example so I really think I am DRAM limited. I wanted to try 2 populated banks but was struggling to do better than 33-34 mhz. I will have to try again later.

I think I'm going to take a crack at a topcat bios next. VLSI topcat as you may know is a 286/386sx bios and Rodney over on the vcfed forums is refining his topcat 286 design, but he says the working 286 bios kind of sucks while the 386 one is much more tunable. If its just a matter of ripping out some small 386 protected mode tests i may be able to get that working for him.

I think potentially a lot of the difference in performance in different 286 chipsets comes down to wait states and clock speeds. Of course is one is not capable of high clocks it's very limited, but I think wait states is a more complicated subject and usually wait states are not 0 or 1 but something in between dependent on factors like interleaving.

At some point I would like to just have a stable 32 or 35 mhz machine, but even that is asking a lot of my DRAM.

Reply 2487 of 2488, by JonnyAmps

User metadata
Rank Newbie
Rank
Newbie

I very excitedly gave this BIOS a shot on several GD-5434 cards in an Amptron VLSI Turbo (VL82CPCAT) and then a PM286 (HT18C). Both boards gave the no video detected beeps. I tested the BIOS on Speedstar64s Rev A3 and A3-A along with a STB Nitro ISA.

Reply 2488 of 2488, by sqpat

User metadata
Rank Newbie
Rank
Newbie

The behavior of my board:

Working video card: normal POST
No video card: 1 long, 2 short beeps.
Unmodified GD5434: failed post. No beeps.
5434 with new BIOS: Worked!

Mine was a diamond speedstar 64 with 2 MB inserted and the original bios was rev 2.02. Interestingly it was a one time write ROM on there - no window (ATMEL AT27C256R) . Maybe make sure you are not accidentally writing to a one time write eprom originally on the card (though your programmer should say it failed if there was a verify step.) But if you had the old bios it should crash the processor and not do any beeps anyway. So hmm, thats interesting. Wonder if it can be related to memory size or jumpers... or a hardware revision.

I'd hope other people can share their experience with whether or not the card works. I think if the card beeps at you, that sounds like the video memory test maybe failed..? I'd like some more data points, so I will post about this in other spots. I just thought the 286 nerds here might like to know first.

Some benches on this vs a speedstar pro:

5429 (Speedstar PRO):

landmark: 6261.40 / 39.72
3dbench: 10.5
topbench: 235/113/90/171/97 total 719 (about) = 70
realdoom (current/0.87): 8293

5434 (Speedstar 64):

landmark: 7336.12 / 39.72 (faster)
3dbench: 10.7 (faster)
topbench: 235/113/90/179/97 total 718 (about) = 69 (slower)
realdoom (current/0.87): 8088 (3-4% faster)

So that's nteresting. topbench says video memory is faster and gives the 5429 a higher score but the 5434 wins at everything else including raw throughput. I double checked 0 ws jumpers and such.

I'm currently away from all my hardware and unable to see if the card will bench at high clocks. I think my 5426 outperforms my ET4000AX/w32i a little bit in clock per clock synthetic benchmarks but struggles beyond 35-37 mhz while the tseng will do 41-43. If this card did more than that, it could probably get big chr/ms and topbench scores, but it wouldnt affect stuff like 3d bench or doom scores.

It's possible a card like this can enable higher end windows experiences on a 286 than before. I wonder if the drivers are 286 compatible... I recall someone doing driver development for a cirrus card recently on vcfed. I will follow up over there...

What a time to be alive, running doom on my 5434 on my 286 right now.