VOGONS


First post, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Hi Vogons,

Seen some passing references of the "maybe I've heard of it" type mentioned years back on this forum. But can anyone confirm or deny the existance of a software emulator of 387 or other version x87 functions on Weitek hardware? Presumably being faster than emulators that run purely on integer hardware.

Y'all probably seen me referring to x87s as "dust caps for the socket" or 387s making your 386 fast enough to be too slow to play quake, so are wondering why I'm bothering. Well FPU for non-technical uses came into it's own at the upper end of 486 into Pentium levels of performance, which is where I have a fringe use case and would like some FPU support, but as you may know, a 487 ain't it. I do have boards with a 4167 hole, that could potentially support the integer only CPU in question.

Asking because I've got a shot at getting a 4167, and this is the only possible use I would have for one at all... I can tape over sockets to keep dust out if it bugs me.

TIA for any hints

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 1 of 4, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie

The way x87 is emulated is through trapping CPU exception when it ecounters floating-point instruction (and later also MMX, SSE, etc). This mechanism was always meant to serve that purpose and also is useful for things like lazy FPU state save/restore in multitasking OSes. Such emulation is not as fast as code that was compiled with emulated x87 in mind but it works well and (if done right) is transparent to running programs - unless those actually look for such emulation.

Weiteks are, IIRC, mostly memory-mapped. That can be intercepted in protected mode (or 8086 virtual mode) through paging but not easily and it would surely clash with most 32-bit OSes. So any such emulator would be quite limited as far as software compatibility is concerned.

As for speed, or lack of, when it comes to early x87 - that's only partly due to silicon limitation. The whole implementation of NPU/FPU in 8086 family is just kinda broken. The team that did it is not to blame, they just made a chip according to Intel specs, it's those specs that were rather inadequate already for original 8086 and things didn't get much better unil 486, in which FPU is bolted to L1 directly and can also make use of integer ALU to speed up certain operations. That being said, if you were using a CAD on PC back in the late 80's you would most certainly want 287 for your 286, and would switch to 386 as soon at it was possible. These are all slow by modern standards but Quake was the effect, not the cause of fast FPUs.

Reply 2 of 4, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Well yes and the installed user base becoming large enough for games using FPU to be commercially viable when enough 486DX and above had been sold. If you were relying on presence of an FPU for your games before ~1993, you'd better hope they appealed to a lot of CAD users, accountants etc. Prior to that it was only a few things like MS Flight Simulator, some games by Moraffware and one game (The Hunt or something like that) that used it to help generate the random environment at the beginning of a game, if installed, then ignored it. And those were optional speedups, not required to run.

Then even in the 80s the 8087 and 80287 were not the only games in town, I was looking through a Micro Cornucopia issue from late 80s and they were describing 68020/68881 math processor boards for the PC, billing them as more like a scientific minicomputer on a board. (Though even if those were massively popular I doubt there would have been much gaming use as they had to communicate through 8 bit ISA) . I knew a cadre of PhD chemists who refused to give up on Vax clustering for simulation and analysis until Pentiums hit the desktop.

But anyway, I did try a 387 emu for 386 back in the day for a game where there was a 387 speedup and was a little surprised that with the emulator it also sped up just a little. Guess batter math routines than the ones in C libraries are mostly the cause of that. I am aware that Weitek work a lot different, but have also heard that the biggest speedups / slowdowns come from multiplication and they are particulalrly good at that. So given that 387 emulation actually works on integer, just kicking a few of the functions that the Weitek is super fast at out to a Weitek seems a plausible thing, even though it doesn't have a feature to feature match of a 387 and was an entirely different beast.

So anyone seent it? The mythical wietek boosted 387 emulator?

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 3 of 4, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie

I'd say a lot of Weitek's speed came not from superiority at any particular math operations but from the way it was meant to work. Easy memory-mapped access to multiple registers (and AFAIR Weitek had lots of those) is just a lot faster than the I/O channel method both 287 and 387 used.

Example: 387 needs 24-32 cycles to execute FADD with 32-bit float memory argument, usually needing about 30. And that's the original i387, the later 387DX and other NPUs would take some 20-25. Of which there's 2 cycles for CPU to fetch the operand address, another 2 to read it (assuming 32-bit bus and aligned data), another 2 to transfer that to NPU, and a few more to transfer the instruction code and setup the I/O for all those transfers in the first place. A 486 FPU would need on average about 10 cycles (8-20) for the same job. As low as 5 if integer ALU was free and could be commanded for FPU needs, but without that it's 10, and if you add all the extra cycles 386DX needs to even get the 387 to do anything it comes close to 20. So there isn't that massive difference between them. My point here being, the guts of a numeric co-processor do matter but HOW it's bolted to system can matter even more.

As for the original question, I've never heard of such emulator, even the actual chips were somewhat rare (and by now have stupid "collector" prices).

Reply 4 of 4, by Jo22

User metadata
Rank l33t++
Rank
l33t++

The Weiteks also had lesser accuracy over x87..

80-Bit as a temporary format was handy for computing multiple calculations with each others, for adding/multiplicating large numbers.
That's what people forget, I think, if they think 80-Bit are unnecessary.

For CAD/CAM or graphis in general, Weitek accuracy was sufficient, however, I suppose.

It's a bit like CISC vs RISC, I think.
RISC is faster for simple things, CISC is faster for certain complex things.

Too bad OS/2 and Windows never supported the Weiteks, IMHO.
They could have had assisted GDI nicely years before the Windows accelerator cards.

Without messing up x87 registers, leaving them available for user applications (Excel, SimCity, etc).
No need to save/restore the x87 registers so often during context switching.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//