VOGONS


Math Coprocessor speed question....

Topic actions

Reply 20 of 29, by Horun

User metadata
Rank l33t++
Rank
l33t++

never mind you found it. There is some good info and "it" here: Re: Are all IIT 387 FPUs the same? 3C87 =/= 4C87DLC after all?

Hate posting a reply and then have to edit it because it made no sense 😁 First computer was an IBM 3270 workstation with CGA monitor. https://archive.org/details/@horun

Reply 21 of 29, by mattw

User metadata
Rank Oldbie
Rank
Oldbie
Horun wrote on 2023-02-01, 23:37:

never mind you found it. There is some good info and "it" here: Re: Are all IIT 387 FPUs the same? 3C87 =/= 4C87DLC after all?

thank you, this one is different version and maybe even more complete, because it contains more files, including the source code of the "Biomorph/Mandelbrot Fractal Generator" in BIOM.ARC. so, it's still very useful comment!

[EDIT] Wow, ran 'IEEETEST' in 86box and honestly I did not expect 14303 results to fail. So, even in 2023 what is used to emulate 387 FPU as code there is very inaccurate compared to a real hardware.

Attachments

  • ieeetest_86box.png
    Filename
    ieeetest_86box.png
    File size
    8.81 KiB
    Views
    465 views
    File comment
    86box IEEETEST results screenshot
    File license
    Public domain

Reply 22 of 29, by Jo22

User metadata
Rank l33t++
Rank
l33t++
mattw wrote on 2023-02-01, 23:47:

[EDIT] Wow, ran 'IEEETEST' in 86box and honestly I did not expect 14303 results to fail. So, even in 2023 what is used to emulate 387 FPU as code there is very inaccurate compared to a real hardware.

This isn't exactly a solution perhaps, but Franke 387 was a good x87 emulator.
http://icfs.de/franke387.html

Some more x87 emulators to test:
Re: Coprocessors for the AMD 386DX 40MHz CPU

In the end, it's emulation, anyway. And it must be done by the CPU. 😉
Unless there's some weird FPU support integrated into the PC emulator.
DOSBox does this in its Win32 builds, I think.
It uses the x87 on the host side. Not sure if it requires the dynarec core, though.

Edit: Beware, there are several revisions of IEEE-754. The original is from 1985, but the International version is from 1989 (IEEE 754-1989)!
Which came out in the 80486 days pretty much.
Older 8087 and 80287 and 80387 FPUs can't comply with this release, maybe.
Another release came out in 2008 and 2019, also.

https://de.wikipedia.org/wiki/IEEE_754

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 23 of 29, by mattw

User metadata
Rank Oldbie
Rank
Oldbie
Jo22 wrote on 2023-02-02, 05:14:

Beware, there are several revisions of IEEE-754. The original is from 1985, but the International version is from 1989 (IEEE 754-1989)!
Which came out in the 80486 days pretty much.
Older 8087 and 80287 and 80387 FPUs can't comply with this release, maybe.

thanks, but do you know where 1985 version can be located, because my best guess is what I am using is 1989 version, since the Cyrix test suite was released in 1991.

Reply 24 of 29, by Jo22

User metadata
Rank l33t++
Rank
l33t++
mattw wrote on 2023-02-02, 09:02:
Jo22 wrote on 2023-02-02, 05:14:

Beware, there are several revisions of IEEE-754. The original is from 1985, but the International version is from 1989 (IEEE 754-1989)!
Which came out in the 80486 days pretty much.
Older 8087 and 80287 and 80387 FPUs can't comply with this release, maybe.

thanks, but do you know where 1985 version can be located, because my best guess is what I am using is 1989 version, since the Cyrix test suite was released in 1991.

Sadly no. 🙁 I merely did read about the existence of different versions. Perhaps they're not that different, also and the emulation really is faulty.
I didn't mean to say that it was an user fault or or something, either. I just stumpled over something (as usual, hah 😅) and thought I should let you know.
What I remember, though, is that the original intel 80287 differed from its successors in the way it handled infinities (I think ?).
It complied to an older IEEE standard, I think. A few older programs didn't like that change and had compatibility or accuracy issues with later FPUs.
That's another reason why I thought I should mention that.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 25 of 29, by mattw

User metadata
Rank Oldbie
Rank
Oldbie

@Jo22: thank you. frankly, only until 2 days ago I wasn't even aware what big mess there is with the math co-processors. in fact, those "tech demos" or "test suites" or whatever, would be very useful if someday someone decides to make exact and precise emulator - I already found few different version both of Cyrix and Intel test suites. So, Intel chips are supposed to pass all IEEE tests, while Cyrix are the most accurate (less rounding and approximation errors). In any way, it's a little bit shocking to me that for so many years, no one actually made an emulator anywhere closer to real Intel and/or Cyrix chips.

Reply 26 of 29, by HanSolo

User metadata
Rank Member
Rank
Member
mattw wrote on 2023-02-02, 11:13:

In any way, it's a little bit shocking to me that for so many years, no one actually made an emulator anywhere closer to real Intel and/or Cyrix chips.

Those emulators were meant to allow people to use software that requires a coprocessor. I assume that a 100% exact emulation would have been a lot slower than one with less precision. So with an exact emulator such software would run in theory but probabaly be so slow that it would have been practically unusable.

Internally a x87 coprocessor uses 80 bit wide registers but hardly anyone really needs such a precision. My best guess would be that emulators limit calculation to 64 bit

Reply 27 of 29, by Deunan

User metadata
Rank Oldbie
Rank
Oldbie

Intel x87 are _a_ reference, not _the_ reference. Granted, on PC, where Intel was the assumed default, you'd want any competing design to be 100% compatible. But not because it's correct but because it causes less issues if every device gives the exact same result.

IIRC the IEEE-754 is not so exactly worded as to precisely define what should happen to the least significant bit in each and every case of ambiguity. And non-x86 FPUs often just do things differently like for example flushing denormals to zero rather than trying to recover all the possible bits of result. This is BTW an option also on modern x86 FPUs.

So, there are going to be some minor differences due to rounding or finite precision (like built-in lookup tables for trig functions, etc). A lot of these testing programs were written not to test vs IEEE-754, which might not even define some behaviour, but to "prove" that this or that particular math coprocessor is superior to other products. Some IIT chips for example were not fully IEEE compliant because they did not support reduced precision and ignored these control bits, always using all 80-bits. This is clearly not correct behaviour but also can't be called "less precise", and the chip was faster than Intel ones.

TL;DR: It's OK to label some chips as not fully (that is, 100%) Intel-compatible, in the sense of every bit of result being always the same in every scenario. But be careful when using words like "broken", "incorrect" or "inexact" - there are such cases, but mostly it's open for discussion what exactly is exact or not, Intel products included.

Reply 28 of 29, by Jo22

User metadata
Rank l33t++
Rank
l33t++
HanSolo wrote on 2023-02-02, 13:50:

Internally a x87 coprocessor uses 80 bit wide registers but hardly anyone really needs such a precision.

80 Bits weren't that bad, actually. Some calulactors used even higher accuracy or so I heard.
The creator(s) of the x87 FPU said that the FPU was most useful for practical users, not math fans.

"It is a common misconception that the more esoteric features of the IEEE 754 standard discussed here, such as extended formats,
NaN, infinities, subnormals etc., are only of interest to numerical analysts, or for advanced numerical applications.
In fact the opposite is true: these features are designed to give safe robust defaults for numerically unsophisticated programmers,
in addition to supporting sophisticated numerical libraries by experts.
The key designer of IEEE 754, William Kahan notes that it is incorrect to "... [deem] features of IEEE Standard 754
for Binary Floating-Point Arithmetic that ...[are] not appreciated to be features usable by none but numerical experts.
The facts are quite the opposite. In 1977 those features were designed into the Intel 8087 to serve the widest possible market...
Error-analysis tells us how to design floating-point arithmetic, like IEEE Standard 754,
moderately tolerant of well-meaning ignorance among programmers".[40]"

Soure: https://en.wikipedia.org/wiki/IEEE_754

The underlaying problem, I think, rather was portability and the dependency on the 64-Bit limitation of the C/C++ language.
This subservience to the C languages still exists, I think. Everyone has to bow down in front of C syntaxes, conventions etc.
Users of, say, BASIC and PASCAL always had to deal with the C dependencies inside of operating systems.
Except DOS and CP/M, maybe, which had a notable chunk of ASM. 😀

Edit:

HanSolo wrote on 2023-02-02, 13:50:

My best guess would be that emulators limit calculation to 64 bit.

Yes, that seems plausible. Math libraries and such used to use single-precision, afaik.
Turbo Pascal had the ability to include both x87 code and math libaries into executables.
So if an x87 (either real or as a software emulator) was around, it used x87 instructions;
otherwise, it used the math library (non-x87, just math algorithms).
Provided that both options were set by the programmer during compile time/linking.

Edit: Correction (regarding Turbo Pascal):
"Floating point
There were several floating point types, including single (the 4-byte (EEE 754) representation) double (the 8-byte IEEE 754 representation),
extended (a 10-byte IEEE 754 representation used mostly internally by numeric coprocessors) and Real (a 6-byte representation).

In the early days, Real was the most popular. Most PCs of the era did not have a floating point coprocessor so all FP had to be done in software.
Borland's own FP algorithms on Real were quicker than using the other types, though its library also emulated the other types in software. "

Source: https://en.wikipedia.org/wiki/Turbo_Pascal

(*My father had one, since the had to have one. 😁 He worked as a programmer/developer, in the 80s and had to test the software..
Also Turbo Pascal 3 was very popular, it still existed on CP/M platform, too. We had an older version on 8" floppy, I vaguely remember..)

Edit: Perhaps it's worth to remember in which times the single- or double-precision was used, originally.
In the days of main frames, 8-Bit and 16-Bit computers, 64-Bit seemed huge. Hence the use of 64-Bit in the C/C++ language, I suppose.
Now 64-Bit is consiered very small or just "normal" (like 16-Bit or 32-Bit we used a few decades before).
Yes, we use 64-Bit CPUs but the SIMDs in them are capable of 128-Bit, 256-Bit and 512-Bit accuracy.
SSE itself used 128-Bit, AVX uses 256-bit and 512-Bit, depending on the generation..

Source: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions

Edit: My apologies for the many edits. The typos somehow didn't end. 😅

Edit: This link is also interesting. I had no idea the Motorola 68000 family of microchips had an Extended Precision counterpart (pun intended)!
https://en.wikipedia.org/wiki/Extended_precis … recision_Format

Edit: Sorry for the many edits. I have trouble to concentrate, too many mistakes and typos here. Even to my taste.

Edit: Never mind. Too much text.. Please just ignore this post and go on.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 29 of 29, by HanSolo

User metadata
Rank Member
Rank
Member
Jo22 wrote on 2023-02-02, 17:53:
Edit: Perhaps it's worth to remember in which times the single- or double-precision was used, originally. In the days of main fr […]
Show full quote

Edit: Perhaps it's worth to remember in which times the single- or double-precision was used, originally.
In the days of main frames, 8-Bit and 16-Bit computers, 64-Bit seemed huge. Hence the use of 64-Bit in the C/C++ language, I suppose.
Now 64-Bit is consiered very small or just "normal" (like 16-Bit or 32-Bit we used a few decades before).
Yes, we use 64-Bit CPUs but the SIMDs in them are capable of 128-Bit, 256-Bit and 512-Bit accuracy.
SSE itself used 128-Bit, AVX uses 256-bit and 512-Bit, depending on the generation..

64bit floats are not considered 'small'. In fact they are standard for everything where high precision is required in real world applications. (Things might be different when building space rockets or in large scale pyhsical simulations )
SSE and AVX support 32 and 64 bit calculation, they only perform the same instrucion on a vector of multiple 32 or 64 bit registers. AVX512 only means that this vector is 512 bit wide (i.e. eight 64bit registers). It does not do 512 bit arithmetics.