@elianda
Thank you for sharing those documents, however I failed to see the requirement for a Cx487DLC-40 FPU when using a Cyrix 486DLC ALU. That article mentions that a Cx-87D87-40 is the FPU of choice for a 486DLC FPU. I have bolded the section of interest. It was suggested to me that the Cyrix 487DLC was a rename of the Cx-87D87 (Cyrix 387) for marketing reasons. If you have other information, please let me know.
With the 486SLC/DLC, one buys a 387 compatible coprocessor to add floating-point capabilities. It is recommended to get a Cyrix coprocessor for this purpose, since these are the fastest 387 compatible coprocessors available. Also, Cyrix sells kits consisting of a 486SLC/DLC and a coprocessor that have a favourable value for money ratio. The floating-point performance of a Cyrix 486DLC + Cyrix 83D87 combination is about 50% of that of an Intel 486DX running at the same frequency.
Here is another neat little blip from that article,
The Cyrix 486SLC/DLC have a RISC-like execution unit with a flexible five stage pipeline, just as the 80486SX has. Unlike the Intel 80486, which has an 8 kB, 4-way associative cache on chip, the Cx486SLC/DLC only have an 1 kB, 2-way associative cache (the cache on the Cyrix chips can also be configured to be of the direct mapped type). The 486DLC provides up to 80% more integer performance than a 386DX at the same clock frequency, with the average performance gain being about 35%. With the 1 kB on-chip cache enabled, the 486DLC provides about 75% of the integer performance of a 486SX at the same clock frequency. With the cache disabled, the 486DLC provides about 65% of the integer performance of a 486SX. The lower performance of the Cyrix 486DLC as compared to the Intel 80486SX is mostly due to the slow 386DX bus interface the 486DLC uses, which is up to 2 times slower than the 486 bus interface. Some additional performance penalty is imposed by the smaller cache on the 486SCL/DLC, which provides significantly lower hit rates than the 8 kB cache of the 80486SX.
The 80% more integer performance compared to the 386DX-40 seems to be well in line with the benchmarks I ran (shown above), which had an average integer performance benefit of 65%. Those are only synthetic benchmarks, so perhaps in terms of real life applications, the average benefit is only 35%. I'd like to see these results from someone with a Ti486SXL, which has 8 KB of L1 cache. Unfortunately, my system didn't show any benefit when using a Ti486SXL w/8 KB of cache compared to the 1 KB of 486DLC, which may be a limitation of my motherboard.
However, Vogons member Anonymous Coward has a Cyrix/Ti 486SXL setup. Attached was his Speedsys result, which is double that of my Ti 486DLC. I wonder which FPU he had installed? None?
Plan your life wisely, you'll be dead before you know it.