VOGONS


Intel RapidCad compatibility issues

Topic actions

First post, by MSxyz

User metadata
Rank Member
Rank
Member

I was able to buy a working Intel RapidCad processor (with its companion chip) and tested it in a few 386 motherboards, one of which had a bios dated 1993, the others 1991 and 1992.

While testing it with the usual benchmark programs, I noticed several issues with some of them.

Let's start with the motherboard(s): they all recognize it as a 386SX with a FPU. Now, I would understand if it was identified as a 386DX + 387, but it seems odd that a BIOS made for DX processors would recognize it as a 386SX. The test program NSSI also detects it as a 386SX but, at least, it correctly identifies the FPU as a 'Intel RapidCad'.

Other issues I found with specific benchmarks / software:

-Checkit hangs at a math coprocessor detection
-Alex Savelev's PC Info hangs at general information screen
-Quake v108 has some odd framerate fluctuations. Sometimes the demo hangs for a few seconds before resuming. Quake v106 doesn't have this problem and the RapidCad predictably outperforms a 486DLC 40 MHz with a Fastmath FPU.
-vspeed hangs

I found these issues with all the 386 motherboards I tested, so I tend to exclude a specific compatibility issue. The chip is even able to run at 40MHz. Since it gets quite hot, I placed an 80mm fan blowing air onto it to keep it cold at I don't experience other issues aside the ones mentioned.

It strikes me as odd that Intel would release a product with so many 'quirks' with specific programs. I also wonder what these programs have in common that cause the system to crash or to behave erratically in the case of Quake.

Do other people have similar experiences?

Last edited by MSxyz on 2024-07-10, 16:51. Edited 1 time in total.

Reply 1 of 41, by mkarcher

User metadata
Rank l33t
Rank
l33t
MSxyz wrote on 2024-07-10, 15:06:

Let's start with the motherboard(s): they all recognize it as a 386SX with a FPU. Now, I would understand if it was identified as a 386DX + 387, but it seems odd that a BIOS made for a DX processors would recognize it as a 386SX. The test program NSSI also detects it as a 386SX but, at least, it correctly identifies the FPU as a 'Intel RapidCad'.

This is likely an artifact cause by the common way to tell the 386DX from the 386SX. Usually, these tests do not test how many bits the CPU uses to interface with the mainboard, but rely on the fact that the 386DX has two coprocessor interface modes (a 16-bit mode for the 287, and a 32-bit mode for the 387), whereas the 386SX only has one coprocessor interface mode, targetting the 16-bit 386SX. The RapidCAD chip pair does not need to implement two different coprocessor interface modes, so there is no need for the bit that selects the mode (ET in CR0) to be writable. If that bit is not writable, the classic interpretation is 386SX.

Reply 2 of 41, by MSxyz

User metadata
Rank Member
Rank
Member

Thanks for the explanation. A little off topic, if I remember correctly, the 386DX was made compatible with the 287 because the design of the 387 wasn't ready by the time it was released. Right?

As for the rest of the problems, what could be the explanation? I'm perplexed by Vspeed and Quake 1.08. The former is just a simple program that pumps data as fast as possible to the video card framebuffer and measures the transfer speed. As for Quake v108, it could be a problem related to compiler specific optimizations. In theory the RapidCad has all extra 486 instructions disabled, but what happens if a program somehow detects it as a 486 and tries to use some 486 specific opcodes? I would presume an 'illegal instruction' exception is triggered. Did Intel really invest some time in reworking portions of the 486 circuitry or it simply deactivated -SX style- some features?

Reply 3 of 41, by mkarcher

User metadata
Rank l33t
Rank
l33t
MSxyz wrote on 2024-07-10, 17:16:

A little off topic, if I remember correctly, the 386DX was made compatible with the 287 because the design of the 387 wasn't ready by the time it was released. Right?

That's correct.

MSxyz wrote on 2024-07-10, 17:16:

As for the rest of the problems, what could be the explanation? I'm perplexed by Vspeed and Quake 1.08.

One of the key performance features of the quake engne is the ability to push data to the video card as fast as possble. Quake uses the FPU to copy data to the screen, because that proved very advantegeous on typcal Pentium chipsets, but even if the key point of the RapidCad is the FPU, we should take care to not focus too much on the FPU. The point I want to stress instead is that both Quake and Vspeed push the graphics card to the limit. You tried different motherboards, but did you try a different video card?

MSxyz wrote on 2024-07-10, 17:16:

what happens if a program somehow detects it as a 486 and tries to use some 486 specific opcodes? I would presume an 'illegal instruction' exception is triggered.

That's what I expect, too. Quake uses a well-behaved DOS extender, which will catch the exception and try to shut down the application. I like to use Quake as memory stability test on overclocked 486 machines, and my experience is that the exception handling and clean exiting works fine in Quake. Strange behaviour is likely unrelated to exceptions.

Reply 4 of 41, by H3nrik V!

User metadata
Rank Oldbie
Rank
Oldbie

What clock speed are you running? Isn't the RapidCAD only 25 MHz rated, or am I confusing it with the 487?

If it's dual it's kind of cool ... 😎

--- GA586DX --- P2B-DS --- BP6 ---

Please use the "quote" option if asking questions to what I write - it will really up the chances of me noticing 😀

Reply 5 of 41, by MSxyz

User metadata
Rank Member
Rank
Member
H3nrik V! wrote on 2024-07-10, 21:43:

What clock speed are you running? Isn't the RapidCAD only 25 MHz rated, or am I confusing it with the 487?

I'm running it at 33MHz and 40MHz OC. Today I've tried running it at 25Mhz but nothing changed.

Chip is stable. I left Quake running demos in a loop for an hour at 40MHz and it never crashed. With the fan blowing fresh air onto the chip its surface is barely warm to the touch.

mkarcher wrote on 2024-07-10, 21:04:

You tried different motherboards, but did you try a different video card?

Good suggestion. I'm always using a Genoa 7900B for my tests because it's the fastest ISA video card in my inventory and likely one of the fastest ever manufactured. I also have other ISA video cards, I will try with them.

Reply 6 of 41, by kixs

User metadata
Rank l33t
Rank
l33t

I had no issues with my RapidCAD. Try different motherboard if you can of course.

Visit my AmiBay items for sale (updated: 2025-03-14). I also take requests 😉
https://www.amibay.com/members/kixs.977/#sales-threads

Reply 7 of 41, by mkarcher

User metadata
Rank l33t
Rank
l33t
MSxyz wrote on 2024-07-11, 05:33:

Good suggestion. I'm always using a Genoa 7900B for my tests because it's the fastest ISA video card in my inventory and likely one of the fastest ever manufactured. I also have other ISA video cards, I will try with them.

I've looked at a photo of that card. You can try removing JP2, disabling 0WS operation. This will obviously severely slow down the card, but might as well solve compatibility issues. A childhood friend had a Cyrix 486DLC system (so a 386 system wth a 486-class processor core just like your system) that worked quite well, but crashed with certain software. He also had a 0WS ET4000 card in that system. As I understand it, ET4000 cards typically just place the output of the address decoder in the ET4000 chip on the 0WS line, activating it on every cycle hitting the card, no matter whether the card is ready to handle the cycle in time. If the ET4000 can't handle a cycle fast enough, it will use the IORDY signal to extend the cycle. This can lead to the confusing situation, that both the "0WS" signal, trying to shorten a cycle and the IOCHRDY signal, trying to extend a cycle are active at the same time. If a chipset behaves to the Intel ISA specification (footnote) I have at hand, it would ignore IOCHRDY and have 0WS take precedence, which is obviously not what is required for a card like this type of ET4000 card.

footnote: See page 41 of "ISA Bus Specification and Application Notes Rev. 2.01". It clearly says: "The enabling of SRDY* signal line does not require IOCHRDY to be enabled, in fact it is ignored by the bus owner.". Note that this specification calls the /0WS aka /NOWS signal SRDY* instead, and calls the high state of IOCHRDY "active", which makes sense according to the name, as high means "ready". As this is the default state of the signal (it has a pull-up), most other sources interpret that line as if it were defined as "/IOWAIT" and call the state where it is driven low to request extra wait states "active".

Reply 8 of 41, by Jo22

User metadata
Rank l33t++
Rank
l33t++

"Intel ISA Specification"?
Intel? Where does this obedience to authority come from?
Isn't it a bit presumptuous to believe that Intel is always the reference when it comes to x86?

To my knowledge, both EISA and ISA had been specified by The Gang of Nine, which Intel wasn't being part of? 🤷‍♂️

https://en.wikipedia.org/wiki/Extended_Indust … he_Gang_of_Nine

Edit: No offense, though. If there's one company that matters, though, wouldn't it be rather IBM? 🤔
IBM's documents about the PC/AT bus might be relevant, too.

Edit: The only relevance of intel I can think of is ISA Plug&Play, maybe.
In the 90s, intel had put its fingers into about anything.

Edit: Never mind. I suppose there was misunderstanding.
Maybe I should have read it as "Intel's take on ISA specification" or something along these lines. My bad.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 9 of 41, by MSxyz

User metadata
Rank Member
Rank
Member

Changing video card didn't solve the issue. My 7900B never gave me any issue with other CPUs and it happily runs even with ISA bus speeds of 16Mhz at 0WS. I can't try this CPU on a different MB at the moment, but at some point in the future I will definitively do it. Thanks for the help.

Reply 10 of 41, by mkarcher

User metadata
Rank l33t
Rank
l33t
Jo22 wrote on 2024-07-11, 07:17:
"Intel ISA Specification"? Intel? Where does this obedience to authority come from? Isn't it a bit presumptuous to believe that […]
Show full quote

"Intel ISA Specification"?
Intel? Where does this obedience to authority come from?
Isn't it a bit presumptuous to believe that Intel is always the reference when it comes to x86?

To my knowledge, both EISA and ISA had been specified by The Gang of Nine, which Intel wasn't being part of? 🤷‍♂️

To be exact, I am talking about this document: http://www.bitsavers.org/pdf/intel/_busSpec/I … c2.01_Sep89.pdf . As this is already named "verson 2" (the .01 is likely just minor changes), the Intel ISA Specification likely predates the Gang-of-Nine efforts. The success of x86-based computers was clearly helping Intel sell CPUs, and in the late 80s, most of them were ISA based. Having more companies build ISA cards that work reliably will increase PC clone sales and Intels processor revenue, so it makes some sense for Intel to sit down and clearly document how a card needs to behave in an AT clone system. I am not aware that the Gang of Nine ever produced a dedicated ISA specification, but it is true that the subset of the EISA specification that deals with ISA cycles is deemed as one of the most official specifications of the ISA bus.

Jo22 wrote on 2024-07-11, 07:17:

Edit: No offense, though. If there's one company that matters, though, wouldn't it be rather IBM? 🤔

At that time, IBM was trying to convince everyone that ISA is outdated and you should buy expensive licenses to use the MCA bus instead. I don't think IBM ever documented ISA to the level an actual bus specification should do, i.e. including all the setup and hold time requirements.

Jo22 wrote on 2024-07-11, 07:17:

Maybe I should have read it as "Intel's take on ISA specification" or something along these lines. My bad.

I just quoted the title of that document. How you interpret the fact that Intel published a document called this name, and how much authority you assign to it, is up to you, of course. On the other hand, this document is likely to be the most elaborate ISA design resource early 386 chipset manufacturers had at hand. That's why I quoted that document in a thread about 386 mainboards.

Reply 11 of 41, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2024-07-11, 18:56:

the Intel ISA Specification likely predates the Gang-of-Nine efforts.

I thought the name ISA was itself an outcome of the Gang of Nine?

Reply 12 of 41, by mkarcher

User metadata
Rank l33t
Rank
l33t
jakethompson1 wrote on 2024-07-11, 18:59:
mkarcher wrote on 2024-07-11, 18:56:

the Intel ISA Specification likely predates the Gang-of-Nine efforts.

I thought the name ISA was itself an outcome of the Gang of Nine?

Seems you are right. According to https://en.wikipedia.org/wiki/Industry_Standa … tecture#History , EISA was already proposed in 1988, which is earlier than the publication date of the "Intel ISA Specification 2.01". Wikipedia further claims that the name ISA was coined by Compaq, a member of the Gang of Nine, to avoid the need of calling it AT bus, as AT was an IBM trademark. Trying to find the time the name ISA was introduced, I looked at The COMPAQ DeskPro 386 technical reference guide , dated september 1986, which just uses the term "8/16 bit expansion bus". That manual contains a quite extensive specification of the bus timing, the same is true for The COMPAQ DeskPro 386/20 technical reference guide dated october 1987 and the DeskPro 386/25 guide dated August 1988. So it indeed looks like the name ISA was created retroactively when EISA was specified.

Reply 13 of 41, by rasz_pl

User metadata
Rank l33t
Rank
l33t
mkarcher wrote on 2024-07-10, 21:04:

One of the key performance features of the quake engne is the ability to push data to the video card as fast as possble. Quake uses the FPU to copy data to the screen,

Actual data copying from internal framebuffer to VGA memory is done using ordinary memset https://github.com/id-Software/Quake/blob/bf4 … _svgalib.c#L735

Quake uses FPU to perspective correct every 16 pixels hardcoding for Pentium FPU implementation (~30 integer cycles between FPU divides). Then you have all the code exploiting Pentium FPU 8 stages pipeline with zero cycle cost fxch Re: FastDoom. A new Doom port for DOS, optimized to be as fast as possible for 386/486 personal computers!

https://github.com/raszpl/FIC-486-GAC-2-Cache-Module for AT&T Globalyst
https://github.com/raszpl/386RC-16 memory board
https://github.com/raszpl/440BX Reference Design adapted to Kicad
https://github.com/raszpl/Zenith_ZBIOS MFM-300 Monitor

Reply 14 of 41, by mkarcher

User metadata
Rank l33t
Rank
l33t
rasz_pl wrote on 2024-07-13, 02:29:
mkarcher wrote on 2024-07-10, 21:04:

One of the key performance features of the quake engne is the ability to push data to the video card as fast as possble. Quake uses the FPU to copy data to the screen,

Actual data copying from internal framebuffer to VGA memory is done using ordinary memset https://github.com/id-Software/Quake/blob/bf4 … _svgalib.c#L735

Wow, so either I believed in misinformation the last 30 years, or this source code does not match the original Quake for DOS executable, or the FPU usage hides inside the implementation of memcpy. (You did not mean memset)

Independent of whether Quake did use FPU-based memcpy, using the FPU at that point makes a lot of sense, as a 64-bit floating point store is the only way to make the Pentium processor do a 64-bit write into uncached memory (video memory is uncached). On a 64-bit write, the chipset wll do a 2-DWord write burst into the VGA framebuffer memory. While the PCI bus can handle 32 bits per clock cycle during a burst, the set up of a PCI cycle is quite slow, so for optimal performance, bursts are essential. The Pentium Pro mitigated this issue by introducing a way to mark certain memory areas as "uncacheable, yet write combining is allowed", so that the processor may merge two consecutive 32-bit writes into one 64-bit write before it is sent over the FSB. This is what FASTVID is doing.

I'm gonna take a peek into a Quake for DOS executable to verify whether that Quake edition uses FPU memcopy. More specifically, I will try to fnd version 1.08 which is troublesome on the RapidCad system mentioned in the original post.

rasz_pl wrote on 2024-07-13, 02:29:

Quake uses FPU to perspective correct every 16 pixels hardcoding for Pentium FPU implementation (~30 integer cycles between FPU divides). Then you have all the code exploiting Pentium FPU 8 stages pipeline with zero cycle cost fxch Re: FastDoom. A new Doom port for DOS, optimized to be as fast as possible for 386/486 personal computers!

This is clearly true, but it does not prove whether the FPU is also used for memcpy or not.

Reply 15 of 41, by rasz_pl

User metadata
Rank l33t
Rank
l33t
mkarcher wrote on 2024-07-13, 07:26:

Wow, so either I believed in misinformation the last 30 years, or this source code does not match the original Quake for DOS executable, or the FPU usage hides inside the implementation of memcpy.

Its even weird. They use standard memcpy for around half the stuff, while randomly other half uses hand rolled quad copy https://github.com/id-Software/Quake/blob/bf4 … e/common.c#L154
then there is also never used MGL_memcpy from scitech https://github.com/id-Software/Quake/blob/bf4 … /mgraph.h#L1581
and movedata just for network handling.

and there is also this hand rolled assembly https://github.com/id-Software/Quake/blob/bf4 … ke/d_copy.s#L27
movl for planar, rep/movsl linear

but in general by the time Pentium ~120 rolled around copying framebuffer wasnt a major bottleneck for the game, the really heavy stuff was geometry and texturing.

https://github.com/raszpl/FIC-486-GAC-2-Cache-Module for AT&T Globalyst
https://github.com/raszpl/386RC-16 memory board
https://github.com/raszpl/440BX Reference Design adapted to Kicad
https://github.com/raszpl/Zenith_ZBIOS MFM-300 Monitor

Reply 16 of 41, by MSxyz

User metadata
Rank Member
Rank
Member

Just a quick update.

I've tested the same chip on a different motherboard, a Panda 386V. I'm having the same exact issues as before with those three programs: Vspeed and PCInfo hang, Quake 1.08 stutters badly.

For the context, the original motherboard is a Caching Tech BC3486UL based on the UMC 480/481 chipset. As VGA I'm using a Genoa 7900B (Tseng Et4000) ISA. I also had tried with a WD90C30 and with a Trident 9000B (also ISA cards) but issues stayed.

The Panda 386V is a VLB motherboard for 386. It has an ALi 1429 chipset. I've tried both with a Tseng Et4000W32p video card and with the same ISA cards as above.

Maybe I have a defective chip?

Reply 18 of 41, by MSxyz

User metadata
Rank Member
Rank
Member
rasz_pl wrote on 2024-07-13, 14:15:

Maybe you just expect too much out of slow 486 in quake?

I use it as a benchmark to put the technical progress in perspective.

With Quake 1.06, a 40MHz Am386DX + Cyrix 83D87 is capable of 2.0 frame per seconds on average. Swap the 386 with a Cyrix 486DLC and the framerate increases to 2.5. With the Intel RapidCAD, also running at 40 MHz, the framerate increases to 3.1, since having the FPU integral to the CPU saves a lot of cycles. The CPU itself, however, is only a little faster than a 386DX in other benchmarks.

To me this stuff is fascinating...

Reply 19 of 41, by mkarcher

User metadata
Rank l33t
Rank
l33t
rasz_pl wrote on 2024-07-13, 09:07:
mkarcher wrote on 2024-07-13, 07:26:

Wow, so either I believed in misinformation the last 30 years, or this source code does not match the original Quake for DOS executable, or the FPU usage hides inside the implementation of memcpy.

Its even weird. They use standard memcpy for around half the stuff, while randomly other half uses hand rolled quad copy https://github.com/id-Software/Quake/blob/bf4 … e/common.c#L154

Q_memcpy at that line is inside an #if 0 block, so it is not compiled.

rasz_pl wrote on 2024-07-13, 09:07:

and there is also this hand rolled assembly https://github.com/id-Software/Quake/blob/bf4 … ke/d_copy.s#L27
movl for planar, rep/movsl linear

but in general by the time Pentium ~120 rolled around copying framebuffer wasnt a major bottleneck for the game, the really heavy stuff was geometry and texturing.

I can confirm that the functions from d_copy.s are used by Quake 1.08 for DOS, and they are identical or very similar to what is found in the repo. The VID stuff differs from the svgalib port in the original DOS version, but it is quite similar in many mays. I could not find any traces of FPU memcpy in Quake 1.08 for DOS, especially not for VID_Update, so the use of FPU memcpy in Quake is likely a myth.