VOGONS


Reply 60 of 72, by aquishix

User metadata
Rank Member
Rank
Member
Scali wrote:
The Am386DX-40 doesn't have a built-in FPU. The Am386DX is basically just a 'carbon copy' of the Intel 386DX, and the 40 MHz ver […]
Show full quote
aquishix wrote:

I would if I could, but I don't have a separate FPU installed(yet -- I've ordered a Cyrix fastmath one, but it hasn't arrived). This is just the integrated FPU circuitry of the Am386DX-40.

The Am386DX-40 doesn't have a built-in FPU.
The Am386DX is basically just a 'carbon copy' of the Intel 386DX, and the 40 MHz version is just a 33 MHz one with slightly more modern manufacturing, which allows the higher clockspeed (some late Intel 33 MHz ones could also run at 40 MHz, if you cherry-pick them).
As such, there is no internal FPU.

So if your machine insists that it has an FPU, that is strange... And if performance is very low, chances are it is not a physical CPU, but some software emulation?

I knew that the Am386DX-40 didn't have a built in floating point co-processor. Up until now, I thought that the terminology was consistent with, say, ALUs:

https://en.wikipedia.org/wiki/Arithmetic_logic_unit

And since an ALU is integrated into the circuitry of the CPU(always), I thought that some kind of FPU was as well(always). From what you just said and what I found on the web, the industry uses the term "FPU" to mean "co-processor", because x86 series processors without co-processors (prior to the 486DX) also didn't have any internal floating point circuitry of any kind. I thought that floating point op codes were implemented in the base x86 models, but just inefficiently. I thought there was an analogy between, say, an i7 CPU with a (crappy) integrated GPU, as distinct from a (powerful) external GPU. I see now that it's more like the old days when "software rendering" was an option in 3D games -- I.e., the CPU was doing, via specialized software, all the calculations that a GPU now handles in the stream processors / CUDA "cores".

So in other words, the terms "FPU" and "floating point co-processor" mean the same thing. That clears that up. Thanks!

In fact, that makes me think that the floating point performance has nothing to do with my overall system performance problems, since if Warcraft I can run at all in a system with no FPU installed, it cannot even be trying to use those FPU op codes.

Clear something else up for me -- when a game or other program is compiled to use an FPU, will it simply crash if an FPU is not present? If that's the case, how do programmers leverage the FPU's abilities when it's not known in advance that an FPU is present? Seems like they'd have to compile it at least twice -- once for systems without an FPU, and once for systems with an FPU...and then run whichever executable makes sense. Or use dynamically linked libraries...or simply only call special floating point functions based on a boolean value that's set when the FPU is detected at the time the program is loaded.

Reply 61 of 72, by konc

User metadata
Rank l33t
Rank
l33t

@Scali: NSSI reports "internal emulation", it's clear that the system doesn't think there's an FPU present

@aquishix: First of all I'd like to make clear that all is good, I'm not trying to attack you or anything, hope it wasn't received that way. Genuinely trying to help. That said though, your result regarding the FPU is because... you don't have one and is perfectly normal. Now check out how this little setting can cripple the performance:

Cache read at 3-1-1-1

3111.jpg
Filename
3111.jpg
File size
280.63 KiB
Views
646 views
File license
Fair use/fair dealing exception

Cache read at 2-2-2-2 (only change)

2222.jpg
Filename
2222.jpg
File size
274.22 KiB
Views
646 views
File license
Fair use/fair dealing exception

As for Warcraft, I don't have the means to properly record gameplay. Would a shaky mobile phone show what you're looking for? And if yes, what should I do, just move around? I'm not sure I know how to do anything else in these games.
I propose that you find something else, like some dos benchmark with graphics that gives you terrible results and I'll gladly run it as well and provide the numbers with both settings.

Last edited by konc on 2018-08-30, 15:36. Edited 1 time in total.

Reply 62 of 72, by aquishix

User metadata
Rank Member
Rank
Member
konc wrote:
@Scali: NSSI reports "internal emulation" and from the results it's clear that the system doesn't think there's an FPU present […]
Show full quote

@Scali: NSSI reports "internal emulation" and from the results it's clear that the system doesn't think there's an FPU present

@aquishix: First of all I'd like to make clear that all is good, I'm not trying to attack you or anything, hope it wasn't received that way. Genuinely trying to help. That said though, your results regarding the FPU is because... you don't have one and are perfectly normal. Check out how this little setting can cripple the performance:

Cache read at 3-1-1-1

3111.jpg

Cache read at 2-2-2-2 (only change)

2222.jpg

As for Warcraft, I don't have the means to properly record gameplay. Would a shaky mobile phone show what you're looking for?
I propose that you find something else, like some dos benchmark with graphics that gives you terrible results and I'll gladly run it as well and provide the numbers with both settings.

Cool.

But -- WTH? Look at the images you just posted: Why would NSSI specifically list the Am386DX-40 in that list of reference benchmarks -- and at a decent level of performance -- and then show that your Am386DX-40 severely underperforms compared to that? That makes absolutely no sense to me. For a minute I thought it might be a typo and that NSSI meant to say "Am387DX-40", but a quick google search reveals that no such FPU ever existed.

This is genuinely confusing.

Anyway, I'm pretty sure my system can't handle 2-2-2-2 timings, and given that...well...time to get a different motherboard. I've already got two of them on the way: one without VLB slots, and one with two VLB slots. I'll try both to see how the performance compares.

Re: Warcraft I footage on a smartphone camera -- yes, that would be plenty good enough. All you'd have to do is instruct a unit to move across the screen and show it happening so it could be timed. Believe me, the difference between what I'd see on your system(with the two different BIOS settings) and what I'm seeing on my own system would be COMPLETELY obvious. I might just post a little youtube video of what I'm witnessing so no one here thinks I'm crazy. You won't believe how bad it is. 😒

Reply 63 of 72, by konc

User metadata
Rank l33t
Rank
l33t
aquishix wrote:

Re: Warcraft I footage on a smartphone camera -- yes, that would be plenty good enough. All you'd have to do is instruct a unit to move across the screen and show it happening so it could be timed. Believe me, the difference between what I'd see on your system(with the two different BIOS settings) and what I'm seeing on my own system would be COMPLETELY obvious. I might just post a little youtube video of what I'm witnessing so no one here thinks I'm crazy. You won't believe how bad it is. 😒

Yes that would interesting (with the TSENG), for me there is a difference but really small. It's by no means unplayable, you can only see the difference if you are looking for it.

btw, idea, if/when you find a m/b that can handle lower than 3-x-x-x cache read timing (most of them can, I'm really surprised that yours can't), switch the chips to conclude if this m/b is garbage and it's not the chips.

Reply 64 of 72, by aquishix

User metadata
Rank Member
Rank
Member
konc wrote:

Yes that would interesting (with the TSENG), for me there is a difference but really small. It's by no means unplayable, you can only see the difference if you are looking for it.

Exactly; this is what I strongly suspected would be the case. =)

konc wrote:

btw, idea, if/when you find a m/b that can handle lower than 3-x-x-x cache read timing (most of them can, I'm really surprised that yours can't), switch the chips to conclude if this m/b is garbage and it's not the chips.

Just to help clear up the mystery, I'll do this for you. Actually, I've already ordered an IC puller tool from Amazon because pulling those small cache RAM chips is really, really frustrating using a pocket knife or screwdriver. I almost broke off one of the poor pins doing that. (It's especially bad with a lot of these old, crusty and/or corroded pins.)

Common story, I'm sure.

Never again! There's no substitute for having the right tool(s) for the job.

Reply 65 of 72, by konc

User metadata
Rank l33t
Rank
l33t

Here are the videos (kept the size really low since you're only interested in how fast that dude is moving):

Filename
warvideos.zip
File size
2.3 MiB
Downloads
43 downloads
File license
Fair use/fair dealing exception

Reply 66 of 72, by aquishix

User metadata
Rank Member
Rank
Member
konc wrote:

Here are the videos (kept the size really low since you're only interested in how fast that dude is moving):

warvideos.zip

Thanks!

Which speed setting did you have the game set to when you recorded these videos? Please tell me it was "Normal"...

In any case, I want to see what "Fastest" looks like on your system.

Reply 67 of 72, by konc

User metadata
Rank l33t
Rank
l33t

Didn't change anything at all, just ran the game and moved a guy, so I guess it's "normal"? I'll give it a quick try and edit this post to let you know how it looks on fastest

Edit: well no, I just saw that the default value is "fastest", this is what you saw.
So what are your thoughts about it?

Reply 68 of 72, by Scali

User metadata
Rank l33t
Rank
l33t
aquishix wrote:

And since an ALU is integrated into the circuitry of the CPU(always), I thought that some kind of FPU was as well(always). From what you just said and what I found on the web, the industry uses the term "FPU" to mean "co-processor", because x86 series processors without co-processors (prior to the 486DX) also didn't have any internal floating point circuitry of any kind. I thought that floating point op codes were implemented in the base x86 models, but just inefficiently. I thought there was an analogy between, say, an i7 CPU with a (crappy) integrated GPU, as distinct from a (powerful) external GPU. I see now that it's more like the old days when "software rendering" was an option in 3D games -- I.e., the CPU was doing, via specialized software, all the calculations that a GPU now handles in the stream processors / CUDA "cores".

Yea, the terminology is not always clear. That is, FPU stands for Floating-point Processing Unit.
ALU is just 'arithmetic', and technically an FPU is also based around ALU units, but not all ALUs are floating-point capable.
As for co-processor, similarly, the FPU was a co-processor in the early days of x86, and many people used the terms interchangeably, but not all co-processors are necessarily FPUs.
As a counter-example, the Commodore Amiga had various processing units in its custom chipsets, with one of them affectionately known as the 'copper', which was taken from 'co-processor'. However, this copper has nothing to do with floating point numbers at all, and in fact cannot perform any arithmetic at all. It is more of a 'display list' processor, and its main task is to write values to chipset registers, synchronized with the screen position.

Anyway, in the world of x86 processors, FPU is equivalent with a unit that can process x87 floating point instructions (modern x86 CPUs also have other ways of processing floating point, such as SSE and 3DNow!, but this is not generally referred to as 'FPU').

aquishix wrote:

Clear something else up for me -- when a game or other program is compiled to use an FPU, will it simply crash if an FPU is not present?

That depends a bit on which CPU exactly. Each CPU has its own interface with the FPU. An 8086/8088 can simply hang when an fwait instruction is executed. In some cases, they may silently ignore FPU instructions. That might not actually lead to a crash, but it will give random results for FPU calculations.
I believe a 386 will generate an invalid instruction exception by default, so the program will indeed crash as soon as an FPU instruction is executed. There are FPU emulation TSRs which hook into this, and emulate the instructions in software.

aquishix wrote:

If that's the case, how do programmers leverage the FPU's abilities when it's not known in advance that an FPU is present? Seems like they'd have to compile it at least twice -- once for systems without an FPU, and once for systems with an FPU...and then run whichever executable makes sense. Or use dynamically linked libraries...or simply only call special floating point functions based on a boolean value that's set when the FPU is detected at the time the program is loaded.

That is a long and complicated subject... Indeed, for CPUs prior to the 386, it was not possible to emulate instructions in software transparently.
There are various approaches (just play around with OpenWatcom for example, there are no less than 3 ways to compile floating point code). A popular one was to pre-pend every FPU instruction with an interrupt call. The compiler's runtime library would contain all the FPU emulation routines. Upon startup, the runtime would also detect an FPU.
When an interrupt was hit, the interrupt handler would either execute the following instruction in software (no FPU present), or it would patch out the interrupt instruction with a nop and return to the code (FPU present). So after the code had been run once, all the interrupts were removed, and the code was basically 'native' FPU code, save for a few nops.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 69 of 72, by aquishix

User metadata
Rank Member
Rank
Member
konc wrote:

Didn't change anything at all, just ran the game and moved a guy, so I guess it's "normal"? I'll give it a quick try and edit this post to let you know how it looks on fastest

Edit: well no, I just saw that the default value is "fastest", this is what you saw.
So what are your thoughts about it?

Well, what you showed me is definitely faster than what my 386 is doing, but I would still regard it as unacceptably slow. If you play Warcraft I on a 486DX2-66 (let alone something as monstrous as a Pentium-II), it's way, WAY faster than that. But if memory serves, Warcraft I is still speed-limited on the "fastest" setting. It's not like Kings Quest I-IV, for instance, where if you run the game on an arbitrarily fast system, the character animations and other events in the game will happen arbitrarily fast. There's a cap. And you're nowhere near that cap, contrary to what I expected.

I really should be blogging about this junk. Other people are going to face the same issues as they dig through their memories and closets to play these old games on authentic hardware.

Reply 70 of 72, by amadeus777999

User metadata
Rank Oldbie
Rank
Oldbie

Warcraft is most likely synchronised to a certain speed besides the water/tile animation and some other things one can miss when not looking closely. It would be funny if the game speed settings were sensitive to CPU speed.

Reply 71 of 72, by aquishix

User metadata
Rank Member
Rank
Member
konc wrote:
@Scali: NSSI reports "internal emulation", it's clear that the system doesn't think there's an FPU present […]
Show full quote

@Scali: NSSI reports "internal emulation", it's clear that the system doesn't think there's an FPU present

@aquishix: First of all I'd like to make clear that all is good, I'm not trying to attack you or anything, hope it wasn't received that way. Genuinely trying to help. That said though, your result regarding the FPU is because... you don't have one and is perfectly normal. Now check out how this little setting can cripple the performance:

Cache read at 3-1-1-1

The attachment 3111.jpg is no longer available

Cache read at 2-2-2-2 (only change)

The attachment 2222.jpg is no longer available

As for Warcraft, I don't have the means to properly record gameplay. Would a shaky mobile phone show what you're looking for? And if yes, what should I do, just move around? I'm not sure I know how to do anything else in these games.
I propose that you find something else, like some dos benchmark with graphics that gives you terrible results and I'll gladly run it as well and provide the numbers with both settings.

Here's what I see, post-POST, when I set the timing to 2-2-2-2.

Attachments

  • 2-2-2-2.jpg
    Filename
    2-2-2-2.jpg
    File size
    170.05 KiB
    Views
    602 views
    File license
    Fair use/fair dealing exception

Reply 72 of 72, by mcfly

User metadata
Rank Newbie
Rank
Newbie

Have any of the owners of this board actually tried to attach a coprocessor to it? I ordered IIT C87-40Mhz and when I mounted in on the board, the system started and shut down immediately. According to the boards specs it supports NPU, this https://stason.org/TULARC/pc/motherboards/I/I … OPTI-495SX.html says 80387? Will intel 16-33 FPU work on this? I see 14MHz quarz nearby, or is there any secret jumper that should be changed to make it work? I don't see a option in Bios to enable/disable it to I assume it does everything automatically. Two bioses exist for the board 1.2 and 2.0 and I tried both of them with the same results. Anyone got lucky about this?