VOGONS


Quake without FPU

Topic actions

Reply 80 of 118, by kixs

User metadata
Rank l33t
Rank
l33t

I think I've already tried it with SX2 but it didn't work... will test again to be certain.

---

Yes. It just freezes Quake:

mVvVCfvm.jpg

Requests are also possible... /msg kixs

Reply 81 of 118, by georgel

User metadata
Rank Member
Rank
Member

Thanks for you tests. To simplify them your run code is 522-340-416. Very odd. This must be some kind of a bug. If I understood correctly your tests on processors that have FPU Q87 runs as fast as physical coprocessor and it halts quake when there is no physical FPU. If this is the case and only Q87x works you should definitely try earlier version of Q87...

Last edited by georgel on 2019-09-27, 11:12. Edited 1 time in total.

Reply 82 of 118, by Scali

User metadata
Rank l33t
Rank
l33t
georgel wrote:

I am not a game expert but as far as I know Doom and Quake use similar engines and have common roots. If Doom needs only integer CPU for 20 FPS then this statement is false:

mpe wrote:

So Quake needs the FPU for a reason 😀

DOOM and Quake aren't all that similar.
Quake is a full polygonal 3D engine with 'conventional' texturemapping, with 6 degrees of freedom.
DOOM is somewhere between Wolf3D and Quake. Like Wolf3D it only has limited movement and orientation. DOOM can basically only draw floors/ceilings and walls, at fixed angles. Its texturemapping is optimized for these cases. It doesn't need any FPU because of this.

Quake indeed needs an FPU for a reason, and the reason is that it was designed that way. It uses conventional perspective-corrected texturemapping, which requires 1/z calculations at specific intervals (in their case they approximate it with once every 16 pixels). This 1/z is done with the FPU.

You wouldn't necessarily need an FPU for this kind of 3D engine, but then you'd need to redesign the code.
Descent is the closest to this. It also supports full 6 degrees of freedom in fully textured rooms. Descent however does not use the FPU for this, and as a result, it does not need a Pentium. It runs fine on a fast 486.
Its texturemapping is slightly lower quality/precision than Quake though.

But in theory you could build a Quake-like game on a Descent-like engine, resulting in a playable Quake on a 486. Possibly trading in a bit of quality, but not too much.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 83 of 118, by BinaryDemon

User metadata
Rank Oldbie
Rank
Oldbie

georgel misunderstood mpe's text.

mpe wrote:

At integer load (Doom) this CPU is roughly equivalent to P66-P75. These get about 20fps in Quake. 486DX-33 gets about 4fps.

mpe is saying that his NexGen CPU is roughly equal it a P66-P75 in doom (no fps numbers given). Then he says P66-P75 gets roughly 20fps in Quake. If mpe was getting 20fps in Doom, something would be massive wrong with his system, other NexGen Doom benchmarks I've seen basically max it out. (~34.5 fps).

Scali, ever look at the Quake console ports for N64 / Saturn / PS1? They look like you describe - like someone made a Quake mod for a simpler 3d engine.

Check out DOSBox Distro:

https://sites.google.com/site/dosboxdistro/ [*]

a lightweight Linux distro (tinycore) which boots off a usb flash drive and goes straight to DOSBox.

Make your dos retrogaming experience portable!

Reply 84 of 118, by mpe

User metadata
Rank Oldbie
Rank
Oldbie
BinaryDemon wrote:

mpe is saying that his NexGen CPU is roughly equal it a P66-P75 in doom (no fps numbers given). Then he says P66-P75 gets roughly 20fps in Quake. If mpe was getting 20fps in Doom, something would be massive wrong with his system, other NexGen Doom benchmarks I've seen basically max it out. (~34.5 fps)..

Exactly. My actual results are:

Doom results:
P60 - 35.82fps
P66 - 39.49fps
P75 - 42.68fps
NexGen - 36.81fps

Quake results:
P60: 15.3fps
P66: 17fps
P75: 21.3fps
NexGen: 0.9fps 😀

The NexGen is still a bit disadvantaged in Doom due to comparatively slower VL-Bus card (ET4000w32p) as opposed to faster PCI card (Matrox Mystique) used in my Pentium systems. I suppose when this is normalised it would get close/beat P66/P75.

Still puzzled why I only get 0.9fps when others are reporting better results with SX2 + Q87X.

Blog|NexGen 586|S4

Reply 85 of 118, by georgel

User metadata
Rank Member
Rank
Member
kixs wrote:
I did it the same way as always. Will try again... […]
Show full quote

I did it the same way as always. Will try again...

---

Tried again... unless it's as quick as FPU, Q87 doesn't disable it.

---

In NSSI, Comptest... it shows the fpu being emulated. Benchmarks show that also. But in Quake it runs as fast like the native FPU. On the other hand Q87X really disables FPU in Quake too.

Tried Q87 on i486DX. It effectively disables the FPU. Benchmark falls from 2524K Whetstones to 231K Whetstones. Quake was not run. But I don't believe quake to be the software that disables Q87 and to distinguish between Q87 & Q87X. If you were using some form of virtual machine/emulation box this might explain the fact Q87 was unable to activate and gain control over trap #7. But the same behavior should have been exhibited with Q87X...This is a mystery to me.

Reply 87 of 118, by mpe

User metadata
Rank Oldbie
Rank
Oldbie

Wow your 5x85-160 scored 15fps with FPU. That's almost a Pentium like performance. Mine barely gets to 12 fps (most likely due to worse MB).

Blog|NexGen 586|S4

Reply 89 of 118, by georgel

User metadata
Rank Member
Rank
Member
kixs wrote:

Made a quick video on Am486-160 with Q87 and Q87X.

https://youtu.be/u_e_O8sbewI

And your video shows exactly that when you run Q87 FPU benchmark decreases some 5 times because real FPU is disabled. Not as you described in previous messages.

Reply 92 of 118, by lolo799

User metadata
Rank Oldbie
Rank
Oldbie

I tried the no fpu version from Quake without FPU on this laptop:

ast_ascentia_700_n_4_33.jpg
Filename
ast_ascentia_700_n_4_33.jpg
File size
58.04 KiB
Views
948 views
File license
Fair use/fair dealing exception

It has 486SX33, 8MB of RAM and a CT65530 graphic chipset, got about 0.1fps in the demo1 in safe mode

20190927_232700.jpg
Filename
20190927_232700.jpg
File size
178.28 KiB
Views
948 views
File license
Fair use/fair dealing exception

PCMCIA Sound, Storage & Graphics

Reply 93 of 118, by The Serpent Rider

User metadata
Rank l33t++
Rank
l33t++

got about 0.1fps in the demo1 in safe mode

Gentlemen, we've officially hit the rock bottom.

I must be some kind of standard: the anonymous gangbanger of the 21st century.

Reply 94 of 118, by SSTV2

User metadata
Rank Oldbie
Rank
Oldbie
lolo799 wrote:

got about 0.1fps in the demo1 in safe mode[/attachment]

Nice, type into console "r_drawviewmodel 0" and "r_drawentities 0", and rerun the bench, FPS might increase to 0.2 or 0.3 🤣

Reply 96 of 118, by Tronix

User metadata
Rank Member
Rank
Member

By the way, I compiled the soft-fp library from gclibc sources with help DJGPP. I used VirtualPC with Windows 98 environment for long files names. I took sfp-machine.h from \gclib\sysdeps\x86\fpu, included in soft-fp.h. strong_alias everyvere in *.c files, so I added to softp-fp.h this macro:

/* Define ALIASNAME as a strong alias for NAME.  */
# define strong_alias(name, aliasname) _strong_alias(name, aliasname)
# define _strong_alias(name, aliasname) \
extern __typeof (name) aliasname __attribute__ ((alias (#name))) ;
// __attribute_copy__ (name);

I commented "__attribute_copy__ (name);" becuase DJGPP produse error with it. I do not know how correct this is. The library eventually compiled.

So, i recompile Quake with -msoft-float and link with -lsoft library. Its compiled fine without unknown function calling.
But...

hn3c3mhkxwwuv-wd-0zhdroxc8i.png

"Bad fov: -nan" - conversion float numbers error. No way 😵

Attachments

https://github.com/Tronix286/

Reply 98 of 118, by Scali

User metadata
Rank l33t
Rank
l33t
georgel wrote:

This FP library will be several times slower than Q87 for IA32.

Why do you say that?
Logic would dictate that a library is faster.
Because:
1) The emulation code can be injected inline by the compiler, where Q87 can only respond to an interrupt generated by the CPU, which is more overhead.
2) Since the code can be injected inline, it can also be optimized by the compiler.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 99 of 118, by georgel

User metadata
Rank Member
Rank
Member

Because:
1. Q87 is not coded in C but in low level assembly and is optimized not by a compiler at all.
2. Responding to and returning from a trap #7 is insignificant overhead in comparison to the hundreds of instructions required to emulate every single FPU instruction. BTW if Q87 detects consecutive FPU instructions it executes them without IRET between them thus avoiding this trap overhead you are considering.
3. Q87 uses faster (large table assisted) computations. The footprint in memory of Q87 is some 30 times larger than that of the other emulators/libs. Its' amazing that even on a 386 25MHz it has better performance than of a HARDWARE 80287-6. AFAIK no other software FPU emulation/library has achieved this.

Still interested how it was designed and who were the developers.

Last edited by georgel on 2019-09-29, 12:55. Edited 1 time in total.