Recently I got caught in a discussion about the behavior of the Build engine and specifically Duke Nukem 3D on 486SX processors. […]
Show full quote
Recently I got caught in a discussion about the behavior of the Build engine and specifically Duke Nukem 3D on 486SX processors. I used to have one of those in the mid-90's and while Duke 3D ran fine on it for the most part, it would turn into a right slideshow whenever I neared a sloping floor in the game. On a friend's 486DX computer the same problem did not occur.
I always assumed this was because the Build engine used some floating point calculations on sloping floors, and had to fall back on software FP emulation on the FPU-less 486SX, slowing the game down to a crawl. I read about similar experiences from other 486SX owners, which only reinforced my assumption.
Then recently I got into this discussion, where someone vehemently denied the Build engine using any floating point calculations at all. This got me doubting my own story, and curious for an official explanation, so I went looking for a reliable source. That led me to re-read your code review of the Build engine, and I noticed you also mentioned that Build uses integers exclusively ("If you search for "float" in the source code of Build you will not get a single hit"). Reading that only puzzled me more.
To get a definitive answer on why slopes are so slow on 486SX processors, I turned to the most reliable source I can think of: the original Duke 3D source code released in 2003. After looking around for slope-related code, I finally found what I was looking for: the Build engine does in fact use floating point assembly instructions in its slope rendering routines (setupslopevlin_ and slopevlin_) inside the A.ASM source file. It's just a few short simple instructions (fild, fadd, fst, fstp), hardly worth mentioning, but it's enough to bring a non-FPU processor to its knees when invoked.
I also looked into how the software floating point emulation works. Turns out it's very simple: the original Build engine source is written for the Watcom C/C++ compiler as you know, and the Watcom assembler tool (wasm.exe) used to assemble the rendering routines "by default, generates code with support for 8087 software emulation". This is mentioned in the codebase of what is now the Open Watcom project here.
Curiously enough, I did encounter an alternative implementation of the slope rendering routines that uses only integer instructions (setupslopevlin2_ and slopevlin2_), but those functions do not appear to be used anywhere in the code.