VOGONS


Windows 95, 3Dnow, and SSE

Topic actions

Reply 41 of 46, by Orkay

User metadata
Rank Member
Rank
Member

I ran a single frame render in 3ds MAX R3 using both a 450MHz Pentium II and III under Windows 98, only to see them finish rendering in pretty much the exact same time, so it does seem that program doesn't use SSE instructions. I can't really think of any programs using SSE that would run in Windows 95 with or without the SSE optimizations enabled, other than 3DMark 99 MAX and a few of those open source goodies like old versions of Blender and VLC. I've tried the latter earlier, but it had a tendency to crash a lot in 95B and NT4. Even then, when I toggled various CPU optimization options in VLC's preferences, I didn't notice any changes in the CPU usage when playing back a small x264 video.

On the flipside, I've confirmed 3DNow is supported in Windows 95 by running AMD's patched Quake II with and without the 3DNow extensions. Maybe I should consider getting Thunderbird-compatible hardware after having reeled in a plentiful of 440BX boards.

Reply 42 of 46, by Falcosoft

User metadata
Rank Oldbie
Rank
Oldbie

Hi,
I have made a video about using SSE/3DNow under 1st edition of Win95. For testing I have used my MandelX fractal generator and SoftIce:
https://youtu.be/ivJxALS7JyA
Conclusion:
If you start sse.com in autoexec.bat (without loading EMM386 in config.sys) then you can use SSE/SSE2/3 under Win95, but only with 1 program. Using more than one SSE capable programs simultaneously can cause problems/ crash. 3DNow can be used without any restrictions and you do not need special tools like sse.com. If OS supports saving/restoring FPU registers (Win95 does) then it automatically also supports MMX registers that 3DNow uses.
For confirming if a program really uses a given instruction set under Win95 I think SoftIce is the best tool.

Last edited by Falcosoft on 2019-05-03, 07:56. Edited 1 time in total.

Website, Facebook, Youtube
Falcosoft Midi Player + Munt VSTi + BassMidi VSTi topic

Reply 43 of 46, by BinaryDemon

User metadata
Rank Oldbie
Rank
Oldbie

Wow, I didn’t realize Falcosoft had a Vogons account and was still active. What a cool turn of events.

Check out DOSBox Distro:

https://sites.google.com/site/dosboxdistro/ [*]

a lightweight Linux distro (tinycore) which boots off a usb flash drive and goes straight to DOSBox.

Make your dos retrogaming experience portable!

Reply 44 of 46, by Orkay

User metadata
Rank Member
Rank
Member
Falcosoft wrote:
Hi, I have made a video about using SSE/3DNow under 1st edition of Win95. For testing I have used my MandelX fractal generator a […]
Show full quote

Hi,
I have made a video about using SSE/3DNow under 1st edition of Win95. For testing I have used my MandelX fractal generator and SoftIce:
https://youtu.be/ivJxALS7JyA
Conclusion:
If you start sse.com in autoexec.bat (without loading EMM386 in config.sys) then you can use SSE/SSE2/3 under Win95, but only with 1 program. Using more than one SSE capable programs simultaneously can cause a crash. 3DNow can be used without any restrictions and you do not need special tools like sse.com. If OS supports saving/restoring FPU registers (Win95 does) then it automatically also supports MMX registers that 3DNow uses.
For confirming if a program really uses a given instruction set under Win95 I think SoftIce is the best tool.

I like this program a lot, it's really useful for quick benchmarks! I ran MandelX on Windows 95A on a real Pentium III computer just now, and without running SSE.COM, it doesn't initiate the render at all, nor does running two instances cause any abnormalities due to SSE not being usable without the program. This indicates installing Windows 95 on a computer with any SSE-capable CPU is perfectly safe, assuming any programs don't try to enable access to SSE registers themselves.

When I ran two instances of MandelX with SSE.COM loaded in AUTOEXEC.BAT, I got more interesting results. Windows 95 didn't crash, but its lack of awareness of SSE registers when switching tasks reveals itself in the render results. There's plenty of differently colored spots in the center that obviously shouldn't be there; disabling SSE renders the images correctly, albeit at least twice as slow.

I wasn't able to render anything with MandelX under Windows NT 4.0 as it uses a later version of DirectX (unless I'm missing something), but according to a page on BearWindows, SSE support can be added to NT4 by installing Service Pack 5 or later; a driver called INTLFXSR.SYS handles the SSE instruction set, and I expect it should handle context switching properly.

Either way, thanks for helping with clearing up all doubts about SSE handling in Windows 95. I'll be sure to note this in my 440BX build guide I've been working on.

Attachments

  • 95sse2.png
    Filename
    95sse2.png
    File size
    78.38 KiB
    Views
    252 views
    File comment
    MandelX in Windows 95A with SSE disabled
    File license
    Fair use/fair dealing exception
  • 95sse1.png
    Filename
    95sse1.png
    File size
    78.6 KiB
    Views
    252 views
    File comment
    MandelX in Windows 95A with SSE enabled
    File license
    Fair use/fair dealing exception

Reply 46 of 46, by Falcosoft

User metadata
Rank Oldbie
Rank
Oldbie
Scali wrote:

Thanks Falcosoft! Saves me the trouble of making a proof-of-concept myself 😀

You're welcome! The pioneer helps where he can, and volunteers the community. (It's a real slogan from communist era Hungary. It's the 5th point from the 12 points of Pioneers.)
https://translate.google.hu/translate?hl=en&t … C3%25A9t_pontja
😀

Orkay wrote:

Either way, thanks for helping with clearing up all doubts about SSE handling in Windows 95.

I'm glad it helped. I have seen you have 450MHz K6-2 so just for the fun try MandelX with it both in FPU and in 3DNow mode. I think you will be surprised as I was when I had finished and tried the code. It's expected that 3DNow can be faster but it's not that optimized 3DNow can be more than 5x faster than also well optimized FPU code on K6-2/3. The SIMD nature of 3DNow cannot explain this difference. So I think it's rather the dual pipelined 3DNow execution units vs. the non-pipelined FPU of K6-2. On an Athlon 3DNow execution is proportionally slower and the FPU is much faster so the difference is not even ~2x.
One can only imagine how K6-2/3 could have worked with float intensive software if there had been more hand optimized 3DNow code at that time.

BinaryDemon wrote:

Wow, I didn’t realize Falcosoft had a Vogons account and was still active.

Yep, FSMP and related software are still actively developed and the 'support forum' is here on Vogons:
Falcosoft Soundfont Midi Player + Munt VSTi + BassMidi VSTi

Website, Facebook, Youtube
Falcosoft Midi Player + Munt VSTi + BassMidi VSTi topic