VOGONS

Common searches


First post, by Concerto

User metadata
Rank Newbie
Rank
Newbie

Hello, I have some old pc builds that I am interested in developing on, so I am curious in knowing what kind of programs you guys use to make software, compile code, what libraries are used, that kind of thing. I am mostly focusing on dos/ windows 9x stuff.

Reply 1 of 38, by GL1zdA

User metadata
Rank Oldbie
Rank
Oldbie

If you want something recent, then use the Open Watcom compiler and an IDE of your choice - it lets you build any target from real mode DOS to Win32.

getquake.gif | InfoWorld/PC Magazine Indices

Reply 2 of 38, by gca

User metadata
Rank Member
Rank
Member

On my old boxes I usually stick to Borland compilers like Turbo-C++ v3 or Turbo Pascal (usually 5.5 or 7 because its what I have access to). For quick and ugly hacks to kill time or a one off just to get something running Q-Basic or its big brother Quick Basic.

Never did much Windows development but on the rare occasions that I do then I use GNU C/C++ with SDL for portability (even then the code usually needs some fettling to get Windows to co-operate with the code being ported from my usual Linux Mint dev platform) with Notepad++ as the editor.

Reply 3 of 38, by Concerto

User metadata
Rank Newbie
Rank
Newbie

Okay. Thanks for the info. I already have some of that software mentioned. I wonder what leileilol and Scali use in regards of software and libraries? I would to know how they build their programs like engoo and the retro programming projects.

Reply 5 of 38, by vvbee

User metadata
Rank Oldbie
Rank
Oldbie

Someone should benchmark the compilers' perf optimizations. For dos, could include borland, digital mars, and watcom. Maybe for win 9x the latter two and gcc. Use a software rasterizer or whatever.

Reply 6 of 38, by Shponglefan

User metadata
Rank l33t
Rank
l33t

Back in the DOS days I used to use Turbo Pascal and Borland C++. For Windows 95, I used Visual C++ with various early DirectX libraries.

Pentium 4 Multi-OS Build
486 DX4-100 with 6 sound cards
486 DX-33 with 5 sound cards

Reply 7 of 38, by beastlike

User metadata
Rank Member
Rank
Member

My fave combo for 16 bit DOS was Borland's bcc and tasm.

Now that I've actually got some time-appropriate machines I am going to try to get some of my old source code running again 😀

You may try an old version of Lazarus if you're into the pascal thing - it's free but dropped win9x support in v1.6 - still pretty cool free IDE though.

Delphi was pretty cool for windows development back in the mid 90's to early 2000's. I believe executables compiled some of the older versions (D6, maybe D7) it will run on Win9x - they had Turbo C++ and Turbo Delphi which were free versions; they had some sort of activation mechanism, not sure if you can still get them. Delphi was great because you could randomly break out into assembly in the middle of a pascal function if need be.

You could also do this in Turbo Pascal for DOS - many times to the dismay of my Pascal teacher, which - f that guy, my code still met spec 😜

Anders Hejlsberg, who originally wrote Turbo Pascal for DOS - which if you have never used it, seriously drop everything you're doing and try this IDE just for fun.-IMHO it's one of the most amazing IDEs for DOS that ever was. Anyway he's (still) a legendary programmer who Microsoft sniped from Borland (their chief architect of Delphi at the time) in the late 90's to create the .NET framework (probably a big part of what led to Borland's downfall). Still the lead architect of C# and since like 2010 he's been the lead architect of TypeScript, one of my all time heroes in the programming world.

Reply 8 of 38, by Azarien

User metadata
Rank Oldbie
Rank
Oldbie

Free Pascal (aka FPC) is a good Pascal compiler that has a DOS version (32-bit based on DJGPP). Recent versions can generate 16-bit DOS executables too.
Windows version should still work on 9x, or 98 at least.

Reply 10 of 38, by vvbee

User metadata
Rank Oldbie
Rank
Oldbie
vvbee wrote:

Someone should benchmark the compilers' perf optimizations. For dos, could include borland, digital mars, and watcom. Maybe for win 9x the latter two and gcc. Use a software rasterizer or whatever.

I compiled a modified version of the smallpt path tracer using a few win32 compilers to see whether there would be notable differences in execution speed. The program rendered a 40 x 30 image with a ray depth of at most 4. Floats were double-precision. I was going to repeat the test for dos, but the dos version of the code stopped working at some point and I couldn't be bothered.

Compilers: borland c++ 5.5 (-5 -P -f -O2 -O -tWC), digital mars c++ probably the newest version (-mn -a8 -5 -cpp -c -f -o+speed), gcc 4.7 (-O3), and open watcom 1.9 (-oneatx -oh -oi+ -ei -zp8 -5 -fpi87 -fp5). The optimizations targeted the pentium level (not ppro), were to use hardware x87, and were not to enable ieee-breaking fp. None of these are guaranteed to have happened. I especially didn't look at the open watcom settings too close and just copied them from a website. Didn't care at all about gcc's. It's unclear whether -f for the borland allows true x87 or just emulation. No doubt you can devise faster/better options, so let loose and do a repeat.

In general, gcc produced the fastest executable, followed by the digital mars compiler. Borland's was the slowest.

Machine: 300 mhz k6-2, windows 98. 512 samples per pixel.
- dmc: 49 seconds to render
- gcc: 50
- ow: 55
- bc: 62

Machine: 1.4 ghz athlon 64, windows 98. 2048 samples per pixel.
- gcc: 21
- dmc: 30
- ow: 38
- bc: 43

Machine: 3.3 ghz haswell xeon, linux + wine. 4096 samples per pixel.
- gcc: 9
- dmc: 12
- ow: 20
- bc: 22

I ran each test twice, and the results were about the same every time. The render times are relative to the number of samples per pixels - the more samples, the slower it goes - so you can't directly compare the cpus.

It's worth noting that while dmc produced the second-fastest executable, it was also the only one to produce an artifact in the rendering, a white line at the top of the image. Shown here are the renderings at 2048 spp (bc, dmc, gcc, ow). I briefly debugged this by varying the compile options, including disabling optimizations, but no change. It may or may not be an fp issue.
c1024.png

Reply 11 of 38, by spiroyster

User metadata
Rank Oldbie
Rank
Oldbie
vvbee wrote:
I compiled a modified version of the smallpt path tracer using a few win32 compilers to see whether there would be notable diffe […]
Show full quote
vvbee wrote:

Someone should benchmark the compilers' perf optimizations. For dos, could include borland, digital mars, and watcom. Maybe for win 9x the latter two and gcc. Use a software rasterizer or whatever.

I compiled a modified version of the smallpt path tracer using a few win32 compilers to see whether there would be notable differences in execution speed. The program rendered a 40 x 30 image with a ray depth of at most 4. Floats were double-precision. I was going to repeat the test for dos, but the dos version of the code stopped working at some point and I couldn't be bothered.

Compilers: borland c++ 5.5 (-5 -P -f -O2 -O -tWC), digital mars c++ probably the newest version (-mn -a8 -5 -cpp -c -f -o+speed), gcc 4.7 (-O3), and open watcom 1.9 (-oneatx -oh -oi+ -ei -zp8 -5 -fpi87 -fp5). The optimizations targeted the pentium level (not ppro), were to use hardware x87, and were not to enable ieee-breaking fp. None of these are guaranteed to have happened. I especially didn't look at the open watcom settings too close and just copied them from a website. Didn't care at all about gcc's. It's unclear whether -f for the borland allows true x87 or just emulation. No doubt you can devise faster/better options, so let loose and do a repeat.

In general, gcc produced the fastest executable, followed by the digital mars compiler. Borland's was the slowest.

Machine: 300 mhz k6-2, windows 98. 512 samples per pixel.
- dmc: 49 seconds to render
- gcc: 50
- ow: 55
- bc: 62

Machine: 1.4 ghz athlon 64, windows 98. 2048 samples per pixel.
- gcc: 21
- dmc: 30
- ow: 38
- bc: 43

Machine: 3.3 ghz haswell xeon, linux + wine. 4096 samples per pixel.
- gcc: 9
- dmc: 12
- ow: 20
- bc: 22

I ran each test twice, and the results were about the same every time. The render times are relative to the number of samples per pixels - the more samples, the slower it goes - so you can't directly compare the cpus.

It's worth noting that while dmc produced the second-fastest executable, it was also the only one to produce an artifact in the rendering, a white line at the top of the image. Shown here are the renderings at 2048 spp (bc, dmc, gcc, ow). I briefly debugged this by varying the compile options, including disabling optimizations, but no change. It may or may not be an fp issue.

That band is interesting, there is colour bleed from both walls in it, which means it WAS part of the render (or at least for some period of time during render it was considered correctly). Usually 'white outs' like this occur when range checks don't go according to plan (especially around trig functions). Its at the top, so maybe precision error at the extremes of the near clipping plane (My monies on this issue being due to something to do with line 92 o.0)? And its the same when compiling with no optimisations?... ouch 😵 .

For the dos version, I assume memory is an issue? I doubt you would get a big framebuffer into 640k o.0. You would need to modify smallpt to persist the ray result to file rather than keeping it in the stack. Interesting idea getting smallpt to run on dos...so in theory it could run with a 387?

P.S You should '-fopenmp' in gcc for the haswell. For funzzies like!

Reply 12 of 38, by vvbee

User metadata
Rank Oldbie
Rank
Oldbie
spiroyster wrote:

That band is interesting, there is colour bleed from both walls in it, which means it WAS part of the render (or at least for some period of time during render it was considered correctly). Usually 'white outs' like this occur when range checks don't go according to plan (especially around trig functions). Its at the top, so maybe precision error at the extremes of the near clipping plane (My monies on this issue being due to something to do with line 92 o.0)? And its the same when compiling with no optimisations?... ouch 😵 .

For the dos version, I assume memory is an issue? I doubt you would get a big framebuffer into 640k o.0. You would need to modify smallpt to persist the ray result to file rather than keeping it in the stack. Interesting idea getting smallpt to run on dos...so in theory it could run with a 387?

P.S You should '-fopenmp' in gcc for the haswell. For funzzies like!

I had an earlier test version that ran in dosbox, though I didn't test it on real hardware. And having recalled that, I remember that it was compiled with dmc and produces a fine image. So something in the current code may be an issue. I recompiled it on gcc and bc earlier to make sure, and no prob. Check out the code and see if you can spot something: http://personal.inet.fi/muoti/eimuoti/tmp/spt.cpp. I hacked it up as I went to get it to compile on everything, so a good strategy might be to just start from scratch with the smallpt source. It's not too many changes that need to be made.

Reply 13 of 38, by vvbee

User metadata
Rank Oldbie
Rank
Oldbie

It turns out the dmc executable has issues when built for w32 but not when built for dos. Not up to me to debug it luckily.

I ran a few more benches with a dos version of the program. I updated the code above if you intend to run your own tests, which is probably unlikely. To cut down on recursion I removed ray reflection and disabled all but diffuse surfaces. For compiling, I used borland c++ 5.2 (-3 -f87 -O2) and digital mars c++ (-mld -a2 -5 -f -o+speed). Borland couldn't inline the intersection routine.

pallo.png

Machine: 300 mhz k6-2, dos. 256 samples per pixel.
- dmc: 38 seconds to render
- bc: 52

Machine: 1.2 ghz athlon 64, dos. 1024 samples per pixel.
- bc: 24
- dmc: 38

It's interesting that on the newer cpu the older borland compiler produced faster code than the newer dmc one. But I think the older cpu is more important for this purpose, and dmc was clearly faster there.

Open watcom 1.9's executable couldn't produce a valid rendering. It would either fail completely or at best make this: vituiks.png.

Reply 14 of 38, by vvbee

User metadata
Rank Oldbie
Rank
Oldbie

The open watcom issue was quick to debug, as manually increasing stack size allowed it to run. I'll leave testing that with the other compilers to someone else.

Scratch this: [It doesn't perform well. On the athlon 64, render time is 27 seconds, above the dmc but below the bc. On the k6-2, render time is 59 seconds, the slowest of the three compilers. A double-precision path tracer isn't what you'd normally write for dos, but come now.]

The above was in fact not true, as I realized I ran ow's exe with 512 samples per pixel on the k6-2, even though I was supposed to use 256 in dos. With the correct number of samples, the open watcom executable performs quite well, as it's now the fastest of the three compilers at 29 seconds. That's more in line with their perf rep. The athlon 64 result is still valid from what I recall, and curious for the bc.

Reply 15 of 38, by Azarien

User metadata
Rank Oldbie
Rank
Oldbie

There are many hacky expressions in the code, such as

if((d=spheres[i].intersect(r))&&d<t)

It's possible that one of such lines causes artifacts due to undefined behaviour.

Reply 16 of 38, by vvbee

User metadata
Rank Oldbie
Rank
Oldbie

I had edited that particular code to be d = spheres.intersect(r); if (d && d<t){t=d;id=i;} in the code I ran the win32 tests on. In the code I'm currently sharing and used for dos I left it as it is in the original. Some of the compilers warn about that line and some don't. It's of course the case that smallpt aims for compactness, and it's clear that you can't comfortably fit a feature-packed path tracer into 99 lines of normal-length c++ code. I aim to test with one of my oldschool rasterizers later.

Reply 17 of 38, by vvbee

User metadata
Rank Oldbie
Rank
Oldbie

I tested the above three compilers with a c++ vga 13h software rasterizer that uses floats/ints for transforms and mainly ints for rendering. The engine has no notable optimizations and it's up to the compiler to make it fast. The scene was a 320-tri textured sphere spinning around, variably covering 10-40% of the screen area.

Machine: k6-2 300, matrox millennium, dos.
- ow: 73 fps
- dmc: 62
- bc: 55

Machine: athlon 64 1.2 ghz, matrox g400, dos.
- ow: 241 fps
- dmc: 171
- bc: 157

The open watcom compiler is 40-50% faster than the other two when using the newer cpu, and finds just under 20% of extra speed over dmc on the older k6-2.

Since the earlier path tracing test may have shown ow struggling in win32, I want to port the rasterizer over and re-run the test. Not very difficult to do so I probably will. I also want to run the dos test on my pentium 90 and 486/50, especially the latter to probe 486 optimizations.

Reply 18 of 38, by vvbee

User metadata
Rank Oldbie
Rank
Oldbie

I toggled some compiler switches to get a headless win32 console version of the rasterizer. Output was validated by testing a particular frame against a reference image - all compilers passed. The program ran for 20 seconds and spat out the number of frames it had generated in that time. The numbers below are an average over three runs.

This time I also included the optimizing compiler from vc++ 6.0 and the nonoptimizing one from vc++ 7.0. Gcc was set to generate for arch=pentium-mmx, others for the pentium, these being correspondingly the highest available options below the ppro. Gcc was v4.7.1 from mingw. I used borland c++ 5.5 here, as I did in the last win32 test. Overall, the results here agree with that test.

Machine: k6-2 300, windows 98.
- gcc: 3122 frames
- dmc: 2848
- vc7nop: 2529
- vc6: 2374
- bc: 2353
- ow: crash

Machine: athlon 64 1.2 ghz, windows 98.
- gcc: 26175 frames
- dmc: 22691
- vc6: 18515
- bc: 14926
- vc7nop: 14535
- ow: crash

Machine: haswell xeon 3.3 ghz, linux + wine.
- gcc: 164180 frames
- vc6: 140655
- dmc: 133807
- vc7nop: 104534
- ow: 101733
- bc: 70054

Gcc killed it, no surprise, with dmc being the second pick for win32 performance. The open watcom compiler had issues here as the exe it made crashed (page fault) under win 98 on both machines and wasn't notably fast otherwise. I may debug the crash but possibly not, am not planning to use ow but am interested if it's an issue in the code. Not much benefit from vc6's optimizations over the nonoptimizing vc7 with the older cpu but about 30% elsewhere.

Reply 19 of 38, by vladstamate

User metadata
Rank Oldbie
Rank
Oldbie

I just wanted to say that this is very good research you are doing. I wish you had something lower than a K6-2 300Mhz. Like a Pentium (MMX) or even a 486. If you have Win98 executables I might be able to run that for you. Will Win95 be ok, or do your tests require 98 specific functions?

YouTube channel: https://www.youtube.com/channel/UC7HbC_nq8t1S9l7qGYL0mTA
Collection: http://www.digiloguemuseum.com/index.html
Emulator: https://sites.google.com/site/capex86/
Raytracer: https://sites.google.com/site/opaqueraytracer/