VOGONS

Common searches


First post, by avatar_58

User metadata
Rank Oldbie
Rank
Oldbie

In the words of Tim Taylor......EEUUURUUUGGGHH? 😳

How come I can get amazingly high amounts of cycles in dosbox now without it even using my full CPU speed? I was running Duke3D at 50000 cycles without even going over 50% most of the time. Is this normal?

I could have sworn before version 0.65 that '34000' was my safe maximum, but I'm kind of confused as to what has been optmized to allow such a large leap! I was running Blood in 800x600 for crying out loud.... 😳

I mean....was I doing something wrong before? Were the builds I was using not running at their full potential? This is incredible.....

--Hmm....build games can go well into 50000 while system shock peaks at 40000.

Reply 1 of 8, by gulikoza

User metadata
Rank Oldbie
Rank
Oldbie

What builds are you refering to? I always try to optimize mine as much as possible, but maybe Qbix has done a better job? But if you're comparing to 0.63....0.65 is almost totally different beast 😀

http://www.si-gamer.net/gulikoza

Reply 2 of 8, by `Moe`

User metadata
Rank Oldbie
Rank
Oldbie

Different games have different maximum values, that much is true. It depends on what instructions they use - some emulated instructions are slower than others, so games that use these "slower" instructions max out earlier.

Moreover, there are a lot of optimizations in 0.65 that weren't present before.

Finally, for anyone compiling sources by themselves: gcc-4.1 is pretty amazing, as far as I have seen so far. If you can, use it for compiling dosbox.

Reply 3 of 8, by TeaRex

User metadata
Rank Member
Rank
Member
`Moe` wrote:

Finally, for anyone compiling sources by themselves: gcc-4.1 is pretty amazing, as far as I have seen so far. If you can, use it for compiling dosbox.

I'm using 4.0.2. For that, I found that using profile guided optimization adds a lot of speed. "configure" with CFLAGS="-O4 -fomit-frame-pointer -fprofile-generate", compile, run a bunch of your favourite games for a while (exercising all the cores if possible), then reconfigure with CFLAGS="-O4 -fomit-frame-pointer -fprofile-use", "make clean" and recompile. DESCENT, for example, is super smooth now, which wasn't ever before on this 2.8 GHz P4 system.

tearex

Reply 4 of 8, by avatar_58

User metadata
Rank Oldbie
Rank
Oldbie

I tried many different builds, including yours gulikoza and ykhwong's but 0.65 seems to hit the jackpot with my PC for some reason. Its almost like I overclocked my PC. 😮

Reply 5 of 8, by Qbix

User metadata
Rank DOSBox Author
Rank
DOSBox Author

To compile the windows version:
I use quite an older version of GCC to be honest
3.4.4

It seems to work fine for me. I might upgrade it when we start testing a new version.

The -O4 option.. I did hear about that. but didn't find documentation on what it does.
My buildflags are a bit longer though.

Water flows down the stream
How to ask questions the smart way!

Reply 6 of 8, by TeaRex

User metadata
Rank Member
Rank
Member
Qbix wrote:

The -O4 option.. I did hear about that. but didn't find documentation on what it does.

Actually it's just a future-proof variant of "-O3", the highest optimization level that gcc currently supports. Anything higher than 3 gets treated like -O3.

tearex

Reply 7 of 8, by `Moe`

User metadata
Rank Oldbie
Rank
Oldbie
TeaRex wrote:

CFLAGS="-O4 -fomit-frame-pointer -fprofile-generate"

Don't forget to set "-march=<your-cpu-type>". The resulting binary might not be usable on other machines, but will include CPU-specific optimizations, which can make quite a difference. Moreover, "-O4" is useless: gcc only has "-O3" as maximum. "-ffast-math" is also useful, and you might want to check if "-mfpmath=sse" improves things (CPU-dependent). For gcc-4.1, "-ftree-vectorize" is a new flag that is worth testing, although it might not even work, it is quite experimental.

Reply 8 of 8, by `Moe`

User metadata
Rank Oldbie
Rank
Oldbie
TeaRex wrote:

Actually it's just a future-proof variant of "-O3", the highest optimization level that gcc currently supports. Anything higher than 3 gets treated like -O3.

From the docs:
`-O2'
Optimize even more. GCC performs nearly all supported
optimizations that do not involve a space-speed tradeoff.

Which means "-O3" enables _all_ speed-related optimizations. There won't be any (future) optimization level -O4, since -O3 is constantly modified to include the latest-and-greatest (and stable) optimizations.