Reply 20 of 28, by gulikoza
- Rank
- Oldbie
Doesn't -msse have to be enabled as well for -mfpmath=see to work? But in my tests, sse math doesn't improve performance much (if all)...dosbox is not that fpu intensive.
Doesn't -msse have to be enabled as well for -mfpmath=see to work? But in my tests, sse math doesn't improve performance much (if all)...dosbox is not that fpu intensive.
you don't have to have -msse if you use -march with an appropriate cpu w/sse since it will be automatically turned on.
as for -mfpmath, i've seen improvements with other stuff so it's become a standard flag with anything i compile.
hey thnx for the heads up on dosbox & the sse math, it's been too long since i've tried using other cflag options. since i had some free time this evening i decided to rebench with some different cflag options and the one that produced the fastest code was:
configure CFLAGS="-g -O3 -pipe -fomit-frame-pointer -funroll-loops -ffast-math" CXXFLAGS="-g -O3 -pipe -fomit-frame-pointer -funroll-loops -ffast-math"
i used the .65 source, and this gave about a ~10% performance increase over the official release as i dropped the .exe from the official release into my testfolder for comparison.
of course my benching is unscientific, and should be taken with a grain of salt, a lime, and a shot of tequila.
which version of gcc did you use ?
-pipe doesn't influence speed.
You didn't specify an arch or a cpu
try: -march=i586 -mtune=i686 as aditional flags
It's a bit odd that you get 10 % increase from your own judgement as the flags are quite similar to ones I used (I have only 1 flag different except for the arch/tune flag)
Be sure to configure dosbox like this
./configure --enable-core-inline
Water flows down the stream
How to ask questions the smart way!
for dosbox i use gcc-3.4.2 as mentioned in an earlier post as it produces the fastest version of dosbox for me (i've benched dosbox with gcc-3.4.2, 3.4.5, 4.0.2, 4.0.3, and 4.1.0).
i used -march=athlon-xp as one of my flags when i rebenched that time around but when i used march=athlon-xp the speed was about the same speed as the official binary. i'll try compile and bench another with the march/mtune you mentioned instead of athlon-xp and will see what that does.
using "--enable-core-inline" put my builds using gcc-3.4.2 and gcc-4.1.0 and yours all within close to the same speed. i guess forgetting to add that is what gave mine the ~10% speed increase. i guess i can now ditch gcc-3.4.2 and stick with only using 4.1.0 since i only kept 3.4.2 just for dosbox, thanx for the info.
hmm well that is interresting. maybe your cpu has a small cache and the larger cpu core because of the inline memory funcitons made it slower.
We might need to benchmark that again.
Water flows down the stream
How to ask questions the smart way!
i don't think it's my cache:
i've deleted the builds already, but if you are interested i could rebuild them and send them to you.