VIDEO Patch for pixel-perfect scaling (SDL1)

Here you can discuss the development of patches.

Re: Patch for pixel-perfect scaling

Postby Ant_222 » 2017-9-06 @ 08:39

krcroft wrote:Unther,

Does your build solve the issue mentioned here?

http://www.vogons.org/viewtopic.php?f=4 ... 20#p592089

My build was based on r2025 and the current pixel perfect patch at the time, and I haven't checked to see if performance has improved since then (I'm running profile-feedback built with gcc 7.x)
You must have mistyped the revision. DOSBox was well over 3000 at the inception of the patch. The patch has become faster, but the speedup may not be sufficient for your situtation, so please compile it with -O3 or higher and test. Edit: Implementing pixel-perfect scaling as a pixel shader or via the hardware-accelerated scaling functionality of SDL 2.0 will solve all performace problems.
Last edited by Ant_222 on 2017-9-06 @ 08:53, edited 3 times in total.
Ant_222
Member
 
Posts: 348
Joined: 2010-7-24 @ 21:29

Re: Patch for pixel-perfect scaling

Postby Ant_222 » 2017-9-06 @ 08:50

lukeman3000 wrote:
Ant_222 wrote:
Giuliano wrote:What I thought was wrong was the resulting aspect ratio of the games with 640x400 graphics. But now I see that, for surfacepp to bring those graphics closer to 4:3, it would require a 3200x2400 full screen area.
Yes, except that 3200 x 2400 will give you precisely a 4:3 aspect ratio, whereas 2560 x 2000 will be nearly perfect.
Wouldn't 1600x1200 give him a unity PAR as well as a 4:3 aspect ratio?
Yes, it would, but 640x400 cannot be scaled to this resolution in a pixel-perfect manner:
Code: Select all
640 x 400 -> [2.5 x 3] -> 1600 x 1200,
where the horizontal scale 2.5 is not an integer.
I was under the impression that 1600x1200 is the lowest resolution for which you will have a truly perfect PAR and aspect ratio for games with native resolution of 320x200 and 1.20 PAR. To my understand, it is as follows:

For games with native resolution of 320x200 and 1.20 PAR
960x800 = 89% perfect PAR; 92% perfect aspect ratio
1280x1000 = 96% perfect PAR; 99% perfect aspect ratio
1600x1200 = 100% perfect PAR; 100% perfect aspect ratio
Correct.
Assuming I'm understanding all this correctly so far, I am still confused by something.
You first quote my explanation about 640x400 but then switch to 320x200.
Why is 4:3 the correct aspect ratio when the native resolution's aspect ratio (320x200) is 1.6?
Because some games are intended to be displayed using rectangular pixels. Some other games running at 320x200, such as Lure of the Temptress, are designed for square pixels. These should be played without aspect-ratio correction at 1600x1000 or at another proportional resolution.
Ant_222
Member
 
Posts: 348
Joined: 2010-7-24 @ 21:29

Re: Patch for pixel-perfect scaling

Postby Ant_222 » 2017-9-06 @ 09:03

lukeman3000 wrote:Assuming I'm understanding all this correctly so far, I am still confused by something. Why is 4:3 the correct aspect ratio when the native resolution's aspect ratio (320x200) is 1.6?
Mark that 4:3 is the ratio of the width and height of the image as displayed on the monitor, whereas 1.6 (or 8:5) is the ratio of the pixel dimensions, which in case of non-square pixels is different from that of the image dimensions. For a PAR of 1.2:
Code: Select all
(320 / 200) / 1.2 = 4 / 3
Ant_222
Member
 
Posts: 348
Joined: 2010-7-24 @ 21:29

Re: Patch for pixel-perfect scaling

Postby Ant_222 » 2017-9-06 @ 10:15

unther wrote:After testing on my desktop, I wanted to get it running on my Raspberry Pi 3 (RPI3). However, I ran into compiling problems related to OpenGL calls. From what I understand, the RPI3 GPU driver doesn't support standard OpenGL, so SDL functions related to OpenGL aren't available.

Normally, when compiling dosbox for use on an RPI3, the lack of Open is worked around by passing "--disable-opengl" to the dosbox configure script, which uses the C_OPENGL preprocessor directive to disable OpenGL calls. Unfortunately, your new code for handling opengl surfaces is not wrapped in C_OPENGL #ifdefs, so it's not being disabled with "--disable-opengl", hence causes compile time errors.
I will fix it and ask you to test the patch again.
Ant_222
Member
 
Posts: 348
Joined: 2010-7-24 @ 21:29

Re: Patch for pixel-perfect scaling

Postby Ant_222 » 2017-9-06 @ 20:05

unther, please test whether the conditional compilation of OpenGL support is correct in the attached patch.
You do not have the required permissions to view the files attached to this post.
Ant_222
Member
 
Posts: 348
Joined: 2010-7-24 @ 21:29

Re: Patch for pixel-perfect scaling

Postby unther » 2017-9-07 @ 02:32

krcroft wrote:Unther,

Does your build solve the issue mentioned here?

http://www.vogons.org/viewtopic.php?f=4 ... 20#p592089


krcroft, unfortunately the wolf3d menu fade effect is still causing audio stutter on my build when using output=surfacepp. The game itself seems to run fine though as long as you fix the cycles to around 4000. (At around 5000 cycles and higher the dosbox process starts to max out one of the cores.)

I was curious about this, so I did a bit more testing. The first thing I noticed was that even when using output=surface, my build of dosbox was far slower than the RetroPie packaged dosbox. (i.e. my dosbox build would max out a CPU core at a fixed 10000 cycles with wolf3d, whereas the RetroPie dosbox binary could do 20000 cycles or higher). From looking at the RetroPie dosbox build script (https://github.com/RetroPie/RetroPie-Setup/blob/master/scriptmodules/emulators/dosbox.sh), I noticed the follow changes were being made to config.h:

Code: Select all
    if isPlatform "arm"; then
        # enable dynamic recompilation for armv4
        sed -i 's|/\* #undef C_DYNREC \*/|#define C_DYNREC 1|' config.h
        if isPlatform "armv6"; then
            <snip>
        else
            sed -i 's/C_TARGETCPU.*/C_TARGETCPU ARMV7LE/g' config.h
            sed -i 's|/\* #undef C_UNALIGNED_MEMORY \*/|#define C_UNALIGNED_MEMORY 1|' config.h
        fi
    fi


After making those three changes to my config.h and rebuilding, my dosbox build performance was roughly doubled when using output=surface (to basically the same performance as the original RetroPie build). However, these config.h changes had no noticeable effect when using output=surfacepp - it was still limited to about 4000 cycles and still had audio stuttering during fade effects.

My next thought was that maybe the RP3's CPU simply can't handle drawing a 1280x1000 surface in a single thread along with all the other emulation demands of dosbox. The default RetroPie dosbox config just draws the original 320x200 surface and let's the hardware scaler do the rest of the work - so using output=surfacepp on a 1080p display is by comparison drawing 20x (4x5y) the number of pixels. I thought a good way to confirm this was to patch in the normal5x scalar into my build and see if it had similar performance issues (As far as I know, normal5x is done purely in software would actually draw even more pixels that the previous test: 25x vs 20x).

To my surprise, the normal5x scaler runs wolf3d perfectly fine on an RPI3 without any audio stutter at all. I was even able to bump the cycles up to 15000+ without maxing out a CPU core. I'm not sure why surfacepp's 4x5y scaling would run so much slower than normal5x's 5x5y scaling - is this even a valid comparison? Ant, would you expect your 5x5y algorithm to perform similarly to normal5x?
unther
Newbie
 
Posts: 5
Joined: 2017-9-05 @ 15:59

Re: Patch for pixel-perfect scaling

Postby unther » 2017-9-07 @ 03:06

Ant_222 wrote:unther, please test whether the conditional compilation of OpenGL support is correct in the attached patch.


Ant, the attached patch works correctly with --disable-opengl - thanks!
unther
Newbie
 
Posts: 5
Joined: 2017-9-05 @ 15:59

Re: Patch for pixel-perfect scaling

Postby Ant_222 » 2017-9-07 @ 08:57

unther wrote:Ant, would you expect your 5x5y algorithm to perform similarly to normal5x?
I know it doesn't, but I do not understand the implementation of the normalnx scalers so I cannot tell you now whence the difference. I will think about it.
Ant_222
Member
 
Posts: 348
Joined: 2010-7-24 @ 21:29

Re: Patch for pixel-perfect scaling

Postby Ant_222 » 2017-9-07 @ 21:45

I have measured the performance of my scaling routine on my PC with AMD A4 3400 at the scaling of a 320x200 image to 1600x1200. Here are the results for different optimisation levels:
Code: Select all
-O0: 134 fps
-O1: 201 fps
-O2: 395 fps
-O3: 504 fps
They look pretty good for practical purposes. I wonder why it works slower when compiled as part of DOSBox. Whosoever shall desire to try the test on his own machine, let him download the source from the attachment to this post. The program outputs the result in this form:
Code: Select all
Scaling 320x200 -> 1600x1200 at 504 fps.
You do not have the required permissions to view the files attached to this post.
Ant_222
Member
 
Posts: 348
Joined: 2010-7-24 @ 21:29

Re: Patch for pixel-perfect scaling

Postby unther » 2017-9-08 @ 01:57

Ant_222 wrote:Whosoever shall desire to try the test on his own machine, let him download the source from the attachment to this post.


Ant,

I'm getting the following results with your test program running on a Raspberry Pi 3 (Quad-core ARM Cortex-A7 @ 1.2 GHz)

Code: Select all
  -O0: Scaling 320x200 -> 1600x1200 at 24 fps.
  -O1: Scaling 320x200 -> 1600x1200 at 31 fps.
  -O2: Scaling 320x200 -> 1600x1200 at 32 fps.
  -O3: Scaling 320x200 -> 1600x1200 at 32 fps.


I think Dosbox normally wants to render at 60 or 70 fps (depending on the video mode), so I guess that explains the slowdowns on full-screen fade effects.
unther
Newbie
 
Posts: 5
Joined: 2017-9-05 @ 15:59

Re: Patch for pixel-perfect scaling

Postby ZakMcKracken » 2017-9-23 @ 11:15

Hi , I recently searched for better rendering output and stumbled on this thread, building my own debug version for linux(arch) from svn for some time now (http://svn.code.sf.net/p/dosbox/code-0/dosbox rev. 4052), patch 14 applied ok and looks beautiful !

Are there any plans integrating this patch to some other builds (or even official ones, is anyone there still alive ? :dead: ).
Dreaming of a working git repo with continuous builds in the future, the only ones I found did not want to compile because of reasons :depressed:

Running the benchmark yields the following results on my Core i5-4690 (3.9 GHz Turbo) (had to append the -lm flag in your Makefile or it would not build)

Code: Select all
-O0 668 fps
-O1 902 fps
-O2 1506 fps (3 run average)
-O3 1322 fps (3 run average)


so building dosbox with default -O2 it is :-D

Thanks again for this great scaler!

EDIT:
Also got an Raspberry PI 3 (arch armv7 kernel, stock clock of 1.2GHz), benchmarks:
Code: Select all
-O0 43 fps
-O1 74 fps
-O2 80 fps
-O3 81 fps
Image
User avatar
ZakMcKracken
Newbie
 
Posts: 9
Joined: 2005-4-30 @ 23:46

Re: Patch for pixel-perfect scaling

Postby Ant_222 » 2017-9-23 @ 12:09

ZakMcKracken wrote:Are there any plans integrating this patch to some other builds (or even official ones, is anyone there still alive ?
I should be glad if there were and should help all I could.
Running the benchmark yields the following results on my Core i5-4690 (3.9 GHz Turbo) (had to append the -lm flag in your Makefile or it would not build)
What does the flag do?—I did not find it in the documentation.
Code: Select all
-O0 668 fps
-O1 902 fps
-O2 1506 fps (3 run average)
-O3 1322 fps (3 run average)
so building dosbox with default -O2 it is :-D
Strange deterioration—seems to depend upon the CPU type. But your numbers are heart-warming.
Also got an Raspberry PI 3 (arch armv7 kernel, stock clock of 1.2GHz), benchmarks:
Code: Select all
-O0 43 fps
-O1 74 fps
-O2 80 fps
-O3 81 fps
What makes your RPi so much faster than unther's?

Everybody is welcome to help me optimise the scaling algorithm. Should anybody know how the built-in normalnx scalers work and why they perform many times better than my routine, I will be grateful for an explanation.
Ant_222
Member
 
Posts: 348
Joined: 2010-7-24 @ 21:29

Re: Patch for pixel-perfect scaling

Postby ZakMcKracken » 2017-9-23 @ 17:32

Ant_222 wrote:What does the flag do?—I did not find it in the documentation.


When searching why some included math.h functions did throw compiler errors I only found
"That's a linker option. It tells the linker to link with (-l) the m library (libm.so/dll). That's the math library. You often need it if you #include <math.h>."
So no idea why its needed on my setup :blah:

What makes your RPi so much faster than unther's?


Arch linux has very recent kernels (maybe optimizations for armv7 cores?), also I am running with enabled VC4 video core, maybe its offloading some work from the cpu and freeing up resources... but dosbox is not very fast to begin with, benchmarking shows doom and quake(super small window benchmarks using phils computer lab benchmark package) that its less than 1/2 the speed , ugh.

BUT on my core i5 its actually faster:

Code: Select all
normal3x / overlay stock 0.74 dosbox
Doom (35*gameticks/realticks formula): 93 fps
Quake: 31.1 fps
PC Player Benchmark (640x480): 43.2

surface/no aspect and scaler none stock 0.74 dosbox build (fastest code path?):
Doom: 139 fps
Quake: 41.5 fps
PC Player: 53 fps

surfacepp:
Doom: 121 fps
Quake: 39.2
PC Player: 46.1 fps


So its a faster renderer for me :cool: , almost as fast than doing nothing at all to the picture, did you by any chance write code that is super optimized for some sse instruction set by accident :lol:
Image
User avatar
ZakMcKracken
Newbie
 
Posts: 9
Joined: 2005-4-30 @ 23:46

Re: Patch for pixel-perfect scaling

Postby Ant_222 » 2017-9-24 @ 16:02

Thanks for the testing, Zak. I am glad my patch performs well on your machine. To make the results more accurate, could you use surface instead of overlay and make sure that the scaler uses the same scale as the pixel-perfect mode, because comparing normal3x with, say, 4x5 pixel-perfect scaling is wrong without the introduction of a correction coefficient of 9/20. In other words, we are interested in pixels per second, rather than frames per second.
Ant_222
Member
 
Posts: 348
Joined: 2010-7-24 @ 21:29

Re: Patch for pixel-perfect scaling

Postby ZakMcKracken » 2017-9-29 @ 03:21

Ant_222 wrote:To make the results more accurate, could you use surface instead of overlay and make sure that the scaler uses the same scale as the pixel-perfect mode


Here you go , also included adjusted PC Player benchmark with 2x2 scale / unscaled for raw throughput comparison

Code: Select all
surface normal3x aspect=false
Doom 133fps
Quake 40.6
PC Player 52.6 fps (runs at 640x480 and is not scaled)

scalerpp windowsize=960x600
Doom 130 fps (320x200 -> 960x600) 3x3
Quake 39.7 (320x200 -> 960x600) 3x3
Pc Player 50.0 (640x480 ->  640x480) 1x1

surface normal2x forced
PC Player 50.1 fps (1280x960)

scalerpp windowsize=desktop (1920x1080)
PC Player 46 fps 640x480 --> 1280x960 2x2 (same resulting scale and result as in my previous run)
Image
User avatar
ZakMcKracken
Newbie
 
Posts: 9
Joined: 2005-4-30 @ 23:46

Re: Patch for pixel-perfect scaling

Postby Ant_222 » 2017-9-29 @ 08:59

Those are excellent results, Zak, thank you. I still fail to understand, however, why on unther's RPi wolf3d works fast with normal5x but stutters with surfacepp...
Ant_222
Member
 
Posts: 348
Joined: 2010-7-24 @ 21:29

Re: Patch for pixel-perfect scaling

Postby ZakMcKracken » 2017-10-01 @ 12:34

Ant_222 wrote:Those are excellent results, Zak, thank you. I still fail to understand, however, why on unther's RPi wolf3d works fast with normal5x but stutters with surfacepp...


Playing around with the CFLAGS and -O3 I only seem to get any decent performance when using some form of opengl output so far, but I didn't patch anything for sdl2 or similar, have to check what the best output for stock dosbox is first , performance is all over the place and overlay completely locks up dosbox / so slow I never see the start of any benchmark...
Image
User avatar
ZakMcKracken
Newbie
 
Posts: 9
Joined: 2005-4-30 @ 23:46

Re: VIDEO Patch for pixel-perfect scaling (SDL1)

Postby MastaG » 2017-11-01 @ 10:19

can anyone update alpha 14 patch to apply against r4063 ?
there's too many rejects for me to fix.
MastaG
Newbie
 
Posts: 10
Joined: 2010-12-23 @ 12:22

Re: VIDEO Patch for pixel-perfect scaling (SDL1)

Postby Ant_222 » 2017-11-01 @ 10:29

This is what I feared. I will look into it at the weekend. Edit: Yesterplay80 has managed to apply it to 4063.
Ant_222
Member
 
Posts: 348
Joined: 2010-7-24 @ 21:29

Re: VIDEO Patch for pixel-perfect scaling (SDL1)

Postby Yesterplay80 » 2017-11-02 @ 07:11

Ant_222 wrote:This is what I feared. I will look into it at the weekend. Edit: Yesterplay80 has managed to apply it to 4063.

Actually, it's not that much that has changed really. It's just a matter of replacing all instances of "#if (HAVE_DDRAW_H) && defined(WIN32)" with "#if C_DDRAW". That's because of the change already introduced in r4056 that moved the dddraw detection to a configure option. Apart from that, only two line defintions for dosbox.cpp changed, but the patch command should get around that by itself. However, here's a fixed patch that works flawlessly with r4063:

pixel-perfect-alpha14_fixed.diff
You do not have the required permissions to view the files attached to this post.
My full-featured DOSBox SVN builds (without debugger) for Windows: Vanilla DOSBox and DOSBox ECE (Enhanced Community Edition)
User avatar
Yesterplay80
Member
 
Posts: 261
Joined: 2016-2-23 @ 11:02
Location: Germany

PreviousNext

Return to DOSBox Patches

Who is online

Users browsing this forum: No registered users and 1 guest