VIDEO Patch for pixel-perfect scaling (SDL1)

Reply 520 of 733, by pantercat

Posted on 2018-09-23, 17:14

pantercat Offline

Rank Newbie

Rank: Newbie
Posts: 60
Joined: 2018-09-06, 17:22

It works!!

I've downloaded dosbox svn r4163, I've applied the last pixel-perfect patch [2018-09-22] Alpha 14 and I've compiled dosbox with

patch -p0 <pixel-perfect-alpha14-4163.patch && cd src && ./autogen.sh && ./configure && make

Now I can run dosbox with surfacepp 😀 Thank you

Reply 521 of 733, by Ant_222

Posted on 2018-09-23, 21:00

Ant_222 Offline

Rank Oldbie

Rank: Oldbie
Posts: 528
Joined: 2010-07-24, 21:29

pantercat wrote:
It works!! […]
Show full quote
It works!!

I've downloaded dosbox svn r4163, I've applied the last pixel-perfect patch [2018-09-22] Alpha 14 and I've compiled dosbox with

patch -p0 <pixel-perfect-alpha14-4163.patch && cd src && ./autogen.sh && ./configure && make

Now I can run dosbox with surfacepp :) Thank you

I am glad it does, but we should thank aaronp for finding the bug and blame myself for introducing it :-)

Reply 522 of 733, by Yesterplay80

Posted on 2018-09-25, 06:42

Yesterplay80 Offline

Rank Oldbie

Rank: Oldbie
Posts: 540
Joined: 2016-02-23, 11:02
Location: Germany

A new ECE build including the fixed patch is available as well, please check if it now works as it should.

My full-featured DOSBox SVN builds for Windows & Linux: Vanilla DOSBox and DOSBox ECE (Google Drive Mirror)

Reply 523 of 733, by pantercat

Posted on 2018-09-25, 21:33

pantercat Offline

Rank Newbie

Rank: Newbie
Posts: 60
Joined: 2018-09-06, 17:22

I've compiled ECE r4163 (Linux source) and it works for me. Thank you all!

Reply 524 of 733, by unther

Posted on 2018-11-13, 17:18

unther Offline

Rank Newbie

Rank: Newbie
Posts: 8
Joined: 2017-09-05, 15:59

Ant_222 wrote:
unther wrote:
Ant, would you expect your 5x5y algorithm to perform similarly to normal5x?
I know it doesn't, but I do not understand the implementation of the normalnx scalers so I cannot tell you now whence the difference. I will think about it.

Ant_222, just a follow-up to our performance discussion a year ago, I created patch (attached) that adds a normal4x5y scaler to dosbox, which uses this same scaling technique as the built-in normal2x/normal3x. Using this normal4x5y scaler along with 'output=surface' is much faster on an RPi3 versus using 'output=surfacepp', while have exactly the same pixel output @1080p. I don't have an effective way to benchmarch the two methods, but normal4x5y can handle the Wolf3D fade effect without stutter on the RPi3, even at 2 or 3 times the cycle count that surfacepp stutters at.

I'm not sure how the normalNx scalers differ from the way surfacepp does scaling, but maybe you'll be able to see something here that'll give you a hint.

Note: I actually created this patch a year ago again r4025, just never got around to posting it, but it still applies and works with r4163.

Reply 525 of 733, by krcroft

Posted on 2018-11-13, 18:17

krcroft Offline

Rank Oldbie

Rank: Oldbie
Posts: 589
Joined: 2017-04-29, 15:07

This is an excellent addition for those on single-Ghz systems (early Pentium 3's, Rpi3, etc..) to get as close to crisp pixels on modern displays. Thank you!

Reply 526 of 733, by Ant_222

Posted on 2018-11-13, 20:01

Ant_222 Offline

Rank Oldbie

Rank: Oldbie
Posts: 528
Joined: 2010-07-24, 21:29

unther, I never understood how those built-in scalers worked. Can you explain, perhaps?

Reply 527 of 733, by unther

Posted on 2018-11-14, 00:47

unther Offline

Rank Newbie

Rank: Newbie
Posts: 8
Joined: 2017-09-05, 15:59

krcroft wrote:
This is an excellent addition for those on single-Ghz systems (early Pentium 3's, Rpi3, etc..) to get as close to crisp pixels on modern displays. Thank you!

Not a problem, but just to clarify for others, because 'normal4x5y' is a fixed scaler, it's only optimal for running 320x200 games on a display with a height of just over 1000 pixels (e.g. 1280x1024, 1680x1050, 1920x1080). For displays with a pixel height of 1200 or more, you'd want to create a patch to add a 'normal5x6y' scaler.

Since I run RetroPie on my RPI3, I just use game-specific configs to set the optimal scaler. I have my RPI3 connected to a 1280x1024 monitor, so for 320x200 games I'll use 'normal4x5y', and for 640x480 games I'll use 'normal2x'. (And I just use Ant's pixel perfect on my desktop connected to a 1920x1200 monitor.)

Reply 528 of 733, by unther

Posted on 2018-11-14, 02:01

unther Offline

Rank Newbie

Rank: Newbie
Posts: 8
Joined: 2017-09-05, 15:59

Ant_222 wrote:
unther, I never understood how those built-in scalers worked. Can you explain, perhaps?

Unfortunately, I don't have any real understanding of how they work either (or really any familiarity with the dosbox code base). I just used the existing normalNx scalers as a template to create a normal4x5y without delving in to how these scalers actually push pixels to the display.

I just took a look now but the code is hard for me to follow due to the nested includes and conditional pre-processor directives being used to reduce code duplication. It looks like the actual scaling code is in render_simple.h, which itself in included multiple times by render_templates.h, once in each scaler definition. render_templates.h in included multiple times by render_scalers.cpp, looks like once for each variation of color bit depth.

At its heart, it looks like these scalers just copy a source pixel into a grid/block for output. Here's the definition from render_templates.h for the built-in normal3x and the normal4x5y that I created from it. Note that SCALERFUNC is then inserted into the code included from render_simple.h

1#define SCALERNAME              Normal3x
2#define SCALERWIDTH             3
3#define SCALERHEIGHT    3
4#define SCALERFUNC                                                              \
5        line0[0] = P;                                                           \
6        line0[1] = P;                                                           \
7        line0[2] = P;                                                           \
8        line1[0] = P;                                                           \
9        line1[1] = P;                                                           \
10        line1[2] = P;                                                           \
11        line2[0] = P;                                                           \
12        line2[1] = P;                                                           \
13        line2[2] = P;
14#include "render_simple.h"
15#undef SCALERNAME
16#undef SCALERWIDTH
17#undef SCALERHEIGHT
18#undef SCALERFUNC
19
20#define SCALERNAME              Normal4x5y
21#define SCALERWIDTH             4
22#define SCALERHEIGHT    5
23#define SCALERFUNC                                                              \
24        line0[0] = P;                                                           \
25        line0[1] = P;                                                           \
26        line0[2] = P;                                                           \
27        line0[3] = P;                                                           \
28        line1[0] = P;                                                           \
29        line1[1] = P;                                                           \
30        line1[2] = P;                                                           \
31        line1[3] = P;                                                           \
32        line2[0] = P;                                                           \
33        line2[1] = P;                                                           \
34        line2[2] = P;                                                           \
35        line2[3] = P;                                                           \
36        line3[0] = P;                                                           \
37        line3[1] = P;                                                           \
38        line3[2] = P;                                                           \
39        line3[3] = P;                                                           \
40        line4[0] = P;                                                           \
41        line4[1] = P;                                                           \
42        line4[2] = P;                                                           \
43        line4[3] = P;
44#include "render_simple.h"
45#undef SCALERNAME
46#undef SCALERWIDTH
47#undef SCALERHEIGHT
48#undef SCALERFUNC

That's as far as I went - hopefully that points you in the right direction.

BTW, if you can't figure out how to get your scaler to run as fast as these built-in ones, another approach might be to just patch in all the scaler variants you need (normal4x5y, normal5x6y, etc.) and just change to them on the fly after you've calculated the optimal one from the PAR. (You can change the scaler on the fly from the command line, should it might be doable - you might not even need surfacepp anymore?)

Reply 529 of 733, by Ant_222

Posted on 2018-11-14, 20:21

Ant_222 Offline

Rank Oldbie

Rank: Oldbie
Posts: 528
Joined: 2010-07-24, 21:29

Thanks for the explanation, unther.

unther wrote:
BTW, if you can't figure out how to get your scaler to run as fast as these built-in ones

I do have a couple ideas. One is to simplify my overcomplicated code by removing surfacenb (which is already available as openglnb) surfacenp (which is too slow and hardly different from nearest neighbor), and then carefully to analylse the simplified code. The second idea is to parallelise the scaling.

another approach might be to just patch in all the scaler variants you need (normal4x5y, normal5x6y, etc.) and just change to them on the fly after you've calculated the optimal one from the PAR. (You can change the scaler on the fly from the command line, should it might be doable - you might not even need surfacepp anymore?)

Indeed, but this is so ugly that I will let someone else it :-) I will help with the selection of the optimal scaling factors.

Reply 530 of 733, by Yesterplay80

Posted on 2018-12-18, 06:53

Yesterplay80 Offline

Rank Oldbie

Rank: Oldbie
Posts: 540
Joined: 2016-02-23, 11:02
Location: Germany

FYI: r4178 once again breaks compatibility with your patch, as I had to adapt the changes to the modified patch I use, I quickly did the same with the original patch.

The attachment pixel-perfect-alpha14-4178.zip is no longer available

My full-featured DOSBox SVN builds for Windows & Linux: Vanilla DOSBox and DOSBox ECE (Google Drive Mirror)

Reply 531 of 733, by Ant_222

Posted on 2018-12-22, 14:00

Ant_222 Offline

Rank Oldbie

Rank: Oldbie
Posts: 528
Joined: 2010-07-24, 21:29

A quick notice that I will try to fix the problem as time permits.

Reply 532 of 733, by Ant_222

Posted on 2019-01-07, 17:32

Ant_222 Offline

Rank Oldbie

Rank: Oldbie
Posts: 528
Joined: 2010-07-24, 21:29

Yesterplay80 wrote:
FYI: r4178 once again breaks compatibility with your patch

I have fixed it in alpha 15 and also introduced a minor change to the implementation. Let me know if it breaks anything.

Reply 533 of 733, by Ant_222

Posted on 2019-01-07, 23:32

Ant_222 Offline

Rank Oldbie

Rank: Oldbie
Posts: 528
Joined: 2010-07-24, 21:29

unther wrote:
Ant_222, just a follow-up to our performance discussion a year ago, I created patch (attached) that adds a normal4x5y scaler to dosbox, which uses this same scaling technique as the built-in normal2x/normal3x. Using this normal4x5y scaler along with 'output=surface' is much faster on an RPi3 versus using 'output=surfacepp', while have exactly the same pixel output @1080p. I don't have an effective way to benchmarch the two methods, but normal4x5y can handle the Wolf3D fade effect without stutter on the RPi3, even at 2 or 3 times the cycle count that surfacepp stutters at.

I have tested the scaling algorithm from alpha 15 separately from DosBox:

1320x200 -> [4x5] -> 1280x1000
2  FPS: 438
3  MPS: 561
4
5320x200 -> [8x10] -> 2560x2000
6  FPS: 145
7  MPS: 743

where MPS stands for output megapixels per second. As you see, the results are more than sufficient even on my ancient PC with AMD A4-3400 APU. Does anyone have an idea how to determine the bottleneck of this algorithm when it works as part of DOSBox?

Reply 534 of 733, by Ant_222

Posted on 2019-01-08, 20:12

Ant_222 Offline

Rank Oldbie

Rank: Oldbie
Posts: 528
Joined: 2010-07-24, 21:29

Whoever experieces slow performance in full-screen mode with my patch, please try with:

1fulldouble=false

Reply 535 of 733, by Yesterplay80

Posted on 2019-01-09, 08:43

Yesterplay80 Offline

Rank Oldbie

Rank: Oldbie
Posts: 540
Joined: 2016-02-23, 11:02
Location: Germany

Ant_222 wrote:
Yesterplay80 wrote:
FYI: r4178 once again breaks compatibility with your patch

I have fixed it in alpha 15 and also introduced a minor change to the implementation. Let me know if it breaks anything.

Nope, seems to work so far. I also updated DOSBox ECE with A15 of your patch, available in ECE r4180.2.

My full-featured DOSBox SVN builds for Windows & Linux: Vanilla DOSBox and DOSBox ECE (Google Drive Mirror)

Reply 536 of 733, by Ant_222

Posted on 2019-01-10, 21:07

Ant_222 Offline

Rank Oldbie

Rank: Oldbie
Posts: 528
Joined: 2010-07-24, 21:29

unther, can you please check the performance of the latest version of the patch with

1fulldouble=false

on your RPi? Does it look the same as with alpha 14?

Reply 537 of 733, by Ant_222

Posted on 2019-01-18, 23:12

Ant_222 Offline

Rank Oldbie

Rank: Oldbie
Posts: 528
Joined: 2010-07-24, 21:29

I have just uploaded an update that adds hardware-accelerated pixel-perfect scaling mode openglpp.

Reply 538 of 733, by KainXVIII

Posted on 2019-01-19, 09:39

KainXVIII Offline

Rank Oldbie

Rank: Oldbie
Posts: 552
Joined: 2015-05-20, 15:04
Location: Yaroslavl

Ant_222 wrote:
I have just uploaded an update that adds hardware-accelerated pixel-perfect scaling mode openglpp.

Does it performs better? 😲

Reply 539 of 733, by Ant_222

Posted on 2019-01-19, 09:41

Ant_222 Offline

Rank Oldbie

Rank: Oldbie
Posts: 528
Joined: 2010-07-24, 21:29

KainXVIII wrote:
Ant_222 wrote:
Does it performs better?

It does!

Main menu