Possible Solution to Youtube and DOS Aspect Ratio Issues

Reply 20 of 46, by MusicallyInspired

Posted on 2016-05-21, 03:42

MusicallyInspired Offline

Rank Oldbie

Rank: Oldbie
Posts: 654
Joined: 2004-09-16, 03:41
Location: Manitoba, Canada

xjas wrote:
Pedantic note: 320x200 is 16:10 with square pixels. 😉

Sorry, yes. I did know that. My bad.

Yamaha FB-01/IMFC SCI tools thread
My Github
Roland SC-55 Music Packs - Duke Nukem 3D, Doom, and more.

Reply 21 of 46, by MobyGamer

Posted on 2016-05-22, 02:22

MobyGamer Offline

Rank Member

Rank: Member
Posts: 312
Joined: 2006-01-18, 04:30

VileR notified me about this thread, as I've been working on "perfect" composite color captures from real (hardware) CGA for a few months, and I thought I'd chime in on the 4:3 vs. 16:9 pillarboxing conversation. My stance: I deliberately pillarbox the footage I upload and distribute because I consider it future-proofing. Here is my reasoning:

When YT was trying to figure out 60p, they went through many stages of support for 60p uploads; there was a time when 1440x1080 (a perfectly valid HDTV, broadcast, MPEG-2 transport stream resolution) was accepted and interpreted as 60p, and then it wasn't, and then it was considered anamorphic 16:9 (which is valid for AVCHD footage), and so on. I don't know what they do with 1440x1080 these days but it was enough for me to realize they don't always know what they're doing. The containers I use (1280x720, 1920x1080) are 16:9 with square pixels and 59.94 frames per second -- these are HD-spec (and in the case of 720p, broadcast-spec) containers that cannot be misinterpreted as to whether or not they are 16:9/60p or something else.

Supplying a pillarboxed video also makes it very clear what the aspect ratio is supposed to be, because it is baked into the presentation. I always cringe when I see 4:3 videos stretched out to 16:9. Providing a pillarboxed video will drop into any professional or amateur video editing software, or player software, and play with the correct aspect ratio every time. In other words, I'm protecting the end user from themselves (and crappy software).

Besides, if anyone wants to use my videos in their own work, it is trivial to crop out the pillarboxing in any video editing software.

Last edited by MobyGamer on 2016-05-22, 05:52. Edited 1 time in total.

Reply 22 of 46, by PhilsComputerLab

Posted on 2016-05-22, 03:04

PhilsComputerLab Offline

Rank l33t++

Rank: l33t++
Posts: 6174
Joined: 2014-09-28, 03:33
Location: Western Australia

Seeing most displays these days are 16:9, I don't have issues with pillar boxing. Some though take a 16:10 footage and pillar box that 😊

I never had issues with 1440 x 1080 footage, and it's still supported of course. YouTube actually has a page about aspect ratios: https://support.google.com/youtube/answer/6375112

Most videos these days include other footage from a modern HD camera, so you'll mostly get a mix of 16:9 and 4:3 footage, making pillar boxing necessary anyway.

I can name another good reason for pilar boxing: GPU aspect ratio scaling. If you want to show anything that has to do with changing resolutions, it's easiest to use GPU aspect ratio scaling and recording everything in one hit, which makes it very convenient.

YouTube, Facebook, Website

Reply 23 of 46, by VileR

Posted on 2016-05-25, 19:52

VileR Offline

Rank l33t

Rank: l33t
Posts: 2199
Joined: 2003-05-14, 22:11
Location: 1-01-80 0:00a

VileRancour wrote:
How about just scaling up by a factor of 8 (2560x1600) and specifying a Display Aspect Ratio of 4:3?

Okay, after further experiments, the latter part does not produce optimal results. Manually specifying a 4:3 DAR *does* result in correct aspect on youtube, but the way YT achieves this is by stretching it vertically during conversion, rather than squeezing it horizontally, and at some resolutions (especially when further scaling is involved during playback) that looks pretty awful.

So yes, it would appear that pure integer nearest-neighbor scaling (to e.g. 1600x1200, possibly with some added borders) is the way to go. I'm still not really sold on making the container 16:9 though. Following Trixter's reply, I realize that there may be some good reasons for doing that, but since this hampers non-16:9 viewers (including myself 😉) it's still a compromise. If only youtube (and web playback standards, in general) were more consistent...

[ WEB ] - [ BLOG ] - [ TUBE ] - [ CODE ]

Reply 24 of 46, by NewRisingSun

Posted on 2016-05-26, 09:41

NewRisingSun Offline

Rank Oldbie

Rank: Oldbie
Posts: 906
Joined: 2005-09-02, 02:26

What do you think of this test video, made with VirtualDub based on a DOSBox-captured AVI:

https://youtu.be/TS4yr9gch0U

resize 320x200 to 640x400 nearest neighbor
gamma correct to linear
resize 640x400 to 1600x1200 precise bilinear
gamma correct to sRGB

The idea behind steps 2 and 4 being the use of gamma-aware resizing. Also when using DOSBox, it's a good idea to use "machine=ega" or "machine=tandy" when the game allows it because the original frame rate will be 60 rather than 70 frames per second. But I suppose you already knew that.

Reply 25 of 46, by VileR

Posted on 2016-05-26, 11:58

VileR Offline

Rank l33t

Rank: l33t
Posts: 2199
Joined: 2003-05-14, 22:11
Location: 1-01-80 0:00a

Thanks for the demonstration, that does look excellent.

I have to ask though: I see why any interpolation method that blends adjacent source pixels (e.g. bilinear) would be vulnerable to gamma errors, as described in the article. However, if nearest-neighbor scaling were used across the board, then I don't see where such errors would be introduced in the first place. Where does the improvement stem from, then? And what's the significance of choosing 640x400 as a cutoff point between the two methods?

[ WEB ] - [ BLOG ] - [ TUBE ] - [ CODE ]

Reply 26 of 46, by NewRisingSun

Posted on 2016-05-26, 12:05

NewRisingSun Offline

Rank Oldbie

Rank: Oldbie
Posts: 906
Joined: 2005-09-02, 02:26

You are correct: when only using nearest-neighbor scaling, gamma is irrelevant. I don't like nearest-neighbor scaling all the way up to 1600x1200 though because it looks really harsh, and the edges tend to strain video codecs other than ZMBV. That's why I bilinear scale to 1600x1200 to soften the edges (and bilinear does require gamma-awareness), pre-scaling via nearest neighbor to 640x400 to prevent it from becoming too blurry. "Bilinear" is better for this than bicubic since edge-enhancement is a no-go for these applications. (The filter that would perfectly capture CRT bluriness would be a gamma-aware Gaussian filter, which VirtualDub does not seem to support.) The pre-scale is not just to make it look less blurry, but to accurately replicate the fact that VGA monitors double-scan in 320x200 modes, that is, they display every scanline twice. That's where the significance of 640x400 comes from: you're basically doing what a real CRT monitor did back in the day.

Last edited by NewRisingSun on 2016-05-26, 12:13. Edited 1 time in total.

Reply 27 of 46, by Scali

Posted on 2016-05-26, 12:12

Scali Offline

Rank l33t

Rank: l33t
Posts: 4873
Joined: 2014-12-13, 14:24

NewRisingSun wrote:
That's where the significance of 640x400 comes from, you're basically doing what a real CRT monitor did back in the day.

Or an LCD for that matter.
LCDs normally scale up to their native resolution as well, and will generally apply some form of bilinear filtering. But in 320x200 mode on VGA, they start from a source image of 640x400. Scaling 320x200 directly would look far worse (think lousy 240p YT videos at full 1600x1200 res).

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 28 of 46, by NewRisingSun

Posted on 2016-05-26, 12:17

NewRisingSun Offline

Rank Oldbie

Rank: Oldbie
Posts: 906
Joined: 2005-09-02, 02:26

Yes, because the double-scanning is actually triggered by the VGA card itself, not the monitor. (For a VGA monitor to display 320x200 without double-scanning, it would need to support a 15 kHz horizontal frequency, which is entirely out of spec for VGA.) You can see this for yourself when switching DOSBox to machine=vgaonly, which emulates the VGA more accurately than the default machine type (svga_s3, I think). All mode 13h screenshots will be saved as 640x400 instead of 320x200. (You should not use machine=vgaonly unless you really have to though, because it only uses a 16 bit output surface, causing horrendous grayscale tracking errors.)

Reply 29 of 46, by Scali

Posted on 2016-05-26, 12:24

Scali Offline

Rank l33t

Rank: l33t
Posts: 4873
Joined: 2014-12-13, 14:24

NewRisingSun wrote:
Also when using DOSBox, it's a good idea to use "machine=ega" or "machine=tandy" when the game allows it because the original frame rate will be 60 rather than 70 frames per second.

Alternatively you can cheat 😀
This was captured from machine=vgaonly (it doesn't work in svga-modes, because they don't render per scanline for some reason), at 70 fps: https://youtu.be/4ClrU-ne2Us
I just slowed it down to 60 fps in my editing software.
Since the music is not synchronized to the video, I did not have to slow down the audio track.
The result is a perfectly smooth scroller, just running slightly slower than it actually should 😀

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 30 of 46, by Scali

Posted on 2016-05-26, 13:01

Scali Offline

Rank l33t

Rank: l33t
Posts: 4873
Joined: 2014-12-13, 14:24

NewRisingSun wrote:
(For a VGA monitor to display 320x200 without double-scanning, it would need to support a 15 kHz horizontal frequency, which is entirely out of spec for VGA.)

Yes, VGA only supports two resolutions really:
640x400@70 Hz
640x480@60 Hz

Other modes are derived from this by doubling the pixels horizontally and/or vertically, eg:
320x200, 320x240, 320x400, 320x480, 640x200, 640x240.

I'm not entirely sure how VGA does 640x350 though (EGA hires modes).

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 31 of 46, by VileR

Posted on 2016-05-26, 13:14

VileR Offline

Rank l33t

Rank: l33t
Posts: 2199
Joined: 2003-05-14, 22:11
Location: 1-01-80 0:00a

NewRisingSun wrote:
You are correct: when only using nearest-neighbor scaling, gamma is irrelevant. I don't like nearest-neighbor scaling all the way up to 1600x1200 though because it looks really harsh, and the edges tend to strain video codecs other than ZMBV. That's why I bilinear scale to 1600x1200 to soften the edges (and bilinear does require gamma-awareness), pre-scaling via nearest neighbor to 640x400 to prevent it from becoming too blurry. "Bilinear" is better for this than bicubic since edge-enhancement is a no-go for these applications. (The filter that would perfectly capture CRT bluriness would be a gamma-aware Gaussian filter, which VirtualDub does not seem to support.) The pre-scale is not just to make it look less blurry, but to accurately replicate the fact that VGA monitors double-scan in 320x200 modes, that is, they display every scanline twice. That's where the significance of 640x400 comes from: you're basically doing what a real CRT monitor did back in the day.

Makes sense. I've always preferred nearest-neighbor to bilinear scaling of game footage, but seeing a proper (gamma-aware) implementation of the latter, it's not nearly as bad - especially with the preliminary 640x400 step.
I guess there's still the issue that the final result usually gets scaled once again during playback, to fit the YT player's dimensions, and this step can't be gamma-corrected AFAICT. Still, I'll want to have a play with this method... perhaps some judicious use of extra borders (as suggested earlier) could mitigate that problem a little more.

[ WEB ] - [ BLOG ] - [ TUBE ] - [ CODE ]

Reply 32 of 46, by Scali

Posted on 2016-05-26, 13:31

Scali Offline

Rank l33t

Rank: l33t
Posts: 4873
Joined: 2014-12-13, 14:24

VileRancour wrote:
I guess there's still the issue that the final result usually gets scaled once again during playback, to fit the YT player's dimensions, and this step can't be gamma-corrected AFAICT.

Well, in theory it can be (the hardware is capable of doing it in realtime, no problem), but we don't have control over it. It may even depend on what GPU/driver/browser you're using. I have no idea if some of them apply gamma correction to be honest.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 33 of 46, by NewRisingSun

Posted on 2016-05-26, 14:08

NewRisingSun Offline

Rank Oldbie

Rank: Oldbie
Posts: 906
Joined: 2005-09-02, 02:26

All hardware scalers that I have seen have always been gamma-unaware. (Although I must admit that I haven't seen many.)

Provided that you use gamma-aware bilinear resizing with a target resolution at least twice the original (pre-scaled) resolution, rescaling to 1080 rather than 1200 lines should not introduce any additional artifacts. Consider this video. It's from a NES game rather than a PC games, so the scale factor is from 240x2 to 1080 lines (2.25) rather than 200x2 to 1080 (2.7), but the checkerboard background along with the 1-frame scrolling would expose any artifacts that existed: https://youtu.be/rihtkfUBoQQ

If you wanted to replicate the image of a non-double-scanning 15 kHz monitor, something in the range of 70-80% scanlines would be appropriate. The scanline effect replaces the nearest neighbor x2 scale vertically, and when combined with a gamma-aware bilinear or Gaussian filter, the result looks quite nice, and the gamma-aware resize will even make the scanlines look right. These two images very accurately replicate my Commodore 1084S-P:
http://www.symphoniae.com/nrs/vogons/scanlines_kq1.png
http://www.symphoniae.com/nrs/vogons/scanlines_smb3.png

Reply 34 of 46, by Jepael

Posted on 2016-05-26, 18:26

Jepael Offline

Rank Oldbie

Rank: Oldbie
Posts: 1195
Joined: 2005-06-15, 19:28
Location: Finland

Scali wrote:
Yes, VGA only supports two resolutions really: 640x400@70 Hz 640x480@60 Hz […]
Show full quote

Yes, VGA only supports two resolutions really:
640x400@70 Hz
640x480@60 Hz

Other modes are derived from this by doubling the pixels horizontally and/or vertically, eg:
320x200, 320x240, 320x400, 320x480, 640x200, 640x240.

Yes, and in addition, if you change from 25.175 MHz clock to 28.322 MHz clock, with identical scan rates, you can have 360 or 720 pixels instead of 320 or 640. 720 is used in text mode (plus maybe in 16-color tweak modes), and 360 in some 256-color tweak modes.

Scali wrote:

I'm not entirely sure how VGA does 640x350 though (EGA hires modes).

It's the same mode than 640x400@70 Hz really, 449 total lines, but with 350 active lines. The H/V sync polarities are different to indicate also the monitor it should adjust the scan rate to show only 350 active lines. The monitor only knows the amount of active lines it should display (350, 400 or 480) from the sync polarities.

Reply 35 of 46, by MobyGamer

Posted on 2016-05-26, 18:28

MobyGamer Offline

Rank Member

Rank: Member
Posts: 312
Joined: 2006-01-18, 04:30

NewRisingSun wrote:
If you wanted to replicate the image of a non-double-scanning 15 kHz monitor, something in the range of 70-80% scanlines would be appropriate.

The problem with putting scanlines in your video is that they will artifact if the video is resized. Not only that, but different monitors have different scanline bleed. My 5153 matches your 1084S-P, more or less, but my NTSC broadcast monitor has a completely black line between each displayed line. I think that scanlines should be something like a "profile" that can be applied to a video player if you want to simulate various output devices. (That said, your processed output looks great; what program or process did you follow to generate the sample screenshots?)

My view on processing emulator output is that it represents the ideal, rather than the actual output. My preference for 320x200 emulator output is nearest neighbor 1600x1200 to preserve pixels as well as fit to a 1:1 aspect ratio as previously discussed, then I resize downward as necessary. So far I haven't had to put emulator output in a 4K video, but if I do, I will likely NN resize to 3200x2400 before resizing downward, to preserve as much sharpness as possible.

My focus this year has been on analog captures from real hardware (CGA primarily, both composite and RGBI). My setup is still being tweaked; when it's finished, I'll likely make a video about all of the challenges and solutions. Since the capture results in already-analog-resized output, I don't attempt to fit it to any sort of 1600x1200 ideal -- and besides, that would crop out the overscan/border which I want to preserve anyway.

Reply 36 of 46, by elianda

Posted on 2016-05-26, 20:44

elianda Offline

Rank l33t

Rank: l33t
Posts: 2515
Joined: 2006-04-21, 16:56
Location: Hannover / Germany

My approach is the following:

for my own archiving I leave the source resolution as it is and set the aspect ratio in the container, e.g. with mkvmerge.

for youtube I have to consider if it is ok to have the pixel shape preserved or if some blur is ok, like for C64 graphics where some blur between the pixels is recommended. In one case I upscale by e.g. 1000% nearest neighbor point sampling such that the resolution is much higher than 1080 vertical. Then I downscale to X x 1080 with the aspect ratio of the original viewport e.g. 4:3 and as method lanzcos3. In a third step I letterbox to 1920x1080. I keep the container without specific setting as by default players as well as youtube assume quadratic pixels.
If it can be blurred a bit I can directly upscale to X x 1080 with bicubic 0.75 with the aspect ratio of the source viewport and letterbox afterwards.
(of course the X x 1080 is just an example for Full HD and can be also be UHD or any other resolution)

As for wanted and well defined blurring for sources like C64 output it can be rather difficult to hit the right amount. This approach also implies that the viewer does not resize the output again. However this is not as critical if it is already good upscaled content from an original low res source. In this case an additional upscale even with bilinear keeps sharp pixels as the bilinear scaling first takes effect only on the pixel borders.

Retronn.de - Vintage Hardware Gallery, Drivers, Guides, Videos. Now with file search
Youtube Channel
FTP Server - Driver Archive and more
DVI2PCIe alignment and 2D image quality measurement tool

Reply 37 of 46, by Stretch

Posted on 2016-05-27, 00:09

Stretch Offline

Rank Oldbie

Rank: Oldbie
Posts: 539
Joined: 2013-11-16, 00:57

NewRisingSun wrote:
...(The filter that would perfectly capture CRT bluriness would be a gamma-aware Gaussian filter, which VirtualDub does not seem to support.)...

ResampleHQ for Avisynth (which you can use with Virtualdub), has a Gaussian kernel.

Win 11 - Intel i7-1360p - 32 GB - Intel Iris Xe - Cubilux 7.1 USB

Reply 38 of 46, by NY00123

Posted on 2018-01-26, 08:08

NY00123 Offline

Rank Member

Rank: Member
Posts: 240
Joined: 2010-02-13, 19:42

Bumping here, mainly to show what has worked for me last night for Youtube (and also a simpler command that could've *probably* be used).

Note that I've run avconv in Ubuntu 14.04. This assumes 320x200 input pixels with an aspect ratio of 4:3. Further note that the input sample rate was 49716Hz, and the framerate was ~70.086304 (although the "-r" argument is probably redundant, see below).

Following is the command used for generating the output for YouTube. While its output was 1600x1200 (with a frame-rate a bit above 70fps), YouTube seems to cap this to a maximum of 1080p60, at least for me. It does pillar-box the output as expected, though.

1avconv -i bm1patch_000.avi -sws_flags neighbor -vcodec libx264 -preset slow -crf 18 -acodec copy -pix_fmt yuv420p -s 1600x1200 -r 70.086304 bm1patch_000_conv.mkv

Note that the above command is what I got after making a few experiments, beginning from a few commands I prepared beforehand for Twitch (Preparing a DOSBox capture for uploading/streaming to a site like Twitch.tv) and also checking at least one other command for reference (http://ingomar.wesp.name/2011/04/dosbox-gamep … eo-capture.html).

There are good chances you don't need most of the arguments at all. In particular, no frame-rate is specified. Here's a simple form (currently untested with YouTube):

1avconv -i bm1patch_000.avi -sws_flags neighbor -vcodec libx264 -acodec copy -s 1600x1200 bm1patch_000_conv_test.mkv

Reply 39 of 46, by dha5448

Posted on 2020-12-19, 06:00

dha5448 Offline

Rank Newbie

Rank: Newbie
Posts: 3
Joined: 2020-12-19, 05:53

I'm bumping this thread in 2020 to solicit comments and critiques about my results with capturing and converting 320x200 VGA for YouTube. I am getting ready to convert a number of films from the 1992 game Stunt Island, and I want the best quality possible. Here is the link:

https://www.youtube.com/watch?v=ODkUlma183E

I did not use any resampling, I wanted the pixels as clear as possible without the blurry look that resampling usually gives. Part of that could be personal preference though.

Here is another video with some gameplay:
https://www.youtube.com/watch?v=XoZ28ZfyJ44

Thanks for any feedback! If anyone is interested, I can share my workflow.

Main menu