First post, by M-HT
I tried speeding up hq2x/hq3x (mainly by using SIMD instructions to speed up function diffYUV - on x86/x64/armv6/armv7/armv8 CPUs).
When I tested just hq2x/hq3x outside of DOSBox, I measured a speed increase 55%-87% (x86/x64 versions). But when I tested whole DOSBox, the maximum speed increase was 2%.
I'm leaving the patches here in case someone is interested in them.
The first patch only changes existing code. The second patch adds (a lot of) code to add some more speed.
The SIMD instructions are enabled by default on x64/armv6/armv8 CPUs. On x86/armv7 they are disabled (because not all CPUs support them) and need to be enabled in file src/gui/render_templates_hq.h.
When using Microsoft Visual C++ compiler, the SIMD instructions can only be enabled on x86 CPUs (inline asm is not supported on other CPUs).