VOGONS

Common searches


First post, by NY00123

User metadata
Rank Member
Rank
Member

Update (May 5th, 2013): Please see a later post for an up-to-date patch.
In particular, the issue with "Threaded optimization" was really a bug in the patch.

Hey all,

I am attaching a patch that attempts to add support for host VSync with no great emulation slowdowns. I did start a thread on a similar topic before, but it has what I consider to be a vastly different patch.

=== How to use this ===

A few changes to the DOSBox settings are done:
- "fulldouble" is renamed "vsync", for reasons of my own. (OK, one of them is related to some flag from SDL2, although there is nothing SDL2 related in this patch.)
- "vsync" applies to OpenGL output now, in addition to Surface and DirectDraw.
- A new "threading" setting has been added (to the [sdl] section in the configuration file). You probably want to keep it set to "auto", although you may change that if you wish.

=== Known issues ===

If any of you tries the OpenGL path with any Nvidia GPU, you may want to disable "Threaded optimization" in the driver settings, or else you may have not benefited from the patch at all. I really don't know why is it the case, but I'm telling to let you know.

Furthermore, it should not be a surprise at all if new bugs are reproduced. As usual, use at your own risk!

=== (The answer is *no*) Does it imply threaded rendering ===

Unfortunately, for now I have decided to keep the rendering itself (e.g. the work of the scalers) in the emulator's thread. With a few(?) modifications to this patch it can probably be done, although one needs to take care of proper video capturing and more.

=== Some more details ===

ok, some forum readers may not be sure what's the big deal. Maybe you can just enable VSync in some way, as done for lots of games, and do no more. However, chances are it is not going to work well here:
- Many DOS games may output 70 distinct frames per second at some point, while today's displays can often do less as given by their refresh rates (e.g. 60Hz). This means that noticeable slowdowns (really a bit of slow motions) can be experienced.
- Even if it weren't a problem, the CPU will probably need to wait for host vertical sync more often than not. These are periods of times when the emulated machine can't run, which can have an impact on heavier DOS games and/or lighter (host) machines.

As hinted before, I did have an earlier attempt, based on time measurements. However it's a bit too sensitive and may easily fail, requires one to manually specify the monitor's refresh rate so it works better, and a bit of potential CPU time (for emulation) is still lost.

In this patch, a different approach is attempted: Let one thread wait for vertical sync, while the emulator may continue to run in a different thread.

It cannot just work as-is by naively calling an existing screen update function with a few modifications, though. For one, many SDL functions related to video and event handling should be called from the thread where SDL_SetVideoMode is called for the very first time. It is also safer to let this thread be the main one.
And then, there are synchronization issues to take care of.

So, after a bit earlier attempt which semi-worked but was imperfect and possibly a bit messy, I have gotten the given patch. Basically, rather than referring to specific functions, like SDL_Flip, pointers to such functions are used. Without threading they simply pointer to what you'd expect (e.g. SDL_Flip). Otherwise they point to wrapper functions, so the secondary thread can push calls to the main thread. There are a few kinds of such calls in use:
- Synchronous calls: The secondary thread waits until the main thread is done with a call to some function. It may be a void function, or one that returns a value. In both cases it is synchronous.
- Asynchronous calls: The secondary thread schedules a function for the main thread and then returns immediately with no wait. The main thread will execute such a function (possibly with different arguments) a bit later.
- Special calls with unique handlings.

As you can see in the patch, locks are used often for synchronization. C++11 adds std::atomic and SDL 2.0 also has a few functions for atomic data accesses, but with this patch you don't need more than the usual SDL 1.2 setup and a compiler with no C++11 support.
Oh, and if you think that it's possible to avoid *both* locks and (hardware) atomic accesses in some way, I'm afraid it may not work as expected. Sure, one may think that if one thread is waiting for a boolean to become "true", it is sufficient that another thread sets it to "true" with no protection. However, the two threads may run on two distinct CPUs with their own caches, and one of them may still contain the old boolean value in its cache, even after the update. See where is it heading?

So, yes, locks and more have their costs. I haven't seen a large degradation in the performance, though, and that's under the assumption there is any noticeable one.

Attachments

  • Filename
    dosbox_trunk_threaded_vsync_20130504_win32.zip
    File size
    1.04 MiB
    Downloads
    177 downloads
    File comment
    Threaded VSync, Win32 binaries
    * The exe may require Visual C++ 2010 Redistributable, and screen overlays may not work as expected.
    File license
    Fair use/fair dealing exception
  • Filename
    dosbox_trunk_threaded_vsync_20130504.diff
    File size
    54.65 KiB
    Downloads
    170 downloads
    File comment
    Threaded VSync patch against r3827
    File license
    Fair use/fair dealing exception
Last edited by NY00123 on 2013-05-05, 09:37. Edited 2 times in total.

Reply 1 of 8, by Mau1wurf1977

User metadata
Rank l33t++
Rank
l33t++

Hi!

I wanted to try this out and see if it makes the scrolling of Pinball Deams smoother. I replaced the DOSBox that the GOG.com version comes with with your version, but now when I launch the game through the shortcuts, DOSBox just closes and opens right away.

Question, as I'm not a Windows programmer. Can an application request custom refresh rates? Or if you create a custom refresh rate in Windows, access it?

So basically just have DOSBox run at 72 Hz and scale it to a lower resolution that the LCD can handle 72 Hz at...

My website with reviews, demos, drivers, tutorials and more...
My YouTube channel

Reply 2 of 8, by bloodbat

User metadata
Rank Oldbie
Rank
Oldbie
Mau1wurf1977 wrote:

Question, as I'm not a Windows programmer. Can an application request custom refresh rates? Or if you create a custom refresh rate in Windows, access it?

Theoretically yes...using DirectX

Reply 3 of 8, by NY00123

User metadata
Rank Member
Rank
Member
Mau1wurf1977 wrote:

Hi!

I wanted to try this out and see if it makes the scrolling of Pinball Deams smoother. I replaced the DOSBox that the GOG.com version comes with with your version, but now when I launch the game through the shortcuts, DOSBox just closes and opens right away.

Hey, thanks for trying this out!
Visual C++ 2010 Redistributable Package should be downloaded and installed, if not ready. If it is not the problem, I'm afraid that you further need to copy the bundled DLLs as well. The SDL version I have compiled the EXE against is 1.2.14 (not the latest in the 1.2 branch), since I had it ready somewhere.
Most of the time I worked on this using a GNU/Linux distribution.

===

BTW (Did I really forget to tell this?) if someone wants a different EXE (like one linked to SDL_sound), it should be possible to apply the patch to the sources and compile, at least in theory. It worked for me with the Autotools and GCC on GNU/Linux, and Visual C++ Express Edition on Windows.
If there are unexpected problems *only* with this patch, it may be the result of two new source files add to the src/gui subdirectory: threading.h and threading.cpp.

Reply 4 of 8, by aqrit

User metadata
Rank Member
Rank
Member

Apologies in advanced...
I don't fully understand your explanation and I haven't gone thru the code.

Is the solution you've presented above better than what is detailed below?

1. Triple Buffer the primary surface.

2. Create a worker thread that loops on "waits for vertical blank". When vertical-blank starts, increment [vsync_tick].

3. Everytime the main thread wants to Flip(), compare [vsync_tick] to [prev_vtick].
If not equal, set [prev_vtick] equal to [vsync_tick] then call Flip().
Else drop the current frame ( don't call Flip() just return ).

That would seem to let DOSBox run its own speed, with vsync, without the main thread waiting for vsync (most of the time).

Reply 5 of 8, by NY00123

User metadata
Rank Member
Rank
Member
aqrit wrote:

Apologies in advanced...
I don't fully understand your explanation and I haven't gone thru the code.

It roughly goes like this:
1. The secondary thread is doing most of the usual job: Run the emulated machine, as well as do the rendering (scaling included).
2. The main thread actually updates the display contents (e.g. by swapping buffers). It may wait for VSync, but it shouldn't halt the secondary thread, at least in theory.
3. If there's no new frame then the main thread just sleeps for a little as an alternative.
4. Furthermore, the main thread is responsible for other function calls that should better be done from there, rather from the secondary thread (like calls to SDL_PollEvent).
5. The secondary thread passes such calls to the main thread.

Is the solution you've presented above better than what is detailed below? […]
Show full quote

Is the solution you've presented above better than what is detailed below?

1. Triple Buffer the primary surface.

2. Create a worker thread that loops on "waits for vertical blank". When vertical-blank starts, increment [vsync_tick].

3. Everytime the main thread wants to Flip(), compare [vsync_tick] to [prev_vtick].
If not equal, set [prev_vtick] equal to [vsync_tick] then call Flip().
Else drop the current frame ( don't call Flip() just return ).

That would seem to let DOSBox run its own speed, with vsync, without the main thread waiting for vsync (most of the time).

ok, I see what you're hinting at. Not sure about the need of point 1, but I can already tell that the patch can work with and without this.
I see a problem in point 2: How can the worker thread increment the given counter?
I'm afraid the most portable way is by doing a Flip() kind of a call. Yeah, at least on X11 you may have the function glXGetVideoSyncSGI (available with the extension GLX_SGI_video_sync), and Direct3D may have its own approach. Maybe someone knows of a portable way I'm not aware of, but I have a feeling there is no such a thing. Even if there were, though, I guess a lot of modern games wouldn't use such a thing these days (although it could possibly help when it comes to input lag).

Last edited by NY00123 on 2013-05-05, 09:35. Edited 4 times in total.

Reply 6 of 8, by NY00123

User metadata
Rank Member
Rank
Member

Guess what? The issue with "Threaded optimization" was really *my* fault, after all. 😀

I have attached a fixed patch (and Windows EXE). The few changes are:
- A fix for the cause of the above issue: The shared graphics buffer's mutex is now locked only when there is a need to copy some contents from/to it. Furthermore, as before it is locked when one checks if there is any display update to do.
- On a side note, there is also a minor fix in the function that swaps buffers, when OpenGL is used with threaded updates but without ARB_PixelBufferObject.

Basically, the simple rule is: *Never* halt the secondary thread (say with a mutex) while trying to access video RAM in some way.

P.S. The reason I didn't spot this earlier is that, by default, "Threaded optimization" is currently disabled on GNU/Linux, while it is rather enabled on Windows.

Attachments

Reply 7 of 8, by Arrakhad

User metadata
Rank Newbie
Rank
Newbie

When I try to run your Win32 build with vsync=true I always get a crash no matter what output I have set.

The error message suggests the culprit to be MSVCR100.dll which I assume is the main DLL for VC++ 2010, so I made sure to reinstall both the x86 and x64 redists of VC++ 2010. The problem still continues.

The exact error message is here:

Problemsignatur:
Problemhändelsens namn: APPCRASH
Programnamn: dosbox.exe
Programversion: 0.0.0.0
Programtidsstämpel: 51861e38
Namn på felmodul: MSVCR100.dll
Modulens version: 10.0.40219.325
Tidsstämpel för felmodul: 4df2be1e
Undantagskod: c0000005
Undantagsförskjutning: 00010a4a
OS-version: 6.1.7601.2.1.0.768.3
Språkvariant-ID: 1053
Ytterligare information 1: 0a9e
Ytterligare information 2: 0a9e372d3b4ad19135b953a78882e789
Ytterligare information 3: 0a9e
Ytterligare information 4: 0a9e372d3b4ad19135b953a78882e789

Any suggestions on what I could try besides reinstalling VC++2010 redists since I have already done that?

Thanks for any help!

Reply 8 of 8, by NY00123

User metadata
Rank Member
Rank
Member

Hey,

Some things which you can try:
- In case you have missed it, try the most recent patch/build. Although I doubt it can fix such crashes, who knows.
- Make a backup of dosbox-SVN.conf and let the modified build re-create a new configuration file.
- In addition to the vsync setting, also play with the threading setting added by the patch. (Note that while vsync replaces fulldouble, it also applies with opengl/openglnb output.)
- Finally, fiddle with some more settings like fullscreen, fullresolution and windowresolution.

By nature, though, this patch can be highly unstable. In particular (should've referred to this earlier), one weak point I haven't fully covered is game controller support. It may work, but it may also fail. On the other hand, I have used this with a few games for some time with no seemingly added instabilities (although on a GNU/Linux desktop and with no game controller).

EDIT: Oh yeah, I don't recall testing this with any kind of networking feature (say for multiplayer), either.