VOGONS


Multithreaded video capturing

Topic actions

First post, by kekko

User metadata
Rank Oldbie
Rank
Oldbie

Hi,
yesterday I was trying to capture a video of Flight Unllimited with the embedded video capturing functionality.
Sadly I noticed that, even on a not-so-weak machine (e8400), capturing a video makes an heavy game to slow down in a way the it becomes unplayable (and I can't film my stunts...). At the same time, only one cpu core was used, while the other one was twiddling thumbs.
So I decided to experiment with multithreading, and my first try has been encouraging: capturing a video doesn't make the game stutter anymore and both cores work 😜 I also tried other games like tomb raider and screamer rally and it helps a lot.
What I've done it's just a quick try, many thing are not managed at all: "deferred" capturing/audio synching/shared access to memory areas...
Do you think it's worth? Anyone want to help?

Edit: latest, fully working version is here.

Last edited by kekko on 2009-10-25, 12:33. Edited 2 times in total.

Reply 1 of 47, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

SDL-threads may indeed be a nice way to have this cross-platform
(though keep an eye on having stuff embraced by cute defines).

Reply 2 of 47, by kekko

User metadata
Rank Oldbie
Rank
Oldbie
wd wrote:

(though keep an eye on having stuff embraced by cute defines).

That's obvious 😜 but there's still a lot of work to do before worrying about the code "cuteness"

Reply 3 of 47, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Well first thing is to make it have an effect, and you already achieved this
in a nice way which is cool 😀
I don't have much experience with threads/SLDthreading, rather openmp,
so can't really help (stuff is tricky by times so "better not touch" is a quite
easy rule for safety.....).

Reply 4 of 47, by kekko

User metadata
Rank Oldbie
Rank
Oldbie

The idea is to change capture.video structure into a FIFO queue
Dosbox main thread populates the queue with audio/video chunks, while a parallel thread sequentially fetches, compresses and writes the entries to the avi file.
Unfortunately I don't know if I'll have enough time to keep working on this... anyway there's some code and ideas above, for anyone interested 😀

Last edited by kekko on 2009-05-25, 06:57. Edited 1 time in total.

Reply 5 of 47, by kekko

User metadata
Rank Oldbie
Rank
Oldbie

sorry, I mean FIFO *corrected*

Reply 6 of 47, by Harekiet

User metadata
Rank DOSBox Author
Rank
DOSBox Author

It's a nice idea. Video capturing could do with an update either way to allow for capturing of the vgaonly tricks when using a line for line rendering approach.

Reply 7 of 47, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

-> responsibility moved to harekiet FUCK YEAH!

Reply 8 of 47, by kekko

User metadata
Rank Oldbie
Rank
Oldbie

I've made a little patch to fix avi header writing when closing dosbox without stopping the video capturing (fixes issues #2810637 and #2819328). It just writes the header at every chunk writing;
This implementation it's not the better way to do it but, actually, the whole capturing needs to be improved imho 😉

I'd like to continue to work on threaded capturing, one day...

Reply 10 of 47, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Does it affect recording speed?

Reply 12 of 47, by kekko

User metadata
Rank Oldbie
Rank
Oldbie

I've been stuck at home with a cold yesterday, so I thought of working on this. The file attached implements the asynchronous capturing as I described it few posts above. I think it came up quite nicely 😀
It uses STL for the queue handling and SDL for threading; synchronous capturing is still available and can be toggled with the usual "cute" defines, so wd won't complain 😉
There are still few issues I have to deal with, mostly related to cpu-intensive capturing ("cic" 😜); it means capturing high-res frames and/or high frame rates: the compression thread can't stand back with dosbox main thread and the chunk queue populates rapidly.
This implementation takes advantage of only one parallel thread, a generic multithreaded capturing would give better performance with >2 cpus (not my case) and may avoid "cic" related issues;
Some of the issues are:
- on "cic", memory fills up quickly; something like frame skipping or fall-back to synchronous capturing when the queue reaches a limit may be possible solutions;
- on long queues, when stopping a capture, I made it up so you can't start a new capture until all the chunks left of the previous capture are written to disk and the queue is flushed;
- palette is not part of the a/v chunk, every chunk uses a unique global palette, which is bad if the game has "cic" and changes palette, because old chunks which are still running, will use the new palette instead of their own; I need to find an efficient way to detect palette changing in order to avoid saving and attaching a palette to every chunk;

Please post comments, feedbacks and suggestions on this.

Last edited by kekko on 2009-10-25, 12:33. Edited 1 time in total.

Reply 13 of 47, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

- on "cic", memory fills up quickly; something like frame skipping or fall-back to synchronous capturing when the queue reaches a limit may be possible solutions;

You'll need something to empty the queue though, right? So there'll be a stop
any way (which may be bad for dosbox cycles stuff, but well) so maybe just
halt the main thread, wait until the queue is empty, then continue (with any
strategy you like, for example check how often the queue runs over and then
fall back to synchronous writing etc.)

Thanks for working on it, still hoping you already got rid of your cold 😀

Reply 14 of 47, by kekko

User metadata
Rank Oldbie
Rank
Oldbie

Halting the main thread until the whole queue is empty may mean several seconds of halt.
I tried to add something like this in the main thread:
if queue.size > max_size delay(few ms);
in order to stabilize the queue size to a value, but even very few ms make the game stutter. I guess I need a different approach.
Could you please have a look at it, when you have some time? I'd be happy to hear some comments and suggestions, maybe there's something obvious to fix or to improve. I left a printf of the queue size in the capture thread; studying the behavior of parallel processing it's been quite interesting 😀

Reply 15 of 47, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

but even very few ms make the game stutter

But the recorded video should be fine, right? So even a several-seconds-halt
would be ok, it simply tells that "recording is a bit too heavy for the selected
game with the current pc".

Reply 16 of 47, by kekko

User metadata
Rank Oldbie
Rank
Oldbie
wd wrote:
But the recorded video should be fine, right? So even a several-seconds-halt would be ok, it simply tells that "recording is a b […]
Show full quote

but even very few ms make the game stutter

But the recorded video should be fine, right? So even a several-seconds-halt
would be ok, it simply tells that "recording is a bit too heavy for the selected
game with the current pc".

I followed your advice. The stop is fine, you are recording an high resolution video at 70fps after all, this is a very little price to pay, on the other hand you can continue playing without any slowdown and the recorded video will be smooth.
I'm attaching a new version, these are the major updates:
- defined a limit for the queue size (atm is 400 chunks): when the queue reaches the limit, dosbox waits for capturing thread to empty the queue and then continues emulation
- fixed the ability to automatically start/stop on resolution change
- added better error handling
- added some new log messages when dosbox pauses to empty the queue
- more code cleanup, added some comments

From the tests I've made it seems reliable, I tried to switch resolution many times, start/stop capturing quickly, play many games during the same recording, play heavy games for long time, I didn't experience big problems.
Even if the C_THREADED_CAPTURE define allows you to disable threaded capturing, running it on a single core shows more or less the same performance as the synchronous capturing.
It's quite nice to finally film and see my captures of flight unlimited at full frame rate 😀

Reply 17 of 47, by lightmaster

User metadata
Rank Oldbie
Rank
Oldbie

Congrats kekko!!! lovely work there.

Reply 18 of 47, by wd

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Thank you very much, code looks fine to me.

Reply 19 of 47, by xcomcmdr

User metadata
Rank Oldbie
Rank
Oldbie

Thank you very much! A wonderful patch indeed!

I had tried to hack it myself before checking back this thread, but I'm new to threads and DOSBox's source code, so it was very hard for me... Thank you again. =)