VOGONS


First post, by Warlord

User metadata
Rank Oldbie
Rank
Oldbie

Will this run munt without dropping frames? I have one in my closet.
http://cdn.viaembedded.com/eol_products/docs/ … eet_v120508.pdf

1.5GHz VIA C7
VIA CX700M Unified Digital Media IGP chipset
1 DDR2 533 DIMM socketUp to 1GB memory size

Reply 1 of 15, by ragefury32

User metadata
Rank Member
Rank
Member
Warlord wrote on 2020-08-17, 02:25:
Will this run munt without dropping frames? I have one in my closet. http://cdn.viaembedded.com/eol_products/docs/ … eet_v120508 […]
Show full quote

Will this run munt without dropping frames? I have one in my closet.
http://cdn.viaembedded.com/eol_products/docs/ … eet_v120508.pdf

1.5GHz VIA C7
VIA CX700M Unified Digital Media IGP chipset
1 DDR2 533 DIMM socketUp to 1GB memory size

Just MUNT + whatever OS you plan to run on it, right? You are not planning to run any actual game emulation (like DOSBox) on it, correct?

The C7s are weak CPUs - the 1.6GHz version barely kept up with the Celeron-M 900 (Dothan-512) found on the original Asus Eee PC...which was designed back in 2005 as a budget chip. It’s not nearly as performant as, say, even the slowest Intel Atoms.
It’ll run standalone...but barely, as long as you have Midi messages coming in via USB and output off the audio-out on the C7 board, and keep the emulation on another machine.

Last edited by ragefury32 on 2020-08-25, 18:56. Edited 2 times in total.

Reply 3 of 15, by Jo22

User metadata
Rank l33t++
Rank
l33t++

Well, I used to run an unofficial DOSBox build with internal Munt on an Eee PC years ago.
From what I remember, it worked not extraordinary well, but games were playable.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 4 of 15, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++
Warlord wrote on 2020-08-17, 02:25:
Will this run munt without dropping frames? I have one in my closet. http://cdn.viaembedded.com/eol_products/docs/ … eet_v120508 […]
Show full quote

Will this run munt without dropping frames? I have one in my closet.
http://cdn.viaembedded.com/eol_products/docs/ … eet_v120508.pdf

1.5GHz VIA C7
VIA CX700M Unified Digital Media IGP chipset
1 DDR2 533 DIMM socketUp to 1GB memory size

Questionable, I think, but probably worth testing. It has slightly more DMIPS performance than a single core from the Pi 2, but the Pi 2 handles all the other operating system tasks on the other cores. I'd start with an OS that's been cut down to the minimum to avoid CPU contention. I use MIDI from Loom as a load test as it's notably more demanding than the other scores I have around.

All hail the Great Capacitor Brand Finder

Reply 5 of 15, by matze79

User metadata
Rank l33t
Rank
l33t

Does MUNT utilize multiple Cores ?

https://dosreloaded.de - The German Retro DOS PC Community
https://www.retroianer.de - under constructing since ever

Co2 - for a endless Summer

Reply 6 of 15, by kjliew

User metadata
Rank Oldbie
Rank
Oldbie

I also have DOSBox with MT32 running on EeePC 701. At the stock clock which is the underclocked Celeron M 900MHz at 630MHz, Munt does not run well. By raising the clock back to 900MHz, Sierra adventure series are fine, with DOSBox running at 3000 cycles with normal core. XWing does not run well at all even if I overclocked the CPU at near 1GHz and DOSBox using max cycles with dynamic core. Munt documentation says it needs a near 1GHz CPU for MT32 emulation. So I guess it would be OK for games that don't require a 486 CPU.

Reply 7 of 15, by sergm

User metadata
Rank Oldbie
Rank
Oldbie
matze79 wrote on 2020-08-24, 12:36:

Does MUNT utilize multiple Cores ?

Define MUNT 😀
To be serious, the mt32emu library itself is nearly agnostic to multithreading, but applications that utilise the library are known to be thread-aware. For example, DOSBox (with the mt32emu patch) can render MT-32 audio in 1 thread and emulate the rest of the world in the other.
As we've found, there is no much sense to add SMP support in the library per se. Newer CPUs with many cores available run mt32emu just fine even in a single thread of execution, whereas in older CPUs that have less performant cores there are rarely more than two. With two threads, the thread synchronisation overhead looks rather unacceptable.
Much better performance gain is achieved by loop vectorisation, etc. I also have considered playing with GPU/APU cores which are rather numerous 😀

Reply 8 of 15, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++
sergm wrote on 2020-08-25, 06:33:
Define MUNT :) To be serious, the mt32emu library itself is nearly agnostic to multithreading, but applications that utilise the […]
Show full quote

Define MUNT 😀
To be serious, the mt32emu library itself is nearly agnostic to multithreading, but applications that utilise the library are known to be thread-aware. For example, DOSBox (with the mt32emu patch) can render MT-32 audio in 1 thread and emulate the rest of the world in the other.
As we've found, there is no much sense to add SMP support in the library per se. Newer CPUs with many cores available run mt32emu just fine even in a single thread of execution, whereas in older CPUs that have less performant cores there are rarely more than two. With two threads, the thread synchronisation overhead looks rather unacceptable.
Much better performance gain is achieved by loop vectorisation, etc. I also have considered playing with GPU/APU cores which are rather numerous 😀

For Munt on a Pi 2, multiple cores means the mt32emu core doesn't get preempted by other processes and can run more or less full blast all the time (assuming the scheduler is doing it's job right). A single core would need extra margin to account for operating system overhead, context switching, and caching penalties among other things. I also usually run the JACK sound server which also operates in another process. Even without multiprocessing in Munt itself, multiple cores do have value for this kind of thing.

All hail the Great Capacitor Brand Finder

Reply 9 of 15, by sergm

User metadata
Rank Oldbie
Rank
Oldbie

Hmm, I don't fully understand the advantages of splitting a piece of work between threads that can be done in a single thread. To avoid unwanted preemption of the thread doing audio rendering with mt32emu (that is by nature a realtime process), we'd rather need to boost the priority of that thread, likewise the mentioned JACK server does. Note, its authors also see no such advantages, so JACK 1 is also purely single-threaded, and only JACK 2 has limited support for multi-threading when the audio processing graph permits so.

However, I do clearly see disadvantages of doing multi-threading on a uni-processor system. All those penalties like context switching still apply. Yet an even more significant thing to take into account is exactly the thread synchronisation penalty. Our measurements we did at the early stage indicated that the pros-cons were unsatisfactory. Surely, CPU load did increase, but nothing like reliable rendering was achieved as a result. Therefore, I doubt that e.g. 2 rendering threads would help anyhow. But feel free to test on the particular device, I can't guarantee anything when it comes to performance, yet Pi2 has more cores than two...

On the other hand, multi-threading works just great when the thread synchronisation is weak, e.g. like for the Falcosoft VSTi plugin that runs two semi-independent mt32emu cores in the Dual Synth mode. However, when it comes to offloading rendering of several partials per thread, the synchronisation becomes not so cheap. And the rest of the emulation spends significantly less CPU time. Dunno, maybe it's actually worth it if you configure, say, 256 partials to render. With such high load I'd expect a performance boost. But 256 partials have nothing to do with MT-32 to be honest 😉

So, for any uni-processor system, I'd concentrate on improving thread affinity and boosting the priority of the rendering thread to the maximum. We currently have a cool interface with JACK in mt32emu-qt which is about lock-free rendering in the JACK realtime thread (in another git branch), so this may already help with fighting against the dropouts.

But for 4-8 cores, e.g. Pi2, a multi-threading at the partial level may also work fine. Albeit, there is another thing to consider: compatibility. Damn C++98 which we have to stick with for now, has no notion of threads, and an external dependency needs to be involved, like OpenMP. But I'm secretly hoping to move on to C++11 very soon, and then this problem will disappear. So, I'm not passionate about investing time into this area right now.

Reply 10 of 15, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++

From the user side, I generally agree with you and don't think there's a lot of cause for multithreading the render code TBH. We can run Munt on very inexpensive SBCs, old Intel Atom netbooks/nettops, and such in real time.

In my post earlier, I merely wanted to point out (mostly for the OP) that multi core CPUs have some utility with the current code base as the OS scheduler can allocate a dedicated core for Munt (as a thread spawned from DOSBox or standalone program) to render with whereas single core CPUs have to shuffle it all on a single set of caches and execution units, hence effectively less than 1.0 allocated CPU cores with additional cache pressure and context switch penalties added on from any other processes operating concurrently.

All hail the Great Capacitor Brand Finder

Reply 11 of 15, by sergm

User metadata
Rank Oldbie
Rank
Oldbie

That's all correct, and the OP should choose the OS wisely, so that it provides a good task scheduling and thus minimise the influence of the context switching overhead on the high-priority thread. Yet of course, increasing the renderer buffer size may significantly lower the overhead, it's obvious, but also increases the latency on the other hand. We do not seem to be able to do anything about it other than that. Just get a better system 😀

Reply 12 of 15, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++

Munt in a single process, multi-threaded environment on FreeRTOS, RTEMS, or similar? I can dream, right?

All hail the Great Capacitor Brand Finder

Reply 13 of 15, by sergm

User metadata
Rank Oldbie
Rank
Oldbie

Unfortunately, there are really not many possibilities for parallelisation in mt32emu. The synth changes its state on the per-sample basis, it's very difficult (if possible) to predict all the internal parameters in advance, so rendering different partials in parallel seems to be the only option. But as the demand grows, we surely can implement that in the (hopefully near) future. I'm just not interested in that personally so far... It appears, a full-blown MCU emulator with accurate reverb implementation using the reverb ROM works just fine on a single core in real time for me. And today's devices are so powerful that I see a little need at this point.

Reply 14 of 15, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++
sergm wrote on 2020-08-28, 08:16:

Unfortunately, there are really not many possibilities for parallelisation in mt32emu. The synth changes its state on the per-sample basis, it's very difficult (if possible) to predict all the internal parameters in advance, so rendering different partials in parallel seems to be the only option. But as the demand grows, we surely can implement that in the (hopefully near) future. I'm just not interested in that personally so far... It appears, a full-blown MCU emulator with accurate reverb implementation using the reverb ROM works just fine on a single core in real time for me. And today's devices are so powerful that I see a little need at this point.

Well, I was thinking in terms of a much thinner OS layer with reduced overhead and much lower, predictable latency in which to run mt32emu. Hell, even a single process executive isn't outside the realm of possibility.

All hail the Great Capacitor Brand Finder

Reply 15 of 15, by sergm

User metadata
Rank Oldbie
Rank
Oldbie

Yeah, we've considered even a EFI module with DOSBox and stuff instead of a bootable flash powered by Windows 95 😀
Many things are possible but only if you spend time on that. What's often hardly available, so sad...