First post, by superfury
I currently have a main thread (CPU and hardware) that's rendering audio after each CPU instruction. The audio is double buffered(one buffer for the CPU thread, once filled enough it's moved to the buffer(locked with a semaphore when copying, using the FIFOBuffer functionality I've written) of the rendering thread. The rendering thread reads this buffer (locking it on every sample) and passes it to the mixer(which is also written specifically for my emulator), which in turn passes it back to SDL(Audio thread routines of SDL).
It works correctly, but I constantly hear plops in the sound (many plops each second, like a low frequency (I think at least once every 4096 samples out of 44.1kHz signal) at least 8 plops each second. It sounds like a cracky sound, but audio is still recognizable (testing it with 8088 MPH atm, hearing drums only(with varying tone frequency) at the credits, working on a 2GHz processor).
Running Bill & Ted's Excellent Adventure gives me correct sound, but with about 8Hz stutter(like an audience shouting).
All sound is buffered by the CPU(main) thread before sending it to it's respective rendering thread(PC speaker, Adlib, Disney Sound Source/Covox speech thing). All these virtual devices render directly in the CPU(main) thread to their respective rendering buffers. The only one that renders directly on the rendering thread itself is the MIDI SF2 synthesizer(it renders all it's audio directly in the rendering thread).
The PIT emulation(PC speaker) first calculates it's 1.19MHz signal for all 3 channels. Then it processes channel 0 into up to 1 IRQ0 interrupt. It discards channel 2 data(not connected to anything we need).
After that channel 3 is sampled 44100 times a second, using 60us PIT output samples at the PIT rate. This generates the proper 44.1kHz audio stream of the PC speaker.
This 44.1kHz signal is written to the first buffer. Once the first buffer passes a threshold, threshold samples are moved from the first buffer to the rendering buffer(which is locked with a semaphore) in one go(lock move unlock).
The Adlib emulation renders it sound at the about the same way, but at ~49kHz frequency clocked by the CPU like the PIT counters and output. The rendering first processes the 80us/320us/CSM timer for the CPU time passed. Then it renders it's adlib output to it's first buffer. When the first buffer is filled up to the threshold, it's contents are moved to the rendering buffer, like the other channels also do in their final rendering process.
The Sound Source emulation renders it's Sound Source input to it's Sound Source primary FIFObuffer(containing up to 16 samples for accuracy and detection). Covox output by the CPU emulation simply sets a Covox left and right channel value(byte value). The Sound Source and Covox rendering routine handles Sound Source output first(for the CPU elapsed time), then the Covox ouput.
The Sound Source rendering(which happens at 7kHz rate) first moves the sound source input(from the 16 sample buffer) to it's secondary buffer(this also enabled accurate detection by the CPU by checking for empty/full primary FIFO buffer). When the secondary buffer threshold is exceeded it moves threshold samples from the secondary buffer to the renderer buffer(which is locked copied then unlocked).
The Covox rendering writes the current value of the left channel and right channel (2 times 8-bit current values(which have been set by the CPU emulation) combined into one 16-bit stereo value) to it's primary FIFO buffer. Then it moves the buffered content in blocks, like the Sound Source, to it's rendering buffer(which is locked copied then unlocked), size set by the threshold like the other channels.
Information about buffer sizes, thresholds before rendering and rendering frequencies:
- PC speaker:
Rendering frequency: 44.1kHz
Double buffer threshold: 256 samples.
Rendering buffer size: 512 samples.
- Adlib:
Rendering frequency: ~49kHz.
Double buffer threshold: 16 samples.
Rendering buffer size: 4096 samples.
- Disney Sound Source:
Rendering frequency: 7kHz
Primary buffer size: 16 samples.
Double buffer threshold: 4 samples.
Rendering buffer size: 1024 samples.
- Covox (Speech Thing, based on information of Dosbox's Sound Source/Covox emulation):
Rendering frequency: 44.1kHz
Primary buffer size: Same as double buffer threshold.
Double buffer threshold: 409 samples.
Rendering buffer size: 4096 samples.
Is this efficient enough? All rendering from the running buffers (last buffer before moving to the renderer, depending on the device, as described above) to the rendering buffer happens using a locked FIFO on the renderer's end(which is read by the rendering callback one lock, sample, unlock at a time).
The rendering still is giving plops (about 8Hz) all the time for all audio output(except the MIDI, since it doesn't use the FIFO buffers, rendering directly instead), even with the double buffering before passing it to the renderer(it makes the overall quality better(less plops) but they still happen).
To hear the current result, simply run the release of my x86EMU emulator(running software or BIOS which actually produces sound on that device of course(BIOS boot beep for PC speaker, games playing sound using the PC speaker/Adlib/Sound Source/Covox/MIDI(requiring a soundfont(.sf2) selected in the BIOS)).
Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io