VOGONS


IBM PC Speaker RC values?

Topic actions

Reply 20 of 66, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

So, if the cutoff frequency is set to 22kHz, won't there be higher frequencies still in the mix after filtering, although at 6dB quiter for each octave above 22050Hz?

That's an artifact of a particular filter implementation, not of a particular cutoff frequency. I was talking about an ideal filter, not a 6dB per octave one.

superfury wrote:

Also, won't that mess up the PWM used at 72 PWM samples, since the calculated response is much faster?

It will yield an audible carrier frequency at 16.6kHz, just as real hardware does.

superfury wrote:

So you would need (22050/44100)*~1190000(1.19MHz) PIT samples instead of 72 in order for the PWM to give valid effects?

I'm not sure what you mean by that.

superfury wrote:

Or is the only requirement that there is a low-pass filter present, no matter what it's cutoff frequency?

As I said above, the low pass filter is necessary because you're doing downsampling, not because of the way the PC speaker works or because of any particular trick that any particular PC speaker software uses.

superfury wrote:

Would a IIR low-pass filter at 22050Hz still allow PWM using a division of x out of 72(73 depths) to give correctly audible results?

An low-pass filter at 22050Hz (not sure about an IIR one specifically) would still allow PWM to work at any PWM sample rate. Forget about the particular PWM sample rate that 8088 MPH uses - other PC speaker software uses PWM at different rates, and a good emulator should work just as well for any of them.

superfury wrote:

Or will that destroy the entire trick, since the frequency response is off(by a lot)?

The point of the low-pass filter in the emulator is to eliminate aliasing caused by the downsampling (which doesn't occur on real hardware), not to eliminate PWM carrier waves (which do).

Reply 21 of 66, by Jepael

User metadata
Rank Oldbie
Rank
Oldbie

Filter cutoff is where only 6dB is cut and 16-bit audio has roughly 96dB of dynamic range. So if you use a filter with wide transition band (slope) then you have to set the cutoff way lower to remove everything at half band. If you use insanely perfect filter with very narrow transition band then you can set cutoff frequency to 22049Hz. IIRC early CD players used 6th order filters at 20kHz to remo e everything at 22050 Hz.

IIR filter is a type of filters, not a specific one. But a biquad is a specific 2nd order IIR filter. Usually in audio or signal processing higher order stable IIR filters are difficult to design so biquads are chained. And to desing a say 4th order filter of certain response (bessel type, chebychev type etc) you can use lookup tables to set cutoff of each filter.

And setting same cutoff to two filters in a row means the total cutoff is lower.

And again, the cutoff to set depends on how steep the filter is ( the filters order and type of filter)

Reply 22 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

So I would simply do something like this(each filter cutting off 6dB out of 96dB):

for (f=0;f<16;f++) applyFilter(&filters[f],&currentoutput);

For each of the outputted samples at 1.19MHz. The filter type is IIR low-pass filter at 22050Hz.

Is that correct?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 23 of 66, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++

The smoothing will be resultant from a combination of characteristic electrical and physical parameters of the speaker itself. The response of a piezo will be different to the response of a full range cone speaker, so there's really no standard device or "correct" response to emulate.

However, you probably want a "typical" device response. If I had to guess, I suspect the low f_s will be around 200 hz; the high -3db point should be around 10 khz. These are very rough numbers. I suggest using an SPL meter and a sig gen for more accurate response data of a 1.5" cone or the 2.25" PC speaker in the original PC, whichever you choose to model.

All hail the Great Capacitor Brand Finder

Reply 24 of 66, by Jepael

User metadata
Rank Oldbie
Rank
Oldbie

Sorry my mistake, the cutoff point is defined as being the -3dB point, not -6dB point.

And another thing that occured to me, is that the speaker itself has resistance and inductance so it is a first-order LR filter as well, even without external RC filtering 😀

superfury wrote:
So I would simply do something like this(each filter cutting off 6dB out of 96dB): […]
Show full quote

So I would simply do something like this(each filter cutting off 6dB out of 96dB):

for (f=0;f<16;f++) applyFilter(&filters[f],&currentoutput);

For each of the outputted samples at 1.19MHz. The filter type is IIR low-pass filter at 22050Hz.

Is that correct?

OK it's impossile to answer that without knowing the IIR filter order, but I'll assume first order. Most likely combining 16 first-order IIR filters with identical cutoff frequencies does not make any sense. Most likely that takes so much CPU horsepower it would be better to have a FIR filter with enough taps. And most likely 16 of them brings the -3dB cutoff point to some very low frequency, so the 22050 Hz for the -3dB cutoff point for all filters also makes no sense. You set the -3dB cutoff point of the total filter earlier, so that given the filters slope (defined by the order of the filter), the attenuation has gone high enough at 22050 Hz (you seem to want -96dB).

So, given that 1st order filters that attenuate 6dB per octave, you do need 16 of them to get 96dB attenuation per octave, but then you need to set the total filter -3dB cutoff point to one octave below 22kHz, or to 11kHz, to get 96dB at 22kHz.

As I explained in my previous post, usually you make a second order IIR filter called a biquad and when combining these you look up from a table how you need to set the cutoff frequencies of each biquad section to have the -3dB point at given cutoff frequency you want. You would still need 8 biquads to have 96dB per octave attenuation.

So to me this sounds like using 16 RC filters versus a sampling rate conversion library, a sampling rate library does the conversion faster and better than 16 RC filters.

Reply 25 of 66, by reenigne

User metadata
Rank Oldbie
Rank
Oldbie
Jepael wrote:

So to me this sounds like using 16 RC filters versus a sampling rate conversion library, a sampling rate library does the conversion faster and better than 16 RC filters.

I believe the sample rate libraries use FIR filters.

I'm also wondering if using FFTs is a sensible alternative. FFTs (at least the nice fast power of 2 ones) won't let you do arbitrary sample rate changes at the same time as filtering like FIR filters will, but perhaps all the resampling could be done at the 1.193MHz rate, and then the actual resampling just done with nearest-neighbour or linear interpolation (which shouldn't cause any aliasing because the filtered wave is lacking those frequencies). Different set of tradeoffs I suppose. I'll look into this some more when I finally get around to writing that part of my emulator.

Reply 26 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

Essentially what UniPCemu does now is low-pass the 1.19MHz square wave at 72 PIT sample IIR into a 1.19MHz wave, then nearest-neighbor(rounded down) resample to 44.1kHz.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 27 of 66, by Jepael

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

Essentially what UniPCemu does now is low-pass the 1.19MHz square wave at 72 PIT sample IIR into a 1.19MHz wave, then nearest-neighbor(rounded down) resample to 44.1kHz.

And I still don't understand what that means. Low pass filtering the 1.19MHz 1-bit stream into 16-bit stream wih frequencies above 22050 removed and passing frequencies below 16kHz and downsampling to 44100 by taking approximately every 72th sample I understand. Nothing else is needed. What "72 pit sample IIR"?

Reply 28 of 66, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++

Are you outputting on a device roughly equivalent to a PC speaker, or are you using a PCM device for output? If you're not using a small full range cone for output, PWM encoded PC speaker output on your emulator will have additional harmonics which the original device would not be capable of reproducing, but the PCM output from your emulator is fully capable of representing and the output device (I use a good stereo system) is capable of reproducing.

If you wish to output PC speaker audio this way (and do so accurately), you're going to have to apply some kind of low fidelity speaker response in software.

All hail the Great Capacitor Brand Finder

Reply 29 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

It currently does exactly as I tell it to(one step at the time):
1. Create PWM sample stream(1-bit) in the FIFO buffer at 1.19MHz.
2. Convert the 1-bit sample stream to 16-bit depth(0 becomes -32768, 1 becomes 32767).
3. Low-pass filter the resulting 16-bit PWM sample stream using an IIR filter at ~16kHz samplerate(14318180/12/72 Hz).
4. Resample to 44.1kHz using round-down 'highest'-neighbour by skipping samples until time matches the final sample in the 1.19MHz to 44.1kHz and using that sample as output to the rendering FIFO. So the first sample is the floor(1.19Mhz/44.1kHz)(floating point stored, rounded down to find) th sample, the next one is The original floating point doubled, rounded down th sample, the third sample is the floating point tripled, rounded down sample etc.

Steps 2+ are done all at once in the final inner loop, generating 44.1kHz samples.

https://bitbucket.org/superfury/unipcemu/src/ … pit.c?at=master

 //PC speaker output!
speaker_ticktiming += timepassed; //Get the amount of time passed for the PC speaker (current emulated time passed according to set speed)!
if ((speaker_ticktiming >= speaker_tick) && enablespeaker) //Enough time passed to render the physical PC speaker and enabled?
{
length = (uint_32)floor(SAFEDIV(speaker_ticktiming, speaker_tick)); //How many ticks to tick?
speaker_ticktiming -= (length*speaker_tick); //Rest the amount of ticks!

//Ticks the speaker when needed!
i = 0; //Init counter!
//Generate the samples from the output signal!
for (;;) //Generate samples!
{
//Average our input ticks!
PITchannels[2].samplesleft += ticklength; //Add our time to the sample time processed!
tempf = floor(PITchannels[2].samplesleft); //Take the rounded number of samples to process!
PITchannels[2].samplesleft -= tempf; //Take off the samples we've processed!
render_ticks = (uint_32)tempf; //The ticks to render!

//render_ticks contains the output samples to process! Calculate the duty cycle by low pass filter and use it to generate a sample!
for (dutycyclei = render_ticks;dutycyclei;)
{
if (!readfifobuffer(PITchannels[2].rawsignal, &currentsample)) break; //Failed to read the sample? Stop counting!
speaker_currentsample = currentsample?(SHRT_MAX*SPEAKER_LOWPASSVOLUME):(SHRT_MIN*SPEAKER_LOWPASSVOLUME); //Convert the current result to the 16-bit data, signed instead of unsigned! <-- Step 2
#ifdef SPEAKER_LOGRAW
writeWAVMonoSample(speakerlograw,(short)speaker_currentsample); //Log the mono sample to the WAV file, converted as needed!
#endif
#ifdef SPEAKER_LOWPASS
//We're applying the low pass filter for the speaker!
applySoundFilter(&PCSpeakerFilter, &speaker_currentsample); //<-- STEP 3
#endif
#ifdef SPEAKER_LOGDUTY
writeWAVMonoSample(speakerlogduty,(short)speaker_currentsample); //Log the mono sample to the WAV file, converted as needed!
#endif
}

//Add the result to our buffer!
writeDoubleBufferedSound16(&pcspeaker_soundbuffer, (short)speaker_currentsample); //Write the sample to the buffer (mono buffer)! <-- STEP 4
++i; //Add time!
if (i == length) //Fully rendered?
{
return; //Next item!
}
}
}

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 30 of 66, by Scali

User metadata
Rank l33t
Rank
l33t

I haven't studied the topic in-depth, but my intuition would say that steps 3 and 4 should be done as a single step.
Resampling by skipping samples ('nearest neighbour' sampling) is not a very good method, and will introduce aliasing.

I would say that the low-pass filter should be a kind of 'scale down' filter, which takes a 'window' of samples as input, and calculates the average into a single value. The displacement of the window from one sample to the next determines the amount of downsampling applied, and you will construct it in a way that the result is exactly 44100 samples for every second of input data.
The windows of neighbouring output samples can overlap.
So basically each output sample is some kind of linear combination of the input samples (a 'weighted average', the weights will be your 'filter kernel'). A form of mathematical convolution.
(I suppose in audio terms, that would be a form of Finite Impulse Response filtering).

Or at least, I am a graphics guy, and this is how I would downsample an image (except I would do it in two directions, so the window will be rectangular). I would expect that downsampling audio would be done in a similar way, where you want each input sample to have some kind of weighted effect on the output, rather than just skipping samples altogether.

I suppose IIR would be some kind of variation on the above theme, where you re-feed the generated output samples into the input as well.
I see it as somewhat equivalent to motion blur in graphics.

Edit: I found this page, which seems quite informative: http://www.dspguru.com/dsp/faqs/multirate/decimation
Interestingly it says this: "The fact that only the outputs which will be used have to be calculated explains why decimating filters are almost always implemented using FIR filters!"
I suppose since we're decimating by a rather large ratio (1193182 / 44100 = ~27), perhaps FIR is really the way to go?

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 31 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

So, based on that article, the PC Speaker(hardware/pit.c) in UniPCemu is decimating(although using an IIR filter instead of a FIR filter), while the renderer(emu/io/sound.c) is only downsampling or upsampling(depending on the channel's samplerate and the samplerate privided by SDL).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 32 of 66, by Jepael

User metadata
Rank Oldbie
Rank
Oldbie
Scali wrote:

Resampling by skipping samples ('nearest neighbour' sampling) is not a very good method, and will introduce aliasing.

There is no aliasing if the frequencies that would get aliased do not exist, that's why they are first filtered away with a low pass filter before decimating (throwing away unneeded samples).

And you are correct that when using a FIR filter, it is possible to calculate the other way around, given an output sample, which input samples affect it and how much. But since the resampling ratio is 27 (Assume integer for now), you need 27 different phases of the filter. It's called polyphase filtering.

The worst part is that the ratio is not an integer. Some emulators (UADE?) use a neat trick that they do not use impulse response directly, but precalculate it into step response with say few thousand phases. That way, all the things affecting the sound on the digital and analog path are emulated. The emitted pulses or steps are called BLIPs or BLEPs, band-limited impulses.

Reply 33 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

One thing I'm still confused about is impulse response(~60us) vs low-pass cutoff frequency. How do the two relate to each other, if at all? Does increasing/decreasing the cutoff frequency change the emulated impulse response? If so, what should the cutoff frequency be(using an IIR filter) to archieve the corect impupse response for a PC Speaker to properly convert PWM input(as a formula to be loaded at runtime for accuracy, floating point format)?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 34 of 66, by Jepael

User metadata
Rank Oldbie
Rank
Oldbie
superfury wrote:

One thing I'm still confused about is impulse response(~60us) vs low-pass cutoff frequency.

And I am still confused why you insist on "60us impulse response", what do you mean by that and why is 60us so important?

RC filter with 60us time constant (tau) means the cutoff is at 2652 Hz, but in this case it means that in 60us the speaker cone has moved only up to 63% of its final destination.
But in your case if you mean that in 60us, the speaker has already reached 99% of its final destination, then it's 5*tau, so 13263 Hz.
If 95%, then 3*tau or 7958 Hz.
Those are all -3dB cutoff frequency points.

But you don't know the shape how the cone moves, you only seem to care it takes 60us to move.
Does it move linearly in 60us?
Does it have RC shape and it moves in 60us?
Does it move in 40us and keep ringing around the end position for the rest of the 20us?

superfury wrote:

How do the two relate to each other, if at all? Does increasing/decreasing the cutoff frequency change the emulated impulse response?

If you think of RC filter step response (not impulse response for now!), if you increase the capacitor value , it brings the cutoff frequency down, and with a step input, it takes longer for the capacitor to charge so it's response to an input step is longer.

superfury wrote:

If so, what should the cutoff frequency be(using an IIR filter) to archieve the corect impupse response for a PC Speaker to properly convert PWM input(as a formula to be loaded at runtime for accuracy, floating point format)?

First of all, I don't think anybody has measured PC speaker output frequency response, and it will of course depend on if you want to know the response of real IBM PC 5150 speaker or any other computer that has a different speaker, different circuitry and different acoustics because of the case, so it's arbitrary.

The main points with the filtering is really to prevent aliasing when downsampling. It will have correct response to any PIT output, square wave or PWM. The more quality you want means more complex filtering needs to be done to achieve it. it is a tradeoff between CPU time used to filter, amount of sound quality (how much aliasing allowed to be heard), and with a simple IIR filter that also defines how early you must start filtering away sounds that will be heard.

Reply 35 of 66, by Scali

User metadata
Rank l33t
Rank
l33t
Jepael wrote:

There is no aliasing if the frequencies that would get aliased do not exist, that's why they are first filtered away with a low pass filter before decimating (throwing away unneeded samples).

That feels a bit counter-intuitive... because your filter is not 'synchronized' with the output, and therefore, can you really say that all samples are equally 'important', and it doesn't matter which ones you discard? I would say depending on which one you pick, you'd get a sort of phase-shift.
Intuitively I would say it would make more sense to take the average of all values, rather than just discarding them completely.

But I suppose what you're saying is that the filter would have removed any high-frequency data, so if you were to take 27 adjacent samples, their standard deviation would be extremely small, and averaging has virtually no effect on the resulting value.

Jepael wrote:

And you are correct that when using a FIR filter, it is possible to calculate the other way around, given an output sample, which input samples affect it and how much. But since the resampling ratio is 27 (Assume integer for now), you need 27 different phases of the filter. It's called polyphase filtering.

I suppose in theory you are right... but if I look at graphics filtering... we have a huge difference between theoretically 'perfect' filters, and what we actually apply in realtime graphics. Realtime filtering of textures for example, is done with anisotropic filtering and mipmapping these days. It's not perfect, but it's a reasonable approximation. And it can be implemented mostly with simple bilinear interpolation circuits.
I suppose it all depends on how much processing power you are willing to spend on it, and how close you want to get to the 'perfect' filter.
I guess rule of thumb is: if it sounds good, it is good.

Jepael wrote:

The worst part is that the ratio is not an integer. Some emulators (UADE?) use a neat trick that they do not use impulse response directly, but precalculate it into step response with say few thousand phases. That way, all the things affecting the sound on the digital and analog path are emulated. The emitted pulses or steps are called BLIPs or BLEPs, band-limited impulses.

Ah yes, I've seen something like this in the ProTracker Win32 clone. I've only seen the source, so I can't really make much of it. It's a bunch of 'magic' constants, but there's a method behind the madness I suppose: https://github.com/pachuco/fl-pt2play/blob/ma … pt2play/Blep.as
The goal is indeed to try to model a physical Amiga's entire signal path.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 36 of 66, by Jepael

User metadata
Rank Oldbie
Rank
Oldbie
Scali wrote:

That feels a bit counter-intuitive... because your filter is not 'synchronized' with the output, and therefore, can you really say that all samples are equally 'important', and it doesn't matter which ones you discard? I would say depending on which one you pick, you'd get a sort of phase-shift.
Intuitively I would say it would make more sense to take the average of all values, rather than just discarding them completely.

But I suppose what you're saying is that the filter would have removed any high-frequency data, so if you were to take 27 adjacent samples, their standard deviation would be extremely small, and averaging has virtually no effect on the resulting value.

The input data is already filtered (think of blurred/smeared pixels in image processing) so high frequencies are removed and each input sample (pixel) already depends on many previous input samples (surrounding input pixels). Then only excess samples (pixels) can be thrown out because no information is lost (in the sense of information theory).

Sure there is phase difference but the amplitude data is there. If you would downsample an image, say by a factor of 27, most likely you want to select the pixels from center of the 27x27 area or you'll see the image is shifted, especially if there are say 13 pixel borders it will definitely look shifted if center is not used. Just like an edge of square wave can happen at any of the 27 PIT samples during one output sample, but humans can't hear the phase difference. The rising edge still happens, and there should be 27 possible phases in the resulting downsampled data how the edge rises. Averaging is not correct, because an edge in PIT samples would always result to downsampled stream of 0,X,27, where X is from 0 to 26 depending on the phase where the edge happened. Just do a plot of FIR filter frequency response with 27 taps of value 1/27, but it is apparent with also combining average of two input samples (FIR filter, 2 taps, both value 0.5).

Another example is to record 1kHz pure sine wave with a sound card. Record it at 44100 Hz, and throw away every other sample, and you still have 1kHz pure sine wave if you play it back at 22050 Hz. It does not matter if you throw out even or odd samples. You can't tell either one apart from a recording directly sampled at 22050 Hz. Well of course there is some noise between 1kHz and 22kHz that will alias down but I don't thing anyone would hear it.

Reply 37 of 66, by superfury

User metadata
Rank l33t++
Rank
l33t++

As far as I understand it, the PC speaker takes about 60us(which equals, when rounded down to PIT samples, 72 PIT samples used in known applications) to move the cone fully from one side to another(from 0V to 5V position or from 5V to 0V position). As far as I understand it, besides making the signal fit for downsampling, it also transforms it in a way which simulates the frequency response (~60us) of the speaker by moving the cone gradually instead of instantly(instead of producing a pure square wave in PCM, it's converted to something that resembles, at the least, the samples output on the PC Speaker's slow response), which is a side effect of the filtering process.

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 38 of 66, by Scali

User metadata
Rank l33t
Rank
l33t

As mentioned, I think the '60us' figure is pretty nonsensical.
Firstly, at best it's a ballpark figure, since there are tons of different PCs out there with tons of different speakers, not to mention different motherboards with different circuitry driving it.
I think this figure just popped up somewhere at one time as someone's guesstimate, and started to lead a life of its own, implying that it's far more accurate and important than it really is.

Secondly, the frequency response is not purely dependent on how quickly the cone moves from one side to the other. Modeling a speaker's response is a very complex operation.
Using a properly configured low-pass filter might get you a reasonable approximation, because speakers tend to have gradual dropoff towards the top of their frequency range.
However, that totally ignores any peaks or drops in the lower and mid ranges.
The PC speaker is very small, so its bass response is not that strong either. And, if you factor in the case and speaker holder, you also have quite some resonant peaks going on in the midrange.

So I think you have to be realistic here: Either you're not modeling a speaker, and shouldn't bother with 'exact' figures like the 60us, but rather find a simple low-pass filter that 'sounds good'.
Or you're modeling a speaker, and your model needs to be far more advanced than what you have now.

Last edited by Scali on 2016-12-30, 08:15. Edited 1 time in total.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 39 of 66, by gdjacobs

User metadata
Rank l33t++
Rank
l33t++

Exactly. Adopt an approximate method to generate PCM output with reasonable efficiency or properly model the response curve of a target PC speaker (from the 5150, for example).

All hail the Great Capacitor Brand Finder