VOGONS

Common searches


DOSBox-X branch

Topic actions

Reply 2180 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

Given the overflow does not occur, would this work:

    if (sizeof(Type) == 1) {
const Bit8u xr = signeddata ? 0x00 : 0x80;
+ Bit8s d;

len--;
- current[0] = ((Bit8s)((*data++) ^ xr)) << 8;
+ d = (Bit8s)((*data++) ^ xr);
+ current[0] = d > xr ? (d << 8) | (2 * d + 1) : d << 8;
- if (stereo)
+ if (stereo) {
- current[1] = ((Bit8s)((*data++) ^ xr)) << 8;
+ d = (Bit8s)((*data++) ^ xr);
+ current[1] = d > xr ? (d << 8) | (2 * d + 1) : d << 8;
+ }
else
current[1] = current[0];

This is based on kcgen's work at https://github.com/dosbox-staging/dosbox-staging/pull/1005.

Tested with modified test code from kcgen:

// gcc convert-8bit-16bit.c -o convert-8bit-16bit.exe
short scale_uto16(unsigned char val) {
if (val > 128)
return (((char)(val ^0x80)) << 8) | (2 * ((char)(val ^0x80)) + 1);
else
return (char)(val ^0x80) << 8;
}

short scale_sto16(char val) {
if (val > 0)
return (val << 8) | (2 * val + 1);
else
return (val << 8);
}

void main()
{
int x;

printf("%8s %12s\n", "value", "int-shift");

unsigned char unsigned_val[] = { 0, 15, 16, 31, 32, 33, 96, 127, 128, 129, 168, 254, 255 };

for (x = 0; x < 13; x++)
printf("%8u %12d\n", unsigned_val[x], scale_uto16(unsigned_val[x]));

printf("\n%8s %12s\n", "value", "int-shift");

char signed_val[] = { -128, -96, -63, -64, -65, -33, -32, -1, 0, 1, 32, 33, 63, 64, 65, 96, 126, 127 };

for (x = 0; x < 18; x++)
printf("%8d %12d\n", signed_val[x], scale_sto16(signed_val[x]));
}

Reply 2183 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

That is a powerful tool, krcroft. Thank you for the instructions, too.

A look-up table is a great idea. Below is a modification of krcroft's code to output the look-up table in human readable format.

// gcc convert-8bit-16bit.c -o convert-8bit-16bit.exe
short scale_uto16(unsigned char val) {
if (val > 128)
return (((char)(val ^0x80)) << 8) | (2 * ((char)(val ^0x80)) + 1);
else
return (char)(val ^0x80) << 8;
}

short scale_sto16(char val) {
if (val > 0)
return (val << 8) | (2 * val + 1);
else
return (val << 8);
}

void main()
{
int x;

printf("0 to 255\n");

for (x = 0; x < 256; x++)
printf("%u=%d\n", x, scale_uto16(x));

printf("\n-128 to 127\n");

for (x = -128; x < 128; x++)
printf("%d=%d\n", x, scale_sto16(x));
}

Reply 2184 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

This code is untested, but it seems like it is structured correctly for inclusion in the mixer.cpp file:

static Bit16s Sample_16_Table[256];

// call function during mixer init
void Mixer_SetSample16Table(void) {
for (Bitu i=0;i<256;i++) {
if (i > 128)
Sample_16_Table[i]=((i-128) << 8) | (2 * (i-128) + 1);
else
Sample_16_Table[i]=(i-128) << 8;
}
}

if (sizeof(Type) == 1) {
const Bit8u xr = signeddata ? 0x00 : 0x80;
+ Bit8s d;

len--;
- current[0] = ((Bit8s)((*data++) ^ xr)) << 8;
+ d = (Bit8s)((*data++) ^ xr);
+ current[0] = Sample_16_Table[d+128-xr];
- if (stereo)
+ if (stereo) {
- current[1] = ((Bit8s)((*data++) ^ xr)) << 8;
+ d = (Bit8s)((*data++) ^ xr);
+ current[1] = Sample_16_Table[d+128-xr];
+ }
else
current[1] = current[0];

Could save a calculation for the stereo case by setting a variable to d+128-xr.

Reply 2185 of 2397, by jmarsh

User metadata
Rank Oldbie
Rank
Oldbie

I think you could make the table have 384 entries with the first 128 entries being duplicated in the last 128, then access it through a pointer to either element 0 or element 128 depending if the source is signed/unsigned.

Reply 2187 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

This is not yet compiled, but I followed your suggestion as best that I could and tried to fix the calculations. Sample_16_Table[] holds the sorted values, from low to high of the "scaled" 16-bit signed integer.

static Bit16s Sample_16_Table[256];

// call function during mixer init
void Mixer_SetSample16Table(void) {
for (Bitu i=0;i<256;i++) {
if (i > 128)
Sample_16_Table[i]=((i-128) << 8) | (2 * (i-128) + 1);
else
Sample_16_Table[i]=(i-128) << 8;
}
}

if (sizeof(Type) == 1) {
+ Bit8s d;
+
const Bit8u xr = signeddata ? 0x00 : 0x80;
+ Bit16s *tablePtr;
+ tablePtr = &Sample_16_Table[128];

len--;
- current[0] = ((Bit8s)((*data++) ^ xr)) << 8;
+ d = (Bit8s)((*data++) ^ xr);
+ current[0] = *tablePtr[d];
- if (stereo)
+ if (stereo) {
- current[1] = ((Bit8s)((*data++) ^ xr)) << 8;
+ d = (Bit8s)((*data++) ^ xr);
+ current[1] = *tablePtr[d];
+ }
else
current[1] = current[0];

If this seems correct, then I guess I could extend the array as suggested? That would save a couple more instructions during runtime?

Last edited by hail-to-the-ryzen on 2021-05-13, 23:38. Edited 1 time in total.

Reply 2188 of 2397, by Wengier

User metadata
Rank Member
Rank
Member

K.A.R.R.: I assume you used the SDL1 version. Try the SDL2 version instead in your case which will likely solve the problem you encountered.

As for the gustype, the DOSBox SVN version of gustype is never fixed, i.e. there does not exist a single "dosboxsvn" gustype I think.

Reply 2190 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

Thank you. I think that the code could be verified and possibly extended to save a few more instructions at the machine level. It is simpler now to read, but I think that xr can be removed and another set of table entries added (as per jmarsh's recommendation), but I think that it requires an independent table of entries. The xor operation in the unsigned operation creates values that I can't confirm as fitting with the current above table. I think that it is incompatible. I should actually test it with sample code first.

Reply 2191 of 2397, by krcroft

User metadata
Rank Oldbie
Rank
Oldbie

jmarsh, hail-to-the-ryzen - the lookup table suggestion is flying and generating unbiased results to boot. Thanks all round!

Code Explorer: https://godbolt.org/z/f6xddKarf

Google Benchmarks: https://quick-bench.com/q/FO-kOkUG-atr8vmaoJ6KMyW9srM

2021-05-13_18-12.png
Filename
2021-05-13_18-12.png
File size
42.62 KiB
Views
2218 views
File comment
GCC 10.2 build with -O2
File license
CC-BY-4.0
2021-05-13_18-15.png
Filename
2021-05-13_18-15.png
File size
45.53 KiB
Views
2218 views
File comment
Clang v11.0 build with -O2
File license
CC-BY-4.0
Last edited by krcroft on 2021-05-16, 01:12. Edited 5 times in total.

Reply 2192 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

That is great news, krcroft. Thank you for your work. I think there is a possible way, as I think jmarsh suggested, to avoid the xor operation on the data values. Something like this:

tablePtr = signeddata ? &Sample_16_Table[128] : &Sample_16_Table[0];

I verified that the values are correct, but I believe the Bit8s cast will truncate the unsigned values in the code, unless there are additional changes to the code. Here is another possibility, but I don't know if the change to the cast will cause any issue:

static Bit16s Sample_16_Table[256];

// call function during mixer init
void Mixer_SetSample16Table(void) {
for (Bitu i=0;i<256;i++) {
if (i > 128)
Sample_16_Table[i]=((i-128) << 8) | (2 * (i-128) + 1);
else
Sample_16_Table[i]=(i-128) << 8;
}
}

if (sizeof(Type) == 1) {
+ Bit16s d;
+
- const Bit8u xr = signeddata ? 0x00 : 0x80;
+ Bit16s *tablePtr;
+ tablePtr = signeddata ? &Sample_16_Table[128] : &Sample_16_Table[0];

len--;
- current[0] = ((Bit8s)((*data++) ^ xr)) << 8;
+ d = (Bit16s)(*data++);
+ current[0] = *tablePtr[d];
- if (stereo)
+ if (stereo) {
- current[1] = ((Bit8s)((*data++) ^ xr)) << 8;
+ d = (Bit16s)(*data++);
+ current[1] = *tablePtr[d];
+ }
else
current[1] = current[0];

Reply 2193 of 2397, by krcroft

User metadata
Rank Oldbie
Rank
Oldbie

hail-to-the-ryzen, right, with two lookup tables defined (one for unsigned and the other signed), the need to adjust the 8-bit values (with xor or offsetting by 128) can be avoided and directly looked up as jmash suggested.

Ref: lines 369 and 370 https://github.com/dosbox-staging/dosbox-stag … f965b7ae43a590d#

Reply 2194 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

Thank you, krcroft, for the help with the use of a second array to handle the 8-bit values. I wonder whether that would show better performance than the use of the xor operations that are used in the version below.

I think that this code is now functional. I had to fix the dereferencing of the pointer since the previous versions were untested and incorrect. This code shows the unbiased values for the conversion from 8-bit to signed 16-bit, but I have not yet verified the boundary cases for the values inside the emulator (-128 and 127). I don't see why the boundary cases would not work. For incorporation in dosbox-x, should just require a change of variable type names, such as from Bit32u to uint32_t. Also, the initialization of the array by Mixer_SetSample16Table() is in Mixer Init, but that may be moved (use extern if called that array table from outside the mixer.cpp file). I credit jmarsh and krcroft for the calculations and references, and I can only claim credit for any errors or mistakes in the code. 😀

diff -rupN dosbox-original/src/hardware/mixer.cpp dosbox/src/hardware/mixer.cpp
--- dosbox-original/src/hardware/mixer.cpp
+++ dosbox/src/hardware/mixer.cpp
@@ -54,6 +54,8 @@
#define MIXER_SSIZE 4
#define MIXER_VOLSHIFT 13

+static Bit16s Sample_16_Table[256];
+
static INLINE Bit16s MIXER_CLIP(Bits SAMP) {
if (SAMP < MAX_AUDIO) {
if (SAMP > MIN_AUDIO)
@@ -379,12 +381,20 @@ inline void MixerChannel::loadCurrentSam
last[1] = current[1];

if (sizeof(Type) == 1) {
+ Bit8s d;
+
+ Bit16s *tablePtr = &Sample_16_Table[128];
const Bit8u xr = signeddata ? 0x00 : 0x80;

len--;
- current[0] = ((Bit8s)((*data++) ^ xr)) << 8;
- if (stereo)
- current[1] = ((Bit8s)((*data++) ^ xr)) << 8;
+
+ d = (Bit8s)((*data++) ^ xr);
+ current[0] = *(tablePtr + d);
+
+ if (stereo) {
+ d = (Bit8s)((*data++) ^ xr);
+ current[1] = *(tablePtr + d);
+ }
else
current[1] = current[0];
}
@@ -888,6 +898,15 @@ void MIXER_Controls_Init() {
MAPPER_AddHandler(MAPPER_RecVolumeDown,MK_kpminus,MMOD1|MMOD2,"recvoldown","RecVolDn");
}

+void Mixer_SetSample16Table(void) {
+ for (Bitu i=0;i<256;i++) {
+ if (i > 128)
+ Sample_16_Table[i]=((i-128) << 8) | (2 * (i-128) + 1);
+ else
+ Sample_16_Table[i]=(i-128) << 8;
+ }
+}
+
void MIXER_Init(Section* sec) {
sec->AddDestroyFunction(&MIXER_Stop);

@@ -913,6 +932,8 @@ void MIXER_Init(Section* sec) {
mixer.recordvol[0]=1.0f;
mixer.recordvol[1]=1.0f;

+ Mixer_SetSample16Table();
+
/* Start the Mixer using SDL Sound at 22 khz */
SDL_AudioSpec spec;
Show last 2 lines
 	SDL_AudioSpec obtained;

Edit: alternatively, it should be possible to declare the pointer outside the function:

@@ -54,6 +54,9 @@ #define MIXER_SSIZE 4 #define MIXER_VOLSHIFT 13 +static Bit16s Sample_16_Table[256]; +Bit16s *tablePtr = & […]
Show full quote

@@ -54,6 +54,9 @@
#define MIXER_SSIZE 4
#define MIXER_VOLSHIFT 13

+static Bit16s Sample_16_Table[256];
+Bit16s *tablePtr = &Sample_16_Table[128];
+
static INLINE Bit16s MIXER_CLIP(Bits SAMP) {
if (SAMP < MAX_AUDIO) {
if (SAMP > MIN_AUDIO)

Reply 2195 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

Here is an updated version because the previous version of the code seemed to have lower performance than expected. I noticed this where patching pcem with the same. This version was additionally tested against the boundary values of -128, 127, and all testing was for the signed 8-bit mono sound format.

--- dosbox-original/src/hardware/mixer.cpp
+++ dosbox/src/hardware/mixer.cpp
@@ -54,6 +54,9 @@
#define MIXER_SSIZE 4
#define MIXER_VOLSHIFT 13

+static Bit16s Sample_16_Table[256];
+Bit16s *tablePtr = Sample_16_Table + 128;
+
static INLINE Bit16s MIXER_CLIP(Bits SAMP) {
if (SAMP < MAX_AUDIO) {
if (SAMP > MIN_AUDIO)
@@ -382,9 +385,9 @@ inline void MixerChannel::loadCurrentSam
const Bit8u xr = signeddata ? 0x00 : 0x80;

len--;
- current[0] = ((Bit8s)((*data++) ^ xr)) << 8;
+ current[0] = *(tablePtr + (Bit8s)((*data++) ^ xr));
if (stereo)
- current[1] = ((Bit8s)((*data++) ^ xr)) << 8;
+ current[1] = *(tablePtr + (Bit8s)((*data++) ^ xr));
else
current[1] = current[0];
}
@@ -888,6 +891,15 @@ void MIXER_Controls_Init() {
MAPPER_AddHandler(MAPPER_RecVolumeDown,MK_kpminus,MMOD1|MMOD2,"recvoldown","RecVolDn");
}

+void Mixer_SetSample16Table(void) {
+ for (Bitu i=0;i<256;i++) {
+ if (i > 128)
+ Sample_16_Table[i]=((i-128) << 8) | (2 * (i-128) + 1);
+ else
+ Sample_16_Table[i]=(i-128) << 8;
+ }
+}
+
void MIXER_Init(Section* sec) {
sec->AddDestroyFunction(&MIXER_Stop);

@@ -913,6 +925,8 @@ void MIXER_Init(Section* sec) {
mixer.recordvol[0]=1.0f;
mixer.recordvol[1]=1.0f;

+ Mixer_SetSample16Table();
+
/* Start the Mixer using SDL Sound at 22 khz */
SDL_AudioSpec spec;
SDL_AudioSpec obtained;

Reply 2196 of 2397, by awgamer

User metadata
Rank Oldbie
Rank
Oldbie

Curious, would the compiler optimize the if statement out of the for loop or would the following optimization be needed to do so? Also, the original >128 if compare makes the split lopsided by 1, 0-128, 129-256, intentional?

void Mixer_SetSample16Table(void) {
for (Bitu i=0;i<=128;i++)
Sample_16_Table[i]=(i-128) << 8;

for (Bitu i=129;i<256;i++)
Sample_16_Table[i]=((i-128) << 8) | (2 * (i-128) + 1);
}

Reply 2197 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

That is a good analysis. The table function runs at the start of emulation only, so it is not as crucial to optimize there. And as you said, the compiler may help where optimizing. I also had verified that function for accuracy in its calculation and then moved to the other part. That avoids the additional uncertainty when verifying the calculations.

On the other question, in the for loop, where i=128, then the table holds a value of 0. From 0 to 127 are the first set of non-zero values. And then from 129 to 255 are the second set of non-zero values. That is the 8-bit range here, from 0 to 255. Those end up as -128 to -1, and 1 to 127 (and 0). At least from memory that seems correct.

Reply 2199 of 2397, by hail-to-the-ryzen

User metadata
Rank Member
Rank
Member

If that was set to >127 instead of 128, then i = 128 would access the path for correcting the bias in the conversion. The way it is currently written, the i = 128 results in a table value of 0 which is expected.