VOGONS


First post, by mkarcher

User metadata
Rank Oldbie
Rank
Oldbie

I'm working on reverse engineering the optional signal processor chip (called ASP = Advanced Signal Processor or CSP = Creative Signal Processor), the CT1748A. I got quite far with it already. It seems the SB16 card I am using is slightly broken, though, most likely inside the ASP. As testbed, I used the DOS tools. I used the most recent SBBASIC, installed the SB CSP upgrade disk over it (it is older and replaces some files by old versions) and finally re-installed SBBASIC over that directory. I encountered the same pattern of working / non-working wave playback in Windows 3.1 and Windows 95.

What works (mostly) fine on that card

  • µlaw/alaw playback (mono and stereo)
  • ctadpcm playback (mono)
  • QSound

What fails (very noisy):

  • ctadpcm playback (stereo)

I don't remember the IMA ADPCM results anymore, I would have to recheck them in Windows. The Creative DOS tools do not support IMA ADPCM wave files. I used Sound Blaster 16 Audio Compression Comparison (DOS) as source for wave files. I actually managed to find out what aspect of the CSP misbehaves, and as I also can understand most of the CSP instructions quite good, I could create a modified ctadpcm playback CSP program that works fine on my card.

Filename
wo0200.zip
File size
2.48 KiB
Downloads
12 downloads
File comment
CTADPCM playback CSP program - BAK = original; CSP = patched
File license
Fair use/fair dealing exception

The ZIP file contains WO0200.bak (the wo0200.csp as shipped by creative labs) and WO0200.CSP (the patched program that works fine on the test card). This file is located in %SOUND%\CSP, and will dynamically be loaded by Creative's PLAY.EXE when a compressed wave file is played. I'm very interested to hear whether the issue I am facing is limited to the test card I have at hand, or whether other cards are affected, too. Thus, I would like to hear reports whether other SB16 ASP owners also face issues with playing back compressed stereo files, and if so, whether my patched program helps (as I only provide WO0200.CSP and not WFM0200A.CSP yet, you can only test in DOS, not in Windows).

Furthermore, detailed documentation for the ST18932 DSP core would be helpful. I already have (and thus don't need pointers to):

  • An ST databook containing the full ST18930 datasheet including opcode assignments with all bit patterns.
  • The ST18933 datasheet (which refers to the 18932 core datasheet for core details), it has some example code in the appendix. The example codes helps understanding the architecture.
  • The ST18940 datasheet (an edition that describes mnemonics and coarse assembler syntax, but doesn't contain the bit assignments).
  • A block diagram of the ST18932 core taken from patent US5710934.
  • The sound blaster Hardware Programming Guide (it contains moderately helpful block diagrams of the card and the specification highlights of the ASP).

EDIT: A CT2760 with ASP as second card to compare to, as well as PLCC68 sockets are already ordered and should arrive the next days. I should have three ASP-capable cards and two ASP chips if everything arrived.

Reply 1 of 10, by Stiletto

User metadata
Rank l33t++
Rank
l33t++

I always knew my digging up the ASP's identity out of USENET archives (and probably QuestStudios Forums archives too) would come in handy one day!
[EDIT: Let's give credit for digging it up out of USENET archives to Cloudshatze: https://web.archive.org/web/20140909172843/ht … hp?topic=2421.0 . I just was one of the ones who made sure that info did not get lost after QuestStudios Forums went offline 😀 )

I'll take a look around for documentation again. The last time I went looking, I did not turn up any more than you...

"I see a little silhouette-o of a man, Scaramouche, Scaramouche, will you
do the Fandango!" - Queen

Stiletto

Reply 2 of 10, by mkarcher

User metadata
Rank Oldbie
Rank
Oldbie
Stiletto wrote on 2022-03-07, 00:20:

I always knew my digging up the ASP's identity out of USENET archives (and probably QuestStudios Forums archives too) would come in handy one day!

EDIT: Let's give credit for digging it up out of USENET archives to Cloudshatze: https://web.archive.org/web/20140909172843/ht … hp?topic=2421.0 . I just was one of the ones who made sure that info did not get lost after QuestStudios Forums went offline 😀 )

Yes, that post was extremely helpful in getting started with the ASP stuff at all. I don't think without the model number ST18932, I would be able to reverse the stuff Creative implemented on their cards. Sorry for not citing it here, but be assured, in my draft write-up about "what is currently known about the CSP", I quoted that post already before you mentioned it, but I need to check whether I included proper attribution.

I'm going to publish info about CSP development soon, I hope.

Reply 3 of 10, by mkarcher

User metadata
Rank Oldbie
Rank
Oldbie

The CT2760 with CSP arrived. As I already expected, CTADPCM playback works out-of-the-box on that card, both with the original CSP program by Creative Labs as well as with my patched program. The programs differ slightly in how they time data transfer from the PC to the CSP. On the CT2830 I used before, the original Creative Labs CSP program creates a lot of noise. This is strong evidence that something on that CT2830 is not working correctly.

The CT1971 (= EMU8000) gets quite warm (I guess around 50 to 60 °C), but as far as I know, this is normal on old revisions of the EMU8000 chip.

Reply 4 of 10, by ViTi95

User metadata
Rank Member
Rank
Member

I'm also very interested on this topic, wonder if it's possible to program directly the CSP and use it as a coprocessor. Saw some days ago someone selling a ISA devkit card for the ST18932, but not very cheap.

https://www.ebay.es/itm/202676573054?ssPageNa … K%3AMEBIDX%3AIT

EDIT: You can find also information about the TS68390, it's the first implementation of the ST18 core. There is available online a full datasheet of that model.
EDIT2: How are you reverse engineering the firmware? IDA Pro nor Ghidra support the ST18. And I've been searching for the SDK and it's nowhere to be found.

Attachments

Reply 5 of 10, by mkarcher

User metadata
Rank Oldbie
Rank
Oldbie
ViTi95 wrote on 2022-03-10, 08:55:

I'm also very interested on this topic, wonder if it's possible to program directly the CSP and use it as a coprocessor. Saw some days ago someone selling a ISA devkit card for the ST18932, but not very cheap.

Yes, that's possible. You are quite limited by the program space of just 512 (32-bit) words and the ERAM size of just 512 (16-bit) words. If my first estimates are correct, that wouldn't even be enough for a MPEG-1 accelerator (which would be an awesome use of the CSP if it could do that).

ViTi95 wrote on 2022-03-10, 08:55:

Yeah, I've seen that offer too, because another board member pointed me to it in a PM. I don't think I need a ST18932 devkit card at that price, especially if it comes without any software and documentation. I already do have a underdocumented ST18932 development card including basic program upload software. The hardware is called the SB16 ASP, and it's public knowledge how to upload "fully prepared" CSP programs (e.g. the ALSA linux drivers know how to upload CSP programs). I might have considered buying that card if software and documentation were included, because that's the stuff I don't alrwady have.

ViTi95 wrote on 2022-03-10, 08:55:

EDIT: You can find also information about the TS68390, it's the first implementation of the ST18 core. There is available online a full datasheet of that model.
EDIT2: How are you reverse engineering the firmware? IDA Pro nor Ghidra support the ST18. And I've been searching for the SDK and it's nowhere to be found.

The TS68930 (you have the digits mixed up) is the NMOS predecessor of the ST18930, which is manufactured in CMOS. A full datasheet (including detailed opcode layout) is available in the databook http://bitsavers.informatik.uni-stuttgart.de/ … pplications.pdf starting at page 554. That's the most comprehensive documentation I have about the ST18 architecture. I used that documentation to roll my own disassembler. It turns out that the ST18932 core is (mostly) binary compatible to the ST18930 core, but has some extra capabilities.

As Creative Labs is Creative Labs, they had to do something funny: They mix up the bits in the instructions to obfuscate them. You can (de)obfuscate the DSP code bytes using the formula

unsigned char obfuscate_byte(unsigned char x)
{
return ((x & 0xA0) >> 5) |
((x & 0x50) >> 3) |
((x & 0x0A) << 3) |
((x & 0x05) << 5);
}

A full CSP program consists of little-endian formatted obfuscated 32-bit instruction words, a 16-bit checksum over the obfuscated bytes, and a 16-bit program ID which is just stored inside the 8052 DSP RAM and be read back later to identify the program currently uploaded to the CSP. The DOS files "WOxxxx.CSP" and "WIxxxx.CSP" are plain CSP programs (obfuscated, including checksum and ID). The Windows WFMxxxxA.CSP files are RIFF containers containing CSP programs that describe the supported sample rates, effects, and provide the possibility to upload a "setup" program before the "main" program. Uploading one or more setup programs makes sense, because you can only access the code space of the CSP using the DSP. To initialize the data memory (that chip has a whopping 192 + 128 + 512 words in the three RAM areas, and there is likely constant coefficient ROM on the CSP) it makes sense to first upload some code that writes initialization values or filter coefficient tables into the RAM, and then replace that code by the main program that uses the initialized RAM.

More details will be published (hopefully) soon when I polished my notes, and cleaned up some white spots.

Reply 6 of 10, by ViTi95

User metadata
Rank Member
Rank
Member

Wow, this is insane. I have some SB cards with the CSP, and can provide testing if required. My idea is to accelerate the sound engine in FastDoom, mainly the stereo separation and volume calculations. Maybe this is the first time an ISA card could be used as an accelerator on old systems.

Reply 7 of 10, by Disruptor

User metadata
Rank Oldbie
Rank
Oldbie
ViTi95 wrote on 2022-03-10, 21:47:

Wow, this is insane. I have some SB cards with the CSP, and can provide testing if required. My idea is to accelerate the sound engine in FastDoom, mainly the stereo separation and volume calculations. Maybe this is the first time an ISA card could be used as an accelerator on old systems.

Well, have you ever tried QSound ?

Reply 8 of 10, by mkarcher

User metadata
Rank Oldbie
Rank
Oldbie
ViTi95 wrote on 2022-03-10, 21:47:

Wow, this is insane. I have some SB cards with the CSP, and can provide testing if required. My idea is to accelerate the sound engine in FastDoom, mainly the stereo separation and volume calculations. Maybe this is the first time an ISA card could be used as an accelerator on old systems.

Don't get your hopes up too high: The CSP does not have enough RAM to store samples on the card, so you have to stream sample data through the normal playback DMA channel. While the CSP can temporarily halt playback DMA (and does so when in decompression mode to reduce the data rate to 50% in aLaw/µLaw or 25% in ctadpcm/IMA ADPCM), it can not run playback DMA at a higher speed than 32 bit (16 bit stereo) per sample interval. The CSP also doesn't have the possibility to serve two DMA channels (like getting one stream from DMA1 and one stream from DMA5). So there is no hardware mixing of arbitrary multiple streams.

I think the QSound program for the DSP is the closest you can get with uncompressed digital audio: It contains an "effect chain" of a digital volume control, a routing module that pans the left channel playback DMA data, the right playback DMA data, the left ADC input and the right ADC input into arbitrary positions of an artificially widened stereo space. The widening happens by mixing a negative low-pass filtered edition of the left channel into the right channel (i.e. adding anti-left sound to the right channel, so the crosstalk from the left speaker to the right ear is cancelled to a certain amount), and mixing a low-pass filtered edition (with a slightly different filter window function, so it sounds less artificial) of the right channel into the left channel. IIRC this stereo enhancement only kicks in if you pan the four sources outside of the normal left-to-right spectrum. The third effect is a digital VU meter that feed backs output level after the filter chain.

So what you can to using the QSound program: Mix two mono samples at the playback rate you configured using commands 41/42, that you already interleaved on your CPU, to arbitrary positions at arbitrary volumes in the widened stereo space. Doing the same with 4 8-bit samples at four arbitrary positions should be possible, too, when you design your custom program.

Reply 9 of 10, by Roxor

User metadata
Rank Newbie
Rank
Newbie

I had a SB16 ASP back in the day (still got the box, manuals and software, but not the actual card), and I've often wondered about this little-used feature.

It's surprising how limited the chip is. 512 words of memory is tiny. It's like those programming games from Zachtronics. I have to wonder if perhaps Creative shot themselves in the foot by picking such a limited chip, or not providing it with a decent amount of external RAM (is it even supported?).

I think you're right about MPEG decoding. It could have been a huge selling point. Imagine boasting that the AWE32 could do MPEG audio decoding on the card and bundling a Video CD playback program with it. They'd have been flying off the shelves.

With findings this interesting I'll definitely be keeping watch on this.

Reply 10 of 10, by mkarcher

User metadata
Rank Oldbie
Rank
Oldbie
Roxor wrote on 2022-03-12, 06:27:

I had a SB16 ASP back in the day (still got the box, manuals and software, but not the actual card), and I've often wondered about this little-used feature.

It's surprising how limited the chip is. 512 words of memory is tiny.

The 512 words of "external RAM" is supported by two banks of internal RAM: The X-RAM (192 words) and the Y-RAM (128 words), so the 512 words is "just external working space", but even combined with the internal memory, the amount is too small for complex transformatory synthesis (like MPEG) or delay-based effects like chorus/reverb, especially if you run at 44.1kHz stereo. 512 words is 256 stereo samples, which makes a maximum delay amount of 11,6ms.

Compare that to the AWE32. The EMU8000 chip has a chorus/reverb engine requiring external "tank RAM". The original AWE32 had a 64Kx4 RAM chip (16 kWords), but later editions downgraded that to a 16Kx4 RAM chip (4kWords), so even the cheap cards have 8 times the capacity of the SB16 ASP. I don't think the drivers ever used the full 16kWords, because the AWE programming guide do not contain any hints on RAM size distinctions for their magic "initialization words". Actually, the reverb/chorus engine of the EMU8000 is a second DSP-type engine using a branchless "endless loop" execution model. When I am done with reverse engineering the CSP, I might try to take a short at reverse engineering the EMU8000 effect engine DSP. While we don't have any documentation for it, the programming model of that effect engine seems to be inspired by the Ensoniq ES5510 effects processor which was used on the effects add-on on the Soundscape Elite for example (and can be emulated by MAME). Likely that programming model was state-of-the-art at that time.

Considering this statement from ST,

ST advertisement wrote:

Creative Technology SB16's ASP is an digital signal processor based ASIC from SGS-Thomsons ST18932. This design consists of five memory elements - four 512 words x 8bits program RAMs, a 512x16 data RAM - plus the DSP megacell.

Power consumtion is 350mW and 0.5mW standby. The 13 MIPS application specific DSP core has been used for data compression techniques. The customized algorithm implemented in the ST18932 assembler language succeded in saving 75% of equivalent disk space over competitors' solution then decoding a full 16-bit stereo recorning.

Development time was five months, from concept to finished prototypes.

my current impression is that the key point of the ASP was to replace the ADPCM compression feature of the Soundblaster 1.0 by something more modern, with the primary constraint being time-to-market.

The classic Soundblaster series can use the 8051 microcontroller to decode 8-bit ADPCM compressed data. This system doesn't support stereo data. Due to the limited processing power of the 8051, it is only capable of handling 11 to 13 kHz. Monaural sound at that sample rate seems unfit for a card that puts "CD quality stereo sound" as primary feature all over the box. Being able to play back "2:1, 3:1 and 4:1 compressed audio" in hardware is a feature that was heavily advertised on earlier sound cards. The 512 words ERAM is integrated in the CT1748 chip itself, and it needs to be accessible in a single processor clock (i.e. 80ns). I don't know how much (moderately) fast SRAM could be integrated in the ST ASICs at that time, but the ASIC size might be the root cause for the low RAM amount.