ih8registrations, I hope you're not offended, but as you were posting those optimizations to my envelope caching routines, I was actually in the process of removing the envelope caching from my code. They were detrimental to the sound (not your improvements, my envelope cacher). You see, they were based on my incorrect assumption that the envelope times were linear (i.e. 50 = 50 samples, 100 = 100 samples, etc.) That was not the case. As with a lot of things on the MT-32, the timing is logarithmic. As such 50 = .297 seconds where as 100 = 22 seconds. Since I was limiting my envelope to just 400 "samples" though, in the smaller timing numbers like 10 or 20, the values would be 0 when on the MT-32 this results in a very fast, though noticable envelope (of only 20-50 actual PCM samples). So, tonight I've spent my time eliminating the envelope caching code and replacing it with an on the fly envelope calculator. Due to some heavy thinking, its actually ended up more streamlined than the original table based envelope code.
During the sample generation the "cache code" required several branches, multiplications, and arithmatic shifts. My "on the fly code", on the other hand, normally requires one multiplication (imul) and one divide (idiv). I feel like such an idiot for leaving that code in there as long as I did. Thanks for your improvements. It wasn't for nothing. Really. Because you got me looking at the caching code again (which I had ignored in favor of the filter code) and I realized it needed a replacement. Furthermore, you inspired me to think more in a more optimized manner (i.e. no more unnecessary branching).