Actually I am still in the process of documenting how the chip works internally and it's not complete yet (I don't find as much time for it even if I wanted to). Therefore, I can't really say yes or no whether you are emulating it right, and most questions you have are not about the emulation itself but about random bugs in code why it does not work, and it's pretty time consuming to figure out just from code what it is supposed to do versus what it actually does. Perhaps in this thread we should talk about how the chip works or how it should be emulated here, instead of what's wrong in your implementation?
As to your noise+tone question; I am not 100% sure right now but I think goes like this: when tone bit is high, output is continuous 16-unit idle slots as noise has no effect. When tone bit is low, the amplitude alternates between 16-unit slot for tone, 16-unit slot for noise, etc, so if both tone and noise are low, you just get continuous 16-unit slots of amplitude, if tone is low and noise is high, you get 16-unit amplitude slot for tone, 16-unit idle slot for noise.
But anyway, I did try to look your code through, and would like to comment about chip emulation aspects in general.
First about the chip output. When no sound is generated, the chip output sinks no current, so the output floats at 5V set by an external resistor. For each of the six channels, there is one current sink, that either is not active and sinks no current and does not change output voltage, or it's active and sinks current thus pulling voltage down. Thus a square wave on one channel will either enable or disable the current sink, so the output toggles between two voltage levels. When expanded to six channels playing different tones, it's clear that at any moment there could be any number between 0 and 6 current sinks enabled, so for any signal out from the chip, it has only 7 analog levels.
Then about the square waves. A square wave is a binary signal, so it has only two states: 1 and 0, high and low, +1 and -1, or active and inactive. This is what is fed to the output current sink, so the output voltage toggles between two voltages when tone is enabled. But when you have a control to enable or disable the square wave generation (by means of tone enable bit or setting amplitude), it still has only two states but stays idle in one of the states when not running. So if your square wave has a peak amplitude of X and toggles between +X and -X (it's peak-to-peak amplitude being 2X), there is no third state where its amplitude is zero even if you disable it, it must be either +X or -X meaning DC offset. Every time the signal changes state, it must jump the amount of 2X in amplitude, whether the jump was due to enabling/disabling a channel or just the square wave itself toggling. Also the idle state of square wave is when the current sink is inactive, so the signal will normally float high, and the first edge on a channel is negative-going edge when the current sink gets enabled and last edge when disabling a tone is a high-going edge.
Then about the volume. The actual chip does not implement volume in PWM like most people understand PWM, it's more like PDM. It is true that a single period of chopped volume control is 16 units long, but the "active" portion of volume X is not packed to first X samples, it's somewhat cleverly distributed throughout the 16 units. So, for instance, volume of 8 (50%) is not 8 units active + 8 units inactive, but rather 4H+4L+4H+4L, and volume of 7 is not 7 units active + 9 inactive, but 1H+3L+4H+4L+4H. I believe the distribution of highs and lows are carefully selected for at least two reasons. It has more random and higher frequency content than just PWM modulating a carrier there so it is easier to filter out the pathological cases, and it appears that if you manage to turn on two square waves at identical phase, if the sum of their amplitudes is 15, it would match the waveform of single square wave with volume of 15.
I know, I should really be gathering and posting my findings.