There were two main problems
- word_buffer was being filled as though with unsigned samples,
but during mixing all samples are kept in signed mode
- If the first buffer was stopped, the voices_active flag got set
anyway, even though the output buffer wasn't initialized yet,
so the samples were mixed with indeterminate data
We also cover the case where no buffer was playing, and ensure
the output buffer is filled.
This now works much better. Tested on neotrellis m4 playing back
4 mp3 streams at a time in signed-16, 22050Hz
This removes downscaling (halving-add) when multiple voices are
being mixed. To avoid clipping, and get similar behavior to before,
set the "level" of each voice to (1/voice_count).
Slow paths that were applicable to only M0 chips were removed.
As a side effect, the internal volume representation is now 0 ..
0x8000 (inclusive), which additionally makes a level of exactly 0.5
representable.
Testing performed, on PyGamer: For all 4 data cases, for stereo and
mono, for 1 and 2 voices, play pure sign waves represented as
RawSamples and view the result on a scope and through headphones.
Also, scope the amount of time spent in background tasks.
Code size: growth of +272 bytes
Performance (time in background task when mixing 2 stereo 16-bit voices):
76us per down from 135us (once per ~2.9ms long term average)
(Decrease from 4.7% to 2.4% of all CPU time)
These arguments are constrained to be compile-time constants, a fact
that gcc complains about under "-Og" optimization, but not in normal
builds. Declare them as enumerated types