This removes downscaling (halving-add) when multiple voices are
being mixed. To avoid clipping, and get similar behavior to before,
set the "level" of each voice to (1/voice_count).
Slow paths that were applicable to only M0 chips were removed.
As a side effect, the internal volume representation is now 0 ..
0x8000 (inclusive), which additionally makes a level of exactly 0.5
representable.
Testing performed, on PyGamer: For all 4 data cases, for stereo and
mono, for 1 and 2 voices, play pure sign waves represented as
RawSamples and view the result on a scope and through headphones.
Also, scope the amount of time spent in background tasks.
Code size: growth of +272 bytes
Performance (time in background task when mixing 2 stereo 16-bit voices):
76us per down from 135us (once per ~2.9ms long term average)
(Decrease from 4.7% to 2.4% of all CPU time)
These arguments are constrained to be compile-time constants, a fact
that gcc complains about under "-Og" optimization, but not in normal
builds. Declare them as enumerated types