if you didn't have access to floating point math and you wanted to mix a stereo u8 PCM sample down to a mono u8 PCM sample and have it sound ok how would you do it?
if you didn't have access to floating point math and you wanted to mix a stereo u8 PCM sample down to a mono u8 PCM sample and have it sound ok how would you do it? 11 comments | Expand all CWs
i can imagine doing some kind of smart scaling/compression to keep the apparent gain higher but i sort of dread figuring out how to do it with integer math @d6 nope, simple sum like you've done is all there is to it. You might also first check for clipping before deciding to do the div by 2. If nothing clipped, then skip it. @hyc i don't think that approach is great because when you are near clipping you're unscaled then suddenly you drop down by 2 once you exceed the peak. i have seen people talk about clamping at the max value but in my test cases that sounds too distorted. thanks for confirming my intuition about the sum! @d6 no I meant, scan the entire audio stream. If nothing clips, skip the div for the whole thing. Clamping just a few samples would distort the hell out of it, yeah. @d6 what you’re doing is fine if you need both channels preserved. Different people do different things here (l+r)*.5 is one sensible option and will preserve the level of correlated content in the left and right channels. This makes sense for most music. If the channels are not correlated, you will observe a level dip - but if you wanted to fix this, you would need a peak limiter (and clipping is one peak limiter) and that’s a whole other can of worms. @d6 @voxel alternating samples like this is like halving the sample rate of each signal without filtering. This will cause all the high frequencies to alias. The correct way to go this route is to interleave the two signals to get a new signal at twice the rate, then down sample using a low pass filter to get the original sample rate. However, this will cause some phase distortion in the second signal as it was shifted slightly in time. @d6 averaging is the correct way to do this. The problem you’re experiencing is non-linear human perception. (And maybe some cancellation of out of phase frequencies) You’ll just have to fudge in a volume boost… which will probably take you out of 8-bit range. @d6 when thinking about the volume, also keep in mind that you’re going from two speakers at full volume to one speaker at full volume. Try this: add the samples, don’t divide, and clip/saturate them instead. This will cause distortion, but it should be roughly the right loudness. @d6 Another thing to consider in your divide step is that you’re introducing aharmonic quantization error. If you want to get fancy, add one bit of noise to your 16-bit sum before dividing. This will cause your error to be white noise instead, which sounds a bit better. |
currently i'm promoting each 8-bit sample to an unsigned short and averaging them, e.g. (left+right)/2.
it sounds ok but not great: as expected it gets quieter and it maybe also introduces some artifacts.