Now I need to overlay the same long audio file with music with several other short audio with voice recordings, near the end of the timing of the music. Ideally, I'd also slightly mute the background while the voices are playing and, if the voices are longer than the rest of the music, stretch the audio to fit all the voice recordings.
Fortunately, I don't need to do this in real time: I can combine the audio in advance, before it's played in the voice channel. But I don't know how to execute this combination automatically.
As far as I understand, I need to make an #ffmpeg call under the hood of the #Go code?