@fell @sponsorblock my understanding is that this would be technically pretty easy - the encoded video streams have I frames every now and then, and these frames don't depend on having existing decoder state¹, so all² you need to do server-side is slot in your ad stream before an I frame and update the stream-container metadata to make it coherent again.

¹: it's how seeking works - going straight to 1:30:45 in a video doesn't require decoding 1½hrs of video, just looking up where the nearest I frame before that point is and decoding from there.
²: heh. “All”