@baldur IMO it's something that can easily be noticed by users: I have integrated a distilled French-targeted Whisper model in my day-to-day usage to be able to understand what people tell me in voice messages, and even during tests I could SEE that it was making some stuff up.
I'm not talking about mistakes, like misunderstanding a word for another. Straight up sentences that could appear out of thin air if the speaker sighed too loudly.
Thankfully I don't rely solely on the transcription...
@Poslovitch @baldur
o.0
'In an example they uncovered, a speaker said, “He, the boy, was going to, I’m not sure exactly, take the umbrella.”
But the transcription software added: “He took a big piece of a cross, a teeny, small piece ... I’m sure he didn’t have a terror knife so he killed a number of people.”'