Email or username:

Password:

Forgot your password?
Baldur Bjarnason

“Researchers say an AI-powered transcription tool used in hospitals invents things no one ever said - ABC News”

abcnews.go.com/US/wireStory/re

56 comments
Kévin ⏚

@baldur AI just making things up in a critical area where there are real world consequences, well I never 😲

Shaun Dyer

@baldur It’s almost like it can’t be trusted to do anything useful and it’s just a toy that is slowly melting the planet…

Corpomancer

@baldur "the health system complies with state and federal privacy laws."

Case closed. Don't worry about it.

Angua :spinny_fox_disability:

@baldur@toot.cafe

First a failure to perfect dumb Voice Recognition, refuse to pay Stenographers and now this.
Add to ADM...

Hospitals invent enough by themselves as it is.

Jamie Knight

@baldur well no one saw that coming, did they?
My wife used to work in medical transcription - you CANNOT trust this job to AI, it will never, ever work.

Mike "piñata economy" Sims

@baldur Literally insane. That is the purpose of a LLM: to make stuff up. That's the PURPOSE. That's what it does. There are hundreds of voice transcription tools that work well, why would anyone ever want one that makes stuff up?

John Harris

@baldur "AI" hype is certainly eye opening regarding how far well-funded PR cab push obviously bad things into public discourse. I am forced to look askance at people who still uncritically post positive news about ChatGPT.

Patty Kimura

@baldur Five years from now people will think we were nuts to place so much public trust in and corporate money on unreliable AI.

Mx. Eddie R

@baldur
What I don't get is, audio transcription is a solved problem. We have software that does a pretty good job, with reasonable energy and resource use and basic human supervision; my partner used to have a job proofreading auto-transcriptions.
Best part is, when it can't make something out, the legit transcription software goes "{UNINTELLIGIBLE}" and the human proofreader fixes it. It never invents stuff.
We don't need a way to do it worse with bigger resource use and sneaky errors.

Tariq

@baldur

Didn't they test it before using it?

potpie

@rzeta0 @baldur I found a summary of the test results: ¯\_ (ツ)_/¯

StaringAtClouds

@baldur [In an example they uncovered, a speaker said, “He, the boy, was going to, I’m not sure exactly, take the umbrella.”

But the transcription software added: “He took a big piece of a cross, a teeny, small piece ... I’m sure he didn’t have a terror knife so he killed a number of people.”]

"Take the umbrella" to "terror knife so he killed a number of people" is a hell of a mistranscription

And this is used to transcribe medical notes !?

And it deletes the recording "for privacy" riiight🤦‍♀️

Nini

@staringatclouds @baldur Probably a good idea not to delete the recordings, someone will need to go listen to verify that the people involved didn't suddenly start talking nonsense.

databit.me

@baldur
But what exactly ? Maybe it shows how conformity is kidding us. we beed to stay creative... But it's not the best way to be in charge...

Thomas Traynor

@baldur I wish they would stop using the word 'hallucination'. They are spouting garbage. Transcribing voice recordings is hard (I do that at least bi-weekly) and I see what automated systems do for the meetings I run and the output isn't even close to what is being said and some are complete inventions. Technical terms and acronyms 95%+ wrong in the transcriptions generated.

William Canna-bass

@baldur "he found hallucinations in 8 out of every 10 audio transcriptions he inspected,"
Well then that isn't a transcription tool, it is a #BullshitEngine

William Canna-bass

@baldur
" A third developer said he found hallucinations in nearly every one of the 26,000 transcripts he created with Whisper."

William Canna-bass

@baldur
"In an example they uncovered, a speaker said, “He, the boy, was going to, I’m not sure exactly, take the umbrella.”

But the transcription software added: “He took a big piece of a cross, a teeny, small piece ... I’m sure he didn’t have a terror knife so he killed a number of people.”"

#BullshitEngine #OpenAI

Kate Nyhan

@baldur
I've read that (fascinating) piece twice, and I see there are tons of links for the people and institutions mentioned, but the underlying studies they mention will be a real hassle to try and find without links.

Kate Nyhan

@baldur
"A University of Michigan researcher conducting a study of public meetings, for example, said he found hallucinations in 8 out of every 10 audio transcriptions he inspected, before he started trying to improve the model."
That sentence has a link but it just points to umich.edu/ - not helpful AP

(Edit, btw I say AP because the website is ABC News but the authors are two journalists from the Associated Press)

@baldur
"A University of Michigan researcher conducting a study of public meetings, for example, said he found hallucinations in 8 out of every 10 audio transcriptions he inspected, before he started trying to improve the model."
That sentence has a link but it just points to umich.edu/ - not helpful AP

Kate Nyhan

@baldur
"A recent study by computer scientists uncovered 187 hallucinations in more than 13,000 clear audio snippets they examined."
That is the academic library equivalent of the reference question "Please help me find a book I read once about goats with a blue cover."

Kate Nyhan

@baldur
(Ok, *maybe* I could find news coverage or a full-text searchable article indexed somewhere that mentions 187 hallucinations. But come on, why does the reader have to go searching?)

Kate Nyhan

@baldur
"Professors Allison Koenecke of Cornell University and Mona Sloane of the University of Virginia examined thousands of short snippets they obtained from TalkBank, a research repository hosted at Carnegie Mellon University." Again, the sentence has links, but they don't go to the study under discussion

Kate Nyhan

@baldur
Most frustrating of all:
"This story was produced in partnership with the Pulitzer Center’s AI Accountability Network, which also partially supported the academic Whisper study."

WHICH ACADEMIC WHISPER STUDY?
YOU PAID FOR IT BUT YOU DON'T WANT ME TO READ IT?

Poslovitch

@baldur IMO it's something that can easily be noticed by users: I have integrated a distilled French-targeted Whisper model in my day-to-day usage to be able to understand what people tell me in voice messages, and even during tests I could SEE that it was making some stuff up.

I'm not talking about mistakes, like misunderstanding a word for another. Straight up sentences that could appear out of thin air if the speaker sighed too loudly.

Thankfully I don't rely solely on the transcription...

Alejandro Gaita-Ariño

@Poslovitch @baldur
o.0
'In an example they uncovered, a speaker said, “He, the boy, was going to, I’m not sure exactly, take the umbrella.”

But the transcription software added: “He took a big piece of a cross, a teeny, small piece ... I’m sure he didn’t have a terror knife so he killed a number of people.”'

Danny Boling ☮️

@Poslovitch

Wow. How does something so bad get integrated into so many business processes? It's incredible what they're getting away with.

@baldur

Poslovitch

@IAmDannyBoling @baldur I don't know. And if I were to tear my hair out about this, I'd be bald by now 😅

To me, it's just common sense: use your tool wisely and knowingly. And knowing when to use a tool is actually more about knowing when *not* to use it: knowing its shortcomings and where it could fail you.

Well. People seem to consider all the AI tech to be pure magic. Even more than how "computers" used to be considered as magical. It's magic. It feels magic. And magic's never wrong.

GolfNovemberUniform

@baldur AI always does and will always do things like this. It must not be used in any critical infrastructure under any circumstances.

Tony Novak CPA

@baldur same thing in ordinary business and legal transcription. That’s why we must keep audio recording and transcript.

nwrocla

@baldur Too many vendors' public-facing folks frame this "hallucination" defect of their commercial #ai as a training issue for the subscriber --- while the vendor continues to enrich itself by training its ai on medical work product & #patient health care records.

#hipaa #publichealth #ethics #fedilaw #opensource #opendata

Nini

@baldur Of course it did, it only exists to make shit up and poorly at that. Just... look, do some text to speech, get someone human to do a cleanup and consider that good enough. Someone will die from this nonsense if they haven't already.

Petesmom

@baldur So itll be GREAT for use in healthcare 🤡💩⚰️

Tessie for Harris

@baldur

I've seen alot of errors when I've read AI. At this point it's not viable

Otto 🇺🇦🇦🇲🌻🐘

@baldur Using "AI" algorihms for anything unchecked is like thinking Bonaqua is more than a plastic bottle.

wbpeckham

@baldur I keep telling people it's not real AI, it's real BG. It's a large language model with algorithm engines that take the accumulated wisdom and beauty and art of the entire history of the human race and churns out garbage. (BS Generator)

Keith Ivey

@baldur "It’s impossible to compare Nabla’s AI-generated transcript to the original recording because Nabla’s tool erases the original audio for 'data safety reasons,' Raison said."

So Nabla is taking the irresponsibility to a new level.

Michael Busch

@baldur @inthehands I recently talked with a web developer who had been working on an application for recording biological lab values.

Their group was unaccountably told to try using ChatGPT.

They stopped when it immediately inserted impossible values into the data tables.

Urzl

@baldur Yes but it's way faster to do it wrong than right and that's the only metric that matters somehow.

Zuri (he/him) 🕐 CET

@baldur When will people stop to
– expect software to solve human problems
– describe software errors in human terms
?
I wouldn't use AI for this, either, but I'm pretty sure, AI doesn't "consider" this "making things up", but rather doing something to its best ability—which is, and that has to be emphasized way more, so people don't get these incorrect expectations in the first place, quite limited.

1/

@baldur When will people stop to
– expect software to solve human problems
– describe software errors in human terms
?
I wouldn't use AI for this, either, but I'm pretty sure, AI doesn't "consider" this "making things up", but rather doing something to its best ability—which is, and that has to be emphasized way more, so people don't get these incorrect expectations in the first place, quite limited.

Zuri (he/him) 🕐 CET

@baldur I would assume that the AI doesn't really "understand" most of it, as this isn't what AI's core task is. It usually gets structured textural input. Voice recognition is a separate step that one would do before passing stuff to AI. So, it's more likely that this AI just has a somewhat good amount of lucky guesses, plus some misses by a landslide.

2/2

Jonathan Mesiano-Crookston

@baldur why do they even need AI to transcribe? To detect who is talking? Is it supposed to improve accuracy? (I'm just musing out loud)

Jimbo93

@baldur

The researches should be replaced by AI too!

Go Up