Email or username:

Password:

Forgot your password?
bedast

The enshittification of AI has lead to the choice of AI used by VLC to be groaned at. I even saw a post cross my feed of someone looking for a replacement for VLC.

VLC is working on on-device realtime captioning. This has nothing to do with generating images or video using AI. This has nothing to do with LLMs.

(edit: There's claims VLC is using a local LLM. It will use whisper.cpp, and not be using OpenAI's models. I don't know which models they will be using. I cannot find any reference to VLC using a LLM.)

While it would be preferred to use human generated captions for better accuracy, this is not always possible. This means a lot of video media is inaccessible to those with hearing impairment.

What VLC is doing is something that will contribute to accessibility in a big way.

AI transcription is still not perfect. It has its problems. But this is one of those things that we should be hoping to advance.

I'm not looking to replace humans in creating captions. I think we're very far from ever being able to do this correctly without humans. But as I said, there's a ton of video content that simply do not have captions available, human generated or not.

So long as they're not trying to manipulate the transcription using GenAI means, this is the wrong one to demonize.

#AI #Transcription #VLC #HearingImpaired #Deaf #Accessibility

102 comments
Hops the sausage dog

@bedast I would also add I find it quite helpful to start with a set of automatically generated captions, and then correct them. I don't do this often, but it saves me loads of time in a part-time job.

Is this a bit like people being annoyed at Mozilla using AI for on-device browser translation, even though that's very useful? I'm not sure if that's generative, but I'd guess not.

bedast

@howisyourdog I'm not a Firefox user so I haven't really dug into the latest in being upset with Firefox making an AI plugin, but it seemed like they were making an LLM to summarize pages. These have been known to get things very wrong. I don’t know if it's on-device or if it uses ChatGPT.

Hops the sausage dog

@bedast oooh, I hadn't heard about it summarising pages, that's useful to know.

bedast

@howisyourdog It’s a plugin/addon so opt-in for now. So there’s at least that. But Mozilla has a history of eventually forcing stuff on users.

Hops the sausage dog

@bedast I was thinking of this I believe, and I'm not sure if it is ML/AI

fr0g

@howisyourdog @bedast

Firefox also has a local AI translation thingy that is different from the plug-in being talked about here.

⏚ ȺՀղöɾէհ 🍉 βօӀìçҽ ժմ βօղƓօûէ

@frog_reborn @howisyourdog @bedast
Yep, and this module of translation by Firefox, using offline AI is getting things really wrong, too, as it translated when I tested it, the The German AFD party into the Ministry of Defence, which is kind of absurd, you know, and totally wrong.

Alex Rock

@bedast @howisyourdog all browsers end up forcing stuff on users anyway. Chrome is the leader in forcing stupid things though, especially regarding privacy infringement.

Infoseepage #StopGazaGenocide

@bedast @howisyourdog It's off device. They say it is privacy preserving, but that's fundamentally questionable when you are sending stuff off device and can always change at any point. It's basically a "trust us, we won't be evil" statement and a lot of people are not willing to trust in that any longer given how they've behaved of late.

Dr. Angus Andrea Grieve-Smith

@howisyourdog @bedast I groan every time I see unsuprevised automated captions or machine translation. They're simply not ready for prime time.

I know some Deaf people find them useful, so I understand the push to integrate them. But this should not be bundled with VLC; it should be an optional plugin, if it isn't already.

Hops the sausage dog

@grvsmth @bedast it's certainly a tricky one. I would go further and say people with hearing loss, particularly those who can't lip read (me), find them more than just useful.

Their accuracy is definitely a problem to be solved, so having it as a plugin is a good compromise as long as people know that. On the other hand it's VLC, so you're getting a pretty amazing piece of software for free, and this is coming from a good place, not trying to inflate stock price with a fad.

Certainly not something to rely on if you're producing videos professionally, but I can also see e.g. a solo YouTuber won't have time to transcribe all their videos.

@grvsmth @bedast it's certainly a tricky one. I would go further and say people with hearing loss, particularly those who can't lip read (me), find them more than just useful.

Their accuracy is definitely a problem to be solved, so having it as a plugin is a good compromise as long as people know that. On the other hand it's VLC, so you're getting a pretty amazing piece of software for free, and this is coming from a good place, not trying to inflate stock price with a fad.

Flaky

@howisyourdog I dunno about the on-device translation, but Mozilla has also been messing with LLMs, staring with the AI sidebar (which could've been just a regular web panel) and the Orbit summariser extension, which is why people have gotten angry (alongside the "privacy preserving" tracking ad-tech).

@bedast

Hops the sausage dog

@Flaky @bedast that's good to know. I'll avoid the LLM stuff and try and disable it in about config if it becomes mandatory

Flaky

@howisyourdog ATM it's not, but you might also want to disable the "privacy preserving advertising" stuff if you don't want Mozilla to track you. Unless you disabled Mozilla telemetry outright, in which case the adtech gets disabled too.

@bedast

Moss Wizard

@bedast It’s tricky because you’re certainly right about the amount of video with no captions, and the unfair inaccessibility of that. But translation AI is exactly the same tech as “generative” or “LLM”, it is statistical modeling. It is not different in any way, including errors and fuel and water demands. It’s like vehicle engines and tires: they do a tremendous amount of good every day, including for accessibility, but they also have terrible side effects that warrant complaints.

sbszine

@bedast @Moss If it's done on device that should address the water issue at least.

bedast

@sbszine @Moss Honestly, in my opinion, any AI inference that is not able to use on-device or edge compute is not ready for mass usage by the public.

There’s multiple AI and AI-adjacent tools that I use that have no reliance on cloud compute for inference or decision making. For example, my insulin pump’s operation to keep my blood glucose near target. This runs on a device the size of a pager.

th4

@sbszine @bedast @Moss as a rule of thumb, if it can run locally it's probably not too outrageously wasteful

🍞

@sbszine @bedast @Moss where do you think your electricity comes from?

nytpu ‮

@sbszine @bedast @Moss The issue has pretty much never been the energy cost of using the model, but the energy cost of training it. And there's also the ethics of the sourcing of the training data as well.

small circle 🕊 in calmness

@Moss @bedast

Also agree on its tremendous value for a11y.

There is however also the surveillance capitalism aspect. Imagine every device with a microphhone or a camera able to phone home tiny compressed and encrypted trickled of plaintext data, containing our conversations and description of social settings. Nightmarish dystopic esp. given all the dystopic stuff we already have.

So in the balance there may be fundamental freedoms A11y is directly addressible need vs. long-termis externality.

v0idness

@smallcircles @Moss @bedast This has nothing to do with the above post. Its also untrue. This is just useless fearmongering.

Brent Pruitt :: Artist

@bedast

I will stick to Open Subtitles as it is more reliable, & will provide better accuracy for slang and other contextual factors

there is no #Enshitification of #AI, when AI is shit to begin with

bedast

@brentpruitt This is a gross hot take built on gross ignorance. And if you think this makes me an AI apologist, you haven’t seen any of my prior posts about AI.

Brent Pruitt :: Artist

@bedast

no, i just find the phrase ‘enshitification of AI’ to be paradoxical / funny

Xavier Jacques Côté

@bedast the worry I do have regarding this feature is it’s will provide an excuse to some (and that will grow over time) to stop investimg into producing quality captioning. Why spending money/ressources when there is an IA who will generate some [crappy, or just basic one, if not errornous] captions, automatically.

I beleive on the long run, thats will be an innevitable drop on the quality, in exchange of availability.

Damn if you do, damn if you don’t, as they say.

FediThing 🏳️‍🌈

@bedast

Maybe this needs to be called "voice recognition" instead of AI?

Using a term that nowadays means something awful is going to make misunderstandings more likely?

(When I read the news about VLC using AI I wrongly assumed it meant generative AI, as that has totally dominated discourse.)

Sami Määttä

@FediThing @bedast One of GenAI's well poisoning aspects has been tarnishing the term "AI". It has lost its meaning now.

Ben Ramsey

@SamiMaatta @FediThing @bedast In the case of automatic transcription, it’s using machine-learning models, which are similar enough to LLMs that it muddies the water, as far as terminology goes.

FediThing 🏳️‍🌈

@ramsey @SamiMaatta @bedast

Whatever it's called, perhaps it needs to get across the ethics of its technology if it wants to avoid misunderstandings?

If it's using massive amounts of energy and/or stolen data for training, then it's probably unethical.

If it's using reasonable amounts of energy and hasn't stolen any data, then it might be ethical.

(I think? Just a layperson here, might be a lot of stuff I'm missing...)

Henrik Pauli

@SamiMaatta @FediThing @bedast And so every developer or group with a sense of marketing should have started avoiding the word for like a year now.

TSRBerry

@bedast I think you completely missed the point of that post asking for different player recommendations. She is well aware that they are implementing STT and not GenAI.
She explains that automatic captions are often bad and the availability of them lead to many people not creating proper subtitles anymore.

See: tech.lgbt/@nina_kali_nina/1137
and
tech.lgbt/@nina_kali_nina/1137

@bedast I think you completely missed the point of that post asking for different player recommendations. She is well aware that they are implementing STT and not GenAI.
She explains that automatic captions are often bad and the availability of them lead to many people not creating proper subtitles anymore.

Rich Felker

@bedast Then don't call it AI. Call it speech to text. But if it uses a language model to more effectively predict words based on context rather than doing an analyzable mechanical local transformation, it is at least partly the "bad kind of AI" - it has the capacity to introduce biases from training data making output that "sounds right" but means the wrong thing, which is much worse than substituting nonsensical homophones now and then (which the reader will immediately recognize as mistakes). Same principle as why autocorrected text is worse than text with typos.

@bedast Then don't call it AI. Call it speech to text. But if it uses a language model to more effectively predict words based on context rather than doing an analyzable mechanical local transformation, it is at least partly the "bad kind of AI" - it has the capacity to introduce biases from training data making output that "sounds right" but means the wrong thing, which is much worse than substituting nonsensical homophones now and then (which the reader will immediately recognize as mistakes)....

Rich Felker

@bedast Enthusiastically calling new functionality "AI" signals to your audience that you're aligned with the scams and makes them distrust you.

This is not hard.

If you have privacy respecting, on-device, non-plagiarized, ethically built statistical model based processing, DON'T CALL IT "AI".

...the heck?!!!??! 🍉

@dalias @bedast I agree. This is why "AI" transcription is a downgrade from previous technologies. It's contributing to the plausible disinformation slop we've still been drowning in lately.

I think automated captions have a place but I'm wary of using genai to do it.

A.V.

@dalias @bedast speech recognition has used language models for decades now. It was one of original applications of language models, way before they scaled up to aping shakespeare.

But even without language models, the act of transcription is very close to generative ai, as its the task of predicting the next text token, given previous tokens and encoded audio sequence.

Rich Felker

@varavs @bedast Then don't call it "AI".

But also, question what harms are coming out of the predictive models. The more they force the output to sound natural and fix misrecognitions, the greater the chance they're altering meaning. Same as autocorrect vs typed text with typos and misspellings.

Rich Felker

@varavs @bedast Also ask if the model is ethically and legally sound. Was it produced from professional training material with compatible license terms? Or stolen from millions of movies or YouTube videos?

LisPi
@dalias @bedast @varavs Aren't basically all the embeddable models that don't have absurd spec requirements sourced & produced by university projects?
LisPi
@dalias @bedast Didn't mathematical/rule-based language modeling start showing massively diminishing returns back like... two~three decades ago or is my information wrong?

As far as I'm aware it would be preferable to start from a rule-based language, and then be able to specifically train a small model on a different captioned sample set of the speaker(s) to eliminate its flakiness.
Koen Hufkens, PhD

@bedast Hear hear, this is why I'm against people labeling anything ML as AI.

ManniCalavera

@koen_hufkens @bedast One of my pet peeves is that most of the time anyone talks about AI (positive or negative) they mean genai. Media is adding anything algorithm-based into the mix as AI, so nobody (me) knows whats talked about when talking AI anymore. Is machine learning the correct umbrella term for "nongenai algorithm-based systems" like automatic captioning? Can I adopt that, or are there more variants of "AI" which would be falsely labeled?

Koen Hufkens, PhD

@ManniCalavera @bedast AI mimics cognitive functioning. So whenever you have a chat interface that would be AI. Most GenAI is prompt driven, so AI. Inpainting apps might technically not be AI but ML, although using generative models.

cloud.google.com/learn/artific

Jcrabapple

@bedast @chriscz Thank you. I really dislike the immediate hate and overreaction for anything that even mentions AI.

F. Maury ⏚

@bedast We do not want AI in any product whatsoever.
No GenAI. No specialized AI. No "it's for the good of the people" AI. Certainly no AI trained by OpenAI.
Is it so hard to understand?
Create tools to manually write subtitles more easily. Yes. Please.
#humanmade #noai

bedast

@x_cli There is AI involved in my survival. It’s not genAI. It’s not transcription modeling. It’s not sexy. But it allows me to live.

It’s a light weight system. Sure.

It’s my insulin pump when connected to a CGM.

Stop demonizing actually useful AI.

Eugene :emacs: :freebsd:

@bedast @x_cli I think there is a choice of word "AI" triggered so many people. I remember the days, then programs like in your insulin pump named "neural network", "smart control system", etc. And AI was something cool, futuristic and unattainable because we (still) don't know what is the human consciousness and how the human brain works (like we know how the computers works — from machine codes to RTL and tricky transistors placement on the silicon die).

(1/3)

Eugene :emacs: :freebsd:

@bedast @x_cli But near 2020 the hype-train and words substituion is started. As a result of all these speculations, it is completely understendable what people going to outrage when they hear about "AI" because for now AI means not the cool thing like in Star Trek, but a stochastical parrot feeded by a stolen data and forcibly inserted in already good things like washing machines to do stupid things like "chat about socks with washing machine" instead of simply push the button.

(2/3)

Eugene :emacs: :freebsd:

@bedast @x_cli Neural networks are still OK — there is a good technology started in 1970, which helps people. AI (I mean real AI) — will be cool and futuristic too. But current "AI" (LLM) hype-train is not cool, it is not a future we wanted (who want to live in cyberpunk with "high tech low life"?). Disadvantages we already know. Advantages — may be will be visible when hype train will stop (improved natural language processing? help in decoding old and lost languages?).

(3/3)

Forbearance

@bedast @x_cli i thought when it was normal it was "machine learning"

v0idness

@x_cli @bedast ai is good actually (in certain situations)

Cinnamon

@bedast@beige.party As much as I hate AI in general (especially generative AI and LLMs), I think I agree this is a fair usage for it, together with OCR and automatic translation. ​:celredcrystalheart:​​:vlc:​

ENIGMATICO :heartbleed:

@bedast@beige.party I don't think generative AI is the only problem. I don't even think generative AI is the problem by itself.

Steffo 🐲

@bedast I think the biggest issue currently is: AI is way to overused in marketing. In fedi, many bubbles mostly don't like AI at all, because for them, AI = GenAI.

Of course, you could look at the automated subtitles on youtube, where swears are censored (which is stupid, lol), or are just plain bad in languages other than English. (e.g. in German it's quite useless.)

So what is the correct thing to do? I'd say: Look at how the subtitles perform. How much performance do they cost? Are they enabled by default (aka. opt out via an options / context menu)? How well do they work in other languages? Do they censor anything? How's the delay?

I mean, I don't know anyone who actively says voice assistants / voice transcription functions in programs are bad *because* they use AI for Speech to Text. And, if I may say... The text transcription on my Pixel phone is working fully locally, in German, without any issues. It's possible, It's not bad. AND: It's not actively advertised as "we have the best AI to do transcription".

Yes, I know... Transcription is not the same as subtitles, but it's still more close to it than having nothing.

Even though I don't need subtitles and can't hear the word "AI" one more time, I'm interested to see, what VLC does there.

Though, I'd love to see a released version 4.0, which would fix some issues I have with VLC, but... eh, we can't have everything I suppose.

@bedast I think the biggest issue currently is: AI is way to overused in marketing. In fedi, many bubbles mostly don't like AI at all, because for them, AI = GenAI.

Of course, you could look at the automated subtitles on youtube, where swears are censored (which is stupid, lol), or are just plain bad in languages other than English. (e.g. in German it's quite useless.)

Heathen ➡️ AnthrOhio

@SteffoSpieler @bedast
"Yes, I know... Transcription is not the same as subtitles, but it's still more close to it than having nothing."

In fact, as someone who is hard of hearing, inaccurate subtitles and transcription is usually worse than trying to figure out what little I'm hearing from context cues.

Accessibility requires work, disabled people require more than just half assed machine learning. If creators cannot put human created captions on media they should question whether creating media is the right job for them. "Something is better than nothing" is how we end up with unsafe wheelchairs kitbashed out of bicycle parts, printed flat dots where there should be braille, and inaccessible captchas.

So if you want to pat VLC on the back for this go ahead, but I'll still refuse to use media without human created captions because everything else is unusable garbage.

@SteffoSpieler @bedast
"Yes, I know... Transcription is not the same as subtitles, but it's still more close to it than having nothing."

In fact, as someone who is hard of hearing, inaccurate subtitles and transcription is usually worse than trying to figure out what little I'm hearing from context cues.

Flaky

@bedast I don't get the fuss tbh? VLC is just adding what this app on Linux has been doing for years. flathub.org/apps/net.sapples.L

Maxi 10x 💉

@bedast #VLC didn't invent this, Windows 10 (!) has had this AI captions for two years at an OS level.

🍞

@bedast it does use a local LLM for transcription

ZanaGB

@bedast so long the model is outsourced to OpenAI and the like. You can always be certain everything you ever watch on VLC will be beamed to a third party "for improvement". Auto-generated subtitles might be better than no subtitles, but not at the cost of constantly feeding third parties with your data.

And of course, if we are talking of OpenAI's models, they are known to outright invent nonsense phrases when they tried audio transcription a few months ago.

Id not trust an hallucinegic liar.

Piko Starsider :verified_paw:

@zanagb @bedast VLC will do it in-device, not sending anything anywhere.

Whisper models are terrible at transcribing casual conversations of doctors and patients because the training data doesn't reflect that kind of speech and environments. But it excels at transcribing movies etc. because a lot of its training data are closed captions. So this would actually work reasonably well. One can put some text with the names of characters, places, etc. as context and that makes it transcribe those names very well. (source: I've been using whisper models at work, and occasionally I've been putting the mic towards the speaker with some show I'm watching to test) (also: I haven't sent any data to openai nor paid them anything)

@zanagb @bedast VLC will do it in-device, not sending anything anywhere.

Whisper models are terrible at transcribing casual conversations of doctors and patients because the training data doesn't reflect that kind of speech and environments. But it excels at transcribing movies etc. because a lot of its training data are closed captions. So this would actually work reasonably well. One can put some text with the names of characters, places, etc. as context and that makes it transcribe those names...

ZanaGB

@starsider @bedast the CES demo makes it clear the transcription is **off-device**, ie, syphoning data. And besides, there are already many built in tools for that on macOS and linux.

If i wanted fucked-up nonsense on my videos i would watch a raunchy youtube poop from the early 2010s

Id rather have a phoneme-based system where at least you can tell what the gibberish came from and you can tell its an error, and even reconstruct the sentence back.

We do not need this.

Piko Starsider :verified_paw:

@zanagb @bedast What makes it clear that it's off-device? Can you provide a link?

What tools are you talking about? I use Linux, what should I search? I would like to compare it with the tool I'm doing as part of my day job (for which I compile the *whole* source code incl. all dependencies so I know for a fact that nothing is ever syphoned).

About fucked-up nonsense, what I see in youtube all the time: Youtube's automatic subtitles are beyond terrible. With automatic translations to my native language they're even worse. Family members use it and I can't fathom how can they get anything out of it. No pauses, no punctuation, full of mistakes.

Using whisper is a 1000x improvement over youtube's. It adds all the correct punctuation and everything. It only fails with proper names (unless it's given a context) and with speech with a lot of background noise. In all the 4 languages I've been testing it.

For regular casual speech it doesn't work _that_ well but my work's project has that in account by marking all the dubious words. It also discards whole sentences with too many dubious words because they tend to be gibberish from random noise. Which makes me shudder when I read about the model being used as-is for conversations without regard from confidence levels, without using the context feature, and using naive stitching (since it can only transcribe 30 seconds at a time). Results are awful as I would have expected.

@zanagb @bedast What makes it clear that it's off-device? Can you provide a link?

What tools are you talking about? I use Linux, what should I search? I would like to compare it with the tool I'm doing as part of my day job (for which I compile the *whole* source code incl. all dependencies so I know for a fact that nothing is ever syphoned).

ZanaGB

@starsider @bedast and... If you think whisper is anywhere being remotely adequate for the job, clearly you do not rely on subtitles to hear, nor consume information and media through foreign sources. The pitfalls are very apparent and very damaging for the actual purpose of "understanding what is actually happening". Random hindi people with tutorials about the weird obscure software you are trying to debug are always an incredibly easy test these... Abominations. always fail

Lars Marowsky-Brée 😷

@bedast I'm not surprised. Everybody also loves the fediverse not having any algorithms at all, after all.
Broad brushes ...

Da Red Gobo Darven Dissek ✅

@bedast
It's also great when it's a video in a foreign language.

For instance, when it's an English video, and you're French. This can be really useful. Especially when there is a strong accent which is difficult to understand.

And I heard there is also automatic translation too, which is awesome.

Oh, by the way, that's what youtube does on its video nowadays...

Remença

@bedast Uhm, if it generates text from video or sound then it's genAI? Maybe the problem is not whether is genAI or not, but what it is used for?

MxFraud

@bedast you seem to be pretty aware of the technical details of this particular AI.

Do you have any reading/links on this that allow people to know the technical details?

The best article I can find, had no technical details:
tomshardware.com/software/vlc-

Mans R

@bedast They should've called it Accessibility Interface.

cute lily :cat_attack: :neocat_floof_cute:

@bedast@beige.party just curious personally, are they using whisper or something else?

Alice Carroll

@bedast I treat both people who mindlessly promote AI (compare with so-called crypto bros) and Luddites stating AI=evil as fanatics. Like, neither point has a lot of thought in it.

Tofu Golem

@bedast
I don't know what the solution is.

The problem is that it will be used to replace humans.

Eric Curtin

@bedast are VLC considering using whisper.cpp or another GGML based project? I think it would be neat if yes

Dom 🦻

@bedast I am almost deaf and rely on captions. I'd rather have no captions than auto-generated. Auto-generated captions, however they are made, are awful. It is an insult to have to deal with them. It will also encourge folks making media to skip putting any effort into captioning because auto-generated is "good enough". But, they are only good enough for folks who can also hear what is being said. I will bring up this point and the response is always, "Have you tried them lately."

I try them everyday, whether I want to or not.

Hearling-abled folks love to tell us what we should be grateful for, though.

@bedast I am almost deaf and rely on captions. I'd rather have no captions than auto-generated. Auto-generated captions, however they are made, are awful. It is an insult to have to deal with them. It will also encourge folks making media to skip putting any effort into captioning because auto-generated is "good enough". But, they are only good enough for folks who can also hear what is being said. I will bring up this point and the response is always, "Have you tried them lately."

bob.php :veritrek_gold:

@bedast i agree that it could be useful but take a look at how whisper works.

lnkr

@bedast I would like to ignore all the AI debates and specifically address the claim that "This is not generative AI".
Directly quoting https://x.com/videolan/status/1877072497146781946, "VLC automatic subtitles generation and translation based on local and open source AI models" it is, according to Wikipedia definition atleast - a generative AI.

Not sure what is exactly a point I'm trying to make with it but there surely is some

kurtseifried (he/him)

@bedast @carol what amuses me in all this is remembering the exact same conversation happening with respect to watching video on a computer versus on a TV or in a theater. The original potato quality video on computers was decried as terrible. And it was terrible, but then it got better.

Saoirse Dulip

@bedast I am happy to see this kind of take as I feel the same way. I can understand the weariness of people when they hear the term AI because the term has been poisoned by bad actors.

But this, I think, is a good benefit of "AI". It's allowing accessibility for those who cannot hear to be able to enjoy things like everyone else. Yes, we could have people responsible for adding closed captions to most things, but this will help with the home made videos, etc..

Kinetix

@bedast@beige.party I'm curious then, why use the AI moniker at all? Computerized speech recognition has been around for numerous years, and every product getting an "AI" label slapped on it now is turning people off, rightly or wrongly. Generative or not, as soon as you say AI I'm thinking big waste of power, probably more marketing than substance, etc.

Shanie

@bedast This is the same issue that #VRChat had with attempting to hire a Blockchain expert; every inept had a brain seizure over the word while VRChat was likely looking for a way to database avatar ownership and creation that was open.

There's no such thing as a shade of gray for many people, and that's real sour.

monkee :foxmonkeeshark:

@bedast@beige.party Maybe not call it AI Captions but just Automatic Captions like in the "olden" days?

Mx Amber Alex

@bedast so tldr: it's basically YouTube auto captions, but local?

Bethany Fannin

@bedast I appreciate this take. I am a huge AI skeptic, but if everything is demonized it all becomes alarmist noise. I love the idea of AI being able to help people and support its usage for that purpose. I just don’t know if it will be put to such purposes (in any meaningful way) in the current tech environment, where endless growth and shareholder supremacy are practically tenets of a new religion. There’s no incentive to be thoughtful, slow down, and add the necessary guardrails.

Jeffrey Hayes

@bedast Thank you for this clarification. A coworker of mine who deals with oral histories has been using AI for transcription, and I'd be lying if I said I didn't initially wince on hearing that.

Paul Sutton

@bedast

Sounds a good idea to me, the tool can take a video and create captions. Your comment about humans being more accurate is also good, as surely once those captions have been created, a human can go through them, and I would assuek captions are stored in a external file, if this can be edited then the human job would be to simply edit the file and correct any minor errors.

Any tools that can make life a little easier is surely welcome. Perhaps the importantj point though is also transparancy, if you have used a tool to transscribe this should be clearly stated, so people know how the captions have been generated.

@bedast

Sounds a good idea to me, the tool can take a video and create captions. Your comment about humans being more accurate is also good, as surely once those captions have been created, a human can go through them, and I would assuek captions are stored in a external file, if this can be edited then the human job would be to simply edit the file and correct any minor errors.

Go Up