Email or username:

Password:

Forgot your password?
Top-level
Simon Willison

A YouTube comment asked the price difference between Gemini 1.5 Flash and OpenAI's Whisper

Whisper API is $0.006 / minute, so an hour of audio = 36 cents

Gemini 1.5 Flash is $0.075 for 1 million tokens, 25 tokens/second of audio so an hour is 0.675 cents

Over 50x cheaper!

4 comments
Daniel Erenrich

@simon that whisper price quote isn't competitive groq.com/pricing/ and I'd be curious on the accuracy differential

Simon Willison

@derenrich problem with Groq is they haven't actually launched their billed API yet, so you're stuck with whatever their free tier will let you do

Developer
Scale up and pay as you go
Pay per Token
Coming Soon
* High Rate Limits
Priority Support
Xing Shi Cai

@simon Is the quality of Gemini and Whisper in Speech-to-text on the same level though?

Simon Willison

@xsc from what I've seen so far they do feel similar in quality - and Gemini can do extra tricks like diarization and tone-of-voice analysis that Whisper can't

I remain paranoid about the risk of Gemini accidentally acting on instructions within the audio, but I've not (yet) seen that happen - so possibly more of a risk with deliberately malicious audio

Go Up