I'm hoping to turn this into a series of YouTube interviews...

I'm hoping to turn this into a series of YouTube interviews with people building cool data projects where we nerd out about what they've built and how they built it, so I'm optimistically thinking of this as episode one! https://www.youtube.com/watch?v=t_S-loWDGE0

Like 7 November at 18:50 | Open on fedi.simonwillison.net

9 comments

Simon Willison

The VERDAD prompts are pretty complex - Rajiv shared this example of a conversation he had with Claude 3.5 Sonnet to further iterate on the existing prompt used with Gemini 1.5 Pro https://gist.github.com/rajivsinclair/8fb0371f6eda25f9e5cc515cd77abd62

7 November at 19:01 | Open on fedi.simonwillison.net

Simon Willison

A YouTube comment asked the price difference between Gemini 1.5 Flash and OpenAI's Whisper

Whisper API is $0.006 / minute, so an hour of audio = 36 cents

Gemini 1.5 Flash is $0.075 for 1 million tokens, 25 tokens/second of audio so an hour is 0.675 cents

Over 50x cheaper!

8 November at 6:15 | Open on fedi.simonwillison.net

Daniel Erenrich

@simon that whisper price quote isn't competitive https://groq.com/pricing/ and I'd be curious on the accuracy differential

8 November at 7:11 | Open on techhub.social

Simon Willison

@derenrich problem with Groq is they haven't actually launched their billed API yet, so you're stuck with whatever their free tier will let you do

Developer
Scale up and pay as you go
Pay per Token
Coming Soon
* High Rate Limits
Priority Support

8 November at 7:25 | Open on fedi.simonwillison.net

Xing Shi Cai

Sensitive content

@simon Is the quality of Gemini and Whisper in Speech-to-text on the same level though?

8 November at 11:52 | Open on mathstodon.xyz

Simon Willison

@xsc from what I've seen so far they do feel similar in quality - and Gemini can do extra tricks like diarization and tone-of-voice analysis that Whisper can't

I remain paranoid about the risk of Gemini accidentally acting on instructions within the audio, but I've not (yet) seen that happen - so possibly more of a risk with deliberately malicious audio

8 November at 12:02 | Open on fedi.simonwillison.net

phildini

@simon I’d love to talk about https://civic.band ✨

8 November at 2:30 | Open on wandering.shop

Simon Willison

@phildini YES let's do it!

8 November at 3:46 | Open on fedi.simonwillison.net

phildini

@simon how do we get started? Wanna dm me on discord?

8 November at 18:55 | Open on wandering.shop