Blogged some notes on the new (still MIT licensed)...

Blogged some notes on the new (still MIT licensed) Whisper Turbo model, quietly released by OpenAI yesterday

It’s both smaller and 8x faster than their previous Whisper Large https://simonwillison.net/2024/Oct/1/whisper-large-v3-turbo-model/

And you can run it on a Mac with “pip install mlx-whisper” and then:

import mlx_whisper
print(mlx_whisper.transcribe(
"path/to/audio",
path_or_hf_repo="mlx-community/whisper-turbo"
)["text"])

Like 1 October at 15:44 | Open on fedi.simonwillison.net

4 comments

@simon interesting they are releasing speech to text models. I wonder will anyone be releasing improved text to speech models anytime soon?

1 October at 15:58 | Open on fosstodon.org

@ianthetechie they haven’t ever released those models as open weights, just via their API

I expect they may announce improved API versions of their text to speech stuff today at their DevDay event

1 October at 16:00 | Open on fedi.simonwillison.net

@simon on an Intel-silicon Mac, or only the Apple sand?

1 October at 18:30 | Open on toolsforthought.social

@billseitz not sure! Would be interested to hear from an Intel Mac use who’s tried it out

2 October at 1:50 | Open on fedi.simonwillison.net