Simon Willison

Some notes on mlx-whisper - it's now really easy to run transcriptions through an Apple Silicon (and GPU) optimized Whisper model using Python on macOS https://simonwillison.net/2024/Aug/13/mlx-whisper/

Like 13 August at 16:36 | Open on fedi.simonwillison.net

5 comments

Matt Campbell

@simon Nitpick: Whisper is speech-to-text. Text-to-speech is speech synthesis.

13 August at 16:46 | Open on toot.cafe

Simon Willison

@matt thanks, corrected!

13 August at 17:23 | Open on fedi.simonwillison.net

Saurabh

@simon Do you know how this compares to whisper.cpp (which also uses the GPU on Macs)?

13 August at 16:46 | Open on mas.to

Simon Willison

@saurabhs anecdotally it sounds like it’s a lot faster: https://twitter.com/awnihannun/status/1822744609241682077 says

“distil-large-v3 runs 40X faster than realtime on my M1 Max (transcribes 12 minutes in 18 seconds)”

13 August at 17:22 | Open on fedi.simonwillison.net

Bill Mill

@simon I wanted to know how the speed compares to `whisper.cpp`, since the openAI whisper is very slow on my mac, so I ran a test: https://notes.billmill.org/link_blog/2024/08/mlx-whisper.html

mlx ran almost 3x faster than whisper.cpp with a model of the same size, and both were using the GPU. I would love to know why it's so much faster!

13 August at 17:46 | Open on hachyderm.io

Go Up