@simon I wanted to know how the speed compares to `whisper.cpp`, since the openAI whisper is very slow on my mac, so I ran a test: notes.billmill.org/link_blog/2

mlx ran almost 3x faster than whisper.cpp with a model of the same size, and both were using the GPU. I would love to know why it's so much faster!