Email or username:

Password:

Forgot your password?
Top-level
David Edmiston

@simon Your post mentioned a ~20GB quantized file via Ollama; did that take up 20GB of RAM or 32?

I’m waiting on delivery this/early next week of a 48GB M4 Pro which is why I'm kinda curious.

2 comments
Simon Willison

@edmistond I just tried running a prompt through the Ollama qwen2.5-coder:32b model and to my surprise it appeared to peak at just 2GB of RAM usage, but it was using 95% of my GPU

I thought GPU and system RAM were shared on macOS so I don't entirely understand what happened there, I would have expected more like 20GB of RAM use

David Edmiston

@simon Interesting, thanks for checking! Either way, since I currently work on a 16GB M1 with no problems for my day to day tools, I know I should have enough RAM to run my normal tools plus that for experimentation. 🙂

Go Up