@simon the 90B (55GB) might confuse people.
You do need ~88GB of RAM, not counting your context window, just to run the 90B model size. So 128 GB of RAM, or else you are going to get 1 token per 30 to 45 seconds or more of output while everything swaps around.
That small model is going to run very, very well on any M-series Mac with enough RAM.
@webology thanks, updated that to say "Or the larger 90B model (55GB download, likely needs ~88GB of RAM) like this:"