Here's what it said when I asked it to count those pelicans
7 comments
Here are all of experiments with full transcripts https://gist.github.com/simonw/6c296f4b9323736dc77978447b6368fc I got QvQ running on my (M2 64GB) laptop! uv run --with 'numpy<2.0' --with mlx-vlm python \ The other major Chinese AI lab, DeepSeek, just dropped their own last-minute entry into the 2024 model race: DeepSeek v3 is a HUGE model (685B parameters) which showed up, mostly undocumented, on Hugging Face this morning. My notes so far: https://simonwillison.net/2024/Dec/25/deepseek-v3/ @simon The split of 256 experts is interesting as the compute of 8 per token I'm assuming will be ~20B params (plus router model I guess?) which is pretty light weight for the performance in Aider. Having all experts in memory is a very high bar though. The DeepSeek v3 paper came out this morning, added a few notes about that here https://simonwillison.net/2024/Dec/26/deepseek-v3/ |
@simon This is why robot captchas have partial hits in the frame because the robots won't count them as having a target match 😆