My favorite test with them is sending them decompiled code and asking to explain what does it do. So far, it at least figured out what it does.
Top-level
@rayslava have you tried that new deepseek?
My favorite test with them is sending them decompiled code and asking to explain what does it do. So far, it at least figured out what it does. 7 comments
@rayslava @a1ba No, you didn't. 14b means that you have tried Qwen 14b, distilled by Deepseek R1 (the reasoning one). It became much better than original Qwen, but still that's not Deepseek. Unless you have an insane amount of VRAM you can't run Deepseek V3 or R1. They really scewed model naming, pubilishing other distilled models as subversions of Deepseek itself. @burbilog @a1ba Okay, I tried the chat.deepseek.com and it's the whole new level. Given this model is released to open source, I can only join to all the people mentioning the new stage in LLM race. |
@a1ba tried an 14B version locally—it worked much better than codellamas for simple cases. Do they have a cloud version with large model too?