@simon About running models locally, I've experimented with that, but large context windows take up lots of RAM, right? Like, isn't it O(n^2) where n is the number of tokens? Or do you not depend on large context windows?
Top-level
@simon About running models locally, I've experimented with that, but large context windows take up lots of RAM, right? Like, isn't it O(n^2) where n is the number of tokens? Or do you not depend on large context windows? 1 comment
|
@matt @glyph the models I can run in my laptop today are leagues ahead of the models I ran on the exact same hardware a year ago - improvements in that space have been significant
This new trick from Microsoft looks like it could be a huge leap forward too - I've not dug into it properly yet though https://github.com/microsoft/BitNet