@leo_song I took the title from the Lemmy post, but haven't actually followed up to investigate https://lemmy.ml/post/747098
Top-level
@leo_song I took the title from the Lemmy post, but haven't actually followed up to investigate https://lemmy.ml/post/747098 1 comment
|
@leo_song it could be that 175B parameters are used during training but not during actual usage