Released a new version of my llm-gguf LLM plugin adding support for GGUF embedding models - which means you can use models like the bafflingly small (30.8MB in its smallest quantization) mxbai-embed-xsmall-v1 model with LLM
Released a new version of my llm-gguf LLM plugin adding support for GGUF embedding models - which means you can use models like the bafflingly small (30.8MB in its smallest quantization) mxbai-embed-xsmall-v1 model with LLM 3 comments
@prem_k haven't really thought about that - for the moment there's so much space for innovation I think having separate formats may be a net positive, makes it easier for the larger community to try out new ideas @simon true that. It's just that it adds overheads for Enterprises. Like, you need something like LiteLLM if you need flexibility of using different models but don't want to bother about their individual API formats and use just one OpenAI API format for all. This reduces training/skilling complexity (for the humans). But adds a component in IT landscape. |
@simon do you see any convergence being possible on the file format for the model weights? GGUF, ONNX, etc?