Released a new version of my llm-gguf LLM plugin adding...

Released a new version of my llm-gguf LLM plugin adding support for GGUF embedding models - which means you can use models like the bafflingly small (30.8MB in its smallest quantization) mxbai-embed-xsmall-v1 model with LLM

https://simonwillison.net/2024/Nov/21/llm-gguf-embeddings/

Like 21 November at 20:32 | Open on fedi.simonwillison.net

3 comments

Prem Kumar Aparanji 👶🤖🐘

@simon do you see any convergence being possible on the file format for the model weights? GGUF, ONNX, etc?

22 November at 2:04 | Open on mastodon.social

Simon Willison

@prem_k haven't really thought about that - for the moment there's so much space for innovation I think having separate formats may be a net positive, makes it easier for the larger community to try out new ideas

22 November at 2:50 | Open on fedi.simonwillison.net

Prem Kumar Aparanji 👶🤖🐘

@simon true that. It's just that it adds overheads for Enterprises.

Like, you need something like LiteLLM if you need flexibility of using different models but don't want to bother about their individual API formats and use just one OpenAI API format for all.

This reduces training/skilling complexity (for the humans). But adds a component in IT landscape.

22 November at 3:09 | Open on mastodon.social

Go Up