Understanding GPT tokenizers Large language models...

Understanding GPT tokenizers

Large language models such as GPT-3/4, LLaMA and PaLM work in terms of tokens. They take text, convert it into tokens (integers), then predict which tokens should come next. Playing around with these tokens is an interesting way to get a better idea for how this stuff actually works under the hood.

https://simonwillison.net/2023/Jun/8/gpt-tokenizers/

Like 21 Jun 2023 at 14:15 | Open on mastodon.social