@larsmb These LLMs can't see individual letters of...

Lars Marowsky-Brée 😷's posts Post Back to profile

nen

@larsmb These LLMs can't see individual letters of common words. That's probably the main reason why they can't always count them correctly.

This tool visualizes how OpenAI's models see text: https://platform.openai.com/tokenizer

But being sometimes wrong wouldn't be that much a problem if these models weren't trained pretty explicitly to just deceive. Fake it until you make a superhuman bullshitter.

Often the smallest unit of text perceived by GPT-4 is the whole word: “How many R's are in the word strawberry?”

One has to pair each letter with a space or other less frequent character to make them visible: “Count the R's in s t r a w b e r r y”

Like 8 August at 15:19 | Wall-to-wall | Open on mementomori.social

1 comment

nen

@larsmb If people who train these models were honestly trying to make something that values truth over impressive marketing, their LLMs would avoid using even language that suggests they may have agency, identity, ability to reflect, self-consciousness, etc. Unless they can prove that they have.

8 August at 15:21 | Open on mementomori.social

Go Up