Email or username:

Password:

Forgot your password?
Top-level
Xe :verified:

@atoponce This is actually a tokenization error. 9.11 looks larger than 9.9 because 11 tokenizes as a single unit and 11 is usually larger than 9.

2 comments
Shiri Bailem
@cadey @atoponce I'm really curious about this because frankly I'm surprised it was even conceptually close
Gustavo

@cadey @atoponce In other words, despite all efforts to make math work better with LLMs, like adding Python support, it's still bad at it. Also it inherited the overconfidence from the dataset, which should include Reddit.

Go Up