"If we can just get our LLM to stop hallucinating, then we could <whatever>..."
"<whoever>, do you have any idea how a LLM works?"
Yeah. And I get it, it's an easy trap to fall into. Generative AI certainly has a lot of properties that make it an easy trap to fall into. I might have fallen into that trap at some point. Then I spent some time reading an article on how generative AI (specifically LLMs in that case) work.
@mkj @eniko
Yes! I don't see how an LLM can reason.
I tried feeding a Logic Puzzle from a grocery store puzzle book into CoPilot. I thought that might be a good minimum threshold to show "reasoning," assuming that puzzle doesn't exist online.
It did very poorly, with the response not actually making sense. A friend put it in ChatGPT-4, and it did better -- solving 2 categories but getting the third wrong.
What do you think of logic puzzles as a test?
@mkj @eniko
Yes! I don't see how an LLM can reason.
I tried feeding a Logic Puzzle from a grocery store puzzle book into CoPilot. I thought that might be a good minimum threshold to show "reasoning," assuming that puzzle doesn't exist online.
It did very poorly, with the response not actually making sense. A friend put it in ChatGPT-4, and it did better -- solving 2 categories but getting the third wrong.