@vampirdaddy The (supposed) idea behind them though is that with enough context and tokens, they can infer "some" logic from language encoded in their models.
And it _might_ even one day work, but it ... definitely doesn't yet.
Top-level
@vampirdaddy The (supposed) idea behind them though is that with enough context and tokens, they can infer "some" logic from language encoded in their models. And it _might_ even one day work, but it ... definitely doesn't yet. 3 comments
@vampirdaddy The idea seems to be that the very large data set allows them to encode a certain level of "reasoning and understanding", and thus correctly predict the next words given the current context. That ... might even work eventually. The point is that even one of the currently largest and most advanced models can't do it (yet?) for a rather trivial task. But please don't reply with very basic fundamentals as one liners, which comes across as somewhat condescending :) Thanks! @larsmb The current models ingested presumably >90% of all internet-available texts. Thus the presumed needed order of magnitude simply won’t exist ever. Plus as the algorithm only picks probable next words, it won’t deduce. It won’t learn, as neural nets usually have to (more or less) completely be te-trained for each "learning" step, still without understanding. |
@larsmb
again:
LLMs do not understand,
LLMs do not reason,
they just guess the next words based on a laaarge data set.
Their programming does not allow anything else.