@vampirdaddy The idea seems to be that the very large data set allows them to encode a certain level of "reasoning and understanding", and thus correctly predict the next words given the current context. That ... might even work eventually.
The point is that even one of the currently largest and most advanced models can't do it (yet?) for a rather trivial task.
But please don't reply with very basic fundamentals as one liners, which comes across as somewhat condescending :) Thanks!
@larsmb
Sorry, condescending was not intended. Just emphasis. Sorry for the wrong messaging!
The current models ingested presumably >90% of all internet-available texts. Thus the presumed needed order of magnitude simply won’t exist ever.
Plus as the algorithm only picks probable next words, it won’t deduce. It won’t learn, as neural nets usually have to (more or less) completely be te-trained for each "learning" step, still without understanding.