@dingodog19 @mkj @eniko I did precisely that, early in the hype cycle. I gave ‘em simple logic puzzles of a form like “all A’s are B’s and some C’s are D’s. You see an E which is not a B. What else can you tell me about it?” — but with the placeholder names replaced by either fake English words (phonotactically reasonable but not meaningful otherwise) or strings of emoji.
A human with any knowledge of logic could answer “this particular E is not an A” for *any* replacement of the placeholder names, without missing a beat.
ChatGPT 3.5 just babbled nonsense and when it did get the logic right, it was obviously due purely to chance.
ChatGPT 4.0 — you know, the one that gullible journalists just gushed over as “so incredibly smart” — wrote somewhat better nonsense, but still often failed to get the logic right.
Recent (a few months ago) Gemini (GPT 4.0 under the hood) spent a lot of time trying to analyze the emoji for significance, completely losing the thread (consistent with observations that feeding these silicon buffoons lookalike puzzles with a twist causes them to get stuck badly).