165 comments
12
@timdesuyo @atoponce @locksmithprime @atoponce i dont know that i am capable of looking clever, but i swear the beard scratch increases processing capability by at least 10% @atoponce This sounds like talking about the economics of cybertruck with fanboys @atoponce ... so you prove "someone" lacks all intelligence by challenging their math disability? It only has a handful of cognitive functions, none of which are good at math. Honestly, it does better than I expected. Aha! But ChatGPT is playing 4D chess, you see. Because in Semantic Versioning, 9.11 IS bigger than 9.9 :blobfoxsmug: NEVER FORGET @ethanjstark @atoponce Ah! This is why I was confused for a moment. I knew the right answer but my brain derp-hesitated for half a second, now I understand why 😅 @atoponce@fosstodon.org And then, chatgpt said "it's bullshitting time" and he bullshitted all over the place, generating smooth-sounding garbage @atoponce@fosstodon.org LMAO what's next, a quarter pounder being larger than a third? 💀 @atoponce@fosstodon.org You should have asked to express both of them as a fraction and compare them, but I guess that too would fail catastrophically. @atoponce i think it does exactly what it is supposed to and mimics our intelligence to a T. @janet_catcus Hah, I've said this, too. The most "human" thing ChatGPT does is never admit it doesn't have an answer. @atoponce @GhostOnTheHalfShell @violet @atoponce this thread evaporated two liters of water in ai queries alone I asked Gemini. It was wrong too. @PhilipVWilson @violet @atoponce I see it's offering no explanation for why the correct answer is correct though. That's interesting. @violet @atoponce It’s not “garbage” exactly. It’s not correct maths, but for example Michael Rosen’s “Hairy Tales and Nursery Crimes” is full of “factual errors”, and no-one should think that is “garbage”. The only issue here is if you try to use generative AI to give you correct answers, which is like trying to get an oboe to tell you who composed Handel’s Messiah. @atoponce @atoponce maybe one day we can invent a machine that's good at doing math, but until then this is the state of the art! @atoponce @atoponce This is actually a tokenization error. 9.11 looks larger than 9.9 because 11 tokenizes as a single unit and 11 is usually larger than 9. @albnelson @atoponce yeah, why would we expect a huge, optimized linear algebra machine to be able to do arithmetic? @atoponce They want to hook this up to gene sequencing machines. Why do these tech bros exist? Nature may try to eliminate them by eliminating us all. @toriver @atoponce I’ve not used copilot, but I assume its numerical analysis output is not directly via the attention mechanism of an llm. Eg it could be using a llm to predict the context to data which is then fed into routines, or use llms to offer code suggestions. None of these things are llms directly doing maths. @atoponce I wonder: you know how virtual assistants are given feminine names and voices (Siri, Alexa)? And you know how there is a persistant false belief that women are somehow worse at math than men? I have to wonder whether that combination of biases has any influence on the programmers who create these LLMs? I mean on top of all of the other biases and misunderstandings they already have about neuroscience and language? Are they creating their own stereotype of a ditzy secretary? @UncivilServant @atoponce It would help if "virtual assistants" used a name and voice appropriate to a 5 year old child. @UncivilServant this has nothing to do with biases; llms don't produce correct answers, they produce statistically-probable text completion. @atoponce @atoponce wow you are either faking it or a really bad prompt engineer: https://chatgpt.com/share/15cf4411-f272-4ebe-a90f-4ddfc93a78bc @Hexa there's always one promptfondler in the thread that doesn't understand that you can't get fully repeatable answers from the confabulation engine, and that any answer to that question is a valid answer within the llm paradigm, no matter if it's incorrect or not. (there's also another promptfondler who thinks that the problem is just in one particular llm, not in the way llm works) Its initial response is ‘correct’, but only if the items being compared are version strings. Funny that a lot of people are trying to justify this as "well, it's not made for that"... But other people are relying on it for these things, so it *IS* a problem. We have real world evidence that the majority of people are not understanding any of this, plus it's being marketed as "a tool that you can use for anything" (in your OS, in your phone, in your browser, etc.). @atoponce i read this entire thing thinking it was going to be a funny joke about versioning :menheraSob:
@atoponce For the example - ChatGPT botching arithmetic - it actually passed the Turing test. Once in a store, I ordered 2.2 lb of some deli item, and the scale registered 2.02. The guy behind the counter called 2.20 "two point twenty" and 2.02 "two point two". The scale always showed two digits past the decimal point. This guy basically made the same mistake as ChatGPT. Je crois que les économistes qui nous annoncent que le programme de LFI sera une catastrophe, se sont renseignés auprès du Chat Gépété. @atoponce Oh Great, we have a tool that uses energy to simulate stupid. As if we hadn’t enough. in the meantime, they have fixed this issue. But I think, we only have to dig a little bit deeper now. @RAlpenstern @atoponce This is a frackin' IL-Series Cylon from the old Battlestar Galactica series. @ErikJonker the authors, apparently, and it is being sold to the public as an universal answering and search engine. @atoponce @trzyglow @atoponce I'm pretty sure it does not interface with python. There must just be in its learning database some content about subtractions in python, maybe with these numbers, or maybe it is able to replace numbers in an example with other numbers and redo the math. If it was actually executing python code (or any language?), I'm sure someone would have already broken it by asking for the result of malicious code. @atoponce As someone else on Mastodon pointed out: Companies has spent BILLIONS to make a program that can simulate a computer that cannot do math. @atoponce well, linux 6.10 is newer than linux 6.9. So that's probably where the confusion stems from, somewhat? That sometimes it does get counted up that way? I mean in the end it's somewhat of a parrot just evaluating the most likely to be reasonable answer and responding with that. Doesn't have to make sense to you or me, just gotta make a good way through the neural net. The python-thing is pretty hilarious. I know I'm right, even if I'm wrong... and here's why @otto @atoponce it certaiy doesn't have math talent. Like many people I know. @atoponce fun fact: when devs are creating releases with numbers, 9.11 in actually newer / bigger than 9.9. Yeah, I know, right? Often including a patch number as well would be: 9.11.0 and 9.9.0. @atoponce well done, OpenAI, you've developed a computer that cannot compute. @atoponce AI is Just the Mr. Knowitall. He has a lot of mansplaining to do. Which sometimes is cute. This AI is quite human in the way it simultaneously rationalises and doubles down on mistakes. @atoponce the answer also differs if one uses an or instead of and. With the or the system seemingly corrects itself in the explanation @atoponce I gave your screenshot to GPT4o. Fun aside, I think it's nice of them to write 'ChatGPT can make mistakes. Check important info.' in the bottom. @atoponce This is most likely an issue with tokenization, rather than something fundamental. That is, the model can't see the structure of the numbers the way we do. If you write the numbers more unconventionally, they don't get tokenized, and the model can perform the task. @atoponce You can tell ChatGPT is right because it set the math in Computer Modern! I bet Python doesn't do that. @atoponce I love that it justifies it as a "small precision error". Floating point isn't that bad! So they made an algorithm with the www at it's disposal to solve problems and it is not able to function as a calculator |
@atoponce probably getting messed up by the Yosemite decimal system.