Generative AI is garbage, exhibit 99: #ai #math

Aaron Toponce ⚛️:debian:'s posts Post Back to profile

Aaron Toponce ⚛️:debian:

Generative AI is garbage, exhibit 99:

#ai #math

Discussion with ChatGPT 4o:

User: 9.11 and 9.9 - which is bigger
ChatGPT: 9.11 is bigger than 9.9
User: subtract them
ChatGPT: To subtract 9.9 from 9.11, you perform the following calculation: 9.11 - 9.99 = 0.21.
User: use python
ChatGPT: The result of subtracting 9.9 from 9.11 in Python is -0.79. This discrepancy is due to the way floating-point arithmetic is handled in Python, which can lead to small precision errors. The actual intended result should be 0.21.

Like 16 Jul 2024 at 12:29 | Open on fosstodon.org

165 comments

A Person

@atoponce probably getting messed up by the Yosemite decimal system.

16 Jul 2024 at 12:59 | Open on mastodon.social

Locksmith

@timdesuyo @atoponce
And this statement must be said while making a clever frown and scratching the chin with the fingers.

16 Jul 2024 at 14:35 | Open on infosec.exchange

A Person

@locksmithprime @atoponce i dont know that i am capable of looking clever, but i swear the beard scratch increases processing capability by at least 10%

16 Jul 2024 at 14:48 | Open on mastodon.social

Kierkegaanks regretfully

@atoponce This sounds like talking about the economics of cybertruck with fanboys

16 Jul 2024 at 13:12 | Open on beige.party

br00t4c

@Kierkegaanks @atoponce 2 + 2 = 4.1 tho

16 Jul 2024 at 18:25 | Open on mastodon.social

Estarriol, Cat owned Dragon

@atoponce that is dangerous garbage, next it will apply bodmas....

16 Jul 2024 at 13:12 | Open on mastodon.scot

Becky

@atoponce CWGAAS: Confident White Guy As A Service.

16 Jul 2024 at 13:38 | Open on oldbytes.space

Mina

@RenewedRebecca @atoponce ChadGPT

16 Jul 2024 at 21:24 | Open on cathode.church

jonoleth

@RenewedRebecca @atoponce MansplAI

17 Jul 2024 at 8:18 | Open on mastodon.social

DELETED

@atoponce Rigolo !

16 Jul 2024 at 13:59 | Open on diaspodon.fr

ari :autism:

@atoponce 🔥 mathematics 2 just dropped

16 Jul 2024 at 14:00 | Open on ak.ari.lt

Shiri Bailem

@atoponce ... so you prove "someone" lacks all intelligence by challenging their math disability?

It only has a handful of cognitive functions, none of which are good at math. Honestly, it does better than I expected.

16 Jul 2024 at 14:02 | Open on foggyminds.com

dianea 🏳️‍⚧️🦋🌱

@atoponce AI, wrong answers with confidence!

16 Jul 2024 at 14:02 | Open on lgbtqia.space

DELETED

@atoponce

Aha! But ChatGPT is playing 4D chess, you see.

Because in Semantic Versioning, 9.11 IS bigger than 9.9

:blobfoxsmug:

NEVER FORGET

16 Jul 2024 at 14:04 | Open on hachyderm.io

@ethanjstark @atoponce Ah! This is why I was confused for a moment.

I knew the right answer but my brain derp-hesitated for half a second, now I understand why 😅

16 Jul 2024 at 18:24 | Open on fosstodon.org

br00t4c

@ethanjstark @atoponce Wowwwww 🤣

16 Jul 2024 at 18:30 | Open on mastodon.social

M Berberich

@ethanjstark @atoponce

By 0.21?! Sure…

17 Jul 2024 at 8:03 | Open on chaos.social

Gersonzao

@atoponce@fosstodon.org And then, chatgpt said "it's bullshitting time" and he bullshitted all over the place, generating smooth-sounding garbage

16 Jul 2024 at 14:05 | Open on lea.pet

Alex The Silly Kitty :neocat_happy_blep:

@atoponce@fosstodon.org LMAO what's next, a quarter pounder being larger than a third? 💀

16 Jul 2024 at 14:08 | Open on floofy.city

Nizar Kerkeni 🇹🇳 نزار القرقني

@atoponce @djayroma For Mixtral an open source IA:

Question to Mixtral IA: 9.11 and 9.3 which is bigger?
Response; The number 9.3 is bigger than 9.11. In decimal notation, 9.11 represents the number nine and eleven hundredths, while 9.3 represents the number nine and thirty hundredths. Since thirty hundredths is greater than eleven hundredths, 9.3 is the larger number.

16 Jul 2024 at 14:09 | Open on mastodon.tn

Camille - stature de tragédien

@nizarus @atoponce
What did you expect 🤣

16 Jul 2024 at 15:21 | Open on framapiaf.org

@nizarus

16 Jul 2024 at 18:34 | Open on mastodon.social

@nizarus

16 Jul 2024 at 18:35 | Open on mastodon.social

Nizar Kerkeni 🇹🇳 نزار القرقني

@ln0 Hmm ! I used https://duckduckgo.com/chat

16 Jul 2024 at 18:42 | Open on mastodon.tn

Show 1 more reply

Christine Burns MBE 🏳️‍⚧️📚⧖

@atoponce Put another gigawatt in the server farm!

16 Jul 2024 at 14:11 | Open on mastodon.green

ENIGMATICO :heartbleed:

@atoponce@fosstodon.org You should have asked to express both of them as a fraction and compare them, but I guess that too would fail catastrophically.

16 Jul 2024 at 14:18 | Open on mk.absturztau.be

TSource Engine Query

@atoponce of course that glorified markov chain can't do math

16 Jul 2024 at 14:28 | Open on suya.place

janet_catcus

@atoponce i think it does exactly what it is supposed to and mimics our intelligence to a T.

16 Jul 2024 at 14:37 | Open on hachyderm.io

DELETED

@janet_catcus Hah, I've said this, too. The most "human" thing ChatGPT does is never admit it doesn't have an answer.

16 Jul 2024 at 14:41 | Open on mastodon.social

Violet Rose

@atoponce
Hmm, well, 9.11 is wider.

16 Jul 2024 at 14:44 | Open on lgbtqia.space

Violet Rose

@atoponce
And an answer from Copilot just now:

A screenshot of Microsoft Copilot answering the question "9.9 and 9.11 - which is bigger". It says 9.11 is bigger, the logic being "The decimal portion in 9.11 (the 0.11) is larger than the decimal portion in 9.9 (which is just 0.9)."

It also describes 9.11 as "nine point eleven."

It appears to be comparing numerical value, and doing so incorrectly.

16 Jul 2024 at 14:49 | Open on lgbtqia.space

Violet Rose

@atoponce
This time I asked which number has the greater numerical value. It gives a correct explanation, but stalls out before concluding that 9.9 is greater, almost as if it doesn't want to contradict its previous answer.

A screenshot of Microsoft Copilot answering whether 9.9 or 9.11 has the greater numerical value. It provides the equivalent values in terms of fractions of 100, but stops before giving a final answer.

16 Jul 2024 at 14:54 | Open on lgbtqia.space

GhostOnTheHalfShell

@violet @atoponce

Had to pause before to burn down another forest.

16 Jul 2024 at 14:59 | Open on masto.ai

xyhhx :PunkFelix:

@GhostOnTheHalfShell @violet @atoponce this thread evaporated two liters of water in ai queries alone

16 Jul 2024 at 19:27 | Open on 438punk.house

Mᴀʀᴋ VᴀɴᴅᴇWᴇᴛᴛᴇʀɪɴɢ

@violet @atoponce Truly, we have been freed from the labor of our demonstrating our own ignorance, biases, and stupidity, and the end must be near.

16 Jul 2024 at 22:21 | Open on mastodon.social

Philip Wilson

@violet @atoponce

I asked Gemini. It was wrong too.
I asked it for its drafts. It got it right in 1 of 3.

Image of Google's Gemini AI answering the question, "9.11and 9.9 -- which is bigger?" and showing 3 draft answers, of which only the 3rd is correct.

16 Jul 2024 at 22:47 | Open on aus.social

Iwillyeah

@PhilipVWilson @violet @atoponce I see it's offering no explanation for why the correct answer is correct though. That's interesting.

16 Jul 2024 at 23:26 | Open on mastodon.ie

Ian McGecKobold (Octobold)

Sensitive content

@violet
Ah, the classic "Ah, the classic" that instantly gives away AI slop

17 Jul 2024 at 3:00 | Open on meow.social

@atoponce Conversely, why not both

16 Jul 2024 at 15:11 | Open on mapstodon.space

tocisz

@atoponce I agree, 9.11 is a big deal. Never heard of 9.9. You haven't specified which metric to use when comparing.

16 Jul 2024 at 15:12 | Open on vmst.io

tocisz

@atoponce but it got lost in the explanation for sure

16 Jul 2024 at 15:14 | Open on vmst.io

tocisz

@atoponce but at least it's politically correct

16 Jul 2024 at 15:16 | Open on vmst.io

johnaldis

@atoponce It’s not “garbage” exactly. It’s not correct maths, but for example Michael Rosen’s “Hairy Tales and Nursery Crimes” is full of “factual errors”, and no-one should think that is “garbage”. The only issue here is if you try to use generative AI to give you correct answers, which is like trying to get an oboe to tell you who composed Handel’s Messiah.

16 Jul 2024 at 15:19 | Open on mas.to

Show 2 replies

DearFox

@atoponce
I'm also trash, I'm bad at math and don't know the multiplication table, let alone multiplying and dividing fractions :blobfoxcomfysmirk: :blobfoxgooglymlem:

16 Jul 2024 at 15:19 | Open on meow.social

DearFox

Sensitive content

@atoponce
Well, for the sake of truth, they write it themselves "ChatGPT can make mistakes. Check important info."
:blobfox_w_:

16 Jul 2024 at 15:25 | Open on meow.social

TheClothesHaveNoEmperor

@atoponce
Thinks exactly like a human. But with massively more power consumption per error.

16 Jul 2024 at 15:32 | Open on lgbtqia.space

Alessandro Corazza

@atoponce maybe one day we can invent a machine that's good at doing math, but until then this is the state of the art!

16 Jul 2024 at 15:51 | Open on mstdn.ca

Michael ☕️

@atoponce
I don't know what you're talking about.
9.11 is clearly bigger than 9.9.
9.11 is 4 characters, and 9.9 is only 3.
#EverythingIsAString

16 Jul 2024 at 15:54 | Open on mas.to

Toxo77

@mcpinson @atoponce thanks, JavaScript !

17 Jul 2024 at 7:52 | Open on ruhr.social

Xe :verified:

@atoponce This is actually a tokenization error. 9.11 looks larger than 9.9 because 11 tokenizes as a single unit and 11 is usually larger than 9.

16 Jul 2024 at 16:00 | Open on pony.social

Shiri Bailem

@cadey @atoponce I'm really curious about this because frankly I'm surprised it was even conceptually close

16 Jul 2024 at 17:22 | Open on foggyminds.com

Gustavo

@cadey @atoponce In other words, despite all efforts to make math work better with LLMs, like adding Python support, it's still bad at it. Also it inherited the overconfidence from the dataset, which should include Reddit.

16 Jul 2024 at 19:38 | Open on urusai.social

Andrew Benedict-Nelson

@atoponce hey look it’s a word calculator not a calculator calculator

16 Jul 2024 at 16:02 | Open on lor.sh

Kelsey Jordahl

@albnelson @atoponce yeah, why would we expect a huge, optimized linear algebra machine to be able to do arithmetic?

16 Jul 2024 at 16:16 | Open on scicomm.xyz

Andrew Benedict-Nelson

@kajord @atoponce see, algebra is where we went wrong in the first place. Never should have mixed up letters and numbers.

16 Jul 2024 at 18:44 | Open on lor.sh

Darrin West

@atoponce They want to hook this up to gene sequencing machines. Why do these tech bros exist? Nature may try to eliminate them by eliminating us all.

16 Jul 2024 at 16:06 | Open on hachyderm.io

Lewis Edwards

@atoponce "Gee, I wonder why our probe slammed into the surface of the planet?"

16 Jul 2024 at 16:13 | Open on freeradical.zone

ry_

@atoponce it’s not garbage, it’s not intelligent. It has useful applications, but arithmetic isn’t one apparently. It’s not wholly surprising. LLMs model natural language. Arithmetic isn’t natural language.

16 Jul 2024 at 16:19 | Open on mastodon.social

Tor Iver Wilhelmsen

@ry_ @atoponce That would basically mean Copilot in Excel is nearly pointless, unless it does something smarter than the rest of Copilot for M365

16 Jul 2024 at 17:53 | Open on mas.to

ry_

@toriver @atoponce I’ve not used copilot, but I assume its numerical analysis output is not directly via the attention mechanism of an llm. Eg it could be using a llm to predict the context to data which is then fed into routines, or use llms to offer code suggestions. None of these things are llms directly doing maths.

16 Jul 2024 at 20:13 | Open on mastodon.social

Noah Cook

@atoponce I wonder: you know how virtual assistants are given feminine names and voices (Siri, Alexa)? And you know how there is a persistant false belief that women are somehow worse at math than men?

I have to wonder whether that combination of biases has any influence on the programmers who create these LLMs? I mean on top of all of the other biases and misunderstandings they already have about neuroscience and language? Are they creating their own stereotype of a ditzy secretary?

16 Jul 2024 at 16:20 | Open on med-mastodon.com

Bill Zaumen

@UncivilServant @atoponce It would help if "virtual assistants" used a name and voice appropriate to a 5 year old child.
There was an old TV program called "Kids say the Darndest Things" with clips of it shown on The Bill Crosby show (well before he was arrested):
https://www.youtube.com/watch?v=G1voLZyI0SM

16 Jul 2024 at 17:50 | Open on fosstodon.org

flere-imsaho

@UncivilServant this has nothing to do with biases; llms don't produce correct answers, they produce statistically-probable text completion. @atoponce

16 Jul 2024 at 19:40 | Open on circumstances.run

masukomi

@atoponce i had my head in Semantic Versioning land just before reading this so i was like "yup. 9.11 is bigger than 9.9" while simultaneously thinking "something's not right here"

16 Jul 2024 at 16:21 | Open on connectified.com

Ryan Dormanesh

@atoponce wow you are either faking it or a really bad prompt engineer: https://chatgpt.com/share/15cf4411-f272-4ebe-a90f-4ddfc93a78bc

16 Jul 2024 at 16:30 | Open on mastodon.gamedev.place

flere-imsaho

@Hexa there's always one promptfondler in the thread that doesn't understand that you can't get fully repeatable answers from the confabulation engine, and that any answer to that question is a valid answer within the llm paradigm, no matter if it's incorrect or not.

(there's also another promptfondler who thinks that the problem is just in one particular llm, not in the way llm works)

@atoponce

16 Jul 2024 at 19:37 | Open on circumstances.run

Ryan Dormanesh

@mawhrin @atoponce fair point. I apologize. Also “promptfondler” 😆 I’ve never heard that one.

17 Jul 2024 at 5:13 | Open on mastodon.gamedev.place

Show 2 replies

Emma (has_many_books of old)

@mawhrin @Hexa @atoponce came for the promptfondler, stayed for the confabulation engine

17 Jul 2024 at 8:20 | Open on ruby.social

squalouJenkins

@atoponce c'est sûr que si il a été entraîné sur les copies des élèves ...

16 Jul 2024 at 16:30 | Open on fosstodon.org

lp0 on fire :unverified:

Its initial response is ‘correct’, but only if the items being compared are version strings.

16 Jul 2024 at 16:34 | Open on social.linux.pizza

Robert Taylor

@atoponce actually 1/6 is bigger than 9/11

16 Jul 2024 at 16:35 | Open on mastodon.social

DELETED

@atoponce

Funny that a lot of people are trying to justify this as "well, it's not made for that"...

But other people are relying on it for these things, so it *IS* a problem. We have real world evidence that the majority of people are not understanding any of this, plus it's being marketed as "a tool that you can use for anything" (in your OS, in your phone, in your browser, etc.).

16 Jul 2024 at 16:40 | Open on mas.to

Brahn

@atoponce

claude... bringin home the bacon, that's actually used shoestrings.

❯ python
Python 3.11.9 (main, May 19 2024, 00:15:35) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from decimal import Decimal
>>>
>>> result = Decimal('9.11') - Decimal('9.9')
>>> print(f"The result of 9.11 - 9.9 is: {result}")
The result of 9.11 - 9.9 is: -0.79
>>>

16 Jul 2024 at 17:10 | Open on hachyderm.io

Brahn

@atoponce it gets better...

16 Jul 2024 at 19:32 | Open on hachyderm.io

DELETED

@atoponce
Gaslighting as a Service

16 Jul 2024 at 17:19 | Open on pilote.me

unusual zone of infecundity

@atoponce i read this entire thing thinking it was going to be a funny joke about versioning :menheraSob:

16 Jul 2024 at 17:33 | Open on brain.worm.pink

Bill Zaumen

@atoponce For the example - ChatGPT botching arithmetic - it actually passed the Turing test. Once in a store, I ordered 2.2 lb of some deli item, and the scale registered 2.02. The guy behind the counter called 2.20 "two point twenty" and 2.02 "two point two". The scale always showed two digits past the decimal point. This guy basically made the same mistake as ChatGPT.

16 Jul 2024 at 17:35 | Open on fosstodon.org

flere-imsaho

@bzdev nah. don't antropomorphise a statistical engine.

16 Jul 2024 at 19:41 | Open on circumstances.run

Show 1 more reply

AVerRoA

@atoponce

Je crois que les économistes qui nous annoncent que le programme de LFI sera une catastrophe, se sont renseignés auprès du Chat Gépété.

16 Jul 2024 at 17:40 | Open on mastodon.social

Johnny Peligro 🍅

@atoponce in this case maybe, but as a chatbot and as art generator is pretty good

16 Jul 2024 at 17:41 | Open on rebased.taihou.website

Ritschi Alpenstern :antifa:

@atoponce Oh Great, we have a tool that uses energy to simulate stupid. As if we hadn’t enough.

16 Jul 2024 at 17:46 | Open on nerdculture.de

echopapa ☑️

@RAlpenstern @atoponce

in the meantime, they have fixed this issue. But I think, we only have to dig a little bit deeper now.

17 Jul 2024 at 9:11 | Open on social.tchncs.de

Ritschi Alpenstern :antifa:

@echopapa @atoponce I just tried it on chatGPT yesterday asking it to calculate 9.9 - 9.10 and it tried to convince me it was -0.2

17 Jul 2024 at 10:47 | Open on nerdculture.de

Show 1 reply

Stefan Baur 6 * 💉

@RAlpenstern @atoponce This is a frackin' IL-Series Cylon from the old Battlestar Galactica series.

17 Jul 2024 at 9:12 | Open on infosec.exchange

Bob

@atoponce ah, the TerranceHowardGPT

16 Jul 2024 at 17:58 | Open on mastodon.social

The Tepid Emperor

@atoponce I asked Copilot…

16 Jul 2024 at 18:01 | Open on warhammer.social

RosOne

@atoponce Llama3

16 Jul 2024 at 18:14 | Open on mastodon.social

Mathieu Génois

@atoponce
This is exactly what I got when some students of mine tried to do their math homework with ChatGPT.
@dlbasile

16 Jul 2024 at 18:15 | Open on mastodon.top

GolfNovemberUniform

@atoponce ikr

16 Jul 2024 at 18:15 | Open on infosec.exchange

shironeko

@atoponce this has been an hot meme in China apparently, lots of people can't tell which is bigger 13.11 or 13.8

16 Jul 2024 at 18:18 | Open on fedi.tesaguri.club

qwertzalotl

@atoponce

This is gonna be soo much fun....

16 Jul 2024 at 18:25 | Open on mas.to

mercurial yapper

@atoponce@fosstodon.org just reading this made me feel dumber

16 Jul 2024 at 18:28 | Open on shonk.social

br00t4c

@atoponce Pseudo profound bullshit is the scourge of our age.

16 Jul 2024 at 18:29 | Open on mastodon.social

Ollie

@atoponce GIGO

16 Jul 2024 at 18:30 | Open on tech.lgbt

Andre Courchesne (co2.click)

@atoponce

16 Jul 2024 at 18:31 | Open on zeroes.ca

Erik Jonker

@atoponce why would anybody think a large language (!) Model should be able to do math.

16 Jul 2024 at 18:33 | Open on mastodon.social

flere-imsaho

@ErikJonker the authors, apparently, and it is being sold to the public as an universal answering and search engine. @atoponce

16 Jul 2024 at 19:43 | Open on circumstances.run

Erik Jonker

@mawhrin @atoponce by using ChatGPT-4 a lot, you quickly find out what it's good for and what not, in my experience, it's limited but does have added value.

16 Jul 2024 at 20:27 | Open on mastodon.social

Freevolt

@atoponce damn i was almost convinced

16 Jul 2024 at 18:36 | Open on mastodon.social

𝔱𝔯𝔷𝔶𝔤𝔩𝔬𝔴 :lattentacle:

@atoponce so it can interface with python, but not with a calculator?

16 Jul 2024 at 18:53 | Open on mastodon.social

youen

@trzyglow @atoponce I'm pretty sure it does not interface with python. There must just be in its learning database some content about subtractions in python, maybe with these numbers, or maybe it is able to replace numbers in an example with other numbers and redo the math.

If it was actually executing python code (or any language?), I'm sure someone would have already broken it by asking for the result of malicious code.

17 Jul 2024 at 7:16 | Open on masto.bike

keith

@atoponce

16 Jul 2024 at 18:56 | Open on discuss.systems

magic💫

@keithpjolley @atoponce well yeah how else would 1.9 be greater than 9.1?

16 Jul 2024 at 21:14 | Open on eldritch.cafe

:blahaj: Why Not Zoidberg? 🦑

@atoponce As someone else on Mastodon pointed out: Companies has spent BILLIONS to make a program that can simulate a computer that cannot do math.

16 Jul 2024 at 19:23 | Open on topspicy.social

sharky5740

@atoponce Well, Python version 3.11 is a later version than 3.9.
Q.E.D.

16 Jul 2024 at 19:28 | Open on techhub.social

Lina

@atoponce no logical rules, no mathematical rules, just stupid statistics for the result :blobcatfacepalm:

16 Jul 2024 at 19:42 | Open on mastodon.ml

not_even_wrong

@atoponce Let me top this

16 Jul 2024 at 19:58 | Open on mastodon.social

Jim Carr

@atoponce I'm seeing lots of "but that's not what generative AI is good at" rebuttals, but AI is marketed as a be-all, end-all tool, so this is a real danger. I use it quite a bit for programming assistance, but I always double-check my results.

16 Jul 2024 at 19:58 | Open on techhub.social

Jim Carr

@atoponce And for what it's worth, I tried the same exercise in Claude 3 (via the DDG AI Chat) and I got the same wrong answer.

Screenshot of AI chatbot being bad at math.

16 Jul 2024 at 20:00 | Open on techhub.social

Leonard Schiff

@atoponce well, linux 6.10 is newer than linux 6.9.

So that's probably where the confusion stems from, somewhat? That sometimes it does get counted up that way?

I mean in the end it's somewhat of a parrot just evaluating the most likely to be reasonable answer and responding with that. Doesn't have to make sense to you or me, just gotta make a good way through the neural net.

The python-thing is pretty hilarious. I know I'm right, even if I'm wrong... and here's why

16 Jul 2024 at 20:03 | Open on mountains.social

Nazo

@atoponce I think the problem here is it may be considering that like a program version number. Which is one thing I *HATE* about the way version numbers are done. In program versions that's actually 9 and 11 versus 9 and 9 rather than just over 9.1 versus 9.9. I wish that would go away as a thing.

That said, yeah, it's still messing that up and should have figured it out better than that anyway. (There are some models that try to produce better math results, but since they're not actually doing math, results still vary.)

I find it incredibly ironic that 300+ watts are being spent to do things a calculator could do at 0.1 watts and still not getting it right a huge percentage of the time when all they truly need to do is pass it through a simple mathematical function.

Expand text...

16 Jul 2024 at 20:57 | Open on urusai.social

zvhxxl

@atoponce LOL

16 Jul 2024 at 21:24 | Open on mastodon.social

Jona Joachim

@atoponce
It's a large *language* model, not a math model
@otto

16 Jul 2024 at 22:19 | Open on mastodon.social

otto@openbsd

@jaj @atoponce it's a large language model that claims to understand the language of math

17 Jul 2024 at 5:32 | Open on bsd.network

Jona Joachim

@otto
Does it 🤔 I thought it was a well known limitation. It does know about theorems of course but numerical calculation I wouldn't think so
@atoponce

17 Jul 2024 at 5:37 | Open on mastodon.social

otto@openbsd

@jaj @atoponce looks at the example. It claims to understand the computation and even "explains" the python result.

17 Jul 2024 at 5:42 | Open on bsd.network

Jona Joachim

@otto
I'm not sure you can say it claims something, it's just an algorithm that spits out numbers which get mapped to letters and words and neural networks can be very wrong when used to extrapolate outside of the training domain. We have to stop treating these algortithms as if they were sentient. And we need to find a way to put condidence intervals on the outputs of the model
@atoponce

17 Jul 2024 at 6:54 | Open on mastodon.social

Jimmy Havok

@atoponce Artificial Stupidity

16 Jul 2024 at 22:28 | Open on mastodon.social

marinheiro

@atoponce The 'like' button is not enough, I really want a 'lol' button, cos I did

16 Jul 2024 at 23:02 | Open on ciberlandia.pt

vickyx :ikeasama019:

@atoponce

17 Jul 2024 at 1:09 | Open on m.cmx.im

Dylan Hall

@atoponce what are you talking about

17 Jul 2024 at 1:18 | Open on aus.social

Ken S

@atoponce … until AI masters methodology, it should not be responsible for generating.

17 Jul 2024 at 3:43 | Open on aus.social

Rue Mohr

@atoponce 😂 😂

17 Jul 2024 at 4:25 | Open on infosec.exchange

Erik

@atoponce I don't understand that python bit, but I'm quite sure chatty doesn't either

17 Jul 2024 at 4:37 | Open on warhammer.social

Acvaristul_Lenes

@atoponce it certaiy doesn't have math talent. Like many people I know.
It helps me a lot in nutrition, sports physiology and performance.
It also saves me time at some sysadmin and PC technician tasks.
It has spotless grammar in all 3 languages.
Aaron, had you not known who you're chatting with, what would have been your reaction?

17 Jul 2024 at 5:29 | Open on mastodon.online

Melroy van den Berg

@atoponce fun fact: when devs are creating releases with numbers, 9.11 in actually newer / bigger than 9.9. Yeah, I know, right? Often including a patch number as well would be: 9.11.0 and 9.9.0.

17 Jul 2024 at 6:16 | Open on mastodon.melroy.org

DELETED

@atoponce marrant 😂

17 Jul 2024 at 6:34 | Open on mastodon.free-solutions.org

Mike Farley :verified:

@atoponce well done, OpenAI, you've developed a computer that cannot compute.

17 Jul 2024 at 7:16 | Open on masto.ai

KielKontrovers

@atoponce AI is Just the Mr. Knowitall. He has a lot of mansplaining to do. Which sometimes is cute.

17 Jul 2024 at 7:36 | Open on norden.social

Dan O'Ginnec

@atoponce

https://hessen.social/@atoponce@fosstodon.org/112796170403027744

If one uses a LLM (like #chatgpt to do math (or any other "logic"-based Problems) it will fail. it generates answers not by thinking (or #logic
generally) but by searching a text-database.

Wrong Tool.

Look at this: i let chatGPT create a #fantasy #map for a roleplaying Game:

While it knows a lot about Maps, and creates a map, mountains are in the Sea as well as the river flows through the sea and there'a also a marine forest.

@atoponce

https://hessen.social/@atoponce@fosstodon.org/112796170403027744

If one uses a LLM (like #chatgpt to do math (or any other "logic"-based Problems) it will fail. it generates answers not by thinking (or #logic
generally) but by searching a text-database.

Wrong Tool.

Look at this: i let chatGPT create a #fantasy #map for a roleplaying Game:

Expand text...

Surreal Map Created by using ChatGPT to make a fantasy Map

17 Jul 2024 at 7:53 | Open on hessen.social

Dan O'Ginnec

It can, by the way, create a simple #house (as svg-xml), and even isnt THAT bad at creating a #human figure (a #venus ) if prompted correctly:

#chatgpt #xml #svg #graphic #design #art

@atoponce

17 Jul 2024 at 8:08 | Open on hessen.social

fedithom

@atoponce

Kewl. Now, can we collectively stop using this shit that's burning our planet?

https://social.saarland/@fedithom/112455121136790239

17 Jul 2024 at 8:08 | Open on social.saarland

Gianluca

@atoponce Generative AI (in this case a LLM) ist not garbage, it is what it is: A large-language model (LLM) and not a calculator. It does what it is supposed to do - create text.

17 Jul 2024 at 8:10 | Open on mastodon.social

Gianluca

@atoponce It's like using a calculator hoping to get a letter done and then calling it garbage because it does not spit out words and sentences.

17 Jul 2024 at 8:13 | Open on mastodon.social

Fardels Bear

@atoponce

This AI is quite human in the way it simultaneously rationalises and doubles down on mistakes.

17 Jul 2024 at 8:56 | Open on mastodon.social

Jochen Garcke

@atoponce the answer also differs if one uses an or instead of and. With the or the system seemingly corrects itself in the explanation

17 Jul 2024 at 9:20 | Open on sigmoid.social

Simon Brooke

@atoponce sadly, the whole of #ChatGPT has been deleted, after someone forgot to turn off the garbage collector.

17 Jul 2024 at 10:05 | Open on mastodon.scot

Ralph_Supermaxieroe

@atoponce ahah great

17 Jul 2024 at 11:17 | Open on mastodon.uno

Abhinav Tushar

@atoponce I gave your screenshot to GPT4o.

Fun aside, I think it's nice of them to write 'ChatGPT can make mistakes. Check important info.' in the bottom.

17 Jul 2024 at 11:20 | Open on mathstodon.xyz

Peter Bloem

@atoponce This is most likely an issue with tokenization, rather than something fundamental. That is, the model can't see the structure of the numbers the way we do.

If you write the numbers more unconventionally, they don't get tokenized, and the model can perform the task.

The same problem, with the numbers written out as lists.

17 Jul 2024 at 12:10 | Open on sigmoid.social

Robert Nasarek

@atoponce

17 Jul 2024 at 12:20 | Open on fedihum.org

Maarten Sneep

@atoponce @Marrekoo Apple “threatening” to delay or not make available “Apple Intelligence” and ChatGPT in the EU is not the threat it thinks it is.
I can do without a bullshit generator in my pocket.