Trusting LLMs threatens your credibility. I read a...

Alyssa Rosenzweig 💜's posts Post Back to profile

Trusting LLMs threatens your credibility.

I read a bogus claim about GPU instruction sets, cited to GPT-4 and an anonymous "expert". This is my area of expertise, I know the claim is demonstrably false. And now I know the author is relying on bullshit generators. Now I doubt every other claim the author makes, because with egregious errors in the parts I know about, how could I trust the parts I don't?

(Edit: narrowed the scope of the lead.)

Like 1 10 Jun 2023 at 14:33 | Open on social.treehouse.systems

82 comments

mnl mnl mnl mnl mnl

@alyssa in a way using llm for facts in earnest is going to expose a lot of people that benefitted from the doubt before. Now everybody is on the lookout, and the easiest to fool are the fools themselves

10 Jun 2023 at 14:39 | Open on hachyderm.io

Hayley

@mnl @alyssa I have to correct first years who were convinced ChatGPT got the right answers for their math homework, when it's consistently dead wrong. "But it gave a different answer" is really unconvincing with that in mind.

(Unrelated gripe - first years in CS, so writing a program to do it would be much less likely to fail.)

10 Jun 2023 at 22:26 | Open on social.applied-langua.ge

Janne Moren

@alyssa
This is something I frequently ran into as a researcher already years and years ago, way before LLMs:

Some publication or columnist confidently spouts absolute nonsense about stuff in my own field; and now I can never trust what they say about any other subject.

10 Jun 2023 at 14:43 | Open on fosstodon.org

Sheep

@jannem @alyssa

There is a term that relates to your experience called "Gell-Mann Amnesia"

10 Jun 2023 at 15:11 | Open on mastodon.social

Sindastra♀️✅

@alyssa Reminds me a bit of "journalism" in general. The amount of nonsense I read in credible newspapers, speaking about "tech(nology)".

Makes me wonder how much nonsense they report in areas I'm not very knowledgeable about.

10 Jun 2023 at 14:44 | Open on chaos.social

Anafabula

@sindastra @alyssa There is a name for that. Gell-Mann Amnesia

10 Jun 2023 at 15:15 | Open on ieji.de

Gen X-Wing

@sindastra @alyssa I have the same feeling. Basically LLMs are simply amplifying what has been and making it even more apparent.

10 Jun 2023 at 19:36 | Open on bitbang.social

Ezekiel :swift:

@alyssa what are your thoughts on isolated decision making behind a fixed input/output layer? For example imagine if Siri etc. Used a LLM to interpret and do language parsing but NOT to build the response

10 Jun 2023 at 14:53 | Open on hachyderm.io

Alex Celeste

@ezekiel

cool but you could make it better by getting rid of the bit where it does language parsing

(serious answer, i cannot think of a justification to myself for why this is an improvement over just _not_ having that step)

10 Jun 2023 at 22:29 | Open on tech.lgbt

Ezekiel :swift:

@erisceleste I mean, isn't one of the most common issues with virtual assistants that they misinterpret a query? For example, consider the following exchange with ChatGPT that Siri (and probably others??) would've completely failed at.

Note that it properly parsed out the request AND corrected my spelling.

— Query —
Pretend to be a virtual assistant. You will receive a query for adding things to reminder lists. Consider the available lists: [Groceries, To-do, Packing]. The user may specify a location trigger, time trigger, and will specify the text to add to the list. Please respond with the list they're trying to add to, the time and/or location if specified, and the text to add.

Prompt: add at two oclock apples to my groceirries list

— Response —
Sure, I can help you with that. Based on your request, you would like to add "apples" to your "Groceries" list at 2 o'clock. Is that correct?

10 Jun 2023 at 22:35 | Open on hachyderm.io

Alex Celeste

@ezekiel

perhaps i speak from too much of a cultural bubble when i say that

groceries:append(apples, 1400)

seems better to me than using english to describe a task that doesn't need to be described in english, in the first place

this can be three button-presses and a swipe of a drop-down

10 Jun 2023 at 22:39 | Open on tech.lgbt

Ezekiel :swift:

@erisceleste I appreciate why you would feel that way, but I hope you understand that the general population just wants to be able to make a request and have it understood, not following a syntax

10 Jun 2023 at 22:40 | Open on hachyderm.io

Omega Jimes

@alyssa It's a little disheartening that the mass response to public LLM availability seems to be "Well now I get to think EVEN LESS!".

10 Jun 2023 at 14:55 | Open on techhub.social

Autumn

@omegajimes @alyssa
Especially with the fact that they can make automation much more accessible

They are an excellent tool for those who know how to use them, and unfortunately, misunderstood by most

10 Jun 2023 at 22:34 | Open on hachyderm.io

Konrad Kołakowski

@alyssa LLMs works best for popular texts, with huge corpus and data. For sure GPU drivers programming is faaar from it. They might be useful, as a tool, for helping in some tedious, repetitive work, but for sure not for such niche work.

It should always be used as a tool - with big amount of scrutiny.

Huge problem is, that if they „don’t know” something - LLMs are simply confabulating 🙃 Extremely dangerous for beginners or less ML-literate people 🫤

10 Jun 2023 at 15:25 | Open on mastodon.social

Jennifer Kayla | Theogrin 🦊

@kkolakowski @alyssa

Even with a huge corpus of data, LLMs are useless, and here's why:

They generate text which looks like, but is not, equivalent to a researched and reviewed paper.

They will take bits and pieces from the entire set of articles and chunk them together into something which is functionally meaningless but looks acceptable at a casual glance.

And I mean individual words! Sentence fragments! Syllables!

They don't know ANYTHING. But they give the illusion of doing so.

10 Jun 2023 at 17:35 | Open on chaosfem.tw

Autumn

@theogrin @kkolakowski @alyssa LLMs don't necessarily need to generate stuff

There are signs of promise for LLMs that avoid hallucination by paraphrasing and permutating instead.

I recommend checking out perplexity.ai

LLMs are also quite helpful for automation; the base training data is just to get the relations right in the first place, then constraints, checks, temperature, and human validation can help vet things out

10 Jun 2023 at 22:37 | Open on hachyderm.io

Autumn

@theogrin @kkolakowski @alyssa
Base LLMs like GPT are useless to your average Joe but great for developers (i.e., you need to know prompt engineering and AI model tooling); ChatGPT is fun for conversations and the public but useless otherwise, and Perplexity is only good for taking multiple raw articles and quoting them directly

10 Jun 2023 at 22:39 | Open on hachyderm.io

Autumn

@theogrin @kkolakowski @alyssa
I might be a tad bit quick to defend LLM development because I believe the best way to go about adopting AI and general technology is to educate the public and encourage them to understand the tools they use, both their devices and their software

Yes, I am a Linux nerd who wishes people wanted to know why and how their systems work

But the curiosity AI has garnered from people could be a great opportunity, even if it's not particularly helpful for most poeple

Yes, I am a Linux nerd who wishes people wanted to know why and how their systems work

Expand text...

10 Jun 2023 at 22:41 | Open on hachyderm.io

Ryan

on a slightly related note: the annoyances of hearing clearly-unqualified people tell you your job is worthless, your passion is not worth pursuing and your work is unnecessary because of LLMs and AIs in general...

you know they're wrong, but it's disheartening still.

10 Jun 2023 at 15:28 | Open on techhub.social

Kofi Loves Efia :verified:

@ryanc be prepared to be very much more depressed. LLMs are very exciting to the Venture Capital set.

10 Jun 2023 at 15:48 | Open on mstdn.social

Ryan

@Seruko at least i'll be in school during the AI bubble, my condolences to everyone who's about to get/already got replaced by AI because of corporate

https://www.cbsnews.com/news/eating-disorder-helpline-chatbot-disabled/

10 Jun 2023 at 16:12 | Open on techhub.social

Demi Marie Obenour

@alyssa Could LLMs be suitable for generating proofs which are then checked by a _sound_ proof checker that is secure against malicious input? If the proof is wrong, the checker will catch it, so no harm has been done.

10 Jun 2023 at 15:39 | Open on infosec.exchange

Jennifer Kayla | Theogrin 🦊

@alwayscurious @alyssa

Depends how you define harm. The requirement for checkers becomes exponentially greater with the use of bots and large language models. Of course, one should perform one's best efforts to check the validity of any resource, but the need for more robust and cautious checks increases the time requirements greatly.

Also, it feels like generating noise at random and then checking for anything which could be a sonnet.

10 Jun 2023 at 16:01 | Open on chaosfem.tw

ShadSterling

@alwayscurious @alyssa only if
1. The work required for repeated checking is less than the work required to get the same result without the LLM, and
2. Default access only includes post-check results

But even if those conditions can be met without undermining the viability of the product, it could only be used in contexts for which an ~infallible checker has been integrated, which could not include general use

10 Jun 2023 at 16:09 | Open on mastodon.social

morgan

@alwayscurious @alyssa There are automated systems to generate mathematical proofs, but I don't think those work anything like LLMs.

10 Jun 2023 at 17:09 | Open on toot.community

Alex Celeste

@alwayscurious

if you know you're trying to generate something specific like proofs, you probably don't need an overwhelmingly overpowered tool like GPT-4 to do that

the use of LLMs is that, through gigantic amounts of wasted computing power, they can appear to emulate a near-infinity of simple tasks; but if you have a specific task in mind the chances are _extremely_ good that you don't need the Swiss-Army-Chainsaw to do it

10 Jun 2023 at 22:34 | Open on tech.lgbt

@alyssa do you find the code it emits too buggy to use, or the chat interface too time-consuming to use?

10 Jun 2023 at 15:40 | Open on octodon.social

Dawn

@22 it's usually utterly irrelevant. I talked with someone trying to use an LLM to write a mod for a game. The code they offered up from the LLM was nonsensical and irrelevant to the game, mashed together multiple game engines and in no way would've interacted with existing systems.

10 Jun 2023 at 16:14 | Open on social.treehouse.systems

@funky7monkey intriguing, thank you. In my experience as a JavaScript and Python dev, it is extremely helpful with those languages, plus things like ImageMagick and jq and command-line things, so I wonder if it’s better at those than game dev because of maybe more training data or whatever? I value others’ counterexamples so thank you for this.

10 Jun 2023 at 16:37 | Open on octodon.social

Matt Hodges

@alyssa @anildash “cited to GPT-4” … ooof … reminds me of spam blogs that put “Source: Reddit” on their content regurgitation mill.

10 Jun 2023 at 15:47 | Open on mastodon.social

Jennifer Kayla | Theogrin 🦊

@alyssa

One of the best articles I've read isn't about specific types of highly specialized work, or even loosely specialized. It's something a five-year-old can typically explain:

Tic-Tac-Toe.

And ChatGPT is excellent at coming up with a seemingly convincing explanation for its tactics. Long-winded and verbose. But it's pants at playing, and I think that perfectly illustrates the difference between the illusion of intelligence, and actual brains.

https://www.aiweirdness.com/optimum-tic-tac-toe/

10 Jun 2023 at 15:47 | Open on chaosfem.tw

Aris Adamantiadis :verified:💲Paid

@alyssa LLMs are an excellent way of getting quick answers to some problems, and if you're open, learn a thing or two that you haven't considered. It's useful. But taking any LLM output at face value without double checking means you're foolish and naive.

10 Jun 2023 at 15:48 | Open on infosec.exchange

BlueWinds

@aris

They're not actually useful for that either. They *look* useful for that, but they're actually just as garbage at that as they are at other tasks beyond "stringing together reasonable-seeming English text."

10 Jun 2023 at 22:25 | Open on tech.lgbt

Elliott

@alyssa I asked ChatGPT some fairly basic math questions (not computations) and it very confidently gave wrong answers. Have to be really careful with it.

10 Jun 2023 at 16:06 | Open on mastodon.social

coupland

@alyssa

Me: "In what year was the Battle of Hastings?"

AI: "The Battle of Hastings took place on October 14, 1066."

I'm sorry but it's your credibility on the subject that's suspect. Wild general statements like "LLMs are only for entertainment" are ridiculous. Here are some *reasonable* statements:

"Be skeptical of everything you read, and when it really matters always verify."

"Don't use a hammer to fix your plumbing. Every tool has an ideal use, choose wisely."

10 Jun 2023 at 16:26 | Open on cryptodon.lol

Lotus

@coupland But whats the point in using chatgpt to answer questions like these? I dont know when that battle took place, I have no way of telling if its making things up.

The uncertainty isnt worth it for me, I would rather just make a few google searches and use websites I know are reliable.

10 Jun 2023 at 16:50 | Open on hachyderm.io

coupland

@LotusHopper Because for questions that have a simple, deterministic answer LLMs are generally quite reliable and it's WAY FASTER than doing a web search.

As I said, right tool for the job. A search engine isn't really the best tool for simple questions with a deterministic answer anymore. There's a new tool in town that's great if you use it right.

10 Jun 2023 at 16:54 | Open on cryptodon.lol

Lotus

@coupland What is a deterministic answer? Things that dont change with time like demographic numbers?

10 Jun 2023 at 16:59 | Open on hachyderm.io

coupland

@LotusHopper Questions that can be definitively answered and that require no interpretation nor change depending on your perspective.

"What's 2+2?" or "What year was the Battle of Hastings" or "What is Miley Cyrus' birthday?"

Questions like "what's the best pizza recipe" or "why is Iran/America/Russia so evil" are not well suited to LLMs.

10 Jun 2023 at 17:08 | Open on cryptodon.lol

Tröglödÿt

@coupland @LotusHopper

what year something happened is a statement that requires a lot of interpretation and presuppositions

like, what is an event in the context? what calendar is used? what are the commonly agreed upon limits of the type of event?

just because interpretation seems easy to you and you don't recognise that it happens, doesn't mean it isn't necessary

there are no simple facts, because human language is quite complicated

10 Jun 2023 at 18:25 | Open on mastodon.nu

coupland

@troglodyt @LotusHopper Sorry Troglodyt but that's a whole pile of pseudo-intellectual horseshit. There is zero... ZERO... ambiguity to asking what Miley Cyrus' birthday is or what year man landed on the moon. Come back to earth space man.

10 Jun 2023 at 19:23 | Open on cryptodon.lol

Tröglödÿt

@coupland @LotusHopper

what's miley circus? never heard of it

10 Jun 2023 at 19:31 | Open on mastodon.nu

Tröglödÿt

@coupland @LotusHopper

ok, looking at your feed and your nick i must make a jump and conclude that trying to make conversation with you is an absolute waste of time, you're much too gullible and already totally occupied by less sophisticated forces than my mind

good luck with your faiths little fellow

10 Jun 2023 at 19:44 | Open on mastodon.nu

flere-imsaho

@coupland lol. you're coming from crypto-dorks instance and have dot eth in display name.

11 Jun 2023 at 9:31 | Open on circumstances.run

wakame

@alyssa
I read a text (a blog entry? a rant?) a few years ago that annoyed me (a lot).

It was about a researcher who basically stated that people shouldn't criticize or review his papers, because he was "right".
Paraphrasing: "The probability that someone reviewing one of papers is not understanding it or getting it wrong is vastly higher than me making a mistake."

Maybe LLMs finally have an effect that people don't take everything at face value. In the past, a text was very likely written by a human. We can't say that anymore.

(Of course, the effect could be the opposite: "Our new FactGPT makes sure to tell only 'the truth'. If you see a text with the green FactGPT checkmark™️, you can be sure that it only contains 'truth'.")

@alyssa
I read a text (a blog entry? a rant?) a few years ago that annoyed me (a lot).

Expand text...

10 Jun 2023 at 16:55 | Open on tech.lgbt

Panegyr 🤡🎪

@alyssa I find they can be occasionally useful if you want exactly what they generate, which is a regression to the mean of subjective answers, queries like “what is a typical naming scheme for a node in a kubernetes cluster” have a very low likelihood of causing actual harm, just don’t trust them for anything more complicated than incredibly general questions that don’t have wrong answers. Which to be clear, is most things. You shouldn’t trust them for most things

10 Jun 2023 at 16:58 | Open on meow.social

jz.tusk

@alyssa

Is this the first instance of someone being "chatsplained" to?

10 Jun 2023 at 17:08 | Open on mastodon.social

Wendell Bell

@alyssa I ‘almost’ told the parties recently that ‘no AI was used in the preparation of this (arbitration) Award,’ which was true, but I finally figured it would be worse to say it: most wouldn’t yet get it, and those who did would think I was making a jk, that might not land right.

10 Jun 2023 at 17:14 | Open on mas.to

Kara Goldfinch

@alyssa Yeah. I've used it for daft things like "write me a dire straits song about Macbeth". I thought considering they did one about Romeo and Juliette it'd be interesting to see what it'd do.
Using it for anything serious, not a chance.

10 Jun 2023 at 17:19 | Open on tweesecake.social

BrianOnBarrington

@alyssa Gosh, I think what you really need right now is an overconfident straight white guy who got his online learning certificate in GPT4 from LinkedIn to wade into the topic and “educate” you. 🤣

10 Jun 2023 at 17:20 | Open on mastodon.world

Mark - Ottawa on Tundra 🇨🇦 :mstdnca: :flag_ON:

@alyssa “In a time of deceit telling the truth is a revolutionary act.”
― George Orwell

10 Jun 2023 at 17:43 | Open on mstdn.ca

Alexis :verifiedtransbian:

@alyssa This 1000%. ChatGPT should never be used for serious work, especially anything like legal defense. I'm sorry you have a bunch of ChatGPT apologists in your replies, so I'd like to let you know, there are a lot of us who fully agree with you

10 Jun 2023 at 17:47 | Open on alexisart.me

Robert Buchberger

They're also good for any time when the output is easily tested/verified. I've used GPT for little scripts in unfamiliar languages for example.

They're good for manipulating information you give them, but can't be trusted to go out and find it in the first place.

10 Jun 2023 at 18:16 | Open on spacey.space

funnymonkey

@alyssa This, for everything.

10 Jun 2023 at 18:24 | Open on freeradical.zone

Steffen Christensen

@alyssa I use LLMs for research, for economics, for summarizing, and for programming. It's all fine. LLMs are highly useful tools.

Publishing LLM-produced output without extensive checking and editing is dumb.

10 Jun 2023 at 18:30 | Open on mastodon.social

Slayerranger/Crackamphetamine

@alyssa Yeah I had a friend mess with OpenAI to write a fake vulnerability report. It lied to him multiple times, and as he continued correcting it, ChatGPT started making dead links to non-existent vulnerabilities so he LOL’d hard at it because it was just citing unrelated CVEs in his troll report 😂

10 Jun 2023 at 18:46 | Open on cyberplace.social

Val Packett

@alyssa@social.treehouse.systems in 👏 this 👏 house 👏 we 👏 only 👏 trust 👏 LLVMs

10 Jun 2023 at 19:35 | Open on blahaj.zone

Gecko

@alyssa I have to admit I find it quite useful for generating initial PoC code.

Though usually I still have to do edits before the code even runs.

That being said, one should never use LLMs as a knowledge source. Today I had it tell me that `let` in Rust is used declare mutable variables xD

10 Jun 2023 at 20:08 | Open on fosstodon.org

BlueWinds

@gecko @alyssa

Be careful about that, even for POCs. https://www.securityweek.com/chatgpt-hallucinations-can-be-exploited-to-distribute-malicious-code-packages/

10 Jun 2023 at 22:31 | Open on tech.lgbt

Gecko

@bluewinds @alyssa I'm well aware, hence I try to only use the generated code when I fully understand it.

The heads-up is still very much appreciated nevertheless <3

10 Jun 2023 at 23:26 | Open on fosstodon.org

BlueWinds

@gecko

That's my real point: it's not actually good for anything. It's all hype. If you're understanding it fully before using it, you'd have been better off just doing the work yourself to begin with!

Anything that chatgpt *seems* to be good at, it's more likely to be harmful than helpful.

11 Jun 2023 at 0:06 | Open on tech.lgbt

DELETED

@alyssa Feel like a whole looooooooooooota mothafuckers are about to learn in real time how trust / journalistic integrity works

10 Jun 2023 at 20:42 | Open on freeradical.zone

Brian Grinter

@alyssa LLM is mansplaining-as-a-service - confidently given completely wrong answers 🤣

10 Jun 2023 at 20:43 | Open on mastodon.sdf.org

Space Cowboy

@alyssa Yeah I just use it like google. For some reason people know when they click a link on google they understand the information can be unreliable. With Chat GPT they implicitly trust it.

Where do people think Chat GPT gets it's data from?

In this case it's probably just quoting something from a stack overflow question from someone that didn't know what they were doing (which is why they were there). But sounding really confident while doing it.

10 Jun 2023 at 20:52 | Open on mastodon.ie

Sean :nivenly: 🦬

@alyssa I've heard some interesting arguments for using it to come up with project names or to expand on an email you're having trouble writing.

Of course the asterisks to that are you need to edit it to your own after gpt goes at it and you need to verify everything contained within is true.

10 Jun 2023 at 20:55 | Open on hachyderm.io

aebrer - Andrew E. Brereton

@alyssa I mean I use copilot for coding and it's an LLM and in that context I find it very helpful. It's not doing the decision making though, mostly just remembering obscure syntax for me

10 Jun 2023 at 21:00 | Open on genart.social

Kevin Leecaster

@alyssa
Using LLMs for anything is climate science denial. https://news.climate.columbia.edu/2023/06/09/ais-growing-carbon-footprint/

10 Jun 2023 at 21:16 | Open on mstdn.social

Feoh

@alyssa Respectfully, I don't agree. LLMs are super at helping out when you can *know* without the tiniest sliver of doubt that the results are correct, and when you treat the results like a suggestion to be vetted, corrected and massaged and not a completed final deliverable.

10 Jun 2023 at 22:31 | Open on oldbytes.space

SnoopJ

@feoh @alyssa Respectfully, the list of people I trust to actually do this post-facto vetting when using one is very short.

10 Jun 2023 at 22:32 | Open on hachyderm.io

Feoh

@SnoopJ @alyssa I don't wish to argue, but let me give you a very concrete example:

"Write pytest unit tests for this code".

It spews out a page full of code, including all the necessary boilerplate for test setup, database setup, etc. etc.

I then take that and add the higher value tests that the LLM doesn't write.

For another example, I am a bit of a windbag. I take a block of business prose, pass it to the LLM, and say "Rewrite this for conciseness and professional tone."

If you *know english* you can validate the correctness of the prose it generates in terms of conveying intent, and if you care you can even use other tools to validate grammatical correctness.

@SnoopJ @alyssa I don't wish to argue, but let me give you a very concrete example:

"Write pytest unit tests for this code".

It spews out a page full of code, including all the necessary boilerplate for test setup, database setup, etc. etc.

I then take that and add the higher value tests that the LLM doesn't write.

Expand text...

10 Jun 2023 at 22:38 | Open on oldbytes.space

Alyssa Rosenzweig 💜

@feoh @SnoopJ Personally, I am uncomfortable using (current) LLMs for those.

For boilerplate - If a system requires large amounts of boilerplate, that's a red flag to me (as an Opinionated developer) about the solution. I would prefer to improve the ergonomics than repeat boilerplate. I realize that's not always possible, but there's enough crap software out there, I'd rather we didn't generate more. The affordance of IntelliJ proliferation is variable and function names becoming more verbose (that may or may not be good). I suspect the affordance of boilerplate generating tools is... systems requiring more boilerplate. ("It's so easy to generate, what's wrong? You don't like code audits that are needlessly difficult? Upset that defect counts are roughly proportional to quantity of code?")

For both - the issue @SnoopJ raises - the current UIs and marketing work together to discourage vetting and instead trust the generated output. Would you catch a subtle bug in the generated boilerplate that caused tests to pass unconditionally? Would you catch a subtle shift in message from the professionalized text?

For both - would you catch plagiarism or open source license violations in the unattributed generated output?

Maybe your eye is more keen than mine. But I suspect with Copilot my brain would be on Autopilot.

I can't trust the output of these tools, the way I can trust YouCompleteMe and proselint. That's reason enough for me to stay away. If I can't trust them for my own work, I don't know how I could trust what people who do trust the output claim / commit / send.

It's tempting to say the problem is misuse. As an expert on GPUs (but not LLMs), I know that the query in question is unanswerable for current LLMs. The honest response I'd expect asking a human is "I don't know, sorry". Instead, apparently GPT confidently spewed wrong info. Was the asker misusing the LLM? Maybe, but it seems that's what the UX encourages.

The point of this thread isn't a moral judgement. It's just that, looking at other people's use of the tools (and the creative ways it can go terribly wrong), it's becoming clear to me that the emperor has no clothes.

@feoh @SnoopJ Personally, I am uncomfortable using (current) LLMs for those.

Expand text...

10 Jun 2023 at 23:52 | Open on social.treehouse.systems

SnoopJ

@alyssa @feoh to me, the larger UX threat is the knowing misrepresentation of LLMs as expert systems for every use case.

I do see the use-case @feoh is talking about, and I've given it a try a few times at the encouragement of others. It's… fine.

But I agree that the overall effect of these systems is corrosive on trust, because as you say, it only takes one such failure to cast a shadow on everything else, even the stuff that isn't LLM output.

11 Jun 2023 at 0:14 | Open on hachyderm.io

Alyssa Rosenzweig 💜

@SnoopJ @feoh usually I like corrosive 🦀

11 Jun 2023 at 0:37 | Open on social.treehouse.systems

Feoh

@SnoopJ @alyssa Oh I totally agree, but I think the oness for that falls squarely at the feet of the people using and relying on these tools in WILDLY inappropriate contexts where they have no business.

I suspect you folks might agree with that :)

11 Jun 2023 at 0:58 | Open on oldbytes.space

Autumn

@alyssa There are some LLMS specifically designed for permutative writing instead of generation, such as Perplexity.ai, which do show promise for being a viable AI search engine replacement, but I feel it AI is currently misunderstood and misused by the public, especially those who confuse chatbots for base models and general AI

tldr; we need better AI education for the general public to solve for false credibility

10 Jun 2023 at 22:32 | Open on hachyderm.io

TheDoctor

@alyssa maybe I’m wrong here, but I think it’s not bad to use LLMs to ask them questions as long as you order them to give you their sources, too. So that you check again what they tell you.
Did that when asking for dog facts for a friend. The specific question I had wasn’t answered properly by any other search engine. So I asked ChatGPT but also asked it to provide me its sources.

11 Jun 2023 at 0:36 | Open on fulda.social

Jacob Rowe-Lane

@alyssa Had a similar experience a couple of times. ChatGPT straight up doesn't understand pointers - I was debugging some code and ran it through to see if it could find the error and it very confidently told me that actually I should allocate space for a double pointer to a data structure, and assign the result (a pointer) to a double pointer - in which case I'm returning a pointer to a double pointer and assigning that pointer to a double pointer and ending up with a triple pointer

11 Jun 2023 at 1:18 | Open on mastodon.social

Minty

@alyssa Thank you for the post. This sort of thing is going to be a huge issue going forward.

11 Jun 2023 at 2:55 | Open on mastodon.social

Roman Luštrik

@alyssa reminds me of this: https://www.johndcook.com/blog/2021/01/18/gell-mann-amnesia/

11 Jun 2023 at 5:36 | Open on sciencemastodon.com

DrYak

@alyssa Yes! That!

Trusting what boils down to an "autocomplete on steroids" for answering you accurate informations is completely asinine.

At best, use it to reformulate nicely information that you know and you're feeding to it.

Or don't use it in scientific context at all.

11 Jun 2023 at 10:26 | Open on mstdn.science

Sigma

@alyssa@social.treehouse.systems
I sort of agree.
The issue is that some people think of LLMs as a knowledge systems, which they aren't.
But I don't think this means that they're just for entertainment either. There are legitimate use cases for making sense of garbled data for example. There is also emergent behavior, like problem solving, that will be really useful in the future, I think.

11 Jun 2023 at 11:04 | Open on comfy.social

Go Up