@datarama LLMs categorically can’t do the jobs they...

Thomas's posts Post Back to profile

Top-level

Thomas 🔭🕹️

@datarama LLMs categorically can’t do the jobs they want to use them for so good luck with that

Like 27 January at 18:02 | Open on hachyderm.io

11 comments

Jernej Simončič �

@thomasfuchs @datarama It's telling that OpenAI isn't using ChatGPT for their support.

27 January at 18:32 | Open on infosec.exchange

datarama

@jernej__s @thomasfuchs It is! Ironically, while customer support was the originally envisioned "killer app" for chatbots, LLMs are actually *worse* at it than old-school chatbots were. Old-school chatbots don't hallucinate (and potentially mislead the customer) and they're not vulnerable to prompt-injection trickery (so you can't eg. get them to promise to sell you a car for 10 dollars).

27 January at 18:39 | Open on hachyderm.io

datarama

@jernej__s @thomasfuchs ...but you also couldn't get an old-school customer support chatbot to write you an algorithm that implements Floyd-Steinberg dithering in Python, so there's that.

27 January at 18:46 | Open on hachyderm.io

Dragon-sided D

@datarama @jernej__s @thomasfuchs Most corporates that offer AI support bots are deploying RAG capabilities.

That basically solves the hallucination issue.

27 January at 20:21 | Open on sciencemastodon.com

Nacho

@thomasfuchs
Can it do the job? No. But can it convince enough people that it can so that their objective (devaluing human labor) is achieved? Well, that's what they're betting on.
@datarama

27 January at 18:46 | Open on mastodon.uy

Dragon-sided D

@nachof @thomasfuchs @datarama you seem quite convinced “it can’t do the job”

Seems a bit premature to me. Right now is the least capable this technology will ever be.

27 January at 20:23 | Open on sciencemastodon.com

datarama

@thomasfuchs They don't necessarily need to. In some contexts, the promise (which may or may not work) is that they can be used to replace expensive human experts with cheaper "proofreader" roles. Instead of solving interesting technical problems, the human's role degrades to verifying that the LLM didn't screw it up.

And in some contexts, that latter part can be at least partially done using non-LLM software. (Eg. formal verification tools in programming.)

27 January at 18:51 | Open on hachyderm.io

datarama

@thomasfuchs The former part of this was *exactly* what the Hollywood writers went on strike over. They *absolutely didn't* want that to happen to their trade.

(Both because of the pay cut, and the likelihood that they'd end up rewriting the whole thing from scratch anyway - but also because even when it *does* work, they didn't want to cede the *creative* part of their job to LLMs, leaving them only with drudgery.)

27 January at 18:53 | Open on hachyderm.io

Magnus Ahltorp

@datarama @thomasfuchs Have anyone put any thought at all into how you maintain or change these LLM-generated sourcecodeless systems, even *when* it produces somewhat usable code?

“Let’s throw out everything we know about software development, that will probably not cause any problems”

27 January at 19:07 | Open on mastodon.nu

datarama

@ahltorp @thomasfuchs I doubt it's going to work for large-scale systems anytime soon. Imagine the kind of "natural language" specification you'd need to produce something like eg. Firefox, Unreal Engine, or the Linux kernel.

But for the small LLM-generated apps people are producing today (where the LLM iterates based on error messages from the compiler), you change them by changing the natural-language prompt that generated them.

...

27 January at 19:14 | Open on hachyderm.io

datarama

@ahltorp @thomasfuchs ...but at least right now, this has the major problem that you can't be sure it didn't also change something else.

(And we *know* that they don't currently work well for larger-scale system maintenance: Their performance in the SWE-Bench benchmark, where they're given actual Github issues on actual Github repos rather than leetcode problems, is *abysmal*, 0-4% success rate.)

27 January at 19:16 | Open on hachyderm.io