Email or username:

Password:

Forgot your password?
Top-level
Thomas 🔭🕹️

@datarama LLMs categorically can’t do the jobs they want to use them for so good luck with that

11 comments
Jernej Simončič �

@thomasfuchs @datarama It's telling that OpenAI isn't using ChatGPT for their support.

datarama

@jernej__s @thomasfuchs It is! Ironically, while customer support was the originally envisioned "killer app" for chatbots, LLMs are actually *worse* at it than old-school chatbots were. Old-school chatbots don't hallucinate (and potentially mislead the customer) and they're not vulnerable to prompt-injection trickery (so you can't eg. get them to promise to sell you a car for 10 dollars).

datarama

@jernej__s @thomasfuchs ...but you also couldn't get an old-school customer support chatbot to write you an algorithm that implements Floyd-Steinberg dithering in Python, so there's that.

Dragon-sided D

@datarama @jernej__s @thomasfuchs Most corporates that offer AI support bots are deploying RAG capabilities.

That basically solves the hallucination issue.

Nacho

@thomasfuchs
Can it do the job? No. But can it convince enough people that it can so that their objective (devaluing human labor) is achieved? Well, that's what they're betting on.
@datarama

Dragon-sided D

@nachof @thomasfuchs @datarama you seem quite convinced “it can’t do the job”

Seems a bit premature to me. Right now is the least capable this technology will ever be.

datarama

@thomasfuchs They don't necessarily need to. In some contexts, the promise (which may or may not work) is that they can be used to replace expensive human experts with cheaper "proofreader" roles. Instead of solving interesting technical problems, the human's role degrades to verifying that the LLM didn't screw it up.

And in some contexts, that latter part can be at least partially done using non-LLM software. (Eg. formal verification tools in programming.)

datarama

@thomasfuchs The former part of this was *exactly* what the Hollywood writers went on strike over. They *absolutely didn't* want that to happen to their trade.

(Both because of the pay cut, and the likelihood that they'd end up rewriting the whole thing from scratch anyway - but also because even when it *does* work, they didn't want to cede the *creative* part of their job to LLMs, leaving them only with drudgery.)

Magnus Ahltorp

@datarama @thomasfuchs Have anyone put any thought at all into how you maintain or change these LLM-generated sourcecodeless systems, even *when* it produces somewhat usable code?

“Let’s throw out everything we know about software development, that will probably not cause any problems”

datarama

@ahltorp @thomasfuchs I doubt it's going to work for large-scale systems anytime soon. Imagine the kind of "natural language" specification you'd need to produce something like eg. Firefox, Unreal Engine, or the Linux kernel.

But for the small LLM-generated apps people are producing today (where the LLM iterates based on error messages from the compiler), you change them by changing the natural-language prompt that generated them.

...

datarama

@ahltorp @thomasfuchs ...but at least right now, this has the major problem that you can't be sure it didn't also change something else.

(And we *know* that they don't currently work well for larger-scale system maintenance: Their performance in the SWE-Bench benchmark, where they're given actual Github issues on actual Github repos rather than leetcode problems, is *abysmal*, 0-4% success rate.)

Go Up