Turns out that LLM summaries are actually useful. Not...

Turns out that LLM summaries are actually useful.

Not for *summarizing* text -- they're horrible for that. They're weighted statistical models and by their very nature they'll drop the least common or most unusual bits of things. Y'know, the parts of a message that are actually important.

No, where they're great is as a writing check. If an LLM summary of your work is accurate that indicates what you wrote doesn't really have much interesting information in it and maybe you should try harder.

Like 23 October at 13:11 | Open on weatherishappening.network

23 comments

Dan Sugalski

I suspect, in the most cynical way I can muster (which, tbh, is *fantastically* and enthusiastically cynical) this is why upper management loves them so much and thinks they're awesome for summarizing emails -- the overwhelming majority of C-suite/VP+ level communication is performative and essentially information-free.

23 October at 14:09 | Open on weatherishappening.network

Paul Hewson

@wordshaper I don't do cynicism, but there is an element of truth there. Don't pass details upwards, ever.

23 October at 17:16 | Open on datasci.social

Dan Sugalski

@texhewson Luckily LLM summaries will tend to scrub those details out, so that's nice.

23 October at 17:20 | Open on weatherishappening.network

Josh Grant

@wordshaper I've wondered quite a bit that the exec suites overall are a major driver of LLM adoption because it appeals to them, specifically

23 October at 17:51 | Open on mastodon.social

Dan Sugalski

@joshin4colours I suspect there was some (probably unintentional) training of the LLMs to favor making executive suite folks happy. The model training does have some amount of directed feedback in them, so answers (and token weights) that appeal to decision makers will get reinforced as part of the feedback process with answers sounding not great to the C-suite getting negative reinforcement.

https://softwarecrisis.dev/letters/llmentalist/ covers this, more or less.

23 October at 17:54 | Open on weatherishappening.network

Josh Grant

@wordshaper huh now that's very interesting

23 October at 17:56 | Open on mastodon.social

MikeK

@wordshaper

Do you mind if I quote you on that?

23 October at 14:14 | Open on mastodon.modern-industry.com

Dan Sugalski

@mkarliner Sure, go ahead

23 October at 14:17 | Open on weatherishappening.network

MikeK

@wordshaper

Excellent and true insight BTW.

23 October at 14:29 | Open on mastodon.modern-industry.com

rk: not a typewriter

@wordshaper

“LLM summaries are a proxy measurement for Shannon entropy”

23 October at 14:28 | Open on mastodon.well.com

Corbin

@rk @wordshaper Indeed, when using local inference, it's possible to get perplexity on a per-token basis as well as for the entire context window. Perplexity is merely the exponential of the Shannon entropy, so a lower perplexity indicates a less-surprising text. The reason to prefer this over reading generated summaries is that perplexity is numeric and quantitative.

23 October at 15:23 | Open on defcon.social

Shiri Bailem

@wordshaper I disagree, they're great for summarizing. It's akin to high powered skimming, so yeah it's not good for little unusual details, but if you're using an AI to summarize text and have any sense you're not looking at text where you're worried about those little exceptions.

Prime example that I've leaned on often is that Amazon now provides an AI summarization at the top of reviews. Odd exceptions are just that with reviews, but it'll definitely highlight the general patterns and let you know if you should dig deeper.

Also I've honestly used it like above once or twice with the intent of just seeing if my point comes across... most conversations aren't really about the fine details or unusual bits?

Expand text...

23 October at 14:38 | Open on foggyminds.com

Khrys

@wordshaper 👌

23 October at 15:18 | Open on mamot.fr

Hippo 🍉

@wordshaper ooh I should try try that on #Snipette articles! We try to make them quirky and adding all sorts of interesting tangents that go all over the place (in a coherent way for a human but I'd like to see how AI fares) 👀️

23 October at 15:28 | Open on fosstodon.org

Dan Sugalski

@badrihippo It could be fun to see exactly how different the LLM-generated summary for an article is compared to the article itself. (I am *not* tempted to build some kind of web-based writing game where you try to get the most-wrong-possible summary. I'm not, definitely, really not...)

23 October at 15:30 | Open on weatherishappening.network

Steve Jones

@wordshaper It draws the lines you must color outside of.

23 October at 16:42 | Open on mastodon.social

Steve

@wordshaper I'm a recruiter and briefly tried using an LLM (Claude) to summarize transcripts of screening calls with candidates. I asked it to summarize a candidate's responses to specific questions and points. It absolutely left out little details I would have highlighted to the hiring manager, and even hallucinated on more than one occasion, injecting its own personality judgements. Maybe it's gotten better since, but seems useless when any kind of nuance is required.

23 October at 16:49 | Open on mstdn.social

CaveDave

@wordshaper that's why they're great for cover letters. No interesitng information is a requirement

23 October at 17:14 | Open on mastodon.social

Dan Sugalski

@engravecavedave True! One of the things about generative LLMs that I like is how they basically kill some of the privilege-based gatekeeping activities, like cover letters or college entrance essays.

Stuff like this from anyone outside a certain band of resources or privilege are useless because they don't reflect the person submitting them. Either they reflect that someone has no resources and not taught how to do the thing, or has resources and basically bought the thing.

23 October at 17:19 | Open on weatherishappening.network