Email or username:

Password:

Forgot your password?
Dan Sugalski

Turns out that LLM summaries are actually useful.

Not for *summarizing* text -- they're horrible for that. They're weighted statistical models and by their very nature they'll drop the least common or most unusual bits of things. Y'know, the parts of a message that are actually important.

No, where they're great is as a writing check. If an LLM summary of your work is accurate that indicates what you wrote doesn't really have much interesting information in it and maybe you should try harder.

23 comments
Dan Sugalski

I suspect, in the most cynical way I can muster (which, tbh, is *fantastically* and enthusiastically cynical) this is why upper management loves them so much and thinks they're awesome for summarizing emails -- the overwhelming majority of C-suite/VP+ level communication is performative and essentially information-free.

Paul Hewson

@wordshaper I don't do cynicism, but there is an element of truth there. Don't pass details upwards, ever.

Dan Sugalski

@texhewson Luckily LLM summaries will tend to scrub those details out, so that's nice.

Josh Grant

@wordshaper I've wondered quite a bit that the exec suites overall are a major driver of LLM adoption because it appeals to them, specifically

Dan Sugalski

@joshin4colours I suspect there was some (probably unintentional) training of the LLMs to favor making executive suite folks happy. The model training does have some amount of directed feedback in them, so answers (and token weights) that appeal to decision makers will get reinforced as part of the feedback process with answers sounding not great to the C-suite getting negative reinforcement.

softwarecrisis.dev/letters/llm covers this, more or less.

rk: not a typewriter

@wordshaper

“LLM summaries are a proxy measurement for Shannon entropy”

Corbin

@rk @wordshaper Indeed, when using local inference, it's possible to get perplexity on a per-token basis as well as for the entire context window. Perplexity is merely the exponential of the Shannon entropy, so a lower perplexity indicates a less-surprising text. The reason to prefer this over reading generated summaries is that perplexity is numeric and quantitative.

Shiri Bailem

@wordshaper I disagree, they're great for summarizing. It's akin to high powered skimming, so yeah it's not good for little unusual details, but if you're using an AI to summarize text and have any sense you're not looking at text where you're worried about those little exceptions.

Prime example that I've leaned on often is that Amazon now provides an AI summarization at the top of reviews. Odd exceptions are just that with reviews, but it'll definitely highlight the general patterns and let you know if you should dig deeper.

Also I've honestly used it like above once or twice with the intent of just seeing if my point comes across... most conversations aren't really about the fine details or unusual bits?

@wordshaper I disagree, they're great for summarizing. It's akin to high powered skimming, so yeah it's not good for little unusual details, but if you're using an AI to summarize text and have any sense you're not looking at text where you're worried about those little exceptions.

Prime example that I've leaned on often is that Amazon now provides an AI summarization at the top of reviews. Odd exceptions are just that with reviews, but it'll definitely highlight the general patterns and let you...

Hippo 🍉

@wordshaper ooh I should try try that on #Snipette articles! We try to make them quirky and adding all sorts of interesting tangents that go all over the place (in a coherent way for a human but I'd like to see how AI fares) 👀️

Dan Sugalski

@badrihippo It could be fun to see exactly how different the LLM-generated summary for an article is compared to the article itself. (I am *not* tempted to build some kind of web-based writing game where you try to get the most-wrong-possible summary. I'm not, definitely, really not...)

Steve Jones

@wordshaper It draws the lines you must color outside of.

Steve

@wordshaper I'm a recruiter and briefly tried using an LLM (Claude) to summarize transcripts of screening calls with candidates. I asked it to summarize a candidate's responses to specific questions and points. It absolutely left out little details I would have highlighted to the hiring manager, and even hallucinated on more than one occasion, injecting its own personality judgements. Maybe it's gotten better since, but seems useless when any kind of nuance is required.

CaveDave

@wordshaper that's why they're great for cover letters. No interesitng information is a requirement

Dan Sugalski

@engravecavedave True! One of the things about generative LLMs that I like is how they basically kill some of the privilege-based gatekeeping activities, like cover letters or college entrance essays.

Stuff like this from anyone outside a certain band of resources or privilege are useless because they don't reflect the person submitting them. Either they reflect that someone has no resources and not taught how to do the thing, or has resources and basically bought the thing.

Bram Diederik

@wordshaper o yeah i do that.

Letting ai summarise something is like a box of chocolate.

Just generated a blog and a newsbrief in the bus on my way to amsterdam.

zellyn

@wordshaper My rule of thumb: if a piece of writing is useless bullshit, then an LLM can *definitely* do it.

tqwhite

@wordshaper @Walrus word!! I have a standard prompt that I use. It gives great results.

Go Up