Email or username:

Password:

Forgot your password?
Top-level
Dan Sugalski

@joshin4colours I suspect there was some (probably unintentional) training of the LLMs to favor making executive suite folks happy. The model training does have some amount of directed feedback in them, so answers (and token weights) that appeal to decision makers will get reinforced as part of the feedback process with answers sounding not great to the C-suite getting negative reinforcement.

softwarecrisis.dev/letters/llm covers this, more or less.

1 comment
Go Up