@joshin4colours I suspect there was some (probably...

@joshin4colours I suspect there was some (probably unintentional) training of the LLMs to favor making executive suite folks happy. The model training does have some amount of directed feedback in them, so answers (and token weights) that appeal to decision makers will get reinforced as part of the feedback process with answers sounding not great to the C-suite getting negative reinforcement.

https://softwarecrisis.dev/letters/llmentalist/ covers this, more or less.

Like 23 Oct 2024 at 17:54 | Open on weatherishappening.network