@ahltorp @thomasfuchs I doubt it's going to work for large-scale systems anytime soon. Imagine the kind of "natural language" specification you'd need to produce something like eg. Firefox, Unreal Engine, or the Linux kernel.
But for the small LLM-generated apps people are producing today (where the LLM iterates based on error messages from the compiler), you change them by changing the natural-language prompt that generated them.
...
@ahltorp @thomasfuchs ...but at least right now, this has the major problem that you can't be sure it didn't also change something else.
(And we *know* that they don't currently work well for larger-scale system maintenance: Their performance in the SWE-Bench benchmark, where they're given actual Github issues on actual Github repos rather than leetcode problems, is *abysmal*, 0-4% success rate.)