@simon this is where I had gotten too as well. It seems to be very limiting for using LLMs as part of automated processes. Especially if it’s hard to detect final states and the LLMs are apparently bad at managing to length on tasks of variable length.