Email or username:

Password:

Forgot your password?
Simon Willison

Do you ever use LLM tools like Claude or ChatGPT to help code up exploratory prototypes?

(Specifically asking about prototyping here, because I'm beginning to think it's a particularly valuable application of this tech)

Anonymous poll

Poll

No, I've not tried that
96
22.7%
No, I've tried and found it didn't help me
59
14%
No, I disagree with the ethics of it
82
19.4%
Yes
185
43.8%
422 people voted.
Voting ended 29 August at 17:49.
16 comments
Chris (Master of Potate) 🥔

@simon talking to a chat it tends to not fit into my work flow. I often work in an exploratory way so that may plan forms during programming. I like code completion though.

Anand Philip

@simon I don't know enough about app deployment and front end to do this without LLM help. I've published two experiments that got a lot of eyeballs and gave me a great deal of insight purely based on gemini generated code. love it.

ijm

@simon I'm surprised by the % saying they disagree with the ethics when *prototyping*; thats interesting.

Simon Willison

@ijm my guess is that a lot of that could be about the energy usage and environmental impact, as illustrated by this recent thread toot.cafe/@baldur/113017028974

Simon Willison

@ijm and to be honest, “it’s just for the prototype” as an ethical workaround doesn’t hold up when if the prototype is any good the chances are quite a bit of it will end up used in the final product

ijm

@simon Ah, I was thinking of the workforce ethics - many don't have the resources to expend on a 'risky' prototype.
Interesting thread, which probably also relates to a paper I just read on where the compute is being done (doi.org/10.31235/osf.io/8yp7z)
Now I'm wonder if anyone has yet tried to do a full co2 footprint for a slow-traditional-prototype vs a fast-with-ai-prototype.

John Feminella 🌠

@simon Routinely. Some of the biggest (eight figures USD deals) applications I've sold over the last 18 months have been about building things that crank out prototypes for clients dealing in bespoke software spaces. It's become particularly good with Claude Sonnet and I'm hoping Q*/Strawberry proves to be an even further improvement.

casraf :typescript: 🇮🇱

@simon
Yep, I also use it to get a clue about new languages I am not familiar with, usually I need to do a function or two here and to get me to understand some of the stdlibs

Tom Phillips

@simon For me I find a lot of the value of spikes and prototypes comes from the process, e.g. discovering that things work differently than I expected. Even if an LLM can give me a working prototype I am worried about the loss of that learning and discovery. I might be wrong though. I'll try it next time and see.

Simon Willison

@twp I'm finding that LLM prototypes are accelerating that process for me too - it's much quicker to try a different approach ("what if I do this with a subprocess instead of threads, how about if I use a SQL UNION here, could AppleScript get this done for me better?") and I'm still reading the code so I'm still learning from what works and what doesn't.

Ian Wagner

@simon yes, but to be honest it is inly well suited to specific domains; usually the ones with poor dev tools and a lot of ceremony and boilerplate which also have a lot of users 😂 But it can speed things up there sometimes.

Janne Moren

@simon
In my brief exploration of it (and based on others experience) it seems to be a direct replacement of Stack Overflow.

That is, if you use reasonably mainstream technology, and you want help in solving a common problem or implementing a standard solution, perhaps with a small twist, then it's helpful and generally correct.

But as you veer off the mainstream path, the suggestions rapidly become misleading and wrong, and it's faster figuring it out for yourself.

@simon
In my brief exploration of it (and based on others experience) it seems to be a direct replacement of Stack Overflow.

That is, if you use reasonably mainstream technology, and you want help in solving a common problem or implementing a standard solution, perhaps with a small twist, then it's helpful and generally correct.

Simon Willison

@jannem I’ve not been finding that myself - sure, it’s best at Python and JavaScript and SQL but I’ve been getting great results for languages I don’t know well (or at all) like Go and AppleScript

You gotta get good at testing what it proxies, but that’s a similar skill to code reviewing code by other people

Janne Moren

@simon
I was trying to get help on Chapel, but that failed pretty badly.

And if I try with Gdscript the models will tend to give me Python code instead. It looks quite similar, and with orders of magnitude more training examples that's perhaps not unexpected.

Oh, and I did try getting help on writing lock-free parallel code in Julia but that failure may honestly be a Julia problem more than the models'.

Simon Willison

@jannem hah, yeah your choice of programming languages is a whole lot less mainstream than mine!

Janne Moren

@simon
Which is not a problem, to be clear. Handling big languages with common problems is the high impact case.

Also, I just asked Gemini for a parallel prefix-sum in Chapel and it gave me a credible-looking piece of code. Now, prefix-sum is Baby's First Parallel Algorithm, and there's bound to be Chapel examples out there to draw from, but that's still a lot better than what I got with an earlier model.

Go Up