@hynek I thought that too, but the more work I get...

@hynek I thought that too, but the more work I get done with LLMs myself the less worried I am about that

I have a Go project I wrote from scratch in production now, despite not being remotely fluent in Go. It has comprehensive test coverage and even implements continuous integration and continuous deployment, which is why I’m confident it’s not a spectacularly bad idea

Would other people YOLO something like that to production without tests? Maybe, and that would definitely be a bad idea!

Like 31 August at 13:44 | Open on fedi.simonwillison.net

10 comments

Timo Zimmermann

@simon @hynek there’s IMHO still a significant difference between writing some code that passes happy path tests and operating a service in production when something goes wrong the first time.

More projects falling apart when looking at them the wrong way and no one around understanding the tooling is IMHO not the solution.

That being said it’s obviously easier with a few decades experience knowing what exactly to look for and which question to ask. But this extrapolates poorly to most devs.

31 August at 14:04 | Open on social.screamingatmyscreen.com

Hynek Schlawack

@simon Yeah but that’s exactly gonna happen once you work under economic constraints and middle managers pining for promotions. My point is exactly what you’re accidentally implying: they’re amazing for tinkering but a time bomb in prod envs. 🤷‍♂️

31 August at 14:05 | Open on mastodon.social

Simon Willison

@hynek I certainly won’t deny that there are an incredible new array of footguns now available to anyone who wants them

31 August at 14:07 | Open on fedi.simonwillison.net

Matthew Martin

@simon @hynek re: rate of adoption for new programming practices at the office
I first saw unit tests in 1998. First project where the entire org was fighting for unit tests rather than deliberately misunderstaning, ignoring or fighting against them: 2020.

It will be 22 years before there is widespread encouragement of AI aided coding. My current client bans AI through the entire org for all purposes. We're all talking about scifi futures for most people.

31 August at 14:19 | Open on mastodon.social

Shauna GM

@simon @hynek do you know Go enough to assess the tests? I have had a number of contributors to a project use AI and often their tests pass but don't actually test the right thing.

31 August at 14:36 | Open on social.coop

Simon Willison

@shauna @hynek I think I know enough about programming to assess the tests: I use tricks like changing the implementation, confirming the test breaks, then fixing the implementation and confirming the test passes

31 August at 14:40 | Open on fedi.simonwillison.net

Simon Willison

@shauna @hynek I have 20+ years of programming to rely on here though - I don’t think “shipping production code in a language you don’t know” is something that’s a great idea with a LOT of that existing experience

31 August at 14:41 | Open on fedi.simonwillison.net

Hynek Schlawack

@simon @shauna Yes, that’s a HUGE qualifier. Given how careers typically work in IT, I’m guessing that’s top 1 percentile.

31 August at 14:44 | Open on mastodon.social

Matthew Martin

@simon @shauna @hynek - only frontier models routinely find bugs with unit tests. 3.5 wrote vacuous tests in comparison to 4 or 4o
- once it fixed the bug via monkey patching before the test ran to make it pass (malicious compliance!)
- the bots write so many unit tests that after a while quantity becomes a quality all of its own & the value comes with the next change I make, I'll see how sensitive the rest of the app was to a change in any part of the app (which points out design flaws)

31 August at 14:45 | Open on mastodon.social

Zane Selvans

@simon @hynek Thus far Copilot has made me more likely to write tests -- I've always found them tedious (even though necessary!) and so have been lazy about it, but now it feels rewarding, and once the scaffolding is there, it's not too bad to add extra cases either by hand or with the LLM. I don't think I would have taken it seriously as an option without your posts Simon.

31 August at 15:14 | Open on social.coop