Study of codebases over a 4 year period show Github...

Study of codebases over a 4 year period show Github Copilot lowers the quality of code over time by increasing the likelihood of bugs being introduced and copy & pasted code.

This is a healthy counterpoint to studies that show improved productivity.

Fast. Cheap. Good. Pick Two.

https://visualstudiomagazine.com/articles/2024/01/25/copilot-research.aspx

Like 29 January at 11:23 | Open on mas.to

19 comments

Kierkrampusgaanks regretfully

@carnage4life I’ll have fast and cheap please

- venture capitalists

29 January at 11:25 | Open on beige.party

thomastc

@carnage4life I'm as sceptical about "AI" as anyone, but it's worth noting that this is not an impartial peer-reviewed study, but rather a whitepaper published by a company that sells a product for code review by humans. They have skin in this game.

29 January at 11:47 | Open on mastodon.gamedev.place

glipari

@carnage4life why i am not surprised

29 January at 12:04 | Open on social.sciences.re

Chris

@carnage4life
Oddly, I’m not at all startled by this. 🙄

29 January at 12:45 | Open on mastodon.online

Robin Syl 🌸:blobcatreach:

@carnage4life "it easy to be fast if every step wrong" - uncle roger

29 January at 13:01 | Open on meow.social

DeManiak 🇿🇦 🐧

@carnage4life Shocked.

Absolutely shocked.

Well..not THAT shocked.

29 January at 13:34 | Open on social.oevents.co.za

Thomas 🔭🕹️

@carnage4life "Save 5 minutes of developer time now, pay for a class action suit later!"

29 January at 14:34 | Open on hachyderm.io

Jason Sando

@thomasfuchs @carnage4life Ugh. Even with all of us screaming "don't trust the output from AI!" some company will undoubtedly try to blame their own negligence on it. You just know it's coming at some point.

29 January at 15:46 | Open on hachyderm.io

Feoh

@carnage4life It's interesting. I wonder how much of that is the advertised feature where you add a comment and copilot writes the entirety of the code for you.

I use Tab9 - https://www.tabnine.com/ and really appreciate its workflow. It just sits there silently while I code and helps with boilerplate and bases suggestions on the patterns that already exist within your code base and repository.

I find it rather helpful, but I'm also fluent enough with Python that I know where the potholes are.

I think there's room for tools like this to be used in a common sense way to boost productivity while maintaining quality, but I realize that's a minority opinion.

@carnage4life It's interesting. I wonder how much of that is the advertised feature where you add a comment and copilot writes the entirety of the code for you.

Expand text...

29 January at 14:43 | Open on oldbytes.space

Dave

@carnage4life has GitHub co-pilot been available for 4 years? It looks like it was released in 2021. Did the folks in this study have access to it before that?

29 January at 15:01 | Open on universeodon.com

San Wu

@carnage4life Even if that's true, I'm sure the statistics are very different per tiers of engineering quality. It's almost a certainty that bug rate increase with Copilot is way, way higher with low-quality software engineers (boot camp graduates) than with SDEs from top 25 schools.

29 January at 15:22 | Open on mastodon.social

Sassinake! - ⊃∪∩⪽

@carnage4life

When did MS buy github again?

29 January at 16:06 | Open on mastodon.social

Peter Amstutz

@carnage4life
Save two hours writing boilerplate, spend ten hours debugging incredibly subtle bugs in production, sounds like a win!

Yea, the fact that LLMs just inherently produce "plausible" results rather than "correct" results means producing bugs that are just that much harder to spot.

29 January at 17:41 | Open on hachyderm.io

Alan Miller :verified_paw:

@tetron @carnage4life 'Truthy' code

30 January at 5:00 | Open on infosec.exchange

Lafncow :blobcatcoffee:

@carnage4life I would not be surprised if these conclusions are true, but this "study" is very flimsy. They compare total metrics across all measured codebases between 2023 and the prior years, assume that all differences are due to Copilot use, and then extrapolate conclusions. I am definitely an AI skeptic, but this is a marketing piece, not a research paper.

29 January at 17:58 | Open on mastodon.social

Patrick Lam :tinoflag:

@lafncow @carnage4life the research just hasn't been done yet (and should be!) The press write-up does quote a workshop paper I was involved in but that's also not strong evidence yet either.

29 January at 22:21 | Open on mastodon.nz

Lafncow :blobcatcoffee:

@va2lam @carnage4life I agree, this is an area that really should be researched! Thank you for jumping in, I hope you or others get to research these questions deeper!

29 January at 22:54 | Open on mastodon.social

Patrick Lam :tinoflag:

@lafncow @carnage4life alas, the general productivity question is not one that I'm well equipped to investigate, but yes, specific properties of (generated) code are up my alley.

29 January at 22:57 | Open on mastodon.nz

Todd Knarr

@carnage4life LLMs for coding: the new cargo cult.

30 January at 5:42 | Open on mstdn.social