I got sucked into GitClear’s Coding on Copilot: 2023 Data Shows Downward Pressure on Code Quality after seeing it posted on Mastodon and then talking about it with a couple of co-workers. The abstract didn’t pass the sniff test so I ended up reading the whole paper. Here’s the summary, plus some comments afterward.
I reordered the presentation to better match academic style, and to bring some important details up out of the appendices.
They start with almost a billion lines of diff from their customers and github data, filtered down to 153 million non-noop lines. They classify changes into
They calculate the time between a line being added and its first delete/move/update. They label a time of less than 2 weeks “churn”. Then they calculate the total lines changed.
I’ve omitted all the 2024 projections (which, ironically, they obtained by asking chatgpt to do a regression instead of doing it themselves), and have replaced the double chart crimes with better charts that show the scale of changes.
Notably, the Time To Churn chart says that time to churn was less than 2 weeks for 65% of code in 2020, but the separate Churn value is only 3.3%. I don’t know what the difference between these measures is.
There’s a good deal of discussion about why it’s bad to have less moved code and more copy/pasted code.
They point out that github’s own survey points out that developers have concerns about AI-generated-code quality.
The biggest threats to their interpretations are:
In their appendix, you can see that the number of repos and commits they use both almost triple. Any analysis that fails to normalise between years is flawed; especially if there’s no attempt to control for any other causes of change. Specifically, the tech industry went through a rapid boom/bust/boom cycle in 2020-2023, where many new programmers were hired, then fired, then hired again.
For example, commits per repo is overall flat from 2020–2023 — but committers per repo goes down. It’s just as likely that fewer programmers are working on each project as tech companies hired thousands of new employees and started them working on hundreds of new projects.
There are lots of minor problems with their discussion as well, mostly having to do with trying to bend statistics and charts to make their point stronger. This is a white paper written by a company selling code quality tools, not an academic paper.