The world is being quietly rearranged by people who write very long documents.


April 7, 2026
arXiv
The title they went with
How AI Coding Agents Modify Code: A Large-Scale Study of GitHub Pull Requests Noisy translates that to

AI code generators write more commits but describe their changes less accurately than human developers


Researchers analyzed 24,000 pull requests from AI coding agents against 5,000 from humans on GitHub. AI agents produce more commits per request but are slightly better at matching their descriptions to what the code actually does — a measurable but narrow behavioral difference.
This is the first large-scale empirical measurement of how AI coding agents actually behave in production development workflows, not in controlled benchmarks. Until now, claims about AI code quality have been theoretical or anecdotal. The finding cuts both directions: AI agents are noticeably verbose in commits (0.54 effect size), which could mean either they're doing more granular work or they're adding noise to the repository. The slight accuracy improvement in PR descriptions is genuine but modest — it does not mean AI agents are more reliable, only that they're slightly better at documentation consistency. What matters is that we now have a dataset large enough to stop guessing and start measuring whether AI contributions degrade development workflows or integrate seamlessly. The next question is whether that verbosity compounds over time in large codebases.
What happens next
Track whether open source projects that accept AI-generated PRs see measurable increases in technical debt, test failures, or security vulnerabilities in their next annual audits, compared to projects that remain human-only.

If you insist
Read the original →