The world is being quietly rearranged by people who write very long documents.


April 6, 2026
arXiv
The title they went with
Automatic Textbook Formalization Noisy further explains

AI formalized a graduate math textbook in one week for less than it costs to hire one expert

The task that required human experts because machines could not be trusted with precision was completed by 30,000 machines operating simultaneously on a shared codebase.

An AI system automatically converted a 500-page graduate textbook on algebraic combinatorics into formal mathematical code (Lean) in a single week, producing 130,000 lines of verified code. The cost of the computation matched what you'd pay a team of human mathematicians for the same work, suggesting that formal verification of mathematical knowledge — a task that has required expert labor for decades — may now be automatable at scale.
500 pages of graduate-level textbook formalized
130,000 lines of formal code generated
5,900 Lean declarations in formalized code
30,000 parallel AI agents used
1 week to complete formalization
Formal verification is the process of translating human mathematics into machine-checkable code so that proofs cannot contain hidden errors. Until now, this has been a bottleneck: converting even small textbooks took years and required specialists. If a single textbook can be formalized in a week at commodity cost, the constraint that has kept formal mathematics a niche activity just evaporated. This means universities, publishers, and research institutions can now treat formal verification as a routine step rather than a luxury — which changes what kinds of mathematical claims can be checked at scale, and which mathematicians become valuable (those who can work with formal systems, not those who can do the conversion work).
30,000 workers finished a job in one week that would cost less than the salary of the one person who could have done it alone.
who wins Any organization that needs mathematical proofs verified and previously could not afford the labor, who now gets it quietly reclassified from luxury to infrastructure.
who loses The small community of human experts whose specific value was that formal mathematics verification was too difficult and too expensive to outsource, counting on that scarcity to remain.
also Chip designers, cryptographers, and anyone whose product ships with a proof attached.
Lean a computer programming language designed for writing and verifying mathematical proofs
formalization converting mathematical proofs into code a computer can check for errors
multi-agent software engineering many AI systems working together on shared code at the same time
Why this hasn't landed yet
The result requires understanding what Lean is and why formalization is hard, which is two prerequisites most science journalists do not have, so the story gets filed under 'AI does math thing' rather than 'the economics of mathematical certainty just changed.'
What happens next
Mathematics publishers and university presses are next. Once one graduate textbook is formalized open-source in a week, the question of why the rest of the catalog has not been formalized becomes difficult to answer politely. Expect the first institutional formalization mandate within two to three years.
The catch
The benchmarking community will spend the next six months locating the specific theorems where the 30,000-agent system produced formally valid but mathematically trivial proofs, which is how every large-scale automated verification claim has been walked back since the early interactive theorem prover era.
The longer arc
Automated theorem proving has been a formal research program since the 1950s, with milestones arriving roughly once per decade. The four-color theorem machine proof in 1976 and the Flyspeck project's complete formalization of the Kepler conjecture in 2014 both drew the same reaction: impressive but narrow. This is the first result at textbook scale, which is a different category of claim.
Part of a pattern
This is part of a pattern of multi-agent AI systems crossing labor-cost thresholds in high-precision domains previously considered safe from automation, following similar cost crossovers in legal document review and clinical coding over the past eighteen months. The pattern is: the domain held until the cost comparison became impossible to ignore.

If you insist
Read the original →

The Sendoff
Textbooks. Maybe 30,000 AI agents working together really can do anything.