The world is being quietly rearranged by people who write very long documents.


April 6, 2026
arXiv
The title they went with
BibTeX Citation Hallucinations in Scientific Publishing Agents: Evaluation and Mitigation Noisy translates that to

AI-powered citation tools hallucinate papers — deterministic retrieval cuts errors in half


Large language models built into scientific writing tools produce citations with wrong details about 50% of the time, even when they can search the web. A two-stage process that first generates citations, then revises them against actual databases cuts that failure rate to 22%, meaning most citations now check out completely.
Scientists increasingly use AI to write papers and generate reference lists. Right now, those AI systems confidently cite papers that don't exist, cite real papers with wrong volume numbers or page ranges, or cite the wrong paper entirely — mistakes that would make the paper itself unreliable or unpublishable. This paper demonstrates the problem at scale (nearly 23,000 citation attempts) and shows a concrete fix: pipe AI outputs through a verification step against CrossRef and Zotero, the databases librarians use. The fix is mechanical, not architectural — it doesn't require smarter AI, just deterministic lookup after generation. This matters because it suggests a whole class of AI-in-publishing problems might be solvable not by training better models but by building better handoffs between AI and databases that already have the truth.
What happens next
Track whether journals and preprint servers actually adopt two-stage verification pipelines in their submission systems — the tool exists, but adoption decides whether this stays academic or becomes infrastructure.

If you insist
Read the original →