Ban for authors submitting AI content ‘welcome but unenforceable’

Research integrity experts commend arXiv’s crackdown on bogus AI-written citations but warn it may be impossible to police at scale

Published on
May 20, 2026
Last updated
May 20, 2026
Source: iStock/AoiHone

A major scientific repository’s decision to ban authors whose work contains “hallucinated” references written by generative artificial intelligence (AI) has been welcomed by research integrity campaigners despite concerns about how the policy can be properly enforced.

In a landmark move, the popular preprint platform arXiv has said it will impose an immediate one-year ban if it finds “incontrovertible evidence” that submissions contain “inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content” written by large language models (LLMs).

“If a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can’t trust anything in the paper,” explained Thomas Dietterich, who chairs arXiv’s computing section, as he announced the policy on social media platform X.

Examples of incontrovertible evidence would include “hallucinated references” and “meta-comments from the LLM”, continued Dietterich, who gave examples of a researcher failing to delete phrases such as “here is a 200 word summary; would you like me to make any changes?” or “the data in this table is illustrative, fill it in with the real numbers from your experiments”.

ADVERTISEMENT

The one-year ban from arXiv will be “followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue,” said Dietterich, an emeritus professor at Oregon State University whose research focuses on machine learning and AI.

The new policy is widely viewed as an attempt to stem the rising number of AI-assisted submissions to the preprint server for mathematics, physics and computing, which received more than 30,000 submissions for the first time in March – more than double the 15,000 papers received in 2020 and six times the 5,000 papers received in 2015.

ADVERTISEMENT

While some scholars have suggested that the one-year ban for a first offence is too harsh, research integrity campaigner Anna Abalkina, based at the Free University of Berlin, welcomed the “countermeasure against paper-mill submissions and low-quality manuscripts”.

“We have a lot of rules but not so many enforcement mechanisms,” said Abalkina on the decisive nature of the ban.

“I welcome this measure because it is aimed mainly at paper mills rather than legitimate authors. Anyway, authors still have a chance to appeal,” she added.

Abalkina’s own research has shown an “increase in submissions to preprint servers that were also made to increase citations,” she explained, with a recent preprint showing that many papers submitted to scientific conferences had previously been offered for sale by paper mills.

Reese Richardson, postdoctoral research fellow at Northwestern University’s Center for Science of Science and Innovation (CSSI) who focuses on research integrity, said the policy change “undoubtedly follows” from the recent study by Zhenyue Zhao and others, including arXiv founder Paul Ginsparg, which found that nearly 150,000  “hallucinated” references were present in papers posted on four preprints in 2025 alone.

ADVERTISEMENT

Although he commended the preprint’s efforts to “disincentivise submitting AI-generated bullshit”, Richardson questioned whether these punitive measures would work given the scale of dubious submissions.

“Zhao et al. estimate that thousands of manuscripts containing hallucinated references will be posted on arXiv every year. Does arXiv plan to apply bans for all of these submissions? Issuing these bans requires arXiv staff to adjudicate each case as well as respond to appeals, which I imagine will wind up being quite onerous, even if they only prosecute a small fraction of the offending cases,” he said.

“If arXiv only selectively enforces this policy, they may save resources on adjudication and enforcement. However, if the risk of being punished is too low even if you are caught, there is little incentive against authors continuing to submit content with unverified references,” Richardson added.

ADVERTISEMENT

There are similar concerns about how arXiv staff will crack down on submissions with falsifications, plagiarism and nonsensical content, said Richardson. “While we’d all like to see a lot less bullshit, the same concerns about scalability apply.”

The new policy, however, also raised questions about the evolving role of preprints within scientific literature, he continued, noting that “arXiv is in a tricky situation” given that its stated mission is “to share research papers and facilitate scientific discovery quickly and freely” rather than check their quality.

“Its assumed role in the information ecosystem is to host scientific works that have not undergone traditional gatekeeping procedures like formal peer review and editorial acceptance for publication,” he explained.

“However, in certain fields such as computer science or AI, arXiv is arguably displacing the role traditionally occupied by competitive outlets like journals and conferences, such that having a complete paper on arXiv can itself be a status-conferring accomplishment,” he said.

ADVERTISEMENT

“Despite now being firmly in their fourth decade around, preprint platforms like arXiv remain an ongoing experiment in scholarly communication. As this experiment continues, how verification and gatekeeping should factor into this mission will remain the subject of considerable debate,” said Richardson.

jack.grove@timeshighereducation.com

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Please
or
to read this article.

Related articles

Related universities

Sponsored

Featured jobs

See all jobs