Logo

From model collapse to citation collapse: risks of over-reliance on AI in the academy

The way GenAI surfaces sources for literature reviews risks exacerbating the citation Matthew effect, writes David Joyner. Here, he offers ways to prevent AI-driven search from blunting the impact of new research
David Joyner's avatar
15 Jan 2026
copy
  • Top of page
  • Main text
  • More on this topic
Industrial building collapse
image credit: Anatolii Mazhora/iStock.

Created in partnership with

Logo

You may also like

A practical guide to writing a literature review
4 minute read
Middle Eastern male millennial researcher

Popular resources

There is a phenomenon in artificial intelligence and deep learning called model collapse. Model collapse is the slow erosion of a generative AI system grounded in reality as it learns more and more from machine-generated data rather than from human-generated content.

As a result of model collapse, the AI model loses diversity in its outputs, reinforces its misconceptions, increases its confidence in its hallucinations and amplifies its biases. The analogy I usually use is a microphone held up to a speaker; it reinforces a portion of the signal until all you get is noise. But perhaps a better analogy is to groupthink: models trained on themselves become like individuals within a cult, constantly reinforcing their own beliefs until they’re so different from mainstream society that they have trouble coexisting with it.

In seemingly unrelated news, a trend has developed in academic writing over the past couple of decades related to how researchers write literature reviews. Tools such as Google Scholar have taken on an increasing role in helping researchers write about related work.

In some ways, this is beneficial; I suspect that Google Scholar has led researchers to draw from a more varied body of literature when contextualising their work. Before the internet, researchers had to scour journals and proceedings of their community to stay up to date, forcing areas to largely stay insular. The rise of online search, however, lets authors easily draw from sources they may never have heard of before, directing them straight to the relevant paper rather than requiring them to read and digest decades of research first. 

While I have not found any research specifically supporting that suspicion (despite rigorous searching of my own), research does point to a separate downside of the rise of online search in writing literature reviews: researchers have found that the proportion of citations to older articles has been rising in recent years. They hypothesise about a “first-page results” syndrome, whereby anything listed on the first page of Google Scholar is likely to garner more attention and citations from writers looking to rapidly fill out the sources in their literature reviews. This causes a sort of citation Matthew effect, whereby already-cited papers are more likely to garner more citations regardless of their relevance.

So, why do I bring up these two seemingly unrelated phenomena?

Among all the writing tasks involved in research, GenAI appears to be disproportionately good at writing literature reviews. ChatGPT and Google Gemini both have deep research features that try to take a deep dive into the literature on a topic, returning heavily sourced and relatively accurate syntheses of the related research, while typically avoiding the well-documented tendency to hallucinate sources altogether. In some ways, it should not be too surprising that these technologies thrive in this area because literature reviews are exactly the sort of thing GenAI should be good at: textual summaries that stay pretty close to the source material.

But here is my major concern: while nothing is fundamentally wrong with the way GenAI surfaces sources for literature reviews, it risks exacerbating the citation Matthew effect that tools like Google Scholar have caused. Modern AI models largely thrive on a snapshot of the internet circa 2022. In fact, I suspect that verifiably pre-2022 datasets will become prized sources for future models, largely untainted by AI-generated content, in much the same way that pre-World War II steel is prized for its lack of radioactive contamination from nuclear testing. But that means these models captured a snapshot of the state of the research community in 2022, reflecting an increasing prioritisation of older, already heavily cited sources.

And that reflects where model collapse and the “first-page results syndrome” collide into something I would dub citation collapse. Researchers are increasingly leaning on AI to support literature reviews to contextualise new research. AI – whether in the form of deep research or search engine optimisation – prioritises those sources that are already well-cited. As a result, that research becomes more cited, reinforcing the feedback cycle that drives model collapse. In this case, model collapse may have a filter in the form of the author, who decides whether decades-old research is no longer the best source to cite. However, this relies on authors filtering the research, which is an increasingly challenging task, given the explosion in the number of articles published every year and the increasing ease with which they can be found.

The potential result is a world that builds on the research community of 2022 rather than the evolving research community of the future. We risk crystallising human knowledge in 2022 as the “gold standard” and continuing to refer to it rather than developing new theories. In the near term, this may not be particularly noticeable, but what will we think in 2030 and 2040 when most citations are from 2020 or earlier, all because those papers were already well-cited when AI began to play a prominent role in seeking related research?

Multiple stakeholders have the potential to mitigate this AI-driven crystallisation. 

First, authors must resist the temptation to lean on the easily searched contributions of AI-driven literature reviews. We must be willing to reference recent discoveries rather than falling back on comfortable, well-cited research from previous decades.

Second, the peer review process needs to be alert to AI use in literature reviews and reward peer reviewers. Organisations that rely on peer review must confront the peer review crisis head-on. Steps must be taken to preserve the value of peer review in academia, likely through direct compensation of peer reviewers. Those of us who participate as peer reviewers must be prepared to go the proverbial extra mile and ensure that new research is contextualised within contemporary research rather than the comfortable theories of yesteryear. This is a tall order, which is why publishers need to properly incentivise the work required. If reviewing for a respected venue gives no more benefit than a line item on a CV, then the mission of peer review is unsustainable.

In the absence of these efforts, the result is clear: a future in which AI-based literature reviews reinforce a view that the research world circa 2022 was the source against which all future developments should be compared. Novel insights are possible, sure, but those insights are less likely to build on one another, given the extreme priority placed on those articles that were well cited as of 2022.

David Joyner is executive director of online education and the online master of science in computer science (OMSCS) in the College of Computing, principal research associate and inaugural holder of the Zvi Galil Peace (pervasive equitable access to computing education) chair at Georgia Tech.

If you would like advice and insight from academics and university staff delivered direct to your inbox each week, sign up for the Campus newsletter.

You may also like

sticky sign up

Register for free

and unlock a host of features on the THE site