Publishing Ethics

Fabricated Citations in Medical Research: What the Lancet Audit Means for Authors

A 12-fold rise in fake references since 2023. A Lancet audit of 2.5 million papers. And a clear link to the widespread use of AI writing tools. Here is what authors publishing in any biomedical journal need to understand before their next submission.

MZ
Dr. Meng Zhao|Physician-Scientist · Founder, LabCat AI
Published: May 202617 min readPublishing Ethics

On May 7, 2026, The Lancet published a short research letter with a number in its headline that made editors and research integrity officers take notice. One in 277 PubMed-indexed papers published in the first seven weeks of this year contained at least one fabricated reference, according to a team led by Maxim Topaz of Columbia University's School of Nursing and Data Science Institute. That single figure represents the end point of a trend line that has climbed steeply since 2023, when the same researchers found that roughly one paper in 2,828 carried a fictional citation. The change is not gradual drift. It is a 12-fold increase in two years.

The study, part of a project Topaz's group calls CITADEL, scanned PubMed Central's open access subset from January 2023 through February 2026, covering approximately 2.5 million papers and 126 million structured references. After automated verification against CrossRef and other registries, they identified 4,046 fabricated citations scattered across 2,810 published papers. The methodology was designed to distinguish genuine fabrications from mundane formatting errors, which makes the resulting count more conservative and therefore more credible than numbers produced by simpler text-matching approaches.

Why This Matters Clinically

Topaz noted in published commentary that medical professionals make treatment decisions based on clinical guidelines, and that those guidelines are built on cited evidence. A fabricated reference does not merely fail a bibliographic accuracy check. It inserts a fictional evidence base into the chain of reasoning that eventually reaches patients.

How the Columbia Team Found What Peer Reviewers Missed

Manual reference checking is not part of peer review at most journals. Reviewers evaluate methodology, interpret findings, and assess the logic of arguments. They are not systematically expected to open each citation and verify that the described paper exists, was authored by those named, and was published in that volume and issue. That gap has always existed. What changed is the speed at which it is now being exploited.

The CITADEL system queried CrossRef, which maintains metadata for more than 150 million scholarly works from over 20,000 publishers, along with other identifier registries. For each reference in the corpus, the system attempted to resolve the cited identifiers and cross-check author names, journal titles, volume and issue numbers, and publication years. A citation was flagged as fabricated only when no matching record could be found after multiple verification passes and after ruling out common transcription errors such as transposed digits in DOIs or minor spelling variants in author names.

The researchers then ran a sample of flagged citations through a secondary review, including manual inspection, to estimate their false positive rate. The approach is not perfect, and the team acknowledges that some genuine papers may not yet be registered in CrossRef, particularly older works from small publishers in some low- and middle-income country journals. But the directional signal is strong. A 12-fold increase in verified fabrications over two years is not attributable to indexing delays.

The Rate Progression and What It Tells Us About Timing

The year-by-year numbers carry more information than the headline statistic alone. In 2023, roughly one paper in 2,828 contained a fabricated reference. By 2025, the rate had reached one in 458. In the first seven weeks of 2026, it reached one in 277. The trajectory is not linear. The sharpest acceleration occurred in mid-2024, and the CITADEL team notes explicitly that this timing coincides with the period when widely available AI writing assistants moved from novelty tools used by early adopters to routine workflow additions used across research environments worldwide.

That correlation is not proof that AI tools are the sole cause. Citation fabrication by humans, whether deliberate or careless, predates large language models. Paper mills have long inserted fictional references as bibliographic filler when their ghostwritten manuscripts needed to reach a minimum citation count. But the scale of the post-2024 acceleration, and the pattern of how these citations look, points clearly to AI assistance as the primary accelerant in the most recent data.

Rate of fabricated citations in PubMed-indexed papers

  • 20231 in 2,828 papers contained at least one fabricated reference
  • 20251 in 458 papers (roughly 6× the 2023 rate)
  • 20261 in 277 papers in the first seven weeks (12× the 2023 rate)

Source: Topaz et al., The Lancet, May 7, 2026. Based on 97.1 million verified references across 2.5 million papers.

Three Types of Hallucinated Citation (and Why They Are Hard to Spot)

Not all fabricated citations look the same. Researchers who have studied how AI models generate references describe three recurring patterns, each with different detection difficulty.

The first and most common type is the fully invented citation. The tool generates a plausible author name, a real journal title, a correctly structured DOI, and a publication year that fits, but the paper does not exist. This is called a phantom citation. The formatting is indistinguishable from a real reference, and because the journal name is genuine, the citation does not look suspicious at a glance. The DOI, if resolved at doi.org, simply returns a not-found error, but few authors or reviewers do that systematically.

The second type is the chimera citation. Here, the AI combines real elements from different papers into a single fictional entry. The first author is a real researcher who does publish in the named journal. The volume and issue numbers are real. But this specific paper, with this specific title, does not exist. Chimera citations are harder to catch because the components are individually verifiable. Only a search for the exact title or DOI reveals the problem.

The third type is the corrupted citation. A real paper exists, but one or more details are wrong: the year is off by one, the page numbers are from a different article in the same issue, or a co-author's name is misspelled in a way that breaks automated lookup. These may be genuine AI errors rather than deliberate fabrication, but they still create references that cannot be reliably traced by a reader trying to verify a claim. In clinical literature, that breakdown has real consequences.

An illustrative example from the CITADEL findings

Among the papers flagged in the audit, one 2025 article on surgical anastomotic techniques published in an open access oncology journal contained 18 fabricated references out of 30 verified, a fabrication rate of 60 percent in a single bibliography. Papers with three or more fabricated citations were common in the dataset, suggesting that the problem in many cases was not incidental but systematic throughout the writing process.

Where the Problem Is Concentrated

The CITADEL data shows that fabricated citations are not evenly distributed across publishers and journal types. More than a third of all identified fabrications came from a small number of large open access publishers, specifically those operating on high-volume article processing charge models, where revenue depends on publishing at scale. The highest-rate publisher in the dataset produced fabricated citations at more than fourteen times the rate of the most selective journals in the corpus.

That concentration pattern tracks two things simultaneously: the intensity of peer review and the economic incentives operating on authors and editors. Journals that publish thousands of papers per year across broad subject areas cannot practically replicate the reference checking depth that a specialist editor at a low-volume journal might apply when reviewing manuscripts in a field they know well. The problem is structural, not simply a matter of carelessness.

This does not mean that high-impact, selective journals are immune. The CITADEL team audited PubMed Central's open access subset, which skews toward OA publishers and may underrepresent the major subscription journals. Whether fabricated citations are reaching journals like JAMA, the New England Journal of Medicine, or the Lancet at meaningful rates remains uncertain because those journals are not fully indexed in the open access corpus the team could systematically audit. What is clear is that the journals most exposed to this problem are the ones that already face the most pressure from paper mills and high-volume manuscript farming.

What Major Journals Are Doing in Response

The response from major publishers has been uneven, reflecting the difficulty of adding systematic reference verification to a submission pipeline that was never designed with that step in mind. The Science family of journals told reporters that it uses an automated reference checking tool at submission, and a spokesperson noted they had not yet encountered a fabricated citation that passed through to print. NEJM and JAMA each indicated they have validation tools in place and that authors who publish in those journals attest to responsibility for the accuracy of their citations.

Those responses are reassuring at the headline level. But the CITADEL data shows that the fabrication problem is concentrated elsewhere in the ecosystem, not in the dozen or so flagship journals with the deepest editorial resources. The real challenge is for the hundreds of PubMed-indexed journals with smaller teams and higher volume, where reference verification has historically been the author's responsibility rather than the editorial office's.

Some publishers are exploring integration with CrossRef's reference linking services, which can flag DOIs that do not resolve at submission rather than post-publication. CrossRef itself has been expanding its Cited-by and reference verification services. But uptake is inconsistent, and verification at the DOI level alone would not catch chimera citations where the DOI points to a real paper that is simply different from the one the author intended to cite.

A related signal from NEJM in May 2026

On May 1, 2026, the New England Journal of Medicine retracted a clinical image case study after the authors acknowledged using an AI tool to alter the photograph included in the submission. The authors stated they were unaware of the journal's image manipulation policy. The incident, the journal's first retraction since 2020, illustrates that AI-related integrity problems in medical publishing are not limited to text and references. The same tools that hallucinate citations can also manipulate images in ways that authors may not recognize as policy violations.

Why This Is an Author Problem, Not Just a Technology Problem

The standard response from anyone caught with a fabricated citation is that the AI tool produced it without warning, and that the author did not know the reference was invented. That explanation may be technically accurate. It is not a defense. The author's name on a paper carries the obligation to verify what the paper claims, including what the bibliography asserts exists. Editors and publishers are not yet holding authors to this responsibility consistently, but the Lancet audit makes it likely that norms will tighten.

The underlying problem is that large language models do not retrieve information from a database in the way a search engine does. When you ask an LLM to suggest references on a topic, it draws on patterns in its training data and generates plausible-looking outputs. Sometimes those outputs correspond to real papers. Sometimes they represent papers the model has seen but with details reconstructed from memory, which is imperfect. Sometimes they are entirely invented because the training data contained references that looked like what you asked for, and the model optimized for apparent plausibility rather than factual accuracy. The model has no awareness of the difference. From its perspective, the citation looks correct.

Authors who use AI to draft a literature review section and then include those references without checking each one are accepting the model's output as truth without verification. That is the core problem, and it is identical in structure to the mistake of citing a paper you remember reading but have not actually opened since graduate school. The scale at which AI tools enable this error is what makes the current situation qualitatively different from citation carelessness in earlier decades.

A Practical Reference Verification Protocol for Authors

Verification does not have to be slow. The goal is to confirm that every reference in your bibliography resolves to a real, retrievable paper before submission. The following workflow handles most cases efficiently.

For any reference that came from an AI tool or that you cannot directly link to a PDF you have read, start with the DOI. Resolve it at doi.org or through CrossRef's free search interface. If the DOI returns a not-found error, the citation is either fabricated or corrupted and must be replaced or removed. Do not assume a resolution failure is a database lag. If the paper exists and is registered, the DOI will resolve.

For references without a DOI, search the exact title in quotation marks on Google Scholar, PubMed, or Semantic Scholar. Zero results from all three simultaneously is strong evidence of a phantom citation. One result that matches the title but differs in authorship or journal from what you cited is a chimera citation. In either case, do not use the reference as originally written.

Reference verification checklist before submission

  • 1.Resolve every DOI at doi.org. Flag any that return errors.
  • 2.Search the exact title (in quotes) on PubMed or Google Scholar for any reference without a DOI.
  • 3.For references from AI tools, verify the first author's publication record on ORCID or Google Scholar to confirm they authored this specific paper.
  • 4.Cross-check volume, issue, and page numbers against the journal's own website or PMC record.
  • 5.For any reference you cannot verify independently, do not include it unless you have read and can produce the actual paper.
  • 6.Consider using tools such as Citely, RefChecker, or CrossRef's reference resolver to batch-check bibliographies from AI-assisted drafts.

If you used an AI tool to draft any part of your manuscript and that tool suggested references, treat the entire bibliography as unverified until you have done the checks above. This is more work than many authors expect, but it is the same standard that journals will eventually enforce systematically. Doing it yourself before submission is both more efficient and less professionally costly than discovering a fabricated citation after acceptance.

How This Is Likely to Change Submission Requirements

The Lancet audit will almost certainly accelerate changes that were already being discussed in editorial circles. Several publishers were already piloting automated reference checking at submission as of early 2026. After a study of this visibility, those pilots are likely to become standard features rather than experiments. Authors should expect submission systems to begin flagging unresolvable DOIs as submission errors rather than warnings within the next publication cycle or two.

Journals that already require authors to attest to the accuracy of their reference lists in submission checklists may begin making that attestation more specific, asking authors to confirm that they have personally verified each cited source rather than accepting a boilerplate agreement. Some journals may introduce a separate field asking whether any AI tool was used to generate or suggest references, analogous to the AI use disclosure fields that proliferated for manuscript text in 2024 and 2025.

For systematic reviews and meta-analyses, where the bibliography is the central scientific output rather than supporting context, verification requirements are likely to become more explicit. Journals that publish systematic reviews under PRISMA guidelines already require thorough search documentation. Adding reference integrity attestation to that documentation is a natural extension, and one that some editors are already requesting informally.

The Practical Takeaway for Researchers

The CITADEL findings do not mean you should stop using AI tools for research. They mean you should stop treating AI-generated content, including references, as production-ready output rather than as first-pass material that requires human verification. The distinction matters because it changes where you spend your attention during manuscript preparation.

If you are using a chatbot or AI writing assistant to help draft a literature review, build your bibliography separately from a source you control: PubMed, Scopus, Web of Science, or your reference manager with records you imported from those databases. Do not let the AI populate references directly. Use it to help you understand what the literature says and where to look, then find and verify the actual papers yourself.

If your co-authors are using AI tools to contribute sections of a paper, make reference verification part of your joint pre-submission checklist the same way you check word count or figure resolution. Corresponding authors bear responsibility for the entire submission package, including sections they did not personally write. A fabricated citation in a methods section contributed by a co-author will be attributed to the corresponding author on the published retraction notice.

The underlying point from the Topaz audit is that fabricated citations have crossed from an edge case into a measurable feature of the biomedical literature at a rate that makes them statistically likely to appear in almost any corpus of papers a reader or guideline developer uses. That is a problem for the integrity of evidence-based medicine at scale. It is also a problem that individual authors can solve completely at the manuscript level, by doing what careful researchers have always been supposed to do: read the papers they cite.

Further Reading

MZ

Written by Dr. Meng Zhao

Physician-Scientist · Founder, LabCat AI

MD · Former Neurosurgeon · Medical AI Researcher

Dr. Meng Zhao is a former neurosurgeon turned medical-AI researcher. After years in the operating room, he moved into applied AI for clinical workflows and now leads LabCat AI, a medical-AI company working on decision support and research tooling for clinicians. He built Journal Metrics as a free resource for researchers who need reliable journal metrics without paid database subscriptions.

Related Articles