Publishing Guide

The NIH Data Sharing Plan Overhaul: What the May 2026 Format Change Means for Researchers

On May 25, NIH retires its old narrative Data Management and Sharing Plan and replaces it with a structured checklist. Here is what the new format contains, why NIH made the change, and how your DMS Plan connects to the data availability statement your journal will ask for at submission.

MZ
Dr. Meng Zhao|Physician-Scientist · Founder, LabCat AI
Published: May 202616 min readPublishing Guide

Most NIH-funded researchers know, at least in the abstract, that the agency has required a Data Management and Sharing Plan since January 2023. Fewer have internalized what that plan actually needs to contain, and fewer still have thought carefully about how the plan they wrote for their grant application connects to the data availability statement their target journal will ask for at manuscript submission. A format change taking effect on May 25, 2026, is a good reason to think through all of it at once.

The change itself is administrative rather than a policy expansion. NIH is not asking more of researchers in terms of what data gets shared. What it is doing is replacing the old freeform two-page narrative with a structured document built around yes-or-no questions and a short table. The goal, according to NIH's own notice (NOT-OD-26-046), is to reduce confusion and the persistent problem of DMS Plans that ran too long, included extraneous detail, and still failed to answer the questions NIH reviewers were actually asking. After evaluating over 1,100 plans submitted since the policy launched in 2023, NIH concluded the old narrative format was producing more noise than signal.

Key Date

May 25, 2026: The old NIH DMS Plan templates are deprecated and removed. All applications with due dates on or after that date must use the new 2026 Pilot format. Applications already submitted before May 25 are not affected, but any new submission after that date requires the restructured document.

What the Old Format Required and Why It Caused Problems

The original 2023 DMS Plan format asked applicants to address several elements in continuous narrative prose: what scientific data would be generated, how that data would be made findable and accessible, which standards and formats would be used, what repository or repositories were planned, how long data would be preserved, and whether access controls were needed for human subject data. Each element made sense individually. In practice, many researchers responded by padding their plans with repository descriptions copied from institutional templates, vague statements about future intentions, and methodological detail that belonged in the research plan rather than the data plan. Plans frequently exceeded the informal two-page norm without actually clarifying what data would be deposited, when, or where.

NIH program officers reviewing those plans found that even a well-intentioned reviewer could not quickly answer the four questions that actually mattered: Will this researcher share the data? When? In what repository? And if human subject data is involved, what protections are in place? Burying those answers in prose forced reviewers to read carefully for information that should have been immediately apparent. The 2026 format is an attempt to fix that by separating the signal from the surrounding explanation.

What the New 2026 Format Looks Like

The new format, published as a Pilot in April 2026 and required from May 25 onward, structures the DMS Plan around a small number of direct questions and a single repository table. The intent is that a reviewer should be able to assess compliance at a glance, with any explanations confined to short text fields that appear only when a "No" answer requires justification.

Core elements of the 2026 NIH DMS Plan format

  • 1.Will the scientific data underlying peer-reviewed publications and other findings be shared? (YES/NO, with explanation required if NO)
  • 2.Will sharing occur within the timelines required by NIH policy? (YES/NO, with explanation required if NO)
  • 3.Will shared data remain available for as long as repository or journal policies require? (YES/NO, with explanation required if NO)
  • 4.If human participant data will be shared, will privacy, rights, and confidentiality be protected as specified in NIH notice NOT-OD-22-213, including whether access controls are needed? (YES/NO, plus description of controls if applicable)
  • 5.A data types and repositories table (100-word maximum) listing the key scientific data expected to be generated, the species and modality if known, and the repository where the data may be managed and shared.
  • 6.For projects generating large-scale human genomic data: will the Genomic Data Sharing Policy's Institutional Certification expectations be met? (YES/NO/Not Applicable)

Notice that the plan no longer asks for an extended description of repository metadata standards, file formats, or preservation infrastructure. That information has not become unimportant, but NIH concluded it was generating length without improving accountability. If a reviewer wants to know what format a researcher plans to use for neuroimaging data, that question can be answered in the data types table, not in three paragraphs about BIDS compliance and NIfTI conventions.

For most straightforward biomedical research, the new plan will genuinely be shorter. A project generating a single type of human clinical data and depositing it in a controlled-access repository like the database of Genotypes and Phenotypes (dbGaP) can answer questions 1 through 4 with four "yes" answers, fill in one row of the table, and answer "not applicable" to the genomics question. The whole document might run half a page. For projects generating multiple heterogeneous data types, the table will need proportionally more rows, and some questions may require short explanations. But the era of three-page DMS Plans that still left reviewers guessing should be over.

Choosing the Right Repository: Domain-Specific vs. General-Purpose

The repository table is where most researchers will spend the most thought. NIH strongly prefers domain-specific repositories where they exist, because they provide field-appropriate metadata standards, better discoverability, and more durable preservation than generalist options. The logic is straightforward: a dataset deposited in a repository that understands its format and can validate its structure is more likely to be reused than one sitting as a zip file on a general platform.

For genomic and genetic data, the dominant NIH-designated repositories are the Sequence Read Archive (SRA) for raw sequencing reads, the Gene Expression Omnibus (GEO) for transcriptomic and epigenomic data, and dbGaP for studies involving human subjects where access controls are needed. Clinical trial data from NIH-funded interventional studies often goes to ClinicalTrials.gov for registration and results reporting, with deidentified patient-level datasets going to repositories like the NIH-supported ClinicalTrials.gov results database or domain-specific platforms such as the National Heart, Lung, and Blood Institute's BioLINCC.

Neuroimaging data has an established infrastructure in OpenNeuro, which accepts BIDS-formatted datasets and is indexed by NIH. Behavioral and cognitive research data often goes to the Open Science Framework (OSF) or a field-specific repository. When no field-specific repository is appropriate, generalist repositories like Zenodo, Dryad, or Figshare are acceptable, though NIH asks that researchers document why no domain-specific option was suitable. Zenodo is maintained by CERN and is widely accepted; Dryad is particularly common in the life sciences and has a curation step that helps with metadata quality.

Repository selection: questions to answer before you write the table

  • Does NIH have a designated repository for this data type? (Check the NIH-supported data sharing resources page.)
  • Does your institution have a data repository that integrates with NIH systems or provides DOIs?
  • Does your data involve human subjects and require access controls? If so, open repositories like Zenodo are not appropriate.
  • Does your target journal have a preferred or required repository? Some journals will not accept "available upon request" and specify platforms.
  • Is the repository FAIR-compliant? NIH expects repositories to support Findable, Accessible, Interoperable, and Reusable standards.

One underappreciated constraint is that the repository you name in your DMS Plan should match or plausibly align with what you write in your journal's data availability statement at submission. If your DMS Plan committed to depositing neuroimaging data in OpenNeuro, writing "data available upon reasonable request" in your manuscript is not consistent. Reviewers and program officers can and do check for coherence between stated plans and actual publication behavior.

The NIH Public Access Policy Change That Already Took Effect

The DMS Plan format change is the more immediate news, but it is the second significant NIH data policy change in less than a year. On July 1, 2025, NIH removed the 12-month embargo that had previously allowed NIH-funded authors to delay public access to their accepted manuscripts. Under the revised NIH Public Access Policy, any author accepted manuscript (AAM) from NIH-funded research accepted for publication on or after December 31, 2025, must be submitted to PubMed Central (PMC) immediately at acceptance, with public availability at the time of official publication rather than after a 12-month delay.

This matters for manuscript preparation because it changes how authors need to navigate journal publication agreements. Many journals have standard author agreements that grant the publisher exclusive rights during the embargo period. Those agreements are now in conflict with NIH policy for funded researchers. NIH-funded authors need to either choose journals that have signed on to a public access compliance agreement with NIH (which allows immediate PMC deposit as part of the publication workflow), or use a journal's CHORUS participation, or explicitly invoke their right as a federally-funded author to retain sufficient rights for PMC deposit. Some journals make this straightforward. Others require a rights addendum or direct negotiation with the editorial office.

If you are not sure whether your target journal complies with the current NIH Public Access requirements, the most reliable way to check is to consult PMC's journal participation page or ask your institution's library. Do not assume that a well-known journal automatically handles PMC deposit for you. Some flagship journals do it seamlessly. Others have not updated their workflows and may require you to handle the deposit yourself after acceptance.

How Your DMS Plan Connects to Your Journal's Data Availability Statement

The DMS Plan and the journal data availability statement are not the same document and serve different audiences. The DMS Plan is a commitment made to NIH at the time of grant application. The data availability statement is a short declaration that appears in your published manuscript and tells readers where your data is, what they can access, and how. But they should be consistent, because both are part of the same chain of accountability running from the research funder to the public reader.

The ICMJE updated its recommendations in January 2026 to reinforce this link. The updated text underscores that authors must have adequate access to the underlying data to take responsibility for the accuracy and integrity of the work, and it tightens the requirement for clinical trial data sharing statements to be specific about what data will be available, to whom, and when. A statement that says "the authors confirm that the data supporting the findings of this study are available within the paper and its supplementary materials" was already inadequate for many journals. After the 2026 ICMJE update, journals aligned with those recommendations are under more pressure to require meaningful specificity.

What a credible data availability statement looks like in 2026

The deidentified participant-level dataset supporting the main findings of this study has been deposited in the NIMH Data Archive under accession number NDAR-XXXXX and is available to qualified investigators upon submission of a data access request. Statistical analysis code is available at https://github.com/[author]/[project] under an MIT license. Aggregate summary statistics underlying Figures 2 through 5 are provided in Supplementary Tables 1 through 4.

Notice what this statement does: it names the specific repository and accession number, describes access conditions precisely ("qualified investigators" via access request), provides the code location separately, and explains what is directly available in the paper itself.

The gap between what authors say they will share and what actually happens at publication has been well documented. Studies examining ICMJE-aligned clinical trial publications found that a substantial fraction of papers that declared an intent to share data never resulted in accessible datasets. In several audits, fewer than one percent of papers that had committed to sharing individual participant data had actually made that data retrievable at the time of review. Journals are aware of this gap. Several, including BMJ, PLOS Medicine, and Nature Medicine, have moved toward requiring evidence of actual data deposition rather than a forward-looking statement of intent.

What Different Major Publishers Now Require

Journal data sharing requirements are not uniform, and they have been tightening at different rates across publishers. Understanding what your target journal actually mandates before you finalize your data plans is more practical than assuming a general norm applies everywhere.

Nature Portfolio journals and Cell Press journals now require data availability statements for all original research as a default. Both publisher groups have lists of approved repositories and, for certain data types like crystallography, genomics, and proteomics, mandatory deposition in specified databases is a precondition for acceptance rather than a post-publication expectation. PLOS journals have a similar approach: all research articles must include a data availability statement, and reviewers are expected to assess whether the data described in the statement is actually accessible, not just promised.

BMJ has been pushing toward mandatory data sharing for clinical trials for several years and has been explicit that "available upon reasonable request" does not constitute a satisfactory data availability statement for randomized controlled trials. The reasoning is that "upon request" language gives the appearance of openness while making actual access dependent on author discretion and responsiveness, neither of which is auditable or reliably enforced.

JAMA and its family of specialty journals require data sharing statements for all clinical trials and have mechanisms for requesting data sharing as part of peer review. The New England Journal of Medicine maintains a data sharing requirement for clinical trials and was one of the first major medical journals to formalize this through the ICMJE process when the requirement was first introduced in 2018. What has changed in the intervening years is the specificity of what counts as compliance and the willingness of journals to enforce it rather than simply note its absence.

Elsevier and Springer Nature have both published general data sharing frameworks that apply across their journal portfolios, with individual journals allowed to add stricter requirements on top of the baseline. If you are submitting to a Springer Nature journal, the group-level policy sets a floor, but a specific journal like The Lancet (published under Elsevier) or Nature Medicine may require more. Check the journal-specific author instructions rather than the publisher-level policy alone.

What to Do If Your Data Cannot Be Fully Shared

The assumption that all research data can simply be deposited in a public repository reflects how the mandate looks in policy documents, not how it works in practice for a large proportion of clinical research. Data involving human participants with rare conditions, small geographic populations, sensitive behavioral measures, commercial sponsor ownership restrictions, or materials governed by site-specific consent agreements often cannot be shared openly without meaningful privacy or contractual risk.

NIH policy has always accommodated this. The DMS Policy does not require that all data be publicly accessible; it requires a plan that accounts for what can and cannot be shared and explains the constraints for anything that cannot. The new 2026 format handles this through the "No" answer pathway: if data cannot be shared within required timelines or at all, the researcher provides a brief explanation. Legitimate reasons include participant consent limitations, proprietary data from commercial sponsors, legal restrictions under a data use agreement, and risk of deductive identification in small or rare populations.

The journal side is similar in principle, though different in application. A data availability statement that says "data cannot be shared due to ethical restrictions imposed by [institution] IRB under protocol [number], as individual patient data from this rare disease cohort would present re-identification risks even after deidentification" is a legitimate, specific, and review-able statement. It is not the same as a vague "available upon reasonable request" that leaves access to author discretion. If your data genuinely cannot be shared, explain why with enough specificity that a reader can assess the constraint rather than simply accept your conclusion.

Acceptable reasons to limit data sharing under NIH policy

  • Informed consent obtained from participants did not include authorization for broader data sharing, and re-consent is not feasible.
  • Data contains information that could identify individuals, particularly in rare disease populations where deidentification provides insufficient protection.
  • Data is subject to a data use agreement or licensing restriction held by a commercial sponsor that limits redistribution.
  • National security, export control, or other regulatory restriction prevents deposition in publicly accessible repositories.
  • The data is not scientific data as defined by NIH (administrative, financial, or performance data used to manage the grant).

Compliance Consequences and What NIH Actually Enforces

Three years into the DMS Policy, the practical enforcement picture is clearer than it was at launch. NIH has not been conducting systematic post-publication audits to verify that every funded researcher deposited their data. What has happened is that progress reports and renewal applications increasingly ask investigators to demonstrate that data sharing commitments were met. Failure to document compliance is noted in progress reviews, and for competitive renewals, a track record of non-compliance can factor into funding decisions at the institute level.

The more immediate enforcement mechanism is at the journal end. Journals that require data availability statements and are serious about them will now often send a manuscript back to authors before peer review if the data availability statement is missing, vague, or clearly in conflict with the content of the methods section. For clinical trials, several journals now require evidence of data deposition before they will initiate review or before they will issue a final acceptance. That is a meaningful change from the earlier era when a promise of future sharing was treated as equivalent to actual sharing.

There is also a reputational dimension. Post-publication data requests have increased substantially, and researchers who declared they would share data but cannot produce it when asked are creating documented problems for themselves. The infrastructure around data sharing integrity, including watchdog commentary and systematic audits by research groups studying reproducibility, has grown to the point where failed data-sharing commitments are more visible than they were five years ago.

A Pre-Submission Checklist for the New Landscape

The most practical thing a researcher writing a manuscript in mid-2026 can do is align three documents before submission: the DMS Plan from the grant application, the actual state of data deposition, and the data availability statement being written for the manuscript. Those three should tell a consistent story.

Before you submit: data alignment questions

  • Does your manuscript's data availability statement name the same repository you committed to in your DMS Plan?
  • Has the data actually been deposited, or are you describing a future intent? (Many journals now require completed deposition.)
  • Does your data availability statement include the accession number or DOI of the deposited dataset?
  • For human subject data: does the statement describe the access mechanism clearly (open access, controlled access via application, available via data use agreement)?
  • Does the analysis code used to generate the reported results have a public location with a stable identifier?
  • If your NIH grant expires or your institution changes, will the deposited data remain accessible? (This is what the persistence question in the new DMS format is asking.)
  • Have you checked the target journal's current data sharing requirements, not just your publisher's general policy?

One common mistake is treating the data availability statement as a formality to be completed in the last thirty minutes before submission. It is worth drafting it early, at the same time as the methods section, when you still have clear visibility into what data exists, in what form, and what its sharing constraints are. Trying to reconstruct that picture under deadline pressure, after co-authors have already signed off on the manuscript, is when vague or misleading statements tend to get written.

What Comes Next

The May 25 format change is the most immediate item, but the broader direction of travel is toward tighter integration between grant commitments, publication records, and actual data deposits. Several initiatives are working toward machine-readable data availability statements that would allow automated checking against repository records at the time of submission, similar to how some journals now automatically verify clinical trial registration numbers. That infrastructure does not exist yet at scale, but the ecosystem is moving toward it.

NIH's ongoing evaluation of DMS Plan quality will also likely produce further guidance about common problems. The 2026 format change was itself driven by observing what researchers were writing under the old format. As NIH accumulates more data on what the simplified format produces and where new confusion emerges, expect further refinements. If your institution has a research data management librarian or a sponsored programs office that tracks these changes, their guidance will be more current than any single document can be.

For researchers submitting manuscripts right now, the practical implication is this: do not treat your DMS Plan as a grant formality that ends at submission. It is a commitment that follows your research through publication and beyond. The journal's data availability statement is where that commitment becomes visible to the reader. The simplest way to write that statement well is to have already done what it describes.

Further Reading

MZ

Written by Dr. Meng Zhao

Physician-Scientist · Founder, LabCat AI

MD · Former Neurosurgeon · Medical AI Researcher

Dr. Meng Zhao is a former neurosurgeon turned medical-AI researcher. After years in the operating room, he moved into applied AI for clinical workflows and now leads LabCat AI, a medical-AI company working on decision support and research tooling for clinicians. He built Journal Metrics as a free resource for researchers who need reliable journal metrics without paid database subscriptions.

Related Articles