Use Case 7: National death registry for outcomes research

From Demand-Driven Open Data for HHS
Jump to: navigation, search


Use case summary


  • PCORnet funded outcomes research often relies on NDI for mortality events not captured within the medical system
  • Example: "Aspirin trial": Optimal maintenance dose of aspirin for secondary prevention of coronary artery disease (CAD)
  • Example: Comparative Effectiveness of Liver Transplant Strategies for End-Stage Liver Disease Patients on Renal Replacement Therapy


  • Value to customer: Necessary to have complete data to conduct the research. Mortality events outside of healthcare setting don't always get captured.
  • Value to public: Useful many comparative effectiveness research studies, regardless whether funded by PCORI


  • Northwestern Transplant Outcomes Research Collaborative (NUTORC)
  • Others involved in conversation
    • Abel Kho at Northwestern University
    • Lesley Curtis at Duke University
    • Maryan Zirkle at PCORI
    • Christine Dymek at HHS/ASPE
    • Delton Atkinson, Director at NCHS Division of Vital Statistics
    • Paula Braun, Entrepreneur-in-Residence at CDC
  • Patient record matching hackathon held at HIMSS, Aug 2015

Current data and limitation

  • Data source: NDI is a program of NCHS (National Center for Health and Statistics) out of CDC
  • Challenge:
    • Log lag time to update, infrequent refresh rate
    • A previous project -- FDA Mini-Sentinel -- has found obtaining NDI data to be a lengthy and expensive process.


  • Frequency: Data should be loaded at 6 month intervals
  • Latency: 60-90 day lag maximum
  • Linking: Need ability to link to EHR records on name, birthdate, gender

For the purpose of outcomes research

  1. Completeness: Partial data vs. complete data
    • For investigations where death is the primary outcome, complete data is typically important. So the timeframe for receiving 100% of the data is needed.
    • Partial data are still useful for:
      • Studying the drivers for that outcome, despite being underreported.
      • Investigations where death is a secondary outcome, as long as the approximate percent of data collected is known.
  2. Accuracy: Cause of death
    • Obtaining cause of death for most PCORnet or CER (comparative effectiveness research) is less important. That's because it cannot be relied upon, given the variation in reporting methods and subjectivity involved.


Issue #1: Access to the data

There are two options for identifying death events at record level: (1) NDI record matching by NCHS and (2) direct access to source dataset

  • Option #1: NDI record matching by NCHS: Once approved by the IRB (Institutional Review Board for the Protection of Human Subjects), researchers can send in their datasets to NCHS for linking. (The IRB is made up of states and subject matter experts.) There are 2 versions of NDI match files available throughout the year:
    • Early release file: Available mid-March: 90% of the deaths for prior calendar year, but often cause of death is missing
    • Final file: Available October/November: 100% of the deaths for prior calendar year, with updated cause of death

  • Option #2: Access to source dataset: In order to get direct access to the data, researchers need to negotiate a Data Use Agreement (DUA) with each of the relevant jurisdictions. This is not a practical option unless research is limited to a few jurisdictions.
    • There are 57 jurisdictions for vital statistics: 50 states, 5 territories (Puerto Rico, U.S. Virgin Islands, Guam, American Samoa, and the Northern Mariana Islands), District of Columbia, New York City.
    • Restrictions on releasing data are due to State laws, to which CDC must adhere. States own the data and CDC purchases it for purpose of aggregating to national level.

Issue #2: Lag time

There are a couple ways to mitigate the lag time challenge.

  • Make 2 requests / year: The most obvious approach is to time the NDI matching request with the Early Release (mid-March) and Final Release (October/November).
    • You'll typically get results back 5-10 business days, along with percent of records received by jurisdiction. For a first time request, add about 2 weeks for Advisory Committee to approve your request.
    • For example, for the 2014 calendar year, the Early Release File became available mid-March 2015 and the Final Release File will be available October/November 2015. (Source: Delton 3/3/2015)
    • Although contractually, jurisdictions are required to have their data in by May 15, that often doesn't happen. So if completeness is important, it's better to target the Final Release File later in the year.
    • Until recently, lag times for 100% reporting used to be as long as 3-4 years. Now, the Final Release File is available in less than 12 months.

  • More than 2 requests per year: Depending on the tradeoff between reporting latency and completeness, researchers could consider requesting data more frequently.
    • 30% of the death records in NDI within 10 days of death.
      • But the percent reported within 10 days vary significantly by jurisdiction and the sophistication of their reporting capabilities. Therefore, adjustments need to be made to the data to accommodate the bias inherent in early reporting.
    • Consider the budgetary impact in this approach, since each request incurs a cost.
    • Consider the overhead associated with reconciling between changes that happen throughout the year until the file is officially "closed" for the year.

Workaround for limited geographies

There's a potential workaround to minimize lag time when only data from limited geographies (such as state, city or county) are needed. In such cases, it might make sense to try working directly with relevant jurisdictions.

  • Review the table indicating NDI Early Release completion percent by state. For jurisdictions with low percent received, if reduction in reporting lag time is important, consider contacting the jurisdiction directly. In such cases, researchers need to negotiate a Data Use Agreement (DUA) with each of the relevant jurisdictions.

Issue #3: Cost reduction

There are several strategies that could be taken to reduce the cost of NDI record matching

  • Consider consolidating list of trial participants across multiple researchers and submit as single request
  • If cause of death isn't important or considered unreliable, exclude it from the NDI request
  • Lillian Ingster at NCHS has expertise in cost reduction strategies

Long term implementation


NCHS reporting data flow
  • Many jurisdictions are in the process of modernizing their mortality data management
    • CDC/NCHS is working with jurisdictions to provide assistance, standardization and funding for this modernization
    • The table indicating NDI Early Release completion percent by state could be used as a proxy of reporting capabilities. But it should be noted that this report doesn't indicate reporting availability and accuracy on cause of death.
    • At this time, there are no published target release dates for modernization by jurisdiction
    • NAPHSIS (National Association for Public Health Statistics and Information Systems) published a paper "Strategies for Improving the Timeliness of Vital Statistics". It includes the data flow, factors that cause delays, and short-term and long term strategies for addressing the challenges


  • There are currently ongoing experiments to find algorithms that enable patient-level record matching between clinical systems (like EHRs and HIEs) and non-clinical datasets (like a state's vital statistics registries, death registries, and even obituaries). Assuming you're interested in data from the states or regions where this is done, it's possible to get more up-to-date and accurate datasets.
    • For example, here's an exercise at a Hackathon focused on patient record matching held at HIMSS Innovation Center in Cleveland, Ohio in August 2015

Develop applications that allow EHRs to easily update the status of patients who are deceased. A synthetic centralized mortality database, such as the National Death Index or a state’s vital statistics registry, will be made available through a FHIR interface. External data sources, such as EHRs, will be matched against this repository to flag decedents. The applications should be tailored to deliver data to decision makers. This scenario will focus on how different use cases drive different requirements for matching.


DDOD would like to acknowledge contributors to this use case: