Use Case 29: Link Open Payments dataset to NPI

From Demand-Driven Open Data for HHS
Jump to: navigation, search

Use Case

Use case summary


  • Open Payments records uses: Physician IDs and Physician Profile IDs These appear to be sequential integers unrelated to NPI's used in other datasets.
  • Need: NPI's should be provided in order to match across other datasets.


  • Value to customer:
  • Value to industry/public: Provider cost/quality analytics by provider across datasets


Current data and limitation

  • Data source #1: Open Payments Data:
    • See Data model section "Physician Profile Supplemental File"
    • Downloads are available for the same fields:
    • Challenge:
      • Missing NPI. (Looks like it uses an abstract ID from Ruby on Rails app.) May be hard to match on name, location, and specialty taxonomy. Since NPPES entries are often outdated, name and location may likely have changed. And specialty taxonomy has not been consistently specified.
      • It's not clear where the data is sourced from. For example, does it come from NPPES, PECOS, etc. This could help indicate how to retrieve the NPI.

  • Data source #2: Medicare Individual Provider List
    • Challenge:
      • On one hand, this data comes from PECOS and should be more up to date than NPPES data
      • But only a few fields provided. It's missing NPI or much other useful info.


  • Fields: NPI
  • Update frequency: Monthly
  • Joins between datasets: NPI



SEC. 1128G. [42 U.S.C. 1320a-7h] (a) TRANSPARENCY REPORTS.—
(A) IN GENERAL.—On March 31, 2013, and on the 90th day of each calendar year beginning thereafter, any applicable manufacturer that provides a payment or other transfer of value to a covered recipient (or to an entity or individual at the request of or designated on behalf of a covered recipient), shall submit to the Secretary, in such electronic form as the Secretary shall require, the following information with respect to the preceding calendar year:
(i) The name of the covered recipient.
(ii) The business address of the covered recipient and, in the case of a covered recipient who is a physician, the specialty and National Provider Identifier of the covered recipient.
  • For publication
Except as provided in subparagraph (E), the procedures established under subparagraph (A)(ii) shall ensure that, not later than September 30, 2013, and on June 30 of each calendar year beginning thereafter, the information submitted under subsection (a) with respect to the preceding calendar year is made available through an Internet website that —
(viii) does not contain the National Provider Identifier of the covered recipient, ...

Data Quality & Usefulness

CMS's "Annual Report to Congress on the Open Payments Program for Fiscal Year 2014" pointed out some challenges with the data that made it difficult for Open Payments accomplish its goals. These included:

  1. Records that could not be matched to a single doctor or hospital due to missing or inconsistent information.
    • These were published as "de-identified records"
    • The total value of payments associated with de-identified records is almost double that of identified records
  2. Records that could not be published due to delays

Here are the actual numbers reported for 2013 program year:

Identified Records De-Identified
Total Published Not Published
Per Delay Request
Number of Records 2.7 million 1.8 million 4.45 million 190,000
Value of payments $1.4 billion $2.3 billion $3.7 billion $551 million



  • NPI can be derived from combination of identifying fields once source data is downloaded: First and last name, specialty, city, license states, address at time of transaction.
    • Downloads with or without filters are possible using the Data Explorer app, which runs on the Socrata hosting platform
  • The fields Open Payments has available for matching on NPPES database are:
  • First, middle, last name, suffix
  • Full business address (street, city, zip, county)
  • Physician type
  • Physician specialty
  • States where licensed to practice
  • Although Physician Profile ID is not the same as NPI, it is constant from year to year, making it easier to match to NPI or other external identifier over time.
  • For cases where the addresses in NPPES database is out of date or specialty in NPPES is not accurately represented, consider pulling from more up-to-date databases (such as AMA Physician Masterfile) or find consumer-facing applications (such as, BetterDoctor, Vitals) that are likely to have more up to date information

  1. Huge thanks to Fred Trotter for pointing out the Open Payments report to Congress, related problems and independent efforts.