Use Case 56: Leverage provider network standard for better insurance plan selection

From Demand-Driven Open Data for HHS
Jump to: navigation, search


Use case summary


Since the launch of health insurance marketplaces as part of the Patient Protection and Affodable Care Act (PPACA), millions of Americans have obtained new health care coverage. There have, however, been widespread consumer complaints; in particular, people tend to choose the wrong plan because relevant information isn't available or is inaccurate. Often, patients don't discover until after a purchase that their physician isn't in-network, or that an in-network specialist they need isn't taking patients.

In November 2015, the Centers for Medicare & Medicaid Services (CMS) enacted a new regulatory requirement for health insurers who list plans on insurace marketplaces. They must now publish a machine-readable version of their provider network directory, publish it to a specified JSON standard, and update it at least monthly. Finally, this data is becoming accessible.

But computer- and engineer-accessibility doesn't make it particularly accessible to the general market of health care consumers. The new challenge, then, is to transform this vast directory of provider data into insights that can guide individuals to the health care they're paying for, that they deserve, and that they often badly need.



  • Value to requester: Facilitates creation of applications for the annual codathon:
  • Value to industry/public: Makes it possible to perform analytics on provider directories and drug formularies across all commercial health insurance companies that offer plans on state or federal insurance exchanges.


Current data and limitation

  • CMS has some provider lookup tool that's available only at time of plan signup. It's not easy for consumers to compare across plans.


  • Fields: All fields identified in the QHP Schema for fall 2015:
  • Update frequency: Updated monthly (Regulation states insurers are supposed to update the datasets at least every 30 days)
  • Joins between datasets: plan_id, npi
  • Lag time: By updating monthly, lag time from updates provided by insurers is no more than one month
  • History: Regulation only mandated for open enrollment period starting fall 2015
  • Delivery mechanism:
    • Download of tab delimited files for analytics (to load into relational database or analytics tool)
    • Download of JSON files possible for non-analytical web/mobile applications (to load into noSQL document database)


  • The thousands of URLs being provided by health insurance companies are being crawled and aggregated into easily accessible datasets.
  • For analytical applications, the data is being transformed into a tabular format.
  • As can be seen from the GitHub discussion, credit goes to the community for creating the solution.