The ProPublica Data Store

ProPublica is making available the datasets that power our data journalism. The raw data we received as the result of a FOIA request is available for free, and datasets that reflect substantial cleaning and processing by our staff are available for a one-time fee. Journalists and academic researchers can purchase premium datasets, and interested commercial users can contact us for pricing, by clicking the "Purchase" button on any dataset. We also provide a pass-through link when a data download is available on another site. Related Story »

Premium Datasets (Purchase)

Cleaned up, categorized and often created from multiple sources, these Premium datasets are unique to ProPublica and sold for a nominal fee.

FOIA Data (Free)

Because ProPublica received these datasets from a FOIA request, we're posting the original, raw datasets free for download.

External Data

ProPublica frequently uses datasets that are free and available online. So instead of downloading copies from us, we send you straight to the source.

Health Datasets

Source JOURN ($) ACAD ($)

Premium: Dollars For Doctors Data (Per State)

ProPublica’s Dollars for Docs data. The data include about $2 billion in payments to doctors, other medical providers and health care institutions that have been disclosed by 15 pharmaceutical companies from 2009 to 2012.
Size: Varies rows, Date Released: 03/2014
Pharmaceutical Company Disclosures $200 $2,000 Purchase

Try a Sample!

Premium: Combined Dollars for Docs Dataset (National)

ProPublica’s Dollars for Docs data. The data include more than $2 billion in payments to doctors, other medical providers and health care institutions that have been disclosed by 15 pharmaceutical companies from 2009 to 2012.
Size: 2,111,118 rows, Date Released: 03/2014
Pharmaceutical Company Disclosures $1,000 $10,000 Purchase

Try a Sample!

Premium: Prescriber Checkup Dataset 2011

ProPublica's Prescriber Checkup data for 2011. The data has been cleaned and joined with other tables to include providers' names, addresses, specialties and contact information, as well as additional information on doctors' prescribing habits. There are seven total files.
Size: Varies rows, Date Received: 04/2013
Centers for Medicare & Medicaid Services $200 $2,000 Purchase

Try a Sample!

Medicare Part D Prescribing Data 2011

Medicare Part D prescriptions for 2011. The data include all drugs prescribed by doctors 11 or more times that year to Part D patients, including those 65 and older. A lookup file is provided to match unique prescriber ID to a practitioner's DEA or NPI number or other identifier. ProPublica used this data in Prescriber Checkup.
Size: 72,921,353 rows, Date Received: 04/2013
Centers for Medicare & Medicaid Services Download

Medicare Part D Prescribing Data (Patients 65 or Older) 2011

Medicare Part D prescriptions written only for patients 65 or older in 2011. The data include all drugs prescribed by doctors 11 or more times to these patients in 2011. A lookup file is provided to match unique prescriber ID to a practitioner's DEA or NPI number or other identifier. ProPublica used this data in Prescriber Checkup.
Size: 58,790,544 rows, Date Received: 04/2013
Centers for Medicare & Medicaid Services Download

CDC Mortality Data

CDC's mortality and cause-of-death data. ProPublica used this data in our Tylenol overdose story.
Centers for Disease Control and Prevention Link

Nursing Home Compare Data

The Centers for Medicare and Medicaid Services Nursing Home Inspection data, including general information about nursing homes, health deficiencies, and penalties, updated monthly. ProPublica used this data in our Nursing Home Inspect. The data sets in particular that we used are Health Deficiencies, Penalties, and Provider Info.
Centers for Medicare & Medicaid Services Link

Nursing Home Deficiencies Data

The Centers for Medicare and Medicaid Services also makes publicly available the full-text statements of nursing home deficiencies. Scroll down to "Related Links" and the dataset is called "Full Text of Statements of Deficiencies -- [Current Month & Year]. It contains 10 Excel files for each region of the country. ProPublica used this data in our Nursing Home Inspect.
Centers for Medicare & Medicaid Services Link

Business Datasets

Source JOURN ($) ACAD ($)

Premium: Recovery Tracker Data

This dataset combines records from the recipient-reported data on Recovery.gov and Recovery Act grants and loans reported by agencies on USAspending.gov. ProPublica used this data in our Recovery Tracker.
Size: 472,059 rows, Date Received: 2013
Recovery.gov, USAspending.gov $200 $2,000 Purchase

Try a Sample!

Transportation Datasets

Source JOURN ($) ACAD ($)

Pipeline Safety Data

This data contains all reported oil or gas pipeline incidents documented by the Pipeline & Hazardous Materials Safety Administration. ProPublica used this data in our Pipeline Safety Tracker.
Pipeline & Hazardous Materials Safety Administration Link

Coming Soon Datasets

Source JOURN ($) ACAD ($)

Medicare Part D Custom Data Runs 2011

This dataset includes additional Medicare files ProPublica used to create the Prescriber Checkup app. The data include drug costs, drug counts and narcotic and antipsychotic drug use.
Size: Varies rows, Date Received: 04/2013
Centers for Medicare & Medicaid Services Download

Cloud download icon by Ugur Akdemir. Shopping cart icon by Icomatic.