The ProPublica Data Store

ProPublica is making available the datasets that power our data journalism. The raw data we received as the result of a FOIA request is available for free, and datasets that reflect substantial cleaning and processing by our staff are available for a one-time fee. Journalists and academic researchers can purchase premium datasets, and interested commercial users can contact us for pricing, by clicking the "Purchase" button on any dataset. We also provide a pass-through link when a data download is available on another site. Related Story »

Premium Datasets (Purchase)

Cleaned up, categorized and often created from multiple sources, these Premium datasets are unique to ProPublica and sold for a nominal fee.

FOIA Data (Free)

Because ProPublica received these datasets from a FOIA request, we're posting the original, raw datasets free for download.

External Data

ProPublica frequently uses datasets that are free and available online. So instead of downloading copies from us, we send you straight to the source.

Health Datasets

Source JOURN ($) ACAD ($)

Premium: Prescriber Checkup Dataset 2012

ProPublica's Prescriber Checkup data for 2012. The data has been cleaned and joined with other tables to include providers' names, addresses, specialties and contact information, as well as additional information on doctors' prescribing habits. There are six total files.
Size: Varies, Date Released: January 2015
Centers for Medicare & Medicaid Services $200 $2,000 Purchase

Try a Sample!

Premium: ProPublica's Open Payments Explorer Data

This data is a cleaned version of CMS's Open Payments data. The cleaned data allows for aggregation by drug, device and company. ProPublica used this data for its Open Payments Explorer app.
Size: Varies, Date Released: January 2015
Centers for Medicare & Medicaid Services $200 $2,000 Purchase

Try a Sample!

Premium: Dollars For Doctors Data (Per State)

ProPublica’s Dollars for Docs data. The data include about $4 billion in payments to doctors, other medical providers and health care institutions that have been disclosed by 17 pharmaceutical companies from 2009 to 2013.
Size: Varies, Date Released: October 2014
Pharmaceutical Company Disclosures $200 $2,000 Purchase

Try a Sample!

Premium: Combined Dollars for Docs Dataset (National)

ProPublica’s Dollars for Docs data. The data include more than $4 billion in payments to doctors, other medical providers and health care institutions that have been disclosed by 17 pharmaceutical companies from 2009 to 2013.
Size: 3,362,932 rows, Date Released: October 2014
Pharmaceutical Company Disclosures $1,000 $10,000 Purchase

Try a Sample!

Premium: Prescriber Checkup Dataset 2011

ProPublica's Prescriber Checkup data for 2011. The data has been cleaned and joined with other tables to include providers' names, addresses, specialties and contact information, as well as additional information on doctors' prescribing habits. There are six total files.
Size: Varies, Date Received: April 2013
Centers for Medicare & Medicaid Services $200 $2,000 Purchase

Try a Sample!

Medicare Part D Prescribing Data 2012

Medicare Part D prescriptions for 2012. The data include all drugs prescribed by doctors 11 or more times that year to Part D patients, including those 65 and older. A lookup file is provided to match unique prescriber ID to a practitioner's DEA or NPI number or other identifier. ProPublica used this data in Prescriber Checkup.
Size: 21,970,751 rows, Date Released: July 2014
Centers for Medicare & Medicaid Services Download

Medicare Part D Prescribing Data (Patients 65 or Older) 2012

Medicare Part D prescriptions written only for patients 65 or older in 2012. The data include all drugs prescribed by doctors 11 or more times to these patients in 2012. A lookup file is provided to match unique prescriber ID to a practitioner's DEA or NPI number or other identifier. ProPublica used this data in Prescriber Checkup.
Size: 16,966,011 rows, Date Received: July 2014
Centers for Medicare & Medicaid Services Download

Medicare Part D Prescribing Data 2011

Medicare Part D prescriptions for 2011. The data include all drugs prescribed by doctors 11 or more times that year to Part D patients, including those 65 and older. A lookup file is provided to match unique prescriber ID to a practitioner's DEA or NPI number or other identifier. ProPublica used this data in Prescriber Checkup.
Size: 21,150,242 rows, Date Received: June 2014
Centers for Medicare & Medicaid Services Download

Medicare Part D Prescribing Data (Patients 65 or Older) 2011

Medicare Part D prescriptions written only for patients 65 or older in 2011. The data include all drugs prescribed by doctors 11 or more times to these patients in 2011. A lookup file is provided to match unique prescriber ID to a practitioner's DEA or NPI number or other identifier. ProPublica used this data in Prescriber Checkup.
Size: 16,366,282 rows, Date Received: June 2014
Centers for Medicare & Medicaid Services Download

Medicare Part D Custom Data Runs 2011

This dataset includes additional Medicare files ProPublica used to create the Prescriber Checkup app. The data include drug costs, drug counts and narcotic and antipsychotic drug use.
Size: Varies, Date Received: June 2014
Centers for Medicare & Medicaid Services Download

Medicare Part D Prescribing Data 2010

Medicare Part D prescriptions for 2010. The data include all drugs prescribed by doctors 11 or more times that year to Part D patients, including those 65 and older. A lookup file is provided to match unique prescriber ID to a practitioner's DEA or NPI number or other identifier. ProPublica used this data in Prescriber Checkup.
Size: 20,758,453 rows, Date Received: June 2014
Centers for Medicare & Medicaid Services Download

New ACA Plan Compare 2014-2015 Data

This data compares differences between 2014 and 2015 Affordable Care Act insurance plans. The data comes already joined through a crosswalk file and includes fields that indicate if a plan changed, and by how much. ProPublica used to create the "Will My Obamacare Health Care Costs Go Up?" app.
Size: 79,279 rows, Date Released: December 2014
Centers for Medicare & Medicaid Services Download

CMS Open Payments Data

CMS Open Payments data. ProPublica cleaned this data to use in the Open Payments Explorer app.
Centers for Medicare & Medicaid Services Link

CDC Mortality Data

CDC's mortality and cause-of-death data. ProPublica used this data in our Tylenol overdose story.
Centers for Disease Control and Prevention Link

Nursing Home Compare Data

The Centers for Medicare and Medicaid Services Nursing Home Inspection data, including general information about nursing homes, health deficiencies, and penalties, updated monthly. ProPublica used this data in our Nursing Home Inspect. The data sets in particular that we used are Health Deficiencies, Penalties, and Provider Info.
Centers for Medicare & Medicaid Services Link

Nursing Home Deficiencies Data

The Centers for Medicare and Medicaid Services also makes publicly available the full-text statements of nursing home deficiencies. Scroll down to "Related Links" and the dataset is called "Full Text of Statements of Deficiencies -- [Current Month & Year]. It contains 10 Excel files for each region of the country. ProPublica used this data in our Nursing Home Inspect.
Centers for Medicare & Medicaid Services Link

Medicare Part B Provider Utilization and Payment Data

Medicare Part B service data for 2012. The data include all services performed by doctors 11 or more times that year to Part B patients. ProPublica used the data to create the Treatment Tracker app and stories.
Centers for Medicare & Medicaid Services Link

Education Datasets

Source JOURN ($) ACAD ($)

School Desegregation Orders Data

This is a dataset of school desegregation orders. The data files include information about school desegregation orders mandated by federal courts and open school desegregation orders that resulted from voluntary agreements between school districts and the U.S. Department of Education’s Office of Civil Rights. ProPublica used this data in our desegregation app.
Size: Varies, Date Uploaded: December 2014
U.S. Department of Justice; Stanford University Download

Restraint and Seclusion Data

This data contains all instances of restraints and seclusions that public schools self-reported during the 2011-2012 school year. It is broken down by state, district, and school. This is the first time the federal government has attempted to collect this data from all schools, though beware: many school districts did not report. ProPublica used this data in our story on the use of restraints at school. Read our reporting recipe for tips on how you can report this story.
Size: 95,635 rows, Date Uploaded: June 2014
Office of Civil Rights, U.S. Department of Education Download

Campaign Finance Datasets

Source JOURN ($) ACAD ($)

Free the Files Filing Data

ProPublica's Free the Files data. ProPublica curated this data from TV stations' political ad filings in swing markets in 2012.
Size: 66,225 rows, Date Released: January 2015
ProPublica, Federal Communications Commission Download

Business Datasets

Source JOURN ($) ACAD ($)

Rating Agency Document Review Data

This data contains a summary of comments investment bankers made about credit rating agencies while pitching their underwriting services for tobacco bonds from 1999 onward. The comments come from 140 underwriting pitches ProPublica collected under public records requests in more than a dozen states. ProPublica used this data in "Bankers Brought Rating Agencies ‘To Their Knees’ On Tobacco Bonds".
Size: 265 rows, Date Released: December 2014
Bond underwriters' responses Download

Premium: Recovery Tracker Data

This dataset combines records from the recipient-reported data on Recovery.gov and Recovery Act grants and loans reported by agencies on USAspending.gov. ProPublica used this data in our Recovery Tracker.
Size: 472,059 rows, Date Received: 2013
Recovery.gov, USAspending.gov $200 $2,000 Purchase

Try a Sample!

Premium: ProPublica's Bailout Data

This data includes expenditures by the Treasury Department via both the broader $700 billion TARP bill (later reduced to $475 billion) and the separate bailout of Fannie Mae and Freddie Mac. ProPublica used this data in our bailout coverage.
Size: Varies, Date Released: October 2014
Treasury Department; SEC filings $200 $2,000 Purchase

Try a Sample!

Transportation Datasets

Source JOURN ($) ACAD ($)

Pipeline Safety Data

This data contains all reported oil or gas pipeline incidents documented by the Pipeline & Hazardous Materials Safety Administration. ProPublica used this data in our Pipeline Safety Tracker.
Pipeline & Hazardous Materials Safety Administration Link

Cloud download icon by Ugur Akdemir. Shopping cart icon by Icomatic.