How We Compared Clinical Trial and Cancer Incidence Data

An in-depth look at newly approved cancer drugs, who participates in their clinical trials and who is affected by those cancers.


Introduction

For our story, Black Patients Miss Out On Promising Cancer Drugs, ProPublica compared participant pools for clinical trials of new cancer treatments to the populations most at risk for the types of cancers targeted by these drugs. To conduct this analysis, we compiled two main data sets:

Clinical Trial Data

FDA Drug Trials Snapshots

In 2012, as part of the FDA Safety and Innovation Act, Congress asked the FDA to report clinical trial participation by demographic subgroup. In 2013, the agency found minorities were often underrepresented, noting that, for many of the drugs under consideration, “there were too few African American or Black patients in the trials to enable meaningful subset analysis.”

For every new drug approved starting in 2015, the FDA published a “Drug Trials Snapshot,” which includes the demographic breakdown for the clinical trial participants by sex, race, and age subgroups. ProPublica has compiled this data for all FDA-approved drugs from January 2015 to mid-August 2018 into a single dataset. Download this dataset at ProPublica's Data Store.

Snapshots included clinical trials run in the United States and internationally, but did not begin until 2017 to report what percentage of trials were conducted in the U.S. Though Asians appear to be well-represented in most trials, many of these trials were likely based outside of the United States. Analysis of 2017 data shows that, for drugs with at least 70 percent of trials conducted within the U.S., Asians make up only 1.7 percent of participants. Furthermore, the “Asian” category does not say if participants are of East Asian, South Asian, Southeast Asian, or Pacific Islander descent.

Reports did not include a Hispanic ethnicity category until 2017, and do not distinguish between white and non-white Hispanics, or between Hispanics of European or Latin American descent.

Cancer Trials Data

From the FDA Drug Trials Snapshots data, ProPublica identified 32 drugs that are primarily used to treat cancer, and were approved by the FDA between January 2015 and June 2018. To obtain more detailed racial breakdowns for these specific drugs, ProPublica manually compiled demographic information from individual FDA Snapshot reports into one dataset, shown in the table below.

Five drugs were approved more than once by the FDA to treat different types of cancers. For example, Lenvima was first approved in 2015 for differentiated thyroid cancer then in 2016 for renal cell carcinoma and again in 2018 for hepatocelullar carcinoma. Similarly, Imfinzi was approved in 2017 for urothelial carcinoma and in 2018 for non-small cell lung cancer. The FDA did not publish Snapshot data for clinical trials related to additional approvals, so we only included data on their first approvals.

Our final analysis for the story excluded one of the 32 drugs, Rydapt, which was approved for two uses. In its first approval, for acute myeloid leukemia, 57 percent of the racial demographic data for the trial was unreported. Its second approval was for aggressive systemic mastocytosis, systemic mastocytosis with associated hematological neoplasm, and mast cell leukemia. Since the first two conditions are not cancers, and we could not separate out data for patients with mast cell leukemia, we do not include Snapshot data related to the second approval.

Clinical Trial Demographics by Race,
for FDA-Approved Cancer Drugs

Brand Name Maker Cancer Type White African American Asian Other: American Indian or Alaska Native Other: Native Hawaiian or Other Pacific Islander Other: Multiple/Mixed Other: Unreported Other: Other Other: Unknown/Missing Other: Aggregate United States (2017 Only) Year
COTELLIC Roche melanoma with a BRAF V600E or V600K mutation 93.0% N/A N/A 7.0% 7.0% N/A 2015
ODOMZO Sun Pharma (NVS developed) basal cell carcinoma 94.0% <1% 0.0% 6.0% N/A 2015
LONSURF Taiho Oncology colorectal cancer 58.0% 1.0% 35.0% 6.0% N/A 2015
PORTRAZZA Eli Lilly squamous non-small cell lung cancer 83.0% 1.0% 8.0% <1% <1% 8.0% N/A 2015
TAGRISSO AstraZeneca EGFR T790M mutation-positive non-small cell lung cancer 36.0% 1.0% 60.0% <1% 2.0% 3.0% N/A 2015
IBRANCE Pfizer HR-positive, HER2-negative breast cancer 89.7% 1.2% 6.1% 3.0% 3.0% N/A 2015
ALECENSA Genentech ALK-positive non-small cell lung cancer 73.5% 1.6% 18.2% 0.4% 0.0% 0.4% 4.3% 1.6% 7.0% N/A 2015
NINLARO Takeda multiple myeloma 84.6% 1.8% 8.9% 1.7% 3.0% 4.7% N/A 2015
LENVIMA Merck differentiated thyroid cancer 79.3% 2.0% 17.9% 0.3% 0.5% <1% N/A 2015
FARYDAK Novartis multiple myeloma 63.2% 3.1% 32.6% 1.0% 1.0% N/A 2015
EMPLICITI BMS & Abbvie multiple myeloma 84.0% 4.0% 10.0% <1% 2.0% <1% 2.0% N/A 2015
UNITUXIN United Therapeutics neuroblastoma 81.9% 7.1% 2.7% 0.9% 1.3% 0.4% 5.8% 8.4% N/A 2015
DARZALEX J&J multiple myeloma 76.0% 10.0% 6.0% 8.0% 8.0% N/A 2015
YONDELIS J&J liposarcoma or leiomyosarcoma 76.0% 12.0% 4.0% 1.0% 3.0% 1.0% 3.0% 8.0% N/A 2015
TECENTRIQ Genentech urothelial carcinoma 91.0% 2.0% 2.0% <1% <1% 2.0% 2.0% 5.0% N/A 2016
VENCLEXTA AbbVie chronic lymphocytic leukemia with 17p deletion 94.0% 3.0% <1% <1% 2.0% 1.0% 3.0% N/A 2016
RUBRACA Clovis Oncology deleterious BRCA mutation associated ovarian cancer 78.0% 4.0% 7.0% 2.0% 9.0% 11.0% N/A 2016
LARTRUVO Eli Lilly soft tissue sarcoma 86.0% 8.0% 3.0% 2.0% 2.0% N/A 2016
ALIQOPA Bayer follicular lymphoma 83.0% 0.0% 9.0% 8.0% 8.0% 13% 2017
BAVENCIO Merck KGaA & Pfizer Merkel cell carcinoma 92.0% 0.0% 3.0% 3.0% 1.0% 4.0% 58% 2017
ZEJULA Tesaro epithelial ovarian, fallopian tube, or primary peritoneal cancer 87.0% 1.0% 3.0% <1% 9.0% 9.0% 70% 2017
ALUNBRIG Takeda ALK-positive non-small cell lung cancer 67.0% 1.0% 31.0% 1.0% 1.0% NR 2017
RYDAPT Novartis FLT3-positive acute myeloid leukemia 38.0% 2.0% 2.0% <1% <1% <1% 57.0% 58.0% 33% 2017
BESPONSA Pfizer B-cell precursor acute lymphoblastic leukemia 71.0% 2.0% 17.0% 10.0% 10.0% 47.0% 2017
VERZENIO Eli Lilly HR-positive, HER2-negative breast cancer 60.6% 2.5% 27.0% 3.2% 0.2% 6.5% 19.0% 2017
KISQALI Novartis HR-positive, HER2-negative breast cancer 82.0% 3.0% 8.0% <1% <1% 3.0% 4.0% 7.0% 32% 2017
NERLYNX Puma HER2-overexpressed/amplified breast cancer 81.0% 3.0% 13.0% 3.0% 3.0% 32.0% 2017
CALQUENCE AstraZeneca mantle cell lymphoma 74.0% 3.0% 0.0% 23.0% 23.0% 36.0% 2017
IMFINZI AstraZeneca urothelial carcinoma 64.0% 3.0% 20.0% 3.0% 9.0% 13.0% 49.0% 2017
IDHIFA Agios & Celgene acute myeloid leukemia 77.0% 6.0% <1% <1% 16.0% <1% 17.0% 83.0% 2017
BRAFTOVI+MEKTOVI Array Biopharma melanoma with a BRAF V600E or V600K mutation 91.0% 0.0% 3.0% 0.5% 1.0% 3.5% 6.0% 9.0% 2018
ERLEADA J&J prostate cancer 66.0% 6.0% 12.0% <1% 16.0% <1% 16.0% 28.0% 2018

Source: U.S. Food and Drug Administration; ProPublica analysis

Credit: Riley Wong/ProPublica

Cancer Incidence Data

SEER Database

The National Cancer Institute runs the Surveillance, Epidemiology, and End Results (SEER) Program for tracking cancer statistics within the United States. For some of the most common cancer types, SEER provides Cancer Stat Facts, summary reports that contain incidence and mortality rates by race and binary gender, based off of SEER 2011-2015 data. For other cancer types, the SEER Cancer Query Systems allows queries to the database for incidence and mortality statistics by cancer type, race, and gender.

The SEER age-adjusted incidence rate for a cancer type is the number of new cases of that cancer per 100,000 people, weighted by the age distribution of the U.S. standard population.

Finally, SEER groups “Asian or Pacific Islander” into one category and does not provide disaggregated data for patients of East Asian, South Asian, Southeast Asian, or Pacific Islander descent.

Cancer Incidence Rates Per 100,000 People by Race and Per Year

Cancer Type White African American Asian Native American
acute myeloid leukemia 4.6 3.9 3.5 2.5
ALK-positive non-small cell lung cancer 54.2 62.5 35.2 36.3
B-cell precursor acute lymphoblastic leukemia 1.0 0.3 0.5 N/A
basal cell carcinoma 22.0 1.0 N/A N/A
chronic lymphocytic leukemia with 17p deletion 5.3 3.9 1.1 1.6
colorectal cancer 39.2 48.7 33.7 42.2
deleterious BRCA mutation associated ovarian cancer 12.1 9.3 9.6 9.0
differentiated thyroid cancer 12.4 7.2 7.8 11.9
EGFR T790M mutation-positive non-small cell lung cancer 54.2 62.5 35.2 36.3
epithelial ovarian, fallopian tube, or primary peritoneal cancer 12.1 9.3 9.6 9.0
FLT3-positive acute myeloid leukemia 4.6 3.9 3.5 2.5
follicular lymphoma 4.1 1.6 1.6 N/A
HER2-overexpressed/amplified breast cancer 17.7 22.1 19.3 N/A
HR-positive, HER2-negative breast cancer 97.1 76.4 71.5 N/A
leiomyosarcoma 0.1 0.1 N/A N/A
liposarcoma 0.1 0.1 N/A N/A
mantle cell lymphoma 1.2 0.5 N/A N/A
melanoma with a BRAF V600E or V600K mutation 28.4 1.1 1.5 5.3
Merkel cell carcinoma 0.7 0.0 N/A N/A
multiple myeloma 6.3 13.8 4.0 5.9
neuroblastoma 0.3 0.2 0.2 0.2
prostate cancer 105.7 178.3 59.1 54.8
soft tissue sarcoma 4.5 5.1 2.8 2.8
squamous non-small cell lung cancer 12.2 15.5 5.6 10.0
urothelial carcinoma 23.3 13.7 9.5 9.6

Notes:

  • Native American figures include both American Indians and Alaska Natives. Decimals have been rounded to the nearest 0.1 for standardization.

  • For urothelial carcinoma, we used the incidence rates for bladder cancer. Additional research found that urothelial carcinoma accounts for 90 percent of all bladder cancers.

  • For differentiated thyroid cancer (DTC), we used the incidence rates for thyroid cancer. Additional research found that DTC accounts for 90 percent of thyroid cancers.

  • For epithelial ovarian cancer, we used the incidence rates for ovarian cancer. Additional research found that epithelial ovarian cancer accounts for 90 percent of ovarian cancers.

  • For acute myeloid leukemia (AML) that is FLT3 mutation-positive, we used the incidence rates for AML without the mutation, as additional research found that the frequency of FLT3 mutations did not differ between races.

  • For ALK-positive non-small cell lung cancer (NSCLC), we used the incidence rates for NSCLC without the mutation, as additional research found that race was not significantly associated with ALK rearrangement status.

  • For EGFR-mutation non-small cell lung cancer (NSCLC), we used the incidence rates for NSCLC without the mutation. Though additional research found that the EGFR mutant rate has been noted to be higher in Asian populations, data was not available to translate this finding into incidence rates by race.

  • For chronic lymphocytic leukemia (CLL) with 17p deletion, we used the incidence rates for CLL without the mutation. Though additional research has found that, out of those with CLL, black patients have a greater frequency of the 17p deletion compared to non-black patients, data was not available to translate this finding into incidence rates by race.

Source: U.S. Food and Drug Administration; ProPublica analysis

Credit: Riley Wong/ProPublica

Additional Research

Some of the cancer drugs in our compiled dataset treat a specific subset of a cancer, e.g. HR-positive, HER2-negative breast cancer. Since these more specific incidence rates were not available in SEER, we looked at additional research on these specific subsets to calculate incidence rates:

Caroline Chen contributed to this methodology.

Close Comment Creative Commons Donate Email Facebook Mobile Phone Podcast Print ProPublica Illinois logo RSS Search Search Twitter WhatsApp