
REVIEW ARTICLES
Article Tools

OPTIONS & TOOLS
COMPANION ARTICLES
ARTICLE CITATION
DOI: 10.1200/PO.20.00150 JCO Precision Oncology no. 4 (2020) 1196-1206. Published online October 5, 2020.
PMID: 35050777
Meta-Analysis of PD-L1 Expression As a Predictor of Survival After Checkpoint Blockade




2Department of Data Science, Dana-Farber Cancer Institute, Boston, MA
3Foundation Medicine, Cambridge, MA
4Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA
5Department of Pathology, Dana-Farber Cancer Institute, Boston, MA
6Department of Radiation Oncology, Dana-Farber Cancer Institute, Boston, MA
Programmed cell death receptor ligand 1 (PD-L1) expression is the most studied biomarker to predict the efficacy of immune checkpoint inhibitors (ICIs), but its clinical significance is controversial. We estimated the distribution of PD-L1 expression scores (ie, tumor proportion score or combined proportion score) and the relationship between PD-L1 levels and ICIs’ impact on overall survival (OS).
We reconstructed, pooled, and analyzed individual-level data on 7,617 patients with cancer from 14 randomized clinical trials. The effects of ICIs were quantified using differences in 24-month restricted mean survival times (ΔRMSTs; ie, the increase in life expectancy truncated at 2 years associated with ICI therapy). In a simulation study, we compared standard randomized clinical trial designs with a trial design that leverages meta-analytic results like ours.
Approximately 93% of patients had a PD-L1 expression ≤ 5% (66% of patients) or > 50% (27% of patients). OS improves with ICIs regardless of PD-L1 expression level, which predicts the benefits’ magnitude. For patients with non–small-cell lung cancer (NSCLC), ΔRMSTs ranged from 1.4 months (95% probability interval [PI], 0.7 to 2.2 months) for PD-L1 expression ≤ 1% to 4.1 months (95% PI, 3.2 to 5.2 months) for PD-L1 expression > 80%. For patients with non-NSCLC tumors, ΔRMSTs ranged from 0.8 months (95% PI, −0.1 to 1.7 months) to 2.3 months (95% PI, 1.3 to 4.4 months), again for PD-L1 expression levels of ≤ 1% and > 80%, respectively. Simulations suggested that designs tailored to meta-analytic results can detect the effects of ICIs in PD-L1 subgroups with higher probability (> 15%) than standard designs.
Immune checkpoint inhibitors (ICIs) have become a standard treatment of many metastatic cancers and are actively being studied in earlier-stage disease.1 As of April 2019, the US Food and Drug Administration approved three programmed cell death protein 1 (PD-1) inhibitors (pembrolizumab, nivolumab, and cemiplimab) and three programmed cell death receptor ligand 1 (PD-L1) inhibitors (atezolizumab, avelumab, and durvalumab) for the treatment of diverse cancers, such as non–small-cell lung cancer (NSCLC), melanoma, head and neck carcinoma, and others.2,3
Key Objective
To estimate the relationship between PD-L1 expression and the effect of ICIs on overall survival using data summaries from a collection of published clinical trials.
Knowledge Generated
Treatment effect estimates in PD-L1 subgroups suggest that most patients benefit from ICIs, including those typically classified as PD-L1 negative. The magnitude of survival benefit varies with the tumors’ PD-L1 expression level.
Relevance
The practice of dichotomizing the range of PD-L1 expression scores is suboptimal for patient stratification. Meta-analytic estimates of the distribution of PD-L1 scores and subgroup-specific treatment effects can improve the designs of future clinical trials of ICIs.
Expression of PD-L1 is one of the most studied biomarkers to predict the efficacy of ICIs,4-7 but several factors limited its study in clinical trials. First, all trials that have compared ICIs with chemotherapy reported survival outcomes in the form of hazard ratios, a metric of treatment effects whose use in immuno-oncology studies has been put into question.8,9 Second, different trials have used different cutoff values to dichotomize PD-L1 expression levels, complicating meta-analytic studies.7,10 Third, different immunohistochemistry (IHC) assays were used to measure PD-L1 expression, and some studies factor immune cell PD-L1 expression into a combined score, further complicating comparisons across studies.11,12 Finally, there are few data available on the prevalence of PD-L1 expression scores in clinically relevant populations, making it difficult to assess the power of individual trials to detect treatment effects in PD-L1–defined subgroups.13,14
To address these gaps, the objectives of this pooled analysis were to estimate the distribution of PD-L1 expression scores in clinically relevant patient populations and assess the relationship between PD-L1 levels and the effects of ICIs on overall survival (OS) using an alternative measure other than hazard ratios (ie, differences in restricted mean survival times [RMSTs]).8,15,16 To do so, we analyze individual-level data from 7,617 patients with cancer. Individual-level data include information on the PD-L1 expression level and survival time of each patient. This information was extracted and reconstructed from the publications of 14 randomized clinical trials of PD-1/PD-L1 inhibitors.
We describe the selection of clinical trial publications in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.17 We searched the PubMed database to identify all English-language publications of randomized clinical trials that compared PD-1- or PD-L1–targeting ICIs against standard chemotherapy or placebo in adults (≥ 18 years) with solid tumors. Our PubMed query (restricted to trials published before October 31, 2018) is available in the Data Supplement. Our inclusion and exclusion criteria for study selection are also reported in the Data Supplement.
Two independent reviewers (A.A. and G.F.) extracted relevant information from the selected publications and their online supplements (conflicts were resolved in collaboration). Collected data included sample size, tumor histology, and IHC assay used to measure PD-L1 expression.11,12
We also reconstructed individual-level patient data (IPD), which included follow-up times and censoring indicators, from Kaplan-Meier curves pictured in the publications and their online supplements. We used the DigitizeIt software (version 2.3) and the IPD extraction algorithm of Guyot et al.18,19 Details of the reconstruction process are given in the Data Supplement.
Our end point of interest was OS (ie, time from randomization until death by any cause). OS is the gold standard for evaluating the long-term clinical outcomes of oncology treatments.20,21
Techniques for the quantification of PD-L1 expression from tissue samples are described elsewhere.12,22 Common measures include the tumor proportion score (TPS; ie, the percentage of tumor cells with partial or complete membrane staining in the tissue sample); another is the combined proportion score (CPS; alternatively known as combined positive score), which counts both tumor and infiltrating immune cells with partial or complete staining. Although the latter is not technically a proportion, like TPS, its values are expressed as a percentage score.23
To evaluate PD-L1 as a predictive marker for ICIs, patients were subdivided in subgroups in each trial, the PD-L1–positive or –negative groups, formed by all individuals with a PD-L1 expression higher or lower than a study-specific cutoff (eg 1%, 5%). We recorded all cut points used in each publication.
To estimate the distribution of PD-L1 levels, we used the collected PD-L1 cut points (Table 1) to define the following classes of PD-L1 expression: 0%-1%, 1%-5%, 5%-10%, 10%-50%, 50%-80%, and 80%-100% (each class includes its upper limit; except for the first class, each excludes its lower limit).
|
We measured the effects of ICIs on OS in each PD-L1 class according to the difference (Δ) in RMSTs between patients treated with ICIs and chemotherapy or placebo. The RMST is the mean survival time (in months since randomization) up to a prespecified time point,15 which we fixed at 24 months (the point where the Kaplan-Meier curve obtained by pooling all patients treated with chemotherapy or placebo crossed the 80% probability level; in a sensitivity analysis, we considered an alternative 30-month cutoff). Contrary to hazard ratios, ΔRMSTs are interpretable even when hazards are not proportional—as is common in ICI trials—and better suited to quantify the survival benefits of immunotherapies.8,15,16,24,25
To combine the information provided by individual trials and estimate the quantities of interest from the reconstructed IPD, we built a joint model of the distribution of PD-L1 expression levels and the treatment-specific survival distribution in different PD-L1 classes. First, we assumed that the prevalence of each class was the same in all trials, regardless of both the type of assay used to measure PD-L1 expression and the type of tumor (ie, NSCLC or other tumors). We subsequently assessed this assumption in a stratified analysis comparing prevalence estimates among trials using TPS or CPS, trials using different IHC assays, and trials of NSCLC or other tumors.
Second, we modeled the association between OS, PD-L1 scores, and tumor type (the technical details are provided in the Data Supplement). We assumed the distribution of survival times was the same for all patients within each subgroup defined by PD-L1 class, therapy received (ie, ICIs v standard chemotherapy or placebo), and tumor type (ie, NSCLC or other tumors). We evaluated this assumption using meta-analytic methods to compare pooled and study-specific estimates.26 In addition, we used piecewise exponential models to describe subgroup-specific survival distributions and compute RMSTs.27 In accordance with ICIs’ mechanism of action,28 we imposed that ICIs’ benefits on OS could not decrease as the PD-L1 expression level increases (eg, that the ΔRMST in the 0%-1% PD-L1 class can only be equal to or lower than that in the 1%-5% class; we assessed this assumption by estimating trends in study-specific ΔRMST).
Our information-pooling approach was based on a Bayesian data augmentation algorithm.29 Patients’ membership PD-L1 classes were not available in the reconstructed IPD. Rather, for each patient, it was only known whether the PD-L1 level fell above or below a specific cut point (Table 1). On the basis of our model, data augmentation performs computations by repeatedly imputing individuals’ PD-L1 classes compatibly with available data. We provide more details on data extraction, modeling, and data augmentation in the Data Supplement.
Following the Bayesian approach,30 we obtained the posterior distribution of model parameters (ie, their conditional distribution given the reconstructed IPD). To implement this approach, we specified a prior distribution for the prevalence of each PD-L1 class and the rates defining the piecewise exponential model. We used a Dirichlet prior distribution for the prevalence of each PD-L1 class. The prior mean matched the PD-L1 intervals’ length (eg, 4% for the 1%-5% class). We also specified independent gamma prior distributions with mean log (2)/12 and variance log (2)/120 for the piecewise exponential rates. This is a weakly informative choice of priors.30,31 The Data Supplement provides details and a graphical illustration of these prior distributions.
From the posterior distribution, we computed estimates (posterior means) of the prevalence of each PD-L1 class and the ΔRMST for the effect of ICIs in each PD-L1 class, separately by tumor type. For all these quantities, we also computed their 95% probability intervals (PIs), the range that contains the true value with 95% (posterior) probability.
The Data Supplement includes a sensitivity evaluation of the estimates with respect to the choice of the prior distribution. Specifically, we recomputed posterior estimates and CIs using a uniform prior—a common noninformative prior for proportions30,31—with mean = 1/6 for the prevalences of each PD-L1 class. We also increased (10-fold) the variance of the prior distributions on the piecewise-exponential rates.
We compared two designs for a randomized trial whose aim is to provide evidence of an ICI effect on OS. In both designs, patients are enrolled regardless of PD-L1 status (positive or negative). The primary null hypothesis is that the ICI has no effect on patients’ OS, regardless of PD-L1 status.
The compared designs are as follows. Design 1 (D1) is a design that includes two separate tests for nonnull ΔRMSTs among PD-L1–positive and –negative patients, according to the common 1% PD-L1 cutoff (Table 1). At the end of the trial, we test for the presence of treatment effects in the PD-L1–positive and –negative subgroups. Within each group, the relevant null hypothesis (ΔRMST = 0) is tested using the rmst2 R function,32 adjusting P values with the Bonferroni-Holm procedure33 and a 5% significance level. P+ and P− are the adjusted P values in the PD-L1 groups. The treatment is declared effective in either the PD-L1–positive or –negative group if min(P+,P−) ≤ 5% and in both if max(P+,P−) ≤ 5%.
Design 2 (D2) is design that leverages meta-analytic estimates using the testing procedure introduced in Arfé et al,9 which targets an optimal power level. We use the 10% PD-L1 cutoff, an inflection point in the estimated relationship between PD-L1 levels and ICIs’ effects on OS (see Results).
The D2 design is similar to D1, except for the different PD-L1 cutoff and the testing procedure implemented in the PD-L1–positive and –negative groups. The testing procedure9 used to detect ICIs’ benefits in the PD-L1 subgroups controls the false-positive error rate at the prespecified 5% level and maximizes the probability of detecting ICIs’ subgroup-specific effects on OS. Technical details are provided in the Data Supplement.
To compare designs, we simulated 10,000 trials (1:1 randomization ratio; sample size, 500 patients) that contrast an ICI with standard chemotherapy in NSCLC. From simulation results, we compared D1 and D2 according to their probability of detecting treatment effects at end of the trial, in either the PD-L1–positive or –negative group (P1) or in both groups (P2).
To generate patient data in our main simulation scenario, we fixed the prevalence of each PD-L1 class identical to that estimated in the analyses (Fig 1A). We assumed an exponential distribution for the event times in each PD-L1 class, with censoring after 30 months of follow-up. Finally, we selected OS parameters so that the treatment-specific RMSTs in each PD-L1 class were equal to those estimated from the reconstructed IPD for NSCLC (Fig 1B).

FIG 1. (A) Estimated prevalence (with 95% probability intervals [PIs]) of individuals with a specific PD-L1 expression level. (B) Estimates (with 95% probability intervals (PIs)) of checkpoint inhibitor effects on overall survival (differences in restricted mean survival times [RMSTs]) in increasing classes of PD-L1 expression for non–small-cell lung cancer (NSCLC; blue dots) and other tumors (red triangles).
As a sensitivity analysis, we considered the following additional simulation scenarios. Scenario 1S was structured as described in the paragraph above, but we assumed that ICIs had no effect on OS. Specifically, we set the OS distribution in the ICI arm identical to that estimated in the chemotherapy arm (Fig 1B). In scenario 2S, we assumed that the OS distribution in both arms was different than that predicted by meta-analytic results. To do so, we fixed OS parameters so that the arm-specific RMSTs in each PD-L1 class were equal to those estimated from the reconstructed IPD for other tumors instead of NSCLC (Fig 1B). Finally, in scenario 3S, we changed the distribution of the PD-L1 scores in the trial. In particular the prevalence of the PD-L1 classes in the enrolled population matches Figure 2B, assay 22C3, and is different from the predictions (Fig 1A).

FIG 2. (A) Prevalence estimates obtained from trials that used combined proportion score (CPS)– (blue) and tumor proportion score (TPS)–based (red) PD-L1 scores. (B) Prevalence estimates obtained separately from trials that used the following immunohistochemistry (IHC) assays: 22C3 (blue), 28-8 (green), 73-10 (red), and SP142 (orange). (C) Prevalence estimates obtained from for different PD-L1 expression levels obtained separately trials of non–small-cell lung cancer (NSCLC; blue) or other tumors (red), considered separately.
A total of 356 abstracts were identified from PubMed. Of these, we excluded one because we could not recover the full text of the article. For the remaining 355 abstracts, we obtained the associated article and online supplements. A total of 341 publications satisfied our exclusion criteria (the PRISMA flowchart of inclusion and exclusion criteria is provided in the Data Supplement). Hence, we included 14 publications in our analyses (Table 1).
Table 1 lists the characteristics of the 14 clinical trials included in the analyses. A total of five PD-L1 expression cut points (1%, 5%, 10%, 50%, and 80%) were used to define PD-L1–positive and –negative subgroups in different trials. In the reconstructed IPD, 6,199 patients died during a total of 8,641 person-years of follow-up.
The estimated prevalence of individuals with a specific PD-L1 expression level is shown in Figure 1A. The estimated distribution of PD-L1 expression is U shaped, with most patients presenting a low or high expression; 66% (95% PI, 65% to 68%) of patients in this population had a PD-L1 expression either in the 0%-1% or 1%-5% range, whereas 27% (95% PI, 26% to 28%) had an expression in the 50%-80% or 80%-100% range, and only an estimated 7% (95% PI, 5% to 8%) had an expression in the 5%-10% or 10%-50% range. As shown in the Data Supplement, we obtained similar results when we evaluated the sensitivity of our information-pooling approach with respect to different Bayesian prior distributions used to implement it. We also obtained similar results after excluding from the analysis the two trials that used placebo as comparator (Data Supplement).
In Figure 2A, we report the prevalence estimates obtained by repeating the analysis considering only trials that used TPS or CPS, separately. Overall, although they were lower in the PD-L1 0%-1% range and 80%-100% range, CPS-based estimates were similar to the corresponding TPS-based estimates, as well as to those reported in Figure 1A.
Figure 2B shows prevalence estimates obtained by repeating our analysis separately for all trials that used the same ICH assay type. Results are comparable with those in Figure 1A, although ICH 23C3 and SP142 were associated with higher prevalence estimates in the 1%-5% and 5%-10% ranges.
Figure 2C shows prevalence estimates obtained restricting the analysis to either only NSCLC trials or only trials of other tumors. Again, results are comparable with those in Figure 1A, although PD-L1 levels are slightly less concentrated in the 0%-5% range for NSCLC.
We estimated the effects of ICIs on OS in increasing classes of PD-L1 expression and separately for NSCLC and other tumors (Fig 1B). ICIs appear to provide an OS benefit in both patients with NSCLC and patients with other types of tumors, regardless of their PD-L1 expression level, which importantly predicts the magnitude of OS benefits. Specifically, for NSCLC, ΔRMSTs ranged from 1.4 months (95% PI, 0.7 to 2.2 months) to 4.1 months (95% PI, 3.2 to 5.2 months) for patients in the 0%–1% and 80%–100% PD-L1 classes, respectively. For other tumors, ΔRMSTs ranged from 0.8 months (95% PI, −0.1 to 1.7 months) to 2.3 months (95% PI, 1.3 to 4.4 months) for patients in the 0%-1% and 80%-100% PD-L1 classes, respectively. We estimated a similar trend in treatment effect estimates along PD-L1 classes using a RMST cutoff of 30 months (Data Supplement). We also obtained nearly identical results also using different prior distributions (Data Supplement) and when excluding the two placebo-controlled trials from the analysis (Data Supplement).
Study-specific estimates suggest a nondecreasing trend in OS benefits along increasing PD-L1 classes. Figure 3 shows the PD-L1– and study-specific estimates of the effects of ICIs on OS, obtained separately for NSCLC trials and trials of other tumors. Estimates were computed (using the survRM2 R package32) from the reconstructed IPD after weighting each patient by his or her estimated (posterior) probability of belonging to the considered PD-L1 classes. These probabilities were obtained from our model but without using survival data or imposing ICI effect estimates to be monotone across PD-L1 classes. Smoothed trends (Fig 3) in ΔRMST estimates support the hypothesis of monotonicity in ICI effects used in our main analysis (Fig 1B). In addition, we found little evidence of heterogeneity in weighted class-specific ΔRMST estimates across trials (Data Supplement).

FIG 3. Study-specific difference in restricted mean survival time (ΔRMST) estimates (in months). Estimates were obtained without assuming nondecreasing effects along PD-L1 classes. For each PD-L1 class, distinct symbols represent separate clinical trials (blue squares, non–small-cell lung cancer [NSCLC] trials; red bubbles, trials on other tumors). Symbol size increases with the precision (inverse of standard error [SE]) of the associated estimate. Solid lines represent smooth trends. ΔRMSTs were estimated using the rmst2 function from the survRM2 R library, modified to allow for subject-specific frequency weights. Smooth trends were obtained by LOESS curve fitting.56
For the design that was not tailored to pretrial meta-analytic summaries (D1), with 10,000 trial simulations, we estimated that the probability to detect a treatment effect (ie, rejecting the null hypothesis that the ICI has no effects on OS) in PD-L1–positive or –negative patients was P1 = 80%. In P2 = 22% of the simulated trials, a positive treatment effect was detected in both groups. Instead, using a design (D2) that incorporates meta-analytic estimates on the distribution of PD-L1 expression and its association with ICI effects on OS, the estimated frequencies were P1 = 93% and P2 = 47%.
In the additional scenario 1S (null treatment effects), simulation-based estimates were P1 = 6% and P2 = 0.2% for D1 and P1 = 5% and P2 = 0.2% for D2. These results confirm that D2 controls the type I error rate at the prespecified 5% level.9 In scenario 2S (different OS distribution than predicted by meta-analytic results), we estimated P1 = 41% and P2 = 5% for D1 and P1 = 53% and P2 = 16% for D2. In Scenario 3S (different distribution of PD-L1 classes), we estimated P1 = 84% and P2 = 18% for D1 and P1 = 94% and P2 = 55% for D2. These results highlight how the operating characteristics of design D2 depend on the level of agreement between the distributions of OS and PD-L1 expression and pretrial predictions based on meta-analytic summaries.
Several factors contributed to the current controversies over the use of PD-L1 expression as a predictive biomarker.23-25 Past trials of ICIs used potentially inappropriate statistical methods8,34 and were not powered to detect treatment effects within PD-L1–specific strata.23-25
Other investigations have quantified the prevalence of PD-L1 levels in clinical patient populations.35,36 Our PD-L1 prevalence estimates are largely compatible with those reported by the Blueprint project for patients with lung cancer (Data Supplement).11 These estimates suggest that a large proportion of patients have a PD-L1 level in the 0%-1%, 1%-5%, 50%-80%, and 80%-100% ranges, regardless of the use of TPS or CPS metrics or tumor types analyzed. However, these estimates may also reflect complexities of manual assessment of PD-L1 by pathologists using different assays.11
Recent meta-analyses evaluated PD-L1 expression scores as a predictor of ICI efficacy.5-7 Although these studies concluded that PD-L1 positivity, defined according to several cut points, can predict the survival benefits of ICIs, none quantified this relation over a range of PD-L1 scores. Our treatment effect estimates for PD-L1–specific subgroups indicate that most patients benefit from ICIs, including, as suggested by previous analyses,7,37,38 those typically defined as PD-L1 negative. Moreover, the magnitude of survival benefits seems to vary with the tumors’ PD-L1 expression level, especially for patients with lung cancer.
Currently, a common strategy to stratify patients is to classify them as PD-L1 negative or positive according to a single threshold for a specific measure of PD-L1 expression (eg, TPS or CPS). Although this practice has the advantage of being simple, it ignores the extent of heterogeneity in treatment response within the resulting subgroups.39 This variation may depend on the tumor type. Our results highlight how the effects of ICIs on OS may vary substantially with the PD-L1 expression levels in patients with lung cancer.
To increase the usefulness of stratification based on PD-L1 measures, a better strategy than simple dichotomization would be to consider multiple cut points to define several PD-L1 classes, each characterized internally by treatment effects of approximately homogeneous magnitude. These classes may need to depend on tumor type. Given the estimated U-shaped distribution of PD-L1 scores and increasing OS benefits along PD-L1 scores (Fig 1), our meta-analysis suggests two cut points (5% and 50%) may be adequate for NSCLC.
Meta-analyses of PD-L1–specific trial results can help design future studies to confirm the survival benefits of ICIs. Estimates such as ours can be used to predict the prevalence of different PD-L1–defined classes and the magnitude of subgroup-specific treatment effects.
Study limitations were as follows. First, individual-level data on survival duration and quantitative PD-L1 measurements collected in the included trials were not publicly available. Thus, we used trial publications to reconstruct, impute, and pool OS data for individual patients within specific PD-L1 expression categories. This allowed us to estimate the prevalence of six PD-L1 classes and their association with the OS benefits of ICIs.
Second, we assumed that the prevalence of different PD-L1 classes did not vary with the type of IHC assay or between lung cancer and other tumors. This assumption may not always be true.11,40 Our stratified analyses (Figs 2A and 2B) suggest potential differences in PD-L1 score distributions assessed by some assays, although prevalence estimates did not change substantially across TPS- or CPS-based assays or tumor types. Estimating the differences between assays in capturing tumor PD-L1 expression with our collection of data sets extracted from publications is complicated by potential confounding. Differences in PD-L1 distributions observed across assays may be confounded by the type of tumor or other patient characteristics not available in the data (eg, differences in sex or age distributions, previous treatments). To address this issue, a comparative approach based on actual patient-level data, accounting for all potential confounders, could be adopted to estimate differences and reproducibility across PD-L1 assays used in ICI trials.
Third, there were limited data to assess differences in the association between PD-L1 levels and treatment effects across most tumor histology types. More data are required to address this issue, and we cannot exclude differences in the relationship between PD-L1 levels and treatment effects across nonlung tumors.
Fourth, OS may have been impacted by patients randomly assigned to standard treatment who then received an ICI off-protocol or because of crossover between arms, especially as use of ICIs has become widespread over time. This may have affected our treatment effects estimates, but it most likely resulted in an underestimation of ICI effects on OS.
Fifth, we assumed that OS benefits could not decrease along the considered PD-L1 classes. Our study-specific estimates were compatible and supported this assumption in a robustness analysis.
Sixth, in our literature search, we identified only one PD-L1–enriched trial that met our other inclusion criteria, the Keynote 10 trial (Data Supplement).41 Because it did not enroll patients with lower scores, Keynote 10 provides information only on the distribution of PD-L1 expression levels among patients with a score ≥ 1%. For this reason, we have excluded this study from our analyses.
Seventh, in our analyses, we did not adjust for differences in cytotoxic chemotherapy regimens between studies. Control chemotherapy regimens may have varied depending on each study’s considered line of therapy (first or second) or tumor type and stage. Despite these potential differences of the comparator arms, we found little heterogeneity in study-specific treatment effect estimates.
Finally, most clinical trials included in our study assessed patients’ tumor PD-L1 expression using not just fresh tissue samples, but also archival samples. Some studies of PD-L1 expression in fresh surgical tissue sections and matched lung biopsies from patients with NSCLC found that archival biopsy specimens tended to underestimate PD-L1 expression levels.42,43
In conclusion, our results support the notion that ICIs can prolong survival even among patients with PD-L1–negative tumors and that PD-L1 expression levels predict the magnitude of ICI effects. To better guide clinical decisions, future trials should move beyond the practice of dichotomizing the range of potential predictive markers based on seemingly arbitrary cut points, as is currently the norm for PD-L1 expression.
Presented at the 34th Annual Meeting of the Society for Immunotherapy of Cancer, National Harbor, MD, November 6-10, 2019.
L.T. and J.D.S. contributed equally to this work.
Conception and design: Andrea Arfè, Geoffrey Fell, Brian Alexander, Lorenzo Trippa, Jonathan D. Schoenfeld
Financial support: Brian Alexander
Administrative support: Brian Alexander
Collection and assembly of data: Andrea Arfè, Geoffrey Fell, Lorenzo Trippa
Data analysis and interpretation: All authors
Manuscript writing: All authors
Final approval of manuscript: All authors
Accountable for all aspects of the work: All authors
The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/po/author-center.
Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).
Employment: Foundation Medicine
Leadership: Foundation Medicine
Stock and Other Ownership Interests: Roche
Research Funding: Eli Lilly (Inst), Puma (Inst), Celgene (Inst)
Open Payments Link: https://openpaymentsdata.cms.gov/physician/854258/summary
Consulting or Advisory Role: Genentech, Merck, Pfizer, Boehringer Ingelheim, Abbvie, AstraZeneca/MedImmune, Clovis Oncology, Nektar, Bristol Myers Squibb, ARIAD, Foundation Medicine, Syndax, Novartis, Blueprint Medicines, Maverick Therapeutics, Achilles Therapeutics, Neon Therapeutics, Hengrui Therapeutics, Gritstone Oncology
Research Funding: Genentech (Inst), Eli Lilly (Inst), AstraZeneca (Inst), Bristol Myers Squibb (Inst), Bristol Myers Squibb
Leadership: Immunitas
Stock and Other Ownership Interests: Immunitas
Honoraria: Perkin Elmer, Bristol Myers Squibb
Consulting or Advisory Role: Bristol Myers Squibb
Research Funding: Bristol Myers Squibb, Merck, Affimed Therapeutics, Kite Pharma
Patents, Royalties, Other Intellectual Property: Patent pending for use of anti-galectin 1 antibodies for diagnostic use
Travel, Accommodations, Expenses: Roche, Bristol Myers Squibb
Consulting or Advisory Role: Galera Therapeutics
Consulting or Advisory Role: Tilos Therapeutics, LEK, Catenion, ACI Clinical, Debiopharm Group, Immunitas
Research Funding: Bristol Myers Squibb, Merck, Regeneron
Expert Testimony: Heidell, Pittoni, Murphy, and Bach, Kline & Specter
No other potential conflicts of interest were reported.
ACKNOWLEDGMENT
We thank Tianqi Chen (Dana-Farber Cancer Institute) for her help in data collection and Donna Neuberg (Dana-Farber Cancer Institute) for her useful comments.
1. | Ribas A, Wolchok JD: Cancer immunotherapy using checkpoint blockade. Science 359:1350-1355, 2018 Crossref, Medline, Google Scholar |
2. | Gong J, Chehrazi-Raffle A, Reddi S, et al: Development of PD-1 and PD-L1 inhibitors as a form of cancer immunotherapy: A comprehensive review of registration trials and future considerations. J Immunother Cancer 6:8, 2018 Crossref, Medline, Google Scholar |
3. | Markham A, Duggan S: Cemiplimab: First global approval. Drugs 78:1841-1846, 2018 Crossref, Medline, Google Scholar |
4. | Yi M, Jiao D, Xu H, et al: Biomarkers for predicting efficacy of PD-1/PD-L1 inhibitors. Mol Cancer 17:129, 2018 Crossref, Medline, Google Scholar |
5. | Lu S, Stein JE, Rimm DL, et al: Comparison of biomarker modalities for predicting response to PD-1/PD-L1 checkpoint blockade: A systematic review and meta-analysis. JAMA Oncol 5:1195, 2019 Crossref, Medline, Google Scholar |
6. | Yu Y, Zeng D, Ou Q, et al: Association of survival and immune-related biomarkers with immunotherapy in patients with non-small cell lung cancer: A meta-analysis and individual patient-level analysis. JAMA Netw Open 2:e196879, 2019 Crossref, Google Scholar |
7. | Shen X, Zhao B: Efficacy of PD-1 or PD-L1 inhibitors and PD-L1 expression status in cancer: Meta-analysis. BMJ 362:k3529, 2018 Crossref, Medline, Google Scholar |
8. | Ferrara R, Pilotto S, Caccese M, et al: Do immune checkpoint inhibitors need new studies methodology? J Thorac Dis 10:S1564-S1580, 2018 (suppl 13) Crossref, Medline, Google Scholar |
9. | Arfé A, Alexander BM, Trippa L: Optimality of testing procedures for survival data in the non-proportional hazards setting. Biometrics 10.1111/biom.13315 [epub ahead of print on June 14, 2020] Google Scholar |
10. | Liu W, Chen S, Yang W: The diverse cutoff of PD-L1 positivity and negativity in studies regarding head and neck squamous cell carcinoma. Oral Oncol 87:199-200, 2018 Crossref, Medline, Google Scholar |
11. | Tsao MS, Kerr KM, Kockx M, et al: PD-L1 immunohistochemistry comparability study in real-life clinical samples: Results of Blueprint phase 2 project. J Thorac Oncol 13:1302-1311, 2018 Crossref, Medline, Google Scholar |
12. | Büttner R, Gosney JR, Skov BG, et al: Programmed death-ligand 1 immunohistochemistry testing: A review of analytical assays and clinical implementation in non-small-cell lung cancer. J Clin Oncol 35:3867-3876, 2017 Link, Google Scholar |
13. | Mandrekar SJ, Sargent DJ: Clinical trial designs for predictive biomarker validation: Theoretical considerations and practical challenges. J Clin Oncol 27:4027-4034, 2009 Link, Google Scholar |
14. | Mandrekar SJ, Sargent DJ: Clinical trial designs for predictive biomarker validation: One size does not fit all. J Biopharm Stat 19:530-542, 2009 Crossref, Medline, Google Scholar |
15. | Royston P, Parmar MK: Restricted mean survival time: An alternative to the hazard ratio for the design and analysis of randomized trials with a time-to-event outcome. BMC Med Res Methodol 13:152, 2013 Crossref, Medline, Google Scholar |
16. | Pak K, Uno H, Kim DH, et al: Interpretability of cancer clinical trial results using restricted mean survival time as an alternative to the hazard ratio. JAMA Oncol 3:1692-1696, 2017 Crossref, Medline, Google Scholar |
17. | Moher D, Liberati A, Tetzlaff J, et al: Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. BMJ 339:b2535, 2009 Crossref, Medline, Google Scholar |
18. | Guyot P, Ades AE, Ouwens MJ, et al: Enhanced secondary analysis of survival data: Reconstructing the data from published Kaplan-Meier survival curves. BMC Med Res Methodol 12:9, 2012 Crossref, Medline, Google Scholar |
19. | Wan X, Peng L, Li Y: A review and comparison of methods for recreating individual patient data from published Kaplan-Meier survival curves for economic evaluations: A simulation study. PLoS One 10:e0121353, 2015 Crossref, Google Scholar |
20. | Anagnostou V, Yarchoan M, Hansen AR, et al: Immuno-oncology trial endpoints: Capturing clinically meaningful activity. Clin Cancer Res 23:4959-4969, 2017 Crossref, Medline, Google Scholar |
21. | Mushti SL, Mulkey F, Sridhara R: Evaluation of overall response rate and progression-free survival as potential surrogate endpoints for overall survival in immunotherapy trials. Clin Cancer Res 24:2268-2275, 2018 Crossref, Medline, Google Scholar |
22. | Udall M, Rizzo M, Kenny J, et al: PD-L1 diagnostic tests: A systematic literature review of scoring algorithms and test-validation metrics. Diagn Pathol 13:12, 2018 Crossref, Medline, Google Scholar |
23. | Kulangara K, Zhang N, Corigliano E, et al: Clinical utility of the combined positive score for programmed death ligand-1 expression and the approval of pembrolizumab for treatment of gastric cancer. Arch Pathol Lab Med 143:330-337, 2019 Crossref, Medline, Google Scholar |
24. | Liang F, Zhang S, Wang Q, et al: Treatment effects measured by restricted mean survival time in trials of immune checkpoint inhibitors for cancer. Ann Oncol 29:1320-1324, 2018 Crossref, Medline, Google Scholar |
25. | Trinquart L, Jacot J, Conner SC, et al: Comparison of treatment effects measured by the hazard ratio and by the ratio of restricted mean survival times in oncology randomized controlled trials. J Clin Oncol 34:1813-1819, 2016 Link, Google Scholar |
26. | Carlin JB: Tutorial in biostatistics. Meta-analysis: Formulating, evaluating, combining, and reporting by S-L. T. Normand, Statistics in Medicine, 18, 321-359 (1999). Stat Med 19:753-759, 2000 Crossref, Medline, Google Scholar |
27. | Crowther MJ, Riley RD, Staessen JA, et al: Individual patient data meta-analysis of survival data using Poisson regression models. BMC Med Res Methodol 12:34, 2012 Crossref, Medline, Google Scholar |
28. | Pardoll DM: The blockade of immune checkpoints in cancer immunotherapy. Nat Rev Cancer 12:252-264, 2012 Crossref, Medline, Google Scholar |
29. | Tanner MA, Wong WH: The calculation of posterior distributions by data augmentation. J Am Stat Assoc 82:528-540, 1987 Crossref, Google Scholar |
30. | Gelman A, Carlin J, Stern H, et al: Bayesian Data Analysis (ed 3). Boca Raton, FL, Taylor & Francis, 2013 Google Scholar |
31. | Schafer JL: Multiple imputation: A primer. Stat Methods Med Res 8:3-15, 1999 Crossref, Medline, Google Scholar |
32. | Uno H, Tian L, Horiguchi M, et al: survRM2: Comparing restricted mean survival time. 2017. https://CRAN.R-project.org/package=survRM2 Google Scholar |
33. | Holm S: A simple sequentially rejective multiple test procedure. Scand J Stat 6:65-70, 1979 Google Scholar |
34. | Alexander BM, Schoenfeld JD, Trippa L: Hazards of hazard ratios: Deviations from model assumptions in immunotherapy. N Engl J Med 378:1158-1159, 2018 Crossref, Medline, Google Scholar |
35. | Chan AWH, Tong JHM, Kwan JSH, et al: Assessment of programmed cell death ligand-1 expression by 4 diagnostic assays and its clinicopathological correlation in a large cohort of surgical resected non-small cell lung carcinoma. Mod Pathol 31:1381-1390, 2018 Crossref, Medline, Google Scholar |
36. | Aggarwal C, Rodriguez Abreu D, Felip E, et al: Prevalence of PD-L1 expression in patients with non-small cell lung cancer screened for enrollment in KEYNOTE-001,-010, and-024. Ann Oncol 27:359-378, 2016 Google Scholar |
37. | Brahmer J, Reckamp KL, Baas P, et al: Nivolumab versus docetaxel in advanced squamous-cell non-small-cell lung cancer. N Engl J Med 373:123-135, 2015 Crossref, Medline, Google Scholar |
38. | Borghaei H, Paz-Ares L, Horn L, et al: Nivolumab versus docetaxel in advanced nonsquamous non-small-cell lung cancer. N Engl J Med 373:1627-1639, 2015 Crossref, Medline, Google Scholar |
39. | Altman DG, Royston P: The cost of dichotomising continuous variables. BMJ 332:1080, 2006 Crossref, Medline, Google Scholar |
40. | Patel SP, Kurzrock R: PD-L1 expression as a predictive biomarker in cancer immunotherapy. Mol Cancer Ther 14:847-856, 2015 Crossref, Medline, Google Scholar |
41. | Herbst RS, Baas P, Kim DW, et al: Pembrolizumab versus docetaxel for previously treated, PD-L1-positive, advanced non-small-cell lung cancer (KEYNOTE-010): A randomised controlled trial. Lancet 387:1540-1550, 2016 Crossref, Medline, Google Scholar |
42. | Ilie M, Long-Mira E, Bence C, et al: Comparative study of the PD-L1 status between surgically resected specimens and matched biopsies of NSCLC patients reveal major discordances: A potential issue for anti-PD-L1 therapeutic strategies. Ann Oncol 27:147-153, 2016 Crossref, Medline, Google Scholar |
43. | Giunchi F, Degiovanni A, Daddi N, et al: Fading with time of PD-L1 immunoreactivity in non-small cells lung cancer tissues: A methodological study. Appl Immunohistochem Mol Morphol 26:489-494, 2018 Crossref, Medline, Google Scholar |
44. | Rittmeyer A, Barlesi F, Waterkamp D, et al: Atezolizumab versus docetaxel in patients with previously treated non-small-cell lung cancer (OAK): A phase 3, open-label, multicentre randomised controlled trial. Lancet 389:255-265, 2017 Crossref, Medline, Google Scholar |
45. | Fehrenbacher L, Spira A, Ballinger M, et al: Atezolizumab versus docetaxel for patients with previously treated non-small-cell lung cancer (POPLAR): A multicentre, open-label, phase 2 randomised controlled trial. Lancet 387:1837-1846, 2016 Crossref, Medline, Google Scholar |
46. | Barlesi F, Vansteenkiste J, Spigel D, et al: Avelumab versus docetaxel in patients with platinum-treated advanced non-small-cell lung cancer (JAVELIN Lung 200): An open-label, randomised, phase 3 study. Lancet Oncol 19:1468-1479, 2018 Crossref, Medline, Google Scholar |
47. | Gandhi L, Rodríguez-Abreu D, Gadgeel S, et al: Pembrolizumab plus chemotherapy in metastatic non-small-cell lung cancer. N Engl J Med 378:2078-2092, 2018 Crossref, Medline, Google Scholar |
48. | Shitara K, Özgüroğlu M, Bang YJ, et al: Pembrolizumab versus paclitaxel for previously treated, advanced gastric or gastro-oesophageal junction cancer (KEYNOTE-061): A randomised, open-label, controlled, phase 3 trial. Lancet 392:123-133, 2018 Crossref, Medline, Google Scholar |
49. | Kang YK, Boku N, Satoh T, et al: Nivolumab in patients with advanced gastric or gastro-oesophageal junction cancer refractory to, or intolerant of, at least two previous chemotherapy regimens (ONO-4538-12, ATTRACTION-2): A randomised, double-blind, placebo-controlled, phase 3 trial. Lancet 390:2461-2471, 2017 Crossref, Medline, Google Scholar |
50. | Bang YJ, Ruiz EY, Van Cutsem E, et al: Phase III, randomised trial of avelumab versus physician’s choice of chemotherapy as third-line treatment of patients with advanced gastric or gastro-oesophageal junction cancer: Primary analysis of JAVELIN Gastric 300. Ann Oncol 29:2052-2060, 2018 Crossref, Medline, Google Scholar |
51. | Powles T, Durán I, van der Heijden MS, et al: Atezolizumab versus chemotherapy in patients with platinum-treated locally advanced or metastatic urothelial carcinoma (IMvigor211): A multicentre, open-label, phase 3 randomised controlled trial. Lancet 391:748-757, 2018 Crossref, Medline, Google Scholar |
52. | Bellmunt J, de Wit R, Vaughn DJ, et al: Pembrolizumab as second-line therapy for advanced urothelial carcinoma. N Engl J Med 376:1015-1026, 2017 Crossref, Medline, Google Scholar |
53. | Ferris RL, Blumenschein G Jr, Fayette J, et al: Nivolumab vs investigator’s choice in recurrent or metastatic squamous cell carcinoma of the head and neck: 2-year long-term survival update of CheckMate 141 with analyses by tumor PD-L1 expression. Oral Oncol 81:45-51, 2018 Crossref, Medline, Google Scholar |
54. | Motzer RJ, Escudier B, McDermott DF, et al: Nivolumab versus everolimus in advanced renal-cell carcinoma. N Engl J Med 373:1803-1813, 2015 Crossref, Medline, Google Scholar |
55. | Robert C, Long GV, Brady B, et al: Nivolumab in previously untreated melanoma without BRAF mutation. N Engl J Med 372:320-330, 2015 Crossref, Medline, Google Scholar |
56. | Cleveland WS, Grosse E, Shyu WM: Local regression models, in Hastie TJ (ed): Statistical Models in S. New York, NY, Taylor & Francis, 2017, pp 309-377 Google Scholar |