0% found this document useful (0 votes)
71 views11 pages

A Genomic Strategy

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 11

The n e w e ng l a n d j o u r na l of m e dic i n e

original article

A Genomic Strategy to Refine Prognosis


in Early-Stage Non–Small-Cell Lung Cancer
Anil Potti, M.D., Sayan Mukherjee, Ph.D., Rebecca Petersen, M.D.,
Holly K. Dressman, Ph.D., Andrea Bild, Ph.D., Jason Koontz, M.D.,
Robert Kratzke, M.D., Mark A. Watson, M.D., Ph.D., Michael Kelley, M.D.,
Geoffrey S. Ginsburg, M.D., Ph.D., Mike West, Ph.D., David H. Harpole, Jr., M.D.,
and Joseph R. Nevins, Ph.D.

A bs t r ac t

Background
From the Institute for Genome Sciences Clinical trials have indicated a benefit of adjuvant chemotherapy for patients with
and Policy (A.P., S.M., H.K.D., A.B., J.K., stage IB, II, or IIIA — but not stage IA — non–small-cell lung cancer (NSCLC). This
G.S.G., M.W., J.R.N.) and the Institute of
Statistics and Decision Sciences (S.M., classification scheme is probably an imprecise predictor of the prognosis of an in-
M.W.), Duke University; and the Depart- dividual patient. Indeed, approximately 25 percent of patients with stage IA disease
ments of Medicine (A.P., J.K., M.K., G.S.G.), have a recurrence after surgery, suggesting the need to identify patients in this
Surgery (R.P., D.H.H.), and Molecular
Genetics and Microbiology (H.K.D., A.B., subgroup for more effective therapy.
J.R.N.), Duke University Medical Center
— both in Durham, N.C.; the Department Methods
of Medicine, University of Minnesota,
Minneapolis (R.K.); and the Department We identified gene-expression profiles that predicted the risk of recurrence in a cohort
of Pathology and Immunology, Washing- of 89 patients with early-stage NSCLC (the lung metagene model). We evaluated the
ton University School of Medicine, St. predictor in two independent groups of 25 patients from the American College of
Louis (M.A.W.). Address reprint requests
to Dr. Nevins at the Duke Institute for Ge- Surgeons Oncology Group (ACOSOG) Z0030 study and 84 patients from the Cancer
nome Sciences and Policy, Duke Univer- and Leukemia Group B (CALGB) 9761 study.
sity, 101 Science Dr., Box 3382, Durham,
NC 27708, or at nevin001@mc.duke.edu.
Results
N Engl J Med 2006;355:570-80. The lung metagene model predicted recurrence for individual patients significantly
Copyright © 2006 Massachusetts Medical Society. better than did clinical prognostic factors and was consistent across all early stages
of NSCLC. Applied to the cohorts from the ACOSOG Z0030 trial and the CALGB
9761 trial, the lung metagene model had an overall predictive accuracy of 72 percent
and 79 percent, respectively. The predictor also identified a subgroup of patients
with stage IA disease who were at high risk for recurrence and who might be best
treated by adjuvant chemotherapy.

Conclusions
The lung metagene model provides a potential mechanism to refine the estimation
of a patient’s risk of disease recurrence and, in principle, to alter decisions regarding
the use of adjuvant chemotherapy in early-stage NSCLC.

570 n engl j med 355;6 www.nejm.org august 10, 2006

The New England Journal of Medicine


Downloaded from www.nejm.org by Amirah Shaleha on November 12, 2010. For personal use only. No other uses without permission.
Copyright © 2006 Massachusetts Medical Society. All rights reserved.
Genomic Str ategy to Refine Prognosis in Early Non–Small-Cell Lung Cancer

L
ung cancer is the leading cause of listed in Table 1 of the Supplementary Appendix,
death from cancer among both men and available with the full text of this article at www.
women in the United States, and non–small- nejm.org. All patients were enrolled according to
cell lung cancer (NSCLC) accounts for almost 80 protocols approved by the institutional review
percent of such deaths.1,2 The clinical staging sys- board of Duke University, after written informed
tem has been the standard for determining lung- consent had been obtained.
cancer prognosis.3-5 Although other clinical and
biochemical markers have prognostic signifi- Histopathological Evaluation
cance,6,7 none are more accurate than the clinico- For each cohort, a single pathologist reviewed all
pathological stage.8 slides to determine whether they met the histo-
The current standard of treatment for patients pathological criteria for NSCLC of the World Health
with stage I NSCLC is surgical resection, despite Organization, including the subtype of adenocar-
the observation that nearly 30 to 35 percent will cinoma and the degrees of differentiation, lym-
relapse after the initial surgery and thus have a phatic invasion, and vascular invasion. Only sam-
poor prognosis,2,4 indicating that a subgroup of ples with a tumor-cell content of more than 50
these patients might benefit from adjuvant che- percent were used in the analysis.
motherapy. Similarly, as a population, patients
with clinical stage IB, IIA or IIB, or IIIA NSCLC Gene-Expression Arrays
receive adjuvant chemotherapy,9-13 but some may Total RNA was extracted from the tumor tissue
receive potentially toxic chemotherapy unneces- with RNeasy Kits (Qiagen). The RNA quality was
sarily. Thus, the ability to identify subgroups of assessed with the use of a bioanalyzer (model 2100,
patients more accurately may improve health out- Agilent). Hybridization targets were prepared from
comes across the spectrum of disease. the total RNA according to standard Affymetrix
Previous studies have described the develop- protocols (described in detail in the Supplemen-
ment of gene-expression, protein, and messenger tary Appendix, along with the methods involved
RNA profiles that are associated in some cases in the scanning of the arrays and the normaliza-
with the outcome of lung cancer.14-24 However, tion of the resulting data). The microarray assays
the extent to which these profiles can be used to were carried out with Affymetrix GeneChips (U133
refine the clinical prognosis and the context in Plus2). All raw data and data transformed with
which improved prognostic capability could be the use of the robust multiarray average expres-
used to alter a clinical treatment decision were not sion measure for the Duke, ACOSOG, and CALGB
clear. Thus, we evaluated the use of gene-expres- data sets are available elsewhere (accession num-
sion patterns as a means of stratifying risk and ber GSE3593 in the Gene Expression Omnibus
treatment in NSCLC. database at www.ncbi.nlm.nih.gov/geo).

Me thods Statistical Analysis


We performed statistical analyses using the meta-
Patients and Tumor Samples gene construction and binary prediction tree anal-
We analyzed 198 tumor samples from three co- ysis, as described previously25-29 and in detail in
horts of patients with NSCLC. The training cohort the Supplementary Appendix. The metagene for
consisted of 89 patients enrolled through the Duke a cluster of genes is the dominant singular factor
Lung Cancer Prognostic Laboratory. The indepen- (principal component), as computed with the use
dent validation cohorts included patients in two of a singular value decomposition of gene-expres-
multicenter cooperative group trials: 25 patients sion levels in the gene cluster in all samples. The
from the American College of Surgeons Oncolo- metagene represents the dominant average pattern
gy Group (ACOSOG) Z0030 study and 84 from of expression of the gene cluster across the tumor
the prospective Cancer and Leukemia Group B samples.25
(CALGB) 9761 trial. Table 1 lists the clinical and We then used the set of metagenes and the
demographic characteristics of the patients in each clinical variables previously shown to be of prog-
cohort and their tumors, and complete details are nostic value (age, sex, tumor diameter, stage of

n engl j med 355;6 www.nejm.org august 10, 2006 571

The New England Journal of Medicine


Downloaded from www.nejm.org by Amirah Shaleha on November 12, 2010. For personal use only. No other uses without permission.
Copyright © 2006 Massachusetts Medical Society. All rights reserved.
The n e w e ng l a n d j o u r na l of m e dic i n e

Table 1. Characteristics of Patients and Tumors.*

ACOSOG Z0030 CALGB 9761


Duke Training Cohort Validation Cohort Validation Cohort
Characteristic (N = 89) (N = 25) (N = 84)
Age — yr
Median 67 67 66
Range 32–83 41–80 33–82
Mean 65±10 65±5 65±10
Sex — %
Male† 56 (63) 16 (64) 56 (67)
Female 33 (37) 9 (36) 28 (33)
Race — %‡ — —
White 78 (88)
Black 8 (9)
Other 3 (3)
Tobacco history — no. (%) — —
Cigarette smoking
None 7 (8)
≤20 Yr 10 (11)
21–49 Yr 36 (40)
≥50 Yr 34 (38)
Heavy cigar smoking, ≥40 Yr 2 (2)
Cancer-cell type — no. (%)
Adenocarcinoma 45 (51) 11 (44) 84 (100)
Squamous 44 (49) 14 (56) 0
Stage — no. (%)
IA 39 (44) 5 (20) 24 (29)
IB 30 (34) 13 (52) 28 (33)
IIA 4 (5) 2 (8) 7 (8)
IIB 10 (11) 5 (20) 8 (10)
IIIA 6 (7) — 9 (11)
IIIB — — 8 (10)
Tumor diameter — cm 4±2 3±2 3±2
Tumor stage — no. (%)
1 37 (42) 10 (40) 33 (39)
2 49 (55) 14 (56) 38 (45)
3 3 (3) 1 (4) 5 (6)
4 — — 8 (10)
Nodal status — no. (%)
Negative 66 (74) 19 (76) 60 (71)
Positive 23 (26) 6 (24) 24 (29)
Accuracy of the lung metagene model — %§ 93 72 79

* Plus–minus values are means ±SD. Percentages may not total 100, because of rounding.
† There were more men in the study cohorts, since one of the principal sites involved was a Veterans Affairs medical center.
‡ Race was self-reported.
§ For the ACOSOG and CALGB data sets, the accuracy was predicted with the use of the Duke cohort as the training co-
hort. Recurrence was defined with the use of a probability of 0.5 as a cutoff.

572 n engl j med 355;6 www.nejm.org august 10, 2006

The New England Journal of Medicine


Downloaded from www.nejm.org by Amirah Shaleha on November 12, 2010. For personal use only. No other uses without permission.
Copyright © 2006 Massachusetts Medical Society. All rights reserved.
Genomic Str ategy to Refine Prognosis in Early Non–Small-Cell Lung Cancer

disease, histologic subtype, and smoking history) R e sult s


in a binary classification-tree analysis to partition
the samples recursively into smaller subgroups. Patient Characteristics
Within these subgroups, predictions of recurrence Table 1 lists the demographic and clinical char-
(with 0 representing 5-year disease-free survival acteristics of the patients (and their tumors) used
and 1 representing death within 2.5 years after the to develop and test the prognostic model (Fig. 1).
initial diagnosis of NSCLC) were made in terms
of the estimated relative probabilities.26,30,31 In Use of Gene-Expression Profiles
the analysis, many classification trees were com- to Improve Prognosis
puted, weighed, and integrated to provide over- Lung cancer is a heterogeneous disease resulting
all risk predictions for each patient. The domi- from the acquisition of multiple somatic mutations;
nant metagenes that constituted the final model given this complexity, it would be surprising if a
are described in the Supplementary Appendix. single gene-expression pattern could effectively
To compare the prognostic efficacy of the meta- describe and ultimately predict the clinical course
gene and clinical strategies, the clinical variables of the disease for all patients. Recognizing the
were treated as factors or principal components importance of addressing this complexity, we have
(similar to the treatment of metagenes in the lung previously described methods to integrate various
metagene model) in a classification-tree analysis forms of data, including clinical variables and mul-
to generate a clinical model. The end result was tiple gene-expression profiles, to build robust pre-
the probability of recurrence, which represents the dictive models for the individual patient.25,26 There
conglomerate prognostic value of the individual are two critical components of this methodologic
clinical variables. Using GraphPad software, we approach. First, we generated a collection of gene-
computed a C statistic (comparable to the area expression profiles, termed “metagenes” (an ex-
under the curve in a receiver-operating-character- ample is given in Fig. 2A), that provide the basis
istic curve in the prediction of binary outcomes) for the predictive models. Second, we used clas-
for the model that included just the clinical vari-
ables, a C statistic for a model that included just
Duke training cohort
the metagenes, and a C statistic for a model that (n=91)
included both the clinical and genomic variables.
The accuracy of each model was defined with
the use of a probability of 0.5 as a cutoff. An
estimated probability of recurrence of more than 89 Patients analyzed from the
Duke training cohort with
0.5 was classified as a high risk of recurrence; the lung metagene model
an estimated probability of recurrence of 0.5 or 2 Excluded from analysis

less was classified as a low risk of recurrence.


Simple univariate and multivariate logistic re-
gressions for recurrence (with and without the Validation cohorts

metagene-based assessment of the risk) were also


computed to assess the baseline prognostic value
of each clinical variable (age, sex, tumor diameter,
stage of disease, histologic subtype, and smoking
history) in the cohorts. We also calculated the 44 Patients in the ACOSOG Z0030 91 Patients in the CALGB 9761
trial assessed for eligibility trial assessed for eligibility
sensitivity, specificity, and positive and negative
predictive values using a probability of recurrence
of 0.5 as the cutoff value. Standard Kaplan–Meier
25 Analyzed with the lung metagene 84 Analyzed with the lung metagene
survival curves were generated for the high-risk model model
and low-risk groups of patients with the use of 19 Excluded from analysis 7 Excluded from analysis
GraphPad software; the survival curves were com-
pared with the use of the log-rank test. This test Figure 1. Development and Validation of the Lung Metagene Model.
generates a two-tailed P value that tests the null Samples were excluded from analyses on the basis of inadequate quality of
hypothesis, which was that the survival curves the messenger RNA.
were identical among the cohorts.

n engl j med 355;6 www.nejm.org august 10, 2006 573

The New England Journal of Medicine


Downloaded from www.nejm.org by Amirah Shaleha on November 12, 2010. For personal use only. No other uses without permission.
Copyright © 2006 Massachusetts Medical Society. All rights reserved.
The n e w e ng l a n d j o u r na l of m e dic i n e

A Metagene 79 B Classification Tree


Tree -1
n=89
0.46
(48/41)

Alive without
recurrence Disease recurrence or death
mgene79 ≤0.09 mgene79 >0.09
30/41 0.61 18/0 0.06

mgene40 ≤0.5 mgene40 >0.5


29/22 0.47 1/19 0.93

mgene5 ≤0.01 mgene5 >0.01


11/20 0.67 18/2 0.14

mgene45 ≤0.04 mgene45 >0.04


6/0 0.07 5/20 0.82

mgene31 ≤0.07 mgene31 >0.09


1/18 0.93 4/2 0.44

C Lung Metagene Model D Clinical Model


Patients without recurrence Patients with recurrence Patients without recurrence Patients with recurrence
1.00 1.00
Probability of Recurrence

Probability of Recurrence

0.75 0.75

0.50 0.50

0.25 0.25

0.00 0.00
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
No. of Samples No. of Samples

Figure 2. Clinical and Genomic Prediction of the Risk of Recurrence of NSCLC.


Panel A shows an example of a key metagene profile used in the lung metagene model, with blue and red represent-
ing the two extremes of gene expression. Panel B shows an example of a classification tree illustrating the incorpo-
ration of metagenes (mgenes) at various levels to predict survival in the Duke training cohort. Numbers and lines
in red indicate patients who survived less than 2.5 years after the initial diagnosis of NSCLC, and those in blue rep-
resent patients who survived more than 5 years after the initial diagnosis of NSCLC. The left-hand box at each node
of the tree shows the number of patients and the total number of patients, and the right-hand box gives (as a per-
centage) the corresponding model-based point estimate of the probability of recurrence within 2.5 years based on
the tree-model predictions for that group. The mean probabilities of recurrence predicted by the lung metagene
model (Panel C) and by the clinical model generated with data on age, sex, tumor diameter, stage of disease, histo-
logic subtype, and smoking history (Panel D) in the Duke cohort are also shown. For each patient, the probability
of recurrent disease was predicted in an out-of-sample cross-validation based on a model completely regenerated
from the data for the remaining patients. I bars represent 95 percent confidence intervals.

574 n engl j med 355;6 www.nejm.org august 10, 2006

The New England Journal of Medicine


Downloaded from www.nejm.org by Amirah Shaleha on November 12, 2010. For personal use only. No other uses without permission.
Copyright © 2006 Massachusetts Medical Society. All rights reserved.
Genomic Str ategy to Refine Prognosis in Early Non–Small-Cell Lung Cancer

sification- and regression-tree analysis to sample than stage of disease, tumor diameter, nodal sta-
these metagenes and build prognostic models; tus, age, sex, histologic subtype, or smoking his-
this approach mines the collection of profiles to tory (Table 3 in the Supplementary Appendix).
predict the clinical outcome best. An example tree Finally, further confirmation that the lung
(one of many generated in the analysis) is depicted metagene model represents the biology of the tu-
in Figure 2B. mor was provided by the finding that the meta-
The predictive accuracy of each model was genes with the greatest discriminatory capability
initially assessed with the use of leave-one-out in the model included genes that have previously
cross-validation, in which the analysis is performed been shown to have clinical relevance in NSCLC.
repeatedly, one sample is removed each time, and In some instances, a metagene represented a sin-
the probability of recurrence is predicted for that gle molecular process such as angiogenesis (meta-
sample. Because the entire model-building process gene 19), which is a proven target for therapy
is repeated for each prediction, the reproducibility in NSCLC. Other key metagenes, such as meta-
of the approach is also evaluated. As a measure of gene 41, represented a combination of biologic
model stability, we generated multiple iterations processes — for example, the BRAF, phospha-
of randomly split training and validation sets tidylinositol 3 kinase, TP53, and MYC signaling
from within the Duke cohort; the resulting ac- pathways.
curacy of prognostic capability exceeded 85 per-
cent (data not shown). Validation of the Metagene Prognostic Model
The lung metagene model for the prediction Validation across Early Stages and Subtypes
of recurrence was superior to a predictive model of NSCLC
generated with the same methods but that in- The samples used to devise the prognostic model
cluded clinical data alone (including age, sex, represented both the major histologic subtypes
tumor diameter, stage of disease, histologic sub- of NSCLC (adenocarcinoma and squamous-cell
type, and smoking history). In the Duke cohort, carcinoma) and all the early stages of disease. To
the lung metagene model predicted disease re- assess the general robustness of the prognostic
currence with an overall accuracy of 93 percent model in the Duke cohort, we examined the pre-
(Fig. 2C). The model built with clinical data had dictions of risk as a function of these variables.
an accuracy of only 64 percent (Fig. 2D). Inclu- The lung metagene model was consistently ac-
sion of the clinical data with the genomic data curate across all the early stages of NSCLC (Fig.
did not further improve the accuracy of the pre- 1 in the Supplementary Appendix) and between
diction of recurrence over that of the genomic the major histologic subtypes (Fig. 2 in the Sup-
data alone. plementary Appendix), not only in the estimated
The outperformance of the clinical model by risk of recurrence but also in the results of the
the lung metagene model in identifying patients Kaplan–Meier survival analysis for each stage or
at risk for recurrence was also supported by the subtype.
results of Kaplan–Meier analyses. The lung meta-
gene model identified two distinct groups of pa- Validation across Data from Two Multicenter Studies
tients with respect to survival (Fig. 3A). In con- For a new prognostic model that assesses the risk
trast, the distinction was less clear for each of the of recurrence to be used to inform the decision of
models based on clinical predictions (one that whether to administer adjuvant chemotherapy, the
combined the clinical variables in a manner simi- model must be shown to be robust when applied
lar to the lung metagene model, and another that to independent, heterogeneous populations of pa-
was based on individual clinical prognostic fac- tients and conditions of sample acquisition. We
tors [tumor diameter and stage of disease are therefore evaluated the ability of the metagene
shown]) (Fig. 3B). Univariate and multivariate model generated from the Duke training cohort
analyses (with and without the genome-based to predict the risk of recurrence by using samples
assessment of the risk of recurrence) to assess the from two multicenter, cooperative group studies
relative prognostic value of the individual clinical (ACOSOG Z0030 and CALGB 9761) (Fig. 1). These
variables and the lung metagene model showed sets of samples represented the full spectrum of
that the lung metagene model performed signifi- clinical outcomes; the samples were not selected
cantly better (P<0.001 by multivariate analysis) with respect to the duration of survival.

n engl j med 355;6 www.nejm.org august 10, 2006 575

The New England Journal of Medicine


Downloaded from www.nejm.org by Amirah Shaleha on November 12, 2010. For personal use only. No other uses without permission.
Copyright © 2006 Massachusetts Medical Society. All rights reserved.
The n e w e ng l a n d j o u r na l of m e dic i n e

Figure 3. Kaplan–Meier Survival Estimates for the Duke A Lung Metagene Model
Training Cohort. 100
Estimates based on predictions from the lung meta-
gene model demonstrate the value of that approach Low risk of recurrence
75
(Panel A). Panel B shows the estimates based on the

Survival (%)
clinical model of prognosis, as well as those based on
individual clinical characteristics — here, tumor diam- 50
eter and stage of disease. A high risk of recurrence was
defined as a probability of recurrence of more than 0.5,
25
and a low risk of recurrence was defined as a risk of High risk of recurrence
0.5 or less. P values were obtained with the use of a P<0.001
log-rank test. Tick marks indicate patients whose data 0
were censored by the time of last follow-up or owing to 0 10 20 30 40 50 60 70 80 90 100
death. Months

We analyzed 25 samples from the ACOSOG B Clinical Model


Z0030 trial to validate the performance of the 100
predictive model of recurrence based on the Duke
training cohort. As was the case with the Duke 75
cohort, for the ACOSOG Z0030 cohort, univariate Low risk of recurrence

Survival (%)
and multivariate analyses showed that the meta- 50
gene model was a significantly more accurate
predictor (P<0.001 by multivariate analysis) than 25 High risk of recurrence
stage of disease, tumor diameter, nodal status,
P=0.04
age, sex, histologic subtype, or smoking history
0
(Table 3 in the Supplementary Appendix). The ac- 0 10 20 30 40 50 60 70 80 90 100
curacy of the prediction of recurrence in the Months
ACOSOG samples was approximately 72 percent
(sensitivity, 85 percent; specificity, 58 percent; 100
positive predictive value, 69 percent; and negative
predictive value, 78 percent) (Fig. 4A). The level 75
Tumor diameter ≤3.0 cm
of accuracy provides an assessment of the robust-
Survival (%)

ness of the risk predictions and is substantial, 50 Tumor diameter >3.0 cm


particularly given the heterogeneity of the cohort
and the fact that the clinical outcomes among the 25
patients in the ACOSOG cohort are prospective.
P=0.04
The Kaplan–Meier survival curves, stratified ac-
0
cording to the risk predictions based on the lung 0 10 20 30 40 50 60 70 80 90 100
metagene model, provide strong evidence of the Months
reliability of those predictions (Fig. 4A). In addi-
tion, a multivariate analysis showed that in this 100
cohort, the patients predicted by the lung meta-
gene model to have a probability of recurrence 75
of more than 0.5 were more likely to have a re-
Survival (%)

currence than those with a predicted probability 50


of recurrence of 0.5 or less (adjusted odds ratio, Disease stage ≤1
35.9; 95 percent confidence interval, 2.8 to 46.3). 25
We analyzed 84 samples from the CALGB 9761 Disease stage >1
P=0.03
trial as a second independent validation cohort.
0
The investigators applying the predictive model 0 10 20 30 40 50 60 70 80 90 100
were unaware of the outcomes among these pa- Months
tients; thus, the genome-based predictions of re-

576 n engl j med 355;6 www.nejm.org august 10, 2006

The New England Journal of Medicine


Downloaded from www.nejm.org by Amirah Shaleha on November 12, 2010. For personal use only. No other uses without permission.
Copyright © 2006 Massachusetts Medical Society. All rights reserved.
Genomic Str ategy to Refine Prognosis in Early Non–Small-Cell Lung Cancer

A ACOSOG Validation Cohort (N=25)

1.00 Patients without recurrence 100


Patients with recurrence
Probability of Recurrence

80 Low risk of
0.75
recurrence

Survival (%)
60
0.50
40 High risk of
recurrence
0.25
20
P<0.001
0.00 0
0 5 10 15 20 25 30 0 10 20 30 40 50 60
No. of Samples Months

B CALGB Validation Cohort (N=84)

1.00 Patients without recurrence 100


Patients with recurrence
Probability of Recurrence

Low risk of
80 recurrence
0.75

Survival (%)
60
0.50
40
High risk of
0.25 recurrence
20
P<0.001
0.00 0
0 10 20 30 40 50 60 70 80 90 0 25 50 75 100 125 150
No. of Samples Months

Figure 4. Independent Validation of the Lung Metagene Model with the Use of Data from the ACOSOG Z0030 Study
and the CALGB 9761 Study.
The lung metagene model was used to estimate the probabilities of recurrence for the ACOSOG samples (Panel A)
and the CALGB samples (Panel B) and to estimate the Kaplan–Meier survival estimates according to the predicted
risk of recurrence. For the CALGB cohort, investigators were unaware of the clinical outcomes, and the predictive re-
sults were submitted to the CALGB statistical center for the evaluation of performance. I bars represent 95 percent
confidence intervals. A high risk of recurrence was defined as a risk of more than 0.5, and a low risk of recurrence
was defined as a risk of 0.5 or less. P values were obtained with the use of a log-rank test. Tick marks indicate pa-
tients whose data were censored by the time of last follow-up or owing to death.

currence were submitted to a CALGB statistician probability of recurrence of greater than 0.5 as
for comparison with the true outcomes. Once compared with 0.5 or less, according to the lung
again, univariate and multivariate analyses showed metagene model (Fig. 4B). Similar to the results
that the lung metagene model predicted outcome seen for the Duke and ACOSOG data, the adjusted
significantly better (P<0.001 by multivariate anal- odds ratio for disease recurrence in the CALGB
ysis) than the stage of disease, tumor diameter, cohort was 16.6 (95 percent confidence interval,
nodal status, age, sex, histologic subtype, or smok- 4.4 to 62.8) when the model estimate for recur-
ing history (Table 3 in the Supplementary Ap- rence was greater than 0.5 (Table 3 in the Supple-
pendix). The overall predictive accuracy of the mentary Appendix).
model for the CALGB samples was 79 percent We also applied the lung metagene model to
(sensitivity, 68 percent; specificity, 88 percent; another cohort of 15 patients with surgically re-
positive predictive value, 79 percent; and negative sected stage I squamous-cell lung cancer. Using
predictive value, 80 percent) (Fig. 4A). Again, the the lung metagene model, we were able to predict
Kaplan–Meier analysis showed a significant dif- the outcome accurately in all 5 patients with re-
ference in the survival rates of patients with a currence and in 7 of 10 patients without recur-

n engl j med 355;6 www.nejm.org august 10, 2006 577

The New England Journal of Medicine


Downloaded from www.nejm.org by Amirah Shaleha on November 12, 2010. For personal use only. No other uses without permission.
Copyright © 2006 Massachusetts Medical Society. All rights reserved.
The n e w e ng l a n d j o u r na l of m e dic i n e

rence, for an overall accuracy of 80 percent (Fig. 3 apy. We therefore focused on the 68 patients from
in the Supplementary Appendix). the Duke, ACOSOG, and CALGB cohorts who were
Finally, to evaluate the extent to which the classified clinically as having stage IA disease.
metagene model could increase the ability of clini- Kaplan–Meier survival curves were generated for
cians to estimate prognosis, we computed a C sta- the group as a whole, as well as for the subgroups
tistic as a measure of the capacity of the clinical predicted to be at high or low risk for recurrence
or genomic information to identify patients ac- by the lung metagene model. Although the sur-
cording to the risk of recurrence. For the ACOSOG vival rate for the group was approximately 70 per-
cohort, the C statistic based on clinical variables cent at four years, the survival rate for those pre-
alone was 0.67; this value was increased to 0.84 dicted to be at high risk was less than 10 percent
by the inclusion of genomic data. For the CALGB (Fig. 5A), thus identifying the subgroup of patients
cohort, inclusion of the genomic data increased with stage IA NSCLC at risk for recurrence.
the value from 0.73 to 0.87. Clearly, the genomic
data transformed a limited clinical-based progno- Dis cus sion
sis to one with substantial capacity to identify pa-
tients who were likely to have disease recurrence. Although gene-expression profiles that can clas-
sify patients with cancer according to their risk
Application of the Refined Prognosis of recurrence have been described in many in-
Previous studies have shown that 25 percent of stances, the prognostic tool we devised could be
patients with stage IA NSCLC will have disease used to change a clinical decision. In particular,
recurrence within five years. Thus, some patients the guidelines for the treatment of patients with
with stage IA NSCLC might be more appropriately stage I NSCLC provide an opportunity to use an
categorized as being at higher risk than others improved prognostic model to refine the currently
and might be candidates for adjuvant chemother- imprecise assessment of risk and the decision re-

A B
Patients with stage IA NSCLC

Stage IA, predicted low risk


100 of recurrence (n=47) Surgery and gene-expression
analysis
80
Stage IA (n=68)
Survival (%)

60 Application of lung metagene model

40

20 Stage IA, predicted high risk Low risk of High risk of


P<0.001 of recurrence (n=21) recurrence recurrence
0
0 25 50 75 100 125 150
Months Observation Randomization

Observation Chemotherapy

Figure 5. Application of the Lung Metagene Model to Refine the Assessment of Risk and Guide the Use of Adjuvant
Chemotherapy in Stage IA NSCLC.
Panel A shows the Kaplan–Meier survival estimates for a group of patients with stage IA disease from the Duke,
ACOSOG, and CALGB cohorts and the subgroups predicted to have either a high probability (>0.5) or a low proba-
bility (≤0.5) of recurrence. P values were obtained with the use of a log-rank test. Tick marks indicate patients
whose data were censored by the time of last follow-up or owing to death. Panel B illustrates the possible design of
a planned prospective, phase 3 clinical trial involving patients with stage IA NSCLC to evaluate the performance
of the metagene model.

578 n engl j med 355;6 www.nejm.org august 10, 2006

The New England Journal of Medicine


Downloaded from www.nejm.org by Amirah Shaleha on November 12, 2010. For personal use only. No other uses without permission.
Copyright © 2006 Massachusetts Medical Society. All rights reserved.
Genomic Str ategy to Refine Prognosis in Early Non–Small-Cell Lung Cancer

garding whom to treat, and thus potentially lead- first step in the use of genomic tools as a strategy
ing to more personalized cancer treatment. In to refine the prognosis and improve the selection of
this case, the refinement of prognosis with the patients appropriate for adjuvant chemotherapy.
use of the metagene model provides the opportu- Drs. Nevins, West, and Dressman report holding equity in
nity for a prospective, randomized, phase 3 clin- Expression Analysis, a DNA microarray service provider estab-
ical trial that would evaluate the benefit of the lished by Duke University. Drs. Nevins, West, Dressman, and
Ginsburg report having served on the advisory board of Expres-
identification of a subgroup of patients with sion Analysis. Dr. Dressman reports having served as a paid
stage IA disease estimated to be at high risk for consultant to Expression Analysis, which carried out the micro-
recurrence (Fig. 5B). Patients initially classified as array assays with Affymetrix GeneChips (U133 Plus2). Dr. Har-
pole reports having served on the advisory board of Genentech
having clinical stage IA disease would undergo (OSI Pharmaceuticals). No other potential conflict of interest
surgery, and the metagene model would then be relevant to this article was reported.
applied to identify the patients predicted to be at We are indebted to the participants of the ACOSOG Z0030
high risk for recurrence. Patients at high risk and CALGB 9761 studies; to Mark Allen, principal investigator
of the ACOSOG Z0030 study; to Michael Maddaus, principal in-
would then be randomly assigned to observation vestigator of the CALGB 9761 study; to Xiaofei Wang, statistician
(the current standard of care for stage IA disease) for the CALGB 9761 study, who was also responsible for the
or adjuvant chemotherapy, in order to evaluate blinded validation of the model predictions; to David Beer, at the
University of Michigan, for the array data on the CALGB 9761
the extent to which the use of genomic reclassi- data set; and to Kaye Culler for her assistance with the prepara-
fication improves survival. Our study is a critical tion of the manuscript.

References
1. Spira A, Ettinger DS. Multidisciplinary tion in resected non–small-cell lung can- 17. Ju Z, Kapoor M, Newton K, et al.
management of lung cancer. N Engl J Med cer. N Engl J Med 2005;352:2589-97. Global detection of molecular changes re-
2004;350:379-92. 11. Douillard J-Y, Rosell R, Delena M, veals concurrent alteration of several bio-
2. Hoffman PC, Mauer AM, Vokes EE. Legroumellec A, Torres A, Carpagnano F. logical pathways in nonsmall cell lung can-
Lung cancer. Lancet 2000;355:479-85. ANITA: phase III adjuvant vinorelbine (N) cer cells. Mol Genet Genomics 2005;274:
[Erratum, Lancet 2000;355:1280.] and cisplatin (P) versus observation (OBS) 141-54.
3. Mountain CF. Revisions in the Inter- in completely resected (stage I-III) non- 18. Beer DG, Kardia SLR, Huang CC, et
national System for Staging Lung Cancer. small-cell lung cancer (NSCLC) patients al. Gene-expression profiles predict sur-
Chest 1997;111:1710-7. (pts): final results after 70-month median vival of patients with lung adenocarcino-
4. Nesbitt JC, Putnam JB Jr, Walsh GL, follow-up. J Clin Oncol 2005;23:Suppl: ma. Nat Med 2002;8:816-24.
Roth JA, Mountain CF. Survival in early- 7013. abstract. 19. Chen G, Gharib TG, Wang H, et al.
stage non-small cell lung cancer. Ann 12. Kato H, Ichinose Y, Ohta M, et al. Protein profiles associated with survival
Thorac Surg 1995;60:466-72. A randomized trial of adjuvant chemo- in lung adenocarcinoma. Proc Natl Acad
5. Mountain CF. The new International therapy with uracil–tegafur for adenocar- Sci U S A 2003;100:13537-42.
Staging System for Lung Cancer. Surg cinoma of the lung. N Engl J Med 2004; 20. Bhattacharjee A, Richards WG,
Clin North Am 1987;67:925-35. 350:1713-21. Staunton J, et al. Classification of human
6. D’Amico TA, Massey M, Herndon JE 13. Strauss GM. Herndon JE II, Maddaus lung carcinomas by mRNA expression
II, Moore MB, Harpole DH Jr. A biologic MA, et al. Randomized clinical trial of profiling reveals distinct adenocarcinoma
risk model for stage I lung cancer: im- adjuvant chemotherapy with paclitaxel and subclasses. Proc Natl Acad Sci U S A
munohistochemical analysis of 408 pa- carboplatin following resection in Stage 2001;98:13790-5.
tients with the use of ten molecular mark- 1B non-small cell lung cancer. J Clin On- 21. Wigle DA, Jurisica I, Radulovich N, et
ers. J Thorac Cardiovasc Surg 1999;117: col 2004;22:7019. abstract. al. Molecular profiling of non-small cell
736-43. 14. Tonon G, Wong KK, Maulik G, et al. lung cancer and correlation with disease-
7. Brundage MD, Davies D, Mackillop High-resolution genomic profiles of hu- free survival. Cancer Res 2002;62:3005-8.
WJ. Prognostic factors in non-small cell man lung cancer. Proc Natl Acad Sci U S A 22. Kikuchi T, Daigo Y, Katagiri T, et al.
lung cancer: a decade of progress. Chest 2005;102:9625-30. Expression profiles of non-small cell lung
2002;122:1037-57. 15. Schneider PM, Praeuer HW, Stoeltz- cancers on cDNA microarrays: identifica-
8. Meyerson M, Carbone DP. Genomic ing O, et al. Multiple molecular marker tion of genes for prediction of lymph-node
and proteomic profiling of lung cancers: testing (p53, C-Ki-ras, c-erbB-2) improves metastasis and sensitivity to anti-cancer
lung cancer classification in the age of estimation of prognosis in potentially cu- drugs. Oncogene 2003;22:2192-205.
targeted therapy. J Clin Oncol 2005;23: rative resected non-small cell lung cancer. 23. Garber ME, Troyanskaya OG, Schlu-
3219-26. Br J Cancer 2000;83:473-9. ens K, et al. Diversity of gene expression
9. Arriagada R, Bergman B, Dunant A, 16. Berrar D, Sturgeon B, Bradbury I, in adenocarcinoma of the lung. Proc Natl
et al. Cisplatin-based adjuvant chemo- Downes CS, Dubitzky W. Survival trees for Acad Sci U S A 2001;98:13784-9.
therapy in patients with completely re- analyzing clinical outcome in lung adeno- 24. Yanaihara N, Caplen N, Bowman E, et
sected non–small-cell lung cancer. N Engl carcinomas based on gene expression pro- al. Unique microRNA molecular profiles
J Med 2004;350:351-60. files: identification of neogenin and diacyl- in lung cancer diagnosis and prognosis.
10. Winton T, Livingston R, Johnson D, et glycerol kinase alpha expression as critical Cancer Cell 2006;9:189-98.
al. Vinorelbine plus cisplatin vs. observa- factors. J Comput Biol 2005;12:534-44. 25. Pittman J, Huang E, Dressman H, et

n engl j med 355;6 www.nejm.org august 10, 2006 579

The New England Journal of Medicine


Downloaded from www.nejm.org by Amirah Shaleha on November 12, 2010. For personal use only. No other uses without permission.
Copyright © 2006 Massachusetts Medical Society. All rights reserved.
Genomic Str ategy to Refine Prognosis in Early Non–Small-Cell Lung Cancer

al. Integrated modeling of clinical and Pittman J, Huang AT, West M. Towards al. Predicting the clinical status of human
gene expression information for person- integrated clinico-genomic models for breast cancer by using gene expression
alized prediction of disease outcomes. personalized medicine: combining gene profiles. Proc Natl Acad Sci U S A 2001;
Proc Natl Acad Sci U S A 2004;101:8431- expression signatures and clinical factors 98:11462-7.
6. in breast cancer outcomes prediction. 30. Denison DGT, Mallick BK, Smith AFM.
26. Pittman J, Huang E, Nevins JR, Wang Hum Mol Genet 2003;12:R153-R157. A Bayesian CART algorithm. Biometrika
Q, West M. Bayesian analysis of binary 28. Huang E, Cheng SH, Dressman H, et 1998;85:363-77.
prediction tree models for retrospectively al. Gene expression predictors of breast 31. Breiman L. Statistical modeling: the
sampled outcomes. Biostatistics 2004;5: cancer outcomes. Lancet 2003;361:1590- two cultures. Stat Sci 2001;16:199-225.
587-601. 6. Copyright © 2006 Massachusetts Medical Society.
27. Nevins JR, Huang ES, Dressman H, 29. West M, Blanchette C, Dressman H, et

JOURNAL EDITORIAL FELLOW


The Journal’s editorial office invites applications for a one-year
research fellowship beginning in July 2007 from individuals at any
stage of training. The editorial fellow will work on Journal projects
and will participate in the day-to-day editorial activities of the Journal
but is expected in addition to have his or her own independent
projects. Please send curriculum vitae and research interests
to the Editor-in-Chief, 10 Shattuck St., Boston, MA 02115
(fax, 617-739-9864), by October 1, 2006.

580 n engl j med 355;6 www.nejm.org august 10, 2006

The New England Journal of Medicine


Downloaded from www.nejm.org by Amirah Shaleha on November 12, 2010. For personal use only. No other uses without permission.
Copyright © 2006 Massachusetts Medical Society. All rights reserved.

You might also like