Article 5

Precision Medicine and Imaging Clinical
Cancer
Research
Deep Learning Predicts Lung Cancer Treatment
Response from Serial Medical Imaging
Yiwen Xu1, Ahmed Hosny1,2, Roman Zeleznik1,2, Chintan Parmar1, Thibaud Coroller1,
Idalid Franco1, Raymond H. Mak1, and Hugo J.W.L. Aerts1,2,3
Abstract
Purpose: Tumors are continuously evolving biological sys- patients with NSCLC treated with chemoradiation and
tems, and medical imaging is uniquely positioned to monitor surgery (178 scans).
changes throughout treatment. Although qualitatively track- Results: Deep learning models using time series scans were
ing lesions over space and time may be trivial, the develop- significantly predictive of survival and cancer-specific out-
Downloaded from http://aacrjournals.org/clincancerres/article-pdf/25/11/3266/2050937/3266.pdf by guest on 17 July 2023

ment of clinically relevant, automated radiomics methods that comes (progression, distant metastases, and local-regional
incorporate serial imaging data is far more challenging. In this recurrence). Model performance was enhanced with each
study, we evaluated deep learning networks for predicting additional follow-up scan into the CNN model (e.g., 2-year
clinical outcomes through analyzing time series CT images of overall survival: AUC ¼ 0.74, P < 0.05). The models stratified
patients with locally advanced non–small cell lung cancer patients into low and high mortality risk groups, which were
(NSCLC). significantly associated with overall survival [HR ¼ 6.16; 95%
Experimental Design: Dataset A consists of 179 patients confidence interval (CI), 2.17–17.44; P < 0.001]. The model
with stage III NSCLC treated with definitive chemoradia- also significantly predicted pathologic response in dataset B
tion, with pretreatment and posttreatment CT images at 1, 3, (P ¼ 0.016).
and 6 months follow-up (581 scans). Models were devel- Conclusions: We demonstrate that deep learning can inte-
oped using transfer learning of convolutional neural net- grate imaging scans at multiple timepoints to improve clinical
works (CNN) with recurrent neural networks (RNN), using outcome predictions. AI-based noninvasive radiomics bio-
single seed-point tumor localization. Pathologic response markers can have a significant impact in the clinic given their
validation was performed on dataset B, comprising 89 low cost and minimal requirements for human input.
Introduction graphic changes of tumors over time (4). Clinical response assess-
ment criteria, such as RECIST (5), analyze time series data using
Lung cancer is one of the most common cancers worldwide and
simple size-based measures such as axial diameter of lesions.
the highest contributor to cancer death in both the developed and
Artificial intelligence (AI) allows for a quantitative, instead of a
developing worlds (1). Among these patients, most are diagnosed
qualitative, assessment of radiographic tumor characteristics, a
with non–small cell lung cancer (NSCLC) and have a 5-year
process also referred to as "radiomics" (6). Indeed, several studies
survival rate of only 18% (1, 2). Despite recent advancements in
have demonstrated the ability to noninvasively describe tumor
medicine spurring a large increase in overall cancer survival rates,
phenotypes with more predictive power than routine clinical
this improvement is less consequential in lung cancer, as most
measures (7–10). Traditional machine learning techniques
symptomatic and diagnosed patients have late-stage disease (3).
involved the derivation of engineered features for quantitative
These late-stage lesions are often treated with nonsurgical
description of images with success in detecting biomarkers for
approaches, including radiation, chemotherapy, targeted, or
response assessment and clinical outcome prediction (11–15).
immunotherapies. This signals the dire need for monitoring
Recent advancements in deep learning (6) have demonstrated
therapy response using follow up imaging and tracking radio-
successful applications in image analysis without human feature
definition (16). The use of convolutional neural networks (CNN)
1
Department of Radiation Oncology, Brigham and Women's Hospital, Dana- allows for the automated extraction of imaging features and
Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts. 2Radi- identification of nonlinear relationships in complex data. CNN
ology and Nuclear Medicine, GROW, Maastricht University Medical Centre,
networks that have been trained on millions of photographic
Maastricht, the Netherlands. 3Department of Radiology, Brigham and Women's
Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston,
images can be applied to medical images through transfer learn-
Massachusetts. ing (17). This has been demonstrated in cancer research with
regards to tumor detection and staging (18). AI developments can
Note: Supplementary data for this article are available at Clinical Cancer
Research Online (http://clincancerres.aacrjournals.org/). be clinically applicable to enhance patient care by providing
accurate and efficient decision support (6, 11).
Corresponding Author: Hugo J.W.L. Aerts, Harvard–Dana-Farber Cancer Insti-
The majority of quantitative imaging studies have focused on
tute, 450 Brookline Avenue, Boston, MA 02115. Phone: 617-525-7156; Fax: 617-
525-7156; E-mail: hugo_aerts@dfci.harvard.edu the development of imaging biomarkers for a single timepoint
(19, 20). However, the tumor is a dynamic biological system with
Clin Cancer Res 2019;25:3266–75
vascular and stem cell contributions, which may respond, thus the
doi: 10.1158/1078-0432.CCR-18-2495 phenotype may not be completely captured at a single time-
2019 American Association for Cancer Research. point (21, 22). It may be beneficial to incorporate posttreatment
3266 Clin Cancer Res; 25(11) June 1, 2019

Longitudinal Deep Learning to Track Treatment Response
(chemoRT) and had at least one follow-up CT scan. We analyzed a

Translational Relevance total of 581 CT scans (average of 3.2; range 2–4 scans per patient,
Medical imaging provides noninvasive means for tracking 125 attenuation CTs from PET and 456 diagnostic CTs) of
patients' tumor response and progression after treatment. pretreatment and follow-up scans at 1, 3, and 6 months after
However, quantitative assessment through manual measure- radiation therapy for delta analysis of the serial scans (Fig. 1). The
ments is tedious, time-consuming, and prone to interoperator CT–PET scans were acquired without iodinated contrast, and
variability, as visual evaluation can be nonobjective and the contrast administration of chest CT scans are patient specific
biased. Artificial intelligence (AI) can perform automated and based on clinical guidelines. As a realistic representation of
quantification of radiographic characteristics of tumor phe- clinical settings, not all patients received imaging scans at all
notypes as well as monitor changes in tumors before, during, timepoints (Supplementary Fig. S1). Patients with surgery prior
and after treatment in a quantitative manner. In this study, we to or after therapy were not included in this study. The main
demonstrated the ability of deep learning networks to predict endpoint of this study was the prediction of survival and
prognostic endpoints of patients treated with radiation ther- prognostic factors for stage III patients treated with definitive
apy using serial CT imaging routinely obtained during follow- radiation (Fig. 2). Dataset A was randomly split 2:1 into
up. We also highlight their potential in accounting for and training/tuning (n ¼ 107) and test (n ¼ 72). Overall survival
utilizing the available serial images to extract the relevant was assessed along with three other clinical endpoints for the
timepoint and image features pertinent to the prediction of definitive radiation therapy cohort: distant metastases, locor-

survival and response to treatment. This provides further egional recurrence, and progression.
insight into applications including the detection of gross An additional test was performed on dataset B, a cohort of 89
residual disease without surgical intervention, as well as other consecutive patients with stage III NSCLC from our institution
personalized medicine practices. between 2001 and 2013, who were treated with neoadjuvant
radiotherapy and chemotherapy prior to surgical resection (tri-
modality). The analysis of dataset B was included for further
validation with a range of standard of care treatment protocols. A
CT scans from routine clinical follow-up as a means to tracking total of 178 CT scans with two timepoints; scans taken prior to
changes in phenotypic characteristics after radiation therapy. State radiation therapy and the scans after radiation were used, both
of the art deep learning methods in video classification and taken prior to surgery. Patient exclusion included those who
natural language processing have utilized recurrent neural net- presented with distant metastasis or those with more than a
works (RNN) to incorporate longitudinal data (23). However, 120 days delay between chemoradiation and surgery, as well as
only a few studies have applied these advanced computational those without survival data. For both cohorts, no histologic
approaches in radiology (24). exclusions were applied. The endpoint of the additional test set
In this study, we use AI in the form of deep learning, specifically of trimodality patients was the prediction of pathologic response,
CNNs and RNNs, to predict survival and other clinical endpoints validated at the time of surgery. The residual tumor was classified
of patients with NSCLC by incorporating pretreatment and follow as responders (pathologic complete response n ¼ 14, and micro-
up CT images. Two datasets were analyzed containing patients scopic residual disease n ¼ 28) or gross residual disease (n ¼ 47)
with similar diagnosis of stage III lung cancer, but treated with based on surgical pathologic reports.
different therapy regimens. In the first dataset, we developed and
evaluated deep learning models in patients treated with definitive CT acquisition and image preprocessing
chemoradiation therapy. The generalizability and further patho- CTs were acquired according to standardized scanning proto-
logic validation of the network was evaluated on a second dataset cols at our institution, using a GE "Lightspeed" CT scanner (GE
comprising patients treated with chemoradiation followed by Medical System) for treatment, pretreatment, and follow-up
surgery. For localization of the tumors, only single-click seed scans. The follow-up scans consisted of different axial spacing
points were needed without volumetric segmentations, demon- and a portion of the images are from PET–CT acquisitions. The
strating the ease of incorporating a large number of scans at several input of the tumor image region is defined at the center of the
timepoints into deep learning analyses. The CT imaging-based identified seed point for the pretreatment, and for the 1, 3, and
patient survival predictions can be applied to response assessment 6-month follow-up CT scans after definitive radiation therapy.
in clinical trials, precision medicine practices, and tailored clinical The seed points were manually defined in 3D Slicer 4.8.1 (25).
therapy. This work has implications for the use of AI-based Because of the variability in slice thicknesses and in-plane reso-
imaging biomarkers in the clinic, as they can be applied nonin- lution, the CT voxels were interpolated to 1 1 1 mm3 using
vasively, repeatedly, at low cost, and requiring minimal human linear and nearest neighbor interpolation. To have a stable input
input. for the proposed architecture, it was necessary to interpolate the
imaging data to homogeneous resolution. This was performed as
the slice thicknesses were a maximum of 5 mm and thus the 2D
Materials and Methods input images are taken at a slice not further than 2 mm away from
Patient cohorts a non-interpolated slice. The linear interpolation was used to
We used two independent cohorts, dataset A and dataset B, avoid potential perturbations from more complex interpolation
consisting in a total of 268 patients with stage III NSCLC for this methods, which involves and may be dependent on several
analysis. Dataset A contained 179 consecutive patients who were parameters and longer computation time. The fine scale was
treated at Brigham and Women's/Dana-Farber Cancer Center chosen to maintain the details of the tumor.
between 2003 and 2014 with definitive radiation therapy and Three axial slices of 50 50 mm2 centered on the selected seed
chemotherapy with carboplatin/paclitaxel or cisplatin/etoposide point were used as inputs to the model. They were spaced 5 mm
www.aacrjournals.org Clin Cancer Res; 25(11) June 1, 2019 3267

Xu et al.
Follow-up 1 Follow-up 2 Follow-up 3

Pretreatment @ 1 month @ 3 months @ 6 months
Patient 1
Figure 1.
Serial patient scans. Representative
CT images of patients with stage III
nonsurgical NSCLC before radiation
Patient 2
therapy and 1, 3, and 6 months

following radiation therapy. A
single click seed point identifies the
input image patch of the neural
network (defined by the dotted

white line).
Patient 3
apart; the center slice is on the same axial slice as the seed point. deformation was on the order of millimeters and did not notice-
5 mm was the maximum slice thickness of the CT images. A ably change the morphology of the tumor or surrounding tissues.
transfer learning approach was applied using the pretrained
ResNet CNN that was trained on natural RGB images. The three Neural network structure
axial slices were used as input to the CNN network. Using three 2D The network structure was implemented in Python, using Keras
slices gives the network information to learn from but keeps the with Tensorflow backend (Python 2.7, Keras 2.0.8, Tensorflow
number of features lower than a full 3D approach, reduces GPU 1.3.0). The proposed network structure has a base ResNet CNN
memory usage and training time, as well as limits the overfitting. trained on the ImageNet database containing over 14 million
Image augmentation was performed on the training data, and natural images (Fig. 3). One CNN was defined for each timepoint
involved image flipping, translation, rotation, and deformation, input, such that an input with scans at three timepoints would
which is a conventional good practice and has shown to improve involve input into three CNNs. The output of the pretrained
performance (26). The same augmentation was performed on the network model was then input into recurrent layers with gated
pretreatment and follow-up images, such that the network gen- recurrent units (GRU), which takes the time domain into account.
erates a mapping for the entire input series of images. The To ensure the network was able to handle missing scans (27, 28),
DISCOVERY LOCK TEST

TRAIN TUNE METHODS TEST BENCHMARK
Dataset A
Clinical
ChemoRT
Dataset A
IMAGENET n = 72
ChemoRT Volume
n = 1.4m
n = 107 Dataset B
ChemoRT surgery Diameter
ResNet RNN n = 89
Figure 2.
Analysis design. Depiction of the deep learning–based workflow with two datasets and additional comparative models. Dataset A included patients treated with
chemotherapy and definitive radiation therapy, and was used to train and fine-tune a ResNet CNN combined with an RNN for predictions of survival. A separate
test set from this cohort was used to assess performance and compared with the performance of radiographic and clinical features. Dataset B included patients
treated with chemotherapy and surgery. This cohort was used as an additional test set to predict pathologic response, and the model predictions were compared
with the change in volume.
3268 Clin Cancer Res; 25(11) June 1, 2019 Clinical Cancer Research
Pretreatment
Follow-up 1
Follow-up 2
(missed) Softmax
Fully
connected
Follow-up 3
Average
pooling
Input Pretrained CNN RNN
Figure 3.

Deep learning architectures. The neural architecture includes ResNet CNNs merged with an RNN, and was trained on baseline and follow-up scans. The input
axial slices of 50 50 mm2 centered on the selected seed point were used as inputs to the model. They were spaced 5 mm apart; the center slice is on the same
axial slice as the seed point. Deep learning networks are trained on natural RGB images and thus need three image slices for input. The outputs of each CNN
model are input into the RNN, with a GRU for time-varying inputs. Masking was performed on certain inputs of the CNN so that the recurrent network takes
missed scans into account. The final softmax layer provides the prediction.
RNN algorithms were used which allowed for amalgamation of stage, gender, age, tumor grade, performance, smoking status, and
several timepoints and the ability to learn from samples with clinical tumor size (primary maximum axial diameter).
missed patient scans at a certain timepoints. The output of the Statistical differences between positive and negative survival
pretrained network was masked to skip the timepoint when a scan groups in dataset A were assessed using the area under the receiver
was not available. Averaging and fully connected layers are then operator characteristic curve (AUC), and the Wilcoxon rank sums
applied after the GRU with batch normalization (29) and drop- test (also known as the Mann–Whitney U test). Prognostic and
out (30) after each fully connected layer to prevent overfitting. The survival estimates were calculated using the Kaplan–Meier meth-
final softmax layer allows for a binary classification output. To test od between low and high mortality risk groups, stratified at the
a model without the input of follow-up scans, the pretreatment median prediction probability of the training set and controlled
image alone was input into the proposed model, with the recur- using a log-rank test. Hazard ratios were calculated through the
rent and average pooling layers replaced by a fully connected Cox proportional-hazards model.
layer, as there was only one input timepoint. An additional test was performed on dataset B, the trimodality
cohort using the 1-year survival model from the definitive radi-
ation cohort with two timepoints. Survival predictions were made
Transfer learning
from the 1-year survival model trained on dataset A. The model
Weights trained with ImageNet (26, 31), a set of 14 million 2D
predictions were used to stratify the trimodality patients based on
color images, were used for the ResNet (31) CNN and the
survival and tumor response to radiation therapy prior to surgery.
additional weights following the CNN were randomized at ini-
The groups were assessed using their respective AUC, and were
tialization for transfer learning. Dataset A was randomly split 2:1
tested with the Wilcoxon rank sums test. This was compared with
into training/tuning and test. Training was performed with Monte
the volume change after radiation therapy and a random forest
Carlo cross-validation, using 10 different splits (further 3:2 split of
clinical model with the same features used for dataset A.
training:tuning) on 107 patients with class weight balancing for
up to 300 epochs. The model was evaluated on an independent
test set of 72 patients, who were not used in the training process. Results
The surviving fractions for training/tuning (n ¼ 107) and test sets
Clinical characteristics
(n ¼ 72) were comparable (Supplementary Table S1). Only the
To evaluate the value of deep learning based biomarkers to
pretreatment image was input into the proposed model, and the
predict overall survival using patient images prior and post
recurrent and average pooling layers were replaced with a fully
radiation therapy (Fig. 1), a total of 268 patients with stage III
connected layer.
NSCLC with 739 CT scans were analyzed (Fig. 2). Dataset A
consisted of 179 patients treated with definitive radiation therapy
Statistical analysis and was used as a cohort to train and test deep learning biomar-
Statistical analyses were performed in Python version 2.7. All kers (Supplementary Table S2). There was no significant differ-
predictions were evaluated on the independent test set of dataset A ence between the patient parameters in the training and test sets of
for survival and for prognostic factors after definitive radiation dataset A (P > 0.1, group summary values in Supplementary Table
therapy. The clinical endpoints included distant metastasis, pro- S2). The patients were 52.8% females (median age of 63 years; age
gression, and locoregional recurrence as well as overall survival for range 32–93 years) and were predominantly diagnosed as having
1 and 2 years following radiation therapy. The analyses were stage IIIA (58.9%) NSCLC at the time of diagnosis, with 58.1% in
compared with a random forest clinical model with features of the adenocarcinoma histology category. The median radiation

Xu et al.
A Pretreatment B Pretreatment + follow-up 1

1.0
P = 0.108 P = 0.060
0.8
Survival probability
0.6
0.4 Figure 4.
Performance deep learning
biomarkers on validation datasets.
0.2 The deep learning models were
evaluated on an independent test
set for performance. The 2-year
0 overall survival Kaplan–Meier
curves were performed with
C Pretreatment + follow-up 1−2 D Pretreatment + follow-up 1−3 median stratification (derived from
the training set) of the low and high

1.0 mortality risk groups with no
P = 0.023 P = 0.027
follow-up or up to three follow-ups
at 1, 3, and 6 months posttreatment
0.8 for dataset A (72 definitive patients
Survival probability
in the independent test set,

log-rank test P < 0.05 for > one
0.6
follow-up).
0.4
0.2
>median
<=median
00 12 24 36 48 0 12 24 36 48
Time (months) Time (months)
dose was 66 Gy for the definitive radiation cohort (range 45– tumor size, did not yield a statistically significant prediction of
70 Gy, median follow-up of 31.4 months). Another cohort of 89 survival (2-year survival AUC ¼ 0.51, P ¼ 0.93) or treatment
patients treated with trimodality served as an external test set response (Supplementary Table S3).
(dataset B). The median radiation dose for the trimodality Further survival analyses were performed with Kaplan–Meier
patients was lower, at 54 Gy (range 50 to 70 Gy, median fol- estimates for low and high mortality risk groups based on median
low-up of 37.1 months). stratification of patient prediction scores (Fig. 4). The models for
2-year overall survival yielded significant differences between the
Deep learning–based prognostic biomarker development and groups with two (P ¼ 0.023, log-rank test) and three (P ¼ 0.027,
evaluation log-rank test) follow-up scans. Comparable results were found for
To develop deep learning–based biomarkers for overall surviv- the following predictions with their respective hazard ratios:
al, distant metastasis, disease progression, and locoregional recur- 1-year overall survival (6.16; 95% CI, 2.17–17.44]; P ¼
rence, training was performed using the discovery part of dataset A 0.0004), distant metastasis free (3.99; 95% CI, 1.31–12.13;
(Fig 2). To leverage the information from millions of photograph- P ¼ 0.01), progression free (3.20; 95% CI, 1.16–8.87; P ¼
ic images, the ResNet CNN model was pretrained on ImageNet 0.02), and no locoregional recurrence (2.74; 95% CI, 1.18–
and then applied to our dataset using transfer learning. The CNN 6.34; P ¼ 0.02), each with significant differences at three fol-
extracted features of the CT images of each timepoint were fed into low-up timepoint scans.
a recurrent network for longitudinal analysis. We observed that
baseline model with only pretreatment scans demonstrated low Predicting pathologic response
performance for predicting 2-year overall survival (AUC ¼ 0.58; As an additional independent validation and to evaluate the
P ¼ 0.3; Wilcoxon test). Improved performance to predict 2-year relationship between delta imaging analysis and pathologic
overall survival was observed with the addition of each follow-up response, the trimodality pre-radiation therapy and post-radia-
scan; at 1 month (AUC ¼ 0.64, P ¼ 0.04), 3 months (AUC ¼ 0.69, tion therapy prior to surgery scans were input into the neural
P ¼ 0.007), and 6 months (AUC ¼ 0.74, P ¼ 0.001; Supplemen- network model trained on dataset A. First for survival prediction
tary Fig. S2). We also observed the similar trend in performance evaluation, the model was tested on dataset B. To match the
for other clinical endpoints, that is 1-year, survival, metastasis, number of input timepoints, the 1-year survival model with the
progression, and locoregional recurrence-free survival (Supple- pretreatment and first follow-up at 1 month was used. The model
mentary Fig. S3). A clinical model, incorporating stage, gender, significantly predicted distant metastasis, progression, and local
age, tumor grade, performance, smoking status, and clinical regional recurrence (Supplementary Table S4). Although, for
0.6 and/or semiautomated contours or qualitative visual interpreta-

Response tions, which are prone to interobserver variability. Additionally,
Gross residual
0.5 * prognostic predictions can potentially aid in the assessment of
*
patient outcome in clinical trials to assess response and eventually
dynamically adapting therapy.
0.4
Normalized values
Using a combined image-based CNN and a time encompassing

RNN, the neural network was able to make survival and prog-
*
0.3 nostic predictions at 1 and 2 years for overall survival. As expected,
with an increase in the number of timepoints and the amount of
0.2
imaging data available to the network, there was an increase in
performance. Although the performance varied between the pre-
dictions, there was a consistent increase in AUC, due to the
0.1 increase in signal from each additional image of the primary
tumor and the changes between the scans with time. In this
0 cohort, using a single pretreatment scan was not successful in
Model probability Delta volume Combined making a prediction of survival. However, previous work in the
field of radiomics using engineered (9, 12, 14, 15) and deep

Figure 5. learning (10) approaches using pretreatment imaging data only,
Pathologic response prediction validation. Model probability and the change were able to predict the endpoint of their interest with the use of
in volume after radiation therapy was used for the prediction of pathologic
anatomical CT or functional PET data. For the cohorts in this
response. The CNN survival model significantly stratified response and gross
residual disease in the second test set dataset B; comparable predictions study, there is a trend towards significance of the deep learning
were found with change in tumor volume and the combination of the two model with the pretreatment timepoint only. Using larger cohorts
parameters (n ¼ 89; Wilcoxon, P < 0.05). could improve the predictive power of the imaging markers. The
clinical model, which included the clinical tumor size (longest
axial diameter), was also not predictive of survival or the other
overall survival there were a low number of events (30 of 89), the prognostic factors.
model was trending towards making a prediction for 3-year The neural network was able to stratify patients into low and
overall survival in dataset B. high mortality risk groups, with significant difference in overall
The predictions of the network were then used to categorize survival (Fig. 4). This was also identified for the risk of
pathologic response (Fig. 5), and were found to significantly locoregional recurrence with the input of two follow-up time-
distinguish between responders and gross residual disease, with points at around 1 and 3 months after the completion of definitive
an AUC of 0.65 (n ¼ 89; P ¼ 0.016; Wilcoxon test), which was radiation therapy. The other outcomes, progression, and distant
similar to the change in volume (AUC of 0.65; n ¼ 89; P ¼ metastasis needed the additional third follow-up at around
0.017; Wilcoxon test). To investigate the additive performance, 6 months for a significant stratification of the mortality risk
we build a combined model of the network probabilities and groups. This may be due to a more defined set of early imaging
change in volume, which showed slightly higher performance phenotypes relating to survival and locoregional recurrence as
(AUC of 0.67; n ¼ 89; P ¼ 0.006; Wilcoxon test). The CNN compared with the other prognostic factors, or confounding
probabilities and changes in the primary tumor volume were phenotypes with regards to distant metastasis and progression,
significantly correlated (P ¼ 0.0002), although with a Spear- which the model cannot overcome unless the third follow-up is
man's correlation value of 0.39. A clinical model, involving incorporated.
parameters of stage, gender, age, tumor grade, performance, The two datasets within our study are inherently different as the
smoking status, and clinical tumor size, did not yield a statis- cohorts are comprised of patients with different disease burdens
tically significant prediction for pathologic response (P ¼ 0.42; and treatment modalities. The surgical patients are younger and
Wilcoxon test). healthier on average, with an earlier stage of disease, and well
enough to tolerate surgery. It has been shown that the survival of
surgical patients is dependent on the success of the surgical
Discussion procedure and distant disease (32), where definitive radiation
Tracking tumor evolution for prediction of survival and therapy survival is determined by local control (33). There was
response after chemotherapy and radiation therapy can be critical also a higher proportion of stage IIIA in patients who also
to treatment assessment and adaptive treatment planning for underwent surgical resection (dataset B) compared with definitive
improving patient outcomes. Conventionally, clinical parameters radiation therapy patients (dataset A).
are used to determine treatment type and to predict outcome (2), Despite these differences, the survival CNN models trained on
but this does not take into account phenotypic changes in the dataset A predicted surrogates of survival in dataset B including
tumor. Medical imaging tracks this evolution of lesions nonin- distant metastasis, progression, and locoregional recurrence. It
vasively and provides a method for tracking the same region was trending towards predicting survival and this may be due to
longitudinally through time, providing additional tumor char- the inherent differences between the cohorts, as well as the low
acteristics beyond those obtained through static images at a single number of events in the cohort and sample size. There was also
timepoint (5). Follow-up CT scans are already a part of the clinical only one follow-up scan available for dataset B, thus less infor-
workflow, providing additional information regarding the mation was provided to the survival model. Although the model
patient. Using deep learning approaches for tumor assessment was designed to overcome the immortal time bias, there could still
allows for the extraction of phenotypic changes without manual be an effect. With more timepoints, fewer patients are alive to have

Xu et al.
the scan performed and thus decrease the ability to predict outcome the network was trained to predict. The use of transfer
survival. learning has demonstrated its effectiveness on improving the
Survival is associated with tumor pathologic response (34, 35). performance of lung nodule detection in CT images (18). Our
Thus, we tested the relationship between the probabilities of the study contained a sample size not on the order of studies based on
survival network model on similar patients with stage III NSCLC photographic images, but the current performance was made
who were in different treatment cohorts (definite radiation ther- possible with the incorporation of pretrained networks on
apy and trimodality). Dataset B included the follow-up timepoint ImageNet. Transfer learning may also be used to test the feasibility
after radiation therapy and prior to surgery, for the prediction of of clinically applicable utilities prior to the collection of a full
response and for further validation of our model. This also serves cohort for analysis.
as a test for generalizability in locally advanced NSCLC patients The incorporation of follow-up timepoints to capture
treated with different standard of care treatment protocols. To dynamic tumor changes was key to the prediction of survival
match the number of input timepoints, the 1-year overall survival and tumor prognosis. This was feasible with the use of RNNs,
model with the pretreatment and first follow-up at 1 month was which allowed for amalgamation of several timepoints and the
used. The model was able to separate the pathologic responders ability to learn from samples with missed patient scans at a
from those with gross residual disease in the trimodality cohort. certain timepoint, which is inevitable in retrospective studies
This was the case, even though the model development was such as this one. Although this type of network has not been
completely blinded from this cohort. applied to medical images, similar network architectures have

This prediction was compared with a well-known prediction of demonstrated success in image and time-dependent analyses,
response, the primary tumor size. The change in tumor volume as in video classification and description applications (41). The
also predicted the response in this cohort with a similar perfor- model was structured to overcome the immortal time bias (42).
mance. However, the two measures, model probability and delta The pooling of CNN without the RNN has been previously
volume, were only weakly correlated and the combined model applied (43), but in this case would result in bias classifications
showed a slight improvement in performance. The proposed for an event when the last patient scan is missed. The RNN was
model was able to predict pathologic response in a different set to not learn from inputs where there is a missing scan (44).
cohort, with only the image and a seed point for input. There is GRU RNNs were used as they contain an update gate and a reset
also a weak correlation between the values, which suggests that gate, which decides the weighting of the information passed on
the image-based neural network model is detecting radiographic to the network output (45). This captures the pertinent infor-
characteristics other than tumor size. mation from each timepoint for the survival and prognostic
The use of a CNN-based network captures the tumor region predictions.
and the immediate tumor environment. Previous techniques Previous work has demonstrated the feasibility of using CT
focused on providing the machine learning algorithm with imaging features to make associations and predictions in lung
accurate manual delineations or semiautomated methods, cancer (7). Several studies used radiomics approaches involving
which may not incorporate surrounding tissue (36, 37). CNN manual delineation of the tumor volume and user defined
image input includes the boundary between the tumor and calculated features to make predictions of survival and patho-
the normal tissue environment. This may provide additional logic response (12–15). Recent applications of deep learning
indications for tumor response and infiltration to the sur- on lung cancer has focused on lung nodule classification as
rounding tissue. Image augmentation was performed on benign or metastatic and they focus on a single scan for the
the training tumor region, as conventional practice in the model input. The study by Kumar and colleagues depended on
field of deep learning and biomedical image processing (38), manual delineation of lung nodules with feature extraction
to improve performance and the small-scale deformations using an autoencoder and classification with decision trees (46).
were applied to prevent overfitting (39) on our relatively Hua and colleagues used 2D region of the tumor lesion on the
small training set. The use of conventional ResNet CNN axial slice for classification, also performed at one time-
for image characterization allows for the incorporation of point (47). Our study differs mainly in the incorporation of
pretreatment weights on natural images (26). This mediated multiple timepoints in the prediction of survival and prognos-
the application of deep neural networks on medical images, tic factors. For further validation, we also applied our devel-
with cohorts much smaller than the millions of samples used oped model on a different cohort for the prediction of path-
in other AI solutions. ologic response, an important clinical factor. In comparison to
The number of samples available for most radiologic studies are previous studies, our model only takes a seed point and creates
not on the same order of magnitude as those used for deep a 50 50 mm2 region around the seed point, which is used as
learning applications. For instance, a facial recognition deep input. To compute handcrafted radiomic features, an accurate
learning application was developed by training on 87,000 images tumor delineation is required (9), which is susceptible to inter-
and testing on 5,000 images (40). However, transfer learning can reader segmentation variability and also is time-consuming.
be used to leverage common low-level CNN parameters from Recently, deep learning has been shown to have higher perfor-
databases such as ImageNet, which contains over 14 million mance than conventional radiomics (39). Our approach only
natural images (26). It would be ideal incorporate the whole required a seed point within a tumor and hence is more
tumor volume by using a network pretrained on 3D radiographic efficient and robust to manual inference. Additional clinical
images or 3D images in general, however the number of images and pathologic evaluations are not always available. Morpho-
available are not near the order of magnitude of which are in logic parameters dependent on manual and semiautomated
photographic images. If available, a model pretrained in 3D CT contours of the whole tumor volume or RECIST (5) measure-
images with samples on the order of thousands of images will ments are prone to interoperator variability and can be costly
likely be overfitted to the patient cohort, the institution, and the to acquire.
Ideally, after training on a larger diverse population and after addressed. Further research in this direction could make these
extensive external validation and benchmarking with current automatically learned feature representations more interpretable.
clinical standards, quantitative prognostic prediction models can
be implemented in the clinic (48). There are several lung nodule Conclusions
detection algorithms available in the literature and with the aid of This study demonstrated the impact of deep learning on tumor
the pretreatment tumor contours routinely delineated by the phenotype tracking before and after definitive radiation therapy
radiation oncologist, the location of the tumor on the follow up through pretreatment and CT follow-up scans. There were
images can be detected automatically (49). The input of our increases in performance of survival and prognosis prediction
model would simply be the bounding box surrounding the with incorporation of additional timepoints using CNN and RNN
detected tumor and can be cropped automatically as well. The networks. This was compared with the performance of clinical
trained network can generate probabilities of prognosis within a factors, which were not significant. The survival neural network
few seconds, and thus would not hinder current clinical efficiency. model could predict pathologic response in a separate cohort with
The probabilities can then be presented to the physician along trimodality treatment after radiation therapy. Although the input
with other clinical images and measures, such as the RECIST of this model consisted of a single seed point input at the center of
criteria (5), to aid in the process of patient assessment. the lesion, without the need for volumetric segmentation our
This proof of principle study has its limitations, one of which is model had comparable predictive power compared with tumor
the sample size of the study cohorts. Thus, a pretrained CNN was volume, acquired through time-consuming manual contours.

used to improve predictive power. Using a deep learning tech- Noninvasive tracking of the tumor phenotype predicted survival,
nique has its limitations. Previous associations were found for risk prognosis, and pathologic response, which can have potential
of distant metastases with the pretreatment scan only, with clinical implications on adaptive and personalized therapy.
machine learning techniques (15). It has been demonstrated that
machine learning based on engineered features out performs deep
learning with small sample sizes. Perhaps with a larger cohort, we Disclosure of Potential Conflicts of Interest
R.H. Mak is a consultant/advisory board member for AstraZeneca, Varian
could potentially achieve better performance deep learning. The Medical Systems, and NewRT. H.J.W.L. Aerts holds ownership interest (includ-
probabilities are essentially calculated with a black box for a ing patents) in Sphera and Genospace. No potential conflicts of interest were
specific task, thus are less practical than engineered features, which disclosed by the other authors.
could potentially be reused for other applications. Neural net-
works can be prone to overfitting, even with the techniques we
have used to mitigate this (29, 30), thus images were resampled to Disclaimer
The funders had no role in study design, data collection and analysis,
a common pixel spacing. Our model used three 2D slices due to
decision to publish, or preparation of the manuscript.
the predefined parameters necessary for transfer learning. How-
ever, a 3D image volume may better represent tumor biology and
thus increase performance. Our survival models are based purely Authors' Contributions
on the CT image and could potentially benefit from the incor- Conception and design: Y. Xu, A. Hosny, R. Zeleznik, C. Parmar, R.H. Mak,
poration of patient specific parameters, such as age, sex, histology, H.J.W.L. Aerts
smoking cessation, and radiation therapy parameters, with a Development of methodology: Y. Xu, A. Hosny, R. Zeleznik, H.J.W.L. Aerts
Acquisition of data (provided animals, acquired and managed patients,
larger cohort of patients. With these limitations, our deep learning
provided facilities, etc.): Y. Xu, T. Coroller, R.H. Mak, H.J.W.L. Aerts
model was able to make predictions of survival and perhaps with a Analysis and interpretation of data (e.g., statistical analysis, biostatistics,
larger dataset and finer more consistent axial spacing, higher and computational analysis): Y. Xu, A. Hosny, R. Zeleznik, C. Parmar, T. Coroller,
more clinically relevant performance may be feasible. R.H. Mak, H.J.W.L. Aerts
Deep learning is a flexible technique which has been success- Writing, review, and/or revision of the manuscript: Y. Xu, A. Hosny,
fully implemented in several fields (16). However, the theory R. Zeleznik, C. Parmar, T. Coroller, I. Franco, R.H. Mak, H.J.W.L. Aerts
Administrative, technical, or material support (i.e., reporting or organizing
behind how the network functions has yet to be established (50). data, constructing databases): Y. Xu, R. Zeleznik, T. Coroller, I. Franco,
The input and output of the model can be quite intuitive, but as H.J.W.L. Aerts
suggested by the term, the hidden middle layers are not. It is Study supervision: Y. Xu, H.J.W.L. Aerts
therefore very challenging to determine the reasoning behind a
network's performance and whether certain parameters have a Acknowledgments
positive or negative impact. Unlike engineered features built to The authors acknowledge financial support from the NIH (NIH-USA
capture certain characteristics of the image, the interpretation of U24CA194354, and NIH-USA U01CA190234); https://grants.nih.gov/fund
ing/index.htm.
deep learning features can be ambiguous. To circumvent this in
the field of image-based CNN, activation maps have been gen-
The costs of publication of this article were defrayed in part by the payment of
erated to capture highly weighted portions of the image with page charges. This article must therefore be hereby marked advertisement in
respect to the network's predictions (65). This can be visualized in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
the form of heat maps, generated over the final convolutional
layer. Also, how to incorporate the domain knowledge into these Received August 15, 2018; revised December 19, 2018; accepted January 28,
abstract features is a very important question that needs to be 2019; published first April 22, 2019.
References
1. Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer 2. Ettinger DS, Akerley W, Borghaei H, Chang AC, Cheney RT, Chirieac LR, et al.
statistics, 2012. CA Cancer J Clin 2015;65:87–108. Non-small cell lung cancer. J Natl Compr Canc Netw 2012;10:1236–71.

Xu et al.
3. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2016. CA Cancer J Clin 25. Fedorov A, Beichel R, Kalpathy-Cramer J, Finet J, Fillion-Robin J-C, Pujol S,
2016;66:7–30. et al. 3D Slicer as an image computing platform for the Quantitative
4. Goldstraw P, Chansky K, Crowley J, Rami-Porta R, Asamura H, Imaging Network. Magn Reson Imaging 2012;30:1323–41.
Eberhardt WEE, et al. The IASLC lung cancer staging project: propo- 26. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep
sals for revision of the TNM stage groupings in the forthcoming convolutional neural networks. Commun ACM 2017;60:84–90.
(eighth) edition of the TNM Classification for Lung Cancer. 27. Rubins J, Unger M, Colice GL. Follow-up and surveillance of the lung
J Thorac Oncol 2016;11:39–51. cancer patient following curative intent therapy. Chest 2007;132:
5. Eisenhauer EA, Therasse P, Bogaerts J, Schwartz LH, Sargent D, Ford R, et al. 355S–367S.
New response evaluation criteria in solid tumours: revised RECIST guide- 28. Calman L, Beaver K, Hind D, Lorigan P, Roberts C, Lloyd-Jones M. Survival
line (version 1.1). Eur J Cancer 2009;45:228–47. benefits from follow-up of patients with lung cancer: a systematic review
6. Hosny A, Parmar C, Quackenbush J, Schwartz LH, Hugo J W. Artificial and meta-analysis. J Thorac Oncol 2011;6:1993–2004.
intelligence in radiology. Nat Rev Cancer 2018;18:500–10. 29. Ioffe S, Szegedy C. Batch normalization: accelerating deep network training
7. Parmar C, Grossmann P, Bussink J, Lambin P, Hugo J W. Machine learning by reducing internal covariate shift. arXiv Preprint 2015. arxiv.org/abs/
methods for quantitative radiomic biomarkers. Sci Rep 2015;5:13087. 1502.03167.
8. Aerts HJWL. Data science in radiology: a path forward. Clin Cancer Res 30. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R.
2018;24:532–4. Dropout: a simple way to prevent neural networks from overfitting.
9. Aerts HJWL, Velazquez ER, Leijenaar RTH, Parmar C, Grossmann P, J Mach Learn Res 2014;15:1929–58.
Carvalho S, et al. Decoding tumour phenotype by noninvasive imag- 31. He K, Zhang X, Ren S, Sun J. Deep residual learning for image
ing using a quantitative radiomics approach. Nat Commun 2014;5: recognition. In: Proceedings: 29th IEEE Conference on Computer
4006. Vision and Pattern Recognition. CVPR 2016; 2016 Jun 26–Jul 1; Las

10. Hosny A, Parmar C, Coroller TP, Grossmann P, Zeleznik R, Kumar A, et al. Vegas, NV. Washington (DC): IEEE Computer Society; 2016. p.
Deep learning for lung cancer prognostication: a retrospective multi-cohort 770–8.
radiomics study. PLoS Med 2018;15:e1002711. 32. Albain KS, Swann RS, Rusch VR, Turrisi AT, Shepherd FA, Smith CJ,
11. Parmar C, Barry JD, Hosny A, Quackenbush J, Aerts HJ. Data analysis et al. Phase III study of concurrent chemotherapy and radiotherapy
strategies in medical imaging. Clin Cancer Res 2018;24:3492–9. (CT/RT) vs CT/RT followed by surgical resection for stage IIIA(pN2)
12. Coroller TP, Agrawal V, Huynh E, Narayan V, Lee SW, Mak RH, et al. non-small cell lung cancer (NSCLC): outcomes update of North
Radiomic-based pathological response prediction from primary tumors American Intergroup 0139 (RTOG 9309). J Clin Oncol 23, 2005
and lymph nodes in NSCLC. J Thorac Oncol 2017;12:467–76. (suppl; abstr 7014).
13. Huynh E, Coroller TP, Narayan V, Agrawal V, Hou Y, Romano J, et al. CT- 33. Tsujino K, Hirota S, Endo M, Obayashi K, Kotani Y, Satouchi M, et al.
based radiomic analysis of stereotactic body radiation therapy patients Predictive value of dose-volume histogram parameters for predicting
with lung cancer. Radiother Oncol 2016;120:258–66. radiation pneumonitis after concurrent chemoradiation for lung cancer.
14. Coroller TP, Agrawal V, Narayan V, Hou Y, Grossmann P, Lee SW, et al. Int J Radiat Oncol Biol Phys 2003;55:110–5.
Radiomic phenotype features predict pathological response in non-small 34. Hellmann MD, Chaft JE, William WN Jr, Rusch V, Pisters KMW, Kalhor
cell lung cancer. Radiother Oncol 2016;119:480–6. N, et al. Pathological response after neoadjuvant chemotherapy in
15. Coroller TP, Grossmann P, Hou Y, Rios Velazquez E, Leijenaar RTH, resectable non-small-cell lung cancers: proposal for the use of major
Hermann G, et al. CT-based radiomic signature predicts distant metastasis pathological response as a surrogate endpoint. Lancet Oncol 2014;15:
in lung adenocarcinoma. Radiother Oncol 2015;114:345–50. e42–50.
16. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521: 35. Pataer A, Kalhor N, Correa AM, Raso MG, Erasmus JJ, Kim ES, et al.
436–44. Histopathologic response criteria predict survival of patients with resected
17. Shin H-C, Roth HR, Gao M, Lu L, Xu Z, Nogues I, et al. Deep convolutional lung cancer after neoadjuvant chemotherapy. J Thorac Oncol 2012;7:
neural networks for computer-aided detection: CNN architectures, dataset 825–32.
characteristics and transfer learning. IEEE Trans Med Imaging 2016;35: 36. Parmar C, Rios Velazquez E, Leijenaar R, Jermoumi M, Carvalho S, Mak RH,
1285–98. et al. Robust Radiomics feature quantification using semiautomatic vol-
18. Shin H-C, Roth HR, Gao M, Lu L, Xu Z, Nogues I, et al. Deep convolutional umetric segmentation. PLoS One 2014;9:e102107.
neural networks for computer-aided detection: CNN architectures, dataset 37. Mackin D, Fave X, Zhang L, Fried D, Yang J, Taylor B, et al. Measuring
characteristics and transfer learning. IEEE Trans Med Imaging 2016;35: computed tomography scanner variability of radiomics features.
1285–98. Invest Radiol 2015;50:757–65.
19. Dandil E, Cakiroglu M, Eksi Z, Ozkan M, Kurt OK, Canan A. Artificial 38. Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for
neural network-based classification system for lung nodules on com- biomedical image segmentation. In: Navab N, Hornegger J, Wells WM,
puted tomography scans. In: 6th International Conference on Soft Frangi AF, editors. Medical image computing and computer-assisted
Computing and Pattern Recognition; 2014 Aug 11–14; Tunis, Tunisia. intervention – MICCAI 2015. 18th International Conference, Munich,
Washington (DC): IEEE Computer Society; 2014. p. 382–6. Germany, October 5–9, 2015, Proceedings, Part III. Berlin: Springer;
20. Shin H-C, Roth HR, Gao M, Lu L, Xu Z, Nogues I, et al. Deep convolutional 2015. p. 234–41.
neural networks for computer-aided detection: CNN architectures, dataset 39. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, et al. A
characteristics and transfer learning. IEEE Trans Med Imaging 2016;35: survey on deep learning in medical image analysis. Med Image Anal 2017;
1285–98. 42:60–88.
21. Jamal-Hanjani M, Wilson GA, McGranahan N, Birkbak NJ, Watkins TBK, 40. Sun Y, Wang X, Tang X. Deep learning face representation from predicting
Veeriah S, et al. Tracking the evolution of non-small-cell lung cancer. N Engl 10,000 classes. In: 2014 IEEE Conference on Computer Vision and Pattern
J Med 2017;376:2109–21. Recognition; 2014 Jun 23–28; Columbus, OH. Washington (DC): IEEE
22. Hermann PC, Huber SL, Herrler T, Aicher A, Ellwart JW, Guba M, et al. Computer Society; 2014. p. 1891–8.
Distinct populations of cancer stem cells determine tumor growth and 41. Joe Yue-Hei Ng, Ng JY-H, Hausknecht M, Vijayanarasimhan S, Vinyals
metastatic activity in human pancreatic cancer. Cell Stem Cell 2007;1: O, Monga R, et al. Beyond short snippets: deep networks for video
313–23. classification. In: 2015 IEEE Conference on Computer Vision and
23. Donahue J, Hendricks LA, Rohrbach M, Venugopalan S, Guadarrama S, Pattern Recognition (CVPR); 2015. http://dx.doi.org/10.1109/cvpr.
Saenko K, et al. Long-term recurrent convolutional networks for visual 2015.7299101.
recognition and description. IEEE Trans Pattern Anal Mach Intell 2017;39: 42. Suissa S. Immortal time bias in pharmaco-epidemiology. Am J Epidemiol
677–91. 2008;167:492–9.
24. Cierniak R. A new approach to image reconstruction from projections 43. Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L. Large-
using a recurrent neural network. Int J Appl Math Comput Sci 2008;18: scale video classification with convolutional neural networks. In: 2014
147–57. IEEE Conference on Computer Vision and Pattern Recognition; 2014 Jun
23–28; Columbus, OH. Washington (DC): IEEE Computer Society; 2014. Computer and Robot Vision. CRV 2015; 2015 Jun 3–5; Halifax, Nova
p. 1725–32. Scotia, Canada. Washington (DC): IEEE Computer Society; 2015. p.
44. Che Z, Purushotham S, Cho K, Sontag D, Liu Y. Recurrent neural 133–8.
networks for multivariate time series with missing values. Sci Rep 47. Hua K-L, Hsu C-H, Hidayati SC, Cheng W-H, Chen Y-J. Computer-aided
2018;8:6085. classification of lung nodules on computed tomography images via deep
45. Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk learning technique. Onco Targets Ther 2015;8:2015–22.
H, et al. Learning phrase representations using RNN encoder–decoder for 48. Lehman CD, Yala A, Schuster T, Dontchos B, Bahl M, Swanson K, et al.
statistical machine translation. In: Proceedings of the 2014 Conference on Mammographic breast density assessment using deep learning: clinical
Empirical Methods in Natural Language Processing (EMNLP); 2014 Oct implementation. Radiology 2018;180694.
25–29; Doha, Qatar. Stroudsburg (PA): Association for Computational 49. Valente IRS, Cortez PC, Neto EC, Soares JM, de Albuquerque VHC, Tavares
Linguistics; 2014. p. 1724–34. JMRS. Automatic 3D pulmonary nodule detection in CT images: a survey.
46. Kumar D, Wong A, Clausi DA. Lung nodule classification using deep Comput Methods Programs Biomed 2016;124:91–107.
features in CT images. In: Proceedings: 2015 12th Conference on 50. Wang G. A perspective on deep imaging. IEEE Access 2016;4:8914–24.

Article 5

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Article 5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Article 5

Uploaded by

Copyright:

Available Formats

Precision Medicine and Imaging Clinical

Downloaded from http://aacrjournals.org/clincancerres/article-pdf/25/11/3266/2050937/3266.pdf by guest on 17 July 2023

3266 Clin Cancer Res; 25(11) June 1, 2019

(chemoRT) and had at least one follow-up CT scan. We analyzed a

Downloaded from http://aacrjournals.org/clincancerres/article-pdf/25/11/3266/2050937/3266.pdf by guest on 17 July 2023

www.aacrjournals.org Clin Cancer Res; 25(11) June 1, 2019 3267

Follow-up 1 Follow-up 2 Follow-up 3

therapy and 1, 3, and 6 months

Downloaded from http://aacrjournals.org/clincancerres/article-pdf/25/11/3266/2050937/3266.pdf by guest on 17 July 2023

DISCOVERY LOCK TEST

Downloaded from http://aacrjournals.org/clincancerres/article-pdf/25/11/3266/2050937/3266.pdf by guest on 17 July 2023

www.aacrjournals.org Clin Cancer Res; 25(11) June 1, 2019 3269

A Pretreatment B Pretreatment + follow-up 1

Downloaded from http://aacrjournals.org/clincancerres/article-pdf/25/11/3266/2050937/3266.pdf by guest on 17 July 2023

in the independent test set,

0.6 and/or semiautomated contours or qualitative visual interpreta-

Using a combined image-based CNN and a time encompassing

Downloaded from http://aacrjournals.org/clincancerres/article-pdf/25/11/3266/2050937/3266.pdf by guest on 17 July 2023

www.aacrjournals.org Clin Cancer Res; 25(11) June 1, 2019 3271

Downloaded from http://aacrjournals.org/clincancerres/article-pdf/25/11/3266/2050937/3266.pdf by guest on 17 July 2023

Downloaded from http://aacrjournals.org/clincancerres/article-pdf/25/11/3266/2050937/3266.pdf by guest on 17 July 2023

www.aacrjournals.org Clin Cancer Res; 25(11) June 1, 2019 3273

Downloaded from http://aacrjournals.org/clincancerres/article-pdf/25/11/3266/2050937/3266.pdf by guest on 17 July 2023

Downloaded from http://aacrjournals.org/clincancerres/article-pdf/25/11/3266/2050937/3266.pdf by guest on 17 July 2023

www.aacrjournals.org Clin Cancer Res; 25(11) June 1, 2019 3275

You might also like