Matsuo 2018
Matsuo 2018
Matsuo 2018
org
GYNECOLOGY
Survival outcome prediction in cervical cancer: Cox
models vs deep-learning model
Koji Matsuo, MD, PhD; Sanjay Purushotham, PhD; Bo Jiang, MS; Rachel S. Mandelbaum, MD; Tsuyoshi Takiuchi, MD, PhD;
Yan Liu, PhD; Lynda D. Roman, MD
BACKGROUND: Historically, the Cox proportional hazard regression progression-free survival when compared with the Cox proportional hazard
model has been the mainstay for survival analyses in oncologic research. regression model (mean absolute error, 29.3 vs 316.2). The deep-learning
The Cox proportional hazard regression model generally is used based on model also outperformed all the other models, including the Cox pro-
an assumption of linear association. However, it is likely that, in reality, portional hazard regression model, for overall survival (mean absolute
there are many clinicopathologic features that exhibit a nonlinear asso- error, Cox proportional hazard regression vs deep-learning, 43.6 vs 30.7).
ciation in biomedicine. The performance of the deep-learning model further improved when more
OBJECTIVE: The purpose of this study was to compare the deep- features were included (concordance index for progression-free survival:
learning neural network model and the Cox proportional hazard regres- 0.695 for 20 features, 0.787 for 36 features, and 0.795 for 40 features).
sion model in the prediction of survival in women with cervical cancer. There were 10 features for progression-free survival and 3 features for
STUDY DESIGN: This was a retrospective pilot study of consecutive overall survival that demonstrated significance only in the deep-learning
cases of newly diagnosed stage IeIV cervical cancer from 2000e2014. model, but not in the Cox proportional hazard regression model. There
A total of 40 features that included patient demographics, vital signs, were no features for progression-free survival and 3 features for overall
laboratory test results, tumor characteristics, and treatment types were survival that demonstrated significance only in the Cox proportional hazard
assessed for analysis and grouped into 3 feature sets. The deep- regression model, but not in the deep-learning model.
learning neural network model was compared with the Cox propor- CONCLUSION: Our study suggests that the deep-learning neural
tional hazard regression model and 3 other survival analysis models for network model may be a useful analytic tool for survival prediction in
progression-free survival and overall survival. Mean absolute error and women with cervical cancer because it exhibited superior performance
concordance index were used to assess the performance of these 5 compared with the Cox proportional hazard regression model. This novel
models. analytic approach may provide clinicians with meaningful survival infor-
RESULTS: There were 768 women included in the analysis. The median mation that potentially could be integrated into treatment decision-making
age was 49 years, and the majority were Hispanic (71.7%). The majority of and planning. Further validation studies are necessary to support this pilot
tumors were squamous (75.3%) and stage I (48.7%). The median follow- study.
up time was 40.2 months; there were 241 events for recurrence and
progression and 170 deaths during the follow-up period. The deep- Key words: Cox proportional hazard, cervical cancer, deep learning,
learning model showed promising results in the prediction of survival prediction
TABLE 1
Patient demographics (N[768)
Features set Features set
Features set 1 (20 features) Measure 2,a n (%) Measure 3,b n (%) Measure
c
Age, y 49 (41e58) Histologic condition Beta-blocker use
Race/ethnicity, n (%) Squamous cell 578 (75.3) No 710 (92.9)
White 64 (8.3) Adenocarcinoma 137 (17.8) Yes 54 (7.1)
Black 48 (6.3) Adenosquamous 31 (4.0) Primary hysterectomy
Hispanic 551 (71.7) Other 22 (2.9) No 559 (72.8)
Asian 101 (13.2) Stage Yes 209 (27.2)
Others 4 (0.5) I 372 (48.7) Radiotherapy
Body mass index, kg/m2c,d 28.0 (24.3e32.8) IA1 91 (11.8) No 297 (38.7)
e
Hypertension, n (%) IA2 21 (2.7) Yes 471 (61.3)
No 569 (74.5) IB1 160 (20.8) Chemotherapy
Yes 195 (25.5) IB2 100 (13.0) No 699 (91.0)
e
Diabetes mellitus, n (%) II 167 (21.9) Yes 69 (9.0)
No 654 (85.6) IIA 30 (3.9)
Yes 110 (14.4) IIB 136 (17.7)
Hypercholesterolemia, n (%)e II NOS 1 (0.1)
No 701 (91.8) III 156 (20.4)
Yes 63 (8.2) IIIA 7 (0.9)
c
Vital signs at diagnosis IIB 149 (19.4)
Systolic blood pressure, mm Hg 125 (112e140) IV 69 (9.0)
Diastolic blood pressure, mm Hg 72 (65e80) IVA 11 (1.4)
Heart rate, beats/min 79 (70e89) IVB 58 (7.6)
c
Laboratory test Unknown 4 (0.5)
White blood cell, 10 /L 9
8.3 (6.6e10.0)
Platelet, 109/L 300 (244e451)
Hemoglobin, g/dL 12.3 (10.2e13.4)
Blood urea nitrogen, mg/dL 12.0 (9.0e15.0)
Creatinine, mg/dL 0.6 (0.5e0.8)
Bicarbonate, mEq/L 25 (23e27)
Albumin, g/dL 4.1 (3.7e4.4)
a
Included the demographics for features set 1 and the 16 listed tumor features; b Included the demographics for features set 1, the 16 tumor features for features set 2, and the listed treatment
features; c Data are given as median (interquartile range); d Missing 20 data; e Missing 4 data.
Matsuo et al. Survival outcome prediction in cervical cancer. Am J Obstet Gynecol 2019.
characteristics (20 features), including association between the extent of fea- metrics. In other words, our deep-
age, race/ethnicity, body mass index, tures and survival prediction in various learning model predicts both mean ab-
vital signs, comorbidities, and pre- analytic approaches. solute error and concordance index by
treatment laboratory results; FS2 rep- Our proposed deep-learning model jointly optimizing the 2 subnetworks,
resents FS1 and tumor characteristics has a hierarchic structure and uses fully each of which optimize these parameters
(20þ16¼36 features), including histo- connected feed-forward neural networks separately. We compared baseline
logic type and cancer stage; FS3 repre- in the lower layers of the model and 2 models (CPH,19 CoxLasso,20 Random
sents FS2 and treatment type (36þ4¼40 subnetworks (fully connected layers) to Survival Forest,21 and Cox Boost22) to
features). The rational of this sequential optimize jointly the concordance index our proposed deep-learning model
split-grouping strategy is to examine the and mean absolute error evaluation (Figure) on the provided dataset for 2
FIGURE
Study schema for survival analysis
Patient baseline characteristics were entered in various analytic models that included the deep-learning neural network model to examine survival
outcome.
Matsuo et al. Survival outcome prediction in cervical cancer. Am J Obstet Gynecol 2019.
tasks (PFS/OS predictions) with 3 optimization of the negative log-partial and CoxLasso24 have been proposed in
different sets of features (FS1e3). The likelihood function, which is measured literature. Although these modeling ap-
results shown here are an average of 10 by the concordance index score. In proaches are not used frequently, we
test folds (from cross validation) in addition, our model uses another sub- have included them for comparison to
terms of concordance index and mean network of deep neural networks to see how they perform with respect to
absolute error. minimize the mean absolute error be- proposed models. CoxBoost is a semi-
Mean absolute error is the absolute tween the actual survival time and the parametric survival model that is
difference between the original survival predicted survival time for individual designed to handle high-dimensional
time (ground truth) and the model’s patients. Thus, our proposed model datasets by fitting the Cox models with
predicted survival time measured in jointly optimizes the concordance index likelihood-based boosting for competing
months. Lower mean absolute error score and mean absolute error simulta- risks.24 CoxLasso, a semiparametric
means a better performing model. neously to accurately predict survival of survival model, is another variant of the
Concordance index can be interpreted as individual patients. Cox model and is regularized with the
the fraction of all pairs of subjects whose CPH is a popular semiparametric Lasso L1 penalty.25e27 It treats the
predicted survival times are ordered model for survival analysis. It estimates number of non-zero coefficients as a
correctly among all subjects that can the risk function hðXi Þ of the event tuning parameter and simultaneously
actually be ordered. In other words, it is occurring (eg, died of cancer) for patient selects with the regularization param-
the probability of concordance between i based on observed covariates/features eter. Also, it fits a varying coefficient Cox
the predicted and the observed survival. Xi with the use of a linear function: model by kernel smoothing, with the
Higher concordance index means better hðXi Þ ¼ Xi b, where b is the coefficient aforementioned penalties. Random
performing model.23 of Xi .19 It measures the impact of the Survival Forest is a popular nonlinear
Our proposed deep-learning model covariates and assumes that the log- machine learning model for survival
for survival analysis uses a subnetwork of hazard of every patient is a linear com- analysis.21 It is used to estimate the risk
deep neural networks with a single bination of the patient’s features. function of patients. Random Survival
output node to estimate the survival In addition to standard CPH, other Forest is a tree model that is based on the
risks hq ðXi Þ of patients i by the modeling variants such as CoxBoost24 random forest method, and it can
TABLE 4
Multivariable analysis for Cox proportional hazard regression models for survival
Progression-free survival Overall survival
Features Hazard ratio (95% confidence interval) P value Hazard ratio (95% confidence interval) P value
Histologic condition .001a,b
Squamous cell 1
Adenocarcinoma 1.61 (1.12e2.32) .011a
Adenosquamous 2.82 (1.45e5.44) .002a
Other 1.77 (0.96e3.26) .07
Stage <.001a,b <.001a,b
I 1 1
II 2.81 (1.77e4.46) <.001a 1.80 (1.04e3.10) .034a
III 5.15 (3.22e8.25) <.001a 2.97 (1.75e5.02) <.001a
IV 12.1 (7.49e19.4) <.001a 10.2 (5.86e17.7) <.001a
Primary hysterectomy
No 1 1
Yes 0.17 (0.10e0.31) <.001 a
0.26 (0.12e0.56) .001a
Radiotherapy
No 1
Yes 0.24 (0.16e0.36) <.001a
Laboratory test (per unit)
Platelet (109/L) 1.002 (1.001e1.003) .007a
Blood urea nitrogen (mg/dL) 1.02 (1.01e1.04) .006a 1.03 (1.01e1.05) .006a
a
Creatinine (mg/dL) 0.84 (0.72e0.98) .024 0.83 (0.71e0.98) .024a
Albumin (g/dL) 0.59 (0.46e0.75) <.001a 0.46 (0.35e0.62) <.001a
Vital signs
Heart rate (bpm) 1.02 (1.01e1.03) .001a 1.01 (1.01e1.02) .021a
a
All significant covariates (P<.05) on univariable analysis that are shown in Supplemental Table 4 were entered in the initial full model, and conditional backward method was used to retain only
significant covariates (P<.05) in the final model; b P value for interaction.
Matsuo et al. Survival outcome prediction in cervical cancer. Am J Obstet Gynecol 2019.
approaches in oncologic research. Most of radiographic testing efficacy,39,40 generalizability to other cervical cancer
41,42
these studies are related to either diag- genomic analysis, and early symp- populations was not possible. In the cur-
nostic work-up, such as radiographic toms,43 but there have been only a few rent study, a larger number of unselected
image analysis and cytopathologic inter- studies that have examined oncologic consecutive cases of women with cervical
pretation, or genomic/molecular analysis outcome.14,44 cancer, which included nonsurgical cases,
for biomarker discovery; studies that use In an analysis of surgically treated were included for analysis, which provided
deep-learning models for survival pre- women with early-stage cervical cancer more meaningful results for
diction in oncology patients remain (n¼102), various deep-learning models interpretation.
limited to date. Specifically in the area of were tested for 5-year OS prediction with We previously have examined the per-
cervical cancer research, the utility of the use of clinicopathologic features formance of deep-learning neural
deep-learning has been used for inter- mainly from surgical-pathologic speci- network models in the prediction
pretation of cervical cytologic testing,32,33 mens.44 Their main finding was that of survival of women with recurrent cer-
human papillomavirus-related risk algo- certain neural network algorithms are su- vical cancer who have a limited life-
rithm development,34 colposcopy inter- perior for survival prediction compared expectancy (3 and 6 months).14 An
pretation,35 tumor tissue with conventional linear regression enormous amount of data points (>5000)
identification,36,37 cervical cancer models. Because their study population that included patient demographics,
screening algorithm evaluation,38 was limited only to surgical cases, symptoms, vital signs, laboratory test
TABLE 5
Summary of clinical-pathologic factors between Cox proportional hazard regression model and deep-learning model
Progression-free survival Overall survival
Deep-learning Cox proportional hazard Deep-learning Cox proportional hazard
Features Concordant only regression only Concordant only regression only
Patient — Age — — — —
demographics
— Body mass index — — Race/ethnicity —
— Race/ethnicity — — — —
— Hypertension — — — —
Vital signs Heart rate — — Heart rate — —
Laboratory test Blood urea White blood cell — Blood urea Bicarbonate Platelet
results nitrogen nitrogen
Creatinine Platelet — — — Creatinine
Albumin Bicarbonate — — — Albumin
— Hemoglobin — — — —
Tumor Cancer stage — — Cancer stage Radiotherapy —
characteristics
Histology type — — — — —
Treatment type Hysterectomy Chemotherapy — Hysterectomy — —
Radiotherapy Beta-blocker use — — — —
Blank space with indicates no feature.
Matsuo et al. Survival outcome prediction in cervical cancer. Am J Obstet Gynecol 2019.
results, tumor characteristics, and treat- are present in survival data. Of note, in learning model (Table 5). The ability to
ment types were time-sequentially exam- our previous study, we found that a highlight these features without explicit
ined after recurrence. The deep-learning number of clinical-laboratory factors feature engineering may represent an
model was compared with a linear demonstrated a nonlinear association example of this benefit of deep-learning
regression model for survival prediction. with survival and implied that use of a models.
The analysis found that the deep-learning neural network model would be more Third, our study suggests that the
model is superior to the CPH model to appropriate than a linear regression performance of the deep-learning neural
identify women with limited life- model in clinical medicine.14 network model will perform better when
expectancy. Taken together, all 3 studies, Second, deep-learning models are able large feature sets are used. The strength
which included the current study, have to not only automatically learn feature of the deep-learning model in handling
shown consistently that deep-learning representations from raw clinical data large feature sets, because of its ability
neural network models may be useful without explicit feature engineering but to learn feature representation, may be
analytic tools for survival prediction in also can fit censored survival data with beneficial particularly in biomedical
women with cervical cancer, given its su- the use of nonlinear risk functions. In research because inclusion of many var-
perior performance compared with the other words, deep-learning models are iables in conventional linear regression
linear regression model. powerful at learning nonlinear relation- models may result in overfitting.
Strengths of the deep-learning neural ships that are present in the data, and A limitation of deep-learning
network model for survival analyses in they easily can handle censoring in sur- models is that these models are
oncology research are in the following vival data. Thus, selection bias because of computationally expensive to train,
threefold. First, as described earlier, this the process of demographic grouping and usually their predictions might be
model exhibits an improved fit for vari- can be eliminated in the deep-learning hard to interpret. For instance, in our
ables with a nonlinear relationship, model. For instance, there were several analysis of OS, there were some fea-
which is applicable when examining features that were not identified as sur- tures that were identified as significant
real-life factors. Unlike CPH and its vival predictors in the conventional an- prognostic factors only in the CPH
variants, deep-learning approaches can alytic approach but were found to be model, but not in the deep-learning
model nonlinear risk functions that significant prognosticators in the deep- model (Table 5). Albumin level, for
example, is a well-recognized prog- possibility that late survival events were convolutional neural networks. AMIA Annu
nostic factor in oncology patients that missed. Most of the study population Symp Proc 2015;2015:1899–908.
10. Saltz J, Gupta R, Hou L, et al. Spatial orga-
reflected general nutritional status, was Hispanic, and generalizability to nization and molecular correlation of tumor-
and our previous analysis of recurrent different population is not known. infiltrating lymphocytes using deep learning on
cervical cancer demonstrated that al- The deep-learning neural network pathology images. Cell Rep 2018;23:181–193
bumin level was the strongest predic- model is a new analytic tool that has e7.
tor for limited life expectancy.14 Thus, been adopted recently in clinical 11. Long NP, Jung KH, Yoon SJ, et al.
Systematic assessment of cervical cancer
the fact that this feature was not sig- decision-making, and its utility will be initiation and progression uncovers genetic
nificant in the deep-learning model is likely become more widespread in the panels for deep learning-based early diag-
a concern in terms of reliability of the near future. Our battery of pilot nosis and proposes novel diagnostic and
modeling in the current study; further studies in cervical cancer (new diag- prognostic biomarkers. Oncotarget 2018;8:
validation and model development are nosis and recurrent disease) endorses 109436–56.
12. Chaudhary K, Poirion OB, Lu L, Garmire LX.
necessary to ensure the reliability of the exploration of deep-learning Deep learning-based multi-omics integration
these deep-learning models. approach, with promising results in robustly predicts survival in liver cancer. Clin
In addition, the challenge and un- survival analysis. This analytic Cancer Res 2018;24:1248–59.
certainty in training the CPH models approach is particularly useful in 13. Haenssle HA, Fink C, Schneiderbauer R,
may result in much higher mean ab- biomedicine where complexity and et al. Man against machine: diagnostic perfor-
mance of a deep learning convolutional neural
solute error compared with other uncertainty exist; therefore, further network for dermoscopic melanoma recognition
models. For example, when the co- study is warranted to establish its role in comparison to 58 dermatologists. Ann Oncol
efficients of the CPH model for PFS in survival analysis. For future direc- 2018;29:1836–42.
prediction task were estimated, the tion of study, an investigation of how 14. Matsuo K, Purushotham S, Moeini A, et al.
model did not completely converge. to obtain feature importance scores A pilot study in using deep learning to predict
limited life expectancy in women with recurrent
One potential reason is the Newton- directly from deep-learning models cervical cancer. Am J Obstet Gynecol
Raphson algorithm that was used in and how to provide clinically mean- 2017;217:703–5.
the estimation of coefficients in the ingful interpretations from deep- 15. Matsuo K, Moeini A, Machida H, et al. Sig-
CPH model: this likely caused learning models will be of value. n nificance of venous thromboembolism in women
convergence failure.45 Last, mean ab- with cervical cancer. Gynecol Oncol 2016;142:
405–12.
solute errors were similar between the References 16. FIGO staging for carcinoma of the vulva,
deep-learning models and the other 1. Torre LA, Bray F, Siegel RL, Ferlay J, Lortet- cervix, and corpus uteri. Int J Gynaecol Obstet
baseline models. One explanation of Tieulent J, Jemal A. Global cancer statistics, 2014;125:97–8.
this observation is that the deep- 2012. CA Cancer J Clin 2015;65:87–108. 17. Machida H, Moeini A, Ciccone MA, et al.
learning model might need more 2. National Cancer Institute. Surveillance, Efficacy of modified dose-dense paclitaxel in
Epidemiology, and End Results Program. Avail- recurrent cervical cancer. Am J Clin Oncol
fine-tuning for PFS prediction. 2018;41:851–60.
able at: https://seer.cancer.gov/statfacts/html/
Another weakness of the current cervix.html. Accessed September 7, 2018. 18. Fauci J, Schneider K, Walters C, et al. The
study is that the limited amount of data 3. LeCun Y, Bengio Y, Hinton G. Deep learning. utilization of palliative care in gynecologic
makes it challenging to train deep- Nature 2015;521:436–44. oncology patients near the end of life. Gynecol
learning models in our experiments. 4. Che Z, Purushotham S, Khemani R, Liu Y. Oncol 2012;127:175–9.
Interpretable deep models for ICU outcome 19. Cox DR. Regression models and life-tables.
More investigation is needed to study the J R Stat Soc Series B Stat Methodol 1972;34:
prediction. AMIA Annu Symp Proc 2017;2016:
performance of deep-learning models in 371–80. 187–220.
limited data settings. We examined only 5. Izadyyazdanabadi M, Belykh E, Mooney MA, 20. Tibshirani R. The lasso method for variable
40 features in the analysis, and there may et al. Prospects for theranostics in neurosurgical selection in the Cox model. Stat Med 1997;16:
be various confounders that were not imaging: empowering confocal laser endomi- 385–95.
croscopy diagnostics via deep learning. Front 21. Ishwaran H, Kogalur UB, Blackstone EH,
examined. For example, performance Lauer MS. Random survival forests. Ann Appl
Oncol 2018;8:240.
status was not examined in the model 6. Scheeder C, Heigwer F, Boutros M. Machine Statist 2008;2:841–60.
but is known to be a prognostic indicator learning and image-based profiling in drug dis- 22. Binder H. “CoxBoost: Cox models by likeli-
in oncology patients. Moreover, we covery. Curr Opin Syst Biol 2018;10:43–52. hood based boosting for a single survival
examined features only at the initial 7. Wang J, Yang X, Cai H, Tan W, Jin C, Li L. endpoint or competing risks.” Version 1: R
Discrimination of breast cancer with micro- package; 2013.
cancer diagnosis; features after initial 23. Steck H, KrishnapuramB, Dehing-oberije C,
calcifications on mammography by deep
diagnosis were not assessed. learning. Sci Rep 2016;6:27327. Lambin P, Raykar VC. On ranking in survival
Although we likely examined one of 8. Liang M, Li Z, Chen T, Zeng J. Integrative analysis: Bounds on the concordance index. Ad-
the largest sample sizes among studies of data analysis of multi-platform cancer data vances in Neural Information Processing Systems.
this nature, the total number remains with a multimodal deep learning approach. Available at: https://papers.nips.cc/paper/3375-
IEEE/ACM Trans Comput Biol Bioinform on-ranking-in-survival-analysis-bounds-on-the-
relatively small, which makes the analysis concordance-index. Accessed December 1,
2015;12:928–37.
challenging in the deep-learning model. 9. Ertosun MG, Rubin DL. Automated grading of 2018.
Follow-up time is also relatively short gliomas using deep learning in digital pathology 24. Binder H, Schumacher M. Allowing for
(<5 years), and there may be the images: a modular approach with ensemble of mandatory covariates in boosting estimation of
sparse high-dimensional survival models. BMC tool for human papillomavirus-positive Koreans: 43. Weegar R, Kvist M, Sundstrom K,
Bioinformatics 2008;9:14. a support vector machine-based approach. J Int Brunak S, Dalianis H. Finding cervical cancer
25. Friedman J, Hastie T, Tibshirani R. Regula- Med Res 2015;43:518–25. symptoms in Swedish clinical text using a
rization paths for generalized linear models 35. Sato M, Horie K, Hara A, et al. Application of machine learning approach and NegEx.
via coordinate descent. J Stat Softw 2010;33: deep learning to the classification of images from AMIA Annu Symp Proc 2015;2015:
1–22. colposcopy. Oncol Lett 2018;15:3518–23. 1296–305.
26. Simon N, Friedman J, Hastie T, Tibshirani R. 36. Wang J, Li L, Yang P, et al. Identification of 44. Obrzut B, Kusy M, Semczuk A,
Regularization Paths for Cox’s Proportional cervical cancer using laser-induced breakdown Obrzut M, Kluska J. Prediction of 5-year
Hazards Model via Coordinate Descent. J Stat spectroscopy coupled with principal compo- overall survival in cervical cancer patients
Softw 2011;39:1–13. nent analysis and support vector machine. La- treated with radical hysterectomy using
27. Sun H, Lin W, Feng R, Li H. Network-regular- sers Med Sci 2018;33:1381–6. computational intelligence methods. BMC
ized high-dimensional Cox regression for analysis 37. Gu J, Fu CY, Ng BK, Liu LB, Lim-Tan SK, Cancer 2017;17:840.
of genomic data. Stat Sin 2014;24:1433–59. Lee CG. Enhancement of early cervical cancer 45. Problems with convergence in the Cox Pro-
28. Lawless JF, Singhal K. Efficient screening of diagnosis with epithelial layer analysis of fluo- portional Hazard Model. Available at: https://
nonnormal regression models. Biometrics rescence lifetime images. PLoS One 2015;10: lifelines.readthedocs.io/en/latest/Examples.
1978;34:318–27. e0125706. html#problems-with-convergence-in-the-cox-
29. CamDavidsonPilon/lifelines: v0.13 (2017). 38. Baltzer N, Sundstrom K, Nygard JF, proportional-hazard-model. Accessed December
Available at: https://doi.org/10.5281/zenodo. Dillner J, Komorowski J. Risk stratification in 4, 2018.
1127755/. Accessed September 7, 2018. cervical cancer screening by complete
30. Keras: The Python Deep Learning library. screening history: applying bioinformatics to a
Available at: https://keras.io. Accessed general screening population. Int J Cancer Author and article information
September 7, 2018. 2017;141:200–9. From the Division of Gynecologic Oncology, Departments
31. Von Elm E, Altman DG, Egger M, Pocock SJ, 39. Torheim T, Malinen E, Hole KH, et al. Auto- of Obstetrics and Gynecology (Drs Matsuo, Mandelbaum,
Gotzsche PC, Vandenbroucke JP. Strength- delineation of cervical cancers using multi- Takiuchi, and Roman) and Computer Science (Drs
ening the Reporting of Observational Studies in parametric magnetic resonance imaging and Purushotham and Liu and Mr Jiang), and the Norris
Epidemiology (STROBE) statement: guidelines machine learning. Acta Oncol 2017;56:806–12. Comprehensive Cancer Center (Drs Matsuo and Roman),
for reporting observational studies. BMJ 40. Mu W, Chen Z, Liang Y, et al. Staging of University of Southern California, Los Angeles, CA.
2007;335:806–8. cervical cancer based on tumor heterogeneity Received Oct. 5, 2018; revised Dec. 6, 2018;
32. Komagata H, Ichimura T, Matsuta Y, et al. characterized by texture features on (18)F-FDG accepted Dec. 17, 2018.
Feature analysis of cell nuclear chromatin distri- PET images. Phys Med Biol 2015;60:5123–39. Supported by Ensign Endowment for Gynecologic
bution in support of cervical cytology. J Med 41. Tan MS, Chang SW, Cheah PL, Yap HJ. Cancer Research (K.M.)
Imaging (Bellingham) 2017;4:047501. Integrative machine learning analysis of multiple L.D.R. is a consultant for Tempus Lab (Chicago, IL),
33. Mariarputham EJ, Stephen A. Nominated gene expression profiles in cervical cancer. and K.M. received an honorarium from Chugai (Tokyo,
texture based cervical cancer classification. Com- PeerJ 2018;6:e5285. Japan); the remaining authors report no conflict of
put Math Methods Med 2015;2015:586928. 42. Wilhelm T. Phenotype prediction based on interest.
34. Kahng J, Kim EH, Kim HG, Lee W. Devel- genome-wide DNA methylation data. BMC Corresponding author: Koji Matsuo, MD, PhD. koji.
opment of a cervical cancer progress prediction Bioinformatics 2014;15:193. [email protected]
SUPPLEMENTAL TABLE 1
Comparison of Cox proportional hazard regression model and deep-learning neural network model for survival
Features Outcome Model Concordance indexa Mean absolute errorb
Set 1 Progression-free survival Cox proportional hazard regression model 0.696 0.072 322.2 129.7
CoxBoost 0.696 0.072 30.2 3.8
CoxLasso 0.699 0.070 30.4 2.5
Random Survival Forest 0.694 0.073 30.0 4.2
Deep learning 0.695 0.080 29.3 3.4
Overall survival Cox proportional hazard regression 0.520 0.059 37.2 2.7
CoxBoost 0.520 0.059 40.9 4.0
CoxLasso 0.521 0.059 40.8 5.0
Random Survival Forest 0.538 0.051 33.6 4.3
Deep learning 0.538 0.042 30.9 3.7
Set 2 Progression-free survival Cox proportional hazard regression 0.785 0.064 329.5 133.4
CoxBoost 0.785 0.063 28.8 4.1
CoxLasso 0.785 0.064 29.9 3.8
Random Survival Forest 0.771 0.070 29.8 4.3
Deep learning 0.787 0.063 29.7 3.5
Overall survival Cox proportional hazard regression 0.511 0.045 37.6 3.0
CoxBoost 0.511 0.045 39.5 2.8
CoxLasso 0.512 0.049 38.9 2.5
Random Survival Forest 0.527 0.053 33.6 4.3
Deep learning 0.534 0.051 30.7 3.7
There were 241 events for progression-free survival and 170 events for overall survival events among 768 cases.
a
A higher concordance index means a better performing model; b A lower mean absolute error means better a performing model.
Matsuo et al. Survival outcome prediction in cervical cancer. Am J Obstet Gynecol 2019.
SUPPLEMENTAL TABLE 2
Survival predictors in deep-learning model (features set 1)
Progression-free survival Overall survival
Features P value Features P value
Albumina 1.57E-60 Diabetes mellitusa 2.35E-04
a a
Hemoglobin 3.08E-54 Hispanic 1.39E-03
a a
Heart rate 1.35E-31 White blood cell 3.29E-03
a a
Platelet 1.06E-22 Creatinine 5.95E-03
Agea 4.18E-10 Whitea 1.25E-02
a a
Creatinine 1.41E-08 Black 1.64E-02
a
White blood cell 4.69E-08 Hypercholesterolemia 5.36E-02
a
Bicarbonate 6.81E-08 Body mass index 5.38E-02
a
Blood urea nitrogen 2.81E-07 Albumin 6.44E-02
Blacka 2.71E-05 Blood urea nitrogen 7.02E-02
Hispanica 4.35E-03 Diastolic blood pressure 8.40E-02
a
Hypertension 5.35E-03 Bicarbonate 8.98E-02
Asian 5.30E-02 Hypertension 1.37E-01
Body mass index 6.62E-02 Platelet 1.46E-01
Systolic blood pressure 1.01E-01 Asian 1.63E-01
Hypercholesterolemia 2.13E-01 Other race 1.84E-01
Other race 2.25E-01 Heart rate 1.94E-01
Diastolic blood pressure 3.14E-01 Systolic blood pressure 2.21E-01
White 4.28E-01 Age 2.29E-01
Diabetes mellitus 4.30E-01 Hemoglobin 3.50E-01
Covariates are listed based on the statistical significance.
a
Significant covariates (P<.05).
Matsuo et al. Survival outcome prediction in cervical cancer. Am J Obstet Gynecol 2019.
SUPPLEMENTAL TABLE 3
Survival predictors in deep-learning model (features set 2)
Progression-free survival Overall survival
Features P value Features P value
Hemoglobina 2.15E-38 Albumina 3.37E-06
a a
Albumin 3.87E-38 Bicarbonate 1.59E-05
a a
Sage IVB 1.53E-37 Hispanic 1.69E-03
a a
Stage IIIB 7.00E-30 Stage IIA 5.83E-03
Stage IB1a 7.62E-30 Body mass indexa 6.10E-03
a a
Stage 1A1 4.75E-27 Stage IA1 2.50E-02
a a
Heart rate 8.12E-19 White blood cell 4.52E-02
a
Platelet 4.00E-15 Diastolic blood pressure 5.68E-02
a
Creatinine 3.06E-08 Hemoglobin 6.10E-02
White blood cella 4.21E-08 White 6.18E-02
a
Blood urea nitrogen 1.10E-07 Diabetes mellitus 6.64E-02
a
Age 1.02E-06 Heart rate 7.42E-02
a
Bicarbonate 1.39E-05 Black 8.59E-02
a
Stage IVA 8.64E-05 Asian 9.46E-02
Blacka 8.58E-04 Other histologic condition 1.11E-01
Hypertensiona 1.44E-03 Creatinine 1.17E-01
a
Hispanic 1.51E-03 Blood urea nitrogen 1.35E-01
a
Stage IIB 6.82E-03 Platelet 1.63E-01
a
Other histologic condition 9.65E-03 Systolic blood pressure 1.66E-01
Body mass indexa 1.47E-02 Hypertension 1.66E-01
a
Asian 2.10E-02 Stage IIIA 1.68E-01
a
Adenocarcinoma 3.56E-02 Stage IIIB 1.83E-01
Stage IB2 2.12E-01 Stage IIB 1.84E-01
Systolic blood pressure 2.27E-01 Age 1.91E-01
Hypercholesterolemia 3.08E-01 Squamous 1.99E-01
Squamous 3.19E-01 Adenosquamous 2.16E-01
Diastolic blood pressure 3.65E-01 Stage IB1 2.24E-01
Stage IIA 4.58E-01 Hypercholesterolemia 2.36E-01
Diabetes mellitus 4.78E-01 Adenocarcinoma 2.54E-01
White 4.84E-01 Stage IVB 3.28E-01
Stage IIIA 4.87E-01 Stage IVA 3.73E-01
Adenosquamous 5.88E-01 Stage IB2 3.95E-01
Covariates are listed based on the statistical significance.
a
Significant covariates (P<.05).
Matsuo et al. Survival outcome prediction in cervical cancer. Am J Obstet Gynecol 2019.
SUPPLEMENTAL TABLE 4
Univariable analysis for survival outcome
Progression-free survival Overall survival
Features Hazard ratio (95% confidence interval) P value a
Hazard ratio (95% confidence interval) P valuea
Age, y 1.02 (1.01e1.03) <.001 1.02 (1.01e1.03) .003
Ethnicity .001 <.001
White 1 1
Black 1.97 (1.05e3.67) 2.34 (1.16e4.70)
Hispanic 0.84 (0.52e1.37) 0.72 (0.40e1.28)
Asian 1.27 (0.72e2.23) 1.29 (0.67e2.49)
Histologic condition .043 .002
Squamous cell 1 1
Adenocarcinoma 0.89 (0.63e1.26) 0.72 (0.46e1.11)
Adenosquamous 1.02 (0.54e1.93) 0.72 (0.29e1.75)
Other 2.23 (1.24e4.00) 2.80 (1.51e5.19)
Stage <.001 <.001
I 1 1
II 3.25 (2.22e4.78) 3.98 (2.39e6.62)
III 6.15 (4.27e8.87) 8.15 (5.06e13.1)
IV 20.9 (14.0e31.1) 30.0 (18.2e49.4)
Primary hysterectomy <.001 <.001
No 1 1
Yes 0.15 (0.10e0.25) 0.10 (0.05e0.19)
Radiotherapy .003
No 1
Yes 1.57 (1.17e2.11)
Primary chemotherapy .001 <.001
No 1 1
Yes 5.17 (3.82e7.00) 6.37 (4.53e8.95)
Laboratory test
White blood cell, 109/L 1.06 (1.03e1.10) .001 1.07 (1.03e1.11) .001
Platelet, 10 /L9
1.003 (1.002e1.004) <.001 1.004 (1.003e1.005) <.001
Hemoglobin, g/dL 0.82 (0.78e0.86) <.001 0.81 (0.76e0.85) <.001
Blood urea nitrogen, mg/dL 1.02 (1.01e1.03) <.001 1.02 (1.01e1.03) <.001
Creatinine, mg/dL 1.17 (1.11e1.24) <.001 1.18 (1.11e1.25) <.001
Bicarbonate, mEq/L 0.92 (0.88e0.95) <.001 0.89 (0.85e0.93) <.001
Albumin, g/dL 0.39 (0.31e0.48) <.001 0.30 (0.23e0.38) <.001
Heart rate, beats/min 1.03 (1.02e1.04) <.001 1.03 (1.02e1.04) <.001
All the covariates that are shown in Table 1 were examined; the covariates with P<.05 are shown in this Table.
a
Considered significant.
Matsuo et al. Survival outcome prediction in cervical cancer. Am J Obstet Gynecol 2019.