- Schmid, Sabine;
- Jiang, Mei;
- Brown, M Catherine;
- Fares, Aline;
- Garcia, Miguel;
- Soriano, Joelle;
- Dong, Mei;
- Thomas, Sera;
- Kohno, Takashi;
- Leal, Leticia Ferro;
- Diao, Nancy;
- Xie, Juntao;
- Wang, Zhichao;
- Zaridze, David;
- Holcatova, Ivana;
- Lissowska, Jolanta;
- Świątkowska, Beata;
- Mates, Dana;
- Savic, Milan;
- Wenzlaff, Angela S;
- Harris, Curtis C;
- Caporaso, Neil E;
- Ma, Hongxia;
- Fernandez-Tardon, Guillermo;
- Barnett, Matthew J;
- Goodman, Gary;
- Davies, Michael PA;
- Pérez-Ríos, Mónica;
- Taylor, Fiona;
- Duell, Eric J;
- Schoettker, Ben;
- Brenner, Hermann;
- Andrew, Angeline;
- Cox, Angela;
- Ruano-Ravina, Alberto;
- Field, John K;
- Marchand, Loic Le;
- Wang, Ying;
- Chen, Chu;
- Tardon, Adonina;
- Shete, Sanjay;
- Schabath, Matthew B;
- Shen, Hongbing;
- Landi, Maria Teresa;
- Ryan, Brid M;
- Schwartz, Ann G;
- Qi, Lihong;
- Sakoda, Lori C;
- Brennan, Paul;
- Yang, Ping;
- Zhang, Jie;
- Christiani, David C;
- Reis, Rui Manuel;
- Shiraishi, Kouya;
- Hung, Rayjean J;
- Xu, Wei;
- Liu, Geoffrey
Background
Somatic EGFR mutations define a subset of non-small cell lung cancers (NSCLC) that have clinical impact on NSCLC risk and outcome. However, EGFR-mutation-status is often missing in epidemiologic datasets. We developed and tested pragmatic approaches to account for EGFR-mutation-status based on variables commonly included in epidemiologic datasets and evaluated the clinical utility of these approaches.Methods
Through analysis of the International Lung Cancer Consortium (ILCCO) epidemiologic datasets, we developed a regression model for EGFR-status; we then applied a clinical-restriction approach using the optimal cut-point, and a second epidemiologic, multiple imputation approach to ILCCO survival analyses that did and did not account for EGFR-status.Results
Of 35,356 ILCCO patients with NSCLC, EGFR-mutation-status was available in 4,231 patients. A model regressing known EGFR-mutation-status on clinical and demographic variables achieved a concordance index of 0.75 (95% CI, 0.74-0.77) in the training and 0.77 (95% CI, 0.74-0.79) in the testing dataset. At an optimal cut-point of probability-score = 0.335, sensitivity = 69% and specificity = 72.5% for determining EGFR-wildtype status. In both restriction-based and imputation-based regression analyses of the individual roles of BMI on overall survival of patients with NSCLC, similar results were observed between overall and EGFR-mutation-negative cohort analyses of patients of all ancestries. However, our approach identified some differences: EGFR-mutated Asian patients did not incur a survival benefit from being obese, as observed in EGFR-wildtype Asian patients.Conclusions
We introduce a pragmatic method to evaluate the potential impact of EGFR-status on epidemiological analyses of NSCLC.Impact
The proposed method is generalizable in the common occurrence in which EGFR-status data are missing.