Mythological Medical Machine Learning: Boosting the Performance of a Deep Learning Medical Data Classifier Using Realistic Physiological Models

Sadiq, Ismail; Perez-Alday, Erick A.; Shah, Amit J.; Rad, Ali Bahrami; Sameni, Reza; Clifford, Gari D.

Computer Science > Machine Learning

arXiv:2112.15442 (cs)

[Submitted on 28 Dec 2021]

Title:Mythological Medical Machine Learning: Boosting the Performance of a Deep Learning Medical Data Classifier Using Realistic Physiological Models

Authors:Ismail Sadiq (1), Erick A. Perez-Alday (2), Amit J. Shah (2), Ali Bahrami Rad (2), Reza Sameni (2), Gari D. Clifford (1,2)

View PDF

Abstract:Objective: To determine if a realistic, but computationally efficient model of the electrocardiogram can be used to pre-train a deep neural network (DNN) with a wide range of morphologies and abnormalities specific to a given condition - T-wave Alternans (TWA) as a result of Post-Traumatic Stress Disorder, or PTSD - and significantly boost performance on a small database of rare individuals.
Approach: Using a previously validated artificial ECG model, we generated 180,000 artificial ECGs with or without significant TWA, with varying heart rate, breathing rate, TWA amplitude, and ECG morphology. A DNN, trained on over 70,000 patients to classify 25 different rhythms, was modified the output layer to a binary class (TWA or no-TWA, or equivalently, PTSD or no-PTSD), and transfer learning was performed on the artificial ECG. In a final transfer learning step, the DNN was trained and cross-validated on ECG from 12 PTSD and 24 controls for all combinations of using the three databases.
Main results: The best performing approach (AUROC = 0.77, Accuracy = 0.72, F1-score = 0.64) was found by performing both transfer learning steps, using the pre-trained arrhythmia DNN, the artificial data and the real PTSD-related ECG data. Removing the artificial data from training led to the largest drop in performance. Removing the arrhythmia data from training provided a modest, but significant, drop in performance. The final model showed no significant drop in performance on the artificial data, indicating no overfitting.
Significance: In healthcare, it is common to only have a small collection of high-quality data and labels, or a larger database with much lower quality (and less relevant) labels. The paradigm presented here, involving model-based performance boosting, provides a solution through transfer learning on a large realistic artificial database, and a partially relevant real database.

Comments:	Presented at the University of Chicago Data Science Institute Dec 6th 2021. See: this https URL and this https URL
Subjects:	Machine Learning (cs.LG)
MSC classes:	92C30, 92C32, 03H10, 62H30, 68Q07, 8T07, 78-10, 92-10, 62R07, 68T09, 68T10
ACM classes:	I.5.1; I.5.2; I.5.4; I.6.3; I.2.1; J.3
Cite as:	arXiv:2112.15442 [cs.LG]
	(or arXiv:2112.15442v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2112.15442

Submission history

From: Gari Clifford [view email]
[v1] Tue, 28 Dec 2021 17:55:37 UTC (3,021 KB)

Computer Science > Machine Learning

Title:Mythological Medical Machine Learning: Boosting the Performance of a Deep Learning Medical Data Classifier Using Realistic Physiological Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Mythological Medical Machine Learning: Boosting the Performance of a Deep Learning Medical Data Classifier Using Realistic Physiological Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators