Learnable Prompt as Pseudo-Imputation: Reassessing the Necessity of Traditional EHR Data Imputation in Downstream Clinical Prediction

Liao, Weibin; Zhu, Yinghao; Wang, Zixiang; Chu, Xu; Wang, Yasha; Ma, Liantao

Computer Science > Machine Learning

arXiv:2401.16796 (cs)

[Submitted on 30 Jan 2024]

Title:Learnable Prompt as Pseudo-Imputation: Reassessing the Necessity of Traditional EHR Data Imputation in Downstream Clinical Prediction

Authors:Weibin Liao, Yinghao Zhu, Zixiang Wang, Xu Chu, Yasha Wang, Liantao Ma

View PDF

Abstract:Analyzing the health status of patients based on Electronic Health Records (EHR) is a fundamental research problem in medical informatics. The presence of extensive missing values in EHR makes it challenging for deep neural networks to directly model the patient's health status based on EHR. Existing deep learning training protocols require the use of statistical information or imputation models to reconstruct missing values; however, the protocols inject non-realistic data into downstream EHR analysis models, significantly limiting model performance. This paper introduces Learnable Prompt as Pseudo Imputation (PAI) as a new training protocol. PAI no longer introduces any imputed data but constructs a learnable prompt to model the implicit preferences of the downstream model for missing values, resulting in a significant performance improvement for all EHR analysis models. Additionally, our experiments show that PAI exhibits higher robustness in situations of data insufficiency and high missing rates. More importantly, in a real-world application involving cross-institutional data with zero-shot evaluation, PAI demonstrates stronger model generalization capabilities for non-overlapping features.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2401.16796 [cs.LG]
	(or arXiv:2401.16796v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2401.16796

Submission history

From: Yinghao Zhu [view email]
[v1] Tue, 30 Jan 2024 07:19:36 UTC (3,871 KB)

Computer Science > Machine Learning

Title:Learnable Prompt as Pseudo-Imputation: Reassessing the Necessity of Traditional EHR Data Imputation in Downstream Clinical Prediction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learnable Prompt as Pseudo-Imputation: Reassessing the Necessity of Traditional EHR Data Imputation in Downstream Clinical Prediction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators