Google Scholar

A prospective randomized clinical trial for measuring radiology study reporting time on Artificial Intelligence-based detection of intracranial hemorrhage in emergent …

A Wismüller, L Stockmaster - Medical Imaging 2020 …, 2020 - spiedigitallibrary.org

A Wismüller, L Stockmaster

Medical Imaging 2020: Biomedical Applications in Molecular …, 2020•spiedigitallibrary.org

The quantitative evaluation of Artificial Intelligence (AI) systems in a clinical context is a challenging endeavor, where the development and implementation of meaningful performance metrics is still in its infancy. Here, we propose a scientific concept, Artificial Intelligence Prospective Randomized Observer Blinding Evaluation (AI-PROBE) for quantitative clinical performance evaluation of radiology AI systems within prospective randomized clinical trials. Our evaluation workflow encompasses a study design and a corresponding radiology Information Technology (IT) infrastructure that randomly blinds radiologists with regards to the presence of positive reads as provided by AI-based image analysis systems. To demonstrate the applicability of our AI-evaluation framework, we present a first prospective randomized clinical trial on investigating the effect of automatic identification of Intra-Cranial Hemorrhage (ICH) in emergent care head CT scans on radiology study Turn-Around Time (TAT) in a clinical environment. Here, we acquired 620 consecutive non-contrast head CT scans from CT scanners used for inpatient and emergency room patients at a large academic hospital over a time period of 14 consecutive days. Immediately following image acquisition, scans were automatically analyzed for the presence of ICH using commercially available software (Aidoc, Tel Aviv, Israel). Cases identified as positive for ICH by AI (ICH-AI+) were automatically flagged in the radiologists' reading worklists, where flagging was randomly switched off with a probability of 50%. Study TAT was measured automatically as the time difference between study completion and first clinically communicated study reporting, with time stamps for these events automatically retrieved from various radiology IT systems. TATs for flagged cases (73 ± 143 min) were significantly lower than TATs for non-flagged (132 ± 193 min) cases (p<0.05, one-sided t-test), where 105 of the 122 ICH-AI+ cases were true positive reads. Total sensitivity, specificity, and accuracy over all analyzed cases were 95.0%, 96.7%, and 96.4%, respectively. We conclude that automatic identification of ICH reduces study TAT for ICH in emergent care head CT settings, which carries the potential for improving clinical management of ICH by accelerating clinically indicated therapeutic interventions. In a broader context, our results suggest that our AI-PROBE framework can contribute to a systematic quantitative evaluation of AI systems in a clinical workflow environment with regards to clinically meaningful performance measures, such as TAT or diagnostic accuracy metrics.

SPIE Digital Library

Show moreShow less

Save Cite Cited by 54 Related articles All 3 versions

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

A prospective randomized clinical trial for measuring radiology study reporting time on Artificial Intelligence-based detection of intracranial hemorrhage in emergent …