Enhancing explainability in clinical deep-learning models: Latent-space variable decoding is superior to gradient-weighted class activation mapping

增强临床深度学习模型的可解释性:潜在空间变量解码优于梯度加权类激活映射

阅读:3

Abstract

BACKGROUND: Deep-learning models designed to assist with clinical decision making abound in cardiology. However, the "black box" nature of these models limits physicians' ability to use them to cross-check clinical gestalt when evaluating model predictions. Analytical techniques such as the popular gradient-weighted class activation mapping (Grad-CAM) may provide insight into model explainability, but the reliability and reproducibility of these techniques have not been studied. OBJECTIVE: To perform a rigorous assessment of the explainability offered by Grad-CAM, with comparison to alternative saliency methods provided by intrinsicly explainable deep-learning models. METHODS: We examined a well-phenotyped cohort of 1930 patients with hypertrophic cardiomyopathy (HCM) and available electrocardiographic waveform data. Novel deep-learning models were developed for the prediction of 2 high-risk HCM features: left ventricular (LV) apical aneurysm and massive LV hypertrophy. Saliency analysis was performed using (1) Grad-CAM and (2) latent-space variable decoding (LSVD). RESULTS: Deep-learning models amenable to Grad-CAM- and LSVD-based saliency analysis demonstrated comparable performances in the identification of LV apical aneurysm (C statistic 0.95 vs 0.93) and massive LV hypertrophy (C statistic 0.82 vs 0.83) during holdout testing. However, while Grad-CAM produced highly variable visual assessments of model attention and offered little insight into the models' underlying decision-making processes, LSVD allowed direct visualization of those electrocardiographic characteristics that differentiated patients with and without the high-risk HCM features of interest. In addition, Kolmogorov-Smirnov goodness-of-fit testing of latent-space variables offered a method for prospectively assessing the likelihood of deep-learning model overfitting. CONCLUSION: Deep-learning models amenable to LSVD analysis offered more robust explainability than did models amenable to the popular Grad-CAM analytical technique while offering comparable predictive performance.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。