Explainable text-tabular models for predicting mortality risk in companion animals

用于预测伴侣动物死亡风险的可解释文本表格模型

阅读:1

Abstract

As interest in using machine learning models to support clinical decision-making increases, explainability is an unequivocal priority for clinicians, researchers and regulators to comprehend and trust their results. With many clinical datasets containing a range of modalities, from the free-text of clinician notes to structured tabular data entries, there is a need for frameworks capable of providing comprehensive explanation values across diverse modalities. Here, we present a multimodal masking framework to extend the reach of SHapley Additive exPlanations (SHAP) to text and tabular datasets to identify risk factors for companion animal mortality in first-opinion veterinary electronic health records (EHRs) from across the United Kingdom. The framework is designed to treat each modality consistently, ensuring uniform and consistent treatment of features and thereby fostering predictability in unimodal and multimodal contexts. We present five multimodality approaches, with the best-performing method utilising PetBERT, a language model pre-trained on a veterinary dataset. Utilising our framework, we shed light for the first time on the reasons each model makes its decision and identify the inclination of PetBERT towards a more pronounced engagement with free-text narratives compared to BERT-base's predominant emphasis on tabular data. The investigation also explores the important features on a more granular level, identifying distinct words and phrases that substantially influenced an animal's life status prediction. PetBERT showcased a heightened ability to grasp phrases associated with veterinary clinical nomenclature, signalling the productivity of additional pre-training of language models.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。