Multimodal AI for Home Wound Patient Referral Decisions From Images With Specialist Annotations

基于图像和专家标注的多模态人工智能辅助家庭伤口患者转诊决策

阅读:1

Abstract

Chronic wounds affect 8.5 million Americans, especially the elderly and patients with diabetes. As regular care is critical for proper healing, many patients receive care in their homes from visiting nurses and caregivers with variable wound expertise. Problematic, non-healing wounds should be referred to experts in wound clinics to avoid adverse outcomes such as limb amputations. Unfortunately, due to the lack of wound expertise, referral decisions made in non-clinical settings can be erroneous, delayed or unnecessary. This paper proposes the Deep Multimodal Wound Assessment Tool (DM-WAT), a novel machine learning framework to support visiting nurses by recommending wound referral decisions from smartphone-captured wound images and associated clinical notes. DM-WAT extracts visual features from wound images using DeiT-Base-Distilled, a Vision Transformer (ViT) architecture. Distillation-based training facilitates representation learning and knowledge transfer from a larger teacher model to DeiT-Base, enabling robust performance on our small wound image dataset of 205 wound images. DM-WAT extracts text features from clinical notes using DeBERTa-base, which comprehends context by disentangling content and position information from clinical notes. Visual and text features are combined using an intermediate fusion approach. To overcome the challenges posed by a small and imbalanced dataset, DM-WAT integrates image and text augmentation along with transfer learning via pre-trained feature extractors to achieve high performance. In rigorous evaluation, DM-WAT achieved an accuracy of 77% [Formula: see text]% and an F1 score of 70% [Formula: see text]%, outperforming the prior state of the art and all baseline single-modality and multimodal approaches. Additionally, to interpret DM-WAT's recommendations, the Score-CAM and Captum interpretation algorithms provided insights into the specific parts of the image and text inputs that the model focused on during decision-making.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。