MRI-based brain stroke classification using a hybrid vision transformer-BiLSTM architecture

基于混合视觉Transformer-BiLSTM架构的MRI脑卒中分类

阅读:1

Abstract

INTRODUCTION: Stroke is a prominent cause of long-term disability, impacting patients' socioeconomic status in daily life. Hemorrhagic and ischemic strokes differ in dimensions, forms, and locations, posing challenges for automated detection. Magnetic resonance imaging (MRI), particularly diffusion-weighted imaging (DWI), reveals changes in fluid balance, thereby enabling early detection. Hence, MRI scans are more accurate than computed tomography (CT) scans due to their increased sensitivity. METHODS: To categorize brain strokes, a hybrid model combining bidirectional long short-term memory (BiLSTM) with a vision transformer (ViT) was developed using an MRI dataset from a private source. ViT identifies qualities using MRI. The ViT captures global contextual and spatial representations using patch-based self-attention (16×16 patches, 256-dimensional projections, four transformer encoder layers with eight attention heads), whereas the BiLSTM network (128 and 64 units) models dependencies inside transformer-encoded features. A comparative study was conducted for the hybrid architecture with deep learning models, including a convolutional neural network (baseline, 85.5%), VGG16 (87.8%), ResNet50 (89.2%), ViT (91.3%), and BiLSTM (88.6%). RESULTS: The hybrid ViT-Bi-LSTM model achieved a precision of 97.35%, recall of 93.04%, accuracy of 95.21%, F1-score of 95.15%, and ROC-AUC of 99.36%, outperforming other comparative approaches. The standalone ViT achieved an accuracy of 91.3%, exceeding the CNN-based methods. In 5-fold cross-validation, the hybrid ViT-BiLSTM model achieved an average accuracy of 96.61%, with a standard deviation of 0.78, indicating stable performance across folds. These findings validate the combination of bidirectional temporal modeling with transformer-based feature extraction. CONCLUSION: By capturing the global spatial context through self-attention and bi-directional features via recurrent processing, ViT with Bi-LSTM networks expands stroke classification from MRI data. The ViT-Bi-LSTM model showed a promising approach for clinical decision support systems in early stroke diagnosis. Future research will use federated learning (FL) to protect privacy and assess model generalizability across multi-institutional MRI datasets.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。