Efficient Ensemble Learning with Curriculum-Based Masked Autoencoders for Retinal OCT Classification

基于课程的掩码自编码器的高效集成学习在视网膜OCT分类中的应用

阅读:1

Abstract

Background/Objectives: Retinal optical coherence tomography (OCT) is essential for diagnosing ocular diseases, yet developing high-performing multiclass classifiers remains challenging due to limited labeled data and the computational cost of self-supervised pretraining. This study aims to address these limitations by introducing a curriculum-based self-supervised framework to improve representation learning and reduce computational burden for OCT classification. Methods: Two ensemble strategies were developed using progressive masked autoencoder (MAE) pretraining. We refer to this curriculum-based MAE framework as CurriMAE (curriculum-based masked autoencoder). CurriMAE-Soup merges multiple curriculum-aware pretrained checkpoints using weight averaging, producing a single model for fine-tuning and inference. CurriMAE-Greedy selects top-performing fine-tuned models from different pretraining stages and ensembles their predictions. Both approaches rely on one curriculum-guided MAE pretraining run, avoiding repeated training with fixed masking ratios. Experiments were conducted on two publicly available retinal OCT datasets, the Kermany dataset for self-supervised pretraining and the OCTDL dataset for downstream evaluation. The OCTDL dataset comprises seven clinically relevant retinal classes, including normal retina, age-related macular degeneration (AMD), diabetic macular edema (DME), epiretinal membrane (ERM), retinal vein occlusion (RVO), retinal artery occlusion (RAO), and vitreomacular interface disease (VID) and the proposed methods were compared against standard MAE variants and supervised baselines including ResNet-34 and ViT-S. Results: Both CurriMAE methods outperformed standard MAE models and supervised baselines. CurriMAE-Greedy achieved the highest performance with an area under the receiver operating characteristic curve (AUC) of 0.995 and accuracy of 93.32%, while CurriMAE-Soup provided competitive accuracy with substantially lower inference complexity. Compared with MAE models trained at fixed masking ratios, the proposed methods improved accuracy while requiring fewer pretraining runs and reduced model storage for inference. Conclusions: The proposed curriculum-based self-supervised ensemble framework offers an effective and resource-efficient solution for multiclass retinal OCT classification. By integrating progressive masking with snapshot-based model fusion, CurriMAE methods provide high performance with reduced computational cost, supporting their potential for real-world ophthalmic imaging applications where labeled data and computational resources are limited.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。