Abstract
BACKGROUND: Optical coherence tomography (OCT) enables high-resolution, non-invasive visualization of retinal structures. Recent evidence suggests that retinal layer alterations may reflect central nervous system changes associated with psychiatric disorders such as schizophrenia (SZ). AIM: To develop an advanced deep learning model to classify OCT images and distinguish patients with SZ from healthy controls using retinal biomarkers. METHODS: A novel convolutional neural network, Self-AttentionNeXt, was designed by integrating grouped self-attention mechanisms, residual and inverted bottleneck blocks, and a final 1 × 1 convolution for feature refinement. The model was trained and tested on both a custom OCT dataset collected from patients with SZ and a publicly available OCT dataset (OCT2017). RESULTS: Self-AttentionNeXt achieved 97.0% accuracy on the collected SZ OCT dataset and over 95% accuracy on the public OCT2017 dataset. Gradient-weighted class activation mapping visualizations confirmed the model's attention to clinically relevant retinal regions, suggesting effective feature localization. CONCLUSION: Self-AttentionNeXt effectively combines transformer-inspired attention mechanisms with convolutional neural networks architecture to support the early and accurate detection of SZ using OCT images. This approach offers a promising direction for artificial intelligence-assisted psychiatric diagnostics and clinical decision support.