Abstract
INTRODUCTION: Schizophrenia (SZ) is a complex psychiatric disorder whose neural mechanisms are still unclear. Functional connectivity (FC) provides a unique perspective for understanding its pathology, but its high-dimensional nature poses significant challenges for feature selection and model interpretation. Traditional feature selection methods, while predictive, lack the integration of prior neuroscience knowledge, resulting in limited clinical relevance. METHODS: To address this, we propose an innovative framework that combines feature selection guided by a large language model (LLM) with counterfactual explanation. This framework leverages brain disease knowledge encoded by the LLM to guide dimensionality reduction of high-dimensional FC, ensuring that selected features are both statistically significant and biologically plausible. Counterfactual explanations are then used to generate causal intervention examples, which are then translated by the LLM into intuitive explanations in natural language, providing understandable and actionable clinical insights for individual patients or physicians. RESULTS: We validate our approach on five real-world SZ datasets and demonstrate that it not only improves model classification performance but also provides new insights into SZ analysis. DISCUSSION: The LLM-based FC analysis method proposed in this study demonstrates good feature selection and interpretability on multiple SZ datasets. Its main advantage is its ability to effectively screen key FC features for brain regions. However, this method has some limitations, such as being difficult to directly apply clinically due to data heterogeneity, being unable to accurately locate individual FC abnormalities, and the hyperparameters for counterfactual generation not yet being optimized.