Abstract
Autism spectrum disorder (ASD) is a neurodevelopmental condition characterized by genetic heterogeneity. Post-transcriptional regulation-particularly alternative polyadenylation (APA)-plays a critical role in the pathogenesis of ASD. APA controls mRNA stability, translational efficiency, and subcellular localization through modulating the length of the 3' untranslated region of mRNA. APA profiling can uncover functionally relevant post-transcriptional alterations often missed by conventional gene expression analyses. However, current ASD analyses still largely rely on differential gene expression or individual APA event detection, which ignores the collective explanatory power of ASD risk genes or co-dysregulated functional gene modules within specific cell types. In this study, we present an integrative computational framework that combines matrix factorization and machine learning to identify ASD-associated gene modules driven by APA and to predict cell-type-specific ASD-related cells. Applied to human brain single-nucleus RNA sequencing (snRNA-seq) data, our approach systematically uncovers APA regulatory patterns that are specific to cell type, brain region, and sex in ASD. The identified APA modules are significantly enriched in pathways related to synaptic function, neurodevelopment, and immune response, with the strongest signals observed in excitatory neurons of the prefrontal cortex. Using APA genes from these modules as features, we built a classification model that effectively distinguishes ASD cells from normal cells. Moreover, we found that integrating APA with gene expression-two complementary modalities-substantially improves prediction accuracy, underscoring APA as an independent and biologically informative regulatory layer. Our work delineates a high-resolution APA regulatory landscape in ASD, offering novel insights and potential therapeutic avenues beyond transcriptional abundance.