Abstract
BACKGROUND: APOE-ε4, the strongest genetic risk factor for Alzheimer's disease (AD), is linked to early motor vulnerability, including subtle speech control changes. Because speech integrates fine neuromotor processes, acoustic analysis could offer a sensitive, noninvasive marker of preclinical effects. OBJECTIVES: To determine if speech acoustics distinguish cognitively healthy APOE-ε4 carriers from non-carriers, and to assess which speech tasks provide optimal classification performance. DESIGN: A cross-sectional observational study employing supervised machine learning to analyze acoustic features extracted from multiple speech tasks. Genetic algorithms (GAs) were used for feature selection, and model performance was compared across task contexts. SETTING: All assessments and recordings were conducted in a sound-attenuated laboratory at MGH Institute of Health Professions. PARTICIPANTS: Forty-four cognitively healthy adults (19 APOE-ε4 carriers, 25 non-carriers), aged 57-79 with no history of neurological and psychiatric conditions. MEASUREMENTS: Digitized speech was analyzed for 88 eGeMAPS acoustic features. Random Forest classifiers were trained to distinguish genotypes; model optimization employed GAs and stratified cross-validation. Performance was evaluated using F1 scores and subgroup analyses for sex effects. RESULTS: Random Forest classification of spontaneous speech achieved F1 scores above 0.90 for distinguishing APOE-ε4 carrier status, outperforming performance on structured tasks. GA-based feature selection consistently improved classification. Accuracy was highest among female participants. The combined speech dataset confirmed the robustness and generalizability of results. CONCLUSIONS: Automated analysis of speech acoustics especially from spontaneous speech, detects APOE-ε4 carrier status in asymptomatic adults, supporting speech as a scalable digital biomarker for early Alzheimer's risk.