SpeechDx: A gold‐standard speech‐and‐language dataset for prognostic AD biomarker development

SpeechDx:用于预测阿尔茨海默病生物标志物开发的金标准语音语言数据集

阅读:1

Abstract

BACKGROUND: Blood biomarkers are highly effective in identifying individuals with amyloid pathology, but not all patients with amyloid will go on to develop symptomatic Alzheimer's disease. Prognostic biomarkers are needed to predict who will experience future cognitive decline and are most likely to benefit from early intervention. Speech‐derived biomarkers built from acoustic and linguistic features found in speech hold significant potential as scalable prognostic biomarkers working in complement with blood biomarkers and other markers of pathology; however, their development has been hindered by the lack of large, clinically annotated speech datasets needed to train machine learning models. METHOD: SpeechDx is a longitudinal observational study that collects and harmonizes speech, biomarker, and clinical data from up to 3,000 participants across clinical sites in the U.S., Australia, and Spain. The SpeechDx study population was selected to capture data from participants who may experience stable cognition or cognitive decline during the 3 years of SpeechDx data collection: While participants span the full cognitive spectrum, including normal cognitive (CN), subjective cognitive decline (SCD), mild cognitive impairment (MCI) and Alzheimer's disease (AD), the majority of the study population is enrolled as CN or SCD. Quarterly, participants remotely complete a brief battery of speech‐ and language‐eliciting tasks via a custom‐built SpeechDx app on a study‐provided tablet. Tasks are designed to elicit semi‐constrained and unconstrained speech, including picture description, story recall, storytelling, and open‐ended questions. Concurrently, clinical sites provide participant high‐quality clinical and biomarker data collection, including longitudinal blood AD biomarkers, MRI, and neuropsychological assessments. Clinical and biomarker data are paired with individual speech samples, de‐identified, and harmonized across all sites to form the unified SpeechDx Dataset. The Dataset is hosted at the AD Data Initiative Workbench, and access is managed by the Data Access Committee. RESULT: SpeechDx is currently enrolling participants across clinical sites in the US, Australia, and Spain, with interim data release to SpeechDx partners starting in 2025. Full dataset completion is anticipated by the end of 2028. CONCLUSION: SpeechDx facilitates the development of prognostic AD speech and language biomarkers through the creation of a harmonized database of longitudinal speech, biomarker, and clinical data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。