Leveraging Clinical Record Geolocation for Improved Alzheimer's Disease Diagnosis Using DMV Framework

利用临床记录地理定位技术,基于DMV框架改进阿尔茨海默病诊断

阅读:1

Abstract

Background: Early detection of Alzheimer's disease (AD) is critical for timely intervention, but clinical assessments and neuroimaging are often costly and resource intensive. Natural language processing (NLP) of clinical records offers a scalable alternative, and integrating geolocation may capture complementary environmental risk signals. Methods: We propose the DMV (Data processing, Model training, Validation) framework that frames early AD detection as a regression task predicting a continuous risk score ("data_value") from clinical text and structured features. We evaluated embeddings from Llama3-70B, GPT-4o (via text-embedding-ada-002), and GPT-5 (text-embedding-3-large) combined with a Random Forest regressor on a CDC-derived dataset (≈284 k records). Models were trained and assessed using 10-fold cross-validation. Performance metrics included Mean Squared Error (MSE), Mean Absolute Error (MAE), and R(2); paired t-tests and Wilcoxon signed-rank tests assessed statistical significance. Results: Including geolocation (latitude and longitude) consistently improved performance across models. For the Random Forest baseline, MSE decreased by 48.6% when geolocation was added. Embedding-based models showed larger gains; GPT-5 with geolocation achieved the best results (MSE = 14.0339, MAE = 2.3715, R(2) = 0.9783), and the reduction in error from adding geolocation was statistically significant (p < 0.001, paired tests). Conclusions: Combining high-quality text embeddings with patient geolocation yields substantial and statistically significant improvements in AD risk estimation. Incorporating spatial context alongside clinical text may help clinicians account for environmental and regional risk factors and improve early detection in scalable, data-driven workflows.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。