De Novo exposomic geospatial assembly of chronic disease regions with machine learning & network analysis

利用机器学习和网络分析对慢性病区域进行从头开始的暴露组地理空间组装

阅读:3

Abstract

BACKGROUND: Determining spatial relationships between diseases and the exposome is limited by available methodologies. aPEER (algorithm for Projection of Exposome and Epidemiological Relationships) uses machine learning (ML) and network analysis to find spatial relationships between diseases and the exposome in the United States. METHODS: Using aPEER we examined the relationship between 12 chronic diseases and 186 pollutants. PCA, K-means clustering, and map projection produced clusters of counties derived from pollutants, and the Jaccard correlation between these clusters with chronic disease geography (defined as groups of counties with high chronic disease prevalence rates) was calculated. Disease-pollution correlation matrices were used together with network analysis to identify the strongest disease-pollution relationships. Results were compared to LISA, Moran's I, univariate, elastic net, and random forest regression. FINDINGS: aPEER produced 68,820 human interpretable maps with distinct pollution-derived regions, and acetaldehyde/benzo(a)pyrene was found to be strongly associated with hypertension (J = 0.5316, p = 3.89 × 10(-208)), stroke (J = 0.4517, p = 1.15 × 10(-127)), and diabetes mellitus (J = 0.4425, p = 2.34 × 10(-127)); formaldehyde/glycol ethers with COPD (J = 0.4545, p = 8.27 × 10(-131)); and acetaldehyde/formaldehyde with stroke mortality (J = 0.4445, p = 4.28 × 10(-125)). Methanol, acetaldehyde, and formaldehyde formed distinct regions in the southeast United States (which correlated with both the Stroke and Diabetes Belts) which were strongly associated with multiple chronic diseases. Pollutants predicted chronic disease geography with similar or superior areas under the curve compared to SDOH and preventive healthcare models (determined with random forest and elastic net methods). Conventional geospatial analysis methods did not identify these geospatial relationships, highlighting aPEER's utility. INTERPRETATION: aPEER identified a pollution-defined geographical region associated with chronic disease, highlighting the role of aPEER in epidemiological and geospatial analysis, and exposomics in understanding chronic disease geography. FUNDING: This work was primarily funded by the BPHC, NHLBI (R03 HL157890) and the CDC, and this work was funded in part by grants from the NIH (U01 HG007691, R01 HL155107, and HL166137), the American Heart Association (AHA24MERIT1185447), and the EU (HorizonHealth 2021 101057619) to JL.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。