Knowledge graph-augmented large language models for reconstructing life course risk pathways: a gestational diabetes mellitus-to-dementia case study

基于知识图谱增强的大型语言模型用于重建生命历程风险路径:以妊娠期糖尿病到痴呆症的案例研究

阅读:1

Abstract

OBJECTIVES: To develop and evaluate a knowledge graph-augmented large language model (LLM) framework that synthesizes epidemiological evidence to infer life-course exposure-outcome pathways, using gestational diabetes mellitus (GDM) and dementia as a case study. MATERIALS AND METHODS: We constructed a causal knowledge graph by extracting empirical epidemiological associations from scientific literature, excluding hypothetical assertions. The graph was integrated with GPT-4 through four graph retrieval-augmented generation (GRAG) strategies to infer bridging variables between early-life exposure (GDM) and later-life outcome (dementia). Semantic triples served as structured inputs to support LLM reasoning. Each GRAG strategy was evaluated by human clinical experts and three LLM-based reviewers (GPT-4o, Llama 3-70B, and Gemini Advanced), assessing scientific reliability, novelty, and clinical relevance. RESULTS: The GRAG strategy using a minimal set of abstracts specifically related to GDM-dementia bridging variables performed comparably to the strategy using broader sub-community abstracts, and both significantly outperformed approaches using the full GDM- or dementia-related corpus or baseline GPT-4 without external augmentation. The knowledge graph-augmented LLM identified 108 maternal candidate mediators, including validated risk factors such as chronic kidney disease and physical inactivity. The structured approach improved accuracy and reduced confabulation compared to standard LLM outputs. DISCUSSION: Our findings suggest that augmenting LLMs with epidemiological knowledge graphs enables effective reasoning over fragmented literature and supports the reconstruction of progressive risk pathways. Expert assessments revealed that LLMs may overestimate clinical relevance, highlighting the need for human-AI collaboration in interpretation and application. CONCLUSION: Integrating semantic epidemiological knowledge with LLMs via GRAG strategies provides a promising framework for life-course epidemiology, enabling early detection of modifiable risk factors and guiding variable selection in cohort study design.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。