Causal Discovery in Observational Medical Research: Scoping Review

观察性医学研究中的因果发现:范围界定综述

阅读:1

Abstract

BACKGROUND: Observational data are fundamental to medical research but present formidable challenges for causal inference. Machine learning-based causal discovery algorithms have emerged as a promising solution to identify causal structures directly from such data. However, the current literature is skewed toward theoretical and methodological innovations, with a critical gap in systematic assessments of performance in medical research settings and a lack of practical guidance for clinicians and researchers on selecting and applying these algorithms in specific medical contexts. OBJECTIVE: This study aimed to systematically map and synthesize the application of causal discovery methods in observational medical research, detailing the methodologies used, their application domains, the robustness of the findings, and the practical challenges encountered. METHODS: Following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines, we conducted a systematic search of Scopus, Web of Science, PubMed, MEDLINE, Embase, and CINAHL from inception to May 2025. We included studies that applied any causal discovery algorithms within a medical research context, encompassing both analyses of real-world observational data and method-validation studies using synthetic or benchmark datasets with a clear medical focus. Purely methodological papers and studies based solely on experimental data were excluded. Data were extracted and synthesized using a descriptive analysis focused on study characteristics, algorithm types, application domains, reported numerical results, and implementation challenges. RESULTS: Out of 2296 identified publications, 72 (3.1%) met the inclusion criteria. Our synthesis revealed three key themes. The first theme was methodological landscape, where constraint-based algorithms were the most prevalent (38/72, 52.8%), with the fast causal inference (10/72, 13.9%) and Peter-Clark algorithms (9/72, 12.5%) being most common. Score-based (19/72, 26.4%) and hybrid (14/72, 19.4%) methods also represented significant and growing segments (methods were not mutually exclusive). The second theme was application domains and findings, where the majority of studies (54/72, 75%) were in clinical research, with a strong focus on mental health (19/72, 26.4%; eg, identifying symptom networks in schizophrenia and posttraumatic stress disorder) and chronic diseases (19/72, 26.4%; eg, elucidating progression pathways in Alzheimer and diabetes). Etiological research was the primary objective (28/72, 38.9%). Public health applications (18/72, 25%) frequently assessed the causal impacts of behavioral interventions. The third theme was implementation challenges and innovations, where common challenges included pervasive unmeasured confounding, limited sample sizes (noted in more than 20% of studies), and reliance on unvalidated causal assumptions. Emerging innovations focused on longitudinal data frameworks and the integration of multimodal data sources to strengthen causal claims. CONCLUSIONS: This review underscores the growing application of causal discovery algorithms in medical research while also highlighting challenges such as the lack of standardized validation frameworks and persistent confounding. Future efforts must focus on developing evaluation standards and fostering interdisciplinary collaboration to translate these powerful computational techniques into reliable tools for medical research and practice.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。