Abstract
Identifying causal relationships in omics data is essential for understanding underlying biological processes. However, detecting these relationships remains challenging due to the complexity of molecular networks and observational data limitations. To guide researchers, we conducted a systematic literature review of data-driven causal omics analysis methods that use structured prior knowledge from regulatory and interaction databases. We grouped methods into three approaches based on the extent of prior knowledge integration: regulon-level (direct regulator-target links, straightforward interpretation, but with the risk of oversimplification), flow-level (multi-step propagation from regulators to targets, broader mechanism explanation, but lacking uncertainty modeling), and network-level (system-wide interactions and crosstalk, most comprehensive, but with increased computational complexity and requiring particularly careful interpretation). These methods have demonstrated utility across diverse applications, including identification of therapeutic targets in acute myeloid leukemia, elucidation of mechanisms in IgA nephropathy, and detection of regulatory perturbations in Alzheimer's disease. We discuss the strengths, limitations, and representative use cases of each approach, and address general limitations and outline future research directions. This review serves as a practical guide for the entire analysis process, from selecting prior knowledge databases (PKDBs) to choosing and applying causal analysis methods for different research questions.