Abstract
BACKGROUND: Understanding the causal relationships between gene expression levels in breast mammary tissue and breast cancer susceptibility is crucial for identifying therapeutic targets and developing prevention strategies. However, traditional observational studies are limited by confounding factors and reverse causation. METHODS: We conducted a comprehensive multi-analytical approach combining Mendelian randomization (MR), summary-based Mendelian randomization (SMR), and transcriptome-wide association study (TWAS) to investigate causal relationships between breast mammary tissue gene expression and breast cancer risk. We utilized large-scale genome-wide association study summary statistics and expression quantitative trait loci data to identify genes with significant causal associations. RESULTS: MR analysis identified three genes with significant protective effects: APOBEC3B (OR = 0.992, 95% CI: 0.988-0.995), SLC22A5 (OR = 0.983, 95% CI: 0.976-0.991), and CRLF3 (OR = 0.984, 95% CI: 0.976-0.991). TWAS analysis revealed SLC4A7 and NEGR1 as the most significant risk-associated genes, while ZBTB38, RGPD1, and CCDC91 demonstrated protective effects. SMR analysis confirmed the robustness of these associations and revealed additional genes with both protective and risk-enhancing effects across the genome. CONCLUSIONS: This integrative genomic analysis provides robust evidence for causal relationships between specific gene expression patterns in breast mammary tissue and breast cancer risk.