Abstract
Diverse genetic structures can lead to heterogeneity among GWAS summary datasets from distinct populations. This makes it more difficult to infer causal effects of exposures on the outcome when multiple GWAS summary datasets are integrated. Here, we propose a Mendelian randomization method called MR-EILLS, which leverages environment invariant linear least squares to establish whether there is a causal relationship that is invariant in all heterogeneous populations. The MR-EILLS model works in both univariate and multivariate scenarios and allows for invalid instrumental variables that violate the exchangeability and exclusion restriction assumptions. In addition, MR-EILLS shows the unbiased causal effect estimations of one or multiple exposures on the outcome, whether there are valid or invalid instrumental variables. Compared to traditional Mendelian randomization and meta methods, MR-EILLS yields the highest estimation accuracy, the most stable type I error rates, and the highest statistical power. Finally, we apply MR-EILLS to explore the independent causal relationships between 11 blood cells and 20 disease-related outcomes, using GWAS summary statistics from five ancestries (African, East Asian, South Asian, Hispanic/Latino and European). The results cover most of the expected causal links that have biological interpretations as well as additional links supported by previous observational studies.