Abstract
BACKGROUND: Microbiome-wide association studies showed links between colorectal cancer (CRC) and gut microbiota. However, the clinical application of gut microbiota in CRC prevention has been hindered by the diversity of study populations and technical variations. We aimed to determine CRC-related gut microbial signatures based on cross-regional, cross-population, and cross-cohort metagenomic datasets, and elucidate its application value in CRC risk assessment. METHODS: We used the MMUPHin tool to perform a meta-analysis of our own cohort and seven publicly available metagenomics datasets to identify gut microbial species associated with CRC across different cohorts, comprising of 570 CRC cases and 557 controls. Based on differential species sets, we constructed the microbial risk score (MRS) using α-diversity of the sub-community (MRS(α)), weighted/unweighted summation methods and machine learning algorithms. Cohort-to-cohort training and validation were performed to demonstrate the transferability. RESULTS: We found that MRS(α) of core species was better validated and more interpretable than those constructed with summation methods or machine learning algorithms. Six species, including Parvimonas micra, Clostridium symbiosum, Peptostreptococcus stomatis, Bacteroides fragilis, Gemella morbillorum, and Fusobacterium nucleatum, were included in MRS(α) constructed by half or more of the cohorts. The AUC of MRS(α), calculated based on the sub-community of six species, varied between 0.619 and 0.824 across the eight cohorts. CONCLUSION: We identified six CRC-related species across regions, populations, and cohorts. The constructed MRS(α) could contribute to the risk prediction of CRC in different populations.