Abstract
BACKGROUND: The complex relationship between the gut microbiome and immune system development during infancy is considered a key factor in the rising rates of pediatric allergic diseases. Food protein-induced allergic proctocolitis (AP), the earliest identified form of non-IgE-mediated food allergy in infants, occurs at the mucosal surface where dietary proteins, intestinal microbes, and immune cells directly interact, and increases the risk for life threatening IgE-mediated food allergy, making it an important model for understanding early food allergic disease development. The question of how specific microbial compositions and functional pathways contribute to AP development and progression remains poorly understood. METHODS: We performed metagenomic sequencing on 740 longitudinal stool samples from 163 infants (84 with AP, 79 without AP) enrolled in the prospective GMAP cohort. Taxonomic profiling, functional pathway analysis, strain-level characterization, and machine learning-based classification were applied to identify microbial differences across disease stages. RESULTS: Here we show that infants with AP exhibit different microbial compositions, characterized by enrichment of Escherichia coli and Bifidobacterium bifidum during early life, including pre-symptomatic stages, while species like Bifidobacterium breve and Klebsiella species are more abundant in infants without AP. These findings suggest the presence of microbial signatures that may be detectable before clinical symptoms emerge, and demonstrate that strain-level differences within E. coli populations may represent AP-associated lineages with distinct gene content profiles that were not previously recognized. For example, biofilm formation and cell adhesion genes in E. coli were particularly enriched in AP-associated clades. Short chain fatty acid (SCFA) and other functional pathways were also associated with AP, including reduced SCFA production during the symptomatic phase, and then a potentially compensatory increased production following AP resolution. CONCLUSIONS: Our results provide the first comprehensive strain-level characterization of the gut microbiome in AP, and functional implications, and generate new hypotheses to be tested regarding candidate microbial features associated with AP for future biomarker discovery and/or intervention targets. This work advances our understanding of how specific microbial taxa and functional pathways may contribute to non-IgE-mediated food allergies and opens new avenues for microbiome-targeted therapeutic approaches as well as novel prevention targets for IgE-mediated food allergies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13073-026-01646-6.