Abstract
The genomes of anaerobic gut fungi (AGF) encode a diverse array of carbohydrate-active enzymes (CAZymes), yet exceedingly few of these enzymes have been experimentally validated or expressed in heterologous systems. Here, we developed a predictive bioinformatic pipeline to annotate novel putative CAZymes from anaerobic fungi and validate their activity through large-scale heterologous expression in Escherichia coli. A total of 173 fungal proteins from Piromyces finnis associated with biomass degradation were synthesized and expressed in E. coli, and 9.8% were soluble with expression levels exceeding 5% of the total proteome using high-throughput proteomic screening. Among these 17 heterologously expressed proteins, analysis with AlphaFold and FoldSeek predicted 13 multi-functional proteins containing catalytic domains fused with repetitive fungal dockerins, and half of the substrate predictions were experimentally validated. One promising enzyme, celsome_012, exhibited robust and specific activity against beechwood xylan at 37°C and pH 6.4, with titers that were also fivefold higher than those of other recombinant proteins screened here. Both Michaelis-Menten kinetics and the linearized Lineweaver-Burk equation yielded consistent values for K(m), and its activation energy was estimated at 51.9 kJ/mol based on the Arrhenius model. This work supports the industrial translation of anaerobic fungal CAZymes due to their robust lignocellulolytic activity and provides a framework for prioritizing AGF proteins for efficient E. coli heterologous expression.IMPORTANCEEfficient breakdown of plant biomass is crucial for producing high-value bio-based products, but identifying enzymes that reduce deconstruction costs remains a challenge. In this study, we harnessed novel CAZymes encoded in the AGF genome through high-throughput proteomic screening for CAZyme expression to identify promising fungal enzymes suitable for large-scale production in E. coli. Additionally, we leveraged cutting-edge computational tools to predict enzyme structure and function, accelerating the screening process beyond traditional methods. Experimental validation confirmed the accuracy of these predictions and revealed a highly active novel xylanase, expanding the available enzyme toolbox for biomass conversion. Overall, this study represents a comprehensive large-scale screening campaign of putative AGF CAZymes, highlighting proteins amenable to E. coli overexpression, integrating advanced sequence and structural annotation, and identifying a robust, novel fungal xylanase for detailed biochemical characterization.