Abstract
Microbial communities in the Earth's biosphere mediate various biogeochemical cycling processes that are essential to maintain ecosystem multi-functioning and stability via the functional genes they carry. Although progress is being made, obtaining high-quality sequence data sets for microbial functional genes in complex environments remains challenging, both technically and in terms of computational resources required. In this study, using the amo gene family encoding ammonia monooxygenase as an example, we aimed to recover important microbial functional genes from shotgun metagenomes via targeted assembly. Comparing to conventional assembly approaches such as single-sample and multi-sample assembly, targeted assembly recovered much higher amo gene diversity while requiring substantially less computational resource and shorter running time. In addition, amo genes recovered by targeted assembly were found with fewer chimeras. Meanwhile, more amo operons were recovered. Not only were the commonly known amoABC subunits observed, but also the less commonly found subunits, like amoX and amoE. Notably, the archaeal amoA subunits recovered by targeted assembly represented the most "super-clades" for ammonia monooxygenase, including NT-α, NT-γ, NP-γ, NP-ζ, and NP-η, demonstrating the advantage of targeted assembly over conventional approaches. Comparable spatial patterns, such as taxa-area and distance-decay relationships, were also observed for the recovered amo assemblages. This study demonstrated an efficient route to recover microbial functional genes from shotgun metagenomes with minimal computational resource and running time.IMPORTANCEMicrobial communities play critical roles in the Earth's biosphere by mediating various biogeochemical cycles of essential elements and maintaining ecosystem stability and multi-functioning through the functional genes they carry. However, recovering the key functional genes from such complex communities remains challenging. Both advantages and limitations exist for different technologies. In this study, using the amo gene family as an example, we show that targeted assembly enables accurate and rapid recovery of high-quality amo sequences from shotgun metagenomes, consuming minimal computational resources and running time. Compared to conventional full-assembly approaches, the amo sequences recovered by targeted assembly are found with more operons, higher (phylo)genetic diversity, and fewer chimeras. This study provides an efficient alternative route for recovering microbial functional genes, particularly when computational resources are limited.