Abstract
The COVID-19 pandemic has substantially altered respiratory pathogen circulation, underscoring the critical need for pre-pandemic baseline data to interpret current epidemiological trends. To establish this baseline, we employed metatranscriptomic sequencing to characterize the etiology of community-acquired pneumonia (CAP) in 20 adult patients hospitalized in Wuxi, China, during 2018-2019. Following ribosomal RNA depletion, sequencing data were analyzed using a stringent dual-filter strategy (RPM ≥ 100 and Z-score ≥ 2) to identify high-confidence pathogens. Our analysis revealed a complex, polymicrobial landscape. Bacterial pathogens predominated, with Streptococcus species detected in 25% of cases. The frequent co-occurrence of oral anaerobes (e.g., Prevotella, Veillonella, Rothia) suggested that aspiration-driven polymicrobial infections were a key pathogenic mechanism. Viral pathogens were also prominent, with Orthorubulavirus hominis and Human respirovirus 1 showing significant transcriptional activity. Notably, our approach enabled the discovery and characterization of two divergent viral strains: a novel Rhinovirus B strain (AP81) with only 90.52% nucleotide identity to its closest relative, and a picobirnavirus phylogenetically distinct from human strains (94.90% identity to a simian-derived virus). Fungal detection was minimal, with only Candida albicans meeting the criteria in a single case. In conclusion, this study provides a crucial pre-pandemic baseline of CAP etiology in Wuxi. It highlights the power of metatranscriptomics to not only define common etiological agents but also to uncover novel viral diversity and reveal the polymicrobial complexity of respiratory infections, offering critical insights for future surveillance and clinical management.