Abstract
Metatranscriptomics has transformed our view of RNA bacteriophage diversity, revealing vast numbers of single-stranded RNA (ssRNA) phages whose protein capsids can be engineered for biotechnology applications. However, many ssRNA phages remain hidden from current detection methods, which require protein-level similarity to known phages. Here we show that RNA structure provides an additional signal for the detection of ssRNA phages in metatranscriptomes, including hidden phages missed by prior protein-based methods. By computationally folding each contig and screening for exceptionally stable RNA secondary structures, we find evidence of thousands of previously unrecognized phages encoding novel coat proteins. We express a library of 12,000 such coat proteins in E. coli and find that most assemble into nuclease-resistant capsids. We determine the 3D structure of one such capsid by cryo-electron microscopy and demonstrate that it can be disassembled and reassembled in vitro to package heterologous RNA-a key step toward repurposing these particles as RNA delivery vehicles. We compile the newly discovered ssRNA phages with previously known ones into a database that contains sequence and structural information for over 460,000 unique RNA molecules and over 100,000 distinct coat proteins, providing a comprehensive resource for microbiology and nanomaterials research.