Abstract
Copyback viral genomes (cbVGs) are truncated viral genomes with complementary ends produced when the viral negative-sense RNA virus polymerase detaches from the replication template and resumes elongation from the nascent strand. Despite advances in methods to identify cbVGs based on the site of polymerase break and rejoin, PCR-based tools cannot provide full length sequences of most cbVGs and/or can introduce errors and artifacts during cbVG amplification. These limitations have painted a limited picture of the diverse population of cbVGs generated during infection. To improve our ability to obtain native full-length sequences of cbVGs, we optimized Direct RNA Sequencing (DRS) as a fast and simple tool to sequence full-length cbVGs and designed a BLAST-based analysis approach to identify cbVGs from long-read sequencing data. We analyzed the DRS outputs of multiple Sendai virus stocks to highlight both the utility and limitations of this tool. We found that to capture the dominant 546 nucleotide cbVG produced by Sendai virus strain Cantell, the length of complementarity between the virus trailer and the DRS oligonucleotide should optimally be increased to up to 32 nucleotides. We also demonstrate comparable quality of cbVG sequences by DRS from as little RNA as 17.6ng from the media fraction or 50ng of from the cellular fraction of cells infected with SeV, in contrast to the recommended 1000ng. Importantly, we validated different cbVG species from two recombinant Sendai virus stocks, including for the first time cbVGs whose break positions occurred at or near position one in the reference genome.