Abstract
Analysis of transcript function is greatly aided by knowledge of the full-length RNA sequence. New long-read sequencing enabled by Oxford Nanopore and PacBio devices have the potential to provide full-length transcript information; however, standard methods still lack the ability to capture true RNA 5' ends and select for polyadenylated (pA+) transcripts only. Here, we present a method that, by utilizing cap trapping and 3'-end adapter ligation, sequences transcripts between their exact 5' and 3' ends regardless of polyadenylation status and without the need for ribosomal RNA depletion, with the ability to characterize polyadenylation length of RNAs, if any. The method shows high reproducibility, can faithfully detect 5' ends, 3' ends and splice junctions, and produces gene-expression estimates that are highly correlated to those of short-read sequencing techniques. We also demonstrate that the method can detect and sequence full-length nonadenylated (pA-) RNAs, including long noncoding RNAs, promoter upstream transcripts, and enhancer RNAs, and present cases where pA+ and pA- RNAs show preferences for different but closely located transcription start sites. Our method is therefore useful for the characterization of diverse capped RNA species and analysis of relationships between transcription initiation, termination, and RNA processing.
