Abstract
Nanopore direct RNA sequencing (DRS) offers distinct advantages for transcriptome analysis over the traditional high-throughput RNA sequencing methods by preserving native RNA modifications, eliminating polymerase chain reaction bias, and simplifying the workflow. However, its high basecalling error rate remains a significant hurdle. Here we introduce Coral, a dual context-aware nanopore DRS basecaller that uses a Transformer-based encoder-decoder architecture to capture contextual dependencies at both the signal and sequence levels, substantially improving accuracy. Coral achieves up to a 6.17% improvement in accuracy on human RNA samples compared to Oxford Nanopore Technologies' Dorado basecaller. This improved accuracy enables the detection of 26% more annotated transcript isoforms. Coral also enhances the downstream haplotype phasing, reducing switch errors by up to 78.8% and Hamming errors by 76%, while phasing 36% more single nucleotide polymorphisms.