Abstract
Leishmania spp. regulate gene expression posttranscriptionally, yet untranslated regions (UTRs) that can affect mRNA stability and translation remain poorly delineated. We generated a de novo assembled genome for Leishmania donovani strain 1S2D (Ld1S) using PacBio HiFi and characterized the transcriptomes of promastigotes and axenic amastigotes with Oxford Nanopore direct RNA sequencing. The genome assembly consists of 65 scaffolds totaling ~33.3 Mb. Structural comparisons to LdBPK282A1 revealed numerous rearrangements, including genes reshuffled among polycistronic transcription units and validated by RNA sequencing of polycistronic reads. Promastigote and amastigote RNA sequencing produced 469,010 and 46,729 monocistronic reads containing a spliced-leader and a polyA tail sequences, defining 8,479 transcripts and supporting 7,415 of the 7,969 annotated protein-coding genes, as well as 604 putative long non-coding RNAs. We annotated UTRs for 4,921 genes and observed that putative RNA G-quadruplexes were markedly enriched in these regions. We also noted that 31.9% and 11.5% of genes were expressed into multiple isoforms in promastigotes and amastigotes, respectively. Collectively, these data provide a genome-wide annotation of L. donovani genes and their UTRs and reveal widespread and stage-specific UTR length polymorphisms and, overall, points to an important role of 3' UTRs in post-transcriptional regulation in L. donovani.