Abstract
Central Africa is the largest region of human T-cell Leukaemia virus (HTLV-1) endemicity with several million people estimated to be infected. Based on the study of the LTR region, it is also the region with the highest HTLV-1 diversity, with the presence of genotypes a-b and d-g. However, complete genomic sequences are still lacking for Central African genotypes. Here, we report the first large collection of complete HTLV-1 sequences for genotypes b, d and f from Central Africa and neighbouring countries. We identified substantial diversity within the HTLV-1b genotype, including a newly defined clade that we designated HTLV-1b-del. It mainly comprises strains from the Democratic Republic of the Congo (COD) and neighbouring countries and is characterized by a distinctive 12-bp-long deletion. We also generated the complete sequence of the STLV-1 strain from Allenopithecus nigroviridis from the COD. This strain belongs to the PTLV-1b genotype and carries a 12-bp duplication in the pX region. Lastly, we found that, except for HTLV-1a strains, HTLV-1 genomes generally lack open reading frames encoding the canonical accessory protein P12; instead, they encode either shorter versions of the protein or an ORF lacking a start ATG codon. This work substantially expands the genomic landscape of HTLV-1 in Central Africa and provides a critical resource for understanding viral diversity.