Abstract
Trypanosoma cruzi, the causative agent of Chagas disease (CD), exhibits remarkable genetic diversity, classified into six discrete typing units (DTUs), and one additional DTU, TcBat, primarily associated with bats. These DTUs are distributed differentially across CD-endemic regions, posing significant challenges for molecular and serological diagnosis, as test performance often varies geographically. Identifying conserved genomic regions shared among parasites circulating in distinct endemic areas is therefore essential. However, complete or semi-complete genome assemblies are available for only a limited number of strains, insufficiently capturing intra- and inter-DTU variability, particularly within repetitive multigene families. A wealth of raw T. cruzi genomic reads is publicly available, offering an opportunity to investigate highly repetitive, high-copy number sequences that are difficult to assemble but potentially valuable for improving diagnostic sensitivity. In this study, we applied a read-based bioinformatics pipeline to analyze data from six DTUs (TcI-TcVI), generating 80-mer fragments and clustering them to identify conserved sequences. Consensus sequences from conserved clusters were used to design synthetic peptides, which were evaluated serologically with samples from chronically infected individuals from Brazil, Bolivia, and Peru. Four peptides from the conserved C-terminal region of mucin family proteins demonstrated robust diagnostic performance (AUC: 0.8783-0.9353), with particularly high values obtained with sera from Brazilian and Bolivian patients. Overall, our results demonstrate that k-mer-based, assembly-free approaches can successfully identify conserved antigens across genetically diverse T. cruzi populations, underscoring their value as discovery tools for potential serological markers. While the peptides identified here represent promising candidates, validation in larger and more geographically diverse cohorts will be essential to establish their broader diagnostic applicability. Importantly, similar genome-informed strategies may also be leveraged to guide the discovery of diagnostic targets for other infectious diseases.