Abstract
Short tandem repeats (STRs) are associated with 70 genetic diseases. Because of the short read length of exome sequencing (ES), STR analysis is not routinely analyzed in clinical ES. So far, there has been limited systematic evaluation using large-scale clinical ES data to assess the diagnostic yield of pathogenic STR expansion. This study retrospectively analyzed 9580 exomes referred to our genetic laboratory between July 2019 and June 2024. The samples were divided into two groups: a genetically undiagnosed cohort (n = 4692) and a reference cohort with a low probability of carrying pathogenic STR expansions (n = 4888). An analysis pipeline was developed on the basis of the combination of multiple algorithms to analyze STRs detected in 30 known disease-related loci, achieving a precision of 54.9% and a sensitivity of 100%. STR verification by capillary electrophoresis analysis of STR confirmed 28 cases (0.6%) with pathogenic STR expansions in known disease-related loci. Fourteen of these cases (0.3%) could be explained by the STR findings, including seven neonates with DMPK expansions. The pipeline showed the potential to identity abnormal STR expansions at novel sites. In conclusion, this study demonstrates the clinical utility of ES-based STR analysis and advocates for its incorporation into the clinical ES workflow in genetic laboratories.