Abstract
All life depends on accurate and efficient protein synthesis. The aminoacyl-tRNA synthetases (aaRSs) are a family of proteins that play an essential role in protein translation, as they catalyze the esterification reaction that charges a transfer RNA (tRNA) with its cognate amino acid. However, new domains added to the aaRSs over the course of evolution in eukaryotes confer novel functions unrelated to protein translation. To date, damaging variants that affect aaRS-encoding genes have been linked to over 50 human diseases. In this study, we leverage the evolutionary history of the aaRS proteins to better understand the distribution of disease-causing missense variants in human cytosolic aaRSs. We hypothesized that disease-causing missense variants in human aaRSs were more likely to be located in the ancient domains of the aaRS, essential for the aminoacylation reaction, rather than in the evolutionarily more recent domains found in eukaryotes. We determined the locations of the modern and ancient domains in each aaRS protein found in humans. We then statistically assessed the positional conservation across each domain and examined the distribution of pathogenic and benign/unknown missense human genetic variants across these domains. We establish that pathogenic missense variants in the human aaRS proteins are enriched in the evolutionarily ancient domains while benign/unknown missense variants are enriched in the modern domains. In addition to defining the evolutionary history of human aaRS proteins through domain identification, we anticipate that this work will improve the ability to diagnose patients affected by damaging genetic variants in the aaRS protein family.