Abstract
PURPOSE: Publicly available genomic databases are critical in understanding human genetic variation. They also provide unique insights into patterns of genetic constraints and their relationship with human disease. METHODS: We utilized one of the largest publicly available databases, Genome Aggregate Database, to determine genes that are highly constrained for only loss-of-function, only missense, and both loss-of-function/missense variants. We identified their unique signatures and explored their causal relationship with human diseases. Those genes were also evaluated for chromosomal location, tissue-level expression, Gene Ontology analysis, and gene family categorization using multiple publicly available databases. RESULTS: We identified unique patterns of inheritance, protein size, and enrichment in distinct molecular pathways for those constrained genes associated with human disease. In addition, we identified genes that are currently not known to cause human disease, which may be excellent gene discovery candidates. CONCLUSION: We elucidate biological pathways of highly constrained genes that expand our understanding of critical cellular proteins. The findings can also advance research in rare diseases.