geneEX: An Integrated Phenotype-Driven Algorithm for Rapid Identification of Causative Variants in Monogenic Disorders

geneEX:一种用于快速识别单基因疾病致病变异的整合表型驱动算法

阅读:1

Abstract

BACKGROUND: In the diagnostic process of monogenic genetic disorders, identifying pathogenic variants is a crucial step. Thanks to the widespread adoption of Next-Generation Sequencing (NGS) technology, diagnostic efficiency has been significantly enhanced. However, with the increasing demand for diagnostic accuracy in clinical practice for monogenic genetic diseases, accurately and swiftly pinpointing pathogenic variants among numerous candidate variants remains a significant challenge. The complexity of data analysis and interpretation continues to limit both the efficiency and accuracy of diagnosis. METHODS: In this study, we have developed an innovative phenotype-driven algorithm, geneEX. This algorithm integrates large language model technology to accurately extract phenotypes from clinical information and automatically acquire Human Phenotype Ontology (HPO) information through a semantic vector representation model, thereby identifying HPO-associated genes. Additionally, it supports semantic matching between patients' free-text phenotypic descriptions and disease phenotypes, further enhancing the identification of pathogenic genes. The algorithm can rank candidate causative variants, enabling rapid and precise identification of potential pathogenic variants in rare genetic disorders. RESULTS: geneEX demonstrates commendable performance in ranking pathogenic variants across both virtual and clinical datasets. The supplementary matching of phenotypes in free-text form significantly enhances the precision of candidate variant prioritization for samples. CONCLUSION: geneEX has achieved automated HPO acquisition through its independently developed phenotype extraction and standardization methods, thereby enabling the full-process automated identification from clinical samples to pathogenic variants. Additionally, by integrating free-text phenotypic descriptions with disease phenotype matching, it enhances the accuracy of pathogenic gene identification. This innovative approach significantly improves the precision and efficiency of identifying pathogenic variants in rare genetic disorders, providing robust support for the diagnosis of monogenic diseases.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。