Assessing Artificial Intelligence (AI) Implementation for Assisting Gene Linking (at the National Library of Medicine)

评估人工智能(AI)在辅助基因链接中的应用(美国国家医学图书馆)

阅读:2

Abstract

OBJECTIVES: The National Library of Medicine (NLM) currently indexes close to a million articles each year pertaining to more than 5300 medicine and life sciences journals. Of these, a significant number of articles contain critical information about the structure, genetics, and function of genes and proteins in normal and disease states. These articles are identified by the NLM curators, and a manual link is created between these articles and the corresponding gene records at the NCBI Gene database. Thus, the information is interconnected with all the NLM resources, services which bring considerable value to life sciences. National Library of Medicine aims to provide timely access to all metadata, and this necessitates that the article indexing scales to the volume of the published literature. On the other hand, although automatic information extraction methods have been shown to achieve accurate results in biomedical text mining research, it remains difficult to evaluate them on established pipelines and integrate them within the daily workflows. MATERIALS AND METHODS: Here, we demonstrate how our machine learning model, GNorm2, which achieved state-of-the art performance on identifying genes and their corresponding species at the same time handling innate textual ambiguities, could be integrated with the established daily workflow at the NLM and evaluated for its performance in this new environment. RESULTS: We worked with 8 biomedical curator experts and evaluated the integration using these parameters: (1) gene identification accuracy, (2) interannotator agreement with and without GNorm2, (3) GNorm2 potential bias, and (4) indexing consistency and efficiency. We identified key interface changes that significantly helped the curators to maximize the GNorm2 benefit, and further improved the GNorm2 algorithm to cover 135 species of genes including viral and bacterial genes, based on the biocurator expert survey. CONCLUSION: GNorm2 is currently in the process of being fully integrated into the regular curator's workflow.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。