Discovery of novel biomarkers and phenotypes by semantic technologies

利用语义技术发现新型生物标志物和表型

阅读:1

Abstract

BACKGROUND: Biomarkers and target-specific phenotypes are important to targeted drug design and individualized medicine, thus constituting an important aspect of modern pharmaceutical research and development. More and more, the discovery of relevant biomarkers is aided by in silico techniques based on applying data mining and computational chemistry on large molecular databases. However, there is an even larger source of valuable information available that can potentially be tapped for such discoveries: repositories constituted by research documents. RESULTS: This paper reports on a pilot experiment to discover potential novel biomarkers and phenotypes for diabetes and obesity by self-organized text mining of about 120,000 PubMed abstracts, public clinical trial summaries, and internal Merck research documents. These documents were directly analyzed by the InfoCodex semantic engine, without prior human manipulations such as parsing. Recall and precision against established, but different benchmarks lie in ranges up to 30% and 50% respectively. Retrieval of known entities missed by other traditional approaches could be demonstrated. Finally, the InfoCodex semantic engine was shown to discover new diabetes and obesity biomarkers and phenotypes. Amongst these were many interesting candidates with a high potential, although noticeable noise (uninteresting or obvious terms) was generated. CONCLUSIONS: The reported approach of employing autonomous self-organising semantic engines to aid biomarker discovery, supplemented by appropriate manual curation processes, shows promise and has potential to impact, conservatively, a faster alternative to vocabulary processes dependent on humans having to read and analyze all the texts. More optimistically, it could impact pharmaceutical research, for example to shorten time-to-market of novel drugs, or speed up early recognition of dead ends and adverse reactions.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。