Data-driven information extraction and enrichment of molecular profiling data for cancer cell lines

基于数据驱动的癌症细胞系分子谱数据信息提取和富集

阅读：1

作者：Smith,Ellery,Paloots,Rahel,Giagkos,Dimitris,Baudis,Michael,Stockinger,Kurt

期刊：	Bioinformatics Advances	影响因子：	2.800
时间：	2024	起止号：	2024;4(1):vbae045
doi：	10.1093/bioadv/vbae045	研究方向：	细胞生物学、肿瘤

Abstract

MOTIVATION: With the proliferation of research means and computational methodologies, published biomedical literature is growing exponentially in numbers and volume. Cancer cell lines are frequently used models in biological and medical research that are currently applied for a wide range of purposes, from studies of cellular mechanisms to drug development, which has led to a wealth of related data and publications. Sifting through large quantities of text to gather relevant information on cell lines of interest is tedious and extremely slow when performed by humans. Hence, novel computational information extraction and correlation mechanisms are required to boost meaningful knowledge extraction. RESULTS: In this work, we present the design, implementation, and application of a novel data extraction and exploration system. This system extracts deep semantic relations between textual entities from scientific literature to enrich existing structured clinical data concerning cancer cell lines. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities such as affected genes. Each relation is accompanied by literature-derived evidences, allowing for deep, yet rapid, literature search, using existing structured data as a springboard. AVAILABILITY AND IMPLEMENTATION: Our system is publicly available on the web at https://cancercelllines.org.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。