Prioritizing transcriptomic and epigenomic experiments using an optimization strategy that leverages imputed data

利用基于插补数据的优化策略,对转录组学和表观基因组学实验进行优先级排序。

阅读:1

Abstract

MOTIVATION: Successful science often involves not only performing experiments well, but also choosing well among many possible experiments. In a hypothesis generation setting, choosing an experiment well means choosing an experiment whose results are interesting or novel. In this work, we formalize this selection procedure in the context of genomics and epigenomics data generation. Specifically, we consider the task faced by a scientific consortium such as the National Institutes of Health ENCODE Consortium, whose goal is to characterize all of the functional elements in the human genome. Given a list of possible cell types or tissue types ('biosamples') and a list of possible high-throughput sequencing assays, where at least one experiment has been performed in each biosample and for each assay, we ask 'Which experiments should ENCODE perform next?' RESULTS: We demonstrate how to represent this task as a submodular optimization problem, where the goal is to choose a panel of experiments that maximize the facility location function. A key aspect of our approach is that we use imputed data, rather than experimental data, to directly answer the posed question. We find that, across several evaluations, our method chooses a panel of experiments that span a diversity of biochemical activity. Finally, we propose two modifications of the facility location function, including a novel submodular-supermodular function, that allow incorporation of domain knowledge or constraints into the optimization procedure. AVAILABILITY AND IMPLEMENTATION: Our method is available as a Python package at https://github.com/jmschrei/kiwano and can be installed using the command pip install kiwano. The source code used here and the similarity matrix can be found at http://doi.org/10.5281/zenodo.3708538. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。