Pharmacovariome scanning using whole pharmacogene resequencing coupled with deep computational analysis and machine learning for clinical pharmacogenomics

利用全药理基因重测序结合深度计算分析和机器学习进行药理变异组扫描,用于临床药理基因组学研究

阅读:2

Abstract

BACKGROUND: This pilot study aims to identify and functionally assess pharmacovariants in whole exome sequencing data. While detection of known variants has benefited from pharmacogenomic-dedicated bioinformatics tools before, in this paper we have tested novel deep computational analysis in addition to artificial intelligence as possible approaches for functional analysis of unknown markers within less studied drug-related genes. METHODS: Pharmacovariants from 1800 drug-related genes from 100 WES data files underwent (a) deep computational analysis by eight bioinformatic algorithms (overall containing 23 tools) and (b) random forest (RF) classifier as the machine learning (ML) approach separately. ML model efficiency was calculated by internal and external cross-validation during recursive feature elimination. Protein modelling was also performed for predicted highly damaging variants with lower frequencies. Genotype-phenotype correlations were implemented for top selected variants in terms of highest possibility of being damaging. RESULTS: Five deleterious pharmacovariants in the RYR1, POLG, ANXA11, CCNH, and CDH23 genes identified in step (a) and subsequent analysis displayed high impact on drug-related phenotypes. Also, the utilization of recursive feature elimination achieved a subset of 175 malfunction pharmacovariants in 135 drug-related genes that were used by the RF model with fivefold internal cross-validation, resulting in an area under the curve of 0.9736842 with an average accuracy of 0.9818 (95% CI: 0.89, 0.99) on predicting whether a carrying individuals will develop adverse drug reactions or not. However, the external cross-validation of the same model indicated a possible false positive result when dealing with a low number of observations, as only 60 important variants in 49 genes were displayed, giving an AUC of 0.5384848 with an average accuracy of 0.9512 (95% CI: 0.83, 0.99). CONCLUSION: While there are some technologies for functionally assess not-interpreted pharmacovariants, there is still an essential need for the development of tools, methods, and algorithms which are able to provide a functional prediction for every single pharmacovariant in both large-scale datasets and small cohorts. Our approaches may bring new insights for choosing the right computational assessment algorithms out of high throughput DNA sequencing data from small cohorts to be used for personalized drug therapy implementation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。