PeakDetective: A Semisupervised Deep Learning-Based Approach for Peak Curation in Untargeted Metabolomics

PeakDetective:一种基于半监督深度学习的非靶向代谢组学峰值筛选方法

阅读:1

Abstract

Peak-detection algorithms currently used to process untargeted metabolomics data were designed to maximize sensitivity at the sacrifice of selectively. Peak lists returned by conventional software tools therefore contain a high density of artifacts that do not represent real chemical analytes, which, in turn, hinder downstream analyses. Although some innovative approaches to remove artifacts have recently been introduced, they involve extensive user intervention due to the diversity of peak shapes present within and across metabolomics data sets. To address this bottleneck in metabolomics data processing, we developed a semisupervised deep learning-based approach, PeakDetective, for classification of detected peaks as artifacts or true peaks. Our approach utilizes two techniques for artifact removal. First, an unsupervised autoencoder is used to extract a low-dimensional, latent representation of each peak. Second, a classifier is trained with active learning to discriminate between artifacts and true peaks. Through active learning, the classifier is trained with less than 100 user-labeled peaks in a matter of minutes. Given the speed of its training, PeakDetective can be rapidly tailored to specific LC/MS methods and sample types to maximize performance on each type of data set. In addition to curation, the trained models can also be utilized for peak detection to immediately detect peaks with both high sensitivity and selectivity. We validated PeakDetective on five diverse LC/MS data sets, where PeakDetective showed greater accuracy compared to current approaches. When applied to a SARS-CoV-2 data set, PeakDetective enabled more statistically significant metabolites to be detected. PeakDetective is open source and available as a Python package at https://github.com/pattilab/PeakDetective.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。