Development of Peptide Identification System for ToF-SIMS Spectra Using Supervised Machine Learning

基于监督式机器学习的飞行时间二次离子质谱肽段鉴定系统开发

阅读:1

Abstract

Time-of-flight secondary ion mass spectrometry (ToF-SIMS) data interpretation for organic materials is complicated because of various fragment ions produced from each molecule and the overlapping of certain mass peaks from different molecules. Fragmentation mechanisms in SIMS are complex because different sputtering and ionization processes can simultaneously occur. Therefore, a prediction system that can identify materials in a sample is required. A novel prediction system for peptides based on ToF-SIMS and amino-acid-based teaching information (labels) for supervised machine learning was developed. To develop the prediction system for general organic materials, the annotation of materials is crucial to creating effective labels for supervised learning. Peptides are composed of 20 amino acid residues, which can be used as labels. We previously developed a peptide prediction system using Random Forest, a supervised machine-learning method. However, only the amino acids contained in the target peptide were predicted, and the amino acid sequence was unable to be assumed. In this study, the amino acid sequence of the test peptide was determined by adding the information on two adjacent amino acids to the labels. Once the prediction system learned the target peptide spectra, the peptides in the newly obtained ToF-SIMS spectra could be identified. The new prediction system also provides useful information for the identification of unknown peptides. The prediction results indicate that two adjacent permutations of amino acids are effective pieces of teaching information for expressing the amino acid sequence of a peptide.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。