Interpretable dimensionality reduction and classification of mass spectrometry imaging data in a visceral pain model via non-negative matrix factorization

通过非负矩阵分解对内脏痛模型中的质谱成像数据进行可解释的降维和分类

阅读:6
作者:Kasun Pathirage, Aman Virmani, Alison J Scott, Richard J Traub, Robert K Ernst, Reza Ghodssi, Behtash Babadi, Pamela Ann Abshire

Abstract

Mass spectrometry imaging (MSI) is a powerful scientific tool for understanding the spatial distribution of biochemical compounds in tissue structures. In this paper, we introduce three novel approaches in MSI data processing to perform the tasks of data augmentation, feature ranking, and image registration. We use these approaches in conjunction with non-negative matrix factorization (NMF) to resolve two of the biggest challenges in MSI data analysis, namely: 1) the large file sizes and associated computational resource requirements and 2) the complexity of interpreting the very high dimensional raw spectral data. There are many dimensionality reduction techniques that address the first challenge but do not necessarily result in readily interpretable features, leaving the second challenge unaddressed. We demonstrate that NMF is an effective dimensionality reduction algorithm that reduces the size of MSI datasets by three orders of magnitude with limited loss of information, yielding spatial and spectral components with meaningful correlation to tissue structure that may be used directly for subsequent data analysis without the need for additional clustering steps. This analysis is demonstrated on an MSI dataset from female Sprague-Dawley rats for an animal model of comorbid visceral pain hypersensitivity (CPH). We find that high-dimensional MSI data (∼ 100,000 ions per pixel) can be reduced to 20 spectral NMF components with < 20% loss in reconstruction accuracy. The resulting spatial NMF components are reproducible and correlate well with H&E-stained tissue images. These components may also be used to generate images with enhanced specificity for different tissue types. Small patches of NMF data (i.e., 20 spatial NMF components over 20 × 20 pixels) provide an accuracy of ∼ 87% in classifying CPH vs naïve control subjects. This paper presents the novel data processing methodologies that were used to produce these results, encompassing novel data processing pipelines for data augmentation to support training for classification, ranking of features according to their contribution to classification, and image registration to enhance tissue-specific imaging.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。