Abstract
INTRODUCTION: Image segmentation is an important challenge in mass spectrometry imaging data processing. Here, we report an unsupervised topological segmentation method adapted to the specific nature of mass spectrometry data. Unlike machine learning clustering algorithms, the proposed method retains the physical and chemical integrity of the mass spectrum, as no dimensionality reduction is required. METHODS: Using the cosine similarity measure, we discard outliers, detect spectrally homogeneous regions, and filter pixels with mixed cell origin on the border of different tissue subtypes. Then, we evaluate the actual data manifold dimensionality to determine spectrally homogeneous regions within samples. The method was implemented to discriminate regions related to sections of aggressive human glial tumours analysed by MALDI-TOF mass spectrometry. RESULTS: Analysis of parallel sections reveals correlated region allocation throughout the sample. The presence of tumour cells decreases progressively from the tumour core toward the sample edge. Filtering pixels with mixed cellular content is essential for investigating highly heterogeneous tumour tissues and their infiltration regions. Therefore, only homogeneous regions were selected using topological segmentation, as identifying metabolic alterations associated with tumour infiltration and metastasis in the native microenvironment is critical for cancer biology. CONCLUSIONS: Topological segmentation helps filter pixels from transition zones where cells of different types contribute comparably to the resulting signal. Consequently, the regions identified by spectral similarity are homogeneous data clusters that represent the characteristic molecular composition of the analyzed cells while preserving their natural variability.