Abstract
Single-molecule localization microscopy (SMLM) data can reveal differences in protein organization between different disease types or samples. Classification of samples is an important task that allows for automated recognition and grouping of data by sample type for downstream analysis. However, methods for classifying structures larger than single clusters of localizations in SMLM point-cloud datasets are not well developed. A graph-based deep learning pipeline is presented for classification of SMLM point-cloud data over a field of view of any size. The pipeline combines features of individual clusters (calculated from their constituent localizations) with the structure formed by the positions of multiple clusters (supracluster structure). This method outperforms previous classification results on a model open-source DNA-PAINT dataset, with 99% accuracy. It is also applied to a challenging new SMLM dataset from colorectal cancer tissue. Explainability tools Uniform Manifold Approximation and Projection and SubgraphX allow exploration of the influence of spatial features and structures on classification results, and demonstrate the importance of supracluster structure in classification.