scHiClassifier: a deep learning framework for cell type prediction by fusing multiple feature sets from single-cell Hi-C data

scHiClassifier:一种通过融合单细胞Hi-C数据的多个特征集进行细胞类型预测的深度学习框架

阅读:1

Abstract

Single-cell high-throughput chromosome conformation capture (Hi-C) technology enables capturing chromosomal spatial structure information at the cellular level. However, to effectively investigate changes in chromosomal structure across different cell types, there is a requisite for methods that can identify cell types utilizing single-cell Hi-C data. Current frameworks for cell type prediction based on single-cell Hi-C data are limited, often struggling with features interpretability and biological significance, and lacking convincing and robust classification performance validation. In this study, we propose four new feature sets based on the contact matrix with clear interpretability and biological significance. Furthermore, we develop a novel deep learning framework named scHiClassifier based on multi-head self-attention encoder, 1D convolution and feature fusion, which integrates information from these four feature sets to predict cell types accurately. Through comprehensive comparison experiments with benchmark frameworks on six datasets, we demonstrate the superior classification performance and the universality of the scHiClassifier framework. We further assess the robustness of scHiClassifier through data perturbation experiments and data dropout experiments. Moreover, we demonstrate that using all feature sets in the scHiClassifier framework yields optimal performance, supported by comparisons of different feature set combinations. The effectiveness and the superiority of the multiple feature set extraction are proven by comparison with four unsupervised dimensionality reduction methods. Additionally, we analyze the importance of different feature sets and chromosomes using the "SHapley Additive exPlanations" method. Furthermore, the accuracy and reliability of the scHiClassifier framework in cell classification for single-cell Hi-C data are supported through enrichment analysis. The source code of scHiClassifier is freely available at https://github.com/HaoWuLab-Bioinformatics/scHiClassifier.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。