Facilitate integrated analysis of single cell multiomic data by binarizing gene expression values

通过二值化基因表达值,促进单细胞多组学数据的整合分析

阅读:1

Abstract

A cell type's identity can be revealed by its transcriptome and epigenome profiles, both of which can be in flux temporally and spatially, leading to distinct cell states or subtypes. The popular and standard workflow for single cell RNA-seq (scRNA-seq) data analysis applies feature selection, dimensional reduction, and clustering on the gene expression values quantified by read counts, but alternative approaches using a simple classification of a gene to "on" and "off" (i.e., binarization of the gene expression) have been proposed for clustering cells and other downstream analyses. Here, we demonstrate that a direct concatenation of the binarized scRNA-seq data and the standard single cell ATAC-seq data is sufficient and effective for vertical integrated clustering analysis, after applying term-frequency-inverse document frequency (TF-IDF) and single value decomposition (also called latent semantic indexing, LSI) algorithms to the combined data, when the two data modalities are collected using a paired multiomic technology. This proposed approach avoids the need for converting scATAC-seq data to gene activity scores for combined analysis. Furthermore it enables a direct investigation into the contribution of each data type for resolving cell type or subtype identity.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。