GUIDING CLUSTERING AND ANNOTATION IN SINGLE-CELL RNA SEQUENCING USING THE AVERAGE OVERLAP METRIC

利用平均重叠度量指导单细胞RNA测序中的聚类和注释

阅读:2

Abstract

Defining cell types using unsupervised clustering algorithms based on transcriptional similarity is a powerful application of single-cell RNA sequencing. A single clustering resolution may not yield clusters that represent both broad, well-defined populations and smaller subpopulations simultaneously. Therefore, when cell identities are not known prior to sequencing, robust comparison and annotation of inferred de novo clusters remains a challenge. In this work, we define the distance between single-cell clusters by proposing the use of the average overlap metric to compare ranked lists of differentially expressed genes in a top-weighted manner. We first benchmark our approach in a truth-known dataset comprised of highly similar yet distinct T-cell populations and show that evaluating clusters with average overlap results in a consistent, precise, and biologically meaningful recapitulation of true cell identities. We then apply our approach to data of unsorted mouse thymocytes and characterize stages of T-cell development in the thymus, including minor populations of double-negative (CD4-CD8-) T-cells that are notoriously difficult to confidently detect in unsorted single-cell data. We demonstrate that measuring cluster similarity with average overlap of marker gene rankings enables robust, reproducible characterization of single cells and clarifies biological interpretation of their underlying identities in highly homogeneous populations.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。