Identification of new marker genes from plant single-cell RNA-seq data using interpretable machine learning methods

利用可解释的机器学习方法从植物单细胞RNA测序数据中鉴定新的标记基因

阅读:1

Abstract

An essential step in the analysis of single-cell RNA sequencing data is to classify cells into specific cell types using marker genes. In this study, we have developed a machine learning pipeline called single-cell predictive marker (SPmarker) to identify novel cell-type marker genes in the Arabidopsis root. Unlike traditional approaches, our method uses interpretable machine learning models to select marker genes. We have demonstrated that our method can: assign cell types based on cells that were labelled using published methods; project cell types identified by trajectory analysis from one data set to other data sets; and assign cell types based on internal GFP markers. Using SPmarker, we have identified hundreds of new marker genes that were not identified before. As compared to known marker genes, the new marker genes have more orthologous genes identifiable in the corresponding rice single-cell clusters. The new root hair marker genes also include 172 genes with orthologs expressed in root hair cells in five non-Arabidopsis species, which expands the number of marker genes for this cell type by 35-154%. Our results represent a new approach to identifying cell-type marker genes from scRNA-seq data and pave the way for cross-species mapping of scRNA-seq data in plants.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。