In case-control single-cell RNA-seq studies, sample-level labels are transferred onto individual cells, labeling all case cells as affected, when in reality only a small fraction of them may actually be perturbed. Here, using simulations, we demonstrate that the standard approach to single cell analysis fails to isolate the subset of affected case cells and their markers when either the affected subset is small, or when the strength of the perturbation is mild. To address this fundamental limitation, we introduce HiDDEN, a computational method that refines the case-control labels to accurately reflect the perturbation status of each cell. We show HiDDEN's superior ability to recover biological signals missed by the standard analysis workflow in simulated ground truth datasets of cell type mixtures. When applied to a dataset of human multiple myeloma precursor conditions, HiDDEN recapitulates the expert manual annotation and discovers malignancy in early stage samples missed in the original analysis. When applied to a mouse model of demyelination, HiDDEN identifies an endothelial subpopulation playing a role in early stage blood-brain barrier dysfunction. We anticipate that HiDDEN should find wide usage in contexts that require the detection of subtle transcriptional changes in cell types across conditions.
HiDDEN: a machine learning method for detection of disease-relevant populations in case-control single-cell transcriptomics data.
阅读:9
作者:Goeva Aleksandrina, Dolan Michael-John, Luu Judy, Garcia Eric, Boiarsky Rebecca, Gupta Rajat M, Macosko Evan
| 期刊: | Nature Communications | 影响因子: | 15.700 |
| 时间: | 2024 | 起止号: | 2024 Nov 2; 15(1):9468 |
| doi: | 10.1038/s41467-024-53666-8 | ||
特别声明
1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。
2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。
3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。
4、投稿及合作请联系:info@biocloudy.com。
