Hierarchical classification-based pan-cancer methylation analysis to classify primary cancer

基于分层分类的泛癌甲基化分析用于对原发性癌症进行分类

阅读:1

Abstract

Hierarchical classification offers a more specific categorization of data and breaks down large classification problems into subproblems, providing improved prediction accuracy and predictive power for undefined categories, while also mitigating the impact of poor-quality data. Despite these advantages, its application in predicting primary cancer is rare. To leverage the similarity of cancers and the specificity of methylation patterns among them, we developed the Cancer Hierarchy Classification Tool (CHCT) using the idea of hierarchical classification, with methylation data from 30 cancer types and 8239 methylome samples downloaded from publicly available databases (The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO)). We used unsupervised clustering to divide the classification subproblems and screened differentially methylated sites using Analysis of variance (ANOVA) test, Tukey-kramer test, and Boruta algorithms to construct models for each classifier module. After validation, CHCT accurately classified 1568 out of 1660 cases in the test set, with an average accuracy of 94.46%. We further curated an independent validation cohort of 677 cancer samples from GEO and assigned a diagnosis using CHCT, which showed high diagnostic potential with generally high accuracies (an average accuracy of 91.40%). Moreover, CHCT demonstrates predictive capability for additional cancer types beyond its original classifier scope as demonstrated in the medulloblastoma and pituitary tumor datasets. In summary, CHCT can hierarchically classify primary cancer by methylation profile, by splitting a large-scale classification of 30 cancer types into ten smaller classification problems. These results indicate that cancer hierarchical classification has the potential to be an accurate and robust cancer classification method.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。