An algorithm for classifying tumors based on genomic aberrations and selecting representative tumor models

一种基于基因组畸变对肿瘤进行分类并选择代表性肿瘤模型的算法

阅读:1

Abstract

BACKGROUND: Cancer is a heterogeneous disease caused by genomic aberrations and characterized by significant variability in clinical outcomes and response to therapies. Several subtypes of common cancers have been identified based on alterations of individual cancer genes, such as HER2, EGFR, and others. However, cancer is a complex disease driven by the interaction of multiple genes, so the copy number status of individual genes is not sufficient to define cancer subtypes and predict responses to treatments. A classification based on genome-wide copy number patterns would be better suited for this purpose. METHOD: To develop a more comprehensive cancer taxonomy based on genome-wide patterns of copy number abnormalities, we designed an unsupervised classification algorithm that identifies genomic subgroups of tumors. This algorithm is based on a modified genomic Non-negative Matrix Factorization (gNMF) algorithm and includes several additional components, namely a pilot hierarchical clustering procedure to determine the number of clusters, a multiple random initiation scheme, a new stop criterion for the core gNMF, as well as a 10-fold cross-validation stability test for quality assessment. RESULT: We applied our algorithm to identify genomic subgroups of three major cancer types: non-small cell lung carcinoma (NSCLC), colorectal cancer (CRC), and malignant melanoma. High-density SNP array datasets for patient tumors and established cell lines were used to define genomic subclasses of the diseases and identify cell lines representative of each genomic subtype. The algorithm was compared with several traditional clustering methods and showed improved performance. To validate our genomic taxonomy of NSCLC, we correlated the genomic classification with disease outcomes. Overall survival time and time to recurrence were shown to differ significantly between the genomic subtypes. CONCLUSIONS: We developed an algorithm for cancer classification based on genome-wide patterns of copy number aberrations and demonstrated its superiority to existing clustering methods. The algorithm was applied to define genomic subgroups of three cancer types and identify cell lines representative of these subgroups. Our data enabled the assembly of representative cell line panels for testing drug candidates.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。