Refining of cancer-specific genes in microsatellite-unstable colon and endometrial cancers using modified partial least square discriminant analysis

利用改进的偏最小二乘判别分析法对微卫星不稳定型结肠癌和子宫内膜癌中的癌症特异性基因进行精细化分析

阅读:1

Abstract

Despite similarities in microsatellite instability (MSI) between colon and endometrial cancer, there are many clinically important organ-specific features. The molecular differences between these 2 MSI cancers are underexplored because the usual differentially expressed gene analysis yields too many noncancer-specific normally expressed genes. We aimed to identify cancer-specific genes in MSI colorectal adenocarcinoma (CRC) and MSI endometrial carcinoma (ECs) using a modified partial least squares discriminant analysis. We obtained a list of cancer-specific genes in MSI CRC and EC by taking the intersection of the genes obtained from tumor samples and normal samples. Specifically, we obtained publically available 1319 RNA sequencing data consisting of MSI CRCs, MSI ECs, normal colon including the rectum, and normal endometrium from The Cancer Genome Atlas and genome-tissue expression sites. To reduce gene-centric dimensions, we retained only 3924 genes from the original data by performing the usual differentially expressed gene screening for tumor samples using DESeq2. The usual partial least squares discriminant analysis was performed for tumor samples, producing 625 genes, whereas for normal samples, projection vectors with zero covariance were sampled, their weights were square-summed, and genes with sufficiently high values were selected. Gene ontology (GO) term enrichment, protein-protein interaction, and survival analyses were performed for functional and clinical validation. We identified 30 cancer-specific normal-invariant genes, including Zic family members (ZIC1, ZIC4, and ZIC5), DPPA2, PRSS56, ELF5, and FGF18, most of which were cancer-associated genes. Although no statistically significant GO terms were identified in the GO term enrichment analysis, cell differentiation was observed as potentially significant. In the protein-protein interaction analysis, 17 of the 30 genes had at least one connection, and when first-degree neighbors were added to the network, many cancer-related pathways, including MAPK, Ras, and PI3K-Akt, were enriched. In the survival analysis, 16 genes showed statistically significant differences between the lower and higher expression groups (3 in CRCs and 15 ECs). We developed a novel approach for selecting cancer-specific normal-invariant genes from relevant gene expression data. Although we believe that tissue-specific reactivation of embryonic genes might explain the cancer-specific differences of MSI CRC and EC, further studies are needed for validation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。