A contrastive adversarial encoder for multi-omics data integration

一种用于多组学数据整合的对比对抗编码器

阅读:2

Abstract

Early and accurate cancer detection is crucial for effective treatment, prognosis, and the advancement of precision medicine. Analyzing omics data is vital in cancer research. While using a single type of omics data provides a limited perspective, integrating multiple omics modalities allows for a more comprehensive understanding of cancer. Current deep models struggle to achieve efficient dimensionality reduction while preserving global information and integrating multi-omics data. This often results in feature redundancy or information loss, overlooking the synergies among different modalities. This paper proposes a contrastive adversarial encoder (CAEncoder) for multi-omics data integration to address this challenge. The proposed model combines a Vision Transformer (ViT) and a CycleGAN, trained in an end-to-end contrastive manner. The ViT is the encoder, utilizing self-attention, while the CycleGAN employs adversarial learning to ensure more discriminative and invariant latent space embeddings. Contrastive adversarial training improves representation quality by preventing information loss, eliminating redundancy, and capturing the synergies among different omics modalities. To ensure contrastive adversarial training, a composite loss function is used, consisting of a weighted combination of Adversarial Loss (Hinge Loss), Cycle Consistency Loss, and Triplet Margin Loss. The Adversarial Loss and Cycle Consistency Loss provide feedback from the CycleGAN, ensuring effective adversarial learning. Meanwhile, the Triplet Margin Loss promotes contrastive learning by pulling similar samples together and pushing dissimilar samples apart in the latent space. The performance of the CAEncoder is evaluated on downstream classification tasks, including both binary and multi-class classifications of five different cancer types. The results show that the model achieved a classification accuracy of up to 93.33% and an F1 score of 92.81%, outperforming existing advanced models. These findings demonstrate the potential of our method to enhance precision medicine for cancer through improved multi-omics data integration.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。