Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling

细胞染色和转录组学数据的跨模态学习可改进作用机制聚类和生物活性建模

阅读:1

Abstract

In drug discovery, different data modalities (chemical structure, cell biology, quantum mechanics, etc.) are abundant, and their integration can help with understanding aspects of chemistry, biology, and their interactions. Within cell biology, cell painting (CP) and transcriptomics RNA-Seq (TX) screens are powerful tools in early drug discovery, as they are complementary views of the biological effect of compounds on a population of cells post-treatment. While multimodal learning of chemical structure-cell painting, or different omics data has been experimented; a cell painting-bulk transcriptomics multimodal model is still unexplored. In this work, we benchmark two representation learning methods: contrastive learning and bimodal autoencoder. We use the setting of cross modality learning where representation learning is performed with two modalities (CP and TX), but only cell painting is available for new compounds embeddings generation and downstream task. This is because for new compounds, we would only have CP data and not TX, due to high data generation cost of the RNA-Seq screen. We show that in the absence of TX features for new compounds, using learned embeddings like those obtained from Constrastive Learning enhances performance of CP features on tasks where TX features excels but CP features does not. Additionally, we observed that learned representation improves cluster quality for clustering of CP replicates and different mechanisms of action (MoA), as well as improves performance on several subsets of bioactivity tasks grouped by protein target families.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。