Abstract
Molecular property prediction refers to predicting the properties of a given molecular representation. This task is of great significance in fields such as drug design and has garnered widespread attention from researchers. For molecular property prediction, the quality of feature learning plays a decisive role in model performance. Although existing molecular graph models can extract effective feature representations from graph structures, how to better utilize these features across different learning tasks remains an important challenge. This paper proposes a subgraph-optimized Graph Autoencoder (TurboGAE) and several multimodal fusion strategies. By introducing a subgraph-level graph tokenizer, TurboGAE more effectively captures the impact of substructure features (within molecular structures) on molecular properties. For cross-modal molecular features, a rational and effective multimodal feature fusion strategy can align intermodal features during the pretraining phase, leveraging the unique strengths of each modality. The proposed methods demonstrate excellent performance in experiments on downstream tasks.