Abstract
The rapid discovery and design of new molecules drive innovation in science and technology, advancing energy storage, catalysis, and drug development. Traditionally, exploring chemical space involves costly quantum-chemical calculations or slow experimental screening, which limits the speed of identifying promising candidates. Machine learning has emerged as a groundbreaking approach to accelerate molecular discovery by predicting key properties directly from molecular structures. Moreover, in many cases, if we can rank molecular structures, it is not necessary to know the exact value of a molecular property. In other words, a ranker model can be useful for molecular screening. In this work, we develop a deep learning model to rank molecular structures using a siamese network approach and pairwise learning to learn the ranking. According to different properties of the QM7x and QO2Mol data sets, the results show that the performance of the learn-to-rank Siamese architecture outperforms standard pointwise regression for predicting absolute energetic properties, such as total and orbital energies, while traditional pointwise regression remains effective for derived (e.g., HOMO-LUMO gap) or nonenergy properties (e.g., dipole moment). To further validate the robustness of the proposed framework, we extended our evaluation to include the Uni-Mol molecular representation model. Experiments with Uni-Mol V1 and V2 across various model sizes (84 M to 1.1 B parameters) confirm that the pairwise learning-to-rank objective consistently outperforms standard pointwise regression, even when using highly expressive pretrained Transformer backbones.