Abstract
Accurate estimation of the strength of the protein-ligand interaction is important in the field of drug discovery. The binding strength can be determined by using experimental binding affinity assays which are both time and labor consuming and costly. Predicting the binding affinity/energy in silico is an alternative approach, particularly for virtual screening of large data sets. In general, the distance-based terms such as electrostatic and van der Waals interactions are among the key determinants of binding energy. In this work, the distance-binding energy relationships, i.e., E ∝ -d(-k), are further explored, extended, and developed for protein-ligand binding affinity prediction. The contributions of different atom-type pairs were considered synthetically and jointly. Additionally, the contact number-energy relationships (E ∝ -n(k)) were also explored for protein-ligand binding affinity prediction. Significantly, the power exponents of the distances or contact numbers in the energy functions are not restricted by the existing theories concerning van der Waals and electrostatic energies (expressed as ar(6)-br(12) and cr). The performances of the new distance-based or contact number-based models are better than the performances of those sophisticated non-machine-learning-based scoring functions developed before. The exploration and extension of the distance-energy and contact number-energy relationships may offer insights into the development of more effective methods for predicting the protein-ligand binding affinity accurately and analyzing the protein-ligand interactions rationally.