Abstract
The prediction of drug solubility represents a fundamental challenge in pharmaceutical development, with traditional experimental methods proving both time-intensive and resource-demanding. This study presents a comprehensive evaluation of Graph Convolutional Networks (GCNs) for predicting drug solubility in binary solvent mixtures across diverse temperature ranges. We used an extensive dataset comprising 27,000 solubility measurements encompassing 123 small-molecule solutes, 44 solvents, and 110 binary solvent combinations measured across varied temperatures (273-373 K). Our GCN architecture incorporates multi-head attention mechanisms, hierarchical molecular representation learning, and sophisticated pooling strategies to capture complex molecular interactions. The proposed GCN model achieved exceptional performance with a mean absolute error (MAE) of 0.28 [Formula: see text] units, demonstrating a 15% improvement over traditional machine learning approaches. Through comprehensive ablation studies and attention visualization analyses, we demonstrate that GCNs excel particularly in modeling structure-solubility relationships for pharmaceutically relevant compounds. Prospective validation using four drug molecules confirmed the model's predictive reliability, with experimental verification yielding MAE < 0.5 [Formula: see text] S for compounds structurally similar to training data. This research establishes GCNs as powerful tools for accelerating pharmaceutical formulation development, potentially reducing experimental requirements by 60-80% while providing interpretable molecular insights through attention mechanisms. This work demonstrates that thoughtful integration of established graph neural network techniques, specifically optimized for binary solvent systems, can substantially advance computational solubility prediction for pharmaceutical applications.