Abstract
Machine learning (ML) has emerged as a powerful technique for multiple stages of breast cancer drug discovery, from target identification to compound prioritization and patient stratification. This article presents a narrative review of recent advances in ML-driven strategies applied to breast cancer drug discovery, with a focus on methods, data resources, and translational relevance. We systematically synthesize representative studies employing supervised and unsupervised learning, deep neural networks, generative models, and multi-omics integration to address key challenges in breast cancer therapeutics. Particular attention is given to ML approaches for biomarker discovery, drug-target interaction prediction, molecular design, and drug response modeling across breast cancer subtypes. The review also summarizes widely used public datasets, including genomic, transcriptomic, pharmacological, and chemical repositories that underpin these approaches. In addition, we discuss reported translational applications, emerging industrial efforts, and critical limitations related to data bias, model generalizability, and clinical applicability. Finally, we outline future directions for improving the robustness, interpretability, and clinical integration of ML-based drug discovery frameworks, aiming to bridge the gap between computational prediction and applicable breast cancer therapies.