Abstract
The design of selective kinase inhibitors remains a formidable challenge due to the high structural conservation of the ATP-binding site across the kinome. While modern generative AI has enabled rapid exploration of chemical space, many advanced models operate as black boxes, obscuring the chemical rationale behind design choices and limiting interpretability. To explore these bottlenecks, we present a modular, generative framework for de novo design of SRC kinase inhibitors, integrating ChemVAE-based latent space modeling, a chemically interpretable Kinase Inhibition Likelihood (KIL) scoring function, Bayesian optimization, and cluster-guided local neighborhood sampling. The results demonstrate that kinase inhibitors spontaneously organize into a coherent, low-dimensional manifold in latent space, with SRC acting as a structural "hub" that enables rational scaffold transformation. Our local neighborhood sampling-based approach successfully converts inhibitors from other kinase families (notably LCK) into novel SRC-like chemotypes, with LCK-derived molecules accounting for ∼40% of high-similarity outputs. Critically, we expose a fundamental representation gap: despite aromatic ring count being a top KIL feature, SMILES-based generation systematically fails to access multi-ring pharmacophores characteristic of clinical kinase inhibitors. This limitation cannot be overcome by scoring refinement alone, demanding topology-aware representations. Our framework also demonstrates that unbiased exploration paired with cluster-guided sampling outperforms active-biased optimization, which traps search in narrow local optima. By exposing representational gaps and showcasing scaffold-aware navigation of latent space, this study argues for hybrid systems that combine the diagnostic transparency of interpretable machine learning frameworks with the generative power of modern architectures.