Abstract
Compared to single-drug therapy, combination therapy involves the use of two or more drugs to reduce drug dosage, decrease drug toxicity, and improve treatment efficacy. We developed an extreme gradient boosting (XGBoost)-based drug-drug cell line prediction model (XDDC) to predict synergistic drug combinations. XDDC was based on XGBoost and used one of the largest drug combination datasets, NCI-ALMANAC. In XDDC, drug chemical structures, adverse drug reactions, and target information were selected as drug features; gene expression, methylation, mutations, copy number variations, and RNA interference data were used as cell line features; and pathway information was incorporated to link drug features and cell line features. XDDC improved the interpretability of drug combination features and outperformed other machine learning methods. It achieved an area under the curve (AUC) of 0.966 ± 0.002 and an AUPR of 0.957 ± 0.002 when cross-validated on NCI-ALMANAC data. Different types of omics data were evaluated and compared in the model. Literature and experimental verification confirmed some of our predictions. XDDC could help medical professionals to rapidly screen synergistic drug combinations against specific cancer cell lines.