Abstract
Assessing cytotoxicity towards human cells is a critical step in preclinical drug development. In preclinical toxicology, human cell lines allow for the analysis of both general and organ-specific toxicity, thus, helping reduce development time and costs. Predicting cytotoxic IC(50) and GI(50) values facilitates the early evaluation of new pharmaceutical agents by assessing the possible therapeutic window. Ten non-tumor and 10 tumor cell lines commonly used in toxicology were selected to develop QSAR models using GUSAR software and ChEMBL data. GUSAR employs atom-centric electrotopological QNA and substructural MNA descriptors to encode molecular structure and utilizes the RBF-SCR algorithm to train QSAR models. The best-performing models (R(2) > 0.5, RMSE < 0.8; mean R(2) = 0.691, mean RMSE = 0.584) were selected using 5-fold cross-validation. These models were implemented in the freely available web application CLC-Pred 2.0 (Cell Line Cytotoxicity Predictor), initially developed for qualitative prediction of cytotoxicity in human cell lines.