Abstract
This study focuses on predicting the solubility of paracetamol and density of solvent using temperature (T) and pressure (P) as inputs. The process for production of the drug is supercritical technique in which the focus was on theoretical investigations of drug solubility and solvent density as well. Machine learning models with a two-input, two-output structure were developed and validated using experimental data on paracetamol solubility as well as density. Ensemble models with decision trees as base models, including Extra Trees (ETR), Random Forest (RFR), Gradient Boosting (GBR), and Quantile Gradient Boosting (QGB) were adjusted to predict the two outputs. The results are useful to evaluate the feasibility of process in improving the efficacy of the drug, i.e., its enhanced bioavailability. The hyper-parameters of ensemble models as well as parameters of decision tree tuned using WOA algorithm separately for both outputs. The Quantile Gradient Boosting model showed the best performance for mole fraction (drug solubility), while the R(2) score of 0.985 was determined. For density of solvent, the Extra Trees model performed the best with an R(2) equal to 0.997.