Abstract
Accurate outcome prediction often requires modeling complex interactions between input features and context-specific modifiers. The pliable lasso is a flexible regression framework that integrates such modifiers into the prediction process. In many real-world applications, however, these modifiers are unobserved at test time and must be estimated. This study investigates the performance of eight supervised machine learning algorithms for estimating the modifier matrix Z in a pliable lasso model under a known-to-unknown scenario. The analysis considers both classification accuracy for modifier estimation and regression accuracy for the final response prediction, using simulated data and two relevant real-world datasets: the Superconductivity dataset and the Mice Protein Expression dataset. Results indicate that tree-based ensemble models (e.g., XGBoost, Random Forest, and Decision Tree) deliver superior modifier classification (AUC > 0.99), while regularized models such as Lasso and Elastic Net achieve the best regression performance. The findings support a hybrid modeling approach in which tree-based classifiers estimate modifying variables, followed by regularized regression for accurate and interpretable predictions.