Abstract
Uncovering important factors is a fundamental and highly demanding phase from a technical prospect with numerous applications in recent scientific research. This study focused on improving factor selection techniques based on partial least squares for classification and therefore, traditional, recent, and proposed approaches are evaluated by means of efficiency. All considered techniques are executed on a real data set of sexually transmitted infections among men belonging to Balochistan (Pakistan) using the Monte Carlo simulation method. The optimal model, selected by linear discriminant analysis and the area under the Receiver Operating Characteristic curve (AUC-ROC), is employed to determine the significant factors associated with sexually transmitted infections among men. The Signal-to-noise ratio index, coupled with Yule's Q-partial least squares, is found to be the most accurate approach in terms of efficiency and frequency of the selected subset of factors. The suggested predictors provide vital facts about sexually transmitted infections and could be useful in related research. The findings also identify areas where further research is needed, such as understanding the drivers of STI transmission in rural areas using large data sets with multiple categories of STIs.