Abstract
INTRODUCTION: Substance use continues to evolve as a multidimensional public health challenge influenced by traditional behavioral triggers and emerging digital interactions. This study investigates how demographic factors, psychological states, and patterns of digital engagement shape substance use behaviors using multiple behavioral data sources. METHODS: Quantitative analyses were conducted using the NHANES dataset and a Kaggle social media psychology dataset to identify statistical relationships and train predictive machine learning models for substance use indicators and digital behavioral patterns. Random Forest, XGBoost, AdaBoost, Support Vector Regression (SVR), and Logistic Regression models were evaluated, with hyperparameter tuning applied to improve predictive performance. In addition, a supplementary survey (N = 236) was collected and used as a qualitative interpretive layer to contextualize the relationship between digital behavior and substance use risk. RESULTS: The analysis revealed nonlinear relationships between social media engagement, anxiety, and loneliness. Contrary to the widely cited linear dose-response assumption, anxiety scores plateaued at higher levels of digital engagement, suggesting that the qualitative nature of online interactions may exert greater influence on psychological distress than usage duration alone. Machine learning models demonstrated improved predictive performance after hyperparameter tuning across both datasets. DISCUSSION: These findings highlight the importance of considering digital engagement patterns alongside traditional behavioral and demographic factors in substance use research. The results support the development of platform-specific digital well-being strategies, nuanced behavioral modeling approaches, and culturally sensitive interventions that integrate both objective behavioral data and subjective user experiences. The proposed multi-source evidence framework provides a foundation for future exploratory behavioral risk profiling and prevention systems.