Abstract
Social media behavior is a promising source of early indicators for psychological distress; however, predictive models often lack transparency, limiting their adoption in mental health settings. This paper describes an explainable machine learning framework for predicting self-reported depression risk based on behavioral features collected from 481 anonymized social media users. Three supervised learning models were tested using a nested 5 × 5 cross-validation strategy, with Random Forest yielding the strongest performance (accuracy = 84.2%, AUC = 0.88). Model calibration analysis using reliability curves and Expected Calibration Error (ECE) demonstrated that Random Forest provides well-calibrated probability estimates suitable for binary High/Low risk assessment. Explainability was integrated using SHAP to identify key behavioral markers, including screen time, passive scrolling, nighttime usage, and stress-driven engagement. Stability testing across multiple random seeds revealed consistent feature ranking patterns, supporting the reliability of the explanations. To showcase real-world applicability, we outline a prototype XAI-driven digital intervention workflow and present a simulation across representative user profiles, illustrating how interpreted model outputs can inform personalized behavioral recommendations. However, generalizability is limited by a moderately sized dataset reliant on self-reported measures and cross-sectional design. Future work will integrate multimodal behavioral signals, larger cohorts, and clinically validated mental-health assessments. Overall, the study presents a more transparent, computationally grounded approach for interpretable depression-risk prediction from social media behavior, bridging the gap between predictive performance and practical explainability.