Abstract
OBJECTIVE: To develop and validate a hierarchical machine learning model integrating static clinical features and dynamic behavioral assessments for accurately predicting violent behaviors among hospitalized schizophrenia patients. METHODS: This retrospective study included 346 schizophrenia patients hospitalized from July 2021 to July 2024 in Liaoning Province. Patients were categorized into violent (n = 123) and non-violent (n = 223) groups based on documented aggressive incidents. Eighteen static clinical variables (e.g., age, gender, history of violence, manic symptoms) were extracted from electronic medical records, and 39 dynamic behavioral indicators (e.g., anger expression, insomnia, auditory hallucinations) were assessed weekly using the Psychiatric Patient Nursing Observation Scale. Predictive models were separately developed using six machine learning algorithms: Regularized Logistic Regression (LR), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), Random Forest (RF), Multi-layer Perceptron (MLP), and K-Nearest Neighbor (KNN). Regularized logistic regression was selected as the final algorithm due to its superior predictive performance, indicated by the highest Area Under the Curve (AUC), in both static baseline and dynamic behavioral models. A hierarchical predictive model was then established using regularized logistic regression separately for static baseline risk and dynamic risk fluctuations, subsequently integrated using a weighted fusion approach. RESULTS: The integrated hierarchical regularized logistic regression model achieved an optimal performance with an area under the curve (AUC) of 0.8741, surpassing both the static baseline model (AUC = 0.7953) and dynamic model (AUC = 0.8003) alone. Optimal predictive performance was obtained with a fusion parameter (α) of 0.37, balancing sensitivity (0.7838), specificity (0.8358), and accuracy (0.8173). Key independent predictors included static factors such as history of violence (odds ratio [OR]=4.638), manic symptoms (OR = 7.801), younger age (OR = 0.966), high-risk command hallucinations (OR = 2.602), and dynamic features like anger expression (OR = 4.649), insomnia (OR = 7.422), and auditory hallucinations (OR = 2.092). CONCLUSION: The hierarchical machine learning model integrating clinical history and dynamic nursing observations significantly enhances predictive accuracy for violent behavior in schizophrenia inpatients, providing clinicians with valuable tools for timely risk assessment and personalized preventive interventions.