Abstract
OBJECTIVE: Rural Hispanic Veterans experience elevated suicide rates when compared to urban counterparts. Group differences remain poorly understood. This study evaluates a rurality-stratified sample of Hispanic Veterans Affairs (VA)-patients, leveraging unstructured electronic health record (EHR) data to refine population-specific suicide risk prediction metrics. METHOD: The study utilized a rural and urban Hispanic VA-patient dataset, including all suicide decedents from 2015-2018 (cases). Each case was matched with four patients who shared demographics and treatment year and remained alive (controls). After extracting and preprocessing all unstructured EHR text data, the corpus was analyzed using 500+ variable semantic analysis package. Least Absolute Shrinkage and Selection Operator and Logistic Regression were used to develop prediction models and area under receiver operating characteristic curve (AUC) was used to examine models' predictive accuracy. RESULTS: The final datasets included 39 rural cases and 148 controls, alongside 273 urban cases and 1090 controls. The predictive models offered considerable accuracy (rural AUC = 0.86; urban AUC = 0.67). While rural models emphasized dislocation from community and communal resources, urban models emphasized alienation and identity challenges. CONCLUSIONS: This study enhances understanding about rural and urban Hispanic suicide decedents and could inform suicide prediction and preventive services.