Household Clustering of High-Risk Contacts in Smear-Positive TB Patient Families: Evidence for Hotspot Households and Risk Stratification in Rural Eastern Cape

涂片阳性结核病患者家庭中高危接触者的家庭聚集性:东开普省农村地区热点家庭和风险分层的证据

阅读:1

Abstract

BACKGROUND: Household contacts of smear-positive tuberculosis (TB) patients face an elevated risk of infection and disease progression, particularly young children and individuals living in overcrowded households. Despite WHO recommendations for systematic contact screening and provision of TB preventive therapy (TPT), implementation remains suboptimal in high-burden rural areas. This study aimed to develop a practical framework for identifying and prioritizing high-risk families by examining demographic predictors, household clustering, and machine learning-based risk models. METHODS: A total of 437 household contacts linked to smear-positive index cases were assessed and classified as high or low risk. Statistical analyses included descriptive measures, χ(2) tests, Z-tests for age-group differences, and multivariable logistic regression. Household-level vulnerability patterns were explored using network visualizations, clustered heatmaps, and risk-ranking charts. Three machine learning models, logistic regression, random forest, and gradient boosting, were trained using demographic and household variables with 5-fold cross-validation and an 80/20 hold-out test split. Model performance was evaluated using the AUROC, AUPRC, accuracy, F1-score, calibration curves, and decision curve analysis. RESULTS: Of the 437 contacts, 290 (66.4%) were classified as high risk. A younger age was strongly associated with high-risk status (χ(2) = 16.61, p = 0.005), with children aged 0-4 years being significantly more likely to be in a high-risk category (Z = 2.706). Gender showed no significant association (p = 0.523). Logistic regression identified younger age (aOR = 2.41, 95% CI: 1.48-3.94) and larger household size (aOR = 1.12 per additional member, 95% CI: 1.01-1.25) as independent predictors of the outcome. Visual analytics revealed apparent clustering of high-risk individuals within "hotspot families," enabling prioritization through composite risk scores. Gradient boosting achieved the strongest performance (AUROC = 0.65; AUPRC = 0.76), with acceptable calibration (Brier score = 0.21) and a positive net clinical benefit in the decision curve analysis. CONCLUSIONS: TB risk is highly clustered at the household level, with large families and young children carrying disproportionate vulnerability. Combining demographic risk assessment, household-level visualization, and predictive modeling provides a practical, data-driven approach to prioritizing households during contact investigation. These findings support the WHO's family-centered strategy and underscore the need to strengthen clinical governance and community-engaged education to optimize TB prevention in resource-limited rural settings.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。