Forecasting incidence of infectious diarrhea using random forest in Jiangsu Province, China

利用随机森林算法预测中国江苏省感染性腹泻的发病率

阅读:1

Abstract

BACKGROUND: Infectious diarrhea can lead to a considerable global disease burden. Thus, the accurate prediction of an infectious diarrhea epidemic is crucial for public health authorities. This study was aimed at developing an optimal random forest (RF) model, considering meteorological factors used to predict an incidence of infectious diarrhea in Jiangsu Province, China. METHODS: An RF model was developed and compared with classical autoregressive integrated moving average (ARIMA)/X models. Morbidity and meteorological data from 2012 to 2016 were used to construct the models and the data from 2017 were used for testing. RESULTS: The RF model considered atmospheric pressure, precipitation, relative humidity, and their lagged terms, as well as 1-4 week lag morbidity and time variable as the predictors. Meanwhile, a univariate model ARIMA (1,0,1)(1,0,0)(52) (AIC = - 575.92, BIC = - 558.14) and a multivariable model ARIMAX (1,0,1)(1,0,0)(52) with 0-1 week lag precipitation (AIC = - 578.58, BIC = - 578.13) were developed as benchmarks. The RF model outperformed the ARIMA/X models with a mean absolute percentage error (MAPE) of approximately 20%. The performance of the ARIMAX model was comparable to that of the ARIMA model with a MAPE reaching approximately 30%. CONCLUSIONS: The RF model fitted the dynamic nature of an infectious diarrhea epidemic well and delivered an ideal prediction accuracy. It comprehensively combined the synchronous and lagged effects of meteorological factors; it also integrated the autocorrelation and seasonality of the morbidity. The RF model can be used to predict the epidemic level and has a high potential for practical implementation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。