Wastewater-based prediction of COVID-19 cases using a random forest algorithm with strain prevalence data: A case study of five municipalities in Latvia

使用随机森林算法和菌株流行数据对废水中的 COVID-19 病例进行预测:以拉脱维亚五个城市为例

阅读:4
作者:Brigita Dejus, Pāvels Cacivkins, Dita Gudra, Sandis Dejus, Maija Ustinova, Ance Roga, Martins Strods, Juris Kibilds, Guntis Boikmanis, Karina Ortlova, Laura Krivko, Liga Birzniece, Edmunds Skinderskis, Aivars Berzins, Davids Fridmanis, Talis Juhna

Abstract

Wastewater-based epidemiology (WBE) is a rapid and cost-effective method that can detect SARS-CoV-2 genomic components in wastewater and can provide an early warning for possible COVID-19 outbreaks up to one or two weeks in advance. However, the quantitative relationship between the intensity of the epidemic and the possible progression of the pandemic is still unclear, necessitating further research. This study investigates the use of WBE to rapidly monitor the SARS-CoV-2 virus from five municipal wastewater treatment plants in Latvia and forecast cumulative COVID-19 cases two weeks in advance. For this purpose, a real-time quantitative PCR approach was used to monitor the SARS-CoV-2 nucleocapsid 1 (N1), nucleocapsid 2 (N2), and E genes in municipal wastewater. The RNA signals in the wastewater were compared to the reported COVID-19 cases, and the strain prevalence data of the SARS-CoV-2 virus were identified by targeted sequencing of receptor binding domain (RBD) and furin cleavage site (FCS) regions employing next-generation sequencing technology. The model methodology for a linear model and a random forest was designed and carried out to ascertain the correlation between the cumulative cases, strain prevalence data, and RNA concentration in the wastewater to predict the COVID-19 outbreak and its scale. Additionally, the factors that impact the model prediction accuracy for COVID-19 were investigated and compared between linear and random forest models. The results of cross-validated model metrics showed that the random forest model is more effective in predicting the cumulative COVID-19 cases two weeks in advance when strain prevalence data are included. The results from this research help inform WBE and public health recommendations by providing valuable insights into the impact of environmental exposures on health outcomes.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。