A Deep Learning Framework for Using Search Engine Data to Predict Influenza-Like Illness and Distinguish Epidemic and Nonepidemic Seasons: Multifeature Time Series Analysis

基于深度学习的搜索引擎数据流感样疾病预测及流行季与非流行季区分框架：多特征时间序列分析

阅读：1

作者：Li,Ji,Yan,Xiangyu,Chu,Xingjie,Zhang,Ying,Liu,Guoliang,Li,Lin,Li,Yue,Dong,Xiaochun,Mei,Zihan,Liu,Zhengkun,Yuan,Jinyue,Sun,Xiaohan,Cao,Chunxia

期刊：	Journal of Medical Internet Research	影响因子：	6.000
时间：	2025	起止号：	2025 Aug 11;27:e71786
doi：	10.2196/71786	研究方向：	微生物学
疾病类型：	流感

Abstract

BACKGROUND: The seasonal influenza epidemic poses a persistent and severe threat to global public health. Web-based search data are recognized as a valuable source for forecasting influenza or other respiratory tract infection epidemics. Current influenza prediction studies typically focus on seasonal trends in traditional monitoring data, neglecting the sensitivity of different web-based search terms to seasonal changes, thereby increasing prediction challenges. OBJECTIVE: The aim of this study was to propose a deep learning framework for different influenza epidemic states based on Baidu index and percentage of influenza-like illness (ILI%). METHODS: Official weekly ILI% data from 2013 to 2024 were extracted from the Chinese National Notifiable Infectious Disease Reporting System (NIDRIS). Based on the Baidu index, influenza-related search indexes were acquired for the corresponding time periods. To explore the association between influenza-related search queries and ILI%, the study conducted a cross-correlation analysis. The study period was divided into influenza epidemic and nonepidemic period. The study finally used the convolutional long short-term memory (CLSTM) network framework to predict influenza epidemics with 1-3 weeks ahead for the all-time period and epidemic + nonepidemic period. The evaluation metrics included model stability metric, accuracy metrics, and explanatory power metric. RESULTS: The ILI% presented a regular seasonal high incidence in China. Meanwhile, the prediction of ILI% after dividing the epidemic and nonepidemic seasons (mean absolute percentage error [MAPE]=10.730%, mean square error [MSE]=0.884, mean absolute error [MAE]=0.649, root-mean-square error [RMSE]=0.940, and R2=0.877) was better than that of the all-time period (MAPE=12.784%, MSE=1.513, MAE=0.744, RMSE=1.230, and R2=0.786). In addition, we found that the ILI% + Baidu search index predicts better than only the ILI% regardless of the time period and lag time of the study. Comparative analysis with long short-term memory (LSTM) and transformer models demonstrated that CLSTM achieved superior performance in 1 week-ahead ILI% predictions using ILI% + Baidu index data in epidemic + nonepidemic period (MAPE=11.824%, MSE=1.243, MAE=0.723, RMSE=1.115, and R2=0.827). Furthermore, CLSTM comprehensively surpasses LSTM in computational efficiency, complexity, extrapolation capability, and stability while partially outperforming transformer models. CONCLUSIONS: This study shows strong potential for influenza prediction by combining Baidu index data with traditional surveillance and specific keywords for epidemic and nonepidemic seasons. It provides a new perspective for public health preparedness. This research is expected to support early warning systems for influenza and other diseases. Future work will further optimize these models for more timely and accurate predictions, enhancing public health responses.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。

肿瘤免疫

炎症

T细胞

线粒体

凋亡

转录调控

巨噬细胞

自噬

传染病

氧化应激

肠道菌群

磷酸化

血管生成

囊泡

3D/类器官

单细胞

中性粒细胞

外泌体

DNA甲基化

miRNA

药物研究

铁死亡

细胞衰老

乙酰化

缺氧低氧

泛素化

树突状细胞

炎性小体

组蛋白修饰

肿瘤微环境

lncRNA

代谢重编程

焦亡

m6A/m5C/m7G

内质网应激

空间多组学

细胞基因治疗

治疗耐药

相分离

Treg

上皮间质转化

免疫代谢

染色质重塑

脂质过氧化

蛋白质稳态

脂代谢

细胞极性

铁代谢

氨基酸代谢

碱基编辑

cGAS-STING

肠脑轴

蛋白降解

乳酸化

翻译调控

circRNA

piRNA

肿瘤异质性

NK 细胞

氧化脂质

MDSC

NETosis

低氧缺氧

溶酶体功能

琥珀酰化

细胞干性

CAR-NK

冷应激

RNA 编辑

Tfh

巴豆酰化

器官芯片

表观遗传记忆

铜死亡

器官纤维化

线粒体未折叠蛋白反应

空间代谢组

程序性坏死

自噬流

MAIT 细胞

肠肝轴

丙酰化