Rapid Epidemiological Data Collection on Social Media for COVID-19: Comparative Study Between Online Surveys and Conventional Cohorts

利用社交媒体快速收集新冠肺炎流行病学数据:在线调查与传统队列研究的比较研究

阅读:1

Abstract

BACKGROUND: After COVID-19 was declared a pandemic by the World Health Organization (WHO) in March 2020, global responses relied on nonpharmaceutical interventions such as physical distancing and mask mandates. These measures were guided by mathematical models built on empirical data. Although traditional methods such as surveys and observational studies provide high-quality data, they are often slow and resource-intensive. Social media polls (SMPs) offer a faster, more cost-effective alternative. OBJECTIVE: This study aims to evaluate the reliability and biases of SMPs as a rapid supplementary tool for epidemiological data collection and to compare their representativeness and data quality with conventional approaches. METHODS: In this cross-sectional observational study in Germany, we used SMPs to collect data on infections and demographic attributes via Twitter and Mastodon. We collected data directly on the social media platforms as well as through forwarding to an external survey via post. The time frame covered was from 2019 to 2024. Data were analyzed for infection rates, sociodemographic representativeness, and overall data quality. RESULTS: SMPs demonstrated viability as a rapid data collection tool. Based on a sample of 6127 answers on social media and 867 responses from the external survey, the self-reported frequency of infection aligned well with conventional sources. Across all 4 studies, approximately one-third of respondents reported having never been infected, half reported having had 1 infection, and one-sixth reported having had 2 or more infections. Statistical analyses of differences between data from Twitter, Mastodon, the external survey, and conventional data showed only small effect sizes (Cohen w=0.105-0.188). Spearman rank correlation demonstrated strong positive associations between infection dates in the external survey and conventional data (ρ=0.883, P<.001), as well as between the external survey and the Robert Koch Institute (ρ=0.640, P<.001). However, demographic analyses revealed biases in the external survey. By design, SMPs do not provide detailed demographic data, limiting options for subgroup analyses. CONCLUSIONS: We found SMPs to be a practical and cost-effective method for quickly gathering epidemiological insights. In particular, self-reported infection frequency can aid during periods of high availability of self-testing during epidemics. We demonstrate that, even with a nonrepresentative and biased sample, we were able to closely match infection numbers with Multilocal and Serial Prevalence Study of Antibodies Against Respiratory Infectious Diseases in Germany data and produce incidence trends comparable to those in official Robert Koch Institute data. One can argue that SMPs alone are insufficient for public health modeling, as they do not allow real-time monitoring of, for example, population infection rates based on serological data. They are also limited with regard to inherent demographic bias related to recruitment and the inability to collect individual-level covariates. However, they can complement traditional approaches by offering rapid, low-cost insights.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。