Detecting Adverse Drug Events in Social Media: A Brief Literature Review

社交媒体中药物不良事件的检测：简要文献综述

阅读：2

作者：Guellil,Imane,Berrachedi,Yousra,Chenni,Nidhal Eddine,Abboud,Massi-Nissa,Wu,Jinge,Wu,Honghan,Alex,Beatrice

期刊：		影响因子：
时间：	2026	起止号：	2026;7(2):199
doi：	10.1007/s42979-026-04752-9

Abstract

Adverse drug events (ADEs) remain a significant burden to public health and a persistent challenge for pharmacovigilance. The proliferation of patient-generated discourse on social media offers a complementary, real-time signal for ADE surveillance. This article provides a concise yet comprehensive review of recent natural language processing (NLP) research on identifying ADEs in social media text. We systematically reviewed 100 peer-reviewed studies (2017-2025) on NLP/AI for detecting or analysing ADEs in social media. Searches in Google Scholar targeted English-language journal and conference papers; patents and protocols were excluded. Of 130 records screened, 6 were protocols and 24 were excluded because the full text could not be located or the item was a conference abstract lacking methodological detail (i.e., no description of approaches or experiments), yielding a final sample of 100 studies. One reviewer performed screening, with full-text eligibility verified by a second. We extracted objectives, data sources/languages, preprocessing and annotation practices, datasets, model families, evaluation metrics, and stated limitations. Studies were grouped into five task categories-classification, extraction, normalization, corpus creation, and broader analytical work-with evidence tables summarizing contributions, toolchains, datasets, and performance. Recurrent challenges include noisy/imbalanced data, multilingual and code-mixed content, and variability in annotation standards. Twitter remains the primary data source: 60% of studies analyse Twitter alone and a further 18% combine Twitter with other platforms (78% in total). English overwhelmingly dominates; only about 5% of studies draw on non-English sources (e.g., French, Chinese, Arabic). Standard pre-processing-URL removal, tokenisation, and lowercasing-is near-universal. Transformer-based models predominate, with BERT and its biomedical or "tweet" variants (e.g., RoBERTa, BioBERT, BERTweet) used in more than 60% of approaches. Persistent obstacles include severe class imbalance and ambiguous or implicit drug-event expressions. Although shared tasks such as SMM4H provide widely used benchmarks, comprehensive annotation guidelines remain uncommon (12% of papers). Recent work increasingly incorporates multimodal inputs and integrates structured biomedical knowledge, yet gaps persist in multilingual coverage, temporal/longitudinal modelling, and real-world deployment. To our knowledge, this is the first review to synthesise findings from a corpus of 100 peer-reviewed studies on ADE detection in social media using NLP. By organising the literature by task type and tracing methodological trends and limitations, it provides practical guidance for researchers and practitioners. The review also outlines actionable directions for future work, including model explainability, support for low-resource languages, and closer collaboration with regulatory authorities to enable real-world deployment.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。

肿瘤免疫

炎症

T细胞

线粒体

凋亡

转录调控

巨噬细胞

自噬

传染病

氧化应激

肠道菌群

磷酸化

血管生成

囊泡

3D/类器官

单细胞

中性粒细胞

外泌体

DNA甲基化

miRNA

药物研究

铁死亡

细胞衰老

乙酰化

缺氧低氧

泛素化

树突状细胞

组蛋白修饰

炎性小体

肿瘤微环境

lncRNA

代谢重编程

焦亡

m6A/m5C/m7G

内质网应激

空间多组学

细胞基因治疗

治疗耐药

相分离

Treg

上皮间质转化

免疫代谢

染色质重塑

脂质过氧化

脂代谢

蛋白质稳态

铁代谢

细胞极性

氨基酸代谢

碱基编辑

cGAS-STING

肠脑轴

蛋白降解

乳酸化

翻译调控

circRNA

piRNA

肿瘤异质性

NK 细胞

氧化脂质

MDSC

NETosis

低氧缺氧

溶酶体功能

细胞干性

琥珀酰化

CAR-NK

RNA 编辑

冷应激

Tfh

巴豆酰化

器官芯片

表观遗传记忆

铜死亡

器官纤维化

线粒体未折叠蛋白反应

空间代谢组

程序性坏死

自噬流

肠肝轴

丙酰化

MAIT 细胞