The Use of Natural Language Processing Methods in Reddit to Investigate Opioid Use: Scoping Review

利用自然语言处理方法调查Reddit上的阿片类药物使用情况:范围界定综述

阅读:1

Abstract

BACKGROUND: The growing availability of big data spontaneously generated by social media platforms allows us to leverage natural language processing (NLP) methods as valuable tools to understand the opioid crisis. OBJECTIVE: We aimed to understand how NLP has been applied to Reddit (Reddit Inc) data to study opioid use. METHODS: We systematically searched for peer-reviewed studies and conference abstracts in PubMed, Scopus, PsycINFO, ACL Anthology, IEEE Xplore, and Association for Computing Machinery data repositories up to July 19, 2022. Inclusion criteria were studies investigating opioid use, using NLP techniques to analyze the textual corpora, and using Reddit as the social media data source. We were specifically interested in mapping studies' overarching goals and findings, methodologies and software used, and main limitations. RESULTS: In total, 30 studies were included, which were classified into 4 nonmutually exclusive overarching goal categories: methodological (n=6, 20% studies), infodemiology (n=22, 73% studies), infoveillance (n=7, 23% studies), and pharmacovigilance (n=3, 10% studies). NLP methods were used to identify content relevant to opioid use among vast quantities of textual data, to establish potential relationships between opioid use patterns or profiles and contextual factors or comorbidities, and to anticipate individuals' transitions between different opioid-related subreddits, likely revealing progression through opioid use stages. Most studies used an embedding technique (12/30, 40%), prediction or classification approach (12/30, 40%), topic modeling (9/30, 30%), and sentiment analysis (6/30, 20%). The most frequently used programming languages were Python (20/30, 67%) and R (2/30, 7%). Among the studies that reported limitations (20/30, 67%), the most cited was the uncertainty regarding whether redditors participating in these forums were representative of people who use opioids (8/20, 40%). The papers were very recent (28/30, 93%), from 2019 to 2022, with authors from a range of disciplines. CONCLUSIONS: This scoping review identified a wide variety of NLP techniques and applications used to support surveillance and social media interventions addressing the opioid crisis. Despite the clear potential of these methods to enable the identification of opioid-relevant content in Reddit and its analysis, there are limits to the degree of interpretive meaning that they can provide. Moreover, we identified the need for standardized ethical guidelines to govern the use of Reddit data to safeguard the anonymity and privacy of people using these forums.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。