Natural language processing for automated triage and prioritization of individual case safety reports for case-by-case assessment

利用自然语言处理技术,对个案安全报告进行自动分诊和优先级排序,以便逐案评估。

阅读:2

Abstract

Objective: To improve a previously developed prediction model that could assist in the triage of individual case safety reports using the addition of features designed from free text fields using natural language processing. Methods: Structured features and natural language processing (NLP) features were used to train a bagging classifier model. NLP features were extracted from free text fields. A bag-of-words model was applied. Stop words were deleted and words that were significantly differently distributed among the case and non-case reports were used for the training data. Besides NLP features from free-text fields, the data also consisted of a list of signal words deemed important by expert report assessors. Lastly, variables with multiple categories were transformed to numerical variables using the weight of evidence method. Results: the model, a bagging classifier of decision trees had an AUC of 0.921 (95% CI = 0.918-0.925). Generic drug name, info text length, ATC code, BMI and patient age. were most important features in classification. Conclusion: this predictive model using Natural Language Processing could be used to assist assessors in prioritizing which future ICSRs to assess first, based on the probability that it is a case which requires clinical review.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。