Harnessing Psycho-lingual and Crowd-Sourced Dictionaries for Predicting Taboos in Written Emotional Disclosure in Anonymous Confession Boards

利用心理语言学和众包词典预测匿名忏悔论坛中书面情感表达的禁忌

阅读:1

Abstract

There have been many efforts in the last decade in the health informatics community to develop systems that can automatically recognize and predict disclosures on social media. However, a majority of such efforts have focused on simple topic prediction or sentiment classification. However, taboo disclosures on social media that people are not comfortable to talk with their friends represent an abstract theme dependent on context and background. Recent research has demonstrated the efficacy of injecting concept into the learning model to improve prediction. We present a vectorization scheme that combines corpus- and lexicon-based approaches for predicting taboo topics from anonymous social media datasets. The proposed vectorization scheme exploits two context-rich lexicons LIWC and Urban Dictionary. Our methodology achieves cross-validation accuracies of up to 78.1% for the supervised learning task on Facebook Confessions dataset, and 70.5% for the transfer learning task on the YikYak dataset. For both the tasks, supervised algorithms trained with features generated by the proposed vectorizer perform better than vanilla t f - i d f representation. This work presents a novel methodology for predicting taboos from anonymous emotional disclosures on confession boards.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。