Analyzing differences between discursive communities using dialectograms

利用方言图分析话语社群之间的差异

阅读:1

Abstract

Word embeddings provide an unsupervised way to understand differences in word usage between discursive communities. A number of papers have focused on identifying words that are used differently by two or more communities. But word embeddings are complex, high-dimensional spaces and a focus on identifying differences only captures a fraction of their richness. Here, we take a step towards leveraging the richness of the full embedding space, by using word embeddings to map out how words are used differently. Specifically, we describe the construction of dialectograms, an unsupervised way to visually explore the characteristic ways in which each community uses a focal word. Based on these dialectograms, we provide a new measure of the degree to which words are used differently that overcomes the tendency for existing measures to pick out low-frequency or polysemous words. We apply our methods to explore the discourses of two US political subreddits and show how our methods identify stark affective polarisation of politicians and political entities, differences in the assessment of proper political action as well as disagreement about whether certain issues require political intervention at all.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。