Syntactic language change in English and German: Metrics, parsers, and convergences

英语和德语的句法语言变化:度量、解析器和趋同

阅读:1

Abstract

Syntactic language change has gained increasing attention in recent years. Previous computational work based on dependency relations has focused on diachronic trends in dependency distance, which measures the linear distance between dependent words, using dependency trees automatically predicted by a dependency parser (mostly the Stanford CoreNLP parser). In this work, we introduce a set of 15 syntax metrics that extend the analysis beyond linear distance by incorporating both linear and tree graph properties of dependency trees, such as tree height and degree. Besides, we propose a multi-parser approach to reduce the impact of using specific parsers, thereby increasing the robustness of the detected language changes. Through a cross-lingual investigation of English and German in parliamentary debates from the last 160 years, using 6 different parsers (CoreNLP and five newer alternatives), we demonstrate that: (1) Relying on one single parser can be problematic, as the agreement on predicted trends can be low across parsers. (2) Our set of metrics can capture subtle patterns of syntactic changes. Our analysis shows that syntactic change over the time period inspected is largely similar between English and German, with only 2.2% of cases yielding opposite trends in these metrics. (3) We also show that changes in syntactic metrics seem to be more frequent at the tails of sentence length distributions and often move in opposite directions for short and long sentences. To our best knowledge, ours is the most comprehensive computational analysis of syntactic language change using modern NLP technology in recent corpora of English and German.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。