Sequence structure in children's speech reveals non-linear development of relations between word categories

儿童语言中的序列结构揭示了词类之间关系的非线性发展

阅读:1

Abstract

Why do children learn some words earlier than others? Can children's speech patterns reveal how their evolving models of language determine what they learn? This study presents a systemic analysis of children's speech using low-dimensional embeddings to examine how the contextual knowledge reflected in their utterances reorganizes as linguistic experience increases. We analyzed age-stratified samples from the CHILDES database (18-36 months: n = 1,693,641 tokens; 3-6 years: n = 1,750,007; 5-12 years: n = 1,721,828) and adult speech from the SUBS2VEC subtitle corpus (n = 1,742,885). Our results suggest that the order and position of words in sequences produced by children from different age groups reflect changes in the way they represent categories of words. Rather than being ungrammatical, children's utterances appear to be structured by temporary grammars that optimize the distribution of information in sequences. The results point to shifts in how words are organized in semantic space, reflecting the gradual alignment of lexical categories during learning; this restructuring appears to draw on functionally ambiguous (multipurpose) categories in English. These findings are somewhat counterintuitive, as they suggest that not knowing the exact meaning of words can facilitate both learning and communication.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。