Abstract
Sound symbolism, referring to the resemblance between the sound structure of words and their meaning, is commonly studied using auditory pseudowords. Companion studies across seven meaning domains demonstrated systematic relationships, varying by domain, between the perceptual ratings, phonetic features, and acoustic parameters of a set of 537 pseudowords (Lacey et al. 2024a, 2024b). Here we employed a k-nearest-neighbor (KNN) machine-learning algorithm to compare 4094 combinations of twelve acoustic parameters (3 spectro-temporal and 9 characterizing vocal quality) and identify the parameter combination that best predicted perceptual ratings in each domain. Using multiple regression, we then examined the relative contributions of the parameters comprising the best performing acoustic model for each domain. Finally, we used the KNN approach to generate sound-symbolic ratings, in the shape domain, for 160 real words and compared these predicted ratings with corresponding perceptual ratings. We found that sound-symbolic mappings rely on domain-specific combinations and weights of acoustic parameters. One spectro-temporal parameter, the fast Fourier transform, and one vocal parameter, the fraction of unvoiced frames, were both present in the best performing model for each meaning domain studied, indicating the general importance of these two parameters for sound-symbolic judgments. The predicted and perceptual ratings of the real words were strongly correlated, indicating the value of this approach to measure the degree of sound-symbolic mapping in natural languages, unconfounded by semantic bias. Our findings support the proposed relevance of sound symbolism to language.