Abstract
Humans organise their social worlds into social and nonsocial events. Social event segmentation refers to the ability to parse the environmental content into social and nonsocial events or units. Here, we investigated the role that perceptual information from visual and auditory modalities, in isolation and in conjunction, played in social event segmentation. Participants viewed a video clip depicting an interaction between two actors and marked the boundaries of social and nonsocial events. Depending on the condition, the clip at first contained only auditory or only visual information. Then, the clip was shown containing both auditory and visual information. Higher overall group consensus and response consistency in parsing the clip was found for social segmentation and when both auditory and visual information was available. Presenting the clip in the visual domain only benefitted group agreement in social segmentation while the inclusion of auditory information (under the audiovisual condition) also improved response consistency in nonsocial segmentation. Thus, social segmentation utilises information from the visual modality, with the auditory cues contributing under ambiguous or uncertain conditions and during segmentation of nonsocial content.