Abstract
Social visual processing in vertebrates employs sophisticated neural mechanisms ranging from categorical face cells to distributed sparse coding systems. In primates, recent evidence supports a "tuning landscape" model where neurons signal distances to prototypes in high-dimensional space rather than functioning as simple category detectors. However, social visual processing in non-mammalian animals remains poorly understood. We recorded single-unit activity from three functionally distinct pigeon brain regions-mesopallium ventrolaterale (MVL), visual Wulst, and nidopallium caudolaterale (NCL)-while birds viewed dynamic videos of conspecifics and control shapes performing courtship, eating, flying, and walking behaviors. Despite finding visually responsive neurons in all regions, we observed no categorical distinction between conspecific and control stimuli. Instead, population analyses revealed discrete temporal modulations corresponding to specific motion features-bowing, wing-flapping, head-bobbing-suggesting feature-based rather than categorical encoding of visual information. Sound-modulated visual units were significantly more prevalent in MVL than Wulst, indicating earlier multimodal integration in the tectofugal pathway than previously recognized. The absence of differential responses in NCL during passive viewing, contrasting with clear modulation in visual areas, suggests that this region is less involved in the automatic analysis of visual features. These findings suggest that avian visual structures use sparse coding principles that are similar to the visual cortex, where populations encode specific features through coordinated but brief neural responses rather than sustained categorical signals.