Abstract
The human auditory system extracts meaning from sounds in the environment by transforming acoustic input signals into semantic categories, such as speech and music. Although distinct acoustic features give rise to these categorical percepts and to preferential responses in spatially segregated regions in the auditory cortex, the nature of the internal representations underlying this transformation remains poorly understood. Here, we combined neuroimaging, a deep neural network (DNN), brain-based sound synthesis, and psychophysical testing in human participants of either sex to investigate the internal sound features encoded in speech- and music-selective regions of the auditory cortex and their functional role in sound categorization. We found that sounds synthesized from cortical activity patterns-though acoustically dissimilar to natural speech and music sounds-nonetheless elicited similar categorical cortical and behavioral responses. These results suggest that the auditory cortex relies on internal, abstracted representations of category structure that are not reducible to the natural acoustic properties of speech and music. Our findings provide new insights into intermediate sound features, as captured by DNNs that may support categorization in the human auditory system.