Abstract
Touch perception is an inherently multisensory process in which vision plays an essential role. However, our understanding of how vision encodes sensory and emotional-affective aspects of observed touch, and the timing of these processes, remains limited. To address this gap, we investigated the neural dynamics of visual touch perception using electroencephalographic (EEG) recordings from participants who viewed videos depicting detailed tactile hand interactions from the Validated Touch-Video Database. We examined how the brain encodes basic body cues, such as hand orientation and viewing perspective, in addition to sensory aspects, including the type of touch (e.g., stroking vs. pressing; hand vs. object touch) and the object involved (e.g., knife, brush), as well as emotional-affective dimensions. Using multivariate decoding, we found that information about body cues emerged within approximately 60 ms, with information about sensory details and valence emerging around 110-160 ms, demonstrating efficient early visual encoding. Information about arousal, threat, and pain was most clearly identified by approximately 260 ms, suggesting that such evaluations require slightly extended neural engagement. Frequency decoding revealed that body cues were processed across a broad spectral range, with strongest contributions in the theta, alpha, and low beta bands (~6-20 Hz), while sensory and emotional-affective features were primarily reflected in delta, theta, and alpha frequencies (~1-13 Hz). Our findings reveal that bottom-up, automatic visual processing is integral to complex tactile assessments, important for rapidly extracting both the personal relevance and the sensory and emotional dimensions of visually observed touch.