Abstract
Recent studies have suggested the importance of statistical image features in both natural scene and object recognition, while the spatial layout or shape information is still important. In the present study, to investigate the roles of low- and high-level statistical image features in natural scene and object recognition, we conducted categorization tasks using a wide variety of natural scene and object images, along with two types of synthesized images: Portilla-Simoncelli (PS) synthesized images, which preserve low-level statistical features, and style-synthesized (SS) images, which retain higher-level statistical features. Behavioral experiments revealed that human observers (of either sex) could categorize style-synthesized versions of natural scene and object images with high accuracy. Furthermore, we recorded visual evoked potentials (VEPs) for the original, SS, and PS images and decoded natural scene and object categories using a support vector machine. Consistent with the behavioral results, natural scene categories were decoded with high accuracy within 200 ms after the stimulus onset. In contrast, object categories were successfully decoded only from VEPs for original images at later latencies. Finally, we examined whether style features could classify natural scene and object categories. The classification accuracy for natural scene categories showed a similar trend to the behavioral data, whereas that for object categories did not align with the behavioral results. Taken together, these findings suggest that although natural scene and object categories can be recognized relatively easily even when layout information is disrupted, the extent to which statistical features contribute to categorization differs between natural scenes and objects.