Abstract
Environmental perception is an essential task for autonomous driving, which is typically based on LiDAR or camera sensors. In recent years, 4D mm-Wave radar, which acquires 3D point cloud together with point-wise Doppler velocities, has drawn substantial attention owing to its robust performance under adverse weather conditions. Nonetheless, due to the high sparsity and substantial noise inherent in radar measurements, most radar perception studies are limited to object-level tasks, with point-level tasks such as semantic segmentation remaining largely underexplored. This paper aims to explore the possibility of using 4D radar in semantic segmentation. We set up the ZJUSSet dataset containing accurate point-wise class labels for radar and LiDAR. Then we propose a cross-modal distillation framework RaSS to fulfill the task. An adaptive Doppler compensation module is also designed to facilitate the segmentation. Experimental results on ZJUSSet and VoD dataset demonstrate that our RaSS model significantly outperforms the baselines and competitors. Code and dataset will be available upon paper acceptance.