Abstract
Within the research scope of knowledge distillation, contrastive representation distillation has achieved remarkable research results by introducing the Contrastive Representation Distillation Loss. However, previous research has relatively paid scant attention to the influence of factors at the input sample level. We observe that the existence of a large number of negative sample pairs in the knowledge transfer process leads to the issue of information redundancy. To mitigate this issue, we propose a representation normalization method and apply it to contrastive representation distillation. The method aims to reduce the information redundancy caused by negative sample pairs. Meanwhile, drawing on the idea of the Triplet Loss function in contrastive learning, we constructed a loss function and integrated it into the Contrastive Representation Distillation Loss to form the Contrast Enhanced Representation Normalization Distillation Loss. This new loss function aims to enhance the similarity between positive sample pairs and increase the distance between negative sample pairs. The experimental results demonstrate that the Contrast Enhanced Representation Normalization Distillation algorithm outperforms the Contrastive Representation Distillation algorithm on the CIFAR100 and ImageNet datasets, and shows remarkable performance compared with other state-of-the-art knowledge distillation methods.This not only enables the deployment of models on resource-constrained devices, but also demonstrates extensive potential application values in tasks such as image segmentation, providing strong support for related research and practical applications.