Abstract
Sketch face synthesis aims to generate sketch images from photos. Recently, contrastive learning, which maps and aligns information across diverse modalities, has found extensive application in image translation. However, when applying traditional contrastive learning to sketch face synthesis, the random sampling strategy and the imbalance between positive and negative samples result in poor performance of synthesized sketch images regarding local details. To address the above challenges, we propose A Facial Structure Sampling Contrastive Learning Method for Sketch Facial Synthesis. Firstly, we propose a region-constrained sampling module that utilizes the distribution map of facial structure obtained by a dual-branch attention mechanism to segment the input photos into diverse regions, thereby providing regional constraints for sample selection. Subsequently, we propose a dynamic sampling strategy that dynamically adjusts the sampling frequency based on the feature density in the distribution map, thereby alleviating sample imbalance. Additionally, to diminish the background influence and enhance the delineation of character contours, we introduce the mask derived from the input photo as an additional input. Finally, to further enhance the quality of the synthesized sketch images, we introduce pixel-wise loss and perceptual loss. The CUFS dataset experiment demonstrates that our method generates high-quality sketch images, outperforming existing state-of-the-art methods in subjective and objective evaluations.