Abstract
The intelligent generation of Peking Opera facial makeup presents a unique challenge in stylized image generation, bridging artificial intelligence with traditional cultural preservation. This study addresses two critical gaps: (1) the need for high-fidelity generation that captures intricate artistic details, and (2) the scarcity of labeled datasets for training. We propose an enhanced Stable Diffusion framework integrating region-specific noise scaling and attention-augmented U-Net models to improve contrast and fine-grained detail synthesis. LoRA fine-tuning further optimizes generation efficiency without compromising quality. Evaluations demonstrate superior performance over SOTA models, achieving FID (16.34), KID (9.44), and SSIM (0.4912) scores, validating the model's effectiveness in preserving cultural authenticity while accelerating production.