Abstract
This research combines artificial intelligence generated content technology with style transfer methods to significantly improve the efficiency of generating and transmitting Yongju opera language content. To overcome fundamental challenges in traditional content creation, we first performed a comprehensive analysis of Yongju opera’s linguistic characteristics and established a multi-source dataset comprising 100 classical scripts, 50 contemporary scripts, and 500 lyric audio clips collected from the Ningbo Drama Research Institute and various opera platforms. The proposed framework features a dual-model architecture where the transformer-based conditional probabilistic generation (TFCPG) model acts as the content generator, while the conditional variational autoencoder (CVAE) functions as the style processor. The TFCPG transforms modern Chinese input into text complying with Yongju opera’s grammatical standards, and the CVAE enhances the text by incorporating dialect vocabulary and rhythmic patterns through style latent variables manipulation. Implemented in TensorFlow 1.4 with multi-task learning (batch size 32, Adam optimizer, learning rate 0.01), experimental results demonstrate that the TFCPG completes text generation in 33.32 min, representing a 56.6% reduction compared to the Transformer baseline. The model achieves a bilingual evaluation understudy (BLEU) score of 45.55 ± 1.32 (p < 0.01). In human evaluations, the system scored 4.35 for dialect authenticity, 4.18 for artistic expression, and 4.27 for cultural relevance. The CVAE component attained a BLEU score of 44.26 with 97.03% style transfer accuracy, exceeding the sequence to sequence (Seq2Seq) baseline by 10.28%. These comprehensive results confirm our approach’s effectiveness for Yongju opera language generation and style adaptation.