Abstract
BACKGROUND: The influenza virus (IV) is responsible for seasonal flu epidemics. Constant mutation of the virus results in new strains and widespread reinfections across the globe, bringing great challenges to disease prevention and control. Research has demonstrated that barcoding technology efficiently and cost-effectively differentiates closely related species on a large scale. We screened and validated species-specific RNA barcode segments based on the genetic relationships of four types of IVs, facilitating their precise identification in high-throughput sequencing viral samples. RESULTS: Through the analysis of single nucleotide polymorphism, population genetic characteristics, and phylogenetic relationships in the training set, 7 IVA type, 29 IVB type, 40 IVC type, and 5 IVD type barcode segments were selected. In the testing set, the nucleotide-level recall rate for all barcode segments reached 96.86%, the average nucleotide-level specificity was approximately 55.27%, the precision rate was 100%, and the false omission rate was 0%, demonstrating high accuracy, specificity, and generalization capabilities for species identification. Ultimately, all four types of IVs were visualized in a combination of one-dimensional and two-dimensional codes and stored in an online database named Influenza Virus Barcode Database (FluBarDB, http://virusbarcodedatabase.top/database/index.html ). CONCLUSION: This study validates the effective application of RNA barcoding technology in the detection of IVs and establishes criteria and procedures for selecting species-specific molecular markers. These advancements enhance the understanding of the genetic and epidemiological characteristics of IVs and enable rapid responses to viral genetic mutations.