Abstract
Coronaviruses rapidly evolve and are prone to new virus emergence. Human coronavirus (HCoV)-229E is one of the seven coronaviruses (aside HCoV-OC43, HCoV-HKU1, HCoV-NL63, SARS-CoV, MERS-CoV, SARS-CoV-2) causing respiratory infections in humans. Genomic data are very scarce for this virus. We implemented an in-house multiplex PCR strategy to amplify HCoV-229E genomes from nasopharyngeal samples, before next-generation sequencing using Nanopore or Illumina technologies. HCoV-229E genomes were assembled and analyzed using MAFFT, MEGA, Itol, Nexstrain, and Nextclade softwares. Thirty-one PCR primer pairs designed to amplify HCoV-229E genome overlapping fragments allowed obtaining 123 genomes classified in an emerging HCoV-229E lineage first reported in China, with two sublineages being delineated. Relatively to genome NC_002645.1 (2001), regarding nucleotide mutations, 1167 substitutions, 72 insertions, and 34 deletions were detected, while regarding amino acid mutations, 415 susbstitutions, 39 deletions, and 14 amino acid insertions were detected. Genes with the greatest diversity were the spike protein-encoding gene, then Nsp3. The two sublineages harbored signature mutations. We almost doubled the HCoV-229E genome set available worldwide and provided the first French genomes. Further studies are needed to strengthen knowledge about this virus' phylogenomics and evolutionary dynamics, which may purvey clues to contribute improving coronavirus knowledge.