Abstract
Nonribosomal peptides are assembled by large enzymes that contain multiple active sites, which function in a modular manner. The adenylation (A) domains present within typical nonribosomal peptide synthetase (NRPS) modules contain specificity-conferring codes or signature sequences (SNSs). In this study, we obtained 2051 A domain sequences from 67 bacterial species. Their alignment and clustering identified 508 SNSs. Over 80% of the SNSs displayed distinct specificity for 36 proteinogenic and nonproteinogenic α-amino acid moieties (α-AAMs). Furthermore, modifications such as N-methylation, monooxygenase activity, and oxidation contributed to the elongation of the A domains, while conferring pronounced affinities for certain α-AAMs. Notably, β-hydroxylation demonstrated particular preferences. Specifically, ornithine, threonine, tyrosine, and phenylalanine moieties frequently underwent atypical covalent modifications, and 41 modules were used iteratively. These insights significantly facilitate the identification of uncharacterized NRPS systems-expediting traditional identification processes-although novel modifications, unusual domain organizations, and dormant domains pose challenges for their accurate prediction.