Abstract
BACKGROUND: The recurrent evolution of the C(4) photosynthetic pathway in angiosperms represents one of the most extraordinary examples of convergent evolution of a complex trait. Comparative genomic analyses have unveiled some of the molecular changes associated with the C(4) pathway. For instance, several key enzymes involved in the transition from C(3) to C(4) photosynthesis have been found to share convergent amino acid replacements along C(4) lineages. However, the extent of convergent replacements potentially associated with the emergence of C(4) plants remains to be fully assessed. Here, we conducted an organelle-wide analysis to determine if convergent evolution occurred in multiple chloroplast proteins beside the well-known case of the large RuBisCO subunit encoded by the chloroplast gene rbcL. METHODS: Our study was based on the comparative analysis of 43 C(4) and 21 C(3) grass species belonging to the PACMAD clade, a focal taxonomic group in many investigations of C(4) evolution. We first used protein sequences of 67 orthologous chloroplast genes to build an accurate phylogeny of these species. Then, we inferred amino acid replacements along 13 C(4) lineages and 9 C(3) lineages using reconstructed protein sequences of their reference branches, corresponding to the branches containing the most recent common ancestors of C(4)-only clades and C(3)-only clades. Pairwise comparisons between reference branches allowed us to identify both convergent and non-convergent amino acid replacements between C(4):C(4), C(3):C(3) and C(3):C(4) lineages. RESULTS: The reconstructed phylogenetic tree of 64 PACMAD grasses was characterized by strong supports in all nodes used for analyses of convergence. We identified 217 convergent replacements and 201 non-convergent replacements in 45/67 chloroplast proteins in both C(4) and C(3) reference branches. C(4):C(4) branches showed higher levels of convergent replacements than C(3):C(3) and C(3):C(4) branches. Furthermore, we found that more proteins shared unique convergent replacements in C(4) lineages, with both RbcL and RpoC1 (the RNA polymerase beta' subunit 1) showing a significantly higher convergent/non-convergent replacements ratio in C(4) branches. Notably, more C(4):C(4) reference branches showed higher numbers of convergent vs. non-convergent replacements than C(3):C(3) and C(3):C(4) branches. Our results suggest that, in the PACMAD clade, C(4) grasses experienced higher levels of molecular convergence than C(3) species across multiple chloroplast genes. These findings have important implications for our understanding of the evolution of the C(4) photosynthesis pathway.