Abstract
Protein domains are fundamental units determining protein functions. This study identified all protein domains and domain combinations from 446 genomes across all major plant lineages. We discovered more domains and domain combinations in land plants than in algae. Many novel "core" protein domains were acquired in the early evolution of streptophytes, substantially enriching the genomic toolkit that enabled plants to shift from unicellular to multicellular organization and to adapt to terrestrial life. After conquering the land, the number of ancestral core domains kept decreasing in land plants; in contrast, an increasing number of non-core domains were acquired, which, together with enhanced activity of domain shuffling, generated various novel domain combinations and expanded protein diversity. We speculate that losing existing genetic elements (core domains) is not always detrimental, as it may have reduced evolutionary constraint upon species, paving the way for biological innovation (speciation) and adaptation to changing environments.