Abstract
Adeno-associated viruses (AAVs) are nonpathogenic DNA viruses with potent gene delivery capabilities, making them essential tools in gene therapy and biomedical research. Despite their therapeutic importance, key aspects of AAV natural biology remain obscure, complicating efforts to explain rare AAV-associated diseases and optimize gene therapy vectors. By analyzing sequence data from virus isolates and endogenous viral elements (EVEs), I reveal a striking evolutionary pattern: While AAV sublineages, defined by the replication-associated (rep) gene, have broadly codiverged with host groups over millions of years, capsid (cap) diversity has been shaped by extensive recombination. In particular, one capsid lineage, Mammalian-wide (M-wide), has spread horizontally across diverse rep lineages and host taxa through multiple recombination events. Furthermore, several AAVs with M-wide capsids-including AAV-4, AAV-12, and bovine AAV (BAAV)-originate from historical adenovirus (Ad) stocks, raising the possibility that laboratory conditions contributed to capsid transfer. Distinguishing natural from laboratory-driven recombination is essential for understanding AAV ecology and its implications for gene therapy. A systematic sequencing effort in human and primate populations is needed to assess the extent of recombinant capsid acquisition, determine the impact of laboratory-driven recombination on circulating AAV diversity, and track ongoing recombination events that could affect vector safety and efficacy.