Abstract
The transmission history and adaptive evolution of Mycobacterium tuberculosis complex (MTBC) in China remain underexplored despite its remarkably low diversity and enduring public health burden. Here, we analyzed 23,873 whole genome sequences of MTBC to reconstruct its spread timeline and routes, population dynamics, and host adaptation patterns in China. The Bayesian coalescent models revealed the recurrent introductions of four MTBC sub-lineages (L2.2, L4.2, L4.4, and L4.5) during 1000-400 years ago, by European-Chinese transmission networks, which might have triggered the formation of their local genetic clades in China. These local clades underwent three rapid population expansions that temporally aligned with historical climate cooling events and warfare, and displayed convergent adaptation, including shared mutations and structural variations in macrophage-resistance genes and enhanced genetic diversity in T cell epitopes. The historical mutation rate estimations and neutral mutation simulations indicated that these local clades experienced pronounced macrophage-induced pressures, likely operative over the past two centuries. The DNA information entropy analysis further revealed their adaptive evolution signatures clustered within macrophage-resistance pathways. The gene Rv0801, computationally predicted to exhibit the most prominent adaptive signatures despite its uncharacterized function, was confirmed through recombinant strain infection to enhance intracellular survival in human macrophages. This integrative approach reveals MTBC's evolutionary trajectory in China, provides novel methods for quantifying selection pressures and detecting adaptive evolution signals in prokaryotes, and constructs a comprehensive framework for investigating pathogen transmission dynamics and host adaptation mechanisms.