Abstract
Horizontal transfer of mitochondrial DNA into the nuclear genome generates nuclear mitochondrial sequences (NUMTs), which serve as molecular fossils reflecting long-term mitochondrial-nuclear interactions and genome evolution. However, the biological mechanisms governing NUMT integration, retention, and evolutionary fate remain incompletely understood in domesticated animals. Here, using the latest pig reference genome assembly (Sscrofa11.1), we present a comprehensive genome-wide characterization of NUMTs in pigs and provide new insights into their genomic distribution and evolutionary constraints. We identified 513 high-confidence NUMTs, of which 460 were chromosomally mapped, accounting for 0.0106% of the nuclear genome. Beyond increased detection, our analyses reveal that pig NUMTs exhibit non-random origins, preferentially integrate into genomic regions under weak selective constraint, and are frequently associated with repetitive elements, consistent with a DNA repair-mediated insertion mechanism. NUMTs predominantly occur as short, fragmented sequences and show signatures of long-term neutral evolution, while insertions disrupting coding sequences are strongly selected against. Synteny-based analyses further identified clustered NUMT regions and duplicated NUMTs, suggesting secondary genomic duplication events following initial integration. Comparative analysis with the earlier Sscrofa10.2 assembly demonstrates that improved genome quality substantially enhances NUMT detection, particularly in repetitive and GC-rich regions, clarifying previously ambiguous sequence-context associations. Together, this high-quality pig NUMT map provides a robust foundation for future functional, evolutionary, and population-level investigations and contributes to the conservation and utilization of pig genetic resources.