Abstract
BACKGROUND: The loss of duplicate genes - fractionation - after whole genome doubling (WGD) is the subject to a debate as to whether it proceeds gene by gene or through deletion of multi-gene chromosomal segments. RESULTS: WGD produces two copies of every chromosome, namely two identical copies of a sequence of genes. We assume deletion events excise a geometrically distributed number of consecutive genes with mean µ ≥ 1, and these events can combine to produce single-copy runs of length l. If µ = 1, the process is gene-by-gene. If µ > 1, the process at least occasionally excises more than one gene at a time. In the latter case if deletions overlap, the later one simply extends the existing run of single-copy genes. We explore aspects of the predicted distribution of the lengths of single-copy regions analytically, but resort to simulations to show how observing run lengths l allows us to discriminate between the two hypotheses. CONCLUSIONS: Deletion run length distributions can discriminate between gene-by-gene fractionation and deletion of segments of geometrically distributed length, even if µ is only slightly larger than 1, as long as the genome is large enough and fractionation has not proceeded too far towards completion.