Abstract
The multispecies coalescent (MSC) model provides a framework for detecting gene flow using genomic data, including between sister species. However, the robustness of the inference to violations of model assumptions are poorly understood. Here, we use simulation to study the false positive rate of a Bayesian test of gene flow under the MSC with multiple influencing factors including recombination, natural selection, discrete versus continuous gene flow, variable species divergence time, and gene flow involving sister versus nonsister lineages. We find that in almost all scenarios examined the test has very low false positives. However, the test of gene flow between sister lineages may be prone to high false positives in cases of very recent species divergence and very high recombination rate. At low recombination rates, the test is robust to selective sweeps, background selection and balancing selection, although prolonged balancing selection can lead to false signals of gene flow between sister lineages. The impact of excessive recombination on the test of gene flow between sisters may be assessed by using a smaller number of sequences for each species and by considering shorter sequences at each locus. Recent species divergence alone (with no recombination) does not cause false positives in tests of gene flow, contrary to previous claims. The test of gene flow between nonsister lineages is robust to recombination at all divergence levels. Our findings provide guidance for reliable inference of gene flow using coalescent methods and highlight the need for care in conducting and interpreting simulation experiments.