What is the most important factor that holds a gene pool together and prevents speciation

The role of natural selection in the origin of species has been controversial ever since Darwin published his great work in 1859; as can be seen from the papers in this volume, it remains so. Darwin (at least, in the first edition of The origin of species) relied on selection as the main cause of evolutionary change, but saw that hybrid sterility could not be directly selected; instead, he argued that it arises as a side-effect of divergence. In contrast, Wallace's (1889) enthusiasm for selection led him to argue that not only could it strengthen prezygotic isolation, by what we now call reinforcement, but that group selection could even cause hybrid sterility (Cronin 1991, ch. 16). Then, as now, ecological divergence that allows distinct species to live together in sympatry received less attention than reproductive isolation. However, Darwin (1859, ch. 4) did attach great importance to diversifying selection in driving speciation.

This paper begins by briefly reviewing the various roles that selection plays in speciation, but then focuses on how it leads to sympatry, by selecting for reduced recombination and for the use of different limiting resources. Specifically, it argues that the condensation of a single population into two reproductively isolated clusters, coexisting in different niches, is a process distinct from the traditional view of reinforcement and that this process is much more likely to occur in parapatry than in sympatry.

Selection against hybrids is essential to the definition of biological species. Alleles will be eliminated from a population if they find themselves in unfit heterozygotes or recombinants, and similarly, sexual selection will act against alleles that make males unattractive, or make it harder for females to find a mate. The fundamental opposition of selection to the evolution of reproductive isolation has long been seen as the major obstacle to speciation (Darwin 1859, ch. 8; Huxley 1860; Mayr 1963, ch. 17); in Wright's (1932) metaphor of an ‘adaptive landscape’, reproductive isolation corresponds to a valley of reduced mean fitness that cannot be crossed by selection alone. This view motivated a variety of models in which random drift overcomes selection, to knock populations onto new fitness peaks: various models of founder-effect speciation (Mayr 1963, ch. 17; Carson & Templeton 1984), chromosomal speciation (Wright 1941; White 1978) and Wright's (1932) ‘shifting balance’.

In fact, selection will not oppose the evolution of reproductive isolation if that isolation is not expressed during divergence. Separate populations will diverge, even if they experience identical environments: they will not fix the same alleles from the ancestral population and will not pick up the same set of mutations. So, two populations will come to differ at many sites, some fraction of which will affect fitness. Novel genotypes will be produced from crosses between populations that have been separated by only a few thousand generations. A priori, we expect that genotypes that have never been tested by selection would have lower fitness, on average, and that the average hybrid fitness would decrease with divergence. It is remarkable that, in fact, organisms that offer by thousands of amino acid substitutions often freely hybridize, and that even where they do not, relatively few incompatibilities may be responsible for hybrid unfitness (Orr & Turelli 2001). The slow accumulation of reproductive isolation reflects the robustness of organisms to genetic change.

The first explicit models in which reproductive isolation evolves as a side effect of divergence were made by Poulton (1903) and Bateson (1909), and rediscovered by Dobzhansky (1937) and Muller (1942); see Orr (1996). Two alleles arise, one in each lineage, and although each allele is favourable on the ancestral background, they are incompatible with each other. Orr (1995) and Orr & Turelli (2001) generalized these models by supposing that there is some small probability that any pair of alleles will substantially reduce fitness. (Note that there may be incompatibilities between derived alleles in different lineages, or between ancestral and derived alleles in the same lineage; the root could lie anywhere on the path connecting the present-day populations; figure 1.) Orr & Turelli (2001) assume that there is a highly skewed distribution of fitness effects, with a very small fraction of allelic combinations showing large detrimental effects. This is consistent with the results of Drosophila speciation genetics, but remains to be firmly demonstrated more generally, for a wider range of traits and taxa.

What is the most important factor that holds a gene pool together and prevents speciation

(a) Mutations A and B arise in one lineage, and C and D in another. The descendant populations have genotype ABcd and abCD, and are separated by two Dobzhansky-Muller incompatibilities (DMIs), indicated by thin lines: allele A is incompatible with D, and B with C. Both incompatibilities are between two derived alleles. (b) If the ancestral genotype were ABcd, and four mutations occurred in one lineage (B → b, A → a, c → C, d → D), then the descendant populations would have the same genotypes, but both incompatibilities would be between a derived and an ancestral allele. All that has changed is the position of the root.

It is unfortunate that the term ‘Dobzhansky–Muller incompatibility’ (DMI) is used in a variety of ways: to refer to Dobzhansky and Muller's specific two locus model; to Orr and Turelli's generalization of it; or to the broad idea that reproductive isolation need not be expressed during divergence. It is also not clear whether the term refers to the process of divergence (in which selection does not oppose the evolution of reproductive isolation) or to the outcome (in which a small number of incompatibilities are involved in hybrid breakdown). This paper uses the term ‘broadly’, to refer to an incompatibility that has evolved without ever having been expressed in the ancestral lineages, but avoids making any restrictive assumptions about the distribution of fitness effects.

In these models, although there is no direct selection for or against reproductive isolation, selection may nevertheless drive divergence in a variety of ways. There may be selection for different favourable mutations in a uniform environment, or different environments may fix different alleles. Both the physical and biological environments may differ, the latter including coevolution between host and pathogen, or selfish genetic elements. The isolation that ensues may be related to the process that was selected or may be due to an entirely different pathway (e.g. autoimmunity in plants (Bomblies & Weigel, 2010), nucleoporins in Drosophila, melanomas in fish (Orr & Presgraves 2000)). Divergence might be due to random drift across a neutral adaptive landscape (Gavrilets & Gravner 1997; Gavrilets 2004), or in opposition to weak selection (e.g. compensatory evolution; Innan & Stephan 2001). However, we might expect strongly selected changes to be more likely to cause reproductive isolation as a by-product. It is striking that in all the examples of ‘speciation genes’ discovered to date, there is evidence for positive selection where this has been tested (mainly, by finding excess rates of amino acid substitution). Under the Dobzhansky–Muller model, whether selection drives biological speciation really depends primarily on the extent to which evolution in general is due to selection.

To some extent, it does not matter what causes divergence—the key issue is what fraction of allelic combinations cause incompatibility. However, the cause of divergence is relevant to the geography of speciation. If divergence is due to moderately strong selection, then it can occur despite gene flow. Thus, divergence in parapatry seems just as likely as in allopatry. With discrete demes, selection dominates if it is faster than migration (i.e. s ≫ m), and in a spatially extended population, selection must favour an allele over a spatial scale that is sufficiently large (

What is the most important factor that holds a gene pool together and prevents speciation
; Slatkin 1973). As can be seen from the many examples of local adaptation across narrow clines, gene flow does not prevent divergence in a heterogeneous environment. Parapatric divergence is more difficult in a uniform environment, but will still occur if the species covers a broad enough range. Then, favourable alleles will arise at different loci and spread through the range at the same time. If they proved to be incompatible with each other, then they will remain separated by a narrow cline that can be the nucleus for further divergence (figure 2; Kondrashov 2002; Navarro & Barton 2003). The rate of parapatric divergence depends on how long it takes for a favourable allele to sweep through the whole population: the longer it takes, the greater the opportunity for incompatible alleles to meet each other.

What is the most important factor that holds a gene pool together and prevents speciation

DMIs can accumulate in parapatry as well as in allopatry. (a) Alleles A and D arise at different places, and at different loci, and both begin to spread. (b) If they meet, and are incompatible, then they will remain separated by a stable pair of clines (double lines). New alleles may then arise (B,C); if these are incompatible with each other, or with one of the alleles that are already established, then they will strengthen the isolation, leading to a set of four clines.

Selection plays a much more direct role in the evolution of sympatry—i.e. in determining how divergent populations come to coexist. Yet, this question has been somewhat neglected, relative to the evolution of reproductive isolation. This may be partly because biological species are defined by reproductive isolation, so that on this definition, speciation is identical with the evolution of isolation. It may also be because sympatry requires ecological divergence, so that understanding its evolution depends on combining genetics with ecology and on work in the field rather than the laboratory. Yet, sympatry is essential for the long-term survival of species: otherwise, a species' range will be fragmented into ever smaller areas, until extinction is inevitable. Examples of parapatric distributions, such as chromosomal races in rodents, represent a balance between the accumulation of partial reproductive isolation in parapatry, and the extinction of local races (Patton & Sherwood 1983; Searle 1993).

Alternative combinations of alleles can coexist in sympatry if they use different limiting resources and if they are, to some degree, reproductively isolated, so that recombination is reduced. Neither ecological divergence nor reproductive isolation need be complete, but the net strength of these two factors must exceed some threshold if disruptive selection, favouring the alternative genotypes, is to overcome recombination. (Indeed, in simple models, the sum of these factors is precisely what determines whether coexistence is possible (Udovic 1980; Gavrilets 2004, and see below).)

Recombination can be reduced by assortative mating, by preference for different niches and mating within niches, by selection against intermediate genotypes, or by chromosomal rearrangements. Divergence to use different resources requires some combination of preference and of specialization to better exploit particular resources. So, we need to understand the joint evolution of ecological divergence, postzygotic isolation and prezygotic isolation to give clusters that can ultimately continue their divergence to give full biological species with no gene flow.

This paper refers to ‘habitats’ in a broad sense, to mean the exploitation of a distinct limiting resource; it need not imply a physical location or microhabitat, but could include, e.g. mimicry of different unpalatable model species.

This paper analyses the transition to sympatry by using a generalization of the Levene (1953) model, which assumes soft selection in two habitats; i.e. each habitat produces a fixed proportion of the population, regardless of number that choose to live on each, or the proportion that survive there. Assuming soft selection is a drastic simplification. Ideally, one would allow for separate density-dependent regulation in each habitat, with survival being a decreasing function of numbers. This would allow study of extinction and range limits, but makes the analysis substantially more complicated.

The Levene (1953) model is the basis for a variety of models of sympatric speciation (Maynard Smith 1966; Maynard Smith & Hoekstra 1980; Felsenstein 1981; Diehl & Bush 1989; see Gavrilets 2004, for a review). In particular, Geritz et al. (1998) and Geritz & Kisdi (2000) give an ‘adaptive dynamics’ analysis which is close to that set out here: they assume that viabilities in each niche are a Gaussian function of an underlying trait, and assume a single locus in a diploid sexual population. Recently, Nagylaki (2009) and Bürger (in press) has made a detailed analysis of the multilocus Levene model, but assuming no epistasis.

This paper will lay out the model in a slightly more general way, that allows for multiple loci to interact in order to determine habitat preference and viability in each habitat. This paper will first describe the evolutionarily stable strategy for preference and for viability separately, then show how genes for either may couple together in a single population and finally, extend the analysis to the parapatric case, to ask how initially disjunct populations come to coexist.


Page 2

What is the most important factor that holds a gene pool together and prevents speciation

(a) The possible viabilities in each niche are limited by a trade-off curve (equation (2.3)). For β = 1, there is a linear trade-off (v0 + v1 ≤ 1; middle line), and any distribution on this line that has mean v̄ = 1 is an ESS. For β = 0.5, the trade-off curve is convex (upper curve), and there is a unique monomorphic ESS for viability (shown at upper right, for c0 = c1 = 0.5). Conversely, for β = 2, the trade-off curve is concave, and there is a unique polymorphic ESS, with a mixture of specialist genotypes (dots at upper left, lower right). (b) If both preference and viability vary, and if there is no recombination between them, then what matters is the constraint on the product αγvγ. The curves here show the combined trade-off (equation (2.5)) for the viabilities shown in (a), and habitat preference (α0 + α1 = 1; equation (2.4)). These are concave, for all β, and so the ESS is always for a polymorphism between two extreme specialists (dots).