Why Have Sex?
The evolution of anisogamy
Here, I replicate Maire et al.’s 2001 paper, which established that anisogamy can evolve without requiring either pre-existing mating types or very large mutations. I give a gentle explanation of the dynamics of evolutionary branching—how attractor points can be fitness minima rather than maxima—using an intuitive case and a geometric / visual explanation of the mathematics. I find that the sigmoid model of the original paper is significantly less robust than claimed: very large zygotes must have at least a 91% chance of surviving to reproduce, rather than at least a 50% chance of doing so. However, a cleaner parameterisation of the survival function removes the dependency on this probability altogether, making the model much more robust and realistic. You can play with the simulation here.
Introduction
A male organism is one with a reproductive function of fathering offspring, and a female organism is one with a reproductive function of mothering offspring.12 In the relevant sense, to father is to contribute a small gamete (like a sperm), and to mother is to contribute a large gamete (like an egg).3
Okay, but hang on: size is obviously a spectrum. If the female/male distinction is ultimately based on gamete size, does this mean that sex ultimately rests on a spectrum? Can we draw a clear line between small and large gametes?
In some species, the answer is no. A species is iso-gamous (‘same-gamete’) if its gametes come in only one size. The distinction between male and female gametes—which gives rise to the distinction between male and female organisms—is only well-defined under conditions of an-iso-gamy (‘not-same-gamete’), when its gametes come in two very distinct sizes.4
But why should we ever expect anisogamy to evolve? That is, why have sex?
The Informal Story
Before breaking out the equations, pictures, and videos, the story can be told pretty informally. It rests on two key assumptions:
(1) Larger zygotes (formed from two gametes) are likelier to survive.
(2) Smaller gametes can be produced more prolifically.
This gives rise to a trade-off between quality and quantity. Producing larger gametes optimises for quality at the expense of quantity, while producing smaller gametes optimises for quantity at the expense of quality. Producing middle-sized gametes can be the worst of both worlds: there aren’t that many of them, and they aren’t that likely to survive. The practical consequence of this trade-off is that some individuals optimise hard for quantity, while others optimise hard for quality.
Okay. If that’s right, then under some conditions, anisogamy is more optimal than isogamy. But that doesn’t solve the puzzle of how a species could go from isogamy to anisogamy.
Let’s start with a parable. Suppose that a species of bird feeds on nuts (which larger beaks make easier to eat) and berries (which smaller beaks make easier to eat). Suppose the species starts off with very large beaks. Then a mutant gene which slightly increases beak size will make individuals that carry it slightly better at cracking nuts, but slightly worse at picking berries. Since there’s more competition for their favored food (as normal individuals already have very large beaks), these mutants have a harder time than normal individuals, and so the mutation will spread at a slower rate than the normal gene.
On the other hand, a mutant gene which slightly decreases beak size will make individuals that carry it slightly worse at cracking nuts, but slightly better at picking berries. Since there’s less competition for their favored food (as normal individuals have very large beaks), these mutants have a slightly easier time than normal individuals, and so the mutation will spread at a faster rate than the normal gene.
Since mutations which increase beak size are relatively less fit, and mutations which decrease beak size are relatively more fit, the species will evolve to have smaller beaks.
But suppose, on the other hand, that the species starts off with very small beaks. Then the story flips: mutations which increase beak size are relatively more fit (since they make you better at cracking nuts than picking berries, and there’s less competition for nuts when most individuals have small beaks), and mutations which decrease beak size are relatively less fit (since they make you better at picking berries, and there’s more competition for berries when most individuals have small beaks). So, the species will evolve to have larger beaks.
This means that there’s some intermediate beak size which is an attractor point: if the species has a beak larger than this size, its beak will get smaller; if it has a beak smaller than this size, its beak will get larger. What happens, though, once the species reaches this attractor point? Will it necessarily stay there?
Suppose the species has a beak size at this intermediate attractor point. Individuals are not very good at cracking nuts, but also not very good at picking berries. Mutations which increase beak size allow their carriers to outcompete others at cracking nuts, and mutations that decrease beak size allow their carriers to outcompete others at picking berries. It’s possible that specialising in either direction is better than specialising in neither direction. That is, both larger-beak mutations and smaller-beak mutations will spread at a faster rate than the normal individuals. This pulls the population in two directions at once, so there’s a subpopulation of large-beaked nut-eaters and another subpopulation of small-beaked berry-eats. Although each subpopulation will be off of the attractor point, they won’t be pulled back towards it: a mutant berry-eater with a slightly larger beak usual won’t be able to compete at all with the specialised nut-eaters, and a mutant nut-eater with a slightly smaller beak than usual won’t be able to compete at all with the specialised berry-eaters. So, the branching is stable.
So, more abstractly, it’s possible that an attractor point is nevertheless a fitness minimum. Populations will be pulled towards the attractor point, but once they reach it, they might branch into two subpopulations that are each able to keep the other from being pulled towards the attractor point. (Of course, this means that if subpopulation somehow gets wiped out, then the other will evolve back towards the attractor point, and then branch again.)5 We’d like to construct a formal model under which a similar thing happens with gamete size: no matter where the population starts, it will come to produce mid-sized gametes, and then branch into a subpopulation which produces many small gametes and another which produces few large gametes.
A Formal Model
To be more precise, one needs to bring out some math (following Maire et al. 2001). If you skim the following section, I’ll trust that it’s because I was just so compelling above.
Anyway, suppose that we start off with some isogamous population. The size of the gametes an individual produces is determined by the two copies of a gene it carries. Since the population is isogamous, all individuals carry an equivalent version x of the gene, which says to produce gametes of size x. Since all individuals have genotype (x, x), they produce gametes of size x that carry the x gene. Two x-gametes of size x combine into an (x, x) zygote of size x + x.
Now, suppose that there is some very rare mutant version y of the gene determining gamete size. An (x, y) individual will produce gametes of size ½(x + y), averaging out the instructions to produce gametes of size x and to produce gametes of size y. Half of these gametes will have a copy of the x gene, and the other half will have a copy of the y gene. (We may suppose that the mutation is so rare that (y, y) individuals will not be produced.)
Okay. Suppose that each individual has a total amount A of material with which to produce gametes, and the chance a zygote survives is a function S of its size z. What is the relative fecundity of the y mutation? That is, what is the ratio of the fecundity f(y) of y to the fecundity f(x) of x, when most copies of the gene are of type x?
Consider some particular x gene. It’s most likely on an (x, x) individual—let’s say that it’s the copy on the right—which spends A resources producing gametes of size x, and so produces A / x gametes. Half of these carry the x gene that was on the right. Each of these most likely combines with another x-gamete of size x to form a zygote of size x + x, which then has an S(x + x) chance of survival. So, each x gene begets approximately this many copies of itself in the next generation.
f(x) = (A / x) ⋅ ½ ⋅ S(x + x)
Now, consider some particular y gene. It’s most likely on an (x, y) individual, which spends A resources producing gametes of size ½(y + x), and so produces produces A / [½(y + x)] gametes. Half of these carry the y gene at which we’re looking. Each of these most likely combines with an x-gamete of size x to form a zygote of size ½(y + x) + x, which then has an S(½(y + x) + x) chance of survival. So, each y gene begets approximately this many copies of itself in the next generation.
f(y) = (A / [½(x + y)]) ⋅ ½ ⋅ S(x + ½(x + y))
Assuming a fixed total population, what matters is the relative fitness R(y, x) of the y gene with respect to the x gene: the rate at which y gametes beget themselves, divided by the rate at which x gametes beget themselves, when almost all copies of the gene are of type x. It is easy to check that this works out to be the following.
R(y, x) = f(y) / f(x) = 2x ⋅ S(1.5x + 0.5y) / ((x + y) ⋅ S(2x))
The important thing to notice is that the A has cancelled itself out, and so won’t affect things going forward.
Now, let’s consider what the survival function S might look like. We want some positive size threshold below which zygotes are nonviable—after all, you can’t have subatomic zygotes—and want don’t want to hit this threshold too sharply. A functional form that fits these needs is as follows. It is a variant of the one in Maire et al. 2001, which I discuss in a footnote.6
S(z) = c ⋅ (1 – k ⋅ exp{ –z2 })
Here, c is the ceiling on survival probability, which gets approached as zygotes get arbitrarily large. There’s always an effective ceiling of 100%, however, which gets hit when c is greater than 1. Meanwhile, k is smoothness parameter that effectively determines where exactly the minimum size threshold is, and so the shape of how it gets hit. To maintain a positive minimum viability threshold, k must be larger than 1, and to maintain a smooth approach to that threshold, k must be smaller than 1.1.
For example, we might let c = 95%, and k = 1.05. That produces a graph which looks something like this (the size units are arbitrary).
Plugging in such a survival function, our relative fitness function becomes this.
R(y, x) = 2x ⋅ (1 – k ⋅ exp{ –(1.5x + 0.5y)2 }) / ((x + y) ⋅ (1 – k ⋅ exp{ –(2x)2 }))
Notice that c has canceled itself out! So the shape of the relative fitness landscapes depends solely on our smoothness constant k. When k = 1.05, as we had, the fitness landscape looks something like this. The x-axis is the size for which the normal x-gene codes, the y-axis is the size for which the mutant y-gene codes, and the z-axis is the relative fitness of the mutant y-gene versus the normal x-gene.
Notice that when x is fixed at some large value, z increases when y decreases (see the faraway horizontal blue lines, which curve up and to the right). Inversely, when x is fixed at some very small value, z increases up to a peak, before decreasing again, as y increases (see the closest horizontal blue line).
Now, suppose that y only mutates a little bit away from x. Then, we stay approximately along the red line where y = x. (Note that the perspective has changed, and we have zoomed in.)
As one should expect, z = 1 at every point along the red line y = x: the relative fitness of the y mutant versus the normal x-gene, when y is actually still equivalent to x, should be exactly 1. Now, our relative fitness landscape has an interesting property. When x is fixed to the right of the middle blue line, then z > 1 when y is a bit smaller than x, but z < 1 when y is a bit larger than x (see the blue line on the right). That is, if x is large, then mutations which decrease gamete size will be fitter than usual. Thus, the population’s gamete size will slowly evolve downwards towards the middle blue line. Inversely, when x is fixed to the left of the middle blue line, then z > 1 when y is a bit larger than x, but z < 1 when y is a bit smaller than x (see the blue line on the left). That is, if x is small, then mutations which increase gamete size will be fitter than usual. Thus, the population’s gamete size will slowly evolve upwards towards the middle blue line. We’ve found an attractor state: wherever the population starts, it will slowly evolve towards the middle blue line.
Formally, we’ve found a constant p where the partial derivative with respect to y is negative along x = p + ε and positive along x = p – ε, for any sufficiently small positive ε. Equivalently:
p is an attractor iff (Ry↾y=x)(p) = 0 and (Ry↾y=x)′(p) < 0
But what happens when the population reaches the attractor state? Curiously, the middle blue line curves upwards in both directions: z > 1 both when y is a bit smaller than x and when y is a bit larger than x. So, our attractor state is actually a fitness minimum! Mutations in either direction are relatively more fit—so the population gets pulled apart, evolving in both directions. We thus have evolved anisogamy.
Formally:
p is a fitness minimum iff (Ry↾y=x)(p) = 0 and (Ryy↾y=x)(p) > 0.
It’s now clear how attractor states can nevertheless be fitness minima: the directional derivative of Ry can be positive along the direction y = x but negative along the direction of the y-axis. This also suggests the formal possibility of repellors which are nevertheless fitness maxima: states which you never want to reach, but you’d also never want to leave.
A Simulation
Above, we made some idealising assumptions: for instance, we ignored the presence of mutants when calculating relative fitness (which makes the model outright inconsistent, since we also relied on their presence). But worry not: we can watch anisogamy evolve in simulation! The x-axis is gamete size; top is the histogram of the current generation, while below is the history of generations.
Notice how the population first evolves downwards towards the attractor, and then gets pulled apart, into three groups (haploid proto-males with two smaller-size genes, diploid proto-females with one smaller-size gene and one larger-size gene, and haploid proto-superfemales with two larger-size genes), eventually collapsing into anisogamy with two sexes, as the proto-superfemales die off. This is because the males eventually produce so many gametes that it becomes extremely improbable that two gametes from females, both carrying larger-size genes, can meet to produce a superfemale.7 Note that the axis displaying gamete size is evenly spaced on a logarithmic scale (and so is also evenly spaced in terms of how many gametes are produced, where further left is more: a male produces about ten thousand gametes for every gamete that a female produces). However, the colour scaling is roughly linear in size. The video is about 40 seconds long.
(Thanks to my friend Claude for implementing this! I’ve hosted the simulation here, if you want to play with it.)
Recall our survival function from before: below around 0.2, zygotes are nonviable. Since the males stabilised well below gamete size 0.1, two male gametes cannot combine into a viable zygote. The males are completely dependent on the females to reproduce. And although two female gametes can also combine in principle, there are just so many male gametes floating around that two female gametes will almost never find one another. As Parker, Baker, and Smith put it in their seminal (sorry) paper that pioneered models like this, …
A bit harsh!
Thus, an organism with both reproductive functions is both male and female. Some plants are both at the same time (simultaneous hermaphrodites), and some fish are each in sequence (sequential hermaphrodites). With a few exceptions, humans are not hermaphroditic. You might have heard the claim that 2% of humans are born intersex, but this is misleading terminology: such individuals are biologically either male or female. The real number of plausible exceptions is closer to two, each due to exceptionally rare chimerism (where it’s not even obvious that the individual shouldn’t count as two organisms).
There are some apparent counterexamples: for instance, female honeybees (other than the queen) are sterile, so don’t seem to have the reproductive function of mothering offspring. I think these can be dealt with, though, without overcomplicating the account.
I’m actually not entirely convinced that anisogamy is the right way to draw the line. For suppose that smaller gametes make the resultant zygotes healthier, but larger ones can be produced in higher quantity. Then the large-gamete producers are the ones that optimise for quantity, and the small-gamete producers are the ones that optimise for quality. In this case, I think we’d want to say that the large-gamete producers are male, and the small-gamete producers are female. One way of stating the underlying generalisation—which might be what’s behind such intuitions, perhaps sexistly—is that to mother is to ensure the quality of offspring, and to father is to ensure the quantity of offspring. But such a generalisation seems to give the wrong results in many cases—for instance, in bees again—and tuning the generalisation to overcome such problems looks like bad overfitting.
There is another feature that often co-occurs with anisogamy: mating types. Each gamete has a type, and two gametes can only combine into a zygote if they are of different types. Some mushrooms have tens of thousands of mating types. But if a species has two mating types which are nevertheless of the same size, then these types don’t correspond to male and female. Further, we can predict the emergence of psychological and physical difference between individuals who optimise for producing many low-quality gametes, on the one hand, and individuals who optimise for producing few high-quality gametes on the other hand.
If we want to reason verbally about the case of gamete size, it might look something like this. Larger gametes are healthier, but smaller gametes are easier to make. If a species produces very large gametes, then individuals who produce slightly smaller gametes will be able to produce more gametes than usual, most of which will join up with a very large gamete to produce a healthy zygote anyway. So, gamete size will evolve downwards. Meanwhile, if a species produces very small gametes, then individuals who produce slightly larger gametes will produce slightly fewer gametes than usual, but most of them will join up with a very small gamete to produce a zygote which is healthier than usual. If the latter effect is large enough to compensate for the former, gamete size will evolve upwards. So, there will be some intermediate gamete size which is an attractor point. But such gametes of this size are neither very healthy nor very easy to make. It’s possible that mutants which optimise for gamete quality and mutants which optimise for gamete quantity will be better off—just like the case of the nut-eaters and berry-eaters—and so the population will be pulled into a subpopulation of proto-females (which produce a few large gametes) and proto-males (which produce many small gametes). A mutant proto-male which produces slightly larger gametes than usual won’t be able to compete with the proto-females on gamete quality, and a mutant proto-female which produces slightly smaller gametes than usual won’t be able to compete with the proto-males on gamete quantity, so each subpopulation is kept off of the attractor point by the other.
Maire et al. use the following for their sigmoid.
S(z) = c – exp{ –z2 }
They claim that 50% < c < 100% suffices to yields an attractor which is a fitness minimum. However, one actually needs 91% < c < 100%, a much tighter range. But by separating out the smoothness parameter k from the ceiling parameter c, we can allow any c whatsoever, as it cancels itself out. This is important, because in realistic cases even the largest zygotes might have low odds of surviving to reproduce!
This sigmoid is basically an upside-down bell curve. A variant of a more classic sigmoid (to get a small positive survivability threshold which isn’t approached too sharply) is as follows.
(1 + exp{ –m(x – 1) })–1 – 0.05
For some values of m (between around 3.0 and 3.7), this exhibits branching; for m around 3.3, it even exhibits two branching points with a repeller in between.
How things stabilise also depends on the minimum viable gamete size. (Recall that our discussion for the survival function was about the minimum viable zygote size.) Here, the minimum viable gamete size was set to 0.0001, implemented by restricting the minimum size for which the gamete size gene can code. But, notably, if one sets this to 0.1 instead—which is still less than half of our minimum viable zygote size—the population stabilises with three sexes. This is because the smallest gametes aren’t so relatively prolific that the large gametes carrying the large variant effectively never meet, as is what happens with 0.0001. In any case, the males crowd around this threshold; they have two copies of the ‘small’ gene variant. Meanwhile, the females have one copy of the ‘small’ gene and one of the ‘large’ gene. Even with no maximum size, where the females stabilise depends on how it’s optimal to resolve the quality/quantity trade-off, because eventually their are diminishing returns to quality (because there is a ceiling, possibly of 100%, and once one is very close to the ceiling it’s better to just produce more gametes).






wow, very very cool
Despite being a non-sex haver the article was pleasant to read