diff --git a/chapter-05.tex b/chapter-05.tex index 540d983..bdd8093 100644 --- a/chapter-05.tex +++ b/chapter-05.tex @@ -687,6 +687,10 @@ \subsection{Migration--selection balance} migration-selection balance (at least under strong selection) is analogous to mutation selection balance.\\ +We can use this same model by analogy for the case of +migration-selection balance in a diploid model, in that case we replace +our haploid $s$ by the cost to heterozygotes $hs$. + \begin{tcolorbox} \begin{question} You are investigating a small river population of sticklebacks, which receives infrequent migrants from a very large marine population. At a set of (putatively) neutral biallelic markers the freshwater population has frequencies: diff --git a/chapter-06.tex b/chapter-06.tex index bfb885a..2843b32 100644 --- a/chapter-06.tex +++ b/chapter-06.tex @@ -24,6 +24,7 @@ \subsection{Stochastic loss of strongly selected alleles} P_i= \frac{(1+s)^i e^{-(1+s)}}{i!} \end{equation} + Consider starting from a single individual with the selected allele, and ask about the probability of eventual loss of our selected allele starting from this single copy ($p_L$). To derive this we'll make use of a @@ -33,16 +34,18 @@ \subsection{Stochastic loss of strongly selected alleles} \begin{enumerate} \item In our first generation with probability $P_0$ our individual leaves no copies of itself to -the next generation, in which case our allele is lost. +the next generation, in which case our allele is lost (Figure \ref{fig:Proof_of_pL_2s}A). \item Alternatively it could leave one copy of itself to the next generation (with probability $P_1$), in which -case with probability $p_L$ this copy eventually goes extinct. +case with probability $p_L$ this copy eventually goes extinct (Figure \ref{fig:Proof_of_pL_2s}B). \item It could leave two copies of itself to the next generation (with probability $P_2$), in which -case with probability $p_L^2$ both of these copies eventually goes extinct. +case with probability $p_L^2$ both of these copies eventually goes +extinct (Figure \ref{fig:Proof_of_pL_2s}C). \item More generally it could leave could leave $k$ copies ($k>0$) of itself to the next generation (with -probability $P_k$), in which case with probability $p_L^k$ all of these copies eventually go extinct. +probability $P_k$), in which case with probability $p_L^k$ all of +these copies eventually go extinct (e.g. Figure \ref{fig:Proof_of_pL_2s}D). \end{enumerate} summing over this probabilities we see that \begin{eqnarray} @@ -78,6 +81,13 @@ \subsection{Stochastic loss of strongly selected alleles} probability of being lost when it is first introduced into the population by mutation. \\ +\begin{figure} +\begin{center} +\includegraphics[width=\textwidth]{figures/Proof_of_pL_2s} +\end{center} +\caption{} \label{fig:Proof_of_pL_2s} +\end{figure} + %%consider reparameterizing 1+(1-hs)s We can also adapt this result to a diploid setting. Assuming that heterozygotes for the $1$ allele have $1+(1-h)s$ children, the diff --git a/figures/Proof_of_pL_2s.pdf b/figures/Proof_of_pL_2s.pdf new file mode 100644 index 0000000..2642f5a Binary files /dev/null and b/figures/Proof_of_pL_2s.pdf differ diff --git a/figures/additive_effect.pdf b/figures/additive_effect.pdf index 2157cb6..5393c0a 100644 Binary files a/figures/additive_effect.pdf and b/figures/additive_effect.pdf differ diff --git a/figures/additive_effect_OverDom.pdf b/figures/additive_effect_OverDom.pdf index 4211cc7..ff8dc3f 100644 Binary files a/figures/additive_effect_OverDom.pdf and b/figures/additive_effect_OverDom.pdf differ diff --git a/html/index.html b/html/index.html index f473767..e00faa7 100644 --- a/html/index.html +++ b/html/index.html @@ -847,7 +847,7 @@

Diploid fluctuating fitness

B) What conditions do you need for a polymorphic equilibrium to be maintained? At what is the equilibrium frequency of this balanced polymorphism?
C) Imagine the cost of the driver were additive \(w_{dd}=1\), \(w_{Dd}=1-e\), \(w_{DD}=1-2e\). Under what conditions can the driver invade the population? Can a polymorphic equilibrium be maintained?

Mutation–selection balance

-

Mutation is constantly introducing new alleles into the population. Therefore, variation can be maintained within a population not only if selection is balancing (e.g. through heterozygote advantage or fluctuating selection over time, as we have seen in the previous section), but also due to a balance between mutation and selection. A case of particular interest is when mutation introduces deleterious alleles and selection acts against these alleles. To study this balance, we return to the model of directional selection, where allele \(A_1\) is advantageous, i.e.

+

Mutation is constantly introducing new alleles into the population. Therefore, variation can be maintained within a population not only if selection is balancing (e.g. through heterozygote advantage or fluctuating selection over time, as we have seen in the previous section), but also due to a balance between mutation and selection. A case of particular interest is when mutation introduces deleterious alleles and selection acts against these alleles. To study this balance, we return to the model of directional selection, where allele \(A_1\) is advantageous, i.e.

@@ -872,7 +872,7 @@

Mutation–selection balance


-

For a start, we consider the case where allele \(A_2\) is not completely recessive (\(h>0\)), so that the heterozygotes suffer at least some disadvantage. We denote by \(\mu = \mu_{1\rightarrow2}\) the mutation rate per generation from \(A_1\) to the deleterious allele \(A_2\), and assume that there is no reverse mutation (\(\mu_{2\rightarrow1} = 0\)). Let us assume that selection against \(A_2\) is relatively strong compared to the mutation rate, so that it is justified to assume that \(A_2\) is always rare, i.e. \(q_t = 1-p_t \ll 1\). Compared to previous sections, for mathematical clarity, we also switch from following the frequency \(p_t\) of \(A_1\) to following the frequency \(q_t\) of \(A_2\). Of course, this is without loss of generality. The change in frequency of \(A_2\) due to selection can be written as \[\Delta_S q_t = \frac{{\overline{w}}_2 - {\overline{w}}_1}{{\overline{w}}} p_t q_t \approx -hs q_t. +

For a start, we consider the case where allele \(A_2\) is not completely recessive (\(h>0\)), so that the heterozygotes suffer at least some disadvantage. We denote by \(\mu = \mu_{1\rightarrow2}\) the mutation rate per generation from \(A_1\) to the deleterious allele \(A_2\), and assume that there is no reverse mutation (\(\mu_{2\rightarrow1} = 0\)). Let us assume that selection against \(A_2\) is relatively strong compared to the mutation rate, so that it is justified to assume that \(A_2\) is always rare, i.e. \(q_t = 1-p_t \ll 1\). Compared to previous sections, for mathematical clarity, we also switch from following the frequency \(p_t\) of \(A_1\) to following the frequency \(q_t\) of \(A_2\). Of course, this is without loss of generality. The change in frequency of \(A_2\) due to selection can be written as \[\Delta_S q_t = \frac{{\overline{w}}_2 - {\overline{w}}_1}{{\overline{w}}} p_t q_t \approx -hs q_t. \label{eq:dirSelApprox}\] This approximation can be found by assuming that \(q^2 \approx 0\), \(p \approx 1\), and that \({\overline{w}}\approx w_1\).

All of these assumptions make sense if \(q \ll 1\). From eqn.  we see that selection acts to reduce the frequency of \(A_2\) (as both \(h\) and \(s\) are positive), and it does so geometrically across the generations. That is, if the initial frequency of \(A_2\) is \(q_0\), then its frequency at time \(t\) is approximately \[q_t = q_0 (1 - hs)^t. \label{eq:dirSelExplApprox}\]

@@ -924,7 +924,7 @@

Migration–selection balance

As a simple model of migration lets suppose within a population a fraction of \(m\) individuals are migrants from the other population, and \(1-m\) individuals are from the same deme.
To quickly sketch a solution to this well set up a situation analogous to our mutation-selection balance model. to do this lets assume that selection is strong compared to migration (\(s \gg m\)) then allele \(1\) will be almost fixed in population \(1\) and allele \(2\) will be almost fixed in population \(2\). If that is the case, migration changes the frequency of allele \(2\) in population \(1\) (\(q_1\)) by \[\Delta_{Mig.} q_1 \approx m\] while as noted above \(\Delta_{S} q_1= -sq_1\), so that migration and selection are at an equilibrium when \(\Delta_{S} q_1+ \Delta_{Mig.}q_1\), i.e. an equilibrium frequency of allele \(2\) in population \(1\) of \[q_{e,1} = \frac{m}{s}\] so that migration is playing to role of mutation and so migration-selection balance (at least under strong selection) is analogous to mutation selection balance.
-

+We can use this same model by analogy for the case of migration-selection balance in a diploid model, in that case we replace our haploid \(s\) by the cost to heterozygotes \(hs\).

You are investigating a small river population of sticklebacks, which receives infrequent migrants from a very large marine population. At a set of (putatively) neutral biallelic markers the freshwater population has frequencies: 0.2, 0.7, 0.8 at the same markers the marine population has frequencies: 0.4, 0.5 and 0.7. From studying patterns of heterozygosity at a large collection of markers, you have estimated the long term effective size of your freshwater population is 2000 individuals.
A) What is \(F_{ST}\) across these neutral markers in the freshwater population, with respect to the large marine population (i.e. treat the marine population as the total)?
B) You are also studying an unlinked locus involved in the regulation of salt uptake. In the marine population the ancestral allele is at close to fixation, but in your river population the derived allele is at 0.99 frequency. Estimate the selective disadvantage of the ancestral allele in your river population. [Hint how can you use neutral differentiation to estimate the migration rate?]

@@ -951,28 +951,35 @@

An equilibrium cline in allele frequency. Our individuals dispersal an average distance of \(\sigma=1\)km per generation, and our allele \(2\) has a relative fitness of \(1+s\) and \(1-s\) on either side of the environmental change at \(x=0\).
+

The cline in allele frequency associated with a sharp environmental transition.

To make progress lets consider a simple model of location adaptation where the environment abruptly changes. Specifically we assume that \(\gamma(x)= 1\) for \(x<0\) and \(\gamma(x)= -1\) for \(x \geq 0\), i.e. our allele \(2\) has a selective advantage at locations to the left of zero, while this allele is at a disadvantage to the right of zero. In this case we can get an equilibrium distribution of our two alleles were to the left of \(zero\) our allele \(2\) is at higher frequency, while to the right of zero allele \(1\) predominates. As we cross from the left to the right side of our range the frequency of our allele \(2\) decreases in a smooth cline.
Our equilibrium spatial distribution of allele frequencies can be found by setting the LHS of eqn. to zero to arrive at \[s\gamma(x) q(x) \left( 1 - q(x) \right) = - \frac{\sigma^2}{2} \frac{d^2q(x)}{dx^2}\] We then could solve this differential equation with appropriate boundary conditions (\(q(-\infty)=1\) and \(q(\infty) = 0\)) to arrive at the appropriate functional form to our cline. While we won’t go into the solution of this equation here, we can note that by dividing our distance \(x\) by \(\ell=\sigma/\sqrt{s}\) we can remove the effect of our parameters from the above equation. This compound parameter \(\ell\) is the characteristic length of our cline, and it is this parameter which determines over what geographic scale we change from allele \(2\) predominating to allele \(1\) predominating as we move across our environmental shift.
The width of our cline, i.e. over what distance do we make this shift from allele \(2\) predominating to allele \(1\), can be defined in a number of different ways. One simple way to define the cline width, which is easy to define but perhaps hard to measure accurately, is the slope (i.e. the tangent) of \(q(x)\) at \(x=0\). Under this definition the cline width is approximately \(0.6 \sigma/\sqrt{s}\).

+

The rate of spatial spread of a beneficial allele.

+

Consider a beneficial mutation that has arisen in a specific spatial location and has begun to spread geographically.

Stochasticity and Genetic Drift in allele frequencies

Stochastic loss of strongly selected alleles

Even strongly selected alleles can be lost from the population when they are sufficiently rare. This is because the number of offspring left by individuals to the next generation is fundamentally stochastic. A selection coefficient of s=\(1\%\) is a strong selection coefficient, which can drive an allele through the population in a few hundred generations once the allele is established. However, if individuals have on average a small number of offspring per generation the first individual to carry our allele who has on average \(1\%\) more children could easily have zero offspring, leading to the loss of our allele before it ever get a chance to spread.
To take a first stab at this problem lets think of a very large haploid population, and in order for this population to stay constant in size we’ll assume that individuals without the selected mutation have on average one offspring per generation. While individuals with our selected allele have on average \(1+s\) offspring per generation. We’ll assume that the distribution of offspring number of an individual is Poisson distributed with this mean, i.e. the probability that an individual with the selected allele has \(i\) children is \[P_i= \frac{(1+s)^i e^{-(1+s)}}{i!}\]

Consider starting from a single individual with the selected allele, and ask about the probability of eventual loss of our selected allele starting from this single copy (\(p_L\)). To derive this we’ll make use of a simple argument (derived from branching processes). Our selected allele will be eventually lost from the population if every individual with the allele fails to leave descendants.

    -
  1. In our first generation with probability \(P_0\) our individual leaves no copies of itself to the next generation, in which case our allele is lost.

  2. -
  3. Alternatively it could leave one copy of itself to the next generation (with probability \(P_1\)), in which case with probability \(p_L\) this copy eventually goes extinct.

  4. -
  5. It could leave two copies of itself to the next generation (with probability \(P_2\)), in which case with probability \(p_L^2\) both of these copies eventually goes extinct.

  6. -
  7. More generally it could leave could leave \(k\) copies (\(k>0\)) of itself to the next generation (with probability \(P_k\)), in which case with probability \(p_L^k\) all of these copies eventually go extinct.

  8. +
  9. In our first generation with probability \(P_0\) our individual leaves no copies of itself to the next generation, in which case our allele is lost (Figure [fig:Proof_of_pL_2s]A).

  10. +
  11. Alternatively it could leave one copy of itself to the next generation (with probability \(P_1\)), in which case with probability \(p_L\) this copy eventually goes extinct (Figure [fig:Proof_of_pL_2s]B).

  12. +
  13. It could leave two copies of itself to the next generation (with probability \(P_2\)), in which case with probability \(p_L^2\) both of these copies eventually goes extinct (Figure [fig:Proof_of_pL_2s]C).

  14. +
  15. More generally it could leave could leave \(k\) copies (\(k>0\)) of itself to the next generation (with probability \(P_k\)), in which case with probability \(p_L^k\) all of these copies eventually go extinct (e.g. Figure [fig:Proof_of_pL_2s]D).

summing over this probabilities we see that \[\begin{aligned} p_L &= \sum_{k=0}^{\infty} P_k p_L^{k} \nonumber \\ &= \sum_{k=0}^{\infty} \frac{(1+s)^ke^{-(1+s)}}{k!} p_L^{k} \nonumber \\ &= e^{-(1+s)} \left( \sum_{k=0}^{\infty} \frac{\left(p_L(1+s) \right)^k}{k!} \right)\end{aligned}\] well the term in the brackets is itself an exponential expansion, so we can rewrite this as \[p_L = e^{(1+s)(p_L-1)} \label{prob_loss}\] solving this would give us our probability of loss for any selection coefficient. Lets rewrite this in terms of the the probability of escaping loss \(p_F = 1-p_L\). We can rewrite eqn as \[1-p_F = e^{-p_F(1+s)}\] to gain an approximation to this lets consider a small selection coefficient \(s \ll 1\) such that \(p_F \ll 1\) and then expanded out the exponential on the right hand side (ignoring terms of higher order than \(s^2\) and \(p_F^2\)) then \[1-p_F \approx 1-p_F(1+s)+p_F^2(1+s)^2/2\] solving this we find that \[p_F = 2s.\] Thus even an allele with a \(1\%\) selection coefficient has a \(98\%\) probability of being lost when it is first introduced into the population by mutation.
-We can also adapt this result to a diploid setting. Assuming that heterozygotes for the \(1\) allele have \(1+(1-h)s\) children, the probability of allele \(1\) is not lost, starting from a single copy in the population, is \[p_F = 2 (1-h)s \label{eqn:diploid_escape}\] for \(h>0\).
+

+
+
+
+

We can also adapt this result to a diploid setting. Assuming that heterozygotes for the \(1\) allele have \(1+(1-h)s\) children, the probability of allele \(1\) is not lost, starting from a single copy in the population, is \[p_F = 2 (1-h)s \label{eqn:diploid_escape}\] for \(h>0\).

The interaction between genetic drift and weak selection.

For strongly selected alleles, once the allele has escaped initial loss at low frequencies, their path will be determined deterministically by their selection coefficients. However, if selection is weak the stochasticity of reproduction can play a role in the trajectory an allele takes even when it is common in the population.
diff --git a/popgen_notes.pdf b/popgen_notes.pdf index c2af4d0..a488d00 100644 Binary files a/popgen_notes.pdf and b/popgen_notes.pdf differ