Hardy Weinberg Equilibrium Lectures 4-11: …...9/17/18 1 Hardy Weinberg Equilibrium Wilhem Weinberg...

Post on 01-Feb-2020

30 views 1 download

Transcript of Hardy Weinberg Equilibrium Lectures 4-11: …...9/17/18 1 Hardy Weinberg Equilibrium Wilhem Weinberg...

9/17/18

1

Hardy Weinberg Equilibrium

Wilhem Weinberg(1862 – 1937)

Gregor Mendel

G. H. Hardy(1877 - 1947)

(1822-1884)

Lectures 4-11: Mechanisms of Evolution (Microevolution)

• Hardy Weinberg Principle (Mendelian Inheritance)

• Genetic Drift

• Mutation

• Sex: Recombination and Random Mating

• Epigenetic Inheritance

• Natural Selection

These are mechanisms acting WITHIN populations,

hence called “population genetics”—EXCEPT for epigenetic modifications, which act on individuals

in a Lamarckian manner

Evolution acts through changes in allele frequency at each generation

Leads to average change in characteristic of the population

Recall from Previous LecturesDarwin’s Observation

HOWEVER, Darwin did not understand how genetic variation was passed on from generation to generation

Recall from Lecture on History of Evolutionary Thought

Darwin�s Observation

Gregor Mendel, �Father of Modern Genetics�

• Mendel presented a mechanism for how traits got passed on

�Individuals pass alleles on to their offspring intact�

(the idea of particulate (genes) inheritance)

Gregor Mendel

(1822-1884)

http://www.biography.com/people/gregor-mendel-39282#synopsis

Gregor Mendel, �Father of Modern Genetics�

Mendel�s Laws of Inheritance• Law of Segregation– only one allele passes from each

parent on to an offspring• Law of Independent Assortment– different pairs of alleles are passed to

offspring independently of each other

Gregor Mendel

(1822-1884)

http://www.biography.com/people/gregor-mendel-39282#synopsis

9/17/18

2

Gregor Mendel

• In cross-pollinating plants with either yellow or green peas, Mendel found that the first generation (f1) always had yellow seeds (dominance). However, the following generation (f2) consistently had a 3:1 ratio of yellow to green.

Using 29,000 pea plants, Mendel discovered the 1:3 ratio of phenotypes, due to dominant vs. recessive alleles

• Mendel uncovered the underlying mechanism, that there are dominant and recessive alleles

• Mathematical description of Mendelianinheritance

Hardy-Weinberg Principle

Godfrey Hardy(1877-1947) Wilhem Weinberg

(1862 – 1937)

Testing for Hardy-Weinberg equilibrium can be used to assess whether a population is

evolving

The Hardy-Weinberg Principle

• A population that is not evolving shows allele and genotypic frequencies that are in Hardy Weinberg equilibrium

• If a population is not in Hardy-Weinberg equilibrium, it can be concluded that the population is evolving

9/17/18

3

Evolutionary Mechanisms (will put population out of HW Equilibrium):

• Genetic Drift• Natural Selection• Mutation• Migration

*Epigenetic modifications change expression of alleles but not the frequency of alleles themselves, so they won’t affect the actual inheritance of alleles

However, if you count the phenotype frequencies, and not the genotype frequencies , you might see phenotypic frequencies out of HW Equilibrium due to epigenetic silencing of alleles. (epigenetic modifications can change phenotype, not genotype)

Requirements of HW Evolution

Large population size Genetic drift

Random Mating Inbreeding & other

No Mutations Mutations

No Natural Selection Natural Selection

No Migration Migration

An evolving population is one that violates Hardy-Weinberg Assumptions

Violation

Fig. 23-5a

Porcupineherd range

Beaufort Sea NORTHWEST

TERRITORIES

M APAREA

ALAS

KA

CAN

ADA

Fortymileherd range

ALAS

KAYU

KON

•What is a “population?”A group of individuals within a species that is capable of interbreeding and producing fertile offspring

(definition for sexual species)

Patterns of inheritance should always be in “Hardy Weinberg Equilibrium”

Following the transmission rules of Mendel

In the absence of Evolution…

Hardy-Weinberg Equilibrium

• According to the Hardy-Weinberg principle, frequencies of alleles and genotypes in a population remain constant from generation to generation

• Also, the genotype frequencies you see in a population should be the Hardy-Weinberg expectations, given the allele frequencies

“Null Model”• No Evolution: Null Model to test if no

evolution is happening should simply be a population in Hardy-Weinberg Equilibrium

• No Selection: Null Model to test whether Natural Selection is occurring should have no selection, but should include Genetic Drift– This is because Genetic Drift is operating even

when there is no Natural Selection

9/17/18

4

Example: Is this population in Hardy Weinberg Equilibrium?

AA Aa aaGeneration 1 0.25 0.50 0.25Generation 2 0.20 0.60 0.20Generation 3 0.10 0.80 0.10

Hardy-Weinberg Theorem

In a non-evolving population, frequency of alleles and genotypes remain constant over generations

You should be able to predict the genotype frequencies, given the allele frequencies

important concepts• gene: A region of genome sequence (DNA or RNA), that is

the unit of inheritance , the product of which contributes to phenotype

• locus: Location in a genome (used interchangeably with “gene,” if the location is at a gene… but, locus can be anywhere, so meaning is broader than gene)

• loci: Plural of locus

• allele: Variant forms of a gene (e.g. alleles for different eye colors, BRCA1 breast cancer allele, etc.)

• genotype: The combination of alleles at a locus (gene)

• phenotype: The expression of a trait, as a result of the genotype and regulation of genes (green eyes, brown hair, body size, finger length, cystic fibrosis, etc.)

important concepts• allele: Variant forms of a gene (e.g. alleles for different eye

colors, BRCA1 breast cancer allele, etc.)

• We are diploid (2 chromosomes), so we have 2 alleles at a locus (any location in the genome)

• However, there can be many alleles at a locus in a population.– For example, you might have inherited a blue eye allele from

your mom and a brown eye allele from your dad… you can’t have more alleles than that (only 2 chromosomes, one from each parent)–BUT, there could be many alleles at this locus in the

population, blue, green, grey, brown, etc.

• Alleles in a population of diploid organisms

A1

A2

A3

A4A1

A1

A2

Sperm

Eggs

• Genotypes

Random Mating (Sex)

Zygotes

A1A3

A1A1 A1A1

A2A4

A3A1

A1A1

A1

A2A1

A1

A3A4

So then can we predict the % of alleles and genotypes in the population at each generation?

A1

A2

A3

A4A1

A1

A2

Sperm

Eggs

Zygotes

A1A3

A1A1 A1A1

A2A4

A3A1

A1A1

A1

A2A1

A1

A3A4

9/17/18

5

Hardy-Weinberg Theorem

In a non-evolving population, frequency of alleles and genotypes remain constant over generations

Fig. 23-6

Frequencies of allelesAlleles in the population

Gametes producedEach egg: Each sperm:

80%chance

80%chance

20%chance

20%chance

q = frequency of

p = frequency ofCR allele = 0.8

CW allele = 0.2

Hardy-Weinberg proportions indicate the expected allele and genotype frequencies, given the starting frequencies

• By convention, if there are 2 alleles at a locus, p and q are used to represent their frequencies

• The frequency of all alleles in a population will add up to 1

– For example, p + q = 1

If p and q represent the relative frequencies of the only two possible alleles in a population at a particular locus, then for a diploid organism (2 chromosomes),

(p + q) 2 = 1

= p2 + 2pq + q2 = 1

– where p2 and q2 represent the frequencies of the homozygous genotypes and 2pq represents the frequency of the heterozygous genotype

What about for a triploid organism? What about for a triploid organism?• (p + q)3 = 1

= p3 + 3p2q+ 3pq2 + q3 = 1

Potential offspring: ppp, ppq, pqp, qpp, qqp, pqq, qpq, qqq

How about tetraploid? You work it out.

9/17/18

6

Hardy Weinberg TheoremALLELESProbability of A = p p + q = 1Probability of a = q

GENOTYPESAA: p x p = p2

Aa: p x q + q x p = 2pqaa: q x q = q2

p2 + 2pq + q2 = 1

More General HW Equations• One locus three alleles: (p + q + r)2 = p2 + q2 + r2 + 2pq +2pr +

2qr

• One locus n # alleles: (p1 + p2 + p3 + p4 … …+ pn)2 = p12 + p22 + p32 + p42… …+ pn2 + 2p1p2 + 2p1p3 + 2p2p3 + 2p1p4 + 2p1p5 + … … + 2pn-1pn

• For a polyploid (more than two chromosomes): (p + q)c, where c = number of chromosomes

• If multiple loci (genes) code for a trait, each locus follows the HW principle independently, and then the alleles at each loci interact to influence the trait

ALLELE FrequenciesFrequency of A = p = 0.8Frequency of a = q = 0.2

p + q = 1

Expected GENOTYPE FrequenciesAA: p x p = p2 = 0.8 x 0.8 = 0.64Aa: p x q + q x p = 2pq

= 2 x (0.8 x 0.2) = 0.32aa: q x q = q2 = 0.2 x 0.2 = 0.04

p2 + 2pq + q2

= 0.64 + 0.32 + 0.04 = 1Expected Allele Frequencies at 2nd Generationp = AA + Aa/2 = 0.64 + (0.32/2) = 0.8q = aa + Aa/2 = 0.04 + (0.32/2) = 0.2

Allele frequencies remain the same at next generation

Hardy Weinberg TheoremALLELE FrequencyFrequency of A = p = 0.8 p + q = 1Frequency of a = q = 0.2

Expected GENOTYPE FrequencyAA: p x p = p2 = 0.8 x 0.8 = 0.64Aa: p x q + q x p = 2pq = 2 x (0.8 x 0.2) = 0.32aa : q x q = q2 = 0.2 x 0.2 = 0.04

p2 + 2pq + q2 = 0.64 + 0.32 + 0.04 = 1

Expected Allele Frequency at 2nd Generationp = AA + Aa/2 = 0.64 + (0.32/2) = 0.8q = aa + Aa/2 = 0.04 + (0.32/2) = 0.2

Similar example,But with different starting allele frequencies

p q

9/17/18

7

p22pqq2

• The frequency of an allele in a population can be calculated from # of individuals:

– For diploid organisms, the total number of alleles at a locus is the total number of individuals x 2

– The total number of dominant alleles at a locus is 2 alleles for each homozygous dominant individual

– plus 1 allele for each heterozygous individual; the same logic applies for recessive alleles

Calculating Allele Frequencies from # of Individuals

AA Aa aa120 60 35 (# of individuals)

#A = (2 x AA) + Aa = 240 + 60 = 300#a = (2 x aa) + Aa = 70 + 60 = 130Proportion A = 300/total = 300/430 = 0.70Proportion a = 130/total = 130/430 = 0.30

A + a = 0.70 + 0.30 = 1

Proportion AA = 120/215 = 0.56Proportion Aa = 60/215 = 0.28Proportion aa = 35/215 = 0.16

AA + Aa + aa = 0.56 + 0.28 +0.16 = 1

Calculating Allele and Genotype Frequencies from # of Individuals Applying the Hardy-Weinberg Principle

• Example: estimate frequency of a disease allele in a population

• Phenylketonuria (PKU) is a metabolic disorder that results from homozygosity for a recessive allele

• Individuals that are homozygous for the deleterious recessive allele cannot break down phenylalanine, results in build up à mental retardation

• The occurrence of PKU is 1 per 10,000 births• How many carriers of this disease in the

population?

– Rare deleterious recessives often remain in a population because they are hidden in the heterozygous state (the “carriers”)

– Natural selection can only act on the homozygous individuals where the phenotype is exposed (individuals who show symptoms of PKU)

–We can assume HW equilibrium if:• There is no migration from a population with different

allele frequency• Random mating• No genetic drift• Etc

9/17/18

8

• The occurrence of PKU is 1 per 10,000 births(frequency of the disease allele):

q2 = 0.0001q = sqrt(q2 ) = sqrt(0.0001) = 0.01

• The frequency of normal alleles is:p = 1 – q = 1 – 0.01 = 0.99

• The frequency of carriers (heterozygotes) of the deleterious allele is:

2pq = 2 x 0.99 x 0.01 = 0.0198or approximately 2% of the U.S. population

So, let’s calculate HW frequencies Conditions for Hardy-Weinberg Equilibrium• The Hardy-Weinberg theorem describes a

hypothetical population

• The five conditions for nonevolving populations are rarely met in nature:

– No mutations – Random mating – No natural selection – Extremely large population size– No gene flow

• So, in real populations, allele and genotype frequencies do change over time

DEVIATIONfrom

Hardy-Weinberg EquilibriumIndicates that

EVOLUTIONIs happening

• In natural populations, some loci might be out of HW equilibrium, while being in Hardy-Weinberg equilibrium at other loci

• For example, some loci might be undergoing natural selection and become out of HW equilibrium, while the rest of the genome remains in HW equilibrium

Hardy-Weinberg across a Genome

Allele A1 Demo

How can you tell whether a population is out of HW Equilibrium?

9/17/18

9

• Perform HW calculations to see if it looks like the population is out of HW equilibrium

• Then apply statistical tests to see if the deviation is significantly different from what you would expect by random chance

Example: Does this population remain in Hardy Weinberg Equilibrium across Generations?

AA Aa aaGeneration 1 0.25 0.50 0.25Generation 2 0.20 0.60 0.20Generation 3 0.10 0.80 0.10

AA Aa aaGeneration 1 0.25 0.50 0.25Generation 2 0.20 0.60 0.20Generation 3 0.10 0.80 0.10

■ In this case, allele frequencies (of A and a) did not change.

■ ***However, the population did go out of HW equilibrium because you can no longer predict genotypic frequencies from allele frequencies

■ For example, p = 0.5, p2 = 0.25, but in Generation 3, the observe p2 = 0.10

How can you tell whether a population is out of HW Equilibrium?

1. When allele frequencies are changing across generations

2. When you cannot predict genotype frequencies from allele frequencies (means there is an excess or deficit of genotypes than what would be expected given the allele frequencies)

Testing for Deviaton from Hardy-Weinberg Expectations

• A c2 goodness-of-fit test can be used to determine if a population is significantly different from the expections of Hardy-Weinberg equilibrium.

• If we have a series of genotype counts from a population, then we can compare these counts to the ones predicted by the Hardy-Weinberg model.

• O = observed counts, E = expected counts, sum across genotypes

Example

• Genotype Count: AA 30 Aa 55 aa 15

• Calculate the c2 value:

Genotype Observed Expected (O-E)2/E AA 30 33 0.27

Aa 55 49 0.73

aa 15 18 0.50

Total 100 100 1.50

• Since c2 = 1.50 < 3.841 (from Chi-square table, alpha = 0.05), we conclude that the genotype frequencies in this population are not significantly different than what would be expected if the population is in Hardy-Weinberg equilibrium.

9/17/18

10

Testing for Deviaton from Hardy-Weinberg Expectations

• A c2 goodness-of-fit test can be used to determine if a population is significantly different from the expections of Hardy-Weinberg equilibrium.

• If we have a series of genotype counts from a population, then we can compare these counts to the ones predicted by the Hardy-Weinberg model.

• O = observed counts, E = expected counts, sum across genotypes

55

Testing for Deviaton from Hardy-Weinberg Expectations

• O = observed counts, E = expected counts, sum across genotypes

• We test our c2 value against the Chi-square distribution (sum of square of a normal distribution), which represents the theoretical distribution of sample values under HW equilibrium

• And determine how likely it is to get our result simply by chance (e.g. due to sampling error); i.e., do our Observed values differ from our Expected values more than what we would expect by chance (= significantly different)?

à Less likely to get these values by chance

56

Test for Deviation from HW equilibrium

• Genotype Count Generation 4:AA 65 Aa 31 aa 4

• Calculate the c2 value:Genotype Observed Expected (O-E)2/E

AA 65 64.8 0.00062Aa 31 31.4 0.0051 aa 4 3.8 0.0105Total 100 100 0.016

• Since c2 = 0.016 < 3.841 (from Chi-square table for critical values, alpha = 0.05), we conclude that the genotype frequencies in this population are not significantly different than what would be expected if the population were in Hardy-Weinberg equilibrium.57

• The chi-squared distribution is used because it is the sum of squared normal distributions

• Calculate Chi-squared test statistic• Figure out degrees of freedom• Select confidence interval (P-value)• Compare your Chi-squared value to the theoretical

distribution (from the table), and accept or reject the null hypothesis.– If the test statistic > than the critical value, the null hypothesis (H0

= there is no difference between the distributions) can be rejected with the selected level of confidence, and the alternative hypothesis (H1 = there is a difference between the distributions) can be accepted.

– If the test statistic < than the critical value, the null hypothesis cannot be rejected 58

Test for Significance of Deviation from HW Equilibrium

Degrees of Freedom is n – 1 = 2 alleles (p, q) -1 = 1

59

Testing for significance• The results come out not significantly different from HW

equilibrium

• This does not necessarily mean that genetic drift is not happening, but that we cannot conclude that genetic drift is happening

• Either we do not have enough power (not enough data, small sample size), or genetic drift is not happening

• Sometimes it is difficult to test whether evolution is happening, even when it is happening... The signal needs to be sufficiently large to be sure that you can’t get the results by chance (like by sampling error)

60

9/17/18

11

Test for Deviation from HW equilibrium

• Genotype Count Generation 4 à increase sample sizeAA 65000 Aa 31000 aa 4000

• Calculate the c2 value:

Genotype Observed Expected (O-E)2/E

AA 65000 64800 0.617

Aa 31000 31400 5.10

aa 4000 3800 10.32

Total 100,000 100,000 16.04

• Since c2 = 16.04 > 3.841 (from Chi-square table for critical values,

alpha = 0.05), we conclude that the genotype frequencies in this

population ARE significantly different than what would be

expected if the population were in Hardy-Weinberg equilibrium.61

Test for Significance of Deviation from HW Equilibrium

Degrees of Freedom is n – 1 = 2 alleles (p, q) -1 = 1

62

• One generation of Random Mating could put a population back into Hardy Weinberg Equilibrium Examples of Deviation from

Hardy-Weinberg Equilibrium

What would Genetic Drift look like?

• Most populations are experiencing some level of genetic drift, unless they are incredibly large

Examples of Deviation from

Hardy-Weinberg Equilibrium

AA Aa aa

Generation 1 0.64 0.32 0.04

Generation 2 0.63 0.33 0.04

Generation 3 0.64 0.315 0.045

Generation 4 0.65 0.31 0.04

Is this population in HW equilibrium?

If not, how does it deviate?

What could be the reason?

9/17/18

12

Examples of Deviation from

Hardy-Weinberg Equilibrium

AA Aa aa

Generation 1 0.64 0.32 0.04

Generation 2 0.63 0.33 0.04

Generation 3 0.64 0.315 0.045

Generation 4 0.65 0.31 0.04

This is a case of Genetic Drift, where

allele frequencies are fluctuating

randomly across generations

Examples of Deviation from Hardy-Weinberg Equilibrium

AA Aa aa0.64 0.36 0

Is this population in HW equilibrium?If not, how does it deviate?What could be the reason?

Examples of Deviation from Hardy-Weinberg Equilibrium

AA Aa aa0.64 0.36 0

Here this appears to be Directional Selection favoring AA

Or… Negative Selection disfavoring aa

Examples of Deviation from Hardy-Weinberg Equilibrium

AA Aa aa0.25 0.70 0.05

Is this population in HW equilibrium?If not, how does it deviate?What could be the reason?

Examples of Deviation from Hardy-Weinberg Equilibrium

AA Aa aa0.25 0.70 0.05

This appears to be a case of Heterozygote Advantage (or Overdominance)

Examples of Deviation from Hardy-Weinberg Equilibrium

AA Aa aa0.10 0.10 0.80

Is this population in HW equilibrium?If not, how does it deviate?What could be the reason?

9/17/18

13

Examples of Deviation from Hardy-Weinberg Equilibrium

AA Aa aa0.10 0.10 0.80

Selection appears to be favoring aa

(1) A nonevolving population is in HW Equilibrium

(2) Evolution occurs when the requirements for HW Equilibrium are not met

(3) HW Equilibrium is violated when there is Genetic Drift, Migration, Mutations, Natural Selection, and Nonrandom Mating

Summary

Hardy Weinberg Equilibrium

Wilhem Weinberg(1862 – 1937)

Gregor Mendel

G. H. Hardy(1877 - 1947)

(1822-1884)

Fig. 23-7-4

Gametes of this generation:

64% CR CR, 32% CR CW, and 4% CW CW

64% CR + 16% CR = 80% CR = 0.8 = p

4% CW + 16% CW = 20% CW = 0.2 = q

64% CR CR, 32% CR CW, and 4% CW CW plants

Genotypes in the next generation:

SpermCR

(80%)

CW

(20%

)

80% CR ( p = 0.8)

CW

(20%)

20% CW (q = 0.2)

16% ( pq)CR CW

4% (q2)CW CW

CR

(80%

)

64% ( p2)CR CR

16% (qp)CR CW

Eggs

Perform the same calculations using percentages

Fig. 23-7-1

SpermCR

(80%)

CW

(20%

)

80% CR (p = 0.8)

CW

(20%)

20% CW (q = 0.2)

16% (pq)CRCW

4% (q2)CW CW

CR

(80%

)

64% (p2)CRCR

16% (qp)CRCW

Eggs

Fig. 23-7-2

Gametes of this generation:

64% CRCR, 32% CRCW, and 4% CWCW

64% CR + 16% CR = 80% CR = 0.8 = p

4% CW + 16% CW = 20% CW = 0.2 = q

9/17/18

14

Fig. 23-7-3

Gametes of this generation:

64% CRCR, 32% CRCW, and 4% CWCW

64% CR + 16% CR = 80% CR = 0.8 = p

4% CW + 16% CW = 20% CW = 0.2 = q

64% CRCR, 32% CRCW, and 4% CWCW plants

Genotypes in the next generation:

1. Nabila is a Saudi Princess who is arranged to marry her first cousin. Many in her family have died of a rare blood disease, which sometimes skips generations, and thus appears to be recessive. Nabila thinks that she is a carrier of this disease. If her fiancé is also a carrier, what is the probability that her offspring will have (be afflicted with) the disease?

(A) 1/4(B) 1/3(C) 1/2(D) 3/4(E) zero

The following are numbers of pink and white flowers in a population.

Pink WhiteGeneration 1: 901 302Generation 2: 1204 403

Generation 3: 1510 504

2. Which of the following is most likely to be TRUE?

(A) The heterozygotes are probably pink

(B) The recessive allele here (probably white) is clearly deleterious(C) Evolution is occurring, as allele frequencies are changing greatly over time(D) Clearly there is a heterozygote advantage(E) The frequencies above violate Hardy-Weinberg expectations

The following are numbers of purple and white peas in a population. (A1A1) (A1A2) (A2A2)Purple Purple White

Generation 1: 360 480 160Generation 2: 100 200 200Generation 3: 0 100 300

3. What are the genotype frequencies at each generation?(A) Generation 1: 0.30, 0.50, 0.20

Generation 2: 0.20, 0.40, 0.40Generation 3: 0, 0.333, 0.666

(B) Generation 1: 0.36, 0.48, 0.16Generation 2: 0.10, 0.20, 0.20Generation 3: 0, 0.10, 0.30

(C) Generation 1: 0.36, 0.48, 0.16Generation 2: 0.20, 0.40, 0.40Generation 3: 0, 0.25, 0.75

(D) Generation 1: 0.36, 0.48, 0.16Generation 2: 0.36, 0.48, 0.16Generation 3: 0.36, 0.48, 0.16

4. From the example on the previous slide, what are the frequencies of alleles at each generation?

(A) Generation1: Dominant allele (A1) = 0.6, Recessive allele (A2) = 0.4Generation2: Dominant allele = 0.4, Recessive allele = 0.6Generation3: Dominant allele = 0.125, Recessive allele = 0.875

(B) Generation1: Dominant allele = 0.6, Recessive allele = 0.4Generation2: Dominant allele = 0.6, Recessive allele = 0.4Generation3: Dominant allele = 0.6, Recessive allele = 0.4

(C) Generation1: Dominant allele = 0.6, Recessive allele = 0.4Generation2: Dominant allele = 0.5, Recessive allele = 0.5Generation3: Dominant allele = 0.25, Recessive allele = 0.75

(D) Generation1: Dominant allele = 0.4, Recessive allele = 0.6Generation2: Dominant allele = 0.5, Recessive allele = 0.5Generation3: Dominant allele = 0.25, Recessive allele = 0.75

5. From the example two slides ago, which evolutionary mechanism might be operating across generations?

(A) Mutation(B) Selection favoring A1(C) Heterozygote advantage(D) Selection favoring A2(E) Inbreeding

9/17/18

15

Answers:

1. Parents: Aa x Aa = Offspring: AA (25%), Aa (50%), aa (25%)Answer = A2. A3. C4. A5. D