Population Genetics is the study of the genetic composition of populations rather than individuals...

38

Transcript of Population Genetics is the study of the genetic composition of populations rather than individuals...

Population Genetics is the study of the genetic composition of populations rather than individuals

Population geneticists study mendelian populationsof sexually-reproducing organisms in which

there is random mating

A mendelian population may be considered to be agroup of closely-related, sexually reproducing organisms residing within defined geographical

boundaries within which each member hasan equal chance of mating with any other

member of the population

Geneticists often use decimals (e.g. p = 0.5) to represent the relative frequency of alleles within populations. The term ‘frequency’ is often used instead of the more accurate term ‘relative frequency’, e.g. “the frequency of allele B is 0.32”

Terminology

In this presentation, we have chosen the more accurate form of phrasing, e.g. “the relative frequency of allele B is 0.32”

Relative Frequency is the frequency of a particular variable divided by the total of all the variables in the sample, also known as the proportion. Relative frequencies (proportions) are always between 0 and 1

Frequency is the total count of a particular variable in a sample or the estimated count for that variable in the population; frequencies are whole numbers, e.g. 105 or 23

A sample is a representative selection from a parent population A parent population (usually referred to as just population) is the

total group from which the sample is selected

This population of 20 fruit flies was bred inculture bottles in a school laboratory

Some of the flies have normal, long wings governed by the dominant allele vg+ whilst others possess

vestigial wings governed by the recessive allele vg

The different genotypes within this population are vg+vg+, vg+vg and vgvg

Each individual is described as possessing TWO alleles for wing length and thus the total number of alleles for this characteristic within the population is 40 (20 x 2)

The total number of alleles of a particular gene that are present within a population at a particular

time is defined as the GENE POOL

The total number of alleles of a particular gene that are present within a population at a particular

time is defined as the GENE POOL

The 20 fruit flies in this population are all heterozygous for normal, long wing length and possess the genotype vg+vg

In this example there is a total of 40 alleles within the population, where 50% of the alleles govern long wing (vg+ )

and 50% govern vestigial wing (vg)

In this example, the relative frequency of the dominant long wing allele is 0.5 and the relative frequency of the

recessive vestigial wing allele is 0.5

Relative frequencies of alleles always provide a total value of ONE, where the value ONE represents the whole population

(100%)

In our example, if the relative frequency of the dominant, long wing allele within the population is 0.6 then the relative

frequency of the vestigial wing allele is 0.4

A general formula is used to represent the relative frequencies of alleles of a single gene within a population:

p + q = 1p + q = 1where:p = the relative frequency of the dominant allele andq = the relative frequency of the recessive allele

Allele frequencies are essential for analysing the genetics of populations but they cannot be observed directly

In 1908, the English mathematician G.H. Hardy and the German geneticist W. Weinberg developed a mathematical approach in which allele frequencies, genotype frequencies

and phenotype frequencies could be used to study the genetic composition of populations

Hardy and Weinberg considered a population with a hypothetical gene that has two alleles (A and a) in which the A allele is dominant to the recessive allele a

Given the frequencies of the A and a alleles, we can calculate the EXPECTED

frequencies of the progeny genotypes and phenotypes

Ap

Ap

aq

aq

AAp2

Aapq

Aapq

aaq2

Ifp = relative frequency of A alleles in the populationq = relative frequency of a alleles in the population,then all chance combinations of gametes containing these alleles can be calculated

Ap

Ap

aq

aq

AAp2

Aapq

Aapq

aaq2

The expected genotypic relative frequencies in the next

generation may be summarised as follows:

p2 + 2pq + q2 = 1AA Aa aap2 + 2pq + q2 = 1AA Aa aa

Thus, p2 is the proportion of the next generation EXPECTED to be homozygous dominant (AA)

2pq is the proportion of the next generation EXPECTED to be heterozygous (Aa)

q2 is the proportion of the next generation EXPECTED to be homozygous recessive (aa)

This is the Hardy-Weinberg EquationThis is the Hardy-Weinberg Equation

p2 + 2pq + q2 = 1p2 + 2pq + q2 = 1AA Aa aa

Within a population of sheep, white wool is determined by a dominant allele B, and black wool upon its recessive allele b

Within this population there are 891 white sheepand 9 black sheep

We can use this information to estimate the allele frequenciesand the frequency and number of heterozygous white sheep

Let p = relative frequency of dominant allele (B)Let q = relative frequency of recessive allele (b)According to the Hardy-Weinberg Equation:p2 + 2pq + q2 = 1BB Bb bb genotypesAll the black sheep are homozygous recessive with the genotype bb; it is not possible to determine from the phenotype which of the white sheep are homozygous or heterozygousWe can calculate the frequency of the black wool allele by first determining the frequency of homozygous recessive black sheep within the population

891 white sheep9 black sheepTotal Population = 900 sheep

891 white sheep9 black sheepTotal Population = 900 sheep

891 white sheep9 black sheepTotal Population = 900 sheep

891 white sheep9 black sheepTotal Population = 900 sheep

q2 = relative frequency of homozygous recessive individuals, i.e. bb genotypes

Therefore, q2 = 9/900 = 1/100 = 0.01

Therefore, q = 0.01 = 0.1

Therefore the relative frequency of the b allele is 0.1 (q)Since p + q = 1, then p = 0.9Therefore the relative frequency of the B allele is 0.9 (p)

Relative frequencies of alleles: p = 0.9 q = 0.1

Allele relative frequencies for wool colour

p = 0.9q = 0.1

Allele relative frequencies for wool colour

p = 0.9q = 0.1

According to the Hardy-Weinberg equation, the relativefrequency of heterozygotes is 2pq

Therefore the relative frequency of Bb individuals is2 x 0.9 x 0.1 = 0.18

The estimated NUMBER of heterozygotes (Bb) is therefore0.18 x 900 = 162

p2 + 2pq + q2 = 1p2 = relative frequency of homozygous dominants2pq = relative frequency of heterozygotesq2 = relative frequency of homozygous recessives

If a population conforms to the expectations predictedby the law, then the population is said to be in

Hardy-Weinberg Equilibrium

However, Hardy and Weinberg based their law on a number of conditions; if any of these conditions are

not met then allele and genotype frequencies willchange over the generations – the process of evolution

The Hardy-Weinberg Law states that:

In a large randomly mating population, the allele and genotype frequencies remain constant from generation to generation

In a large randomly mating population, the allele and genotype frequencies remain constant from generation to generation

The Hardy-Weinberg equilibrium is maintained if all the following conditions are met:

• The population is large

• There is random mating within the population

• No selection pressures are operating – each genotype has equal fitness and there is no differential mortality

• The population is closed – there is no immigration or emigration of individuals from or into another population

• There is no mutation from one allele into another or, if mutation does occur, the forward and back mutations are equal

• There is no genetic drift, i.e. changes in allele frequencies resulting from random, chance processes (most likely to affect small populations)

• The character being studied is not sex linked

In the example of the sheep population, the allele frequencies were determined by estimating the

frequency of the recessive allele from the number of homozygous recessive individuals

In cases of co-dominance, the heterozygotes may be phenotypically distinguishable from the

homozygotes, e.g. both red and white snapdragons are homozygous (RR & rr) whereas only pink

snapdragons are heterozygous (Rr)

In these cases, the frequencies of the genotypes can be determined by counting and thus provide

observed rather than estimated values

In a species of bean, seed colour is determined by a pair of alleles C and c,

where the allele C determines golden seeds and the allele c determines dark

green seeds - there is co-dominanceThe heterozygote (Cc) has light green seeds and is thus

distinguishable from the other two genotypes (CC and cc)500 seeds of this species were randomly selected ‘in the field’

and the following phenotypes were recorded:

golden 100

light green 250

dark green 150

In this situation the ACTUAL allele frequencies can be determined by counting the alleles, as the heterozygotes

(Cc) are distinguishable from both homozygotes (CC & cc)

golden CC 100

light green Cc 250

dark green cc 150

Each individual is described as possessing TWO alleles for seed colour such that the total number of alleles

for seed colour in this sample is(100 + 250 + 150) x 2 = 1000 alleles

In this situation let:p = the relative frequency of the C allele andq = the relative frequency of the c allele

golden CC 100

light green Cc 250

dark green cc 150

• Total number of alleles in the sample = 1000• 100 homozygous golden seeds (CC) represent 200 C alleles• 250 heterozygous seeds represent 250 C alleles as the genotype

is Cc and each individual carries only ONE C allele• Therefore the total number of C alleles in the sample is

200 + 250 = 450• The proportion (relative frequency) of C alleles in the sample is

therefore 450/1000 = 0.45• Therefore p = 0.45

golden CC 100

light green Cc 250

dark green cc 150

• Total number of alleles in the sample = 1000• 150 homozygous dark green seeds (cc) represent 300 c alleles• 250 heterozygous seeds (Cc) represent 250 c alleles • Therefore the total number of c alleles in the sample is

300 + 250 = 550• The frequency of c alleles in the sample is therefore

550/1000 = 0.55• Therefore q = 0.55 (N.B. this simply confirms p + q = 1)

golden CC 100

light green Cc 250

dark green cc 150

p = 0.45 q = 0.55p = 0.45 q = 0.55

These values of p and q are ACTUAL values determined from the sample and not ESTIMATES of the parent population as

are determined by the Hardy-Weinberg equation

These actual values of p and q can be applied to the Hardy-Weinberg equation in order to determine whether the parent population is in Hardy-Weinberg equilibrium

According to the Hardy-Weinberg Law;

p = 0.45 q = 0.55p = 0.45 q = 0.55

• p2 represents the EXPECTED proportion of gold homozygotes (CC)

• 2pq represents the EXPECTED proportion of light green heterozygotes (Cc)

• q2 represents the EXPECTED proportion of dark green homozygotes (cc)

Determine the expected frequencies and hence the expected numbers of the different genotypes in the population

Determine the expected frequencies and hence the expected numbers of the different genotypes in the population

According to the Hardy-Weinberg Law;

p = 0.45 q = 0.55p = 0.45 q = 0.55

• p2 = (0.45)2 = 0.2025 = expected proportion of CC individuals• Therefore expected number of CC individuals = 0.2025 x 500 =

101.25 (500 = total number of seeds in the sample)• 2pq = 2 x 0.45 x 0.55 = 0.495 expected proportion of Cc

individuals• Therefore expected number of Cc individuals =

0 .495 x 500 = 247.5• q2 = (0.55)2 = 0.3025 = expected proportion of cc individuals• Therefore expected number of cc individuals =

0.3025 x 500 = 151.25

The observed results can be compared with those expected from the Hardy-Weinberg equation in

order to determine whether genotype frequencies (CC, Cc and cc) are in equilibrium

The significance of differences between the OBSERVED and EXPECTED results can be

assessed using the Chi-squared test (χ²)

The basis of the Chi-squared test is the difference between observed results

(O) and the expected results (E) predicted by the ‘null hypothesis’E

EO 2

2 )(

Σ =

151.25150cc

247.5250Cc

101.25100CC

(O – E)2 /E(O – E)2O - EExpected Result (E)

Observed Result (O)Genotype

E

EO 22 )(

• State the ‘null hypothesis’ and ‘alternative hypothesis’• Complete the table above to obtain the 2 value• Use the 2 value and critical values table to determine 2

crit, at the 5% significance level for one degree of freedom (number of alleles - 1)

• Comment on the result

Genotype Observed Result (O)

Expected Result (E) O – E (O – E)2 (O – E)2 /E

CC 100 101.25

Cc 250 247.5

cc 150 151.25

Σ =

Σ = 0.05101

0.010331.5625-1.25151.25150cc

0.025256.25+2.5247.5250Cc

0.015431.5625-1.25101.25100CC

(O – E)2 /E(O – E)2O – EExpected Result (E)

Observed Result (O)Genotype

E

EO 22 )(

• Null Hypothesis; there is no difference between the observed results and those expected from the Hardy-Weinberg equation

• Alternative Hypothesis; there is a difference between the observed results and those expected from the Hardy-Weinberg equation

• The2 value is 0.05

• For a 2 value of 0.05 and 1 degree of freedom there is between an 80% and 90% probability that chance alone has caused the difference between the observed values and those expected if the null hypothesis is true, i.e. p lies between 0.80 and 0.90; there is NO significant difference between observed and expected results

• The 2 value (0.05) is less than the critical value, crit2 (3.84) for a 5% (0.05)

level of significance and thus we accept the null hypothesis and reject the alternative hypothesis

• The observed and expected results are NOT significantly different and therefore this bean population is in Hardy-Weinberg equilibrium

If the observed and expected results had shown a significant difference, then factors other than chance

would have caused this difference and the population would not be in Hardy-Weinberg equilibrium

In such a case, one or more of the factors considered as assumptions of the law must be

operating in the population, e.g. there is mutation from one allele into the another

Past Question

• A species of beetle, only found on guano island, has a characteristic controlled by a pair of codominant alleles, CM and CN.

• Define codominance• If there are 500 beetles in the total population and

300 individuals have the genotype CM CM , 150 had the genotype

CM CN , and 50 had the genotype CN CN . Calculate the actual frequency of the allele CN. Show your working.

answer

• 0.25 or 25% 2 marks

• CN= 250/1000 1 mark

Cont..

• Use your answer and the Hardy-Weinberg equation to calculate the number of beetles that you would expect to have the genotype CN CN

answer

• p2 = (0.25)2 / 0.0625 / square of calculated figure for CN ----------2 marks.

• Formula = 1 ------ 1 mark