Population Genetics

31
Population Genetics

Transcript of Population Genetics

Page 1: Population Genetics

Population Genetics

Page 2: Population Genetics

Evolution by Natural Selection

• Unlike Mendel, Charles Darwin made a big splash when his defining work, "On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life" (which we refer to as “The Origin of Species”) published in 1859.

• Darwin set forth a scientific theory that described how one species could give rise to another species, given sufficient time. It was heavily attacked at the time (and continuing to this day) by people who thought that it contradicted their religious beliefs. Nevertheless, the basic theory has survived and flourished, and today it is one of the main pillars of biological theory.

Page 3: Population Genetics

Fitness

• A fundamental concept in evolutionary theory is “fitness”, which can defined as the ability to survive and reproduce. Reproduction is key: to be evolutionarily fit, an organism must pass its genes on to future generations.

• Basic idea behind evolution by natural selection: the more fit individuals contribute more to future generations than less fit individuals. Thus, the genes found in more fit individuals ultimately take over the population.

• Natural selection requires 3 basic conditions:– 1. there must be inherited traits.– 2. there must be variation in these traits among members of the

species.– 3. some inherited traits must affect fitness

Page 4: Population Genetics

Genetics of Populations• Darwin didn’t understand how inheritance worked--Mendel’s work

was still in the future. It wasn’t until the 1930’s when Mendelian genetics was incorporated into evolutionary theory, in what is called the “Neo-Darwinian synthesis”.

• Translated into Mendelian terms, the basis for natural selection is that alleles that increase fitness will increase in frequency in a population.

• Thus, the main object of study in evolutionary genetics is the frequency of alleles within a population.

• A “population” is a group of organisms of the same species that reproduce with each other. There is only one human population: we all interbreed.

• The “gene pool” is the collection of all the alleles present within a population.

• We are mostly going to look at frequencies of a single gene, but population geneticists generally examine many different genes simultaneously.

Page 5: Population Genetics

Allele and Genotype Frequencies

• Each diploid individual in the population has 2 copies of each gene. The allele frequency is the proportion of all the genes in the population that are a particular allele.

• The genotype frequency of the proportion of a population that is a particular genotype.

• For example: consider the MN blood group. In a certain population there are 60 MM individuals, 120 MN individuals, and 20 NN individuals, a total of 200 people.

• The genotype frequency of MM is 60/200 = 0.3.• The genotype frequency of MN is 120/200 = 0.6• The genotype frequency of NN is 20/200 = 0.1

• The allele frequencies can be determined by adding the frequency of the homozygote to 1/2 the frequency of the heterozygote.

• The allele frequency of M is 0.3 (freq of MM) + 1/2 * 0.6 (freq of MN) = 0.6• The allele frequency of N is 0.1 + 1/2 * 0.6 = 0.4

• Note that since there are only 2 alleles here, the frequency of N is 1 - freq(M).

Page 6: Population Genetics

Heterozygosity and Polymorphism

• A gene is called “polymorphic” if there is more than 1 allele present in at least 1% of the population. Genes with only 1 allele in the population are called “monomorphic”. Some genes have 2 alleles: they are “dimorphic”.

• In a study of white people from New England, 122 human genes that produced enzymes were examined. Of these, 51 were monomorphic and 71 where polymorphic. On the DNA level, a higher percentage of genes are polymorphic.

• Heterozygosity is the percentage of heterozygotes in a population. Averaged over the 71 polymorphic genes mentioned above, the heterozygosity of this population of humans was 0.067.

Page 7: Population Genetics

Hardy-Weinberg Equilibrium• Early in the 20th century G.H. Hardy and Wilhelm Weinberg

independently pointed out that under ideal conditions you could easily predict genotype frequencies from allele frequencies, at least for a diploid sexually reproducing species such as humans.

• For a dimorphic gene (two alleles, which we will call A and a), the Hardy-Weinberg equation is based on the binomial distribution:

p2 + 2pq + q2 = 1 where p = frequency of A and q = frequency of a, with p + q = 1.• p2 is the frequency of AA homozygotes• 2pq is the frequency of Aa heterozygotes• q2 is the frequency of aa homozygotes

• H-W can be viewed as an extension of the Punnett square, using frequencies other than 0.5 for the gamete (allele) frequencies.

Page 8: Population Genetics

Hardy-Weinberg Example

• Taking our previous example population, where the frequency of M was 0.6 and the frequency of N was 0.4.

• p2 = freq of MM = (0.6)2 = 0.36• 2pq = freq of MN - 2 * 0.6 * 0.4 = 0.48• q2 = freq of NN = (0.4)2 = 0.16

• These H-W expected frequencies don’t match the observed frequencies. We will examine the reasons for this soon.

Page 9: Population Genetics

Rare Alleles and Eugenics• A popular idea early in the 20th century was

“eugenics”, improving the human population through selective breeding. The idea has been widely discredited, largely due to the evils of “forced eugenics” practiced in certain countries before and during World War 2. We no longer force “genetically defective” people to be sterilized.

• However, note that positive eugenics: encouraging people to breed with superior partners, is still practiced in places.

• The problem with sterilizing “defectives” is that most genes that produce a notable genetic diseases are recessive: only expressed in heterozygotes. If you only sterilize the homozygotes, you are missing the vast majority of people who carry the allele.

• For example, assume that the frequency of a gene for a recessive genetic disease is 0.001, a very typical figure. Thus p = 0.999 and q = 0.001. Thus p2 = 0.998, 2pq = 0.002, and q2 = 0.000001. The ratio of heterozygotes (undetected carriers) to homozygotes (people with the disease) is 2000 to 1: you are sterilizing only 1/2000 of the people who carry the defective allele. This is simply not a workable strategy for improving the gene pool.

Page 10: Population Genetics

Nazi Eugenics

"The Threat of the Underman. It looks like this: Male criminals had an average of 4.9 children, criminal marriage, 4.4 children, parents of slow learners, 3.5 children, a German family 2.2 children, and a marriage from the educated circles, 1.9 children."

Page 11: Population Genetics

Estimating Allele Frequencies from Recessive Homozygote Frequency

• If Hardy-Weinberg equilibrium is assumed (an assumption we will examine shortly), it is possible to estimate the allele frequencies for a gene that shows complete dominance even though heterozygotes can’t be distinguished from the dominant homozygotes.

• The frequency of recessive homozygotes is q2. Thus, the frequency of the recessive allele is the square root of this. Very simple.

• For example, the recessive genetic disease PKU has a frequency in the population of about 1 in 10,000. q2 thus equals 0.0001 (10-4). The square root of this is 0.01 (10-2), which implies that the frequency of the PKU allele is 0.01 and the frequency of the normal allele is 0.99. Thus the frequency of the heterozygous genotype is 2 * 0.99 * 0.01 = 0.198. Abut 2% of the population is a carrier of the PKU allele.

• Note again: this ASSUMES H-W equilibrium, and this assumption is not always true.

Page 12: Population Genetics

Necessary Conditions for Hardy-Weinberg Equilibrium

• The relationship between allele frequencies and genotype frequencies expressed by the H-W equation only holds if these 5 conditions are met. None of them is completely realistic, but all are met approximately in many populations.

• If a population is not in equilibrium, it takes only 1 generation of meeting these conditions to bring it into equilibrium. Once in equilibrium, a population will stay there as long as these conditions continue to be met.

– 1. no new mutations– 2. no migration in or out of the population– 3. no selection (all genotypes have equal fitness)– 4. random mating– 5. very large population

Page 13: Population Genetics

Testing for H-W Equilibrium

• If we have a population where we can distinguish all three genotypes, we can use the chi-square test once again to see if the population is in H-W equilibrium. The basic steps:– 1. Count the numbers of each genotype to get the observed

genotype numbers, then calculate the observed genotype frequencies.

– 2. Calculate the allele frequencies from the observed genotype frequencies.

– 3. Calculate the expected genotype frequencies based on the H-W equation, then multiply by the total number of offspring to get expected genotype numbers.

– 4. Calculate the chi-square value using the observed and expected genotype numbers.

– 5. Use 1 degree of freedom (because there are only 2 alleles).

Page 14: Population Genetics

Example• Data: 26 MM, 68 MN, 106 NN, with a total population of 200 individuals.• 1. Observed genotype frequencies:

– MM: 26/200 = 0.13– MN: 68/200 = 0.34– NN:106/200 = 0.53

• 2. Allele frequencies:– M: 0.13 + 1/2 * 0.34 = 0.30– N: 0.53 + 1/2 * 0.34 = 0.70

• 3. Expected genotype frequencies and numbers:– MM: p2 = (0.30)2 = 0.09 (freq) x 200 = 18– MN: 2pq = 2 * 0.3 * 0.7 = 0.42 (freq) * 200 = 84– NN: q2 = (0.70)2 = 0.49 (freq) * 200 = 98

• 4. Chi-square value: – (26 - 18)2 / 18 + (68 - 84)2 / 84 + (106 - 98)2 / 98– = 3.56 + 3.05 + 0.65– = 7.26

• 5. Conclusion: The critical chi-square value for 1 degree of freedom is 3.841. Since 7.26 is greater than this, we reject the null hypothesis that the population is in Hardy-Weinberg equilibrium.

Page 15: Population Genetics

Relaxing the H-W Conditions: Random Mating

• The fullest meaning of “random mating” implies that any gamete has an equal probability of fertilizing any other gamete, including itself. In a sexual population, this is impossible because male gametes can only fertilize female gametes.

• More or less random mating in a sexual population is achieved in some species of sea urchin, which gather in one place and squirt all of their gametes, male and female, out into the open sea. The gametes then find each other and fuse together to become zygotes.

• In animal species, mate selection is far more common than random fertilization. A very general rule is “assortative mating”, that like tends to mate with like: tall people with tall people, short people with short people, etc. This rule is true for externally detectable phenotypes such as appearance, but invisible traits like blood groups are usually close to H-W equilibrium in the population.

• Assortative mating is most easily analyzed as a tendency for inbreeding. You are more like your relatives than you are to random strangers. Thus you are somewhat more likely to mate with a distant relative than would be expected by chance alone.

Page 16: Population Genetics

My Boyfriend is Type B

Japanese Blood Type Personality Chart

Type A

Best Traits

Conservative, reserved, patient, punctual, perfectionist and good with plants.

Worst Traits

Introverted, obsessive, stubborn, self conscious, and uptight

Type B

Best Traits

Creative and passionate. Animal loving. Optimistic and flexible

Worst Traits

Forgetful, irresponsible, individualist

Type AB

Best Traits

Cool, controlled, rational. Sociable and popular. Empathic

Worst Traits

Aloof, critical, indecisive and unforgiving

Type O

Best Traits

Ambitious, athletic, robust and self-confident. Natural leaders

Worst Traits

Arrogant, vain and insensitive. Ruthless in Korean, written and directed by Choi Seok-Won

Page 17: Population Genetics

Measuring Inbreeding

• Recall that inbreeding decreases the number of heterozygotes in the population: each generation of selfing decreases the number of heterozygotes by 1/2.

• By comparing the number of heterozygotes observed to the number expected for a population in H-W equilibrium, we can estimate the degree of inbreeding.

• A measure of inbreeding in the “inbreeding coefficient”, F.

F = 1 - (obs hets) / (exp hets).• If F = 0, the observed heterozygotes is equal to the

expected number, meaning that the population is in H-W equilibrium.

• If F = 1, there are no heterozygotes, implying a completely inbred population.

• Thus, the higher F is, the more inbred the population is.

Page 18: Population Genetics

Example• Wild oats is a common plant in California, the cause of the golden-

brown hillsides all summer out there.• Wild oats can pollinate itself, but the pollen also blows in the wind so

it can cross fertilize. The task is to estimate the relative proportions of these two types of mating.

• Data for the phosphoglucomutase (Pgm) gene:– 104 AA, 9 AB, 42 BB = 155 total individuals

• H-W calculations: – freq of A = 104 + 1/2 * 9 = 108.5 / 155 = 0.7– freq of B = 1 - freq(A) = 0.3

– exp heterozygotes = 2pq = 2 * 0.7 * 0.3 = 0.42 (freq) * 155 = 65.1– F = 1 -(obs hets) / (exp hets) = 1 - 9 / 65.1 = 1 - 0.14– F = 0.84– This is a very inbred population: most matings are self-pollination.

Page 19: Population Genetics

Inbreeding Depression and Genetic Load

• For most species, including humans, too much inbreeding leads to weak and sickly individuals, as seen in this example of mice inbred by brother-sister matings.

• Inbreeding depression is caused by homozygosity of genes that have slight deleterious effects. It has been estimated that on the average, each human carries 3 recessive lethal alleles. These are not expressed because they are covered up by dominant wild type alleles. This concept is called the “genetic load”.

• However, it has been argued that some amount of inbreeding is good, because it allows the expression of recessive genes with positive effects. The level of inbreeding in the US has been estimated (from Roman Catholic parish records) at about F = 0.0001, which is approximately equivalent to each person mating with a fifth cousin.

gen litter size

% dead by 4 weeks

0 7.50 3.9

6 7.14 4.4

12 7.71 5.0

18 6.58 8.7

24 4.58 36.4

30 3.20 45.5

Page 20: Population Genetics

Mutation

• Mutation is unavoidable. It happens as a result of radiation in the environment: cosmic rays, radioactive elements in rocks and soil, etc., as well as mutagenic chemical compounds, both natural and artificially made, and just as a chance event inherent in the process of DNA replication.

• However, the rate of mutation is quite low: for any given gene, about 1 copy in 104 - 106 is a new mutation.

• Mutations provide the necessary raw material for evolutionary change, but by themselves new mutations do not have a measurable effect on allele or genotype frequencies.

Page 21: Population Genetics

Migration

• Migration is the movement of individuals in or out of a population. Migration is necessary to keep a species from fragmenting into several different species. Even as low a level as one individual per generation moving between populations is enough to keep a species unified.

• Migration can be thought of as combining two populations with different allele frequencies and different numbers together into a single population. After one generation of random mating, the combined population will once again be in H-W equilibrium.

Page 22: Population Genetics

Migration Examples• Population X has 20 individuals with frequency of the A allele = 0.8.

Population Y has 10 individuals with frequency of the A allele = 0.2. The two populations mix. What is the frequency of A in the final population?

• There are 20 + 10 = 30 individuals in the final population, for a total of 60 copies of the gene.

– For population X, 40 * 0.8 = 32 copies are A, and 8 are a. – For population Y, 20 * 0.2 = 4 copies are A, and 16 are a.– Adding these together, the final population has 32 + 4 = 36 A alleles and 8 + 16 =

24 a alleles. Out of 60 alleles, the frequency of A is 36/60 = 0.6

• A real example: African Americans have a large proportion of African ancestry, but also some European ancestry. The Duffy blood group has an allele with a frequency of 0 among West African populations, and an average frequency of 0.43 among European populations. Other blood groups can also be used in this technique: very little assortative mating occurs on the basis of blood group.

– In Oakland CA, African-Americans are reported to have about 22% European ancestry

– In Charleston South Carolina, the proportion is about 3.7%

Page 23: Population Genetics

Selection• Selection is the primary factor driving evolution. Genes that confer

increased fitness tend to take over a population. Note that random events also play a big factor: sometimes a “good” gene is lost due to chance events. Also, a gene that confers increased fitness in one environment may confer decreased fitness in another environment.

• Selection can occur at many places in the life cycle: the embryo might be defective, the fetus might not survive to birth, the immature offspring might be killed, the individual might not be able to find a mate or might be sterile.

• We will simplify all of this by assuming that the gametes are produced at random and combine at random, to produce a population of zygotes in H-W equilibrium. Then, we will apply selection to the zygotes, killing off different proportions of the different genotypes.

• Fitness is a function of the genotype. We will define the “relative fitness” of the best genotype as equal to 1.0, and the fitnesses of the two other genotypes as equal to or less than 1.

Page 24: Population Genetics

Selection Against Recessive Homozygote

• This situation is what happens with a recessive genetic disease. Heterozygotes and dominant homozygotes are indistinguishable and have the same relative fitness: 1.0. The recessive homozygote has the genetic disease and a fitness less than 1. The exact fitness depends on the nature of the disease.

• Start with a population where p = 0.6 and q = 0.4, and assume that the aa homozygote has a relative fitness of 0.1 (i.e. 90% of the aa offspring die without reproducing).

• The zygotes produces (in H-W equilibrium) are 0.36 AA, 0.48 Aa, and 0.16 aa.

• Selection on the zygotes reduces the aa’s by 90%, to 0.016.• However, proportions must add to 1.0, so we divide each proportion by a

correction factor. The correction factor is the sum of the remaining proportions: 0.36 + 0.48 + 0.016 = 0.856.

• So, after selection, the frequency of AA is 0.36 / 0.856 = 0.42. The frequency of Aa is 0.48 / 0.856 = 0.56. The frequency of aa is 0.016 / 0.856 = 0.019.

• Final allele frequencies: A = 0.42 + 1/2 * 0.56 = 0.70. a = 1 - freq(A) = 0.3.

Page 25: Population Genetics

Selection Favoring the Heterozygote

• Some genes maintain 2 alleles in the population by having the heterozygote more fit than either homozygote.

• An example is HbS, the sickle cell hemoglobin allele. In rural West Africa, where malaria is endemic and medical support is rudimentary, the relative fitness of the HbA homozygote is estimated at 0.85, due to susceptibility to malaria. The relative fitness of the HbS homozygote is estimated at approximately 0, with almost none reaching reproductive age due to sickle cell disease. The heterozygote is the most fit, so it given a relative fitness of 1.0. Under these conditions, it is possible to predict an equilibrium frequency of the HbS allele of about 0.13. This is approximately what is seen in various West African countries.

Page 26: Population Genetics

Genetic Drift• Genetic drift is the random changes in allele frequencies. Genetic

drift occurs in all populations, but it has a major effect on small populations.

• For Darwin and the neo-Darwinians, selection was the only force that had a significant effect on evolution. More recently it has been recognized that random changes, genetic drift, can also significantly influence evolutionary change. It is thought that most major events occur in small isolated populations.

• Simple example: A population of 1 female and 2 males, where the female chooses only 1 male to mate with. Assume that the female has the Aa genotype, male #1 is AA, and male #2 is aa.– initially the allele frequencies are 0.5 A and 0.5 a– if male #1 gets to mate, the offspring will have a 0.75 A, 0.25 a

frequency– if male #2 mates, the offspring will be 0.25 A and 0.75 a.

Page 27: Population Genetics

Fixation of Alleles• Genetic drift causes allele

frequencies to fluctuate randomly each generation. However, if the frequency of an allele ever reaches zero, it is permanently eliminated from the population. The other allele, whose frequency is now 1.0, is “fixed”, which means that all individuals in the population will be homozygous for that allele. This continues for all future generations (in the absence of mutation).

• The average rate at which alleles become fixed is a function of the population size. The larger the population, the longer it takes for fixation to occur.

Page 28: Population Genetics

Population Bottlenecks and Founder Effect

• Bottlenecks and the founder effect are closely related phenomena.

• Founder effect: If a small group of individuals leaves a larger population and develops into a separate, isolated population, the allele frequencies in the new population are determined by the allele frequencies in the founders. Since these frequencies are probably different from those found in the general population, the new population will have a different set of frequencies.

• This is especially true for rare alleles, which can suddenly become prominent if one of the founders has the rare allele.

Page 29: Population Genetics

Founder Effect Example• Founder effect example: the Amish are

a group descended from 30 Swiss founders who renounced technological progress. Most Amish mate within the group. One of the founders had Ellis-van Crevald syndrome, which causes short stature, extra fingers and toes, and heart defects. Today about 1 in 200 Amish are homozygous for this syndrome, which is very rare in the larger US population.

• Note the effect inbreeding has here: the problem comes from this recessive condition becoming homozygous due to the mating of closely related people.

Page 30: Population Genetics

Bottlenecks• A population bottleneck is essentially the same phenomenon as the

founder effect, except that in a bottleneck, the entire species is wiped out except for a small group of survivors. The allele frequencies in the survivors determines the allele frequencies in the population after it grows large once again.

• Example: Pingalop atoll is an island in the South Pacific. A typhoon in 1780 killed all but 30 people. One of survivors was a man who was heterozygous for the recessive genetic disease achromatopsia. This condition caused complete color blindness. Today the island has about 2000 people on it, nearly all descended from these 30 survivors. About 10% of the population is homozygous for achromatopsia This implies an allele frequency of about 0.26.

Page 31: Population Genetics

Human Bottleneck

• The human population is thought to have gone through a population bottleneck about 100,000 years ago. There is more genetic variation among chimpanzees living within 30 miles of each other in central Africa than there is in the entire human species.

• The tree represents mutational differences in mitochondrial DNA for various members of the Great Apes (including humans).