BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai [email protected].

82
BASIC of GENETICS WHAT YOU NEED TO KNOW Ahmed Rebai [email protected]

Transcript of BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai [email protected].

Page 1: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

BASIC of GENETICSWHAT YOU NEED TO KNOW

Ahmed Rebai

[email protected]

Page 2: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

DNA.. THE CODE OF LIFE

DNA is a molecule made of four bricks Living cells/organisms have DNA within it DNA contains the ‘text’ of life

Page 3: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

DNA

Page 4: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

FROM DNA TO PROTEIN

Page 5: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

DNA

Parts of DNA are CODING (give proteins) this is only 3% in human genome but 95% of yeast

Parts of DNA are NON-CODING:Introns Regulatory region of genesOther (junk DNA!)

Page 6: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

GENE

Gene: a section of DNA that codes for a protein and protein contributes to a trait

A chromosome is a ‘chunk’ of DNA and genes are parts of chromosomes

Page 7: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

GENES … ALLELES

Because we have a pair of each chromosome, we have two copies of each gene

These two forms can be identical in sequence or different: they are called ALLELE

Alleles can yield different phenotypes

Page 8: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

ALLELE

Allele: the different ‘options’ for a gene Example: attached or unattached earlobes

are the alleles for the gene for earlobe shape

Page 9: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

DOMINANT/RECESSIVE

Dominant: an allele that blocks or hides a recessive allele

Recessive: an allele that is blocked by or hidden by a dominant allele

Page 10: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

GENOTYPE

Genotype: A person’s set of alleles (gene options)

Genotypes can be noted by Two letters denoting alleles: AA, AB, BB or for

single variations for example AA, AG, GG A digit 1, 2, 3 or 0,1,2 (choosing a reference

allele)

2

1

0

Page 11: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

HOMOZYGOUS/HETEROZYGOUS

Homozygous: When a person’s two alleles for a gene are the same

Heterozygous: When a person’s two alleles for a gene are different

You get one allele from your mom and one from your dad.

If you get the same alleles from your mom and dad, you are homozygous for that gene.

If your mom gave you a different allele than your dad, you are heterozygous for that gene

Page 12: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

PHENOTYPE

Phenotype: A person’s physical features because of their genotype

What you look like (your phenotype) is based on what your genotype is (your genes)

Page 13: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

SEGERGATION: LESSONS FROM PEAS

Mendel (1822-1884) in the monastry of St. Thomas in the town of Brno (Brünn), in the Czech Republic. By a series of experiments in 1856-1863 on garden peas discovred the laws of inheritance

Page 14: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

SEXUAL REPRODUCTION

Page 15: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

MENDELIAN GENETICS: THE LAWS

Page 16: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

SEGERGATION

Page 17: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.
Page 18: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

SEGREGATION RULES 1. Genes come in pairs, which means that a

cell or individual has two copies (alleles) of each gene.

2. For each pair of genes, the alleles may be identical (homozygous WW or homozygous ww), or they may be different (heterozygous Ww).

3. Each reproductive cell (gamete) produced by an individual contains only one allele of each gene (that is, either W or w).

4. In the formation of gametes, any particular gamete is equally likely to include either allele (hence, from a heterozygous Ww genotype, half the gametes contain W and the other half contain w).

5. The union of male and female reproductive cells is a random process that reunites the alleles in pairs.

Page 19: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

MENDEL’S FIRST LAW

The Principle of Segregation: In the formation of gametes, the paired hereditary determinants separate (segregate) in such a way that each gamete is equally likely to contain either member of the pair.

Page 20: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

RECOMBINATION Mendel studied co-segregation of two

genes by crossing: Wrinkled and Green x Round and Yellow

Page 21: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.
Page 22: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

MENDENL’S SECOND LAW

The Principle of Independent Assortment: Segregation of the members of any pair of alleles is independent of the segregation of other pairs in the formation of reproductive cells.

This is of course valid for unlinked genes

Page 23: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

RECOMBINATION

When two genes are linked (close on the same chromosome) they do not segregate independently; frequencies of genotypes in progeny depend on the distance between genes

Page 24: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

MULTIPLE GENES FOR A PHENOTYPE: POLYGENIC TRAITS

Page 25: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

CONTINIOUS SCALE FOR A PHENOTYPE

Page 26: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

LET US EXERCICE

What are the genotypes produced by the following matings and their frequencies:

AA x AA AA x Aa AA x aa Aa x Aa Aa x aa aa x aa What are the frequencies of two-gene

genotypes from this mating: AABb x AaBB?

Page 27: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

POPULATION GENETICSBasic concepts and theories

Page 28: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

PROBABILITY IN POPULATION GENETICS

Consider the offsprings of the mating Aa x Aa The addition rule:

Pr(an offspring have at least one A allele)=Pr(A-)= Pr(AA or Aa)= Pr(AA)+Pr(Aa)=1/4+1/2=3/4

For any two independent events A and B Pr(A or B)=Pr(A)+Pr(B)

The multiplication rule: Pr(two offsprings having at least one A allele

each)= Pr(A- and A-)=Pr(A-)xPr(A-)= 3/4x3/4=9/16 Far any two independent events A and B

Pr(A and B)=Pr(A)xPr(B)

Page 29: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

EXERCICE

Two indivdiuals with genotypes Aa and Aa married and had three children; what is the probability that one of their children has the genotype aa?

Pr(aa and (AA or Aa) and (AA or Aa))= Pr(aa)xPr(A-)xPr(A-)=1/4x3/4x3/4=9/64

But Since the aa child have three possible birth

orders we should multiply by 3. so 27/64. Compute for the case of two children? (response: 6/16; for 4 children this is also 27/64)

Page 30: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

ORGANIZATION OF GENETIC VARIATION

A population is a group of organisms of the same species living within a sufficiently restricted geographical area that any mmeber can potentially mate with any other member (of the opposite sex)

Population subdivision can be due to geographic constraints as well as to social behaviour

Local populations: by country, town, : a group of individuals that can interbreed also said subpopulations or Mendelian populations

Page 31: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

GENETIC VARIATION

Phenotypic diversity in natural populations is impressive and is due to genetic variation: multiple alleles for many genes affecting the phenotype

Population genetics is concerned by describing how alleles are organized into genotypes and to determine wether alleles of the same or different genes are associated at random

Page 32: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

32

ALLELE FREQUENCIES IN POPULATIONS

Allele frequency is the proportion in the population of all alleles of the gene that are of the specified type

Since the population are of large size allele frequencies are estimated from a population sample Consider a gene with genotypes: AA, Aa et aa and a

sample of N individuals We count the number of individuals that have AA, Aa et

aa genotypes (denoted NAA, NAa et Naa, respectively) and we estimate the ferquency of allele A by the number of alleles A among all alleles segregating in the population, that is:

pA= (2NAA+NAa)/2N

and then pa=1-pA

Page 33: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

EXAMPLE In a sample of 1000 individuals 298 were of genotype

MM and 489 MN and 213 NN so the ferquency of allele M is

pM=(2*298+489)/(2*1000)=0.54 We can compute a 95% confidence interval for the

frequency based on the binomial law and normal approximation:

This approximation is only valid for non-small (>0.1) and non-high (<0.9) frequencies

In example we get [0.52 ; 0.56]

Page 34: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

FOR RARE ALLELES

For rare alleles (less than 1%) there is chance that a sample do not contain any allele carrier so the frequency estimation will be 0

An alternative is to use Emprical Bayes estimation: For uniform prior this gives p=(k+2)/(n+4) where

k is the observed number of alleles in the sample and n the total number of alleles

Page 35: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

RANDOM MATING

Means that any two individuals (of opposite sex) have the same probability to mate

This means that genotypes meet each other with the same probability as if they were formed by random collision of genotypes

Random mating can apply to some genes like those controlling blood groups or neutral polymorphisms but not for others like those controlling skin color or height

Page 36: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.
Page 37: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

NON OVERLAPPING GENERATION

Formally this means that the cycle of birth, maturation and death includes the death of all individuals present in each generation before the next generation mature

This is only an approximation (simplistic in humans) but works well as far as geotype frequencies are considered

Page 38: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

THE HARDY-WEINBERG PRINCIPLE

38

If we assume that The organism is diploidReproduction is sexualGenerations non-overlappingAllele frequencies identical in males

and femalesThe population is of large sizeMating is randomMigration and mutation is negligibleNatural seltcion does not affect alleles

Page 39: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

THEN..

Genotype frequencies can be deduced from allele frequencies (p is frequency of allele A, q=1-p of allele a):

AA: p² Aa: 2pq aa: q² These frequencies (allelic and genotypic)

remains the same over generations : we say that the population is in Hardy-Weinberg Equilibrium (HWE)

Page 40: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

WHY?

Page 41: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

IMPLICATION OF HWE

Despite very restrictive and incorrect assumption HWE offers a reference model in which there are no evolutionary forces at work other than those imposed by the process of reproduction itself (like a mechanical model of falling object without any force in action other than gravity)

The HW model separates life cycle to two phases: games->zygote and zygote->adult

Even if the assumptions of non-overlapping generations is not true HWE will be attained gradually

Applies also to multiallelic genes

Page 42: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

IMPLICATION OF HWE

Page 43: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

APPLICATION OF HWE

We can calculate the number of carriers of a rare mutation in the population

Ex: cystic fibrosis in european population patient is known to be 1 over 1700 (q=0.024) so the number of heterozygotes is (due to HWE) about 5%

So when there is a very rare allele most of genotypes containing this allele are heterozygous:

Show that for a rare allele of frequency is 1/1000 there are 2000 times more heterzoygotes than recessive homozygotes?

Page 44: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

HWE DEVIATION

44

Deviation from HWE can be due to inbreeding, population stratification, selection, gender-dependent allele frequencies, non-random (assortative) mating

Principle do not apply directly to X-linked genes or Y-linked genes

Page 45: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

TESTS OF HWE

45

Compare observed to expected genotype counts using Pearson chi-square test of goodness of fit: with 3 genotypes and 1 parameter estimated (p) we have a test with 1 df

Inappropriate for rare variants (low genotype counts): use Fisher Exact Test (FET)

Other Exact tests are available in the R language (e.g. Genetics package,…)

Page 46: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

PEARSON CHI-SQUARE THROUGH D

46

Let DA= PAA- p²Testing HWE is testing DA=0

Compute p-value = Pr(²1df> ²obs)If p-value<0,05 (or 0,0001) then Deviation

from HWE

))²1((²

²

pp

DN A

Page 47: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

47

Example: In a sample of 1000 individuals 298 were of genotype MM and 489 MN and 213 NN so the ferquency of allele M is

Genotypes: MM MN NN Observed counts : 298 489 213 Expected counts : 294.3 496.4 209.3

pM=0.54, PMM=0.294 so D=0.298-0.294=0.004

²=N D²/(p(1-p))²=1000*(0.004/(0.54*0.46))²

²=0.25<3.84; p-value=0.61

TESTS OF HWE: LET’S DO IT!

Page 48: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

HAPLOTYPES FROM GENOTYPES

48

If we study many genes they can be linked and one can use haplotypes

A haplotype (haploid genotype) is a set for alleles carried by one chromosome for several genes

Consider two genes (A,a) and (B,b) with allele frequencies (pA, pa) and (pB, pb)

If gametic frequencies are product of allele frequencies:

AB: pAxpB, Ab:pAxpb, aB: paxpB, ab:paxpb

We say that the genes are in random association or in Linkage equilibrium

Page 49: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.
Page 50: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

LINKAGE DISEQUIULIBRIUM

If the observed frequency of gametes (e.g. PAB) differ from that expected under linkage equilibrium (pAxpB) we say that the gene is in Linkage Disequilibrium (LD)

To measure and test LD we need to know the haplotype frequencies

Page 51: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

LINKAGE DISEQUILIBRIUM

51

SNP1 SNP2Allele Frequencies

40%

60%

30%70%

No LDLinkage Disequilibrium (LD)

12%

28%

18%

42%

a

A

60%

30%

10%

B

b

Page 52: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

LD MEASURES: D

The difference between observed and expected haplotype frequency

Is also equal to

D is bounded between Dmax and Dmin

BAAB ppPD

aBAbabAB PPPPD

Page 53: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

D’: STANDARDIZED D

Practically choose alleles A and B such that D>0 and pA>pB,

A standardized measure of LD is thus:

D’=1 denotes complete LD

BA pp

D

D

DD

)1('

max

Page 54: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

THE R² MEASURE : MORE PRACTICAL

54

This is correlation from the 2x2 contingency table of haplotype counts

Or

bBaA PPPP

Dr

²²

bA

aB

PP

PPDr )²'(²

Page 55: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

TESTING LD

We can show that Nr² is a chi-square test of LD (1df)Exercice: two blood group systems:

M/N and S/s gave following haplotypes (1000 individuals):

MS: 474 Ms: 611 NS: 142 Ns: 733 Allele frequencies are M: 0.54, S: 0.31 Compute D and D’ and r² Test LD Solution: D=0.07, D’=0.50 r²=0.47, X²=470, p<10-100

Page 56: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

CAUSES OF LD

LD is ‘created by linkage’ If r is the recombination rate between two

genes then we can show that LD at generation t is given by

Dt=(1-r)tD0

If r is small (genes very close on chromosome) the decay is very slow and can stay for over hundreds of generation

Page 57: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

RECOMBINATION AND LD

(1-r)/2 /2

Page 58: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

DECAY OF LD OVER GENERATIONS

Page 59: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.
Page 60: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

ADMIXTURE OF POPULATIONS

LD can be created by the merge of populations having different gametic frequencies

Let two populations and two genes in linkage equiulibrium in both, where alleles A and B have frequencies 0.05 in the first population and 0.95 in the second population

A new population is formed by equal mixture of the two populations, show that LD is high in that population (D=0.2 and D’=0.81) ?

Page 61: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

ADMIXTURE

Page 62: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

NATURAL (DARWINIAN) SELECTION

Individuals differ in their ability to survive and reproduce owing in part to their genotype

Th selective advantage/disadvantage is measured by fitness

Selection results in a change of allele frequencies over generations and deviation from HWE

Page 63: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

EFFECT OF SELECTION

Page 64: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

RANDOM GENETIC DRIFT

For each generation there is a chance in the drawing of gametes that will unit to form the next generation

This chance can result in a random change in allele frequency and may ultimately lead to the fixation or elimination of some alleles

Page 65: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

SIMPLY SAYING

Page 66: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

MATHEMATICAL MODELS OF DRIFT

Wright-Fisher model (1930): probability of obtaining k copies of an allele that had frequency p in the last generation is:

expected time before a neutral allele becomes fixed through genetic drift is given by:

Page 67: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.
Page 68: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

POPULATION BOTTLENECK

Page 69: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

FOUNDER EFFECT

Page 70: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

POPULATION SUBSTRUCTURE

When a population is organized in several subpopulations having different genetic composition (allele frequencies)

Substructure generally results in the reduction of heterozygotes frequency relative to that expected with random mating (Wahlund principle)

Several measures to assess population substructure : F-statistics

Page 71: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

F-STATISTICS

Defined by Wright (1921)

(1-FIT)=(1-FIS)(1-FST)

Page 72: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

ANOTHER FORMULATION

The mots useful to test substructure is FST an index that measures the level of genetic divergence among subpopulations

FST=(HT-HS)/HT

HS: average heterozygosity among individuals within subpopulations

HT: average heterozygosity among individuals within the total populations

According to variance of allele frequencies

Can be calculated by R package (hierfstat) FST is not a genetic distance

Page 73: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

HOW TO USE IT?

FST=1 means total divergence by fixation of alternative alleles in subpopulations

<0.05: little differentiation 0.0<FST<0.15 moderate

0.15<FST<0.25 high >0.25 very high Test chi-square with 1 df: X²= (k-1) N FST Examples:

between european and sub-sahrian african: 0.15 Japanese-african: 0.19 europeans: 0.11

Page 74: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

EXAMPLE

Two population where allele frequency is 0,5 and 0,3

Page 75: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

ADMIXTURE

Genetic admixture occurs when individuals from two or more previously separated populations begin interbreeding.

Admixture results in the introduction of new genetic lineages into a population.

Most human populations are a product of mixture of genetically distinct groups that intermixed within the last 4,000 years.

Page 76: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

ADMIXTURE DETECTION

By testing HWE Standard statistical methods applied to data

on genotype, alleles/haplotype frequencies: Principal component Analysis (PCA), Clustering: K-means, hierarchical,..

Advanced methods: Maximum likelihood (psmix R package) Bayesian methods Wavelet analysis (adwave R package)

STRUCTURE

Page 77: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

PRINCIPAL COMPONENT ANALYSIS

Page 78: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

CLUSTERING

Page 79: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

STRUCTURE

inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed.

http://pritchardlab.stanford.edu/structure.html

Page 80: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

ADMIXTURE

https://www.genetics.ucla.edu/software/admixture/

Page 81: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

R PACKAGES Genetics: Classes and methods for handling

genetic data. Includes classes to represent genotypes and haplotypes at single markers up to multiple markers on multiple chromosomes. Function include allele frequencies, flagging homo/heterozygotes, flagging carriers of certain alleles, estimating and testing for Hardy-Weinberg disequilibrium, estimating and testing for linkage disequilibrium, ...

Adegenet: Classes and functions for genetic data analysis within the multivariate framework

Hierfstat: estimation of hierarchical F-statistics from haploid or diploid genetic data with any numbers of levels in the hierarchy, following the algorithm Functions are also given to test via randomisation the significance of each F and variance components

Page 82: BASIC of GENETICS W HAT YOU NEED TO KNOW Ahmed Rebai Ahmed.rebai@cbs.rnrt.tn.

RECOMMENDED READINGS