Branches of Biology. The Molecular Level Study the molecular units of life.
Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms...
-
Upload
rodney-simpson -
Category
Documents
-
view
221 -
download
0
Transcript of Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms...
![Page 1: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/1.jpg)
Lecture 3
Molecular Evolution and Phylogeny
![Page 2: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/2.jpg)
Facts on the molecular basis of life
• Every life forms is genome based• Genomes evolves• There are large numbers of apparently hom
logous intra-genomic (paralog) and inter-genomic (ortholog) genes
• Some genes, especially those related to the function of transcription and translation, are common to ALL life forms
• The closer two organisms seem to be phylogenetically, the more similar their genomes and corresponding genes are
![Page 3: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/3.jpg)
Central dogma of molecular biology
DNA
RNA
Protein
![Page 4: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/4.jpg)
• Closer related organisms have more similar genomes
• Highly similar genes are homologs (have the same ancestor)
• A universal ancestor exists for all life forms• Molecular difference in homologous genes
(or protein sequences) are positively correlated with evolution time
• Phylogenetic relation can be expressed by a dendrogram (a “tree”)
Basic assumptions of molecular evolution
![Page 5: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/5.jpg)
The five steps in phylogenetics dancing
Modified from Hillis et al., (1993). Methods in Enzymology 224, 456-487
12
3
4
5
Sequence data
Align Sequences
Phylogenetic signal?Patterns—>evolutionary processes?
Test phylogenetic reliability
Distances methods
Choose a method
MB ML
Characters based methods
Single treeOptimality criterion
Calculate or estimate best fit tree
LS ME NJ
Distance calculation(which model?)
Model?
MPWheighting?
(sites, changes)?Model?
![Page 6: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/6.jpg)
Why protein phylogenies?Why protein phylogenies?
• For historical reasons - first For historical reasons - first
sequences...sequences...• Most genes encode proteins...Most genes encode proteins...• To study protein structure, function To study protein structure, function
andand
evolutionevolution• Comparing DNA and protein based Comparing DNA and protein based
phylogenies can be usefulphylogenies can be useful•Different genes - e.g. 18S rRNA versus Different genes - e.g. 18S rRNA versus
EF-2 proteinEF-2 protein•Protein encoding gene - codons versus Protein encoding gene - codons versus
amino acidsamino acids
![Page 7: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/7.jpg)
Protein were the first molecular Protein were the first molecular sequences to be used for sequences to be used for phylogenetic inferencephylogenetic inference
Fitch and Margoliash (1967)
Construction of phylogenetic trees.
Science 155, 279-284.
![Page 8: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/8.jpg)
Statistical Physics and Biological InformationInstitute of Theoretical Physics
University of California at Santa Barbara2001 May 7
Most of what follows taken from:
![Page 9: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/9.jpg)
Understanding trees
Time
30 Mya
Root
22 Mya
7 Mya
same as
![Page 10: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/10.jpg)
Understanding trees #2
![Page 11: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/11.jpg)
Understanding trees #3
![Page 12: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/12.jpg)
Difference in homologous sequences is a measure of evolution time
Part of multiple sequence alignment of Mitochondrial Small Sub-Unit rRNA
Full length is ~ 950
11 primate species with mouse as outgroup靈長目
Change similarity matrix to distance matrix: d = 1 - S
![Page 13: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/13.jpg)
![Page 14: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/14.jpg)
From alignment construct pairwise distance**Note: Alignment is not the only way to computedistance
![Page 15: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/15.jpg)
Models of sequence evolution
![Page 16: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/16.jpg)
Jukes-Cantor (minimal) Model
All substitution rates = all base frequency = 1/4
A C= 3 Pij(2t)
![Page 17: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/17.jpg)
• Let probability of site being a base at time t be P(t)• After elapse time t
mutate to other three bases is –3t P(t) Gain from other bases is t (1 - P(t))
• Hence P(t + t) = P(t) –3t P(t) + t (1 - P(t)) dP(t)/dt = P(t)
• Write P(t) = a exp(-bt) +c, solution is b= , c=1/4 P(t) = a exp(- t) +1/4
• If P(0) = 1, then a = ¾. If P(0) = 0, then a = -1/4• Finally Psame(t) =1/4 +3/4 exp(- t)
Pchange(t) =1/4 - 1/4 exp(- t)
Derivation of Jukes-Cantor formula
![Page 18: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/18.jpg)
Transition A G or C TTransversion A T or C G
Hasegawa-Kishino-Yano modelHas a more general substitution rate
![Page 19: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/19.jpg)
![Page 20: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/20.jpg)
Part of Jukes-Cantor distance matrixfor primate examples
(is much larger; for outgroup)
Matrix will be used for clustering methods
![Page 21: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/21.jpg)
Clustering
![Page 22: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/22.jpg)
UPGMA
![Page 23: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/23.jpg)
Neighbor-Joining Method
![Page 24: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/24.jpg)
N-J Method produces an Unrooted, Additive tree
![Page 25: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/25.jpg)
PAM Spinach Rice Mosquito Monkey HumanSpinach 0.0 84.9 105.6 90.8 86.3Rice 84.9 0.0 117.8 122.4 122.6Mosquito 105.6 117.8 0.0 84.7 80.8Monkey 90.8 122.4 84.7 0.0 3.3Human 86.3 122.6 80.8 3.3 0.0
What is required for the Neighbour joining method?
Distance matrix0. Distance Matrix
Neighbor-Joining MethodAn Example
![Page 26: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/26.jpg)
PAM distance 3.3 (Human - Monkey) is the minimum. So we'll join Human and Monkey to MonHum and we'll calculate the new distances.
Mon-Hum
MonkeyHumanSpinachMosquito Rice
1. First Step
![Page 27: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/27.jpg)
After we have joined two species in a subtree we have to compute the distances from every other node to the new subtree. We do this with a simple average of distances:Dist[Spinach, MonHum]
= (Dist[Spinach, Monkey] + Dist[Spinach, Human])/2 = (90.8 + 86.3)/2 = 88.55
Mon-Hum
MonkeyHumanSpinach
2. Calculation of New Distances
![Page 28: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/28.jpg)
PAM Spinach Rice Mosquito MonHumSpinach 0.0 84.9 105.6 88.6Rice 84.9 0.0 117.8 122.5Mosquito 105.6 117.8 0.0 82.8MonHum 88.6 122.5 82.8 0.0
HumanMosquito
Mon-Hum
MonkeySpinachRice
Mos-(Mon-Hum)
3. Next Cycle
![Page 29: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/29.jpg)
PAM Spinach Rice MosMonHumSpinach 0.0 84.9 97.1Rice 84.9 0.0 120.2MosMonHum 97.1 120.2 0.0
HumanMosquito
Mon-Hum
MonkeySpinachRice
Mos-(Mon-Hum)
Spin-Rice
4. Penultimate Cycle
![Page 30: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/30.jpg)
PAM SpinRice MosMonHumSpinach 0.0 108.7MosMonHum 108.7 0.0
HumanMosquito
Mon-Hum
MonkeySpinachRice
Mos-(Mon-Hum)
Spin-Rice
(Spin-Rice)-(Mos-(Mon-Hum))
5. Last Joining
![Page 31: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/31.jpg)
Human
Monkey
MosquitoRice
Spinach
The result:Unrooted Neighbor-Joining Tree
![Page 32: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/32.jpg)
![Page 33: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/33.jpg)
Bootstrapping
![Page 34: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/34.jpg)
Why are trees not exact?
![Page 35: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/35.jpg)
Pairwise distances usually not tree-like
![Page 36: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/36.jpg)
![Page 37: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/37.jpg)
Searching tree space
![Page 38: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/38.jpg)
Maximum likelihood criterion
![Page 39: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/39.jpg)
Parsimony criterion
![Page 40: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/40.jpg)
Parsimony with molecular data
![Page 41: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/41.jpg)
Parsimony criterion
Paul Higgs:
![Page 42: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/42.jpg)
![Page 43: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/43.jpg)
Is the best tree much better than others?
L: likelihood at nodes
![Page 44: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/44.jpg)
Use Maximum Likelihood to rank alternate trees
yes
yes
same topology
NJ tree is 2nd best
![Page 45: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/45.jpg)
Use Parsimony to rank alternate trees
different topology
; parsimony differentiates weakly
![Page 46: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/46.jpg)
Quartet puzzling
![Page 47: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/47.jpg)
![Page 48: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/48.jpg)
MCMC: Markov chain with Monte Carlo
![Page 49: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/49.jpg)
Topology probabilities according to MCMC
![Page 50: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/50.jpg)
Clade probability compared from tree methods
NJ method is very fast and close to being the best
![Page 51: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/51.jpg)
Lecture and Book
•Lecture by Paul Higgs• online.itp.ucsb.edu/online/infobio01/higgs/• see online.itp.ucsb.edu/online/infobio01/ for many lectures
•Book by Wen-Hsiong Li 李文雄•“Molecular Evolution” (Sinauer Associates, 1997)
![Page 52: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/52.jpg)
•CMS Molecular Biology Resource•www.unl.edu/stc-95/ResTools/cmshp.html•Phylogeny - Molecular Evolution•www.unl.edu/stc-95/ResTools/biotools/biotools2.html
•The Tree of Life Web Project •tolweb.org/tree/phylogeny.html
•Web Resources in Molecular Evolution and Systematics
•darwin.eeb.uconn.edu/molecular-evolution.html
Some web sites on Molecular Evolution
![Page 53: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/53.jpg)
• On-line service • www.ebi.ac.uk/clustalw/• clustalw.genome.ad.jp/
• Software• ftp-igbmc.u-strasbg.fr/pub/ClustalX/• ftp-igbmc.u-strasbg.fr/pub/ClustalW/
Some web sites on ClustalW
![Page 54: Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.](https://reader035.fdocuments.net/reader035/viewer/2022062322/56649d025503460f949d5a92/html5/thumbnails/54.jpg)