A (short) introduction to...

96
Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more... A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PRStatistics, Millport Field Station 16 Aug 2016 1/39

Transcript of A (short) introduction to...

Page 1: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

A (short) introduction to phylogenetics

Thibaut Jombart, Marie-Pauline Beugin

MRC Centre for Outbreak Analysis and ModellingImperial College London

Genetic data analysis withPR∼Statistics, Millport Field Station

16 Aug 2016

1/39

Page 2: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

OutlineContext

Phylogenies...

Distance trees

Parsimony

Likelihood/Bayesian

Uncertainty

Pitfalls & best practices

And more...

2/39

Page 3: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

OutlineContext

Phylogenies...

Distance trees

Parsimony

Likelihood/Bayesian

Uncertainty

Pitfalls & best practices

And more...

3/39

Page 4: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Phylogenetics: from the origins...

C. Darwin, Notebook, 1837.

’From the first growth of the tree, manya limb and branch has decayed anddropped off; and these fallen branches ofvarious sizes may represent those wholeorders, families, and genera which havenow no living representatives, and whichare known to us only in a fossil state.’

4/39

Page 5: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Phylogenetics: ...to the present

Bininda-Emonds et al., 2007,Nature.

• phylogenetic trees are part of thestandard toolbox of genetic dataanalysis

• represent the evolutionary historyof a set of (sampled) taxa

5/39

Page 6: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

And the main difference is...

Current trees look better!

(and some other minor differences)

6/39

Page 7: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

And the main difference is...

Current trees look better!

(and some other minor differences)

6/39

Page 8: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

And the main difference is...

Current trees look better!

(and some other minor differences)

6/39

Page 9: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

About the minor differences...

• DNA sequencing revolution

• huge data banks freely available(e.g. GenBank)

• easier, cheaper, faster to obtainDNA sequences

• increasing number of full genomesavailable

Different ways to exploit this information.

7/39

Page 10: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

About the minor differences...

• DNA sequencing revolution

• huge data banks freely available(e.g. GenBank)

• easier, cheaper, faster to obtainDNA sequences

• increasing number of full genomesavailable

Different ways to exploit this information.

7/39

Page 11: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

OutlineContext

Phylogenies...

Distance trees

Parsimony

Likelihood/Bayesian

Uncertainty

Pitfalls & best practices

And more...

8/39

Page 12: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Genetic changes (substitution) accumulate over time

Substitution: replacement of a nucleotide (e.g. a → t)

...attgtacgtaggatctagg...

...attgaacgtaggatctagg......attgtacgtaccatctagg...

...atcgtacgtaccatctggg...

...attgaacgtagttgctagg...

Tim

e

Substitution patterns reflect the evolutionary history

9/39

Page 13: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Genetic changes (substitution) accumulate over time

Substitution: replacement of a nucleotide (e.g. a → t)

...attgaacgtagttgctagg...

...atcgtacgtaccatctggg...

...attgaacgtaggatctagg......attgtacgtaccatctagg...

Tim

e...attgtacgtaggatctagg...

Substitution patterns reflect the evolutionary history

9/39

Page 14: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Genetic changes (substitution) accumulate over time

Substitution: replacement of a nucleotide (e.g. a → t)

...attgaacgtagttgctagg...

...atcgtacgtaccatctggg...Tim

e...attgtacgtaggatctagg...

...attgaacgtaggatctagg......attgtacgtaccatctagg...

Substitution patterns reflect the evolutionary history

9/39

Page 15: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Genetic changes (substitution) accumulate over time

Substitution: replacement of a nucleotide (e.g. a → t)

Tim

e...attgtacgtaggatctagg...

...attgaacgtaggatctagg......attgtacgtaccatctagg...

...atcgtacgtaccatctggg...

...attgaacgtagttgctagg...

Substitution patterns reflect the evolutionary history

9/39

Page 16: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Genetic changes (substitution) accumulate over time

Substitution: replacement of a nucleotide (e.g. a → t)

Tim

e...attgtacgtaggatctagg...

...attgaacgtaggatctagg......attgtacgtaccatctagg...

...atcgtacgtaccatctggg...

...attgaacgtagttgctagg...

Substitution patterns reflect the evolutionary history

9/39

Page 17: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Genetic changes (substitution) accumulate over time

Substitution: replacement of a nucleotide (e.g. a → t)

Tim

e...attgtacgtaggatctagg...

...attgaacgtaggatctagg......attgtacgtaccatctagg...

...atcgtacgtaccatctggg...

...attgaacgtagttgctagg...

Substitution patterns reflect the evolutionary history

9/39

Page 18: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Using substitution patterns to reconstruct the evolutionaryhistory

Reconstructed phylogeny

...attaaacgtaggatctagg...

...attaaacgtaggatctagg...

...attcatacgtaggatcagg...

...attgtacgtaggatctttt...

...attgtacgtaggatctttt...

...attgcatgtaggatctttt...

Phylogenetics aim to reconstruct evolutionary trees (phylogenies)from genetic sequence data.

10/39

Page 19: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Using substitution patterns to reconstruct the evolutionaryhistory

...attaaacgtaggatctagg...

...attaaacgtaggatctagg...

...attcatacgtaggatcagg...

...attgtacgtaggatctttt...

...attgtacgtaggatctttt...

...attgcatgtaggatctttt... Reconstructed phylogeny

Phylogenetics aim to reconstruct evolutionary trees (phylogenies)from genetic sequence data.

10/39

Page 20: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Using substitution patterns to reconstruct the evolutionaryhistory

...attaaacgtaggatctagg...

...attaaacgtaggatctagg...

...attcatacgtaggatcagg...

...attgtacgtaggatctttt...

...attgtacgtaggatctttt...

...attgcatgtaggatctttt... Reconstructed phylogeny

Phylogenetics aim to reconstruct evolutionary trees (phylogenies)from genetic sequence data.

10/39

Page 21: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Using trees to represent the evolutionary history

How to we build them?

11/39

Page 22: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Using trees to represent the evolutionary history

- "taxa"- "tips"- "leaves"- "Operational Taxonomic Units (OTUs)"

How to we build them?

11/39

Page 23: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Using trees to represent the evolutionary historyMost Recent Common Ancestors

(MRCA)

- "nodes"- "Hypothetical Taxonomic Units (HTUs)"

How to we build them?

11/39

Page 24: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Using trees to represent the evolutionary history

- "edge"- length = amount of evolution (not time, as a rule)- length is optional

branch

How to we build them?

11/39

Page 25: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Using trees to represent the evolutionary history

- "patristic" distance: sum of branch lengths- other measures of distance/dissimilarity- vertical axis meaningless

distances between tips

How to we build them?

11/39

Page 26: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Using trees to represent the evolutionary history

Root

- oldest part of the tree- defined using an outgroup- optional

How to we build them?

11/39

Page 27: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Using trees to represent the evolutionary history

Root

- oldest part of the tree- defined using an outgroup- optional

How to we build them?

11/39

Page 28: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

WorkflowPrepare data

• align sequences: alignment software + manual refinement

Build the tree

• distance-based methods

• maximum parsimony

• likelihood-based methods (ML, Bayesian)

Analyse the tree

• assess uncertainty

• test phylogenetic signal

• model trait evolution

• ... 12/39

Page 29: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

WorkflowPrepare data

• align sequences: alignment software + manual refinement

Build the tree

• distance-based methods

• maximum parsimony

• likelihood-based methods (ML, Bayesian)

Analyse the tree

• assess uncertainty

• test phylogenetic signal

• model trait evolution

• ... 12/39

Page 30: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

WorkflowPrepare data

• align sequences: alignment software + manual refinement

Build the tree

• distance-based methods

• maximum parsimony

• likelihood-based methods (ML, Bayesian)

Analyse the tree

• assess uncertainty

• test phylogenetic signal

• model trait evolution

• ... 12/39

Page 31: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

WorkflowPrepare data

• align sequences: alignment software + manual refinement

Build the tree

• distance-based methods

• maximum parsimony

• likelihood-based methods (ML, Bayesian)

Analyse the tree

• assess uncertainty

• test phylogenetic signal

• model trait evolution

• ... 12/39

Page 32: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

OutlineContext

Phylogenies...

Distance trees

Parsimony

Likelihood/Bayesian

Uncertainty

Pitfalls & best practices

And more...

13/39

Page 33: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Distance-based phylogenetic reconstruction

Approaches relying on agglomerative clustering algorithms(e.g. Single linkage, UPGMA, Neighbor-Joining)

Rationale

1. compute pairwise genetic distances D

2. group closest sequences

3. update D

4. go back to 2) until all sequences are grouped

14/39

Page 34: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Distance-based phylogenetic reconstruction

15/39

Page 35: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Distance-based phylogenetic reconstruction

15/39

Page 36: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Distance-based phylogenetic reconstruction

15/39

Page 37: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Distance-based phylogenetic reconstruction

15/39

Page 38: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Distance-based phylogenetic reconstruction

15/39

Page 39: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Distance-based phylogenetic reconstruction

15/39

Page 40: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

What is the distance between a node and tips?

Hierarchical clustering:

k

i jDi,jg

D =...k,g

• single linkage: Dk,g = min(Dk,i, Dk,j)

• complete linkage: Dk,g = max(Dk,i, Dk,j)

• UPGMA: Dk,g =Dk,i+Dk,j

2

Neighbor joining:Transforms original distances to account for heterogeneous rates ofevolution.

16/39

Page 41: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

What is the distance between a node and tips?

Hierarchical clustering:

k

i jDi,jg

D =...k,g

• single linkage: Dk,g = min(Dk,i, Dk,j)

• complete linkage: Dk,g = max(Dk,i, Dk,j)

• UPGMA: Dk,g =Dk,i+Dk,j

2

Neighbor joining:Transforms original distances to account for heterogeneous rates ofevolution.

16/39

Page 42: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

What is the distance between a node and tips?

Hierarchical clustering:

k

i jDi,jg

D =...k,g

• single linkage: Dk,g = min(Dk,i, Dk,j)

• complete linkage: Dk,g = max(Dk,i, Dk,j)

• UPGMA: Dk,g =Dk,i+Dk,j

2

Neighbor joining:Transforms original distances to account for heterogeneous rates ofevolution.

16/39

Page 43: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

What is the distance between a node and tips?

Hierarchical clustering:

k

i jDi,jg

D =...k,g

• single linkage: Dk,g = min(Dk,i, Dk,j)

• complete linkage: Dk,g = max(Dk,i, Dk,j)

• UPGMA: Dk,g =Dk,i+Dk,j

2

Neighbor joining:Transforms original distances to account for heterogeneous rates ofevolution.

16/39

Page 44: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

What is the distance between a node and tips?

Hierarchical clustering:

k

i jDi,jg

D =...k,g

• single linkage: Dk,g = min(Dk,i, Dk,j)

• complete linkage: Dk,g = max(Dk,i, Dk,j)

• UPGMA: Dk,g =Dk,i+Dk,j

2

Neighbor joining:Transforms original distances to account for heterogeneous rates ofevolution.

16/39

Page 45: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Distance-based phylogenetic reconstruction

Advantages

• simple

• flexible (many distances and clustering algorithms)

• fast and scalable (applicable to large datasets)

Limitations

• sensitive to distance/clustering chosen

• evolutionary rates are not estimated

• no measure of uncertainty for the tree obtained

17/39

Page 46: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Distance-based phylogenetic reconstruction

Advantages

• simple

• flexible (many distances and clustering algorithms)

• fast and scalable (applicable to large datasets)

Limitations

• sensitive to distance/clustering chosen

• evolutionary rates are not estimated

• no measure of uncertainty for the tree obtained

17/39

Page 47: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

OutlineContext

Phylogenies...

Distance trees

Parsimony

Likelihood/Bayesian

Uncertainty

Pitfalls & best practices

And more...

18/39

Page 48: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Maximum parsimony phylogenies

Approaches relying on finding the tree with the smallest numberof character changes (substitutions)

Rationale

1. start from a pre-defined tree

2. compute initial parsimony score

3. permute branches and compute parsimony score

4. accept new tree if the parsimony score is improved

5. go back to 3) until convergence

19/39

Page 49: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Maximum parsimony phylogenies

score: 8

score: 5

score: 6

Initial tree

20/39

Page 50: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Maximum parsimony phylogenies

score: 5

score: 6

Initial tree

score: 8

20/39

Page 51: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Maximum parsimony phylogenies

score: 5

score: 6

Initial tree

score: 8

20/39

Page 52: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Maximum parsimony phylogenies

score: 6

Initial tree

score: 5

score: 8

20/39

Page 53: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Maximum parsimony phylogenies

score: 6

Initial tree

score: 5

score: 8

20/39

Page 54: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Maximum parsimony phylogenies

Advantages

• applicable to any discontinuous characters (not just DNA)

• intuitive explanation: ‘simplest’ evolutionary scenario

Limitations

• evolutionary rates are not estimated

• no measure of uncertainty for the tree obtained

• computer-intensive

• different types of substitutions ignored

• evolution not necessarily parsimonious

• sensitive to heterogeneous rates of evolution (long branchattraction)

21/39

Page 55: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Maximum parsimony phylogenies

Advantages

• applicable to any discontinuous characters (not just DNA)

• intuitive explanation: ‘simplest’ evolutionary scenario

Limitations

• evolutionary rates are not estimated

• no measure of uncertainty for the tree obtained

• computer-intensive

• different types of substitutions ignored

• evolution not necessarily parsimonious

• sensitive to heterogeneous rates of evolution (long branchattraction)

21/39

Page 56: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

OutlineContext

Phylogenies...

Distance trees

Parsimony

Likelihood/Bayesian

Uncertainty

Pitfalls & best practices

And more...

22/39

Page 57: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Likelihood-based phylogenies (ML / Bayesian)

Approaches relying on a model of sequence evolution:

• ML: find tree and evolutionary rates with highest likelihood

• Bayesian: find tree and evolutionary rates to posteriorprobability

Rationale

1. start from a pre-defined tree

2. compute initial likelihood/posterior

3. permute branches, sample new parameters and computelikelihood/posterior

4. accept new tree and parameters based on likelihood/posteriorimprovement

5. go back to 3) until convergence

23/39

Page 58: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Likelihood-based phylogenies (ML / Bayesian)

Approaches relying on a model of sequence evolution:

• ML: find tree and evolutionary rates with highest likelihood

• Bayesian: find tree and evolutionary rates to posteriorprobability

Rationale

1. start from a pre-defined tree

2. compute initial likelihood/posterior

3. permute branches, sample new parameters and computelikelihood/posterior

4. accept new tree and parameters based on likelihood/posteriorimprovement

5. go back to 3) until convergence

23/39

Page 59: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Likelihood-based phylogenies (ML / Bayesian)

Den

sity

Likelihood / Posterior

24/39

Page 60: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Likelihood-based phylogenies (ML / Bayesian)

Advantages

• very flexible

• consistent with a model of evolution

• statistically consistent (model comparison)

• parameter estimation

• (Bayesian) several trees → measure of uncertainty

Limitations

• computer-intensive

• choice of the model of evolution

• (ML) no measure of uncertainty for the tree obtained

• (Bayesian) need to find a consensus tree

25/39

Page 61: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Likelihood-based phylogenies (ML / Bayesian)

Advantages

• very flexible

• consistent with a model of evolution

• statistically consistent (model comparison)

• parameter estimation

• (Bayesian) several trees → measure of uncertainty

Limitations

• computer-intensive

• choice of the model of evolution

• (ML) no measure of uncertainty for the tree obtained

• (Bayesian) need to find a consensus tree

25/39

Page 62: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

OutlineContext

Phylogenies...

Distance trees

Parsimony

Likelihood/Bayesian

Uncertainty

Pitfalls & best practices

And more...

26/39

Page 63: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

How do we know the tree is robust?

Main issue: assess the uncertainty of the tree topology /individual nodes

Approaches

• ML: model selection to comparetrees (whole tree)

• Bayesian methods:between-samples variability(individual nodes)

• any method:bootstrap (individual nodes)

27/39

Page 64: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

How do we know the tree is robust?

Main issue: assess the uncertainty of the tree topology /individual nodes

Approaches

• ML: model selection to comparetrees (whole tree)

• Bayesian methods:between-samples variability(individual nodes)

• any method:bootstrap (individual nodes)

27/39

Page 65: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

How do we know the tree is robust?

Main issue: assess the uncertainty of the tree topology /individual nodes

Approaches

• ML: model selection to comparetrees (whole tree)

• Bayesian methods:between-samples variability(individual nodes)

• any method:bootstrap (individual nodes)

27/39

Page 66: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

How do we know the tree is robust?

Main issue: assess the uncertainty of the tree topology /individual nodes

Approaches

• ML: model selection to comparetrees (whole tree)

• Bayesian methods:between-samples variability(individual nodes)

• any method:bootstrap (individual nodes)

27/39

Page 67: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

How do we know the tree is robust?

Main issue: assess the uncertainty of the tree topology /individual nodes

Approaches

• ML: model selection to comparetrees (whole tree)

• Bayesian methods:between-samples variability(individual nodes)

• any method:bootstrap (individual nodes)

27/39

Page 68: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Bootstrapping phylogenies

• assess variability due to sampling the genome andconflicting signals

• relies on analysing resampled datasets

Rationale

1. obtain a reference tree

2. resample the sites with replacement

3. obtain a tree for the resampled dataset

4. go back to 2) until the desired number of bootstrapped treesis attained

5. compute the frequency of each bifurcation of the referencetree occuring in bootstrapped trees

28/39

Page 69: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Bootstrapping phylogenies

• assess variability due to sampling the genome andconflicting signals

• relies on analysing resampled datasets

Rationale

1. obtain a reference tree

2. resample the sites with replacement

3. obtain a tree for the resampled dataset

4. go back to 2) until the desired number of bootstrapped treesis attained

5. compute the frequency of each bifurcation of the referencetree occuring in bootstrapped trees

28/39

Page 70: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Bootstrapping phylogenies

...ttttaaaccatggatctagg...

...ttttaaaccatggatctagg...

...atttcatacgtaggatcagg...

...aaatgtaccatggatcttgt...

...aaatgtaccatggatcttgt...

...aaatgcatcatggatcttgt...

...attaaacgtaggatctagg...

...attaaacgtaggatctagg...

...attcatacgtaggatcagg...

...attgtacgtaggatctttt...

...attgtacgtaggatctttt...

...attgcatgtaggatctttt... Reconstructed phylogeny

...ttttaaaccatggatctagg...

...ttttaaaccatggatctagg...

...atttcatacgtaggatcagg...

...aaatgtaccatggatcttgt...

...aaatgtaccatggatcttgt...

...aaatgcatcatggatcttgt...Reconstructed phylogeny

Reconstructed phylogenyReconstructed phylogeny

Reconstructed phylogenyReconstructed phylogeny

sampling sites withreplacement

compare topologies

...ttttaaaccatggatctagg...

...ttttaaaccatggatctagg...

...atttcatacgtaggatcagg...

...aaatgtaccatggatcttgt...

...aaatgtaccatggatcttgt...

...aaatgcatcatggatcttgt...

...ttttaaaccatggatctagg...

...ttttaaaccatggatctagg...

...atttcatacgtaggatcagg...

...aaatgtaccatggatcttgt...

...aaatgtaccatggatcttgt...

...aaatgcatcatggatcttgt...

...ttttaaaccatggatctagg...

...ttttaaaccatggatctagg...

...atttcatacgtaggatcagg...

...aaatgtaccatggatcttgt...

...aaatgtaccatggatcttgt...

...aaatgcatcatggatcttgt...

29/39

Page 71: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Bootstrapping phylogenies

Advantages

• standard

• simple to implement

Limitations

• possibly computer-intensive

• assumes that the genome has been sampled randomly (oftenwrong)

30/39

Page 72: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Bootstrapping phylogenies

Advantages

• standard

• simple to implement

Limitations

• possibly computer-intensive

• assumes that the genome has been sampled randomly (oftenwrong)

30/39

Page 73: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

OutlineContext

Phylogenies...

Distance trees

Parsimony

Likelihood/Bayesian

Uncertainty

Pitfalls & best practices

And more...

31/39

Page 74: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Plotting trees as rooted

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

1

23

4

5

6

7

8

9

10

11

12

13

14

15

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

1

23

4

5

6

7

89

10

11

12

13

14

15

1

234

5

6

7

8

9

10

11

12

13

14

15

Never plot an unrooted tree as rooted.

32/39

Page 75: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Plotting trees as rooted

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

1

23

4

5

6

7

8

9

10

11

12

13

14

15

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

1

23

4

5

6

7

89

10

11

12

13

14

15

1

234

5

6

7

8

9

10

11

12

13

14

15

Never plot an unrooted tree as rooted.

32/39

Page 76: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Interpreting distances

33/39

Page 77: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Interpreting distances

33/39

Page 78: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Interpreting distances

meaningful distance = sum of branch lengths

meanin

gle

ss

33/39

Page 79: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

The paradox of divergent clusters

MRCA and genetic distances may give different information.

34/39

Page 80: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

The paradox of divergent clusters

D(red, red)>D(red,blue)

MRCA and genetic distances may give different information.

34/39

Page 81: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

The paradox of divergent clusters

D(red, red)>D(red,blue)

MRCA and genetic distances may give different information.

34/39

Page 82: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Taking uncertainty into account

193 HIV−1 sequences from DRC(Strimmer & Pybus 2001)

0.2 0.15 0.1 0.05 0

At best, the tree is an estimate of the likely evolutionary history ofthe taxa studied.

35/39

Page 83: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Taking uncertainty into account

193 HIV−1 sequences from DRC(Strimmer & Pybus 2001)

0.2 0.15 0.1 0.05 0

Collapsed tree (threshold length 0.01)

0.2 0.15 0.1 0.05 0

At best, the tree is an estimate of the likely evolutionary history ofthe taxa studied.

35/39

Page 84: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Taking uncertainty into account

193 HIV−1 sequences from DRC(Strimmer & Pybus 2001)

0.2 0.15 0.1 0.05 0

Collapsed tree (threshold length 0.01)

0.2 0.15 0.1 0.05 0

At best, the tree is an estimate of the likely evolutionary history ofthe taxa studied.

35/39

Page 85: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

(Over, Mis)Interpreting temporal trends

t30t46t33t64t25

t99 t73t8t81t31

t76t91t100

t42t77

t74t98 t65t79t36t58

t32t26t59t16t54t10

t18t11t94t5t45

t60t53t67t88

t27t52t38t40t22

t90t80t37t12t93t78

t49t82

t50 t57t71t21t7t1t84

t70

t20t97t41t43t19t47t28t17t66t55 t34t61t35

t3t2t89t75t86

t15t13t23

t56

t96t24t44

t39t63t69

t83t72t29t14t62t6

t48t4

t87t92t9t95 t68t51t85

12 10 8 6 4 2 0

“Time trees” only make sense under a near-perfect molecular clock.

36/39

Page 86: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

(Over, Mis)Interpreting temporal trends

t30t46t33t64t25t99 t73t8t81

t31

t76 t91t100t42t77

t74t98 t65t79t36t58

t32t26t59t16t54t10

t18t11t94t5t45t60 t53t67t88 t27t52t38t40t22

t90t80t37t12t93t78

t49t82

t50 t57t71t21t7t1t84

t70

t20t97t41t43t19t47t28t17 t66t55 t34t61t35t3

t2t89t75t86t15t13t23

t56

t96 t24t44t39 t63t69t83t72t29t14

t62t6t48t4

t87t92t9t95 t68t51t85

12 10 8 6 4 2 0 0 20 40 60 80

02

46

810

12

Time to the root

Mut

atio

ns to

the

root

> anova(lm(d.root~-1+t.root))

anova(lm(d.root~-1+t.root))Analysis of Variance Table

Response: d.root Df Sum Sq Mean Sq F value Pr(>F) t.root 1 5434.7 5434.7 214.66 < 2.2e-16 ***Residuals 99 2506.4 25.3

“Time trees” only make sense under a near-perfect molecular clock.

36/39

Page 87: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

(Over, Mis)Interpreting temporal trends

t30t46t33t64t25t99 t73t8t81

t31

t76 t91t100t42t77

t74t98 t65t79t36t58

t32t26t59t16t54t10

t18t11t94t5t45t60 t53t67t88 t27t52t38t40t22

t90t80t37t12t93t78

t49t82

t50 t57t71t21t7t1t84

t70

t20t97t41t43t19t47t28t17 t66t55 t34t61t35t3

t2t89t75t86t15t13t23

t56

t96 t24t44t39 t63t69t83t72t29t14

t62t6t48t4

t87t92t9t95 t68t51t85

12 10 8 6 4 2 0 0 20 40 60 80

02

46

810

12

Time to the root

Mut

atio

ns to

the

root

> anova(lm(d.root~-1+t.root))

anova(lm(d.root~-1+t.root))Analysis of Variance Table

Response: d.root Df Sum Sq Mean Sq F value Pr(>F) t.root 1 5434.7 5434.7 214.66 < 2.2e-16 ***Residuals 99 2506.4 25.3

“Time trees” only make sense under a near-perfect molecular clock.

36/39

Page 88: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

OutlineContext

Phylogenies...

Distance trees

Parsimony

Likelihood/Bayesian

Uncertainty

Pitfalls & best practices

And more...

37/39

Page 89: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

This is only the beginning

Many things can be done with trees

• estimate divergence time

• model trait evolution (phylogeneticcomparative method)

• reconstruct ancestral states

• measure diversity

• infer past demographics/effectivepopulation size (coalescence)

• ...

• and also, other approaches thanphylogenetics to analyse geneticdata

38/39

Page 90: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

This is only the beginning

Many things can be done with trees

• estimate divergence time

• model trait evolution (phylogeneticcomparative method)

• reconstruct ancestral states

• measure diversity

• infer past demographics/effectivepopulation size (coalescence)

• ...

• and also, other approaches thanphylogenetics to analyse geneticdata

38/39

Page 91: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

This is only the beginning

Many things can be done with trees

• estimate divergence time

• model trait evolution (phylogeneticcomparative method)

• reconstruct ancestral states

• measure diversity

• infer past demographics/effectivepopulation size (coalescence)

• ...

• and also, other approaches thanphylogenetics to analyse geneticdata

38/39

Page 92: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

This is only the beginning

Many things can be done with trees

• estimate divergence time

• model trait evolution (phylogeneticcomparative method)

• reconstruct ancestral states

• measure diversity

• infer past demographics/effectivepopulation size (coalescence)

• ...

• and also, other approaches thanphylogenetics to analyse geneticdata

38/39

Page 93: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

This is only the beginning

Many things can be done with trees

• estimate divergence time

• model trait evolution (phylogeneticcomparative method)

• reconstruct ancestral states

• measure diversity

• infer past demographics/effectivepopulation size (coalescence)

• ...

• and also, other approaches thanphylogenetics to analyse geneticdata

38/39

Page 94: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

This is only the beginning

Many things can be done with trees

• estimate divergence time

• model trait evolution (phylogeneticcomparative method)

• reconstruct ancestral states

• measure diversity

• infer past demographics/effectivepopulation size (coalescence)

• ...

• and also, other approaches thanphylogenetics to analyse geneticdata

38/39

Page 95: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

This is only the beginning

Many things can be done with trees

• estimate divergence time

• model trait evolution (phylogeneticcomparative method)

• reconstruct ancestral states

• measure diversity

• infer past demographics/effectivepopulation size (coalescence)

• ...

• and also, other approaches thanphylogenetics to analyse geneticdata

38/39

Page 96: A (short) introduction to phylogeneticsadegenet.r-forge.r-project.org/files/PRstats/lecture-phylo.1.0.pdfPhylogenetics: ...to the present Bininda-Emonds et al., 2007, Nature. phylogenetic

Context Phylogenies... Distance trees Parsimony Likelihood/Bayesian Uncertainty Pitfalls & best practices And more...

Time to get your hands dirty!

The pdf of the practical is online:

http://adegenet.r-forge.r-project.org/

or

Google → adegenet → documents → “GDAR August 2016”

39/39