An Introduction to Multiple Sequence Alignments
-
Upload
mariette-dupont -
Category
Documents
-
view
51 -
download
3
description
Transcript of An Introduction to Multiple Sequence Alignments
![Page 1: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/1.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
An Introduction toMultiple Sequence
AlignmentsCédric Notredame
![Page 2: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/2.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
![Page 3: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/3.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Manguel M, Samaniego F.J., Abraham Wald’s Work on Aircraft Suvivability, J. American Statistical Association. 79, 259-270, (1984)
![Page 4: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/4.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Our Scope
How Can I Use My Alignment?
How Does The Computer Align The Sequences?
How Can I Assemble a Mult. Aln?
What are the Difficulties?
![Page 5: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/5.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Outline
-Why Do We Need Multiple Sequence Alignment ?
-The progressive Alignment Algorithm
-A possible Strategy…
-Potential Difficulties
![Page 6: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/6.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Pre-requisite
-How Do Sequences Evolve?
-How can We COMPARE Sequences ?
-How can We ALIGN Sequences ?
![Page 7: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/7.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Why Do We Need Multiple Sequence
Alignment ?
![Page 8: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/8.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Sometimes Two Sequences Are Not Enough…
The man with TWO watches NEVER knows the time
![Page 9: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/9.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
What is A Multiple Sequence Alignment?
Structural Criteria:Residues are arranged so that those playing a similar role end up in the same column.
Evolution Criteria:Residues are arranged so that those having the same ancestor end up in the same column.
![Page 10: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/10.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
PhylogenicRelation
FunctionalRelation
![Page 11: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/11.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
![Page 12: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/12.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPunknown -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------unknown AKDDRIRYDNEMKSWEEQMAE * : .* . :
How Can I Use A Multiple Sequence Alignment?
Extrapolation Beyond The Twilight Zone
SwissProtUnkown Sequence
Homology?
Less Than 30 % idBUT
Conserved where it MATTERS
![Page 13: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/13.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
![Page 14: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/14.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
How Can I Use A Multiple Sequence Alignment?
Extrapolation
Prosite Patterns
![Page 15: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/15.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
How Can I Use A Multiple Sequence Alignment?
Extrapolation
Prosite PatternsP-K-R-[PA]-x(1)-[ST]…
![Page 16: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/16.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
How Can I Use A Multiple Sequence Alignment?
Extrapolation
Prosite Patterns
SwissProtUncharacterised Signature
Match?
![Page 17: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/17.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-IQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
How Can I Use A Multiple Sequence Alignment?
Extrapolation
Prosite Patterns
Profiles And HMMs
-More Sensitive-More Specific
L?K>R
AFDEFGHQIVLW
![Page 18: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/18.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
A PROSITE PROFILE
A Substitution Cost For Every Amino Acid, At Every Position
![Page 19: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/19.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
How Can I Use A Multiple Sequence Alignment?
Extrapolation
Motifs/Patterns
Phylogeny
chite
wheattrybr
mouse
-Evolution-Paralogy/Orthology
Profiles
![Page 20: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/20.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
How Can I Use A Multiple Sequence Alignment?
Extrapolation
Motifs/Patterns
Phylogeny
Profiles
Struc. Prediction
Column Constraint
Evolution Constraint
Structure Constraint
![Page 21: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/21.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
How Can I Use A Multiple Sequence Alignment?
Extrapolation
Motifs/Patterns
Phylogeny
Profiles
Struc. Prediction
PsiPred OR PhD For secondary Structure Prediction: 75% Accurate.
Threading: is improving but is not yet as good.
![Page 22: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/22.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
How Can I Use A Multiple Sequence Alignment?
Automatic MultipleSequence Alignment methodsare not always perfect…
You know better…With your big BRAIN
![Page 23: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/23.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
![Page 24: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/24.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Why Is It Difficult To Compute A multiple Sequence Alignment?
A CROSSROAD PROBLEM
BIOLOGY:What is A Good Alignment
COMPUTATIONWhat is THE Good Alignment
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
![Page 25: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/25.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Why Is It Difficult To Compute A multiple Sequence Alignment ?
BIOLOGY
CIRCULAR PROBLEM....
GoodSequences
GoodAlignment
COMPUTATION
![Page 26: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/26.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
The Biological Problem.
Same as PairWise Alignment Problem
We do NOT know how Sequences Evolve.
We do NOT understand the Relation Between Structures and Sequences.
We would NOT recognize the Correct Alignment if we had it IN FRONT of our eyes…
![Page 27: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/27.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
The Biological Problem.The Charlie Chaplin Paradox
![Page 28: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/28.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
The Biological Problem.How to Evaluate an Alignment
-Substitution Matrix (Blosum)
-An Evaluation Function
AAACC
-Gap Penalties.
-A nice set of Sequences
A
A
A CSums of Pairs: Cost=6
C
Over-estimation of the Substitutions
Easy to compute
![Page 29: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/29.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
The COMPUTATIONAL Problem.Producing the Alignment
-Substitution Matrix (Blosum)
-An Evaluation Function
-Gap Penalties.
-A nice set of Sequences
-An Alignment Algorithm
GLOBAL Alignment
Will It Work?
![Page 30: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/30.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
HOW CAN I ALIGN MANY SEQUENCES
2 Globins =>1 Min
![Page 31: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/31.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
3 Globins =>2 hours
HOW CAN I ALIGN MANY SEQUENCES
![Page 32: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/32.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
4 Globins => 10 days
HOW CAN I ALIGN MANY SEQUENCES
![Page 33: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/33.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
5 Globins => 3 years
HOW CAN I ALIGN MANY SEQUENCES
![Page 34: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/34.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
6 Globins =>300 years
HOW CAN I ALIGN MANY SEQUENCES
!DHEA
Loaded
![Page 35: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/35.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
7 Globins =>30. 000 years
HOW CAN I ALIGN MANY SEQUENCES
Solidified Fossil,Old stuff
![Page 36: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/36.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
8 Globins =>3 Million years
HOW CAN I ALIGN MANY SEQUENCES
![Page 37: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/37.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
The Progressive Multiple Alignment
Algorithm(Clustal W)
![Page 38: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/38.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
![Page 39: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/39.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Making An Alignment
Any Exact Method would be TOO SLOW
We will use a Heuristic Algorithm.
Progressive Alignment Algorithm is the most Popular
-Fast
-ClustalW
-Greedy Heuristic (No Guarranty).
![Page 40: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/40.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Progressive Alignment
Feng and Dolittle, 1988; Taylor 1989
Clustering
![Page 41: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/41.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Dynamic Programming Using A Substitution Matrix
Progressive Alignment
![Page 42: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/42.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Progressive Alignment
-Depends on the ORDER of the sequences (Tree).
-Depends on the CHOICE of the sequences.
-Depends on the PARAMETERS:
•Substitution Matrix.
•Penalties (Gop, Gep).
•Sequence Weight.
•Tree making Algorithm.
![Page 43: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/43.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Progressive AlignmentWhen Does It Work
Works Well When Phylogeny is Dense
No outlayer Sequence.
Image: River Crossing
![Page 44: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/44.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
SeqA GARFIELD THE LAST FA-T CATSeqB GARFIELD THE FAST CA-T ---SeqC GARFIELD THE VERY FAST CATSeqD -------- THE ---- FA-T CAT
CLUSTALW (Score=20, Gop=-1, Gep=0, M=1)
SeqA GARFIELD THE LAST FA-T CATSeqB GARFIELD THE FAST ---- CATSeqC GARFIELD THE VERY FAST CATSeqD -------- THE ---- FA-T CAT
CORRECT (Score=24)
Progressive AlignmentWhen Doesn’t It Work
![Page 45: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/45.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
GARFIELD THE LAST FAT CATGARFIELD THE FAST CAT ---
GARFIELD THE LAST FAT CAT
GARFIELD THE FAST CAT
GARFIELD THE VERY FAST CAT
THE FAT CAT
GARFIELD THE VERY FAST CAT-------- THE ---- FA-T CAT
GARFIELD THE LAST FA-T CATGARFIELD THE FAST CA-T ---GARFIELD THE VERY FAST CAT-------- THE ---- FA-T CAT
![Page 46: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/46.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Building the Right Multiple Sequence
Alignment.
![Page 47: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/47.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Recognizing The Right Sequences When you Meet Them…
![Page 48: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/48.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Gathering Sequences: BLAST
![Page 49: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/49.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Common Mistake:Sequences Too Closely Related
PRVA_MACFU SMTDLLNAEDIKKAVGAFSAIDSFDHKKFFQMVGLKKKSADDVKKVFHILDKDKSGFIEEPRVA_HUMAN SMTDLLNAEDIKKAVGAFSATDSFDHKKFFQMVGLKKKSADDVKKVFHMLDKDKSGFIEEPRVA_GERSP SMTDLLSAEDIKKAIGAFAAADSFDHKKFFQMVGLKKKTPDDVKKVFHILDKDKSGFIEEPRVA_MOUSE SMTDVLSAEDIKKAIGAFAAADSFDHKKFFQMVGLKKKNPDEVKKVFHILDKDKSGFIEEPRVA_RAT SMTDLLSAEDIKKAIGAFTAADSFDHKKFFQMVGLKKKSADDVKKVFHILDKDKSGFIEEPRVA_RABIT AMTELLNAEDIKKAIGAFAAAESFDHKKFFQMVGLKKKSTEDVKKVFHILDKDKSGFIEE :**::*.*******:***:* :****************..::******:***********
PRVA_MACFU DELGFILKGFSPDARDLSAKETKTLMAAGDKDGDGKIGVDEFSTLVAESPRVA_HUMAN DELGFILKGFSPDARDLSAKETKMLMAAGDKDGDGKIGVDEFSTLVAESPRVA_GERSP DELGFILKGFSSDARDLSAKETKTLLAAGDKDGDGKIGVEEFSTLVSESPRVA_MOUSE DELGSILKGFSSDARDLSAKETKTLLAAGDKDGDGKIGVEEFSTLVAESPRVA_RAT DELGSILKGFSSDARDLSAKETKTLMAAGDKDGDGKIGVEEFSTLVAESPRVA_RABIT EELGFILKGFSPDARDLSVKETKTLMAAGDKDGDGKIGADEFSTLVSES :*** ******.******.**** *:************.:******:**
-IDENTICAL SEQUENCES BRING NO INFORMATION FOR THE MULTIPLE SEQUENCE ALIGNMENT
-MULTIPLE SEQUENCE ALIGNMENTS THRIVE ON DIVERSITY…
![Page 50: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/50.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
![Page 51: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/51.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Sequence Weighting Within ClustalW
![Page 52: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/52.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Selecting Diverse Sequences (Opus II)
![Page 53: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/53.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Respect Information!
This Alignment Is not Informative about the relation Betwwen TPCC MOUSE and the rest of the sequences.
PRVA_MACFU ------------------------------------------SMTDLLN----AEDIKKAPRVA_HUMAN ------------------------------------------SMTDLLN----AEDIKKAPRVA_GERSP ------------------------------------------SMTDLLS----AEDIKKAPRVA_MOUSE ------------------------------------------SMTDVLS----AEDIKKAPRVA_RAT ------------------------------------------SMTDLLS----AEDIKKAPRVA_RABIT ------------------------------------------AMTELLN----AEDIKKATPCC_MOUSE MDDIYKAAVEQLTEEQKNEFKAAFDIFVLGAEDGCISTKELGKVMRMLGQNPTPEELQEM : :*. .*::::
PRVA_MACFU VGAFSAIDS--FDHKKFFQMVG------LKKKSADDVKKVFHILDKDKSGFIEEDELGFIPRVA_HUMAN VGAFSATDS--FDHKKFFQMVG------LKKKSADDVKKVFHMLDKDKSGFIEEDELGFIPRVA_GERSP IGAFAAADS--FDHKKFFQMVG------LKKKTPDDVKKVFHILDKDKSGFIEEDELGFIPRVA_MOUSE IGAFAAADS--FDHKKFFQMVG------LKKKNPDEVKKVFHILDKDKSGFIEEDELGSIPRVA_RAT IGAFTAADS--FDHKKFFQMVG------LKKKSADDVKKVFHILDKDKSGFIEEDELGSIPRVA_RABIT IGAFAAAES--FDHKKFFQMVG------LKKKSTEDVKKVFHILDKDKSGFIEEEELGFITPCC_MOUSE IDEVDEDGSGTVDFDEFLVMMVRCMKDDSKGKSEEELSDLFRMFDKNADGYIDLDELKMM
-A better Spread of the Sequences is needed
![Page 54: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/54.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Selecting Diverse Sequences (Opus II)
![Page 55: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/55.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Selecting Diverse Sequences (Opus II)
PRVB_CYPCA -AFAGVLNDADIAAALEACKAADSFNHKAFFAKVGLTSKSADDVKKAFAIIDQDKSGFIEPRVB_BOACO -AFAGILSDADIAAGLQSCQAADSFSCKTFFAKSGLHSKSKDQLTKVFGVIDRDKSGYIEPRV1_SALSA MACAHLCKEADIKTALEACKAADTFSFKTFFHTIGFASKSADDVKKAFKVIDQDASGFIEPRVB_LATCH -AVAKLLAAADVTAALEGCKADDSFNHKVFFQKTGLAKKSNEELEAIFKILDQDKSGFIEPRVB_RANES -SITDIVSEKDIDAALESVKAAGSFNYKIFFQKVGLAGKSAADAKKVFEILDRDKSGFIEPRVA_MACFU -SMTDLLNAEDIKKAVGAFSAIDSFDHKKFFQMVGLKKKSADDVKKVFHILDKDKSGFIEPRVA_ESOLU --AKDLLKADDIKKALDAVKAEGSFNHKKFFALVGLKAMSANDVKKVFKAIDADASGFIE : *: .: . .* .:*. * ** *: * : * :* * **:**
PRVB_CYPCA EDELKLFLQNFKADARALTDGETKTFLKAGDSDGDGKIGVDEFTALVKA-PRVB_BOACO EDELKKFLQNFDGKARDLTDKETAEFLKEGDTDGDGKIGVEEFVVLVTKGPRV1_SALSA VEELKLFLQNFCPKARELTDAETKAFLKAGDADGDGMIGIDEFAVLVKQ-PRVB_LATCH DEELELFLQNFSAGARTLTKTETETFLKAGDSDGDGKIGVDEFQKLVKA-PRVB_RANES QDELGLFLQNFRASARVLSDAETSAFLKAGDSDGDGKIGVEEFQALVKA-PRVA_MACFU EDELGFILKGFSPDARDLSAKETKTLMAAGDKDGDGKIGVDEFSTLVAESPRVA_ESOLU EEELKFVLKSFAADGRDLTDAETKAFLKAADKDGDGKIGIDEFETLVHEA :** .*:.* .* *: ** :: .* **** **::** **
-A REASONABLE Model Now Exists.
-Going Further:Remote Homologues.
![Page 56: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/56.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Aligning Remote Homologues
PRVA_MACFU ------------------------------------------SMTDLLNA----EDIKKAPRVA_ESOLU -------------------------------------------AKDLLKA----DDIKKAPRVB_CYPCA ------------------------------------------AFAGVLND----ADIAAAPRVB_BOACO ------------------------------------------AFAGILSD----ADIAAGPRV1_SALSA -----------------------------------------MACAHLCKE----ADIKTAPRVB_LATCH ------------------------------------------AVAKLLAA----ADVTAAPRVB_RANES ------------------------------------------SITDIVSE----KDIDAATPCS_RABIT -TDQQAEARSYLSEEMIAEFKAAFDMFDADGG-GDISVKELGTVMRMLGQTPTKEELDAITPCS_PIG -TDQQAEARSYLSEEMIAEFKAAFDMFDADGG-GDISVKELGTVMRMLGQTPTKEELDAITPCC_MOUSE MDDIYKAAVEQLTEEQKNEFKAAFDIFVLGAEDGCISTKELGKVMRMLGQNPTPEELQEM : ::
PRVA_MACFU VGAFSAIDS--FDHKKFFQMVG------LKKKSADDVKKVFHILDKDKSGFIEEDELGFIPRVA_ESOLU LDAVKAEGS--FNHKKFFALVG------LKAMSANDVKKVFKAIDADASGFIEEEELKFVPRVB_CYPCA LEACKAADS--FNHKAFFAKVG------LTSKSADDVKKAFAIIDQDKSGFIEEDELKLFPRVB_BOACO LQSCQAADS--FSCKTFFAKSG------LHSKSKDQLTKVFGVIDRDKSGYIEEDELKKFPRV1_SALSA LEACKAADT--FSFKTFFHTIG------FASKSADDVKKAFKVIDQDASGFIEVEELKLFPRVB_LATCH LEGCKADDS--FNHKVFFQKTG------LAKKSNEELEAIFKILDQDKSGFIEDEELELFPRVB_RANES LESVKAAGS--FNYKIFFQKVG------LAGKSAADAKKVFEILDRDKSGFIEQDELGLFTPCS_RABIT IEEVDEDGSGTIDFEEFLVMMVRQMKEDAKGKSEEELAECFRIFDRNADGYIDAEELAEITPCS_PIG IEEVDEDGSGTIDFEEFLVMMVRQMKEDAKGKSEEELAECFRIFDRNMDGYIDAEELAEITPCC_MOUSE IDEVDEDGSGTVDFDEFLVMMVRCMKDDSKGKSEEELSDLFRMFDKNADGYIDLDELKMM : . .: .. . *: * : * :* : .*:*: :** .
PRVA_MACFU LKGFSPDARDLSAKETKTLMAAGDKDGDGKIGVDEFSTLVAES-PRVA_ESOLU LKSFAADGRDLTDAETKAFLKAADKDGDGKIGIDEFETLVHEA-PRVB_CYPCA LQNFKADARALTDGETKTFLKAGDSDGDGKIGVDEFTALVKA--PRVB_BOACO LQNFDGKARDLTDKETAEFLKEGDTDGDGKIGVEEFVVLVTKG-PRV1_SALSA LQNFCPKARELTDAETKAFLKAGDADGDGMIGIDEFAVLVKQ--PRVB_LATCH LQNFSAGARTLTKTETETFLKAGDSDGDGKIGVDEFQKLVKA--PRVB_RANES LQNFRASARVLSDAETSAFLKAGDSDGDGKIGVEEFQALVKA--TPCS_RABIT FR---ASGEHVTDEEIESLMKDGDKNNDGRIDFDEFLKMMEGVQTPCS_PIG FR---ASGEHVTDEEIESIMKDGDKNNDGRIDFDEFLKMMEGVQTPCC_MOUSE LQ---ATGETITEDDIEELMKDGDKNNDGRIDYDEFLEFMKGVE :: .. :: : :: .* :.** *. :** ::
![Page 57: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/57.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
SomeGuideline
s…
![Page 58: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/58.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Do Not Use Two Many Sequences…
![Page 59: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/59.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Reading Your Alignment
![Page 60: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/60.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
![Page 61: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/61.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Going Further…
PRVA_MACFU VGAFSAIDS--FDHKKFFQMVG------LKKKSADDVKKVFHILDKDKSGFIEEDELGFIPRVB_BOACO LQSCQAADS--FSCKTFFAKSG------LHSKSKDQLTKVFGVIDRDKSGYIEEDELKKFPRV1_SALSA LEACKAADT--FSFKTFFHTIG------FASKSADDVKKAFKVIDQDASGFIEVEELKLFTPCS_RABIT IEEVDEDGSGTIDFEEFLVMMVRQMKEDAKGKSEEELAECFRIFDRNADGYIDAEELAEITPCS_PIG IEEVDEDGSGTIDFEEFLVMMVRQMKEDAKGKSEEELAECFRIFDRNMDGYIDAEELAEITPCC_MOUSE IDEVDEDGSGTVDFDEFLVMMVRCMKDDSKGKSEEELSDLFRMFDKNADGYIDLDELKMMTPC_PATYE SDEMDEEATGRLNCDAWIQLFER---KLKEDLDERELKEAFRVLDKEKKGVIKVDVLRWI . : .. . :: . : * :* : .* *. : * .
PRVA_MACFU LKGFSPDARDLSAKETKTLMAAGDKDGDGKIGVDEFSTLVAES--PRVB_BOACO LQNFDGKARDLTDKETAEFLKEGDTDGDGKIGVEEFVVLVTKG--PRV1_SALSA LQNFCPKARELTDAETKAFLKAGDADGDGMIGIDEFAVLVKQ---TPCS_RABIT FR---ASGEHVTDEEIESLMKDGDKNNDGRIDFDEFLKMMEGVQ-TPCS_PIG FR---ASGEHVTDEEIESIMKDGDKNNDGRIDFDEFLKMMEGVQ-TPCC_MOUSE LQ---ATGETITEDDIEELMKDGDKNNDGRIDYDEFLEFMKGVE-TPC_PATYE LS---SLGDELTEEEIENMIAETDTDGSGTVDYEEFKCLMMSSDA : . :: : :: * :..* :. :** ::
![Page 62: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/62.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
WHAT MAKES A GOOD ALIGNMENT…
-THE MORE DIVERGEANT THE SEQUENCES, THE BETTER
-THE FEWER INDELS, THE BETTER
-NICE UNGAPPED BLOCKS SEPARATED WITH INDELS
-DIFFERENT CLASSES OF RESIDUES WITHIN A BLOCK:
•Completely Conserved•Conserved For Size and Hydropathy•Conserved For Size or Hydropathy
-THE ULTIMATE EVALUATION IS A MATTER OF PERSONNAL JUDGEMENT AND KNOWLEDGE.
![Page 63: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/63.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
![Page 64: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/64.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Potential Difficulties
![Page 65: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/65.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
DO NOT OVERTUNE!!!
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
DO NOT PLAY WITH PARAMETERS IF YOU KNOW THE ALIGNMENT YOU WANT: MAKE IT YOURSELF!
chite ---ADKPKRPL-SAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAP-SAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPR-SAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. :*: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
![Page 66: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/66.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
TUNING or NOT TUNING!!!
-MOST METHODS ARE TUNED FOR WORKING WELL ON AVERAGE
-PARAMETERS BEHAVIOUR DO NOT NECESSARILY FOLLOW THE THEORY (i.e. Substitution Matrices).
-A GOOD ALIGNMENT IS USUALLY ROBUST(i.e. Changes little).
-TUNE IF YOU WANT TO CONVINCE YOURSELF.
-PARAMETERS TO TUNE USUALLY INCLUDE:•GOP/ GEP•MATRIX•SENSITIVITY Vs SPEED
GOP
GEP
Substitution Matrices (Etzold and al. 1993)
Gonnet 61.7 %Blosum50 59.7 %
Pam250 59.2 %
![Page 67: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/67.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
![Page 68: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/68.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
KEEP A BIOLOGICAL PERSPECTIVE
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AD--K----PKR-PLYMLWLNS-ARESIKRENPDFK-VT-EVAKKGGELWRGL- wheat -DPNK----PKRAP-FFVFMGE-FREEFKQKNPKNKSVA-AVGKAAGERWKSLStrybr -K--KDSNAPKR-AMT-MFFSSDFR-S-KH-S-DLS-IV-EMSKAAGAAWKELG mouse ----K----PKR-PRYNIYVSESFQEA-K--D-D-S-AQGKL-KLVNEAWKNLS * *** .:: ::... : * . . . : * . *: *
DIFFERENT PARAMETERS
WRONG ALIGNMENT !!!
![Page 69: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/69.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
REPEATS
THERE IS A PROBLEM WHEN TWO SEQUENCES DO NOT CONTAIN THE SAME NUMBER OF REPEATS
IT IS THEN BETTER TO MANUALLY EXTRACT THE REPEATS AND TO ALIGN THEM. INDIVIDUAL REPEATS CAN BE RECOGNIZED USING DOTTER
![Page 70: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/70.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
![Page 71: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/71.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Naming Your Sequences The Right Way
![Page 72: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/72.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
What Are The AvailableMethods
???
![Page 73: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/73.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Simultaneous Alignments : MSA
1) Set Bounds on each pair of sequences (Carillo and Lipman)
2) Compute the Maln within the Hyperspace
-Few Small Closely Related Sequence.
-Do Well When They Can Run.
-Memory and CPU hungry
![Page 74: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/74.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Simultaneous Alignments : DCA
-Few Small Closely Related Sequence, but less limited than MSA
-Do Well When Can Run.
-Memory and CPU hungry, but less than MSA
![Page 75: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/75.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Dialign
![Page 76: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/76.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Dialign II
1) Identify best chain of segments on each pair of sequence. Assign a Pvalue to each Segment Pair.
3) Assemble the alignment according to the segment pairs.
2) Ré-évaluate each segment pair according to its consistency with the others
![Page 77: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/77.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Dialign II
-May Align Too Few Residues
-No Gap Penalty-Does well with ESTs
![Page 78: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/78.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Dialign II
bibiserv.techfak.uni-bielefeld.de/dialign/submission.html
![Page 79: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/79.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Muscle
![Page 80: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/80.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
7.16.1 ProgressiveIterative Methods
-HMMs, HMMER, SAM, MUSCLE
-Slow, Sometimes Inaccurate-Good Profile Generators
![Page 81: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/81.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
7.16.1 ProgressiveMUSCLE
![Page 82: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/82.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
7.16.1 Progressive
MUSCLE
phylogenomics.berkeley.edu/cgi-bin/muscle/input_muscle.py
![Page 83: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/83.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
7.16.1 Progressive
MUSCLE
phylogenomics.berkeley.edu/cgi-bin/muscle/input_muscle.py
![Page 84: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/84.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
T-Coffee
![Page 85: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/85.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Mixing Local and Global Alignments
Local Alignment Global Alignment
Extension
Multiple Sequence Alignment
![Page 86: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/86.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Mixing Heterogenous Data With
T-Coffee
Local Alignment Global Alignment
Multiple Sequence Alignment
Multiple Alignment
StructuralSpecialist
![Page 87: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/87.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Struct Vs StructSeq Vs Struct
Thread
Evaluation on Homestrad
Superpose
Seq Vs SeqLocalGlobal
Mixing Sequences and Structures with T-Coffee
![Page 88: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/88.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
What is the Local Quality of my Alignment
II
I
![Page 89: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/89.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
T-Coffee
igs-server.cnrs-mrs.fr/Tcoffee/
![Page 90: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/90.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
DBClustal
![Page 91: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/91.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
DBClustal
BlastP
![Page 92: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/92.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
DBClustal
![Page 93: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/93.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
DBClustal
![Page 94: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/94.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Expasy Blast
![Page 95: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/95.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Expasy BLAST
www.expasy.org/tools/blast/
![Page 96: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/96.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Expasy BLAST
![Page 97: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/97.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Choosing the right method
![Page 98: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/98.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Situation Solution
![Page 99: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/99.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Priority Solution
Method
Priority
Trees Profile 2D –Pred 3D-Pred Func-Pred
Accuracy
Speed
![Page 100: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/100.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Purpose Solution
![Page 101: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/101.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Conclusion
![Page 102: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/102.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
-The BEST alignment Method: Your BrainThe Right Data
-Beware of repeated elements
Multiple Alignment
-The Best Evaluation Procedure:Experimental Data (SwissProt)
-Choosing The Sequences Well is Important
![Page 103: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/103.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Know Your Problem: What do you want to do with your MSA
Multiple Alignment
![Page 104: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/104.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Addresses
MAFFT Progressive/iterative www.biophys.kyoto-u.jp/katoh
POA Progressive/Simultaneous www.bioinformatics.ucla.edu/poa
MUSCLE Progressive/Iterative www.drive5.com/muscle
![Page 105: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/105.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
What Is BaliBaseBaliBase
DescriptionPROBLEM
Source: BaliBase, Thompson et al, NAR, 1999,
Even Phylogenic Spread.
One Outlayer Sequence
Two Distantly related Groups
Long Internal Indel
Long Terminal Indel
![Page 106: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/106.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
What Is BaliBaseWhich Method ?
PROBLEM
Source: BaliBase, Thompson et al, NAR, 1999,
Strategy
Strategy
ClustalW, T-coffee,MSA, DCA
PrrP,T-Coffee
Dialign
T-Coffee
T-Coffee
Dialign
T-Coffee
![Page 107: An Introduction to Multiple Sequence Alignments](https://reader030.fdocuments.net/reader030/viewer/2022032607/568130f1550346895d9714a1/html5/thumbnails/107.jpg)
Copyright Cédric Notredame (2000-2003) All rights reserved
Methods /Situtations
1-Carillo and Lipman:-MSA, DCA.
-Few Small Closely Related Sequence.
2-Segment Based:-DIALIGN, MACAW.
-May Align Too Few Residues-Good For Long Indels
-Do Well When They Can Run.
3-Iterative:-HMMs, HMMER, SAM.
-Slow, Sometimes Inaccurate-Good Profile Generators
4-Progressive: -ClustalW, Pileup, Multalign…-Fast and Sensitive