Multiple Sequence Alignment

32
Multiple Sequence Multiple Sequence Alignment Alignment ClustalW ClustalW TCoffee TCoffee Ka, Ks, and Ka/Ks Ka, Ks, and Ka/Ks Anchored alignment Anchored alignment 1

description

Multiple Sequence Alignment. ClustalW TCoffee Ka, Ks, and Ka/Ks Anchored alignment. ClustalW. http://www.ebi.ac.uk/clustalw/. ClustalW. Paste your sequences. Multiple sequence Alignment alignment options . Submit . Exercise. - PowerPoint PPT Presentation

Transcript of Multiple Sequence Alignment

Page 1: Multiple Sequence Alignment

Multiple Sequence Multiple Sequence AlignmentAlignment

ClustalWClustalWTCoffeeTCoffee

Ka, Ks, and Ka/KsKa, Ks, and Ka/KsAnchored alignmentAnchored alignment

11

Page 2: Multiple Sequence Alignment

ClustalWClustalW

http://www.ebi.ac.uk/clustalw/ http://www.ebi.ac.uk/clustalw/

22

Page 3: Multiple Sequence Alignment

ClustalWClustalW

Paste your sequences

Multiple sequence Alignment alignment options

Submit

33

Page 4: Multiple Sequence Alignment

ExerciseExercise

HomoloGene is a system for automated HomoloGene is a system for automated detection of homologs among annotated detection of homologs among annotated genes of several completely sequenced genes of several completely sequenced eukaryotic genomes. eukaryotic genomes.

Download the FASTA sequences of Download the FASTA sequences of HomoloGene:5276 and align them with HomoloGene:5276 and align them with ClustalWClustalW

44

Page 5: Multiple Sequence Alignment

Download protein sequences

55

Page 6: Multiple Sequence Alignment

ResultResult

Alignment

Guide Tree

66

Page 7: Multiple Sequence Alignment

TCoffeeTCoffeehttp://tcoffee.crg.cat/http://tcoffee.crg.cat/

Tcoffee computes its alignments by combining a Tcoffee computes its alignments by combining a collection of smaller alignmentscollection of smaller alignments 77

Page 8: Multiple Sequence Alignment

Alignment at the DNA level based on an Alignment at the DNA level based on an alignment at the Protein Level alignment at the Protein Level

The 18-kDa protein plays an important role The 18-kDa protein plays an important role in fertilization of several abalone speciesin fertilization of several abalone species

Build a multiple sequence alignment using Build a multiple sequence alignment using the following sequences the following sequences

88

Page 9: Multiple Sequence Alignment

Sequences Sequences >gi|604533|gb|AAC37231.1| fertilization protein>gi|604533|gb|AAC37231.1| fertilization proteinMRSLVLLCVLLMAICAADKKTSVSKENEAAMKVAMMKFLDMKAGVFKEIIEDMGYPITPPQWTTLLYYNRMRSLVLLCVLLMAICAADKKTSVSKENEAAMKVAMMKFLDMKAGVFKEIIEDMGYPITPPQWTTLLYYNRERLIEFCRSFLALSKKIILLGGNKLNKANFARMGRILGWKSQWAVRQRQWGMVRVSRRHTSTAIAKRIVAERLIEFCRSFLALSKKIILLGGNKLNKANFARMGRILGWKSQWAVRQRQWGMVRVSRRHTSTAIAKRIVAMKVADLPCNMKVADLPCN

>gi|604531|gb|AAC37233.1| fertilization protein>gi|604531|gb|AAC37233.1| fertilization proteinMRFLLLLCVLMGAVSQAVCRKRPNVWGKIVVKEKNKAAMKIGFMEYLDAKLVKFKRHWLVGANWKLQKFEMRFLLLLCVLMGAVSQAVCRKRPNVWGKIVVKEKNKAAMKIGFMEYLDAKLVKFKRHWLVGANWKLQKFETDEMRYLAIKRLIKVCHGYTIWSQRLIMLKYRPLNEKYFKKVGRYLAWRNYLIVFRMWIGVLKKNLKRSETDEMRYLAIKRLIKVCHGYTIWSQRLIMLKYRPLNEKYFKKVGRYLAWRNYLIVFRMWIGVLKKNLKRSEITKPMQKLLDTKDGELPCPVRKIHGITKPMQKLLDTKDGELPCPVRKIHG

>gi|604529|gb|AAC37232.1| fertilization protein>gi|604529|gb|AAC37232.1| fertilization proteinMRSLVLLCVLMAVGCVAFDDVVVSRQEQSYVQRGMVNFLDEEMHKLVKRFRDMRWNLGPGFVFLLKKVNRMRSLVLLCVLMAVGCVAFDDVVVSRQEQSYVQRGMVNFLDEEMHKLVKRFRDMRWNLGPGFVFLLKKVNRERMMRYCMDYARYSKKILQLKHLPVNKKTLTKMGRFVGYRNYGVIRELYADVFRDVQGFRGPKMTAAMRKERMMRYCMDYARYSKKILQLKHLPVNKKTLTKMGRFVGYRNYGVIRELYADVFRDVQGFRGPKMTAAMRKYSSKDPGTFPCKNEKRRGYSSKDPGTFPCKNEKRRG

>gi|604527|gb|AAC37230.1| fertilization protein>gi|604527|gb|AAC37230.1| fertilization proteinMRSLVLLCVLLMAICAADKKTTVSKENAAAMKIAMIKFLDARAGKFKKRVENMGYPITPPQWTTLLYYNRMRSLVLLCVLLMAICAADKKTTVSKENAAAMKIAMIKFLDARAGKFKKRVENMGYPITPPQWTTLLYYNRQRLMEWCHTYVEFSKKIILMGGNKLNKKNFTRMGRIIGWKNQWVLKRRQWEMVRVMRRYKSTAIAKKIVAQRLMEWCHTYVEFSKKIILMGGNKLNKKNFTRMGRIIGWKNQWVLKRRQWEMVRVMRRYKSTAIAKKIVAMKVADLPCNMKVADLPCN

>gi|604525|gb|AAC37229.1| fertilization protein>gi|604525|gb|AAC37229.1| fertilization proteinMRSLVLLCVLLMAICAADKKSTVSKENAAAMKVAMIKFLDSRTDRFKKRIEKIGYPITPPQYTTLLYYNRMRSLVLLCVLLMAICAADKKSTVSKENAAAMKVAMIKFLDSRTDRFKKRIEKIGYPITPPQYTTLLYYNRERLMDWCHNYVEVSKKIILLGGNKLNKKNFARMGRIIGWKNQWILKRRQWHMVRVMRRYKASAIAKKIVAERLMDWCHNYVEVSKKIILLGGNKLNKKNFARMGRIIGWKNQWILKRRQWHMVRVMRRYKASAIAKKIVAMKVADLPCNMKVADLPCN

99

Page 10: Multiple Sequence Alignment

Choose TCoffee Regular, paste the sequences in the data box, and press submit

1010

Page 11: Multiple Sequence Alignment

Download formats

Guide tree

1111

Page 12: Multiple Sequence Alignment

Codon AlignmentCodon Alignment

In order to study selection patterns, you In order to study selection patterns, you will need to have the corresponding DNA will need to have the corresponding DNA alignment alignment

Using the PROTOGENE (Protein-to-Using the PROTOGENE (Protein-to-Gene) in Tcoffee, the amino-acid Gene) in Tcoffee, the amino-acid alignment will be transformed into a codon alignment will be transformed into a codon alignment. The actual procedure invloves alignment. The actual procedure invloves tBLASTn.tBLASTn.

1212

Page 13: Multiple Sequence Alignment

•PROTOGENE (in Tcoffee) is time PROTOGENE (in Tcoffee) is time consuming. Please submit your email consuming. Please submit your email address, and the results will be emailed to address, and the results will be emailed to you.you.•PROTOGENE may return more that one PROTOGENE may return more that one DNA sequence for any given Protein DNA sequence for any given Protein sequence. For your homework assignment, sequence. For your homework assignment, please choose one sequence for each please choose one sequence for each species.species.

1313

Page 14: Multiple Sequence Alignment

(Result) Codon alignment(Result) Codon alignment>gi|604533|gb|AAC37231.1|_G_L36554 _S_ AAC37231 _DESC_ fertilization protein MATCHES_ON Haliotis assimilis fertilization protein mRNA, complete cds>gi|604533|gb|AAC37231.1|_G_L36554 _S_ AAC37231 _DESC_ fertilization protein MATCHES_ON Haliotis assimilis fertilization protein mRNA, complete cdsATGAGGTCTTTGGTGCTTCTCTGTGTTTTGCTGATGGCAATATGTGCGGCGGAC------ATGAGGTCTTTGGTGCTTCTCTGTGTTTTGCTGATGGCAATATGTGCGGCGGAC------------------------AAAAAAACCTCGGTCTCGAAGGAAAATGAAGCCGCAATGAAG------------------AAAAAAACCTCGGTCTCGAAGGAAAATGAAGCCGCAATGAAGGTAGCGATGATGAAGTTTTTGGATATGAAGGCGGGTGTATTCAAAGAAATC---ATTGAGGTAGCGATGATGAAGTTTTTGGATATGAAGGCGGGTGTATTCAAAGAAATC---ATTGAGGATATGGGATATCCAATAACCCCTCCGCAATGGACAACTCTACTGTACTACAACAGAGAGGATATGGGATATCCAATAACCCCTCCGCAATGGACAACTCTACTGTACTACAACAGAGAGAGATTGATTGAATTTTGCCGTTCCTTCCTTGCATTGTCCAAAAAGATTATATTGCTGGGAAGATTGATTGAATTTTGCCGTTCCTTCCTTGCATTGTCCAAAAAGATTATATTGCTGGGAGGTAACAAATTAAATAAGGCGAACTTCGCTAGGATGGGTCGAATCCTTGGCTGGAAAAGCGGTAACAAATTAAATAAGGCGAACTTCGCTAGGATGGGTCGAATCCTTGGCTGGAAAAGCCAGTGGGCTGTGAGACAGAGGCAATGGGGGATGGTCAGA---------GTGTCGAGGCGCCAGTGGGCTGTGAGACAGAGGCAATGGGGGATGGTCAGA---------GTGTCGAGGCGCCATACAAGTACTGCAATAGCTAAAAGGATCGTCGCCATGAAAGTTGCTGACCTACCCTGTCATACAAGTACTGCAATAGCTAAAAGGATCGTCGCCATGAAAGTTGCTGACCTACCCTGTAAC------------------TAGAAC------------------TAG>gi|604531|gb|AAC37233.1|_G_L36590 _S_ AAC37233 _DESC_ fertilization protein MATCHES_ON Haliotis corrugata fertilization protein mRNA, complete cds>gi|604531|gb|AAC37233.1|_G_L36590 _S_ AAC37233 _DESC_ fertilization protein MATCHES_ON Haliotis corrugata fertilization protein mRNA, complete cdsATGAGGTTTTTGCTGCTTCTCTGTGTTTTGATGGGGGCAGTATCTCAGGCAGTATGCAGAATGAGGTTTTTGCTGCTTCTCTGTGTTTTGATGGGGGCAGTATCTCAGGCAGTATGCAGAAAAAGACCTAATGTCTGGGGGAAAATCGTGGTCAAGGAGAAAAATAAAGCCGCAATGAAGAAAAGACCTAATGTCTGGGGGAAAATCGTGGTCAAGGAGAAAAATAAAGCCGCAATGAAGATAGGGTTTATGGAATATTTGGATGCAAAGTTGGTAAAGTTTAAAAGGCACTGGCTTGTTATAGGGTTTATGGAATATTTGGATGCAAAGTTGGTAAAGTTTAAAAGGCACTGGCTTGTTGGAGCCAATTGGAAACTTCAAAAATTTGAAACGGATGAAATGAGATACCTCGCCATAAAGGGAGCCAATTGGAAACTTCAAAAATTTGAAACGGATGAAATGAGATACCTCGCCATAAAGAGACTGATAAAAGTTTGCCATGGATACACTATTTGGTCCCAACGACTAATAATGTTAAAAAGACTGATAAAAGTTTGCCATGGATACACTATTTGGTCCCAACGACTAATAATGTTAAAATATCGACCATTGAATGAGAAATACTTCAAAAAGGTGGGTCGATACCTTGCCTGGCGAAACTATCGACCATTGAATGAGAAATACTTCAAAAAGGTGGGTCGATACCTTGCCTGGCGAAACTACCTCATAGTTTTTCGGATGTGGATCGGCGTTTTG------AAGAAAAATCTTAAAAGATACCTCATAGTTTTTCGGATGTGGATCGGCGTTTTG------AAGAAAAATCTTAAAAGATCGGAAATAACGAAACCCATGCAAAAACTCCTCGACACAAAGGATGGTGAGTTGCCCTGCTCGGAAATAACGAAACCCATGCAAAAACTCCTCGACACAAAGGATGGTGAGTTGCCCTGCCCTGTTAGAAAGATACATGGATAACCTGTTAGAAAGATACATGGATAA>gi|604529|gb|AAC37232.1|_G_L36589 _S_ AAC37232 _DESC_ fertilization protein MATCHES_ON Haliotis fulgens fertilization protein mRNA, complete cds>gi|604529|gb|AAC37232.1|_G_L36589 _S_ AAC37232 _DESC_ fertilization protein MATCHES_ON Haliotis fulgens fertilization protein mRNA, complete cdsATGAGGTCTTTGGTGCTTCTCTGTGTTTTGATGGCGGTAGGATGTGTGGCGTTT------ATGAGGTCTTTGGTGCTTCTCTGTGTTTTGATGGCGGTAGGATGTGTGGCGTTT------------------------GATGATGTGGTGGTCTCAAGGCAAGAGCAATCTTATGTGCAG------------------GATGATGTGGTGGTCTCAAGGCAAGAGCAATCTTATGTGCAGAGAGGGATGGTCAACTTTTTGGATGAAGAAATGCATAAACTGGTTAAACGG---TTTAGAAGAGGGATGGTCAACTTTTTGGATGAAGAAATGCATAAACTGGTTAAACGG---TTTAGAGATATGCGATGGAATTTAGGGCCAGGCTTTGTATTCCTTCTAAAAAAAGTCAACAGAGAGGATATGCGATGGAATTTAGGGCCAGGCTTTGTATTCCTTCTAAAAAAAGTCAACAGAGAGAGAATGATGCGCTACTGCATGGATTACGCCAGATATTCCAAAAAGATTTTACAGCTAAAAAGAATGATGCGCTACTGCATGGATTACGCCAGATATTCCAAAAAGATTTTACAGCTAAAACATCTTCCAGTAAATAAGAAGACCCTCACTAAAATGGGTAGATTCGTTGGATATCGAAACCATCTTCCAGTAAATAAGAAGACCCTCACTAAAATGGGTAGATTCGTTGGATATCGAAACTATGGGGTCATCAGGGAGTTGTACGCCGACGTATTCAGAGACGTTCAAGGATTTAGGGGGTATGGGGTCATCAGGGAGTTGTACGCCGACGTATTCAGAGACGTTCAAGGATTTAGGGGGCCTAAAATGACTGCAGCCATGAGGAAGTACAGCAGCAAGGATCCTGGTACATTTCCTTGCCCTAAAATGACTGCAGCCATGAGGAAGTACAGCAGCAAGGATCCTGGTACATTTCCTTGCAAGAACGAGAAACGCCGCGGATGAAAGAACGAGAAACGCCGCGGATGA>gi|604527|gb|AAC37230.1|_G_L36553 _S_ AAC37230 _DESC_ fertilization protein MATCHES_ON Haliotis sorenseni fertilization protein mRNA, complete cds>gi|604527|gb|AAC37230.1|_G_L36553 _S_ AAC37230 _DESC_ fertilization protein MATCHES_ON Haliotis sorenseni fertilization protein mRNA, complete cdsATGAGGTCTTTGGTGCTTCTCTGTGTTTTGCTGATGGCAATATGTGCGGCGGAC------ATGAGGTCTTTGGTGCTTCTCTGTGTTTTGCTGATGGCAATATGTGCGGCGGAC------------------------AAAAAAACCACGGTCTCGAAGGAAAATGCAGCCGCAATGAAG------------------AAAAAAACCACGGTCTCGAAGGAAAATGCAGCCGCAATGAAGATAGCTATGATAAAGTTTTTGGATGCGAGGGCGGGTAAATTCAAAAAACGC---GTTGAGATAGCTATGATAAAGTTTTTGGATGCGAGGGCGGGTAAATTCAAAAAACGC---GTTGAGAATATGGGATATCCAATAACCCCTCCGCAATGGACAACTCTACTATACTACAACAGACAGAATATGGGATATCCAATAACCCCTCCGCAATGGACAACTCTACTATACTACAACAGACAGAGATTGATGGAATGGTGCCATACCTACGTTGAATTTTCCAAAAAGATTATATTGATGGGAAGATTGATGGAATGGTGCCATACCTACGTTGAATTTTCCAAAAAGATTATATTGATGGGAGGTAACAAATTAAATAAGAAGAACTTCACTAGGATGGGTCGAATCATTGGCTGGAAAAACGGTAACAAATTAAATAAGAAGAACTTCACTAGGATGGGTCGAATCATTGGCTGGAAAAACCAGTGGGTTTTGAAAAGGAGGCAATGGGAGATGGTCAGA---------GTGATGAGGCGCCAGTGGGTTTTGAAAAGGAGGCAATGGGAGATGGTCAGA---------GTGATGAGGCGCTATAAAAGTACTGCAATAGCTAAAAAGATCGTCGCCATGAAAGTTGCTGACCTACCCTGTTATAAAAGTACTGCAATAGCTAAAAAGATCGTCGCCATGAAAGTTGCTGACCTACCCTGTAAC------------------TAGAAC------------------TAG>gi|604525|gb|AAC37229.1|_G_L36552 _S_ AAC37229 _DESC_ fertilization protein MATCHES_ON Haliotis rufescens fertilization protein mRNA, complete cds>gi|604525|gb|AAC37229.1|_G_L36552 _S_ AAC37229 _DESC_ fertilization protein MATCHES_ON Haliotis rufescens fertilization protein mRNA, complete cdsATGAGGTCTTTGGTGCTTCTCTGTGTTTTGCTGATGGCAATATGTGCGGCGGAC------ATGAGGTCTTTGGTGCTTCTCTGTGTTTTGCTGATGGCAATATGTGCGGCGGAC------------------------AAAAAATCCACGGTCTCGAAGGAAAATGCAGCCGCAATGAAG------------------AAAAAATCCACGGTCTCGAAGGAAAATGCAGCCGCAATGAAGGTAGCGATGATAAAGTTTTTGGATTCGAGGACGGATAGATTCAAAAAACGC---ATTGAGGTAGCGATGATAAAGTTTTTGGATTCGAGGACGGATAGATTCAAAAAACGC---ATTGAGAAGATTGGATATCCAATAACCCCTCCGCAATATACAACTCTACTATACTACAACAGAGAGAAGATTGGATATCCAATAACCCCTCCGCAATATACAACTCTACTATACTACAACAGAGAGAGATTGATGGATTGGTGCCATAACTACGTTGAAGTATCCAAAAAGATTATATTGTTGGGAAGATTGATGGATTGGTGCCATAACTACGTTGAAGTATCCAAAAAGATTATATTGTTGGGAGGTAACAAATTAAATAAGAAGAACTTCGCTAGGATGGGTCGAATCATTGGCTGGAAAAACGGTAACAAATTAAATAAGAAGAACTTCGCTAGGATGGGTCGAATCATTGGCTGGAAAAACCAGTGGATTTTGAAAAGGAGGCAATGGCACATGGTCAGA---------GTGATGAGGCGCCAGTGGATTTTGAAAAGGAGGCAATGGCACATGGTCAGA---------GTGATGAGGCGCTATAAAGCTTCTGCAATAGCTAAAAAGATCGTCGCCATGAAAGTTGCTGACCTACCCTGTTATAAAGCTTCTGCAATAGCTAAAAAGATCGTCGCCATGAAAGTTGCTGACCTACCCTGTAAC------------------TAGAAC------------------TAG

1414

Page 15: Multiple Sequence Alignment

SNAP - Ds/Dn Calculation Tool SNAP - Ds/Dn Calculation Tool http://hcv.lanl.gov/content/sequence/SNAP/SNAP.htmlhttp://hcv.lanl.gov/content/sequence/SNAP/SNAP.html

Calculates synonymous and nonsynonymous Calculates synonymous and nonsynonymous substitution rates based on codon alignments substitution rates based on codon alignments according to Nei and Gojobori (1986) method.according to Nei and Gojobori (1986) method.

1515

Page 16: Multiple Sequence Alignment

Input codon alignment

Select output statistics

1616

Page 17: Multiple Sequence Alignment

SNAP - Ds/Dn Calculation Tool SNAP - Ds/Dn Calculation Tool

Conclusion: We detect positive selection in six of the Conclusion: We detect positive selection in six of the comparisons. So did Swanson and Vacquier (1998).comparisons. So did Swanson and Vacquier (1998).

1717

Page 18: Multiple Sequence Alignment

Distmat calculates the evolutionary distances between every pair of sequences in a multiple alignment.

The distances are expressed in terms of the number per 100 nucleotides or number of replacements per 100 amino acids

Distmathttp://emboss.bioinformatics.nl/cgi-bin/emboss/distmat

1818

Page 19: Multiple Sequence Alignment

Feed the DNA alignment of 18-kDa protein Feed the DNA alignment of 18-kDa protein into distmat.into distmat.

Calculate separately the distances Calculate separately the distances between the sequences for codon between the sequences for codon positions 1 and 2, and for codon position positions 1 and 2, and for codon position 3.3.

Are the results in agreement with those Are the results in agreement with those from the dn/ds analysis?from the dn/ds analysis?

Distmat

1919

Page 20: Multiple Sequence Alignment

Distmat

Page 21: Multiple Sequence Alignment

Distmat

Page 22: Multiple Sequence Alignment

http://dialign.gobics.de/anchor/manual

http://dialign.gobics.de/anchor/submission.php

User manual:

2222

Anchored multiple-sequence alignment with DIALIGN

Page 23: Multiple Sequence Alignment

Align the following sequences (use the file Align the following sequences (use the file dalign_sequences.txt): dalign_sequences.txt):

>seq1 WKKNADAPKRAMTSFMKAAY >seq1 WKKNADAPKRAMTSFMKAAY >seq2 WNLDTNSPEEKQAYIQLAKDDRIRYD >seq2 WNLDTNSPEEKQAYIQLAKDDRIRYD >seq3 WRMDSNQKNPDSNNPKAAYNKGDANAPK>seq3 WRMDSNQKNPDSNNPKAAYNKGDANAPK

2323

Page 24: Multiple Sequence Alignment

Results Results DIALIGN makes alignments from DIALIGN makes alignments from fragmentsfragments

2424

Page 25: Multiple Sequence Alignment

Results Results

Numbers below the alignment reflect Numbers below the alignment reflect some rough degree of local similarity some rough degree of local similarity among the sequencesamong the sequences

2525

Page 26: Multiple Sequence Alignment

Anchored alignmentAnchored alignment Now, let us assume that the user has Now, let us assume that the user has

some expert knowledge concerning a some expert knowledge concerning a certain domain that is present in all the certain domain that is present in all the input sequencesinput sequences

The domains marked in red in the three The domains marked in red in the three sequences are thought to be homologous sequences are thought to be homologous to one anotherto one another

>seq1 WKKNADAPKRAMTSFMKAAY >seq2 WNLDTNSPEEKQAYIQLAKDDRIRYD >seq3 WRMDSNQKNPDSNNPKAAYNKGDANAPK

2626

Page 27: Multiple Sequence Alignment

Therefore, the user wants to define this Therefore, the user wants to define this domain as domain as anchoranchor and align the rest of the and align the rest of the sequences automatically.sequences automatically.

To specify a set of anchor points, each To specify a set of anchor points, each anchor point corresponds to a anchor point corresponds to a equal-equal-length segment pairlength segment pair involving two of the involving two of the input sequences should be definedinput sequences should be defined

2727

Page 28: Multiple Sequence Alignment

first sequence involved first sequence involved second sequence involvedsecond sequence involved start of anchor in first sequence start of anchor in first sequence start of anchor in second sequence start of anchor in second sequence length of anchor length of anchor

2828

Page 29: Multiple Sequence Alignment

Results Results

The specified domain is aligned and the The specified domain is aligned and the remainder of the sequences is aligned remainder of the sequences is aligned automatically respecting the constraints automatically respecting the constraints given by the anchor points: given by the anchor points:

2929

Page 30: Multiple Sequence Alignment

Guidance/HoT

Page 31: Multiple Sequence Alignment

>seq1 WKKNADAPKRAMTSFMKAAY >seq2WNLDTNSPEEKQAYIQLAKDDRIRYD >seq3WRMDSNQKNPDSNNPKAAYNKGDANAPK>seq4WRMDSNQKNPNNPKAAYNKGDANAPK

Page 32: Multiple Sequence Alignment