Functional Motifs - people.uleth.ca

18
Biochemistry 4000 Lecture 7 Slide 1 Functional Motifs Functional Motifs Various Sources Various Sources

Transcript of Functional Motifs - people.uleth.ca

Page 1: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 1

Functional MotifsFunctional Motifs

Various SourcesVarious Sources

Page 2: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 2

Functional MotifsFunctional Motif Definition(s)Functional Motif Definition(s): : OriginallyOriginally – a structural motif that performs a biological – a structural motif that performs a biological functionfunction

● Short continuous stretch of primary sequenceShort continuous stretch of primary sequence● Defined in terms of protein architectureDefined in terms of protein architecture

Bioinformatics eraBioinformatics era – any primary sequence pattern that is – any primary sequence pattern that is associated with a biological function. associated with a biological function.

● Sequence fragments from anywhere in the primary sequenceSequence fragments from anywhere in the primary sequence● Defined in terms of primary sequenceDefined in terms of primary sequence

Why has the term 'evolved' (opinion)? Why has the term 'evolved' (opinion)? ● Bioinformatic sequence annotation relies heavily (and successfully) Bioinformatic sequence annotation relies heavily (and successfully)

on primary sequence pattern searcheson primary sequence pattern searches● Use of primary sequence information is far more widespread than the Use of primary sequence information is far more widespread than the

use of structural information (eg. Majority rules) use of structural information (eg. Majority rules)

Helix-loop-helixHelix-loop-helix

Catalytic triad of Catalytic triad of serine proteasesserine proteases

195195

5757102102

Page 3: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 3

Helix-loop-helix (EF-hand)(Ca2+ binding)

Helix-loop-helixHelix-loop-helix: Ca: Ca2+2+ binding motif composed of binding motif composed of two orthogonal helices and a connecting loop of two orthogonal helices and a connecting loop of 12 residues (~30 residues)12 residues (~30 residues)

● Helices within a single motif make few contacts Helices within a single motif make few contacts (left)(left)

● Typically occur in pairs forming a 4-helix orthogonal Typically occur in pairs forming a 4-helix orthogonal bundle (right)bundle (right)

EF-handEF-hand: Older name derived from original : Older name derived from original structural studies on parvalbumin structural studies on parvalbumin

● Helix E & F form the helix-loop-helixHelix E & F form the helix-loop-helix

Intended to describe 3D shapeIntended to describe 3D shape

Page 4: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 4

Ca2+ binding loopCaCa2+2+ binding site binding site: : 5 axial and 2 equatorial ligands coordinate a central Ca5 axial and 2 equatorial ligands coordinate a central Ca2+2+

● Pentagonal bipyramid coordinationPentagonal bipyramid coordination● All contacts are from 12 residue loopAll contacts are from 12 residue loop

Highly conserved Asp and Glu side-chains make 5 contacts with the CaHighly conserved Asp and Glu side-chains make 5 contacts with the Ca2+2+ (remaining contacts are from the main chain and/or a bridging water)(remaining contacts are from the main chain and/or a bridging water)

Numbering represents Numbering represents sequence position within the 12 sequence position within the 12 residue loop.residue loop.

Positions 7 and 9 contribute Positions 7 and 9 contribute main-chain contacts and main-chain contacts and residues at these positions are residues at these positions are not conservednot conserved

Page 5: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 5

Sequence Conservation in Helix-turn-Helix Motifs

Primary Sequence Logo for known Helix-turn-Helix MotifsPrimary Sequence Logo for known Helix-turn-Helix Motifs

Represents all known residues at each position in an alignment of all Represents all known residues at each position in an alignment of all helix-turn-helix motifs. helix-turn-helix motifs.

Schematic representation of Schematic representation of a helix-turn-helix motifa helix-turn-helix motif (n stands for non-polar)(n stands for non-polar)

Conservation (y-axis) vs sequence Conservation (y-axis) vs sequence position (x-axis) shown by height of position (x-axis) shown by height of each positioneach position

AndAnd

Frequency of residues at each Frequency of residues at each sequence position shown by height sequence position shown by height of individual one-letter codesof individual one-letter codes

Page 6: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 6

Sequence Conservation in Helix-loop-Helix Motifs

Primary Sequence Logo and Helix-loop-helix structurePrimary Sequence Logo and Helix-loop-helix structure

Calmodulin helix-loop-helixCalmodulin helix-loop-helix

Position 1 is invariant – Asp coordinates axial position of CaPosition 1 is invariant – Asp coordinates axial position of Ca2+2+

Positions 3 and 5 – two Asp (or oxygen containing residues) coordinate equatorial positions of CaPositions 3 and 5 – two Asp (or oxygen containing residues) coordinate equatorial positions of Ca2+2+ Position 12 – Glu that forms bidentate (two) interactions with equatorial positions of CaPosition 12 – Glu that forms bidentate (two) interactions with equatorial positions of Ca2+2+

Position 6 – Gly due to main-chain conformationPosition 6 – Gly due to main-chain conformation

Remaining highly conserved positions contain non-polar residues that stabilized the helix-loop-helix Remaining highly conserved positions contain non-polar residues that stabilized the helix-loop-helix structure structure

Page 7: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 7

Structural comparison of Helix-loop-helix motifs

Structural superpositionStructural superposition: : Least-squares minimization of atomic coordinates Least-squares minimization of atomic coordinates allows structures to be superposed (position relative to a common origin)allows structures to be superposed (position relative to a common origin)

Graphical representation of structural similarities and differencesGraphical representation of structural similarities and differences

Superposition of several structures from Superposition of several structures from families of Helix-loop-helix Cafamilies of Helix-loop-helix Ca2+2+ binding binding proteins proteins

A) Calmodulin familyA) Calmodulin family B) Parvalbumin familyB) Parvalbumin family

C) Troponin familyC) Troponin family

In all cases, the helix-loop-helix motifs In all cases, the helix-loop-helix motifs adopt similar structures (with small root adopt similar structures (with small root mean-square deviations) mean-square deviations)

Page 8: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 8

Structural comparison of Helix-loop-helix motifs

Structural superpositionStructural superposition: : another another exampleexample

Structural superposition of Calmodulin Structural superposition of Calmodulin bound to specific inhibitors:bound to specific inhibitors:

Grey: Calmodulin bound to four trifluoroperazine Grey: Calmodulin bound to four trifluoroperazine (TFP) (TFP)

Blue/Red: Calmodulin bound to KAR-2 Blue/Red: Calmodulin bound to KAR-2

Structural differences observed for Structural differences observed for calmodulin bound to different inhibitorscalmodulin bound to different inhibitors

Is comparable to Is comparable to

difference between Calmodulin family difference between Calmodulin family members (previous slide, panel A) members (previous slide, panel A)

Page 9: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 9

Conformational Changeupon substrate binding

CaCa2+2+ free (apo) free (apo): : Helices of Helix-loop-Helix are roughly parallel and the loop Helices of Helix-loop-Helix are roughly parallel and the loop directs conserved Asp/Glu residues into bulk solventdirects conserved Asp/Glu residues into bulk solvent

CaCa2+2+ bound (holo) bound (holo): : Helices of Helix-loop-Helix are roughly orthogonal and Helices of Helix-loop-Helix are roughly orthogonal and the loop wraps around Cathe loop wraps around Ca2+2+ directing the Asp/Glu residues at the ion directing the Asp/Glu residues at the ion

Conformation change upon CaConformation change upon Ca2+2+ binding uncovers a hydrophobic binding uncovers a hydrophobic surface that is a protein and peptide surface that is a protein and peptide binding sitebinding site

Facilitates role in signal transduction Facilitates role in signal transduction as the hydrophobic surface modulates as the hydrophobic surface modulates the activity of proteins that it bindsthe activity of proteins that it binds

eg. Calmodulin dependent protein eg. Calmodulin dependent protein kinasekinase

Exposed non Exposed non polar surfacepolar surface

Page 10: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 10

Helix-loop-Helix Family

Canonical EF-handsCanonical EF-hands: Helix-loop-helix motifs with 12 residue loops that : Helix-loop-helix motifs with 12 residue loops that bind Cabind Ca2+2+ using conserved Asp/Glu residues at position 1, 3, 5 and 12. using conserved Asp/Glu residues at position 1, 3, 5 and 12. Note: All previously discussed Helix-loop-Helix proteins Note: All previously discussed Helix-loop-Helix proteins

Pseudo EF-handsPseudo EF-hands: Helix-loop-Helix motif of the N-terminus of S100 : Helix-loop-Helix motif of the N-terminus of S100 proteins. Loop of 14 residues that binds Caproteins. Loop of 14 residues that binds Ca2+2+ using carbonyl oxygens using carbonyl oxygens at positions 1, 4, 6 and 9at positions 1, 4, 6 and 9

S100 ApoS100 Apo

Page 11: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 11

Helix-loop-Helix Proteins

Functional RolesFunctional Roles: Two : Two classes of Helix-loop-Helix classes of Helix-loop-Helix proteinsproteins

1) 1) Signaling proteinsSignaling proteinsLarger group that includes Larger group that includes calmodulin, troponin and S100calmodulin, troponin and S100- all undergo Ca- all undergo Ca2+2+ dependent dependent conformational changeconformational change

2) 2) Transport/Buffering proteinsTransport/Buffering proteinsCalbindin D9K onlyCalbindin D9K only- does not undergo conformational - does not undergo conformational changechange

Phylogenetic tree for Helix-loop-Helix family proteins Phylogenetic tree for Helix-loop-Helix family proteins - circles=canonical, squares=pseudo, solid are known to bind Ca- circles=canonical, squares=pseudo, solid are known to bind Ca2+2+

Page 12: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 12

Helix-loop-Helix Proteins(Humans)

More than 100 human proteins contain a Helix-loop-Helix functional motif (2009) More than 100 human proteins contain a Helix-loop-Helix functional motif (2009)

Page 13: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 13

Helix-loop-Helix Diversity

EucaryotesEucaryotes: Helix-loop-helix proteins have primary roles in signal : Helix-loop-helix proteins have primary roles in signal transduction. transduction.

Q? Do procaryotes have helix-loop-helix proteins?Q? Do procaryotes have helix-loop-helix proteins?A.A. Yes Yes

Procaryotic helix-loop-helix proteins are more diverse than eucaryotic proteins.Procaryotic helix-loop-helix proteins are more diverse than eucaryotic proteins.Role maintaining CaRole maintaining Ca2+2+ homeostasis and signaling in bacterial homeostasis and signaling in bacterial

Procaryotic helix-loop-helix proteins have a greater Procaryotic helix-loop-helix proteins have a greater diversity of loop sizes (9, 10, 12, 15) and interhelical diversity of loop sizes (9, 10, 12, 15) and interhelical packingpacking

Divergent procaryotic EF-hand like proteinDivergent procaryotic EF-hand like proteinDifferent loop size and interhelical packing Different loop size and interhelical packing with same Cawith same Ca2+2+ coordination coordination

Page 14: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 14

Detecting Helix-loop-Helix Proteins

Sequence patternSequence pattern: : Derived from structural Derived from structural studies and primary studies and primary sequence alignmentssequence alignments

Automated identification of Automated identification of Helix-loop-Helix Proteins is Helix-loop-Helix Proteins is successful in > 80% of casessuccessful in > 80% of cases

ProSiteProSite

Automated annotation of Automated annotation of primary sequence based upon primary sequence based upon known functional motifs known functional motifs identified from sequenceidentified from sequence

Page 15: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 15

Comparing Protein Structures

Quantifying Structural SimilarityQuantifying Structural Similarity: : Domain fold classifications (ie. SCOP, CATH, etc.) are based Domain fold classifications (ie. SCOP, CATH, etc.) are based upon backbone 'structural' similarity between proteins of upon backbone 'structural' similarity between proteins of knownknown

Difficult to quantify !!!Difficult to quantify !!!Virtually impossible to come up with a single value that Virtually impossible to come up with a single value that represents 3D structural similarityrepresents 3D structural similarity

Techniques for quantifying structural similarityTechniques for quantifying structural similarity: : Most (all?) approaches are based-upon the Most (all?) approaches are based-upon the superposition or superposition or structural alignmentstructural alignment of 2 (or more) structures. of 2 (or more) structures.

Superpositions or structural alignments are typically Superpositions or structural alignments are typically calculated by calculated by minimizing the RMSDminimizing the RMSD (root mean square (root mean square deviation) of equivalent atomic coordinatesdeviation) of equivalent atomic coordinates

where xwhere xii and y and y

ii are equivalents are equivalents

atoms in the two structures (x atoms in the two structures (x and y) being superposed and y) being superposed

RMSD = RMSD =

Note: there are many different algorithms for calculating superpositions that Note: there are many different algorithms for calculating superpositions that primarily differ with respect to the amount of user input required and the primarily differ with respect to the amount of user input required and the underlying mathematics underlying mathematics

Page 16: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 16

Comparing Protein Structures

RMSD from superpositionRMSD from superposition: :

RMSD values are strongly dependent upon:RMSD values are strongly dependent upon:1 – atoms used for superposition (main, side, domain, …)1 – atoms used for superposition (main, side, domain, …)2 – size of protein (and resolution of structure)2 – size of protein (and resolution of structure)3 – large outliers (ie. Regions of structure that are far apart in 3 – large outliers (ie. Regions of structure that are far apart in the superposition the superposition4 – insertions and deletions in primary sequence 4 – insertions and deletions in primary sequence

Without information regarding the number and identity of atoms used Without information regarding the number and identity of atoms used in the superposition, the RMSD values are largely meaninglessin the superposition, the RMSD values are largely meaningless

ExamplesExamplesIdentical protein structures (300 residues) in different space groups Identical protein structures (300 residues) in different space groups (ie. Independent X-ray structure determinations); Homologs (same (ie. Independent X-ray structure determinations); Homologs (same length) sharing 80%, 50% and 30% sequence identitylength) sharing 80%, 50% and 30% sequence identity

SuperpositionSuperposition ResiduesResidues ~ RMSD~ RMSDCCαα atom atom AllAll 0.5 0.5 ÅÅAll atomsAll atoms AllAll 1.5 1.5 ÅÅ

80% C80% Cαα atom atom AllAll 1.0 1.0 ÅÅ50% C50% Cαα atom atom AllAll 1.5 1.5 ÅÅ30% C30% Cαα atom atom AllAll 2.0 2.0 ÅÅ

Note: Superposition of NMR and X-ray Note: Superposition of NMR and X-ray structures generally produces (slightly) structures generally produces (slightly) larger RMSDs than superpositions of two larger RMSDs than superpositions of two NMR or two X-ray structuresNMR or two X-ray structures- Experimental differences in the account - Experimental differences in the account for observation for observation

Page 17: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 17

Comparing Protein Structures

Entire polypeptide (141 residues):Entire polypeptide (141 residues):RMSD (all CRMSD (all Cαα atoms) atoms) 2.21 2.21 ÅÅ

N-terminal domain (74 residues):N-terminal domain (74 residues):RMSD (all CRMSD (all Cαα atoms) atoms) 1.35 1.35 ÅÅ

Virtually all cases, loops (and straps) are sites of greatest divergence Virtually all cases, loops (and straps) are sites of greatest divergence in the superposed structures.in the superposed structures.

Superposition of Calmodulin Superposition of Calmodulin (blue) and Troponin C (red)(blue) and Troponin C (red)- 45% identity with 2 insertions - 45% identity with 2 insertions / deletions/ deletions

Two views of the superposition of the N-terminal domains Two views of the superposition of the N-terminal domains of Calmodulin (blue) and Troponin C (red)of Calmodulin (blue) and Troponin C (red)

Divergent Divergent loop loop

Page 18: Functional Motifs - people.uleth.ca

Biochemistry 4000Lecture 7 Slide 18

Comparing Protein Structures

S100A and TNC (72 residues):S100A and TNC (72 residues): Parvalbumin and TNC (100 residues):Parvalbumin and TNC (100 residues):RMSD (all CRMSD (all Cαα atoms) atoms) 3.02 3.02 ÅÅ RMSD (all CRMSD (all Cαα atoms) atoms) 2.27 2.27 ÅÅ

Superposition of Parvalbumin (blue) and Superposition of Parvalbumin (blue) and Troponin C (red)Troponin C (red)- 23% sequence identity and 4 insertion / - 23% sequence identity and 4 insertion / deletionsdeletions

Superposition of the pseudo-EF hand S100A (blue) and Superposition of the pseudo-EF hand S100A (blue) and Troponin C (red)Troponin C (red)- 22% sequence identity and 2 insertions / deletions- 22% sequence identity and 2 insertions / deletions

Divergent Divergent segmentssegments