Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does...
-
Upload
dayna-powers -
Category
Documents
-
view
219 -
download
0
Transcript of Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does...
Folds and motifs
•Using SCOP, CATH, CE to find structural homologs.
•What class/architecture, etc does your molecule fall in?
•Use CE to identify structural homologs for your protein of interest.
•Motifs: classic turn types. Extended turn types.
•Use Rasmol to find the glycines in your structure. That are the phi angles? What type of turn is it?
The SCOP database
• Contains information about classification of protein structures and within that classification, their sequences
• http://scop.berkeley.edu
SCOP classification heirarchy
(1) class
(2) fold
(3) superfamily
(4) family
(5) protein
(6) species
global characteristics (no evolutionary relation)Similar “topology” . Distant evolutionary cousins?Clear structural homologyClear sequence
homologyfunctionally identical
unique sequences
protein classes
1. all 2. all 5. multidomain (21)6. membrane (21)7. small (10)8. coiled coil (4)9. low-resolution (4)10. peptides (61)11. designed proteins (17)
number of sub-categories
possibly not complete, or erroneous
class: proteinsFolds:
TIM-barrel (22)
swivelling beta/beta/alpha domain (5)
spoIIaa-like (2)
flavodoxin-like (10)
restriction endonuclease-like (2)
ribokinase-like (2)
chelatase-like (2)
Mainly parallel beta sheets (beta-alpha-beta units)
Many folds have historical names. “TIM” barrel was first seen in TIM. These classifications are done by eye, mostly.
fold: flavodoxin-like
Superfamilies:
1.Catalase, C-terminal domain (1)
2.CheY-like (1)
3.Succinyl-CoA synthetase domains (1)
4.Flavoproteins (3)
5.Cobalamin (vitamin B12)-binding domain (1)
6.Ornithine decarboxylase N-terminal "wing" domain (1)
7.Cutinase-like (1)
8.Esterase/acetylhydrolase (2)
9.Formate/glycerate dehydrogenase catalytic domain-like (3)
10.Type II 3-dehydroquinate dehydratase (1)
3 layers, ; parallel beta-sheet of 5 strand, order 21345
Note the term: “layers”
These are not domains. No implication of structural independence.
Note how beta sheets are described: number of strands, order (N->C)
fold-level similaritycommon topological features
catalase flavodoxin
At the fold level, a common core of secondary structure is conserved. Outer secondary structure units may not be conserved.
Superfamily:Flavoproteins
Flavodoxin-related (7)
NADPH-cytochrome p450 reductase, N-terminal domain
Quinone reductase
These molecules do not superimpose well, but side-by-side you can easily see the similar topology. Sec struct’s align 1-to-1, mostly.
Family: quinone reductasesbinds FAD
NADPH quinone reductase
quinone reductase type 2
Proteins:
Different members of the same family superimpose well. At this level, a structure may be used as a molecular replacement model.
CATH
• Class• Architecture• Topology• Homology
http://www.biochem.ucl.ac.uk/bsm/cath_new/index.html
Remote homologs are more likely than close
ones
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0 20 40 60 80 100
percent identity for structural homologs
likelihood
Structural conservation is
stronger than sequence conservation
The existence of large numbers of remote homologs shows us that true structural similarity is hard to see in the amino acid sequence.
beta-catenin versus clathrin (from FSSP database)
Structural homology is sometimes difficult to see
Rasmol practice
Using your protein of interest:
Find all glycines.
select gly
Label each alpha-carbon by number
label %r
Measure the phi-angle by eye (align N and CA. R-handed is positive) Write down the residue number and phi angle.
Rasmol exercises
(1) Color the ribbon drawing of 1qajA.pdb (it’s in your home directory) from the N-terminus (blue) to C-terminus (red)
(2) Divide 1qajA.pdb into domains of different color.
(3) In 1qajA.pdb, display only residues 66-87. What are the first and last residues with helical backbone angles?
(4) Find all of the atoms that H-bond to the sidechain of Ser92.
Secondary structure
Alpha helix
Right-handed Overall dipole N+->C- 3.6 residues/turn i->i+4 H-bonds
3 types of Alpha helix
ae
b
f c
g
dTwo ways to display position of sidechain on a helix.
non-polar
amphipathic
polar
For amphipathic and non-polar, sidechains line up in a favorable way.
Helix caps
Please go to http://isites.bio.rpi.edu/
Click on:Glycine alpha (Schellman) C-cap Type 1
Glycine alpha (Schellman) C-cap Type 2
Proline alpha C-cap
Frayed alpha helix
Serine alpha N-cap (capping box)
Exercise: Find a helix cap in 1QAJ. What type is it?
Capping motifs stabilize helices.
Proline helix C-capProline helix C-cap
Sequence pattern=...nppnnpp[HNYF]P[DE]n
structural variability
Locations of non-polar (magenta) and polar (green) sidechains
note: hydrophobic cluster
Pro blocks helixD or E stabilizes tight turn
“n”=non-polar“p”=polar[...]=alternative aa’s
Two Helix motifs
corner(helix-turn-helix)
EF-hand
(binds Ca2+)
corner motif
motif pattern (summary):
nnpp[nH]Gn[PDS][Px]pn
Exercise: Find a corner motif in myoglobin.
“n”=non-polar“p”=polar[...]=alternative aa’s
red=favorableblue = unfavorablegreen=neutral
backbone angles:green=psired=phi
I-sites motif(Also described by A. Efimov)
Find a corner in 1DWT
Download 1dwt from www.rcsb.orgOpen using Rasmol
Rasmol commands:
Display->Cartoons
Color->structure
select alpha
label %m
color labels yellow
Find the corner pattern (sequence and structure).
Can you see the motif in the sequence?
1 GLSDGEWQQV LNVWGKVEAD IAGHGQEVLI RLFTGHPETL EKFDKFKHLK HHHHHHH HHHHHHHHHT HHHHHHHHHH HHHHHTHHHH HTTTTTTT
51 TEAEMKASED LKKHGTVVLT ALGGILKKKG HHEAELKPLA QSHATKHKIP SHHHHHTTHH HHHHHHHHHH HHHHHHHTTT HHHHHHHH HHHHHTS
101 IKYLEFISDA IIHVLHSKHP GDFGADAQGA MTKALELFRN DIAAKYKELG HHHHHHHHHH HHHHHHHHST TSS HHHHHH HHHHHHHHHH HHHHHHHHHT
151 FQG
EF-hand
Exercise: Rasmol 1CLL. Follow instructions on next page
1CLL exercise. How does the EF-hand bind
calcium?restrict hetero and CA (comments in blue...) spacefill (this displays only calciums)click to get residue # (x = residue number)restrict within (10., x)center selected (so you can rotate it easily)Display->ball-and-stick (from menu item. you see atoms and bonds)select protein and within (10., x)Display->sticks (now protein part has think bonds)select hoh and within (10., x) ( selects nearby waters)spacefillcolor cyan (or your favorite water color...)select within (3., x) ( atoms within bonding distance)label ( these are the atoms bonded to calcium x)color labels yellow (..so you can see them)
Calcium causes a conformational change
in the EF hand.red=1CLL
green=1cfd
Create a Rasmol scriptWrite a file using jot or vi (for example: vi script.ras)Put keyboard commands in the script.Start RasmolInvoke the script using the ‘script’ command(for example: script script.ras)
Useful keyboard commands for scripts:
load xxx.pdb (loads a new PDB file)zap (closes PDB file, blanks the display)pause (stops the script until you hit a key)
Create a Rasmol script to survey many
structuresList the pdb files in the xtal directory (ls xtal/*.pdb)
Create a script file (myscript.ras) as follows:
load xtal/file1.pdb (use one of the PDB files listed)backbone 0.6color blueselect hohspacefill 1.0color redset unitcell onpausezapload xtal/file2.pdb
(repeat as before for each file listed)
Run the script: script myscript.ras
Next, change the script and run it again (i.e. change the colors to green and magenta)
Nested scripts
Tired of having to edit the script 10 times over??Write a “nested” script.
Create a file called “mynestedscript.ras”Put in all the commands except the ‘load’ line. Just one copy.The last line is ‘zap’.
Create a file called “mymainscript.ras” containing:
load file1.pdbscript mynestedscript.rasload file2.pdbscript mynestedscript.rasetc. etc. etc
Further exercises: just a few examples!
Select atoms by temperature factor: select temperature > 5000 (Rasmol uses B*100, so this selects all atoms with B > 50.00)
Find crystal contacts: select :B & within (8., :A)(This gets all atoms of chain B that are close to chain A. Your chain IDs may vary.)
Can you determine the crystal form (triclinic, orthorhombic, etc) just by looking at the unit cell? (set unitcell on)
Survey the B-factors of waters in several structures using a nested script. (Use: color temperature)
Use logic operators! Find alpha carbons within 10.A or residue x but not within 6.A. (select within (10.,x) and not within (6., x) )
Practice stereo viewing. (stereo -7 for walleyed viewing)
Proline helix C-capProline helix C-cap
Sequence pattern=...nppnnpp[HNYF]P[DE]n
structural variability
Locations of non-polar (magenta) and polar (green) sidechains
note: hydrophobic cluster
Pro blocks helixD or E stabilizes tight turn
“n”=non-polar“p”=polar[...]=alternative aa’s
Two Helix motifs
corner(helix-turn-helix)
EF-hand
(binds Ca2+)
beta-strandAntiparallel:
Parallel:
middle strand
edge strand
Sequence motifs for beta strands
Amphipathic
Found at the edges of a sheet, or when one side of the sheet is exposed to solvent (i.e. 2-layer proteins).
Found in the buried middle strands of sheets in 3-layer proteins.
Hydrophobic
Note preference for beta-branched aa’s: I,V,T
beta hairpins
Two adjacent antiparallel beta strands = a beta hairpin
Shown are “tight turns”, 2 residues in the loop region (shaded). Hairpins can have as many as 20 residues in the loop region.
hairpin motifs
“Serine beta-hairpin” (my naming). A specific pattern (DPESG) forms an incomplete alpha-helical turn 4-residues long.
“Extended Type-1 hairpin”. A type-1 “tight turn” has only 2 residues in the turn. This motif, more common than the tight turn, has an additional Pro or polar sidechain. Pattern: PDG.
diverging turn
“Diverging turns” have a Type-2 beta turn and two strands that do not pair. The consensus sequence pattern is PDG. The residue before G can be anything polar, but not a D or an N.
Exercise: Find the diverging turn in 1CSK.
Greek key motif
2 3 4 1
beta meander
One of the most common arrangements of four strands.
Two permutations. 2341 and 3214
Exercise: Download 2PLT. Find the Greek key motifs.
motifWhen a helix occurs between two strands, they are often paired in a parallel sheet.
The cross-over from one strand to the next is almost always right-handed. It is not known why this is true.
(NOTE: the cross-over is always right-handed even when the two strands are not paired!!)
TOPS diagramsbeta strand pointing up
beta strand pointing down
alpha helix
NC motif
Note R-handed crossover!
motif1 2 3
12 3
31 2
Only 3 are possible.
(with R-handed crossovers)
Efimov’s theory
A. Efimov showed that almost all protein structures can be classified as being one of 7 trees, each starting with a motif and “growing” by one secondary structure unit at time.
Does this recapitulate evolution? the folding pathway?
Domains
Proteins fold heirarchically.
secondary structure --> motifs --> subdomains-->domains --> multidomain proteins --> complexes
Most domains fold independently of the rest of the chain.
All alpha folds
3-helix bundle 1CUK 156-203
4-helix bundle 1NFN
Globin 1DWT
...many many other
alpha/beta folds
“TIM barrel” 8TIM
“Rossman fold” 1HET 176-318
many others...
Exercise: in 1HET, find the “crevice” between two motifs where NAD binding commonly occurs.
Exercise: Draw TOPS diagram for 1HET 176-318.
all beta folds
“Jelly roll” 1JMU 371-500:D, 2PLT
barrels 1EMC
propellers
beta-helixExercise: Draw TOPS diagrams for the two domains above.
Term projects
Look at the description online.
Choose a molecule to study. Check with me before you start.
Look at “Molecule of the Month” at RCSB.org for inspiration. But don’t pick one of these molecules.