Protein Folding Programs

19
Protein Folding Protein Folding Programs Programs By By Asım OKUR Asım OKUR CSE 549 CSE 549 November 14, 2002 November 14, 2002

description

Protein Folding Programs. By Asım OKUR CSE 549 November 14, 2002. Protein Structure. DNA Sequence  Protein Sequence  Structure  (Mis)function It is believed that all the information necessary to determine the structure of a protein is present in its primary sequence. - PowerPoint PPT Presentation

Transcript of Protein Folding Programs

Page 1: Protein Folding Programs

Protein Folding Protein Folding ProgramsPrograms

ByByAsım OKURAsım OKUR

CSE 549CSE 549November 14, 2002November 14, 2002

Page 2: Protein Folding Programs

Protein StructureProtein Structure

DNA Sequence DNA Sequence Protein Sequence Protein Sequence Structure Structure (Mis)function (Mis)function

It is believed that all the information It is believed that all the information necessary to determine the structure necessary to determine the structure of a protein is present in its primary of a protein is present in its primary sequence.sequence.

Page 3: Protein Folding Programs

Protein Folding Protein Folding ProgramsPrograms

Protein folding is one of the biggest Protein folding is one of the biggest computational challengescomputational challenges

Different types of folding and Different types of folding and structure predictions programsstructure predictions programs SimulationsSimulations Homology Modeling ApproachesHomology Modeling Approaches

Page 4: Protein Folding Programs

SimulationsSimulations

Simulate the real behavior of Simulate the real behavior of proteinsproteins

High detail, short time scalesHigh detail, short time scales 2 main simulation types2 main simulation types

Molecular DynamicsMolecular Dynamics Monte Carlo Monte Carlo

Page 5: Protein Folding Programs

The Energy FunctionThe Energy Function

Calculate energies for each particle Calculate energies for each particle Since long range interactions Since long range interactions

important for each pair of particles important for each pair of particles the pair-wise interactions should be the pair-wise interactions should be calculatedcalculated

bonds angles torsions ji ij

ji

ij

ij

ij

ijneqeqrPair R

qq

R

B

R

An

VKrrKE

612

22 )cos(12

)()(

Page 6: Protein Folding Programs

Homology ModelingHomology Modeling

Template Selection Template Selection and Fold and Fold Assignment Assignment

Target – Template Target – Template AlignmentAlignment

Model BuildingModel Building Loop ModelingLoop Modeling Sidechain ModelingSidechain Modeling

Model EvaluationModel Evaluation

Page 7: Protein Folding Programs

Fold Assignment and Fold Assignment and Template SelectionTemplate Selection

Identify all protein structures with Identify all protein structures with sequences related to the target, then select sequences related to the target, then select templatestemplates

3 main classes of comparison methods3 main classes of comparison methods Compare the target sequence with each Compare the target sequence with each

database sequence independently, pair-wise database sequence independently, pair-wise sequence – sequence comparison, BLAST and sequence – sequence comparison, BLAST and FASTAFASTA

Multiple sequence comparisons to improve Multiple sequence comparisons to improve sensitivity, PSI-BLASTsensitivity, PSI-BLAST

Threading or 3-D template matching methodsThreading or 3-D template matching methods

Page 8: Protein Folding Programs

Target – Template Target – Template AlignmentAlignment

Most important step in Homology Most important step in Homology ModelingModeling

A specialized method should be used for A specialized method should be used for alignmentalignment Over 40% identity the alignment is likely to be Over 40% identity the alignment is likely to be

correct. correct. Regions of low local sequence similarity Regions of low local sequence similarity

become common when overall sequence become common when overall sequence identity is under 40%. (Saqi et al., Protein identity is under 40%. (Saqi et al., Protein Eng. 1999)Eng. 1999)

The alignment becomes difficult below 30% The alignment becomes difficult below 30% sequence identity. (Rost, Protein Eng. 1999)sequence identity. (Rost, Protein Eng. 1999)

Page 9: Protein Folding Programs

Model BuildingModel Building Construct a 3-D model of the target Construct a 3-D model of the target

sequence based on its alignment on sequence based on its alignment on template structurestemplate structures

Three different model building approachesThree different model building approaches Modeling by rigid body assemblyModeling by rigid body assembly Modeling by segment matchingModeling by segment matching Modeling by satisfaction of spatial restraintsModeling by satisfaction of spatial restraints

Accuracies of these models are similarAccuracies of these models are similar Template selection and alignment have Template selection and alignment have

larger impact on the modellarger impact on the model

Page 10: Protein Folding Programs

Swiss-MOD Web Server

Screenshots from the Homology Modeling Server Swiss-Model

• Construct a framework using known protein structures

• Generate the location of the target amino acids on the framework

• If loop regions not determined, additional database search or short simulations

Page 11: Protein Folding Programs

Procedure of the MODELLER program

• After obtaining restraints run a geometry optimization or real-space optimization to satisfy them

Page 12: Protein Folding Programs

Errors in Homology Errors in Homology ModelsModels

a. Errors in sidechain packing

b. Distortions and shifts in correctly aligned regions

c. Errors in regions without a template

Page 13: Protein Folding Programs

d. Errors due to misalignment

e. Incorrect templates

Page 14: Protein Folding Programs

Model Building Model Building ProgramsPrograms

 COMPOSER P www-cryst.bioc.cam.ac.uk

 CONGEN P www.congenomics.com/congen/congen.html

 CPH models S www.cbs.dtu.dk/services/CPHmodels/

 DRAGON P www.nimr.mrc.ac.uk/~mathbio/a-aszodi/dragon.html

 ICM P www.molsoft.com

 InsightII P www.msi.com

 MODELLER P guitar.rockefeller.edu/modeller/modeller.html

 LOOK P www.mag.com

 QUANTA P www.msi.com

 SYBYL P www.tripos.com

 SCWRL P www.cmpharm.ucsf.edu/~bower/scrwl/scrwl.html

 SWISS-MOD S www.expasy.ch/swissmod

 WHAT IF P www.sander.embl-heidelberg.de/whatif/

Page 15: Protein Folding Programs
Page 16: Protein Folding Programs

ApplicationsApplications

Page 17: Protein Folding Programs

Critical Assessment of Critical Assessment of protein Structure protein Structure Prediction (CASP)Prediction (CASP)

Venclovas et al. Proteins, 2001

Page 18: Protein Folding Programs

Critical Assessment of Critical Assessment of protein Structure protein Structure Prediction (CASP)Prediction (CASP)

Venclovas et al. Proteins, 2001

Page 19: Protein Folding Programs

ConclusionsConclusions Computer Simulations are powerful to show Computer Simulations are powerful to show

detailed motions but they cannot cover long detailed motions but they cannot cover long enough time spans to simulate folding for large enough time spans to simulate folding for large systemssystems

Homology Modeling techniques can be Homology Modeling techniques can be successful if the target protein has a known foldsuccessful if the target protein has a known fold The higher the sequence similarity the more likely the The higher the sequence similarity the more likely the

model will be successful model will be successful With the implementation of better techniques the With the implementation of better techniques the

errors in fold assignment, alignment, and sidechain errors in fold assignment, alignment, and sidechain and loop modeling are decreasingand loop modeling are decreasing

Theoretically, if at least one member of every possible Theoretically, if at least one member of every possible fold is known, it is possible to predict the structure of fold is known, it is possible to predict the structure of every coding sequence to within a certain accuracyevery coding sequence to within a certain accuracy