p3d @EuroSciPy2010 by C. Fufezan

40
High-throughput structural bioinformatics using Python & p3d Dr. C. Fufezan Institute for Biochemistry and Biotechnology of Plants (IBBP)

Transcript of p3d @EuroSciPy2010 by C. Fufezan

Page 1: p3d @EuroSciPy2010 by C. Fufezan

High-throughputstructural bioinformatics

using Python & p3d

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 2: p3d @EuroSciPy2010 by C. Fufezan

Overview

Background

p3d overview

example ATP binding site

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Fufezan, C. and Specht M. (2009) BMC Bioinformatics 10, 258

http://p3d.fufezan.net

http://github.com/fu/p3d clone us - fork us!

Page 3: p3d @EuroSciPy2010 by C. Fufezan

Background

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 4: p3d @EuroSciPy2010 by C. Fufezan

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

chain(s) of amino acids ...N D R P A I M K

... form proteins

oxygennitrogen

carbon

Background

Page 5: p3d @EuroSciPy2010 by C. Fufezan

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

chain(s) of amino acids ...N D R P A I M K

... form proteins

oxygennitrogen

carbon

Background

Page 6: p3d @EuroSciPy2010 by C. Fufezan

chain(s) of amino acids ...N D R P A I M K

... form proteins

and some bind cofactorse.g. ATPAdenosin-tri-phosphate

oxygennitrogen

carbon

Background

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 7: p3d @EuroSciPy2010 by C. Fufezan

Background

knowledge based approaches to elucidate structural factors that are essential for co-factor binding - protein engineering- protein folding- co-factor tuning

proteinsSTRUCTURE O FUNCTION O BIOINFORMATICS

ISSN 0887-3585

A PDB Survey of Heme Ligands in Proteins

Articles published online in Wiley InterScience, 9 May 2008–19 August 2008

V O L U M E 7 3 , N U M B E R 3 , N O V E M B E R 1 5 , 2 0 0 8

prote

ins

VO

LU

ME

7

3,

N

UM

BE

R

3,

N

OV

EM

BE

R

15

,

20

08

PA

GE

S

52

7–

79

4

Proteins_c1_sp_Ob.qxp 9/11/08 4:14 PM Page 1

Morozov et al. (2004) PNAS, 101, 6946-Huang et al.(2004) PNAS, 101, 5536-Fufezan et al. (2008) Proteins, 73, 690-Negron et al. (2009) Proteins, 74, 400-Fufezan (2010) Proteins, in press

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 8: p3d @EuroSciPy2010 by C. Fufezan

p3d

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 9: p3d @EuroSciPy2010 by C. Fufezan

p3d overview

Python module that allows

to access and manipulate protein structure files

rapid development of new screening tools

easily incorporate complex queries

Fufezan, C. and Specht M. (2009) BMC Bioinformatics 10, 258http://p3d.fufezan.net

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 10: p3d @EuroSciPy2010 by C. Fufezan

N

CA

CB

CG2CG1

CO

x y z idx

atom type

AAchain

residbetaortemperaturefactor

user

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

Page 11: p3d @EuroSciPy2010 by C. Fufezan

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

proteinnot-proteinogenic

chain['A']oxygennitrogen

backboneatype['CA']

...

Protein Object...

hash...

residues

backbone

alpha

oxygen

protein not proteinDr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 12: p3d @EuroSciPy2010 by C. Fufezan

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

proteinnot-proteinogenic

chain['A']oxygennitrogen

backboneatype['CA']

...

Protein Object...

hash...

residues

backbone

alpha

oxygen

protein not proteinDr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 13: p3d @EuroSciPy2010 by C. Fufezan

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

proteinnot-proteinogenic

chain['A']oxygennitrogen

backboneatype['CA']

...

Protein Object...

hash...

residues

backbone

alpha

oxygen

protein not proteinDr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 14: p3d @EuroSciPy2010 by C. Fufezan

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

proteinnot-proteinogenic

chain['A']oxygennitrogen

backboneatype['CA']

...

Protein Object...

hash...

residues

backbone

alpha

oxygen

protein not proteinDr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 15: p3d @EuroSciPy2010 by C. Fufezan

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

proteinnot-proteinogenic

chain['A']oxygennitrogen

backboneatype['CA']

...

Protein Object...

hash...

residues

backbone

alpha

oxygen

protein not proteinDr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 16: p3d @EuroSciPy2010 by C. Fufezan

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

Tree Object

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 17: p3d @EuroSciPy2010 by C. Fufezan

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

Tree Object

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 18: p3d @EuroSciPy2010 by C. Fufezan

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

Tree Objectquery( Vector1, radius )

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 19: p3d @EuroSciPy2010 by C. Fufezan

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

Tree Objectquery( Vector1, radius )

Vectorsdo not have to

be atoms!!

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 20: p3d @EuroSciPy2010 by C. Fufezan

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Page 21: p3d @EuroSciPy2010 by C. Fufezan

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

Page 22: p3d @EuroSciPy2010 by C. Fufezan

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

Page 23: p3d @EuroSciPy2010 by C. Fufezan

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)

Page 24: p3d @EuroSciPy2010 by C. Fufezan

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)pdb.query(‘oxygen and not protein’)

Page 25: p3d @EuroSciPy2010 by C. Fufezan

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)pdb.query(‘oxygen and not protein’)

pdb.query(‘ protein and within 4 of ’, p3d.vector.Vector(x,y,z) )

Page 26: p3d @EuroSciPy2010 by C. Fufezan

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)pdb.query(‘oxygen and not protein’)

pdb.query(‘ protein and within 4 of ’, p3d.vector.Vector(x,y,z) )

for residueName in pdb.hash[non-aa-resname]:

Page 27: p3d @EuroSciPy2010 by C. Fufezan

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)pdb.query(‘oxygen and not protein’)

pdb.query(‘ protein and within 4 of ’, p3d.vector.Vector(x,y,z) )

for residueName in pdb.hash[non-aa-resname]: targets = pdb.query(' protein and within 4 of \

Page 28: p3d @EuroSciPy2010 by C. Fufezan

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)pdb.query(‘oxygen and not protein’)

pdb.query(‘ protein and within 4 of ’, p3d.vector.Vector(x,y,z) )

for residueName in pdb.hash[non-aa-resname]: targets = pdb.query(' protein and within 4 of \ ( resname 'residueName' and oxygen )' )

Page 29: p3d @EuroSciPy2010 by C. Fufezan

Example ATP binding

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 30: p3d @EuroSciPy2010 by C. Fufezan

The ATP binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Adenosine-tri-phosphate

ΔG˚' = -30 kJ mol-1

40 kg / day

Page 31: p3d @EuroSciPy2010 by C. Fufezan

The ATP binding sites

non. redundant set of proteins24 binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 32: p3d @EuroSciPy2010 by C. Fufezan

The ATP binding sites

non. redundant set of proteins24 binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 33: p3d @EuroSciPy2010 by C. Fufezan

The ATP binding sites

non. redundant set of proteins24 binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 34: p3d @EuroSciPy2010 by C. Fufezan

The ATP binding sites

non. redundant set of proteins24 binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 35: p3d @EuroSciPy2010 by C. Fufezan

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 36: p3d @EuroSciPy2010 by C. Fufezan

The ATP binding site

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 37: p3d @EuroSciPy2010 by C. Fufezan

The ATP binding site

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 38: p3d @EuroSciPy2010 by C. Fufezan

The ATP binding site

+4.5

-4.5

hydropathy index

0

1

10

Observations

non. redundant set of proteins24 binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 39: p3d @EuroSciPy2010 by C. Fufezan

Summary

p3d allows to develop quickly Python scripts to screen Protein structures

combines Vectors, sets and BSPTree

p3d allows flexible and complex queriesusing human readable language

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Page 40: p3d @EuroSciPy2010 by C. Fufezan

Acknowledgements

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

M. SpechtProf. Dr. M. Hippler

founding by the DFG and Alexander von Humboldt Stiftung