Protein docking by LZerD, KiharaLab at CAPRI meeting 2016
-
Upload
daisuke-kihara -
Category
Science
-
view
164 -
download
0
Transcript of Protein docking by LZerD, KiharaLab at CAPRI meeting 2016
Human and Server CAPRI Protein Docking Prediction Using LZerD with Combined
Scoring Functions
Daisuke Kihara Department of Biological SciencesDepartment of Computer SciencePurdue University, Indiana, USA
1
http://kiharalab.org
CAPRI Round 30 Results
2(Lensink et al., CAPRI30 group paper, 2016)
Overview of Protein Docking Prediction Using LZerD in CAPRI
3
Re-ranking with scoring functions
HHPred SparksXMUFold
TASSERPhyre2
TASSERliteMultiComSingle Chain
Modeling
PRESCO
Sub-unit models
LZerD
~50,000 docking models
Clustering, RMSD < 5 Å
10 models
MD relaxation Submit
LZerD(Local 3D Zernike descriptor-based Docking program)
4
normal vector
3DZernike descriptor
6Å
Interface area
(Venkatraman, Yang, Sael, & Kihara, BMC Bioinformatics, 2009)
(Lizard)
3D Zernike Descriptors (3DZD) An extension of
spherical harmonics based descriptors
A 3D object can be represented by a series of orthogonal functions, thus practically represented by a series of coefficients as a feature vector
Compact Rotation invariant
5
A surface representation of 1ew0A (A) is reconstructed from its 3D Zernike invariants of the order 5, 10, 15, 20, and 25 (B-F). (Sael & Kihara, 2009)
),()(),,( mlnl
mnl YrRrZ
),( mlY )(rRnl
),,( rZ mnl
: Spherical harmonics, : radial functions
polynomials in Cartesian coordinates
143 .)()(
xxxx dZf m
nlmnl Zernike moments:
Zernike Descriptor: 2)( mnl
lm
lmnlF
Protein Residue Environment SCOre (PRESCO)
6
within a sphere of 6 or 8 Å
along the main-chainCenter
(Kim & Kihara, Proteins 2014)
Finding Similar Side-Chain Depth Environment (SDE) from a database
7
Structure Database2536 proteins
500 lowest RMSD fragments of 9 side-chain centroids;Superimposed with the query fragment
Select SDE with the same number of side-chain centroids in the sphere of 8.0Å
Query SDE
Compute RMSD of residue-depth for corresponding side-chain centroids
Sort by depth RMSD to the query
surface
CASP11 Free Modeling Category Ranking (Model 1)
8
(http://www.predictioncenter.org/casp11/zscores_final.cgi?formula=assessors)
(Kim & Kihara, Proteins 2015)
DFIRE, GOAP, ITScore Scoring Functions
DFIRE (Yaoqi Zhou): statistical distance-dependent atom contact potential using the finite ideal-gas reference state
GOAP (Jeff Skolnick): DFIRE * orientation dependent term
ITScore (Xiaoqin Zou):iteratively refined statistical distance-dependent atom contact potential
9
The BindML Algorithm
10(La D, & Kihara D, Proteins 2012)
Generating Substitution ModelsiPFAM (505 Families)
Model Model11
iPfam Dataset Benchmark
ROC based on 449 Protein Complexes
12
BindML Webserver
13
http://kiharalab.org/bindml
(Wei Q, La D, & Kihara D, Methods in Mol.Biol. In press 2016)
T79 (Round 30)
(Interface 2) Kihara: 3 hits; LZerD: 1 hit Homodimer LZerD runs:
No-interface prediction With BindML-consPPISP prediction
LZerD selection strategy: Consensus of ITScore and GOAP 5 from no-interface, 5 from BindML-consPPISP
Kihara selection strategy: Manual combination of ITScore, GOAP, DFIRE,
and PRESCO 10 from no-interface
14
T79 Subunit Model Quality
Chain ARMSD: 4.0 Å
Chain B RMSD: 4.0 Å
nativemodel
15
T79 Human Selected Model
fnat 0.16, L-RMSD 14.1Å, i-RMSD 3.8 Å
nativemodel
16
T79 Interface Prediction
Method Precision Recall F-Score
BindML 0 0 NA
Cons-PPISP 0.10 0.18 0.12
17
T79 Scores (no-interface prediction)
18ITScoreGOAP DFIRE
LRM
SDfn
atiR
MSD
T79 Score Comparison
19ITScoreGOAP DFIRE
ITSc
ore
GOAP
DFIR
E
T79 PRESCO scores
20
lRM
SD
PRESCO PRESCO
With Inteface Prediction Without Interface Prediction
T79 Score performance summary
Run Score RFH Hits in top 10
nointerface ITScore 1 (62) 3
nointerface GOAP 1 (72) 3
nointerface DFIRE 1 (111) 5
BindML-consPPISP
all - -
RFH: rank of first acceptable (medium) hit
21
T91 (Round 30)
Kihara: 8 hits; LZerD: 2 hits Homodimer LZerD runs:
No-interface prediction (with our monomer model) With BindML+consPPISP interface prediction Zhang1 CASP server model, no-interface prediction
Server selection strategy 10 from no-interface
Human selection strategy Consensus of ITScore, GOAP, PRESCO, and visual
inspection 5 from no-interface, 5 from Zhang1
22
T91 Subunit Models
Chain COur model: RMSD 6.0 ÅZhang: RMSD 4.9 Å
nativeOur modelZhang1
Chain DOur model RMSD 6.5 ÅZhang: RMSD 5.7 Å
23
T91 Human Selected Model
modelnative
fnat 0.33, L-RMSD 9.0 Å, I-RMSD 4.2 Å
24
T91 Interface Prediction
Method Precision Recall F-Score
BindML 0.64 0.20 0.30
Cons-PPISP 0.50 0.28 0.36
25
T91 Score (no interface prediction)
26ITScoreGOAP DFIRE
LRM
SDfn
atiR
MSD
T91 Scores (With Interface prediction)
27ITScoreGOAP DFIRE
LRM
SDfn
atiR
MSD
T91 Scores (Zhang models)
28ITScoreGOAP DFIRE
LRM
SDfn
atiR
MSD
T91 Zhang1 Score Comparison
29ITScoreGOAP DFIRE
ITSc
ore
GOAP
DFIR
E
T91 PRESCO Scores
Without Interface PredictionDocking with Zhang models
PRESCO PRESCO
LRM
SD
Top 5 models selected from each30
T91 Score Performance Summary
Run Score RFH Hits in top 10nointerface ITScore 2 2
nointerface GOAP 2 1
nointerface DFIRE 1 2
interface ITScore 1042 0
interface GOAP 165 0
interface DFIRE 116 0
zhang1 ITScore 1 (4) 5
zhang1 GOAP 2 (16) 5
zhang1 DFIRE 1 (6) 6
RFH: rank of first acceptable (medium) hit
31
T96 (Round 31)
Heterodimer Predictor hits: 0 (5 by other groups) Scorer hits: human 1, server 0 (1 by other
group) Human: 6 selected by PRESCO, 4 selected from
with predicted interface, ITScore, GOAP, DFIRE
No PDB file for the native structure available: metrics computed using two scorer hits (average L-RMSD/I-RMSD, max fnat)
32
T96 scorer hits
Chain B S39.M03 (Haliloglu)fnat 0.22L-RMSD 5.68 ÅI-RMSD 2.44 Å
Chain A
Chain BS31.M06 (Kihara)fnat 0.32 L-RMSD 7.99 ÅI-RMSD 2.67 Å
33
T96 interface prediction
Chain Method Precision Recall F-score
A BindML 0.15 0.2 0.17
Cons-PPISP 0 0 NA
B BindML 0.12 0.11 0.12
Cons-PPISP* NA NA NA
*Cons-PPISP predictions were only for the N-terminal tail; visual inspection suggests that N-terminal tail is not a likely a binding site, so these predictions were not used.
34
T96 Scorer-Models Scores
35ITScoreGOAP DFIRE
lRM
SDfn
atiR
MSD
T96 Score Performance Summary
Score RFH Hits in top 10ITScore 529 0GOAP 6 1DFIRE 125 0
RFH: rank of first acceptable hit
• The hit for GOAP/DFIRE is the same model picked by PRESCO
36
Summary Our docking prediction procedure runs LZerD, and decoys
were selected by combining DFIRE, ITScore, GOAP, and PRESCO. Binding sites were predicted by BindML and cons-PPISP.
On the examples shown, PRESCO’s performance was not as spectacular as we expected from its performance on single chain str. prediction.
DFIRE, ITScore, GOAP showed similar, reasonably good performance.
Scoring functions performance depends on subunit model quality.
The way to use BindML prediction needs to be improved.
37
Lab Members
38@kiharalab
Lenna Peterson
Hyung-Rae Kim