drug target prediction using semantic linked data
-
Upload
bin-chen -
Category
Health & Medicine
-
view
720 -
download
1
description
Transcript of drug target prediction using semantic linked data
DRUG TARGET PREDICTION USING SEMANTIC LINKED DATA
Bin Chen Ph.D. candidate School of Informatics and Computing, Indiana University
[email protected] http://cheminfo.informatics.indiana.edu/~binchen
CSHALS, Feb 23, 2012
Beyond Data Integration
Chem2Bio2RDF.ORG
What is Drug Target?
Why Drug Target Prediction?
How to predict drug target?
?
• Substructure• Side effect• Chemical ontology• Gene expression profile
Drug 1 Target 1
bind
Drug 2From Ligand (drug) perspective
?
• Sequence• 3D structure• Gene Ontology• Ligand
Drug 1 Target 1
bind
Target 2
From target perspective
Troglitazone
PPARG
ACSL4
bind
PPARA
bind
bind
Rosiglitazone Pioglitazo
ne
hypoglycemic drug
Chemical ontology
Chemical ontology
Chemical ontology
bindbind
PPAR signaling pathway
Eicosapentaenoic Acid
Response to nutrient
pathway
bind
bind
GO
GO
pathway
troglitazone
troglitazone
Semantic Linked Network
Topology is important for association
Cmpd 1Protein
1 Cmpd 2
Cmpd 1Protein
1 Cmpd 2
hasSubstructure
hasSubstructure
hasSubstructure
hasSubstructure
bind
bind
Protein2
Cmpd1
Cmpd 2Protein
1
Protein2
Cmpd1Protein
1
hasGOhasGO
Protein2
Cmpd1Protein
1bind PPI
Cmpd1Protein
1Cmpd 2 bindhasSideeffect hasSide ffect
Cmpd1Protein
1Cmpd 2 bindhasSubstructure hasSubstructure
Semantic is important for association
GO:00001
hypertension
substructure1
bind
bind bind bind
(Semantic Link Association Prediction)
Data schema
Path finding
Drug: Troglitazone
Target :PPARG>300k nodes, >1million edges
Statistical Model---edge weight
Target I
Target JPPI
PPI
PPI
hasGOhasGO
hasPathway
bind
P(i j) =1/3
Statistical Model---path raw score
Target2
Cmpd1 (s)
Target
1 (t)
e1 e2e4 e3
Protein2
Cmpd1
Cmpd1
Protein1
Path pattern
bind bind bind
Protein
Cmpd
Cmpd
Protein
bind bind bind
Protein1
Cmpd3
Cmpd4
Protein5
bind bind bind
Path examples:
Path pattern:
• Randomly sample 100,000 drug target pairs, yielding 453,087 paths, 35 patterns
• Plot Path pattern score distribution
• Convert path score to path z score
Path Pattern Score Distribution
Statistical Model---path z score
Statistical Model---association score
Fit association scores of random pairs to normal distribution
Association Score distribution among different pairs
Comparing SLAP with link prediction methods in social science
ROC curve
AUC=0.92
Drug polypharmacology profiles
Polypharmacology profile comprises of association scores of one drug against over one thousand targets
Moexipril
Rescinnamine
Benazepril
Quinapril
Fosinopril
Trandolapril
Enalapril
Chlorthalidone
Metolazone
Trichlormethiazide
Betaxolol
Alprenolol
Acebutolol
Nadolol
Bisoprolol
Drug similarity network
• Nodes present drugs
• Two nodes are linked if they are similar in terms of biological function.
• Nodes are colored by their therapeutic indications
Insomnia related drugs
Dissimilar Drugs have same indication
Drug repurposing
Anti-Parkinson
allergic rhinitis
http://www.ebi.ac.uk/chebi/searchId.do?chebiId=3398
?
Summary
Semantic Link association can be assessed by topology and semantics of the network
Domain knowledge plays an important role!
Team
Prof. Ying Ding Prof. David Wild
http://chem2bio2rdf.org/slap
Thanks!
Backup slides
SLAP Pipeline
Path filtering
Chen, B., Dong. X., Jiao, D., Wang, H., Zhu, Q., Ding, Y., Wild, D.J. Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data. BMC Bioinformatics, 2010, 11:255
Chem2Bio2RDF Datasets
Chem2Bio2RDF data
Other data venderscompoundprotein/genechemogenomicsliteratureothers
http://chem2bio2rdf.org
Ranking Target associated chemicals
Randomly select three targets Select all target associated
chemicals as positive link Randomly select equal number of
chemicals that are not associated with the target
Compare with Naïve bayes using Molecular Weight, ALogP, number of hydrogen bond acceptors and donors, the number of rotatable bonds and FCFP_6 as descriptors
Leave one out validation
Two objects are related if they are related to same objects
Coauthorship
Same Target
Two objects are related if their related objects are related
Similar Drugs have distinct indications
Levodopa: dopaminergic agentAnti-parkinson drug
Methyldopa : antiadrenergicAntihypertensive drug
Slap similarity: p value>0.05Tanimoto coefficient=0.89
Direct: drug target interacts with each other physicallyIndirect: indirect interaction (e.g., change gene expression)Random: random drug target pairs
Association Score distribution among different pairs