Potential Drug Target Discovery on PPI Networks

Post on 05-Feb-2017

220 views 0 download

Transcript of Potential Drug Target Discovery on PPI Networks

Raheleh SalariSFU

Potential Drug Target Discovery on PPI Networks

• Pathogens becoming more drug resilient; infectious diseases on the rise.

• Emerging diseases (e.g. avian flu) may result in a global pandemic!

• Rational drug design - search for magic bullets is failing.

• Combinatorial therapies needed – multiple drug targets.

Computational identification of drug targets• Protein protein interaction (PPI) networks:

edges-interactions, nodes-proteins.

• Goal: Identify protein targets on PPI networks whose “removal” disrupts several “essential” pathways/complexes and their possible “backup” paths on the PPI network.

• Targets should have no human orthologs.

Associated PPI subnetwork

ExampleH.Pylori Chemotaxis pathway

PPI networks + pathways• Strategy: aim to disrupt all the possible

communication paths between “endpoint” pairs of essential pathogenic pathways (multicut).

• Weighted node sparsest cut: – Input: Node weights (large for human orthologs -

small for essential proteins, surface proteins, easy targets), Essentiality of source/sink pairs (quantify how important a pathway is to survival)

– Output: minimize W(C) / ecc(C)• W(C) = total weight of nodes on C• ecc(C) = total essentiality of the pathways disrupted

Approximation algorithms• DSC: # endpoint pairs = O(log n)

O(log n) approximation by trivial generalization of multi-cut algorithms (Check every subset of source sink pairs)O(n3 log2 n) [Goldberg & Tarjan 88]

• LP: # source/sink pairs unboundedO(n1/2 ) approximation

polynomial rounding algorithm [Hajiaghayi & Raecke 07] • Identical results on H.pylori PPI network,

slight differences on E.coli PPI network

Input: E.coli Signaling Pathways

DSCmtlDMotADPPABacterial Chemotaxis

DSC, LPcheW*MotATarBacterial Chemotaxis

CysACysPABC Transporter

holDdnaEDNA polymerase

holAdnaEDNA polymerase

DSC, LPdnaK*FrdANarQNarL Family

NarINarGNarL Family

TorATorSOmpR Family

PhoAPhoROmpR Family

Method(s)Target(s)SinkSourcePathway

Input: E.coli essential complexes

kdsAcafARibosome associated

uvrChlpARibosome associated

priAsbcBDNA polymerase

DSC, LPlpdA*, IysU, aceF*, aceE, iscS*, rpsE*

fdhDhscAIscS

aidBFfhACP

DSC, LPrpoA*+, rpoB*+, rplC*, rpoC*, rpsB*, rpsE*

greBhepARNA Polymerase

rpoNinfBRNA Polymerase

Method(s)Target(s)SinkSourceComplex

Input: H.pylori signaling pathways

DSC, LPHP0241dnaNdnaEDNA Polymerase

DSC, LPmsrABOppFOppAABC Transporters

MotBCheWBacterial Chemotaxis

DSC, LPFabEFlhAFliDFlagellar Assembly

DSC, LPHP0823FliNFliGFlagellar Assembly

DSC, LPHP0149AtoBAtoSTwo component Sys.

DSC, LPHP0452TrpETrpBTwo component Sys.

DSC, LPHP0933trbIcag12Type IV Secretion Sys.

FlhAFliFType III Secretion Sys.

YidCSecDProtein Export

DSC, LPHP1223rpsFrplIRibosomal Proteins

rplPrplDRibosomal Proteins

Method(s)Target(s)SinkSourcePathway

PPI networks only• Strategy: aim to disrupt as many “potential”

pathways as possible (balanced cut).• Minimum weighted node separator problem:

C is a -balanced separator if C partitions V to V’ and V’’ s.t. min{|V’|,|V’’|} > .|V|– Input: Node weights (small node weights indicate

essentiality, targetability etc., human orthologs have large weight)

– Output: find C with minimum total weight

Approximation algorithms, heuristics• O(log n) approximation [Leighton & Rao 99]

performs poorly in practice .• O(log1/2 n) approximation [Arora & Kale 07] is only

slightly better.• Greedy heuristics targeting nodes with maximum

degree (GDeg), betweenness (GBet) perform relatively poorly.

• Heuristics motivated by several combinatorial observations devised (HMWS).

Comparison of HMWS, GDeg and GBet methods

E.Coli pathways disrupted (cut size 28, β=0.15)

ABC transporters (Iron complex) *16

Bacterial Chemotaxis *15

Two Component (NarL family) *14

Aminoacyl-tRNA biosynthesis13

Lysine biosynthesis12

RNA polymerase11

Purine metabolism10

Pyrimidine metabolism9

Valine, leucine and isoleucine degradation8

Glycine, serine and threonine metabolism7

Alanine and asparate metabolism6

Glycolysis/Gluconeogenesis5

Citrate cycle (TCA cycle)4

Butanoate metabolism3

Pyruvate metabolism2

Ribosome1

E.coli known drug targets (re)discovered(cut size 28 β=0.15)

Clomocycline, Demeclocycline, Doxycycline, Lymecycline, Minocycline, Oxytertracycline, Tetracycline, Tigecycline

rpsD

NitrofurantoinrpsJ

Rifampin, RifaximinrpoB

RifabutinrpoA

DrugGene Name

H.Pylori disrupted pathways (cut size 17, β=0.15)

Tyep IV secretion system *17

Flagellar assembly *16

ABC transporters(Iron complex) *15

Two-component system – NtrC family *14

Protein export (Sec dependent pathway) *13

Oxidative phosphorylation (f-type ATPase) *12

Bacterial chemotaxis *11

DNA polymerases *10

Epithelial cell signaling in H. pylori infection *9

Oxidative phosphorylation (F-type ATPase) *8

Ribosomal proteins *7

Urease complex6

Flagellar assembly5

Caprolactam degradation4

RNA polymerase3

Pyrimidine metabolism2

Purine metabolism1

Acknowledgements• Cenk Sahinalp (SFU, CompBio)• Fereydoun Hormozdiari (SFU, CompBio)• Vineet Bafna (UCSD)• Phuong Dao (SFU, CopmBio)• SFU CTEF: Bioinformatics for combating

infectious diseases program• NSERC, CRC program, MSFHR

HMWS1. RWB: compute Random Walk

Betweenness for all nodes – in O(n3) time on a sparse graph

2. Split: returns an initial cut s.t. every connected component < (1n nodes

3. Merge: partitions the components into two each with > n nodes

4. Cut: do it all over again