NetBioSIG2014-Talk by Salvatore Loguercio
-
Author
alexander-pico -
Category
Science
-
view
247 -
download
11
Embed Size (px)
description
Transcript of NetBioSIG2014-Talk by Salvatore Loguercio

Network-augmented Genomic Analysis (NAGA) Applied to Cystic Fibrosis studies
Salvatore Loguercio, Ph.D. [email protected]
@sal99k http://sulab.org
July 11, 2014 Network Biology SIG – ISMB 2014

Cystic fibrosis overview
• inherited recessive chronic disease - chest infection, lung damage, and bowel obstruction.
• 30,000 children and adults in the US (70,000 worldwide); 1,000 new cases diagnosed each year.
• Predicted median age of survival for a person with CF: late 30s.
• Primary therapy: airway clearance techniques (ACT)
Source: Cystic Fibrosis Foundation

CFTR and mucous flow 3
Source: http://www.flickr.com/photos/ajc1/3737955649
• Mutation cause the body to produce unusually thick, sticky mucus
• Clogs the lungs and leads to life-threatening lung infections
• Obstructs the pancreas and stops natural enzymes from helping the body break down and absorb food

Golgi
ER
Lysosome
WT CFTR
WT
chloride conductance
B
C
SDS-PAGE
endosomes
Apical
surface
degradation
DF508 CFTR cannot
exit the ER
DF508
X
Credit: Bill Balch
CFTR mutations affect protein folding and export

A systematic approach to CF correction
Cell line: CFBE
Functional: siRNA screen
ΔF508 CFTR against PN library*
368 siRNAs that significantly rescue CFTR function
*Collection of 2500 siRNA targeting proteins involved in protein homeostasis (‘proteostasis’)
Biochemical: MudPIT proteomics
775 differentially interacting proteins (WT/ ΔF508-CFTR)

A systematic approach to CF correction
Functional: siRNA screen
ΔF508 CFTR against PN library*
368 siRNAs that significantly rescue CFTR function
Biochemical: MudPIT proteomics
775 differentially interacting proteins (WT/ ΔF508-CFTR)
(368)

Connect Functional with Biochemical data

Target
1
2
3
I) Compute all shortest paths from siRNA hits to the target through a weighted protein interaction network (Dijstra algorithm)
II) Prioritize connecting proteins specific to the set of high-scoring siRNA hits considered.
Connect siRNA hits to a target through the Human Interactome
2
2

I. Build integrated PPI network
II. Run Shortest Path analysis
III. Control for unrelated protein hubs

Publicly available interaction data: From 10 source databases and 11 studies
14796 proteins 169625 interactions
Quality score [0:1] for each interaction, based on experimental evidences*
*Source: Human Integrated Protein- Protein Interaction reference (HIPPIE)
d = 9 Average path length: 3.6
I. Build a weighted protein interaction network – include MS data
+ Experimental interactome
(nodes + edges)
Updated scores, based on databases and experimental interactome S(u,v) = 2 – Sexp – Sdb
Sexp=
1 if e(u,v) in exp 0

Target
1
2
3
2
2
I. Build integrated PPI network
II. Run Shortest Path analysis

Target
1
2
3
2
2
I. Build integrated PPI network
II. Run Shortest Path analysis
III. Control for unrelated protein hubs

siRNA library
Randomly select a subset of the same size of the target set
shortest path analysis
Repeat n times
Randomized “hubness” For each connecting node
Target
Randomization – select proteins specific for the set of siRNA hits
For each protein connecting siRNA hits to the target, compute:
Nsp: number of distinct siRNA hits that utilize the protein on its shortest path to the target
Nrnd: randomized Nsp
p-value = 𝑠𝑢𝑚(𝑁𝑟𝑛𝑑≥𝑁𝑠𝑝)
𝑙𝑒𝑛𝑔𝑡ℎ(𝑁𝑟𝑛𝑑)
Nsp, Nrnd and the associated p-value are used to prioritize connecting proteins specific to the set of siRNA hits considered

CFTR – PN connectors – first degree – real vs. randomized
Nsp ≥3 Select: Nsp ≥3 Nsp /Nrnd≥2 (12 proteins)

Assessing candidate regulators 15
42 candidate regulators
31 previously screened
11 novel genes
22 (71%) previously
identified as hits
8 (73%) validate in de novo
experiments

Validation of predicted protein targets
siRNA screen CFTR rescue of function
8/11 (73%) novel candidate regulators validate
x
x
x

Gene Symbol
Solo vs. MudPit
Vx809 vs. MudPit
SRRM1 x
CDC5L x NDKB x
TPR x AIFM1 x
2ABB x KPCD2 x PLSCR1 x
MAP3K14 x TFG x x
XRCC5 x x CTNB1 x
XPO1 x MCM7 x WDR61 x
PP2AB x H2AFX x
MYC x
Validation of predicted targets - Specificity
X: predicted : validated
siRNA screen CFTR rescue of function
New condition: Vx-809 drug

X: predicted : validated
siRNA screen CFTR rescue of function
Validation of predicted targets - Coverage
Restrain flow through a subset of direct interactors
Gene Symbol
Solo vs. MudPIT (partial)
Solo vs. MudPIT
(full)
Vx809 vs. MudPIT
(full)
SRRM1 x x EIF3L x STAU1 x
CAN2 x SNRPA x
AUP1 x
Good specificity Sub-optimal coverage

Summary
• NAGA is a network-based method to integrate functional genomics data (e.g. siRNA screens) with interactomics datasets (e.g. AP-MS, MudPIT)
• Useful for prioritizing novel functional targets and for
identifying relevant network modules
• It leverages publicly available information on protein-protein
interactions and thus is readily applicable to many scenarios where a connection between functional and biochemical data is sought
• Good specificity, coverage to be improved

Contact [email protected]
@sal99k http://sulab.org
Andrew Su
Su Lab
William Balch
Darren Hutt
Daniela Roth
Chao Wang
Anita Pottekat
Sumit Chanda
Stephen Soon
Dieter Wolf
Trey Ideker
Anne Carvunis
Jean Wang
Daniel Quan
Travel funding to ISMB 2014 was generously
provided by NSF and the NetBio SIG committee NetBio SIG