DataAnalytics for Personalized Medicine by Aryya Gangopadhyay, PhD
-
Upload
university-of-maryland-baltimore-county-department-of-information-systems -
Category
Healthcare
-
view
118 -
download
2
description
Transcript of DataAnalytics for Personalized Medicine by Aryya Gangopadhyay, PhD
Data Analy)cs for Personalized Medicine
Aryya Gangopadhyay UMBC
Presented at the 3rd Interna7onal Conference on Personalized Medicine,
June 26-‐29, 2014
Scope • Big data promise (Pentland et al 2013) – US Healthcare industry can save $200 billion per year
• Need complete picture – Reality mining (MIT Tech. Review 2008) – Socio-‐demographics – EMRs – Biological data
• Interac7ons in the network – Topology-‐based analysis – Centrality-‐based analysis – Perturba)ons (diseases as network perturba)ons: del Sol et al
2010) • Network par77oning • Visualiza7on
• “Within 10 years every healthcare consumer will be surrounded by a virtual cloud of billions of data points” [Hood et al. 2013]
Big data in healthcare
Interconnec)ons
– Biological processes are interconnected systems – Analyze interac)ons – Resilient against random perturba)ons – Vulnerable to targeted aXacks
CIDeR: Large, mul7-‐dimensional, mul7modal, dynamic
Extensions to our previous work – Updated the network • Nodes: 5168 to 9767 • Edges: 14410 to 27744
– Previous analysis • Network characteris)cs: CC, diameter, path lengths, etc. • Node-‐based analysis
– Developed a new method for iden)fying effectors and receptors
• Perturba)on analysis – Extensions • How do we par))on the network? • What criteria to use and why? • What are the effects of such par))oning?
Network extracted from CIDeR: 2014
• Nodes: 9767 • Edges: 27744 • Diameter: 15 • # CC: 89 • Avg. PL: 4.7 • Avg. degree: 2.8
Node Centrality measures: correla)ons
x = Authority Y = Betweenness Centrality Correla)on: 0.8
x = Clustering Coefficient Y = Betweenness Centrality Correla)on: -‐0.02
x = Hub Y = Authority Correla)on: 0.88
x = PageRank Y = Authority Correla)on: 0.92
Correla)ons of Node Centrality measures
Clustering.Coefficient
Clustering.Coefficient
Hub
Hub
Authority
Authority
PageRank
PageRank
Eigenvector.Centrality
Eigenvector. Centrality
Betweenness.Centrality
Betweenness. Centrality
Eccentricity
Eccentricity
Overall network characteris)cs
• PageRank, hub and authority scores are strongly correlated
• Clustering coefficient is nega)vely correlated with other node centrality measures
• Implica7ons: 1. Nodes that are strong effectors are also strong receptors 2. Less central nodes are not connected to each other but
mainly with an influen)al node 3. Influen7al nodes are mostly connected to each other 4. Fully connected sub-‐graphs are small and rare
Par))oning the graph
• How can we capture the above characteris)cs? • Modularity: • The objec)ve is to maximize Q • Intui)on: – Put influen)al nodes in separate clusters – Create dense sub-‐communi)es (common neighbors)
• Algorithms (op)mal solu)on is NP-‐hard: Brandes 2007): – Spectral clustering based (Newman 2006) – Greedy algorithm (Blondel et al. 2008)
Q =12m
(Aij −did j2m
)i∈Cl , j∈Cl
∑l=1
k
∑
Clusters formed by maximizing modularity
Dendrogram of top 8 Disease Clusters
C
C
Cluster 100
Nodes: 1177 Edges: 2122
Cluster 82
Nodes: 1200 Edges: 2554
K-‐core • Objec7ve: Restrict analysis to regions of increased centrality and connectedness
• K-‐core: largest sub-‐graph where all nodes have a minimum degree of k (Batagelj 2002).
• K=5 (mode=2 for the en)re network) • Protein Interac)on Networks (Wuchty et al 2005, Hamelin et al 2008)
Taken from Hamelin et al 2008
5-‐core graph: color code-‐Type
5-‐core graph: color-‐code: Modularity class
Disease Clusters (top 5) dendrogram
C
C
5-‐core graph: Cluster 5 (26%)
5-‐core graph: Cluster 6 (22%)
5-‐core graph: Cluster 0 (16%)
5-‐core graph: Cluster 3 (13%)
5-‐core graph: Cluster 4 (12.5%)
Comparison of clusters
• Contribu7ng areas • Biology, bioinforma)cs, sociology, SNA, Physics, applied mathema)cs, Computer and informa)on sciences
• Summary • Holis)c analysis of health data • Analysis based on node centrality • Network par))oning • Studying the effect of perturba)on
• Where do we go from here • Create a taxonomic structure of elements and interac)ons • Search tool • Biological and clinical implica)ons
Conclusion