ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

36
ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

description

ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING. BACKGROUND. Completion of sequencing projects Need for functional discovery Emerging area of study: Large scale genomic analysis Similarity of living systems. GENETIC NETWORKS. Modelling genetic networks - PowerPoint PPT Presentation

Transcript of ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

Page 1: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

ANALYSIS OF GENETIC NETWORKS USING

ATTRIBUTED GRAPH MATCHING

Page 2: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

BACKGROUND

• Completion of sequencing projects• Need for functional discovery• Emerging area of study: Large

scale genomic analysis• Similarity of living systems

Page 3: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

GENETIC NETWORKS

• Modelling genetic networks• Interaction of genes and proteins• Relationship between topology and

function

Page 4: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

MOTIVATION

• Common biological processes• Comparison of networks• Discovering missing interactions• Discovering missing genes

Page 5: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

GRAPH MATCHING

mpn132

mpn124

mpn141

mpn145

mpn134

mpn133

mge234

mge235

mge236

mge312

mge314

mge310

mge313

mge336mge337

Search-based Algorithm

Pruning Techniques

G1

G2

Page 6: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

ROADMAP

• Scale-Free Networks• Modelling Genetic Networks• Graph Matching• Algorithm• Results

Page 7: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

SCALE-FREE NETWORKS

Page 8: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

COMPLEX NETWORKS

• Small-world model– WWW– Human acquaintances network– Citation networks– Biological networks

Page 9: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

SMALL-WORLD

• Features:– Characteristic path length– Clustering coefficient– Sparseness

Page 10: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

SMALL-WORLD

• Somewhere in between regular & random graphs

Page 11: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

SMALL-WORLD • Highly clustered• Short diameter

Page 12: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

SCALE-FREE NETWORKS

• Complex networks: biological, social, www, power grid, citation etc.

• Power low connectivity: P(k) = k -

• Hubs - authorities

Page 13: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

SCALE-FREE NETWORKS

• Application for testing scale free behavior• Yeast• Helicobacter Pylori• Mycoplasma Pnuemonia• Mycoplasma Genitelium• Linear log-log graph• Slope =

Page 14: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

SCALE-FREE NETWORKS • Slope is calculated by least mean

square method

Page 15: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

TOPOLOGY & FUNCTIONALITY

• Small diameter– ease of dissemination of information– ease of restoring after disturbance

• Cliquishness – Alternate paths are found

• Heterogeneity– Random removal does not effect the

network– Hubs are vulnerable to attack

Page 16: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

BIOLOGICAL ASPECTS • Multifunctionality

– Grouped into functional units

• Stability• Reason: Most of

the interactions are between hubs and authorities

Page 17: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

MODELLING GENETIC NETWORKS

Page 18: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

TYPES OF GENETIC NETWORKS• Categorized by data sources

– Metabolic pathways– Gene expression arrays– Protein interactions– Gene interactions

Page 19: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

INTERACTION MAPS• High level perspective

– Nodes: Genes or proteins– Edges: Presence of an interaction

• Data sources– Two-hybrid analysis– Fusion analysis– Chromosomal proximity– Phylogenetic analysis

Page 20: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

GRAPH MATCHING

Page 21: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

PROBLEM DEFINITION

Attributed Relational Graph (ARG)

G = { V, E, X}.

V = {v1, v2, …, vn} Nodes

E = {e1, e2, …, em} Edges

X = {x1, x2,…,xn} Attributes

Page 22: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

INEXACT SUBGRAPH MATCHING

Allow for :

• Mismatching attribute values

• Missing nodes

• Missing links

Also called error-correcting subgraph isomorphism

NP-Complete

Page 23: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

SEARCH TECHNIQUES

• Cost function• Pruning (Structure Constraints)•Backtracking

Page 24: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

ATTRIBUTED GRAPH MATCHING TOOL

Page 25: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

ATTRIBUTE MATCHING

- Amino Acid Sequence Content Composition– array of 20, percentage of each aa– Amino acid grouped into classes: array of 6– Amino acid triples grouped into classes:

array of 216

MKVLNKNEL

216

1

2)]()([ 21

iiiS XX

6 x 6 x 6

A

anOaX

A

1n

))(( 1)(

Page 26: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

ATTRIBUTE MATCHING

Difference in amino acid composition values of gene pairs for M. Genitalium and M. Pneumoniae.

Score

observations

Page 27: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

STRUCTURAL CONSTRAINTS

• Effect of scale-free behaviour– Connectivity information: Highly

heterogeneous, thus start with most connected and work around it

– Pruning strategy: comparibility is determined by power low

loglog

)(log)(log

12

12

12

12

kk

kPkP

xx

yy

Page 28: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

STRUCTURAL CONSTRAINTS• Neigborhood connectivity

– Choose the neighbor at the next stage

• Backtracking– Component by component– Go back to the neighbor with the

most connectivity within the component

Page 29: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

TEST CASE

• Mycoplasma Genitalium: – smallest genome (470 ORFs)

• Mycoplasma Pnuemoniae: – Very similar, superset (688 ORFs)

Page 30: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

TEST CASE...• Mycoplasma Genitalium:

– 232 nodes– 211 links

• Mycoplasma Pnuemoniae: – 267 nodes– 257 links

• Inputs:• MGE links• MPN links

• MGE synonyms• MPN synonyms

• MGE amino acid sequence• MPN amino acid sequence

Page 31: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

RESULTSMGE MPN

Page 32: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

DISCOVERY OF MISSING DATA

• Missing link

• Link between in MPN632 and MPN637 is missing in our data but exists in literature

Page 33: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

DISCOVERY OF MISSING DATA

• Missing node with known COG

MPN236--- MPN237---MPN238---MPN678MG098 ----MG099-----MG100----MG459

MG459 is ortholog of MPN678

Page 34: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

DISCOVERY OF MISSING DATA

• Missing node without known ortholog

Page 35: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

CONCLUSION

• Large-scale genomics• Interaction data captures system

structure and dynamics• Graph matching exploits the scale-

free characteristics• Novel interactions and genes can

be identified

Page 36: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

ACKNOWLEDGEMENT

• YASEMİN TÜRKELİ