Information Visualization using graphs algorithms Symeonidis Alkiviadis [email protected]...

33
Information Visualization using graphs algorithms Symeonidis Alkiviadis [email protected] [email protected]

Transcript of Information Visualization using graphs algorithms Symeonidis Alkiviadis [email protected]...

Information Visualization using graphs algorithms

Symeonidis [email protected]@ics.forth.gr

Contents

Preliminaries

Gene clustering

Graph extraction from biological data

Graph visualization

Open issues

Discussion

Preliminaries

Visualize clusters of genes produced by clustering over gene expressions

Gene expression:

set of values of genes over a set of patients

Preliminaries

Graph G(V,E) : set of vertices, with edges joining vertices

Each vertex represents a gene

Each edge represents strong correlation

Clustering => groups of vertices

Contents

Preliminaries

Gene clustering

Graph extraction from biological data

Graph visualization

Open issues

Discussion

Gene clustering

Correlation

Compute Pearson's correlation coefficient for every pair of genes

Ny

yNx

x

Nxy

xyr

22

22

Gene clustering

Greedy clustering

for every unclassified gene x

create a cluster which includes it

add all genes y

with correlation > threshold

Cost: O(|genes|2)

Contents

Preliminaries

Gene clustering

Graph extraction from biological data

Graph visualization

Open issues

Discussion

Graph extraction from biological data

Genes → vertices ۷

Clusters→ groups ۷

Edges ?

Graph extraction from biological data

In-cluster relationMean value of correlation coefficients for all

genes in a cluster

All pairs of genes with correlation higher than threshold* mean are considered highly correlated

Edge meaning: (Very) strong correlation

Graph extraction from biological data

Inter-cluster relationMean value of correlation coefficients for

each cluster

All pairs of genes with correlation higher than threshold* (mean1+mean2)/2 are considered highly correlated

Edge meaning: Possibly wrong classification

Graph extraction from biological data

Genes → vertices ۷

Clusters→ groups ۷

Edges ۷ all highly correlated pairs of genes

Contents

Preliminaries

Gene clustering

Graph extraction from biological data

Graph visualization

Open issues

Discussion

Graph visualization

Gene → Vertex → circle

High correlation → Edge → line

Cluster → Group → Circle with respective genes - vertices on its periphery

Graph visualization

Place groups

Determine ordering of vertices in group

Try to reduce crossings

Graph visualization placing groups

Force - directed method over groups

Graph visualization

Place groups

Determine ordering of vertices in group

Try to reduce crossings

Graph visualizationDetermine ordering of vertices in group(tree)

Tree

depth first search discovery time

Graph visualizationDetermine ordering of vertices in group(bicon)

Biconnected graph:

Remains connected after removing one(any) vertex/edge

Graph visualizationDetermine ordering of vertices in group(bicon)

For every node u identify triangles

or create them

Store (v,w)

Remove u

u v

wu v

w

Graph visualizationDetermine ordering of vertices in group(bicon)

Restore graphRemove all stored edgesPerform dfs, compute longest path

and place it

Graph visualizationDetermine ordering of vertices in group(bicon)

Place any remaining verticesNext to 2 neighborsNext to 1 neighborNext to 0 neighbors

Graph visualizationDetermine ordering of vertices in group(n-bic)

Non-biconnected graph … under development

There is a vertex whose removal disconnects the graph

Decompose into bicon. components

get articulation points

vertices responsible for non-biconnectivity

Graph visualizationDetermine ordering of vertices in group(n-bic)

Decompose into bicon. components biconnected subgraphs

get articulation pointsvertices responsible for non-biconnectivity

Graph visualizationDetermine ordering of vertices in group(n-bic)

Articulation points

+ biconnected components

------------------------------------------

Block - cut - point tree

-Dfs on block cut point=> relative ordering of components

- For each biconnected component act as before

Graph visualizationDetermine ordering of vertices in group

CostTree:

dfs: O(|E|+\V|)=O(|E|)Biconnected graph

Dominated by dfs O(|E|) Non- biconnected graph

Dominated by extracting block-cut tree O(|E|)

Graph visualization… until now

Determine groups’ positions ۷Determine vertices ordering ۷

Graph visualization

Place groups ۷

Determine ordering in group ۷

Try to reduce crossings

Graph visualizationreduce crossings

Spin groups trying to minimize energy

Graph visualizationedge coloring

Each edge is assigned a weight

weight(xnode ,ynode )= r(xgene ,ygene)The color of each edge reflects its weight

brighter color → stronger correlation

In- group edges have different color than inter-group edges

Graph visualizationOverall

Initially…

Graph visualizationoverall

Finally…

Open issues

Clustering

Edge translation

Visualize large data setsZoomLayered drawingScrollbars