AB Satu Elisa Schaeffer · [6] P. Orponen and S. E. Schaeffer. Local clustering of large graphs by...

1
AB Algorithms for nonuniform networks Satu Elisa Schaeffer Laboratory for Theoretical Computer Science Helsinki University of Technology, Finland E-mail: [email protected] SN09 We take observations on structural properties of natural networks as a starting point for developing efficient algo- rithms for natural instances of different graph problems [8]. The results include observations on structural effects to- gether with algorithms that aim to reveal structural proper- ties or exploit their presence in solving an interesting graph problem. Sampling μ cc i · p i,i = μ cc i · p i ,i α · n = i (1 - α) deg (v i ) 2m v v pw ,v pv ,v pw,w pw ,w pv ,w pv,v pv ,v pw ,w pv,v Reg. RW Bal. RW pw,v pv,w pw,w w w Spectra of four DGM graphs [3] -0.5 0 0.5 1 Eigenvalue DGM generation 4 Balanced Regular -0.5 0 0.5 1 Eigenvalue DGM generation 5 -0.5 0 0.5 1 Eigenvalue DGM generation 6 -0.5 0 0.5 1 Eigenvalue DGM generation 7 Convergence to the stationary distribution 0 0.2 0.4 0.6 0.8 1 1000 100 10 1 TVD (est.) DGM gen. 5 (n = 366, m = 729) balanced min. bal. combined regular 0 0.2 0.4 0.6 0.8 1 2000 500 100 25 10 1 DGM gen. 7 (n = 3,282, m = 6,561) bal. m. bal. comb. reg. 0 0.2 0.4 0.6 0.8 1 2000 500 100 25 10 1 TVD (est.) Step Collaboration graph (n = 503, m = 828) bal. m. bal. comb. reg. 0 0.2 0.4 0.6 0.8 1 2000 500 100 25 10 1 Step Collaboration graph (n = 5,909, m = 13,510) balanced m. bal. combined regular Clustering Given a data set D, divide it into clusters C 1 ,..., C k , k i=1 C i = D such that the elements assigned to a particular cluster C i are somehow similar. Fiedler clustering [6] Initialize a gradient-descent iteration with f (v s )=0 and f (v)=1 for all v V , v = v s , choosing a descent-speed parameter δ> 0, and taking the iteration steps defined by ˜ f (w) t+1 = min{1, ˜ f (w) t + δ · γ}, where γ = {v,w}∈E ˜ f (v) t - (1 - c) · deg (w) · ˜ f (w) t . 0 0.2 0.4 0.6 0.8 1 0 20 40 60 80 100 120 140 Fiedler value Vertex label 0 0.2 0.4 0.6 0.8 1 Fiedler value (FV) Vertices in ascending order of FV Exact Fiedler vectors Approx. Fied. vectors Mean abs. times Density clustering [7, 9] F (C)= δ (C) · ρ (C)= 2 deg int (C) 2 |C| (|C| - 1)(deg int (C) + deg ext (C)) 691 374 726 15 595 401 768 691 0 2 4 6 8 Average Cluster order 0 10 20 30 40 50 Internal degree 0 5 10 15 20 25 External degree 0 2 4 6 8 Variation Simulation time 0 10 20 30 40 50 Simulation time 0 5 10 15 20 25 Simulation time Search with one-step lookahead None k = 5 k = 10 k = 20 k = 30 Full Average path-length matrices for the C. Elegans neural net- work; no lookahead, a k-place lookahead buffer filled by uniform-probability selection, and full lookahead. Spanning trees Load δ (T ) 854 588 1,350 Load δ (F ) 362 70 1,350 Avg. hop count 2.05 2.66 1.71 A simplified communication-cost model [5]: the energy consumed by one-bit communication between v and w is E (v,w)= α t + α amp · dist Eucl (v,w)+ α r . Thick edges have weight 1+ (> 0) and the thin edges have weight 1. The MST (total weight 10) has avg path length 4; the other tree has unweighted avg path length 2.36, weighted avg path length 2.36 + 1.472, and total weight 10 + 5. 0 20 40 60 80 0.4 0.2 0.1 Avg. hop count T = 1 0 20 40 60 80 0.4 0.2 0.1 T = 5 MST 10000 9000 8000 7000 6000 5000 4000 3000 2000 1000 MHT G 0 20 40 60 80 0.4 0.2 0.1 Avg. hop count Density T = 10 0 20 40 60 80 0.4 0.2 0.1 Density T = 50 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 0.4 0.2 0.1 Avg. path length T = 1 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 0.4 0.2 0.1 T = 5 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 0.4 0.2 0.1 Avg. path length Density T = 10 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 0.4 0.2 0.1 Density T = 50 The average hop-length and the average path length for the MST set and the different-parameter LHT sets, as well as the approximate MHT set and the original graph (avg over 30-graph sets). Graph similarity (i) the maximum common subgraph problem [1, 4]; given two graphs G and P , construct a graph G of maximum order and size such that both G and P contain a subgraph isomorphic to G (ii) the minimum common supergraph problem [2, 4]: given two graphs G and P , construct a graph G of minimum order and size s.t. G contains a subgraph isomorphic to G and a subgraph isomorphic to P Bit-string representation B (A) = 011100100001010 A G = 01110 1 0100 11 001 100 01 0011 0 Bit-string similarity measure δ w (B 1 , B 2 )=1 - 2 N i=1 (I (i) · i) N(N + 1) Graph similarity measure ρ G S ,P =1 - min ϕ:SS δ w B ϕ(G S ) , B (P ) 1 10 100 1000 10000 100000 1e+06 5 6 7 8 9 10 11 Runtime (ms) Graph order n Exact algorithm 1 10 100 1000 50 100 150 200 Graph order n Greedy algorithm Gnp BA WS Computation of canonical bit-strings (avg over 30 instances) References [1] H. Bunke, P. Foggia, C. Guidobaldi, C. Sansone, and M. Vento. A comparisonof algorithms for maximum common subgraph on randomly connected graphs. In T. Caelli, A. Amin, R. P. W. Duin, M. S. Kamel, and D. de Ridder, editors, Structural, Syntactic, and Statistical Pattern Recognition in Proceedings of Joint IAPR International Workshops SSPR 2002 and SPR 2002, volume 2396 of Lecture Notes in Computer Science, pages 123–132,New York, NY, USA, 2002. Springer-Verlag GmbH. [2] H. Bunke, X. Jiang, and A. Kandel. On the minimum common supergraph of two graphs. Computing, 65(1):13–25, July 2000. [3] S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes. Pseudofractal scale-free web. Physical Review E, 65(6):066122, June 2002. [4] M.-L. Fernández and G. Valiente. A graph distance measure combining maximum common subgraph and minimum common supergraph. Pattern Recognition Letters, 22(6–7):753–758, 2001. [5] G. Gupta and M. Younis. Performance evaluation of load-balanced clustering in wireless sensor networks. In Proceed- ings of Tenth International Conference on Telecommunications.IEEE, Feb. 2003. [6] P. Orponen and S. E. Schaeffer. Local clustering of large graphs by approximate Fiedler vectors. In S. Nikoletseas, editor, Proceedings of the Fourth International Workshop on Efficient and Experimental Algorithms (WEA’05), vol- ume 3505 of Lecture Notes in Computer Science, pages 524–533, Berlin/Heidelberg, Germany, 2005. Springer-Verlag GmbH. [7] S. E. Schaeffer. Stochastic local clustering for massive graphs. In T. B. Ho, D. Cheung, and H. Liu, editors, Proceedings of the Ninth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-05), volume 3518 of Lecture Notes in Computer Science, pages 354–360, Berlin/Heidelberg, Germany, 2005. Springer-Verlag GmbH. [8] S. E. Schaeffer. Algorithms for nonuniform networks. Research Report A102, Helsinki University of Technology, Laboratory for Theoretical Computer Science, Espoo, Finland, May 2006. [9] S. E. Schaeffer, S. Marinoni, M. Särelä, and P. Nikander. Dynamic local clustering for hierarchical ad hoc networks. In Proceedings of the 2006 International Workshop on Wireless Ad Hoc and Sensor Networks (IWWAN 2006), To appear. Address for correspondence: Helsinki University of Technology, P.O. Box 5400, FI-02015 TKK, FINLAND This research was supported by the Academy of Finland under grants 81120 and 206235, the Helsinki Graduate School in Computer Science and Engineering (HeCSE), the Nokia Foundation, and the Rotary Foundation.

Transcript of AB Satu Elisa Schaeffer · [6] P. Orponen and S. E. Schaeffer. Local clustering of large graphs by...

Page 1: AB Satu Elisa Schaeffer · [6] P. Orponen and S. E. Schaeffer. Local clustering of large graphs by approximate Fiedler vectors. In S. Nikoletseas, editor, Proceedings of the Fourth

ABAlgorithms for nonuniform networks

Satu Elisa Schaeffer†

Laboratory for Theoretical Computer ScienceHelsinki University of Technology, FinlandE-mail: [email protected]

SN09

We take observations on structural properties of naturalnetworks as a starting point for developing efficient algo-rithms for natural instances of different graph problems [8].The results include observations on structural effects to-gether with algorithms that aim to reveal structural proper-ties or exploit their presence in solving an interesting graphproblem.

Sampling

µcci · pi,i′ = µcc

i′ · pi′,i

α · ε

n=

ε′i(1 − α) deg (vi)

2m

v v′

pw′,v′

pv′,v′

pw,w pw′,w′

pv′,w′

pv,v′

pv′,v

pw′,w

pv,v

Reg. RW Bal. RW

pw,vpv,w

pw,w′

w′w

Spectra of four DGM graphs [3]

-0.5

0

0.5

1

Eig

enva

lue

DGM generation 4

BalancedRegular

-0.5

0

0.5

1

Eig

enva

lue

DGM generation 5

-0.5

0

0.5

1

Eig

enva

lue

DGM generation 6

-0.5

0

0.5

1

Eig

enva

lue

DGM generation 7

Convergence to the stationary distribution

0

0.2

0.4

0.6

0.8

1

1000 100 10 1

TV

D (

est.)

DGM gen. 5 (n = 366, m = 729)

balancedmin. bal.

combinedregular

0

0.2

0.4

0.6

0.8

1

2000 500 100 25 10 1

DGM gen. 7 (n = 3,282, m = 6,561)

bal.m. bal.comb.

reg.

0

0.2

0.4

0.6

0.8

1

2000 500 100 25 10 1

TV

D (

est.)

Step

Collaboration graph (n = 503, m = 828)

bal.m. bal.comb.

reg.

0

0.2

0.4

0.6

0.8

1

2000 500 100 25 10 1

Step

Collaboration graph (n = 5,909, m = 13,510)

balancedm. bal.

combinedregular

ClusteringGiven a data setD, divide it into clustersC1, . . . , Ck,

k⋃

i=1

Ci = D

such that the elements assigned to a particular clusterCi aresomehowsimilar.

Fiedler clustering [6]Initialize a gradient-descent iteration withf (vs) = 0 andf (v) = 1 for all v ∈ V , v 6= vs, choosing a descent-speedparameterδ > 0, and taking the iteration steps defined byf̃ (w)t+1 = min{1, f̃ (w)t + δ · γ}, where

γ =∑

{v,w}∈E

f̃ (v)t − (1 − c) · deg (w) · f̃ (w)t .

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80 100 120 140F

iedl

er v

alue

Vertex label

0

0.2

0.4

0.6

0.8

1

Fie

dler

val

ue (

FV

)

Vertices in ascending order of FV

Exact Fiedler vectors Approx. Fied. vectors Mean abs. times

Density clustering [7, 9]

F (C) = δ (C) · ρ (C) =2 degint (C)2

|C| (|C| − 1)(degint (C) + degext (C))

691

374

726

15595

401

768

691

0

2

4

6

8

Ave

rage

Cluster order

0

10

20

30

40

50

Internal degree

0

5

10

15

20

25

External degree

0

2

4

6

8

Var

iatio

n

Simulation time

0

10

20

30

40

50

Simulation time

0

5

10

15

20

25

Simulation time

Search with one-step lookahead

None k = 5 k = 10 k = 20 k = 30 Full

Average path-length matrices for theC. Elegansneural net-work; no lookahead, ak-place lookahead buffer filled byuniform-probability selection, and full lookahead.

Spanning trees

Loadδ` (T ) 854 588 1,350Loadδ` (F ) 362 70 1,350

Avg. hop count 2.05 2.66 1.71

A simplified communication-cost model [5]: the energyconsumed by one-bit communication betweenv andw is

E(v, w) = αt + αamp · distEucl (v, w) + αr.

Thick edges have weight1 + ε (ε > 0) and the thin edgeshave weight1. The MST (total weight10) has avg pathlength4; the other tree has unweighted avg path length2.36,weighted avg path length2.36 + 1.472ε, and total weight10 + 5ε.

0

20

40

60

80

0.40.20.1

Avg

. hop

cou

nt

T = 1

0

20

40

60

80

0.40.20.1

T = 5

MST10000

900080007000600050004000300020001000MHT

G

0

20

40

60

80

0.40.20.1

Avg

. hop

cou

nt

Density

T = 10

0

20

40

60

80

0.40.20.1

Density

T = 50

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0.40.20.1

Avg

. pat

h le

ngth

T = 1

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0.40.20.1

T = 5

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0.40.20.1

Avg

. pat

h le

ngth

Density

T = 10

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0.40.20.1

Density

T = 50

The average hop-length and the average path length for theMST set and the different-parameter LHT sets, as well asthe approximate MHT set and the original graph (avg over30-graph sets).

Graph similarity(i) the maximum common subgraphproblem [1, 4]; given

two graphsG andP , construct a graphG′ of maximumorder and size such that bothG andP contain a subgraphisomorphic toG′

(ii) the minimum common supergraphproblem [2, 4]: giventwo graphsG andP , construct a graphG′ of minimumorder and size s.t.G′ contains a subgraph isomorphic toG and a subgraph isomorphic toP

Bit-string representation

B (A) = 011100100001010

AG =

0 1 1 1 0

1 0 1 0 0

1 1 0 0 1

1 0 0 0 1

0 0 1 1 0

Bit-string similarity measure

δw (B1,B2) = 1 −2

∑Ni=1 (I (i) · i)

N(N + 1)

Graph similarity measure

ρ(

GS, P)

= 1 − minϕ:S→S

{

δw

(

Bϕ(GS),B (P )) }

1 10

100 1000

10000 100000 1e+06

5 6 7 8 9 10 11

Run

time

(ms)

Graph order n

Exact algorithm

1

10

100

1000

50 100 150 200

Graph order n

Greedy algorithm

GnpBAWS

Computation of canonical bit-strings(avg over 30 instances)

References[1] H. Bunke, P. Foggia, C. Guidobaldi, C. Sansone, and M. Vento. A comparison of algorithms for maximum common

subgraph on randomly connected graphs. In T. Caelli, A. Amin, R. P. W. Duin, M. S. Kamel, and D. de Ridder, editors,Structural, Syntactic, and Statistical Pattern Recognition in Proceedings of JointIAPR International WorkshopsSSPR

2002 andSPR 2002, volume 2396 ofLecture Notes in Computer Science, pages 123–132, New York, NY, USA, 2002.Springer-Verlag GmbH.

[2] H. Bunke, X. Jiang, and A. Kandel. On the minimum common supergraph of two graphs.Computing, 65(1):13–25,July 2000.

[3] S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes. Pseudofractal scale-free web.Physical Review E, 65(6):066122,June 2002.

[4] M.-L. Fernández and G. Valiente. A graph distance measurecombining maximum common subgraph and minimumcommon supergraph.Pattern Recognition Letters, 22(6–7):753–758, 2001.

[5] G. Gupta and M. Younis. Performance evaluation of load-balanced clustering in wireless sensor networks. InProceed-ings of Tenth International Conference on Telecommunications. IEEE, Feb. 2003.

[6] P. Orponen and S. E. Schaeffer. Local clustering of largegraphs by approximate Fiedler vectors. In S. Nikoletseas,editor, Proceedings of the Fourth International Workshop on Efficient and Experimental Algorithms (WEA’05), vol-ume 3505 ofLecture Notes in Computer Science, pages 524–533, Berlin/Heidelberg, Germany, 2005. Springer-VerlagGmbH.

[7] S. E. Schaeffer. Stochastic local clustering for massivegraphs. In T. B. Ho, D. Cheung, and H. Liu, editors,Proceedingsof the Ninth Pacific-Asia Conference on Knowledge Discoveryand Data Mining (PAKDD-05), volume 3518 ofLectureNotes in Computer Science, pages 354–360, Berlin/Heidelberg, Germany, 2005. Springer-Verlag GmbH.

[8] S. E. Schaeffer. Algorithms for nonuniform networks. Research Report A102, Helsinki University of Technology,Laboratory for Theoretical Computer Science, Espoo, Finland, May 2006.

[9] S. E. Schaeffer, S. Marinoni, M. Särelä, and P. Nikander.Dynamic local clustering for hierarchical ad hoc networks. InProceedings of the 2006 International Workshop on WirelessAd Hoc and Sensor Networks (IWWAN 2006), To appear.

†Address for correspondence: Helsinki University of Technology, P.O. Box 5400, FI-02015 TKK, FINLAND

This research was supported by the Academy of Finland under grants 81120 and 206235, the Helsinki Graduate School in Computer Science and Engineering (HeCSE), the Nokia Foundation, and the Rotary Foundation.