Tree edit distance1 Tree Edit Distance. Minimum edits to transform one tree into another Tree edit...

29
Tree edit distance 1 Tree Edit Distance

Transcript of Tree edit distance1 Tree Edit Distance. Minimum edits to transform one tree into another Tree edit...

Page 1: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 1

Tree Edit Distance

Page 2: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Minimum edits to transform one tree into another

Tree edit distance 2

TED

Page 3: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 3

Delete a node:

The edit operations

w

˙˙˙

˙˙˙

v

Relabel a node:

Page 4: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 4

The edit operations

˙˙˙

˙˙˙

Insert a node:

˙˙˙

v

Page 5: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 5

Existing Algorithms

Page 6: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 6

Recursive Algorithm [SZ89]

v w

F G

Recurs on the rightmost root:

Delete v

d(F,G) = min Delete w

Match v and w

Page 7: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 7

Recursive Algorithm [SZ89]

v w

F G

Recurs on the rightmost root:

Delete v

d(F,G) = min Delete w

Match v and w

Page 8: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 8

Recursive Algorithm [SZ89]

v w

F G

Recurs on the rightmost root:

Delete v

d(F,G) = min Delete w

Match v and w

Page 9: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 9

Recursive Algorithm [SZ89]

v w

F G

Recurs on the rightmost root:

Delete v

d(F,G) = min Delete w

Match v and w

Page 10: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 10

Recursive Algorithm [SZ89]

v w

F G

Recurs on the rightmost root:

Delete v

d(F,G) = min Delete w

Match v and w

Page 11: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 11

Recursive Algorithm [SZ89]

v w

F G

Recurs on the rightmost root:

Delete v

d(F,G) = min Delete w

Match v and w

Page 12: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance12

Time Complexity [SZ89]

relevant subproblem: if it shows up while computing d(F,G)

#relevant subproblems = time complexity = O(n2m2) = O(n4) O(nm . min{Depth(F),Leaves(F)} . min{Depth(G),Leaves(G)})

v w

F G

Relevant subforests

Page 13: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 13

Klein98

Same as previous algorithm, but recurs on a light child in F.

#relevant subproblems = (#relevant subforests of F) . m2 =

= O(nlogn . m2) = O(n3logn)

F G

By heavy path decomposition [HT84]

Page 14: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 14

Decomposition strategy [DT03]

For every two subforests (F,G) a strategy says right or left.

Zhang & Shasha’s strategy = right always.

Klein’s strategy = right iff the rightmost tree in F is smaller than the leftmost tree in F.

Lower bound of strategy algorithms = (nm . logn . logm)

Any strategy algorithm computes the edit distance between any two subtrees of F and G (without their roots).

Page 15: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 15

Our Results

An O(m2n(log + 1)) = O(n3) time, O(nm) space algorithm.

(Today: O((nm)3/2 )=O(n3) time and space) [DMRW ICALP07] A strategy algorithm symmetrically dependant on the two input

trees.

A matching lower bound for all strategy algorithms. (Today: A lower bound of (nm2))

Local edit distance and affine gap penalties at the cost of one execution. (Today: Local RNA edit distance) [BHLW CPM06]

nm

Page 16: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 16

Our Algorithm

Our algorithm to compute d(F,G):

1. If F<G compute d(G,F).

2. Recursively run d(Ki,G) for every Ki.

3. Run Klein’s strategy where “master” is F (no need to

recurs).

K5

K3

K4

F

K2

K1

G

Page 17: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 17

Analysis

Our algorithm to compute d(F,G):

1. If F<G compute d(G,F).

2. Recursively run d(Ki,G) for every Ki.

3. Run Klein’s strategy where “master” is F (no need to

recurs).

K5

K3

K4

F

K2

K1

G

2|G||F|

2(**)

(*)

|F| |K| i,

|F||K|

i

ii

i

i2 G),R(K |G||F|G)R(F,

i

i G),R(K

R(F, G) = ?

Page 18: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 18

An O((nm)3/2) = O(n3) Upper Bound

We show that . Proof by induction:

R(F,G)

3/2|)G||F4(|G)R(F,

i

i2 G),R(K |G||F|

Page 19: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 19

We show that . Proof by induction:

R(F,G)

3/2|)G||F4(|G)R(F,

3/2

3/22

ii

i3/22

ii

3/22

ii

2

ii

2

|)G||F4(|

2|F|

|F||G|4 |G||F|

|K||K||G|4 |G||F|

|K||G|4 |G||F|

|)G||K4(| |G||F|

G),R(K |G||F|

}{maxi

2/3

2/3By inductive assumption

By (*) and (**)

We know G<F

An O((nm)3/2) = O(n3) Upper Bound

Page 20: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 20

An O((nm)3/2) = O(n3) Upper Bound

We show that . Proof by induction:

R(F,G)

3/2|)G||F4(|G)R(F,

3/2

3/22

ii

i3/22

ii

3/22

ii

2

ii

2

|)G||F4(|

2|F|

|F||G|4 |G||F|

|K||K||G|4 |G||F|

|K||G|4 |G||F|

|)G||K4(| |G||F|

G),R(K |G||F|

}{maxi

2/3

2/3By inductive assumption

By (*) and (**)

We know G<F

Page 21: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

An O((nm)3/2) = O(n3) Upper Bound

Tree edit distance 21

We show that . Proof by induction:

R(F,G)

3/2|)G||F4(|G)R(F,

3/2

3/22

ii

i3/22

ii

3/22

ii

2

ii

2

|)G||F4(|

2|F|

|F||G|4 |G||F|

|K||K||G|4 |G||F|

|K||G|4 |G||F|

|)G||K4(| |G||F|

G),R(K |G||F|

}{maxi

2/3

2/3By inductive assumption

By (*) and (**)

We know G<F

Page 22: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 22

We show that . Proof by induction:

R(F,G)

3/2|)G||F4(|G)R(F,

3/2

3/22

ii

i3/22

ii

3/22

ii

2

ii

2

|)G||F4(|

2|F|

|F||G|4 |G||F|

|K||K||G|4 |G||F|

|K||G|4 |G||F|

|)G||K4(| |G||F|

G),R(K |G||F|

}{maxi

2/3

2/3By inductive assumption

By (*) and (**)

We know G<F

2(**)

(*)

|F| |K| i,

|F||K|

i

ii

An O((nm)3/2) = O(n3) Upper Bound

Page 23: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 23

An O((nm)3/2) = O(n3) Upper Bound

We show that . Proof by induction:

R(F,G)

3/2|)G||F4(|G)R(F,

3/2

3/22

ii

i3/22

ii

3/22

ii

2

ii

2

|)G||F4(|

2|F|

|F||G|4 |G||F|

|K||K||G|4 |G||F|

|K||G|4 |G||F|

|)G||K4(| |G||F|

G),R(K |G||F|

}{maxi

2/3

2/3By inductive assumption

By (*) and (**)

We know G<F

2(**)

(*)

|F| |K| i,

|F||K|

i

ii

Page 24: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance24

An O( ) Bound

Proof idea: At most log(n/m) nested recursive calls where F is “master” before all trees ≤ m. For all trees ≤ m use previous O(m3) bound . At most n/m such trees so total =

n/m.O(m3) = O(nm2) .

)(lnm2 1og nm

K5

K3

K4

F

K2

K1

G

Page 25: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 25

A Matching Lower Bound for all decomposition strategy algorithms

Page 26: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 26

A Matching Lower Bound for all decomposition strategy algorithms

An (nm2) lower bound: F G

Page 27: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 27

A Matching Lower Bound for all decomposition strategy algorithms

An (nm2) lower bound:

Consider this computational path: If the strategy says left delete from F, otherwise delete

from G.

For every two internal nodes v in F and w in G we get: min{|Fv|,|Gw|} new subproblems (Fv is the tree rooted at

v).

Summing over all such v,w:

2/n

1i

2/m

1j

2

w,v

)nm(}j2,i2min{|}Gw||,Fvmin{|

Page 28: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 28

A Matching Lower Bound for all decomposition strategy algorithms

An lower bound

A careful counting argument on:

))1m

n(lognm( 2

F G

Page 29: Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Tree edit distance 29

Thank you!