Tree edit distance1 Tree Edit Distance. Minimum edits to transform one tree into another Tree edit...
-
Upload
griselda-preston -
Category
Documents
-
view
222 -
download
0
Transcript of Tree edit distance1 Tree Edit Distance. Minimum edits to transform one tree into another Tree edit...
Tree edit distance 1
Tree Edit Distance
Minimum edits to transform one tree into another
Tree edit distance 2
TED
Tree edit distance 3
Delete a node:
The edit operations
w
˙˙˙
˙˙˙
v
Relabel a node:
Tree edit distance 4
The edit operations
˙˙˙
˙˙˙
Insert a node:
˙˙˙
v
Tree edit distance 5
Existing Algorithms
Tree edit distance 6
Recursive Algorithm [SZ89]
v w
F G
Recurs on the rightmost root:
Delete v
d(F,G) = min Delete w
Match v and w
Tree edit distance 7
Recursive Algorithm [SZ89]
v w
F G
Recurs on the rightmost root:
Delete v
d(F,G) = min Delete w
Match v and w
Tree edit distance 8
Recursive Algorithm [SZ89]
v w
F G
Recurs on the rightmost root:
Delete v
d(F,G) = min Delete w
Match v and w
Tree edit distance 9
Recursive Algorithm [SZ89]
v w
F G
Recurs on the rightmost root:
Delete v
d(F,G) = min Delete w
Match v and w
Tree edit distance 10
Recursive Algorithm [SZ89]
v w
F G
Recurs on the rightmost root:
Delete v
d(F,G) = min Delete w
Match v and w
Tree edit distance 11
Recursive Algorithm [SZ89]
v w
F G
Recurs on the rightmost root:
Delete v
d(F,G) = min Delete w
Match v and w
Tree edit distance12
Time Complexity [SZ89]
relevant subproblem: if it shows up while computing d(F,G)
#relevant subproblems = time complexity = O(n2m2) = O(n4) O(nm . min{Depth(F),Leaves(F)} . min{Depth(G),Leaves(G)})
v w
F G
Relevant subforests
Tree edit distance 13
Klein98
Same as previous algorithm, but recurs on a light child in F.
#relevant subproblems = (#relevant subforests of F) . m2 =
= O(nlogn . m2) = O(n3logn)
F G
By heavy path decomposition [HT84]
Tree edit distance 14
Decomposition strategy [DT03]
For every two subforests (F,G) a strategy says right or left.
Zhang & Shasha’s strategy = right always.
Klein’s strategy = right iff the rightmost tree in F is smaller than the leftmost tree in F.
Lower bound of strategy algorithms = (nm . logn . logm)
Any strategy algorithm computes the edit distance between any two subtrees of F and G (without their roots).
Tree edit distance 15
Our Results
An O(m2n(log + 1)) = O(n3) time, O(nm) space algorithm.
(Today: O((nm)3/2 )=O(n3) time and space) [DMRW ICALP07] A strategy algorithm symmetrically dependant on the two input
trees.
A matching lower bound for all strategy algorithms. (Today: A lower bound of (nm2))
Local edit distance and affine gap penalties at the cost of one execution. (Today: Local RNA edit distance) [BHLW CPM06]
nm
Tree edit distance 16
Our Algorithm
Our algorithm to compute d(F,G):
1. If F<G compute d(G,F).
2. Recursively run d(Ki,G) for every Ki.
3. Run Klein’s strategy where “master” is F (no need to
recurs).
K5
K3
K4
F
K2
K1
G
Tree edit distance 17
Analysis
Our algorithm to compute d(F,G):
1. If F<G compute d(G,F).
2. Recursively run d(Ki,G) for every Ki.
3. Run Klein’s strategy where “master” is F (no need to
recurs).
K5
K3
K4
F
K2
K1
G
2|G||F|
2(**)
(*)
|F| |K| i,
|F||K|
i
ii
i
i2 G),R(K |G||F|G)R(F,
i
i G),R(K
R(F, G) = ?
Tree edit distance 18
An O((nm)3/2) = O(n3) Upper Bound
We show that . Proof by induction:
R(F,G)
3/2|)G||F4(|G)R(F,
i
i2 G),R(K |G||F|
Tree edit distance 19
We show that . Proof by induction:
R(F,G)
3/2|)G||F4(|G)R(F,
3/2
3/22
ii
i3/22
ii
3/22
ii
2
ii
2
|)G||F4(|
2|F|
|F||G|4 |G||F|
|K||K||G|4 |G||F|
|K||G|4 |G||F|
|)G||K4(| |G||F|
G),R(K |G||F|
}{maxi
2/3
2/3By inductive assumption
By (*) and (**)
We know G<F
An O((nm)3/2) = O(n3) Upper Bound
Tree edit distance 20
An O((nm)3/2) = O(n3) Upper Bound
We show that . Proof by induction:
R(F,G)
3/2|)G||F4(|G)R(F,
3/2
3/22
ii
i3/22
ii
3/22
ii
2
ii
2
|)G||F4(|
2|F|
|F||G|4 |G||F|
|K||K||G|4 |G||F|
|K||G|4 |G||F|
|)G||K4(| |G||F|
G),R(K |G||F|
}{maxi
2/3
2/3By inductive assumption
By (*) and (**)
We know G<F
An O((nm)3/2) = O(n3) Upper Bound
Tree edit distance 21
We show that . Proof by induction:
R(F,G)
3/2|)G||F4(|G)R(F,
3/2
3/22
ii
i3/22
ii
3/22
ii
2
ii
2
|)G||F4(|
2|F|
|F||G|4 |G||F|
|K||K||G|4 |G||F|
|K||G|4 |G||F|
|)G||K4(| |G||F|
G),R(K |G||F|
}{maxi
2/3
2/3By inductive assumption
By (*) and (**)
We know G<F
Tree edit distance 22
We show that . Proof by induction:
R(F,G)
3/2|)G||F4(|G)R(F,
3/2
3/22
ii
i3/22
ii
3/22
ii
2
ii
2
|)G||F4(|
2|F|
|F||G|4 |G||F|
|K||K||G|4 |G||F|
|K||G|4 |G||F|
|)G||K4(| |G||F|
G),R(K |G||F|
}{maxi
2/3
2/3By inductive assumption
By (*) and (**)
We know G<F
2(**)
(*)
|F| |K| i,
|F||K|
i
ii
An O((nm)3/2) = O(n3) Upper Bound
Tree edit distance 23
An O((nm)3/2) = O(n3) Upper Bound
We show that . Proof by induction:
R(F,G)
3/2|)G||F4(|G)R(F,
3/2
3/22
ii
i3/22
ii
3/22
ii
2
ii
2
|)G||F4(|
2|F|
|F||G|4 |G||F|
|K||K||G|4 |G||F|
|K||G|4 |G||F|
|)G||K4(| |G||F|
G),R(K |G||F|
}{maxi
2/3
2/3By inductive assumption
By (*) and (**)
We know G<F
2(**)
(*)
|F| |K| i,
|F||K|
i
ii
Tree edit distance24
An O( ) Bound
Proof idea: At most log(n/m) nested recursive calls where F is “master” before all trees ≤ m. For all trees ≤ m use previous O(m3) bound . At most n/m such trees so total =
n/m.O(m3) = O(nm2) .
)(lnm2 1og nm
K5
K3
K4
F
K2
K1
G
Tree edit distance 25
A Matching Lower Bound for all decomposition strategy algorithms
Tree edit distance 26
A Matching Lower Bound for all decomposition strategy algorithms
An (nm2) lower bound: F G
Tree edit distance 27
A Matching Lower Bound for all decomposition strategy algorithms
An (nm2) lower bound:
Consider this computational path: If the strategy says left delete from F, otherwise delete
from G.
For every two internal nodes v in F and w in G we get: min{|Fv|,|Gw|} new subproblems (Fv is the tree rooted at
v).
Summing over all such v,w:
2/n
1i
2/m
1j
2
w,v
)nm(}j2,i2min{|}Gw||,Fvmin{|
Tree edit distance 28
A Matching Lower Bound for all decomposition strategy algorithms
An lower bound
A careful counting argument on:
))1m
n(lognm( 2
F G
Tree edit distance 29
Thank you!