Splay trees (Sleator, Tarjan 1983)

72
1 Splay trees (Sleator, Tarjan 1983)

description

Splay trees (Sleator, Tarjan 1983). Main idea. Try to arrange so frequently used items are near the root. We shall assume that there is an item in every node including internal nodes. We can change this assumption so that items are at the leaves. First attempt. - PowerPoint PPT Presentation

Transcript of Splay trees (Sleator, Tarjan 1983)

Page 1: Splay trees (Sleator, Tarjan 1983)

1

Splay trees (Sleator, Tarjan 1983)

Page 2: Splay trees (Sleator, Tarjan 1983)

22

Main idea

• Try to arrange so frequently used items are near the root

• We shall assume that there is an item in every node including internal nodes. We can change this assumption so that items are at the leaves.

Page 3: Splay trees (Sleator, Tarjan 1983)

23

First attempt

Move the accessed item to the root by doing rotations

y

x

B

C

x

Ay

B C

<===>

A

Page 4: Splay trees (Sleator, Tarjan 1983)

24

Move to root (example)

e

d

b

a

c

A

D

E

F

CB

e

d

a

b

c

A

B

E

F

DC

e

d

b

a E

F

DC

c

BA

Page 5: Splay trees (Sleator, Tarjan 1983)

25

Move to root (analysis)

There are arbitrary long access sequences such that the time per access is Ω(n) !

Homework ?

Page 6: Splay trees (Sleator, Tarjan 1983)

26

Splaying

Does rotations bottom up on the access path, but rotations are done in pairs in a way that depends on the structure of the path.

A splay step:

z

y

x

A B

C

D

x

y

z

DC

B

A

==>(1) zig - zig

Page 7: Splay trees (Sleator, Tarjan 1983)

27

Splaying (cont)

z

y

x

B C

A

D

x

z

DC

==>(2) zig - zag

y

BA

y

x

BA

C

x

y

CB

==>(3) zig

A

Page 8: Splay trees (Sleator, Tarjan 1983)

28

Splaying (example)

i

h

H

g

f

e

d

c

b

a

I

J

G

A

B

C

D

E F

i

h

H

g

f

e

d

a

b

c

I

J

G

A

B

F

E

DC

==>

i

h

H

g

f

a

d e

b

c

I

J

G

A

B F

E

DC

==>

Page 9: Splay trees (Sleator, Tarjan 1983)

29

Splaying (example cont)

i

h

H

g

f

a

d e

b

c

I

J

G

A

B F

E

DC

==>i

h

H

a

f g

d e

b

c

I

J

G

A

BF

E

DC

==>

f

d

b

c

A

B

E

DC

a

h

i

I JH

g

e

GF

Page 10: Splay trees (Sleator, Tarjan 1983)

30

Splaying (analysis)

Assume each item i has a positive weight w(i) which is arbitrary but fixed.

Define the size s(x) of a node x in the tree as the sum of the weights of the items in its subtree.

The rank of x: r(x) = log2(s(x))

Measure the splay time by the number of rotations

Page 11: Splay trees (Sleator, Tarjan 1983)

31

Access lemma

The amortized time to splay a node x in a tree with root t is at most 3(r(t) - r(x)) + 1 = O(log(s(t)/s(x)))

This has many consequences:

Potential used: The sum of the ranks of the nodes.

Page 12: Splay trees (Sleator, Tarjan 1983)

32

Balance theorem

Balance Theorem: Accessing m items in an n node splay tree takes O((m+n) log n)

Proof:

Page 13: Splay trees (Sleator, Tarjan 1983)

34

Static optimality theorem

Static optimality theorem: If every item is accessed at least once

then the total access time is O(m + q(i) log (m/q(i)) ) i=1

n

For any item i let q(i) be the total number of time i is accessed

Optimal average access time up to a constant factor.

Page 14: Splay trees (Sleator, Tarjan 1983)

37

Proof of the access lemma

proof. Consider a splay step.

Let s and s’, r and r’ denote the size and the rank function just before and just after the step, respectively. We show that the amortized time of a zig step is at most 3(r’(x) - r(x)) + 1, and that the amortized time of a zig-zig or a zig-zag step is at most 3(r’(x)-r(x))

The lemma then follows by summing up the cost of all splay steps

The amortized time to splay a node x in a tree with root t is at most 3(r(t) - r(x)) + 1 = O(log(s(t)/s(x)))

Page 15: Splay trees (Sleator, Tarjan 1983)

38

Proof of the access lemma (cont)

y

x

BA

C

x

y

CB

==>(3) zig

A

amortized time(zig) = 1 + =

1 + r’(x) + r’(y) - r(x) - r(y)

1 + r’(x) - r(x)

1 + 3(r’(x) - r(x))

Page 16: Splay trees (Sleator, Tarjan 1983)

39

Proof of the access lemma (cont)

amortized time(zig-zig) = 2 + =

2 + r’(x) + r’(y) + r’(z) - r(x) - r(y) - r(z) =

2 + r’(y) + r’(z) - r(x) - r(y)

2 + r’(x) + r’(z) - 2r(x)

2r’(x) - r(x) - r’(z) + r’(x) + r’(z) - 2r(x) = 3(r’(x) - r(x))

z

y

x

A B

C

D

x

y

z

DC

B

A

==>(1) zig - zig

2 -(log(p) + log(q)) = log(1/p) + log(1/q) = log(s’(x)/s(x)) + log(s’(x)/s(z))= r’(x)-r(x) + r’(x)-r(z)

Page 17: Splay trees (Sleator, Tarjan 1983)

40

Proof of the access lemma (cont)

z

y

x

B C

A

D

x

z

DC

==>(2) zig - zag

y

BA

Similar. (do at home)

Page 18: Splay trees (Sleator, Tarjan 1983)

41

Intuition

3

2

1

6

5

4

9

8

7

log( ) log( 1) ....log(1) ( log )n n O n n

Page 19: Splay trees (Sleator, Tarjan 1983)

42

Intuition (Cont)

log(1) log(3) log(7) log(15)... log( ) ( )2 4 8

n n nn n O n

Page 20: Splay trees (Sleator, Tarjan 1983)

43

Intuition

3

2

1

6

5

4

9

8

7

6

5

4

3

2

1

9

8

7

= 0

Page 21: Splay trees (Sleator, Tarjan 1983)

44

6

54

3

2

1

9

8

7

= log(5) – log(3) + log(1) – log(5) = -log(3)

6

5

4

3

2

1

9

8

7

Page 22: Splay trees (Sleator, Tarjan 1983)

45

6

54

3

2

1

9

8

76

54

3

2

1

9

8

7

= log(7) – log(5) + log(1) – log(7) = -log(5)

Page 23: Splay trees (Sleator, Tarjan 1983)

46

6

54

3

2

1

98

7

= log(9) – log(7) + log(1) – log(9) = -log(7)

6

54

3

2

1

9

8

7

Page 24: Splay trees (Sleator, Tarjan 1983)

49

Static finger theorem

Static finger theorem: Let f be an arbitrary fixed item, the total

access time is O(nlog(n) + m + log(|ij-f| + 1))j=1

m

Splay trees support access within the vicinity of any fixed finger as good as finger search trees.

Suppose all items are numbered from 1 to n in symmetric order. Let the sequence of accessed items be i1,i2,....,im

Page 25: Splay trees (Sleator, Tarjan 1983)

50

Working set theorem

Working set theorem: The total access time is

Let t(j), j=1,…,m, denote the # of different items accessed since the last access to item j or since the beginning of the sequence.

1

log( ) log( ( ) 1)m

j

O n n m t j

Proof:

Page 26: Splay trees (Sleator, Tarjan 1983)

52

Application: Data Compression via Splay Trees

Suppose we want to compress text over some alphabet

Prepare a binary tree containing the items of at its leaves.

To encode a symbol x:

•Traverse the path from the root to x spitting 0 when you go left and 1 when you go right.

•Splay at the parent of x and use the new tree to encode the next symbol

Page 27: Splay trees (Sleator, Tarjan 1983)

53

Compression via splay trees (example)

e f g ha b c d

aabg...

000

e f g h

a

b

c d

Page 28: Splay trees (Sleator, Tarjan 1983)

54

Compression via splay trees (example)

e f g ha b c d

aabg...

000

e f g h

a

b

c d

0

Page 29: Splay trees (Sleator, Tarjan 1983)

55

Compression via splay trees (example)

aabg...

0000

e f g h

a

b

c de f g h

c d

a b

10

Page 30: Splay trees (Sleator, Tarjan 1983)

56

Compression via splay trees (example)

aabg...

0000

e f g h

a

b

c de f g h

c d

a b

101110

Page 31: Splay trees (Sleator, Tarjan 1983)

57

Decoding

Symmetric.

The decoder and the encoder must agree on the initial tree.

Page 32: Splay trees (Sleator, Tarjan 1983)

58

Compression via splay trees (analysis)

How compact is this compression ?

Suppose m is the # of characters in the original string

The length of the string we produce is m + (cost of splays)

by the static optimality theorem

m + O(m + q(i) log (m/q(i)) ) =O(m + q(i) log (m/q(i)) )

Recall that the entropy of the sequence q(i) log (m/q(i)) is a lower bound.

Page 33: Splay trees (Sleator, Tarjan 1983)

59

Compression via splay trees (analysis)

In particular the Huffman code of the sequence is at least

q(i) log (m/q(i))

But to construct it you need to know the frequencies in advance

Page 34: Splay trees (Sleator, Tarjan 1983)

60

Compression via splay trees (variations)

D. Jones (88) showed that this technique could be competitive with dynamic Huffman coding (Vitter 87)

Used a variant of splaying called semi-splaying.

Page 35: Splay trees (Sleator, Tarjan 1983)

61

Semi - splaying

z

y

x

A B

C

D

y

z

DC

==>Semi-splay zig - zig

x

A B

z

y

x

A B

C

D

x

y

z

DC

B

A==>

Regular zig - zig

*

*

*

*

Continue splay at y rather than at x.

Page 36: Splay trees (Sleator, Tarjan 1983)

67

Update operations on splay trees

Catenate(T1,T2):

Splay T1 at its largest item, say i.

Attach T2 as the right child of the root.

T1 T2

i

T1T2

i

T1 T2

≤ 3log(W/w(i)) + O(1)

Amortize time: 3(log(s(T1)/s(i)) + 1 + s(T1)

s(T1) + s(T2)log( )

Page 37: Splay trees (Sleator, Tarjan 1983)

68

Update operations on splay trees (cont)

split(i,T):

Assume i T

T

i

Amortized time = 3log(W/w(i)) + O(1)

Splay at i. Return the two trees formed by cutting off the right son of i

i

T1 T2

Page 38: Splay trees (Sleator, Tarjan 1983)

69

Update operations on splay trees (cont)

split(i,T):

What if i T ?

T

i-

Amortized time = 3log(W/min{w(i-),w(i+)}) + O(1)

Splay at the successor or predecessor of i (i- or i+). Return the two trees formed by cutting off the right son of i or the left son of i

i-

T1 T2

Page 39: Splay trees (Sleator, Tarjan 1983)

70

Update operations on splay trees (cont)

insert(i,T):

T1 T2

i

Perform split(i,T) ==> T1,T2

Return the tree

Amortize time:

min{w(i-),w(i+)}W-w(i)

3log( ) + log(W/w(i)) + O(1)

Page 40: Splay trees (Sleator, Tarjan 1983)

71

Update operations on splay trees (cont)

delete(i,T):

T1 T2

i

Splay at i and then return the catenation of the left and right subtrees

Amortize time:

w(i-)W-w(i)

3log( ) + O(1)

T1 T2+

3log(W/w(i)) +

Page 41: Splay trees (Sleator, Tarjan 1983)

73

Open problems

Dynamic optimality conjecture:

Consider any sequence of successful accesses on an n-node search tree. Let A be any algorithm that carries out each access by traversing the path from the root to the node containing the accessed item, at the cost of one plus the depth of the node containing the item, and that between accesses perform rotations anywhere in the tree, at a cost of one per rotation. Then the total time to perform all these accesses by splaying is no more than O(n) plus a constant times the cost of algorithm A.

Page 42: Splay trees (Sleator, Tarjan 1983)

74

Open problems

Dynamic finger conjecture (now theorem)

The total time to perform m successful accesses on an arbitrary n-node splay tree is

O(m + n + (log |ij+1 - ij| + 1))

where the jth access is to item ij

j=1

m

Very complicated proof showed up in SICOMP recently (Cole et al)

Page 43: Splay trees (Sleator, Tarjan 1983)

76

Tango trees(Demaine, Harmon, Iacono, Patrascu 2004)

Page 44: Splay trees (Sleator, Tarjan 1983)

77

Lower bound

A reference tree

Page 45: Splay trees (Sleator, Tarjan 1983)

78

y

Lower bound

A reference tree

Page 46: Splay trees (Sleator, Tarjan 1983)

79

y

Left region of y

Page 47: Splay trees (Sleator, Tarjan 1983)

80

y

Right region of y

Page 48: Splay trees (Sleator, Tarjan 1983)

81

y

IB(σ,y) = #of alternations between accesses to the left region of y and accesses to the right region of y

Page 49: Splay trees (Sleator, Tarjan 1983)

82

IB(σ) = yIB(σ,y)

y

Page 50: Splay trees (Sleator, Tarjan 1983)

83

The lower bound (Wilber 89)

OPT(σ) ≥ ½ IB(σ) - n

Page 51: Splay trees (Sleator, Tarjan 1983)

84

Proof

Page 52: Splay trees (Sleator, Tarjan 1983)

85

Unique transition point

x

z

y

xz

Page 53: Splay trees (Sleator, Tarjan 1983)

86

Transition point exists

y

xz

z1z

z2

l

x1x

x2

r

Page 54: Splay trees (Sleator, Tarjan 1983)

87

y

xz

z1z

z2

lx1

x

x2

rz

Page 55: Splay trees (Sleator, Tarjan 1983)

88

y

xz

z1

z

z2

l

x1xx2

rz

Page 56: Splay trees (Sleator, Tarjan 1983)

89

y

xz

z1

z

z2

l

x1xx2

rz

Transition point does not change if not touched

Page 57: Splay trees (Sleator, Tarjan 1983)

90

y1

xz

z1

z

z2

l1

x1xx2

r1

z is a trasition point for only one node

y2

y1 and y2 have different z’s if unrelated

Page 58: Splay trees (Sleator, Tarjan 1983)

91

y1

x

z

z1

z

z2

l1

x1xx2

r1

z is a trasition point for only one node

y2

If z is not in y2’s subtree we are ok

Page 59: Splay trees (Sleator, Tarjan 1983)

92

y1

x

z

z1

z

z2

l1

x1xx2

r1

z is a trasition point for only one node

y2

Otherwise z is the LCA of everyone in y2’s subtree

So it’s the first among l2 and r2

Page 60: Splay trees (Sleator, Tarjan 1983)

93

OPT(σ) ≥ ½ IB(σ) - n (proof)

• Sum, for every y, how many times the algorithm touched the transition point of y (the deeper of l and r)

• Let σy1, σy2, σy3, σy4, ……., σyp be the interleaving accesses through y

• We must touch l when we access σyj for odd j, and r for even j

• so you touch the transition point unless it switched, but to switch it you also have to access it

Page 61: Splay trees (Sleator, Tarjan 1983)

94

y

xz

Tango trees

A reference tree

Each node has a preferred child

Page 62: Splay trees (Sleator, Tarjan 1983)

95

y

xz

Tango trees (cont)

a

b

c

d

fe

y

bfc d e xz a

A hierarchy of balanced binary trees each corresponds to a blue path

Page 63: Splay trees (Sleator, Tarjan 1983)

96

y

xz

a

b

c

d

fe

y

3 6

4

bfc d e xz a

Each node stores its depth in the blue path and the maximum depth in its subtree

6

Page 64: Splay trees (Sleator, Tarjan 1983)

97

y

xz

a

b

c

d

fe

7

1

2

3 6 5

4

bfc d e xz a

Nodes on the lower part are continuous in key space, can use maximum depth values to find the “interval” containing them

6

Cut

5

Page 65: Splay trees (Sleator, Tarjan 1983)

98

y

xz

a

b

c

d

fe

7

1

2

3 6 5

4

bfc d e xz a

6 (5,1)

Split Split

Page 66: Splay trees (Sleator, Tarjan 1983)

99

y

xz

a

b

c

d

fe

7

1

2

3

6 5

4

b

f

c

de xz

a6

(5,1)

Page 67: Splay trees (Sleator, Tarjan 1983)

100

y

xz

a

b

c

d

fe

4

1

2

3

3 2

1

b

f

c

de xz

a3

(5,1)

need some differential encoding of the depths

Page 68: Splay trees (Sleator, Tarjan 1983)

101

y

xz

a

b

c

d

fe

4

1

2

3

3 2

1

b

f

c

de xz

a3

(5,1)

Catenate

Page 69: Splay trees (Sleator, Tarjan 1983)

102

y

xz

a

b

c

d

fe

4

1

2

3

3 2

1 b

f

c

de

a3

Catenate

xz

(5,1)

Page 70: Splay trees (Sleator, Tarjan 1983)

103

y

xz

a

b

c

d

fe

4

1

2

3

3 2

1 b

f

c

de

a3

Similarly can do join

xz

(5,1)

Page 71: Splay trees (Sleator, Tarjan 1983)

104

y

xz

The algorithm

a

b

c

d

fe

y

bfc d e xz a

Search, and then, bottom-up cut and join so that your tango tree corresponds to the blue edges in the reference tree

Page 72: Splay trees (Sleator, Tarjan 1983)

105

Analysis

• If k edges change from black to blue:

(( 1) log log( ))O k n• Sum over m accesses

( ( ) ) log log( )O IB m n • If m=Ω(n)

( ) log log( )O OPT n