Self Adjusted Data Structures. Self-adjusting Structures Consider the following AVL Tree 44 1778...

Self Adjusted Data Structures

Self-adjusting Structures

Consider the following AVL Tree

44

17 78

32 50 88

48 62



44

17 78

32 50 88

48 62

Suppose we want to search for the following sequence of elements: 48, 48, 48, 48, 50, 50, 50, 50, 50.



44

17 78

32 50 88

48 62

Suppose we want to search for the following sequence of elements: 48, 48, 48, 48, 50, 50, 50, 50, 50.

In this case, is this a good structure?


• So far we have seen: • BST: binary search trees

– Worst-case running time per operation = O(N)– Worst case average running time = O(N)

» Think about inserting a sorted item list

• AVL tree:– Worst-case running time per operation = O(logN)– Worst case average running time = O(logN)– Does not adapt to skew distributions


• The structure is updated after each operation• Consider a binary search tree. If a sequence of

insertions produces a leaf in the level O(n), a sequence of m searches to this element will represent a time complexity of O(mn)

Use an auto-adjusting strucuture

Self adjustable lists

• Move to Front– Whenever an element is accessed it is moved to

the front of the list• Transposition

– Whenever an element x is accessed, we switch x with its predecessor

• Frequency counter– The list is always sorted by decreasing order of

frequencies

EXAMPLE

Self adjustable lists


• Algoritmos Admissíveis

– Um método é dito admissível se no i-ésimo acesso ele move o elemento acessado ki posições para frente

– Classe de algoritmos que engloba todos os métodos apresentados.

Análise do Move to Front

Teorema. Seja H um método admissível e seja s uma sequência de m acessos. Então

CustoMF(S) <= 2CustoH(s) –m,

Aonde CustoMF(S) e CustoH(s) são, respectivamente, os custos de MF e H para processar uma sequência s de requisições

Análise do Move to FrontProva. Empregamos o método da função potencialDi :o numero de inversões da lista mantida pelo MF em relação a lista do algoritmo H após o i-ésimo acesso. Lista de H: a b c f e d Lista de MF: b a f d e c Inversões: (b,a) , (f,c), (d,e), (d,c) Di =4

Temos quec’i = ci +Di –Di-1

Análise do Move to FrontProva. Elemento x é acessado pelo MF. x: k-ésimo elemento da lista de H x: j-ésimo elemento da lista de MFp: número de elementos que precedem x na lista de MF e sucedem x na lista de H

• Quando o MF coloca x na primeira posição, j-1-p inversões são criadas e p são destruídas

• Quando H move o elemento x ei operações para frente, ei inversões são destruídas e nenhuma é criada

Análise do Move to FrontProva. Da análise anterior

c’i = j +(j-1-p) – (p + ei) = 2(j-1-p) - ei + 1

Temos que j-1-p <=k-1 já que o número de inversões criadas é limitado pela posição de x na lista de H. Logo,

c’i <= 2(k-1) - ei + 1

Somando c’i

Segue que

Bad Sequences for Transposition and Frequency Counter

• Transposition– We insert n elements a1,…,an and then we access

the element an n times. – Transposition pays O(n2) while MF pays O(n)

• Frequency Counter– List with elements (an,…,a1). We access an n times, an-1 (n-

1) times and so on. FC does not reorganize the list.– FC pays O(n3) while MF pays O(n2)

Further Reading

Randomization avoids malicious adversaries


Splay Trees (Tarjan and Sleator 1985)• Binary search tree.• Every accessed node is brought to the root• Adapt to the access probability distribution

Splay trees: Basic Idea

• Try to make the worst-case situation occur less frequently.

• In a Binary search tree, the worst case situation can occur with every operation. (while inserting a sorted item list).

• In a splay tree, when a worst-case situation occurs for an operation:– The tree is re-structured (during or after the operation), so that

the subsequent operations do not cause the worst-case situation to occur again.

Splay trees: Basic idea

• The basic idea of splay tree is:

• After a node is accessed, it is pushed to the root by a series of AVL tree-like operations (rotations).

• For most applications, when a node is accessed, it is likely that it will be accessed again in the near future (principle of locality).

A first attempt

• A simple idea– When a node k is accessed, push it towards the

root by the following algorithm:

• On the path from k to root:

– Do a singe rotation between node k’s parent and node k itself.

F

k4

E

D

A

k1

BB

k5

k3

k2

access path

Accessing node k1

A first attempt

F

k4

E

D

B

k5

k3

After rotation between k2 and k1

A

k2

C

k1

A first attempt

F

k4

E

B

k5

k1


A

k2

DC

k3

A first attempt

F

k1

B

k5


A

k2

DC

E

k4

k3

A first attempt

k1

BA

k2

DC

F

k3

k5

E

k4

k1 is now root

But k3 is nearly as deep as k1 was. An access to k3 will push some other node nearly as deep as k3 is.

So, this method does not work ...

A first attempt

Splaying - algorithm

• Assume we access a node. • We will splay along the path from access node

to the root. • At every splay step:

– We will selectively rotate the tree. – Selective operation will depend on the structure

of the tree around the node in which rotation will be performed

Implementing Splay(x, S)

• Do the following operations until x is root.– ZIG: If x has a parent but no grandparent, then

rotate(x).– ZIG-ZIG: If x has a parent y and a grandparent,

and if both x and y are either both left children or both right children.

– ZIG-ZAG: If x has a parent y and a grandparent, and if one of x, y is a left child and the other is a right child.


• Do the following operations until x is root.– ZIG: If x has a parent but no grandparent, then rotate(x).– ZIG-ZIG: If x has a parent y and a grandparent, and if both

x and y are either both left children or both right children.– ZIG-ZAG: If x has a parent y and a grandparent, and if one

of x, y is a left child and the other is a right child.

A B

x

C

y


• Do the following operations until x is root.– ZIG: If x has a parent but no grandparent, then rotate(x).– ZIG-ZIG: If x has a parent y and a grandparent, and if both



A B

x

C

y

CB

y

A

x

ZIG(x)

root

Implementing Splay(x, S)• Do the following operations until x is root.– ZIG: If x has a parent but no grandparent, then rotate(x).– ZIG-ZIG: If x has a parent y and a grandparent, and if both



A B

y

C

x

CB

x

A

y

ZAG(x)

root

Implementing Splay(x, S)• Do the following operations until x is root.– ZIG: If x has a parent but no grandparent, then rotate(x).– ZIG-ZIG: If x has a parent y and a grandparent, and if both x and y

are either both left children or both right children.– ZIG-ZAG: If x has a parent y and a grandparent, and if one of x, y is a

left child and the other is a right child.

A B

xC

yD

z

Implementing Splay(x, S)• Do the following operations until x is root.– ZIG: If x has a parent but no grandparent, then rotate(x).– ZIG-ZIG: If x has a parent y and a grandparent, and if both x and y

are either both left children or both right children.– ZIG-ZAG: If x has a parent y and a grandparent, and if one of x, y is a


ZIG-ZIG

A B

xC

yD

z

DC

zB

yA

x

Implementing Splay(x, S)• Do the following operations until x is root.– ZIG: If x has a parent but no grandparent, then rotate(x).– ZIG-ZIG: If x has a parent y and a grandparent, and if both x and y are

either both left children or both right children.– ZIG-ZAG: If x has a parent y and a grandparent, and if one of x, y is a


B C

xD

y

z

A

Implementing Splay(x, S)• Do the following operations until x is root.– ZIG: If x has a parent but no grandparent, then rotate(x).– ZIG-ZIG: If x has a parent y and a grandparent, and if both x and y are

either both left children or both right children.– ZIG-ZAG: If x has a parent y and a grandparent, and if one of x, y is a


B C

xD

y

z

DC

y

x

A

BA

z

ZIG-ZAG

Splay Example Apply Splay(1, S) to tree S:

10

9

8

7

6

5

4

3

2

1

ZIG-ZIG

10

9

8

7

6

5

4

ZIG-ZIG

1

2

3


10

9

8

7

6 ZIG-ZIG

1

2

3

4

5


10

9

8

ZIG-ZIG1

6

2

3

4

5

7


10

ZIG

1

8

96

7

2

3

4

5


1

10

8

96

7

2

3

4

5


Apply Splay(2, S) to tree S:

1

10

8

96

7

2

3

4

5

2

8

4

63

10

1

9

5 7

Splay(2)

Splay Tree Analysis

• Define the potential function• Associate a positive weight to each node v: w(v)

• W(v)= w(y), y belongs to a subtree rooted at v

• Rank(v) = log W(v)

Splay Tree Analysis

• Define the potential function• Associate a positive weight to each node v: w(v)• W(v)= w(y), y belongs to a subtree rooted at v• Rank(v) = log W(v)• The tree potential is:

v

rank(v)

Upper bound for the amortized time ofa complete splay operation

• To estimate the time of a splay operation we are going to use the number of rotations


• To estimate the time of a splay operation we use the number of rotations

Lemma: The amortized time for a complete splay operation of a node x in a tree of root r is at most

1 + 3[rank(r) – rank(x)] where rank(x) is the rank of x before the splay and rank(r) is the rank of r after the splay.


Proof: The amortized cost a is given by a=t + after – before

t : number of rotations executed in the splaying


Proof: The amortized cost a is given by a=t + after – before

a = o1 + o2 + o3 + ... + ok

oi : amortized cost of the i-th operation during the splay ( zig or zig-zig or zig-zag)


Proof:i : potential function after i-th operation

ranki : rank after i-th operation

oi = ti + i – i-1

Splay Tree Analysis

• Operations

– Case 1: zig( zag)

– Case 2: zig-zig (zag-zag)

– Case 3: zig-zag (zag-zig)

Splay Tree Analysis

– Case 1: Only one rotation (zig)

r

x

root

Splay Tree Analysis– Case 1: Only one rotation (zig)

A B

x

C

r

CB

r

A

x

ZIG(x)

w.l.o.g.

r

x

root

Splay Tree Analysis– Case 1: Only one rotation (zig)

A B

x

C

r

CB

r

A

x

ZIG(x)

w.l.o.g.

After the operation only rank(x) and rank(r) change

r

x

root

Splay Tree Analysis

– Since potential is the sum of every rank: i - i-1 = ranki(r) + ranki(x) – ranki-1(r) – ranki-1(x)

ti = 1 (time of one rotation)

Amort. Complexity:

oi = 1 + ranki(r) + ranki(x) – ranki-1(r) – ranki-1(x)

Splay Tree Analysis

Amort. Complexity:


A B

x

C

r

CB

r

A

x

ZIG(x)

Splay Tree Analysis

Amort. Complexity:


A B

x

C

r

CB

r

A

x

ZIG(x)

ranki-1(r) ranki(r)

ranki (x) ranki-1(x)

Splay Tree AnalysisAmort. Complexity:

oi <= 1 + ranki(x) – ranki-1(x)

A B

x

C

r

CB

r

A

x

ZIG(x)

ranki-1(r) ranki(r)


Splay Tree AnalysisAmort. Complexity:

oi <= 1 + 3[ ranki(x) – ranki-1(x) ]

A B

q

C

r

CB

r

A

q

ZIG(x)

ranki-1(r) ranki(r)


Splay Tree Analysis– Case 2: Zig-Zig

ZIG-ZIG

A B

xC

yD

z

DC

zB

yA

x


ZIG-ZIG

A B

xC

yD

z

DC

zB

yA

x

oi = 2 + ranki (x) + ranki (y)+ranki (z) – ranki-1(x) – ranki-1(y) – ranki-1(z)


ZIG-ZIG

A B

xC

yD

z

DC

zB

yA

x

oi = 2 + ranki(x) + ranki (y)+ranki(z) – ranki-1(x) – ranki-1(y) – ranki-1(z)

ranki-1(z) = ranki (x)


ZIG-ZIG

A B

xC

yD

z

DC

zB

yA

x

oi = 2 + ranki (y)+ranki (z) – ranki-1(x) – ranki-1(y)


ZIG-ZIG

A B

xC

yD

z

DC

zB

yA

x

oi = 2 + ranki (y)+ranki (z) – ranki-1 (x) – ranki-1 (y)

ranki (x) ranki (y)ranki -1(y) ranki-1 (x)


ZIG-ZIG

A B

xC

yD

z

DC

zB

yA

x

oi 2 + ranki (x)+ranki (z) – 2ranki-1 (x)

Splay Tree AnalysisDevemos mostrar que 2 + ranki (x)+ranki (z) – 2ranki-1 (x) <=3(ranki(x)-ranki-1(x))

Ou seja,

-2 ranki-1 (x)+ranki (z) – 2ranki (x)

Devemos estudar o comportamento da função

ranki-1 (x)+ranki (z) – 2ranki (x)

Splay Tree AnalysisTemos que

ranki-1 (x)+ranki (z) – 2ranki (x) =

log( wi-1(x)/ wi (x))+log( wi(z)/ wi(x) )

Examinando a árvore percebemos que

wi-1(x)/ wi (x)+ wi(z)/ wi(x)<=1

Portanto, log( wi-1(x)/ wi (x))+log( wi(z)/ wi(x) ) é maior ou igual a min log a + log b sujeito a a+b<=1. Segue da convexidade da função log que o mínimo é atingido em a=b=1/2. Portanto, o mínimo é maior ou igual a -2.


ZIG-ZIG

A B

xC

yD

z

DC

zB

yA

x

oi 3[ ranki (x) – ranki-1 (x) ]

Splay Tree Analysis

– Case 3: Zig-Zag ( analysis similar to the case 2)

oi 3[ ranki (x) – ranki-1 (x) ]

Splay Tree Analysis

Putting the three cases together and telescoping

a = o1 + o2 + ... + ok 3[rank(r)-rank(x)]+1

Splay Tree Analysis

For proving different types of results we must set the weights accordingly

Theorem. The cost of m accesses is O(m log n +n log n), where n is the number of items in the tree

Splay Tree Analysis

Theorem. The cost of m accesses is O(m log n+n logn), where n is the number of items in the tree

Splay Tree Analysis

Proof: • Define every weight as 1/n.

• Then, the amortized cost is at most 3 log n + 1.

• The potential variation is at most n log n

•Thus, by summing over all accesses we conclude that the cost is at most 3m log n + n log n +m

Static Optimality Theorem

Theorem: Let q(i) be the number of accesses to item i. If every item is accessed at least once, then total cost is at most

n

i iq

miqmO

1 )(log)(

Static Optimality TheoremProof. Assign a weight of q(i)/m to item i. Then,

• rank(r)=0 and rank(i) log(q(i)/m)

Thus, 3[rank(r) – rank(i)] +1 3log(m/q(i)) + 1 In addition, || Thus, the overall cost is

n

i

iqm1

))(/log(

n

i iq

miqmO

1 )(log)(

Static Optimality Theorem

Theorem: The cost of an optimal static binary search tree is

m

i iq

miqm

1 )(log)(

Static Finger Theorem

Theorem: Let i,...,n be the items in the splay tree. Let the sequence of accesses be 1,...,m. If f is a fixed item, the total access time is

))1|log(|log(1

m

ij fimnnO

Static Finger Theorem

Proof. Assign a weight 1/(|i –f|+1)2 to item i. Then,

• rank(r)= O(1). • rank(ij)=O( log( |ij – f +1|)

Since the weight of every item is at least 1/n 2, then

|| n log n

Working Set Theorem

Theorem: Let i,...,n be the items in the splay tree. Let the sequence of accesses be 1,...,m. Let i(j) be the item accessed at the j-th access and let let t(j) be the number of distinct itens accessed since the previous access to i(j). Then,

))1)(log(log(1

m

i

jtmnnO

Dynamic Optimality Conjecture

Conjecture Consider any sequence of successful accesses on an n-node search tree. Let A be any algorithm that carries out each access by traversing the path from the root to the node containing the accessed item, at a cost of one plus the depth of the node containing the item, and that between accesses performs an arbitrary number of rotations anywhere in the tree, at a cost of one per rotation. Then the total time to perform all the accesses by splaying is no more than O(n) plus a constant times the time required by the algorithm.

Dynamic Optimality Conjecture: best attempt

Tango Trees: O(log log n) competitive ratio

Dynamic optimality - almost.E. Demaine, D. Harmon, J. Iacono, and M. Patrascu. In Foundations of Computer Science (FOCS), 2004

Insertion and Deletion

Most of the theorems hold !

Paris Kanellakis Theory and Practice Award Award 1999

Splay Tree Data Structure

Daniel D.K. Sleator and Robert E. Tarjan

Citation For their invention of the

widely-used "Splay Tree" data structure.

Amortized running time• Ordinary Complexity: determination of worst

case complexity. Examines each operation individually

Amortized running time• Ordinary Complexity: determination of worst

case complexity. Examines each operation individually

• Amortized Complexity: analyses the average complexity of each operation.

Amortized Analysis: Physics Approach

• It can be seen as an analogy to the concept of potential energy


• Potential function which maps any configuration E of the structure into a real number (E), called potential of E.



• Potential function which maps any configuration E of the structure into a real number (E), called potential of E.

It can be used to to limit the costs of the operations to be done in the future


Amortized cost of an operation

a = t + (E’) - (E)

Amortized cost of an operation

a = t + (E’) - (E)

Real timeof the operation

Structure configuration after the operation

Structure configuration before the operation

Amortized cost of a sequence of operations

t i = (ai - i + i-1)i=1 i=1

m m

a = t + (E’) - (E)


t i = (ai - i + i-1)i=1 i=1

m m

= 0 - m + ai i=1

mBy telescopic

a = t + (E’) - (E)

Amortized cost of a sequence of M operations

t i = (ai - i + i-1)i=1 i=1

m m

= 0 - m + ai i=1

mBy telescopic

The total real time does not depend on the intermediary potential

a = t + (E’) - (E)


Ti = (ai - i + i-1)i=1 i=1If the final potential is greater or equal than the initial, then the amortized complexity can be used as an upper bound to estimate the total real time.

If the final potential is greater or equal than the initial, then the amortized complexity can be used as an upper bound to estimate the total real time.

Amortized running time

• Definition: For a series of M consecutive operations: – If the total running time is O(M*f(N)), we say that

the amortized running time (per operation) is O(f(N)).

• Using this definition: – A splay tree has O(logN) amortized cost (running

time) per operation.


A B

xC

yD

z


A B

xC

yD

z

DC

z

y

A B

x


A B

xC

yD

z

DC

z

y

A B

x

DC

zB

yA

x

Self Adjusted Data Structures. Self-adjusting Structures Consider the following AVL Tree 44 1778...

Documents

Transcript of Self Adjusted Data Structures. Self-adjusting Structures Consider the following AVL Tree 44 1778...