Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP...

205
Balanced Trees Part Two

Transcript of Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP...

Page 1: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Balanced TreesPart Two

Page 2: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Outline for Today

● Red/Black Trees● Using our isometry!

● Tree Rotations● A key primitive in restructuring trees.

● Augmented Binary Search Trees● Leveraging red/black trees.

Page 3: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Recap from Last Time

Page 4: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

2-3-4 Trees

1 2 4 6 7 8 10 12 14 15 17 18 19 21 22 24 26

3 9 11 16 20 25

5 13 23

● A 2-3-4 tree is a multiway search tree where

● every node has 1, 2, or 3 keys,● any non-leaf node with k keys has exactly k+1 children, and● all leaves are at the same depth.

● To insert a key, place it in a leaf. If out of space, split the leaf and kick the median key one level higher, repeating this process.

Page 5: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Red/Black Trees● A red/black tree is a BST with

the following properties:● Every node is either red or black.● The root is black.● No red node has a red child.● Every root-null path in the tree

passes through the same number of black nodes.

After we hoist red nodes into their parents:

Each “meta node” has 1, 2, or 3 keys in it. (No red node has a red child.)

Each “meta node” is either a leaf or has one more key than node. (Root-null path property.)

Each “meta leaf” is at the same depth. (Root-null path property.)

7

3

5 11

1

2 4

6

8

9

10

12

Page 6: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Red/Black Trees● A red/black tree is a BST with

the following properties:● Every node is either red or black.● The root is black.● No red node has a red child.● Every root-null path in the tree

passes through the same number of black nodes.

● After we hoist red nodes into their parents:● Each “meta node” has 1, 2, or 3

keys in it. (No red node has a red child.)

● Each “meta node” is either a leaf or has one more child than key. (Root-null path property.)

● Each “meta leaf” is at the same depth. (Root-null path property.)

7

3 5 11

1 2 4 6 8 9 10 12

This is a2-3-4 tree!

This is a2-3-4 tree!

Page 7: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

New Stuff!

Page 8: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Data Structure Isometries

● Red/black trees are an isometry of 2-3-4 trees; they represent the structure of2-3-4 trees in a different way.

● That gives us some really easy theorems basically for free.

● Theorem: The maximum height of a red/black tree with n nodes is O(log n).

● Proof idea: Pulling red nodes into their parents forms a 2-3-4 tree with n keys, which has height O(log n). Undoing this at most doubles the height of the tree. ■-ish

Page 9: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Data Structure Isometries

● Red/black trees are an isometry of 2-3-4 trees; they represent the structure of2-3-4 trees in a different way.

● That gives us some really easy theorems basically for free.

● Theorem: The maximum height of a red/black tree with n nodes is O(log n).

● Proof idea: Pulling red nodes into their parents forms a 2-3-4 tree with n keys, which has height O(log n). Undoing this at most doubles the height of the tree. ■-ish

Explain why, using the isometry.

Formulate an answer now, but don’t post anything in chat

just yet.

Explain why, using the isometry.

Formulate an answer now, but don’t post anything in chat

just yet.

Page 10: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Data Structure Isometries

● Red/black trees are an isometry of 2-3-4 trees; they represent the structure of2-3-4 trees in a different way.

● That gives us some really easy theorems basically for free.

● Theorem: The maximum height of a red/black tree with n nodes is O(log n).

● Proof idea: Pulling red nodes into their parents forms a 2-3-4 tree with n keys, which has height O(log n). Undoing this at most doubles the height of the tree. ■-ish

Explain why, using the isometry.

Now, private chat me your best explanation. Not sure? Just

answer “??”

Explain why, using the isometry.

Now, private chat me your best explanation. Not sure? Just

answer “??”

Page 11: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Data Structure Isometries

● Red/black trees are an isometry of 2-3-4 trees; they represent the structure of2-3-4 trees in a different way.

● That gives us some really easy theorems basically for free.

● Theorem: The maximum height of a red/black tree with n nodes is O(log n).

● Proof idea: Pulling red nodes into their parents forms a 2-3-4 tree with n keys, which has height O(log n). Undoing this at most doubles the height of the tree. ■-ish

Page 12: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Exploring the Isometry

● Nodes in a 2-3-4 tree are classified into types based on the number of children they can have.● 2-nodes have one key (two children).● 3-nodes have two keys (three children).● 4-nodes have three keys (four children).

● How might these nodes be represented?

Page 13: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Exploring the Isometry

k₁

k₁ k₂

k₁ k₂ k₃

k₁

k₁

k₂ k₁

k₂

k₁ k₃

k₂

Page 14: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3 31

7 23

17

13

7 19

3 13 17 23 31

add5

add5

Page 15: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3 31

7 23

17

13

7 19

3 13 17 23 31

add5

add5

7 19

3 13 17 23 31

Page 16: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3 31

7 23

17

13

7 19

3 13 17 23 31

7 19

3 13 17 23 315

19

3 31

7 23

17

13

5

add5

add5

Page 17: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

7 19

3 13 17 23 31

7 19

3 13 17 23 315

19

31

7 23

17

13

add5

add5

3

5

3

Page 18: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

19

31

7 23

17

133

5

Goal

3

Page 19: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

19

31

7 23

17

133

5

Goal

3

5

Page 20: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

19

31

7 23

17

133

5

Goal

3

5

Page 21: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

19

31

7 23

17

133

5

Goal

3

5

Page 22: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3 31

7 23

17

13

5

Page 23: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3 31

7 23

17

13

5

add21

add21

7 19

3 13 17 23 315

Page 24: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3 31

7 23

17

13

5

add21

add21

7 19

3 13 17 23 315

7 19

3 13 17 23 315

Page 25: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3 31

7 23

17

13

5

add21

add21

7 19

3 13 17 23 315

7 19

3 13 17 23 315 21

19

3 31

7 23

17

13

5

21

Page 26: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3

7

17

13

5

add21

add21

7 19

3 13 17 23 315

7 19

3 13 17 23 315 21

19

3

7

17

13

5

21

31

23

31

23

Page 27: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3

7

17

13

5

19

3

7

17

13

5

21

Goal

31

23

31

23

Page 28: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3

7

17

13

5

19

3

7

17

13

5

Goal

31

23

21

31

23

21

Page 29: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3

7

17

13

5

19

3

7

17

13

5

23

21

Goal

31

23

21

31

Page 30: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

3

7

17

13

5

19

3

7

17

13

5

23

21

Goal

31

23

21

31

Page 31: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Red/Black Tree Insertion

● Rule #1: When inserting a node, if its parent is black, make the node red and stop.

● Justification: This simulates inserting a key into an existing 2-node or 3-node.

Page 32: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

add4

add4

7 19

3 13 17 23 315 21

21

7 19

3 13 17 23 315 213

5

Page 33: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

add4

add4

7 19

3 13 17 23 315

7 19

3 13 17 23 315 214

19

31

7

17

13

23

214

53

2121

3

5

Page 34: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

add4

add4

7 19

3 13 17 23 315

7 19

3 13 17 23 315 21

19

31

7

17

13

23

214

4

53

2121

3

5

Page 35: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

19

31

7

17

13

23

214

53

213

5

Goal

Page 36: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

19

31

7

17

13

23

214

53

213

5

4

Goal

We need to move nodes around in a binary search tree. How do we do this?

We need to move nodes around in a binary search tree. How do we do this?

Page 37: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

A B

>B<A >A<B

A

B

<A

>A<B

>B

A

B

<A >A<B

>B

Tree Rotations

Page 38: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

A B

>B<A >A<B

Tree Rotations

A

B

<A

>A<B

>B

A

B

<A >A<B

>B

Page 39: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

AB

<A >A<B

>B

AB

<A

>A<B

>B

AB

<A >A<B

>BRotateRight

RotateLeft

AB

<A

>A<B

>B

Page 40: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

3

5

4

3 5

4

3

5

4

3 5

4

applyrotation

changecolors

apply rotation

This applies any time we're inserting a new node into

the middle of a “3-node” in this pattern.

By making observations like these, we can determine

how to update a red/black tree after an insertion.

This applies any time we're inserting a new node into

the middle of a “3-node” in this pattern.

By making observations like these, we can determine

how to update a red/black tree after an insertion.

Page 41: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

19

31

7

17

13

23

214

53

213

5

4

Goal

Page 42: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

19

31

7

17

13

23

214

53

213

4

5

Goal

Page 43: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

19

31

7

17

13

23

214

53

214

5

Goal

3

Page 44: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

19

31

7

17

13

23

214

53

214

5

Goal

3

Page 45: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13

19

31

7

17

13

23

214

53

214

5

Goal

3

Page 46: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13 214

53

Page 47: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13 214

53

15

Page 48: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

17

13 214

53

15

Page 49: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

15

13 214

53

17

Page 50: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

15 214

53 1713

Page 51: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

15 214

53 1713

Page 52: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

15 214

53 1713

7 19

3 17 23 315 214 13 15

add16

add16

7 19

3 17 23 315 214 13 15

Page 53: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

7 19

3 17 23 315 21

19

31

7 23

15 214

53 1713

4 13 15

7 19

3 17 23 315 214 13 15 16

add16

add16

Page 54: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

7 19

3 17 23 315 21

19

31

7 23

15 214

53 1713

4 13 15

7 19

3 17 23 315 214 13

15

16

add16

add16

Page 55: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

7 19

3 17 23 315 21

19

31

7 23

15 214

53 1713

4 13 15

7 19

3 17 23 315 214 13

15

16

add16

add16

Page 56: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

7 19

3 17 23 315 21

19

31

7 23

15 214

53 1713

4 13 15

7 19

3 17 23 315 214 13

15

16

add16

add16

19

31

7 23

15 214

53 1713

16

Page 57: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

7 19

3 17 23 315 21

19

31

7 23

214

53

4 13 15

7 19

3 17 23 315 214 13

15

16

add16

19

31

7 23

214

53

15

1713

16

add16

15

1713

Page 58: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

214

53

19

31

7 23

214

53

15

1713

16

15

1713

Page 59: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

214

53

19

31

7 23

214

53

15

1713

16

15

1713

16

Page 60: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

214

53

19

31

7 23

214

53

15

1713

16

15

1713

16

Page 61: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

19

31

7 23

214

53

15

1713

16

Page 62: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

31

23

214

53 1713

16

19

7

15

Page 63: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

31

23

214

53 1713

16

19

7

15

Page 64: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

13

17

16

31

23

21

4

53

19

7

15

Page 65: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

13

17

16

31

23

21

4

53

19

7

15

Page 66: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

17

16 31

23

21

134

53

197

15

Page 67: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

17

16 31

23

21

134

53

197

15

Page 68: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

17

16 31

23

21

134

53

197

15

Page 69: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Building Up Rules

● The complex rules on red/black trees make perfect sense if you connect it back to 2-3-4 trees.

● There are lots of cases to consider because there are many different ways you can insert into a red/black tree.

● Main point: Simulating the insertion of a key into a node takes time O(1) in all cases. Therefore, since 2-3-4 trees support O(log n) insertions, red/black trees support O(log n) insertions.

● The same is true of deletions.

Page 70: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

My Advice

● Do know how to do B-tree insertions and searches.

● You can derive these easily if you remember to split nodes.

● Do remember the rules for red/black trees and B-trees.● These are useful for proving bounds and deriving results.

● Do remember the isometry between red/black trees and 2-3-4 trees.

● Gives immediate intuition for all the red/black tree operations.

● Don't memorize the red/black rotations and color flips.

● This is rarely useful. If you're coding up a red/black tree, just flip open CLRS and translate the pseudocode. ☺

Page 71: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Problems

Page 72: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Classical Algorithms

● The “classical” algorithms model goes something like this:

Given some input X, compute some interesting function f(X).

● The input X is provided up front, and only a single answer is produced.

time

Input Xprovided

Input Xprovided

Output f(X) computed

Output f(X) computed

Page 73: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Problems

● Dynamic versions of problems are framed like this:

Given an input X that can change in fixed ways, maintain X while being able to compute f(X)

efficiently at any point in time.● These problems are typically harder to solve

efficiently than the “classical” static versions.

time

Input Xprovided

Input Xprovided

f(X) computed

f(X) computed

X updated

X updated

X updated

X updated

f(X) computed

f(X) computed

X updated

X updated

Page 74: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

● The selection problem is the following:

Given a list of distinct values and a number k, return the kth-smallest value.

● In the static case, where the data set is fixed in advance and k is known, we can solve this in time O(n) using quickselect or the median-of-medians algorithm.

● Goal: Solve this problem efficiently when the data set is changing – that is, the underlying set of elements can have insertions and deletions intermixed with queries.

31 41 59 26 53 58 79

Page 75: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

15

Page 76: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

6

7

8

10

9

Page 77: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

6

7

8

10

9

Page 78: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

6

7

8

10

9

Page 79: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

6

7

8

10

9

Page 80: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

6

7

8

10

9

Page 81: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

6

7

8

10

99

Page 82: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

6

7

8

10

994

Page 83: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

6

7

8

10

994

Page 84: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

5

6

7

8

9

11

1094

Page 85: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

5

6

7

8

9

11

1094

Page 86: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

5

6

7

8

9

11

1094

Problem: After inserting a new value, we may have to

update Θ(n) values.

Problem: After inserting a new value, we may have to

update Θ(n) values.

This is inherent in this solution route. These numbers track global properties of the tree.

This is inherent in this solution route. These numbers track global properties of the tree.

Page 87: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

15

Page 88: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

6

7

8

10

9

Page 89: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

6

7

8

10

9

Page 90: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

0

1

2

4

3

Page 91: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

0

1

2

4

3

If new nodes are added to the the left subtree, the numbers on the right don’t need to update.

If new nodes are added to the the left subtree, the numbers on the right don’t need to update.

Page 92: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

0

1

2

4

3

Page 93: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

0

1

2

4

3

Page 94: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

0

1

2

1

0

Page 95: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

2

3

4

5

0

1

2

1

0

Page 96: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

Page 97: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

Mechanically: Number each key so that it only stores

its order statistic in the subtree rooted at itself.

Mechanically: Number each key so that it only stores

its order statistic in the subtree rooted at itself.

Operationally: Annotate each key with the number of

keys in its left subtree.

Operationally: Annotate each key with the number of

keys in its left subtree.

Page 98: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

Page 99: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

1?

Page 100: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

1?

Page 101: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

1?

Page 102: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

Page 103: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

Page 104: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

9?

Page 105: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

9 – 5 – 1?

Page 106: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

3?

Page 107: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

3 – 2 – 1?

Page 108: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

0?

Page 109: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

0?

Page 110: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

Page 111: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

0

Page 112: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

09

Page 113: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

090

Page 114: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

090

Page 115: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

090

We only update values on nodes that gained a new key in their left subtree. And there are only O(log n) of these!

We only update values on nodes that gained a new key in their left subtree. And there are only O(log n) of these!

Page 116: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

0

5

0

1

2

1

090

We only update values on nodes that gained a new key in their left subtree. And there are only O(log n) of these!

We only update values on nodes that gained a new key in their left subtree. And there are only O(log n) of these!

Page 117: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

1

6

0

1

2

1

090

We only update values on nodes that gained a new key in their left subtree. And there are only O(log n) of these!

We only update values on nodes that gained a new key in their left subtree. And there are only O(log n) of these!

Page 118: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

1

6

0

1

2

1

090

Page 119: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

1

6

0

1

2

1

090

16

Page 120: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

1

6

0

1

2

1

090

160

Page 121: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

1

6

0

2

3

1

090

160

Page 122: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

1

6

0

2

3

1

090

160

Page 123: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

1

6

0

2

3

1

090

160How do we update the numbers after the rotation?

How do we update the numbers after the rotation?

Page 124: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

AB

<A

>A<B

>B

AB

<A >A<B

>B

na

nb

RotateRight

RotateLeft

?

?

?

? nb

na AB

<A

>A<B

>B

AB

<A >A<B

>B

Page 125: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

AB

<A

>A<B

>B

AB

<A >A<B

>B

na

nb

RotateRight

RotateLeft

na

?

?

? nb

na AB

<A

>A<B

>B

AB

<A >A<B

>B

Page 126: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

AB

<A

>A<B

>B

AB

<A >A<B

>B

na

nb

RotateRight

RotateLeft

na

nb – na

– 1

?

? nb

na AB

<A

>A<B

>B

AB

<A >A<B

>B

Page 127: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

AB

<A

>A<B

>B

AB

<A >A<B

>B

na

nb

RotateRight

RotateLeft

na

nb – na

– 1

?

nanb

na AB

<A

>A<B

>B

AB

<A >A<B

>B

Page 128: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

AB

<A

>A<B

>B

AB

<A >A<B

>B

na

nb

RotateRight

RotateLeft

na

nb – na

– 1

nb + na

+ 1

nanb

na AB

<A

>A<B

>B

AB

<A >A<B

>B

Page 129: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

150

1

0

3

1

6

0

2

3

1

090

160How do we update the numbers after the rotation?

How do we update the numbers after the rotation?

Page 130: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

15

0

1

0

3

1

6

1

2

3

1

090 16

0

Page 131: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic Selection

19

3 23

7

311713

5

4

14

15

0

1

0

3

1

6

1

2

3

1

090 16

0

Page 132: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

16

Dynamic Selection

19

3 23

7

31

17

13

5

4

14

150

1

0

3

1

6

1

0

3

1

090 0

Page 133: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

16

Dynamic Selection

19

3 23

7

31

17

13

5

4

14

150

1

0

3

1

6

1

0

3

1

090 0

Page 134: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Order Statistic Trees

● This modified red/black tree is called an order statistics tree.● Start with a red/black tree.● Tag each node with the number of nodes in its left subtree.● Use the preceding update rules to preserve values during

rotations.● Propagate other changes up to the root of the tree.

● Only O(log n) values must be updated on an insertion or deletion and each can be updated in time O(1).

● Supports all BST operations plus select (find kth order statistic) and rank (given a key, report its order statistic) in time O(log n).

Page 135: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Generalizing our Idea

Page 136: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Edits to values are localized along the access path.

Page 137: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Edits to values are localized along the access path.

Page 138: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Edits to values are localized along the access path.

Page 139: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Edits to values are localized along the access path.

We can recompute values after a rotation.

Page 140: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Edits to values are localized along the access path.

We can recompute values after a rotation.

Page 141: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Edits to values are localized along the access path.

We can recompute values after a rotation.

Page 142: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Edits to values are localized along the access path.

We can recompute values after a rotation.

Page 143: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Edits to values are localized along the access path.

We can recompute values after a rotation.

Page 144: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Edits to values are localized along the access path.

We can recompute values after a rotation.

Page 145: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Imagine we cache some value in each node that can be computed

just from (1) the node itself and (2) its children’s values.

Page 146: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Imagine we cache some value in each node that can be computed

just from (1) the node itself and (2) its children’s values.

Page 147: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Imagine we cache some value in each node that can be computed

just from (1) the node itself and (2) its children’s values.

Recompute values on this access path, bottom-up.

Recompute values on this access path, bottom-up.

Page 148: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Imagine we cache some value in each node that can be computed

just from (1) the node itself and (2) its children’s values.

Page 149: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Imagine we cache some value in each node that can be computed

just from (1) the node itself and (2) its children’s values.

Recompute the values in these nodes.

Recompute the values in these nodes.

Page 150: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Imagine we cache some value in each node that can be computed

just from (1) the node itself and (2) its children’s values.

Page 151: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Imagine we cache some value in each node that can be computed

just from (1) the node itself and (2) its children’s values.

Recompute the values in these nodes.

Recompute the values in these nodes.

Page 152: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Imagine we cache some value in each node that can be computed

just from (1) the node itself and (2) its children’s values.

Page 153: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Theorem: Suppose we want to cache some computed value in each node of a red/black tree. Provided that the value can be

recomputed purely from the node’s value and from it’s children’s values, and provided that each value can be computed in time

O(1), then these values can be cached in each node with insertions, lookups, and deletions still taking time O(log n).

Page 154: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Example: Hierarchical Clustering

Page 155: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 10020

Page 156: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

42 44 60 66 71 86 92 100

20

20

Page 157: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

42 44 60 66 71 86 92 100

20

20

Page 158: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

42 44

60 66 71 86 92 100

20

20 43

Page 159: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

42 44

60 66 71 86 92 100

20

20 43

Page 160: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

42 44

60 66 71 86 92 100

20

20 43

Page 161: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

43

Page 162: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60

66 71

86 92 100

20

20 68.5

42 44

43

Page 163: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60

66 71

86 92 100

20

20 68.5

42 44

43

Page 164: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60

66 71

86 92 100

20

20 68.5

42 44

43

Page 165: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60

66 71

86 92 100

20

20 68.5

42 44

43

Page 166: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60

66 71 86 92

100

20

20 68.5

42 44

43 89

Page 167: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60

66 71 86 92

100

20

20 68.5

42 44

43 89

Page 168: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60

66 71 86 92

100

20

20 68.5

42 44

43 89

Page 169: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60

66 71 86 92

100

20

20 68.5

42 44

43 89

Page 170: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92

100

20

20 65.67

42 44

43 89

Page 171: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92

100

20

20

42 44

43 8965.67

Page 172: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92

100

20

20

42 44

43 8965.67

Page 173: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92

100

20

20

42 44

43 8965.67

Page 174: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

43 92.6765.67

Page 175: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

43 92.6765.67

Page 176: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

43 92.6765.67

Page 177: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

43 92.6765.67

Page 178: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

56.6 92.67

Page 179: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

56.6 92.67

Page 180: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

56.6 92.67

Page 181: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

56.6 92.67

Page 182: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

70.13

Page 183: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

70.13

Page 184: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

70.13

Page 185: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20

42 44

70.13

Page 186: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20 42 44

64.56

Page 187: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 100

60 66 71 86 92 100

20

20 42 44

64.56

Page 188: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 10020

Page 189: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

1D Hierarchical Clustering

42 44 60 66 71 86 92 10020

This tree is called a dendrogram.

This tree is called a dendrogram.

Page 190: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Analyzing the Runtime

● How efficient is this algorithm?● Number of rounds: Θ(n).● Work to find closest pair: O(n).● Total runtime: Θ(n2).

● Can we do better?

Page 191: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Analyzing the Runtime

How efficient is this algorithm?

Number of rounds: Θ(n).● Work to find closest pair: O(n).

Total runtime: Θ(n2).

Can we do better?

Page 192: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic 1D Closest Points

● The dynamic 1D closest points problem is the following:

Maintain a set of real numbers undergoing insertion and deletion while efficiently supporting queries of the form

“what is the closest pair of points?” ● Can we build a better data structure for

this?

Page 193: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic 1D Closest Points

k

max min

Page 194: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

A Tree Augmentation

● Augment each node to store the following:● The maximum value in the tree.● The minimum value in the tree.● The closest pair of points in the tree.

● Claim: Each of these properties can be computed in time O(1) from the left and right subtrees.

● These properties can be augmented into a red/black tree so that insertions and deletions take time O(log n) and “what is the closest pair of points?” can be answered in time O(1).

Page 195: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic 1D Closest Points137

Min: ?

Max: ?

Closest: ?, ?42

Min: -17

Max: 67

Closest: 15, 21

271 Min: 142

Max: 415

Closest: 300, 310

Fill in the blanks.

Formulate an answer now, but don’t post anything in chat

just yet. 😃

Fill in the blanks.

Formulate an answer now, but don’t post anything in chat

just yet. 😃

Page 196: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic 1D Closest Points137

Min: ?

Max: ?

Closest: ?, ?42

Min: -17

Max: 67

Closest: 15, 21

271 Min: 142

Max: 415

Closest: 300, 310

Fill in the blanks.

Now, private chat me your best guess. Not sure? Just answer

“??” 😃

Fill in the blanks.

Now, private chat me your best guess. Not sure? Just answer

“??” 😃

Page 197: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic 1D Closest Points137

Min: ?

Max: ?

Closest: ?, ?42

Min: -17

Max: 67

Closest: 15, 21

271 Min: 142

Max: 415

Closest: 300, 310

Page 198: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic 1D Closest Points137

Min: -17

Max: ?

Closest: ?, ?42

Min: -17

Max: 67

Closest: 15, 21

271 Min: 142

Max: 415

Closest: 300, 310

Page 199: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic 1D Closest Points137

Min: -17

Max: 415

Closest: ?, ?42

Min: -17

Max: 67

Closest: 15, 21

271 Min: 142

Max: 415

Closest: 300, 310

Page 200: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Dynamic 1D Closest Points137

Min: -17

Max: 415

Closest: 137, 14242

Min: -17

Max: 67

Closest: 15, 21

271 Min: 142

Max: 415

Closest: 300, 310

Page 201: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Some Other Questions

● How would you augment this tree so that you can efficiently (in time O(1)) compute the appropriate weighted averages?

● Trickier: Is this the fastest possible algorithm for this problem?● What if you’re guaranteed that the keys are

all integers in some nice range?

Page 202: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

A Helpful Intuition

Page 203: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Divide-and-Conquer

● Initially, it can be tricky to come up with the right tree augmentations.

● Useful intuition: Imagine you're writing a divide-and-conquer algorithm over the elements and have O(1) time per “conquer” step.

< k > kk

Page 204: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Divide-and-Conquer

● Initially, it can be tricky to come up with the right tree augmentations.

● Useful intuition: Imagine you're writing a divide-and-conquer algorithm over the elements and have O(1) time per “conquer” step.

< k > k

k

Page 205: Suffix Arrays - Stanford Universityweb.stanford.edu/class/cs166/lectures/03/Slides03.pdfThe LCP array stores information about the branching words of a string. Together, they’ll

Next Time

● Amortized Analysis● Lying about runtime costs in an honest

manner.● Potential Functions

● Quantifying disorder and messiness.● Some Applications

● Clever algorithms that are faster than they look.