Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  ·...

23
Public Service Announcement “The Experimental Social Science Laboratory (Xlab) invites you to participate in social science studies! Experiments con- ducted at Xlab (located in Hearst Gym, Suite 2) are computer- ized, decision-making studies such as tasks, surveys, and games. We also occasionally offer remote online and mobile studies that can be completed anywhere. Participants earn $15/hour on av- erage every time they participate. For more information, visit xlab.berkeley.edu. To sign up, visit berkeley.sona-systems.com.” Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 1

Transcript of Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  ·...

Page 1: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Public Service Announcement

“The Experimental Social Science Laboratory (Xlab) invitesyou to participate in social science studies! Experiments con-ducted at Xlab (located in Hearst Gym, Suite 2) are computer-ized, decision-making studies such as tasks, surveys, and games.We also occasionally offer remote online and mobile studies thatcan be completed anywhere. Participants earn $15/hour on av-erage every time they participate. For more information, visitxlab.berkeley.edu. To sign up, visit berkeley.sona-systems.com.”

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 1

Page 2: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

CS61B Lecture #31Today:

• More balanced search structures (DS(IJ), Chapter 9

Coming Up:

• Pseudo-random Numbers (DS(IJ), Chapter 11)

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 2

Page 3: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Really Efficient Use of Keys: the Trie

• Have been silent about cost of comparisons.

• For strings, worst case is length of string.

• Therefore should throw extra factor of key length, L, into costs:

– Θ(M) comparisons really means Θ(ML) operations.

– So to look for key X , keep looking at same chars of X M times.

• Can we do better? Can we get search cost to be O(L)?

Idea: Make a multi-way decision tree, with one decision per characterof key.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 3

Page 4: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

The Trie: Example

• Set of keys

{a, abase, abash, abate, abbas, axolotl, axe, fabric, facet}

• Ticked lines show paths followed for “abash” and “fabric”

• Each internal node corresponds to a possible prefix.

• Characters in path to node = that prefix.

a

a✷

a✷b

aba

abas

abase

abase✷

h

abash✷

t

abate✷

b

abbas✷

x

axe

axe✷

o

axolotl✷

f

f

a

fab

fabric✷

c

facet✷

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 4

Page 5: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Adding Item to a Trie

• Result of adding bat and faceplate.

• New edges ticked.

a

a✷

a✷b

aba

abas

abase

abase✷

h

abash✷

t

abate✷

b

abbas✷

x

axe

axe✷

o

axolotl✷

b

bat✷

f

f

a

fab

fabric✷

c

fac

e

facep

faceplate✷

t

facet✷

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 5

Page 6: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

A Side-Trip: Scrunching

• For speed, obvious implementation for internal nodes is array in-dexed by character.

• Gives O(L) performance, L length of search key.

• [Looks as if independent of N , number of keys. Is there a depen-dence?]

• Problem: arrays are sparsely populated by non-null values—waste ofspace.

Idea: Put the arrays on top of each other!

• Use null (0, empty) entries of one array to hold non-null elements ofanother.

• Use extra markers to tell which entries belong to which array.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 6

Page 7: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Scrunching Example

Small example: (unrelated to Tries on preceding slides)

• Three leaf arrays, each indexed 0..9

A1:

bass trout pike

A2:

ghee milk oil

A3:

salt cumin mace

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

• Now overlay them, but keep track of original index of each item:

A123:

bass

saltghee

trout

cumin

pikemilk oil

mace

0 -1 1 -1 2 5 5 7 6 7 9A1: 0* 1 2 3 4 5* 6 7* 8 9

A2: 0 1 2* 3 4 5 6* 7* 8 9A3: 0 1* 2 3 4 5* 6 7 8 9*

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 7

Page 8: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Probabilistic Balancing: Skip Lists

• A skip list can be thought of as a kind of n-ary search tree in whichwe choose to put the keys at “random” heights.

• More often thought of as an ordered list in which one can skip largesegments.

• Typical example:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

• To search, start at top layer on left, search until next step wouldovershoot, then go down one layer and repeat.

• In list above, we search for 125 and 127. Gray nodes are looked at;darker gray nodes are overshoots.

• Heights of the nodes were chosen randomly so that there are about1/2 as many nodes that are > k high as there are that are k high.

• Makes searches fast with high probability.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 8

Page 9: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Probabilistic Balancing: Skip Lists

• A skip list can be thought of as a kind of n-ary search tree in whichwe choose to put the keys at “random” heights.

• More often thought of as an ordered list in which one can skip largesegments.

• Typical example:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

⇓⇒

• To search, start at top layer on left, search until next step wouldovershoot, then go down one layer and repeat.

• In list above, we search for 125 and 127. Gray nodes are looked at;darker gray nodes are overshoots.

• Heights of the nodes were chosen randomly so that there are about1/2 as many nodes that are > k high as there are that are k high.

• Makes searches fast with high probability.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 9

Page 10: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Probabilistic Balancing: Skip Lists

• A skip list can be thought of as a kind of n-ary search tree in whichwe choose to put the keys at “random” heights.

• More often thought of as an ordered list in which one can skip largesegments.

• Typical example:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

⇓⇒

• To search, start at top layer on left, search until next step wouldovershoot, then go down one layer and repeat.

• In list above, we search for 125 and 127. Gray nodes are looked at;darker gray nodes are overshoots.

• Heights of the nodes were chosen randomly so that there are about1/2 as many nodes that are > k high as there are that are k high.

• Makes searches fast with high probability.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 10

Page 11: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Probabilistic Balancing: Skip Lists

• A skip list can be thought of as a kind of n-ary search tree in whichwe choose to put the keys at “random” heights.

• More often thought of as an ordered list in which one can skip largesegments.

• Typical example:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

⇓⇒

• To search, start at top layer on left, search until next step wouldovershoot, then go down one layer and repeat.

• In list above, we search for 125 and 127. Gray nodes are looked at;darker gray nodes are overshoots.

• Heights of the nodes were chosen randomly so that there are about1/2 as many nodes that are > k high as there are that are k high.

• Makes searches fast with high probability.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 11

Page 12: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Probabilistic Balancing: Skip Lists

• A skip list can be thought of as a kind of n-ary search tree in whichwe choose to put the keys at “random” heights.

• More often thought of as an ordered list in which one can skip largesegments.

• Typical example:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

• To search, start at top layer on left, search until next step wouldovershoot, then go down one layer and repeat.

• In list above, we search for 125 and 127. Gray nodes are looked at;darker gray nodes are overshoots.

• Heights of the nodes were chosen randomly so that there are about1/2 as many nodes that are > k high as there are that are k high.

• Makes searches fast with high probability.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 12

Page 13: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Probabilistic Balancing: Skip Lists

• A skip list can be thought of as a kind of n-ary search tree in whichwe choose to put the keys at “random” heights.

• More often thought of as an ordered list in which one can skip largesegments.

• Typical example:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

• To search, start at top layer on left, search until next step wouldovershoot, then go down one layer and repeat.

• In list above, we search for 125 and 127. Gray nodes are looked at;darker gray nodes are overshoots.

• Heights of the nodes were chosen randomly so that there are about1/2 as many nodes that are > k high as there are that are k high.

• Makes searches fast with high probability.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 13

Page 14: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Probabilistic Balancing: Skip Lists

• A skip list can be thought of as a kind of n-ary search tree in whichwe choose to put the keys at “random” heights.

• More often thought of as an ordered list in which one can skip largesegments.

• Typical example:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

• To search, start at top layer on left, search until next step wouldovershoot, then go down one layer and repeat.

• In list above, we search for 125 and 127. Gray nodes are looked at;darker gray nodes are overshoots.

• Heights of the nodes were chosen randomly so that there are about1/2 as many nodes that are > k high as there are that are k high.

• Makes searches fast with high probability.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 14

Page 15: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Probabilistic Balancing: Skip Lists

• A skip list can be thought of as a kind of n-ary search tree in whichwe choose to put the keys at “random” heights.

• More often thought of as an ordered list in which one can skip largesegments.

• Typical example:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

• To search, start at top layer on left, search until next step wouldovershoot, then go down one layer and repeat.

• In list above, we search for 125 and 127. Gray nodes are looked at;darker gray nodes are overshoots.

• Heights of the nodes were chosen randomly so that there are about1/2 as many nodes that are > k high as there are that are k high.

• Makes searches fast with high probability.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 15

Page 16: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Probabilistic Balancing: Skip Lists

• A skip list can be thought of as a kind of n-ary search tree in whichwe choose to put the keys at “random” heights.

• More often thought of as an ordered list in which one can skip largesegments.

• Typical example:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

• To search, start at top layer on left, search until next step wouldovershoot, then go down one layer and repeat.

• In list above, we search for 125 and 127. Gray nodes are looked at;darker gray nodes are overshoots.

• Heights of the nodes were chosen randomly so that there are about1/2 as many nodes that are > k high as there are that are k high.

• Makes searches fast with high probability.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 16

Page 17: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Probabilistic Balancing: Skip Lists

• A skip list can be thought of as a kind of n-ary search tree in whichwe choose to put the keys at “random” heights.

• More often thought of as an ordered list in which one can skip largesegments.

• Typical example:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

• To search, start at top layer on left, search until next step wouldovershoot, then go down one layer and repeat.

• In list above, we search for 125 and 127. Gray nodes are looked at;darker gray nodes are overshoots.

• Heights of the nodes were chosen randomly so that there are about1/2 as many nodes that are > k high as there are that are k high.

• Makes searches fast with high probability.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 17

Page 18: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Probabilistic Balancing: Skip Lists

• A skip list can be thought of as a kind of n-ary search tree in whichwe choose to put the keys at “random” heights.

• More often thought of as an ordered list in which one can skip largesegments.

• Typical example:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

• To search, start at top layer on left, search until next step wouldovershoot, then go down one layer and repeat.

• In list above, we search for 125 and 127. Gray nodes are looked at;darker gray nodes are overshoots.

• Heights of the nodes were chosen randomly so that there are about1/2 as many nodes that are > k high as there are that are k high.

• Makes searches fast with high probability.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 18

Page 19: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Example: Adding and deleting

• Starting from initial list:

−∞0

1

2

3

10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞

• In any order, we add 126 and 127 (choosing random heights forthem), and remove 20 and 40:

−∞0

1

2

3

10 25 30 50 55 60 90 95 100 115 120 125 126 127 130 140 150 ∞

• Shaded nodes here have been modified.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 19

Page 20: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Summary

• Balance in search trees allows us to realize Θ(lgN) performance.

• B-trees, red-black trees:

– Give Θ(lgN) performance for searches, insertions, deletions.

– B-trees good for external storage. Large nodes minimize # ofI/O operations

• Tries:

– Give Θ(B) performance for searches, insertions, and deletions,where B is length of key being processed.

– But hard to manage space efficiently.

• Interesting idea: scrunched arrays share space.

• Skip lists:

– Give probable Θ(lgN) performace for searches, insertions, dele-tions

– Easy to implement.

– Presented for interesting ideas: probabilistic balance, random-ized data structures.

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 20

Page 21: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Summary of Collection Abstractions

Multisetcontains, iterator

Listget(n)

Set

Ordered Setfirst

UnorderedSet

Priority Queue Sorted Setsubset

Mapcontains, iterator

get

UnorderedMap

OrderedMap

Blue: Java has corresponding interfaceGreen: Java has no corresponding interface

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 21

Page 22: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Data Structures that Implement Abstractions

Multiset

• List: arrays, linked lists, circular buffers

• Set

– OrderedSet

∗ Priority Queue: heaps∗ Sorted Set: binary search trees, red-black trees, B-trees,sorted arrays or linked lists

– Unordered Set: hash table

Map

• Unordered Map: hash table

• Ordered Map: red-black trees, B-trees, sorted arrays or linked lists

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 22

Page 23: Public Service Announcement - University of California, …cs61b/fa17/materials/... ·  · 2017-11-13Public Service Announcement “The Experimental Social Science Laboratory ...

Corresponding Classes in Java

Multiset (Collection)

• List: ArrayList, LinkedList, Stack, ArrayBlockingQueue,ArrayDeque

• Set

– OrderedSet

∗ Priority Queue: PriorityQueue∗ Sorted Set (SortedSet): TreeSet

– Unordered Set: HashSet

Map

• Unordered Map: HashMap

• Ordered Map (SortedMap): TreeMap

Last modified: Thu Nov 2 19:38:19 2017 CS61B: Lecture #31 23