Lecture 8 - Stanford...

59
Lecture 8 HASHING!!!!!

Transcript of Lecture 8 - Stanford...

Page 1: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Lecture8HASHING!!!!!

Page 2: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Announcements

• HW3dueFriday!

• HW4postedFriday!

Page 3: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Today:hashing

n=9buckets

1

2

3

9

13

22

43

9…

NIL

NIL

NIL

NIL

#

Page 4: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Outline

• HashtablesareanothersortofdatastructurethatallowsfastINSERT/DELETE/SEARCH.

• likeself-balancingbinarytrees

• Thedifferenceiswecangetbetterperformanceinexpectationbyusingrandomness.

• LikeQuickSort vs.MergeSort

• Hashfamiliesarethemagicbehindhashtables.

• Universalhashfamiliesareevenmoremagic.

Page 5: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Goal:JustlikeonMonday

• WeareinterestinginputtingnodeswithkeysintoadatastructurethatsupportsfastINSERT/DELETE/SEARCH.

• INSERT

• DELETE

• SEARCH

5

datastructure

5

4

52

HEREITIS

nodewithkey“2”

Page 6: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Today:

• Hashtables:

• O(1)expectedtimeINSERT/DELETE/SEARCH

• Worseworst-caseperformance,butoftengreatinpractice.

OnMonday:

• Selfbalancingtrees:

• O(log(n))deterministicINSERT/DELETE/SEARCH

#prettysweet

#evensweeterinpractice

eg,Python’sdict,Java’sHashSet/HashMap,C++’sunordered_map

Hashtablesareusedfordatabases,caching,objectrepresentation,…

Page 7: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

OnewaytogetO(1)time

• Sayallkeysareintheset{1,2,3,4,5,6,7,8,9}.

• INSERT:

• DELETE:

• SEARCH:

9 6 3 5

4 5 6 7 8 9

963 5

1 2 3

6

3 2

3ishere.

Thisiscalled

“directaddressing”

Page 8: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Thatshouldlookfamiliar

• KindoflikeBUCKETSORT fromLecture6.

• Sameproblem:ifthekeysmaycomefromauniverse U={1,2,….,10000000000}….

Page 9: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Thesolutionthenwas…• Putthingsinbucketsbasedononedigit.

1 2 3 4 5 6 7 8 90

345

50 1321

101

1

234

21 345 13 101 50 234 1

INSERT:

NowSEARCH 21

It’sinthisbucketsomewhere…

gothroughuntilwefindit.

Page 10: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

22 342 12 102 52 232 2

INSERT:

Problem…

1 2 3 4 5 6 7 8 90

342

52

12

22

102

2

232

NowSEARCH 22….thishasn’tmade

ourliveseasier…

Page 11: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Hashtables

• Thatwasanexampleofahashtable.

• notaverygoodone,though.

• Wewillbemoreclever(andlessdeterministic) aboutourbucketing.

• Thiswillresultinfast(expectedtime)INSERT/DELETE/SEARCH.

Page 12: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Butfirst!Terminology.• WehaveauniverseU,ofsizeM.

• Misreallybig.

• Butonlyafew(sayatmostnfortoday’slecture)elementsofMareevergoingtoshowup.

• Miswaaaayyyyyyy biggerthann.

• Butwedon’tknowwhichoneswillshowupinadvance.

Allofthekeysinthe

universeliveinthis

blob.

UniverseU

Afewelementsarespecial

andwillactuallyshowup.

Example:Uisthesetofallstringsofatmost

140ascii characters.(128140 ofthem).

TheonlyoneswhichIcareaboutarethose

whichappearastrendinghashtagson

twitter.#hashhashtags

Therearewayfewerthan128140 ofthese.

Examplesaside,I’mgoingtodrawelementslikeI

alwaysdo,asblueboxeswithintegersinthem…

Page 13: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Thepreviousexamplewiththisterminology

• WehaveauniverseU,ofsizeM.• atmostnofwhichwillshowup.

• Mis waaaayyyyyy biggerthann.

• WewillputitemsofUintonbuckets.

• Thereisahashfunction h:U →{1,…,n}whichsayswhatelementgoesinwhatbucket.

Allofthekeysinthe

universeliveinthis

blob.

UniverseU

nbuckets1

2

3

h(x)=least

significantdigitofx.

Forthislecture,I’massumingthatthe

numberofthingsisthesameasthe

numberofbuckets,botharen.

Thisdoesn’thavetobethecase,

althoughwedowant:

#buckets=O(#thingswhichshowup)

Page 14: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Thisisahashtable(withchaining)

• Arrayofnbuckets.

• Eachbucketstoresalinkedlist.• WecaninsertintoalinkedlistintimeO(1)

• TofindsomethinginthelinkedlisttakestimeO(length(list)).

• h:U → {1,…,n}canbeanyfunction:• butforconcretenesslet’sstickwithh(x)=leastsignificantdigitofx.

nbuckets(sayn=9)

1

2

3

9

13 22 43

Fordemonstration

purposesonly!

Thisisaterriblehash

function!Don’tusethis!

9

INSERT:

13

22

43

9

SEARCH43:

Scanthroughalltheelementsin

bucketh(43)=3.

Page 15: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Aside:Hashtableswithopenaddressing

• Thepreviousslideisabouthashtableswithchaining.

• There’salsosomethingcalled“openaddressing”

• You’llseeitonyourhomeworkJ

n=9buckets

1

2

3

9

13 43

Thisisa“chain”

n=9buckets

1

2

3

9

13

43

\end{Aside}

Page 16: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Thisisahashtable(withchaining)

• Arrayofnbuckets.

• Eachbucketstoresalinkedlist.• WecaninsertintoalinkedlistintimeO(1)

• TofindsomethinginthelinkedlisttakestimeO(length(list)).

• h:U → {1,…,n}canbeanyfunction:• butforconcretenesslet’sstickwithh(x)=leastsignificantdigitofx.

nbuckets(sayn=9)

1

2

3

9

13 22 43

Fordemonstration

purposesonly!

Thisisaterriblehash

function!Don’tusethis!

9

INSERT:

13

22

43

9

SEARCH43:

Scanthroughalltheelementsin

bucketh(43)=3.

Thisisagoodideaaslongastherearenottoomanyelementsinthatbucket!

Page 17: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Themainquestion

• Howdowepickthatfunctionsothatthisisagoodidea?

1. Wewanttheretobenotmanybuckets(say,n).

• Thismeanswedon’tusetoomuchspace

2. Wewanttheitemstobeprettyspread-out inthebuckets.

• ThismeansitwillbefasttoSEARCH/INSERT/DELETE

n=9buckets

1

2

3

9

13

22

43

9

n=9buckets

1

2

3

9

13 43

21

93

vs.

Page 18: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Worst-caseanalysis

• Designafunctionh:U->{1,…,n} sothat:

• Nomatterwhatinput(fewerthannitemsofU)DarthVaderchooses,thebucketswillbebalanced.

• Here,balancedmeansO(1)entriesperbucket.

• Ifwehadthis,thenwe’dachieveourdreamofO(1)INSERT/DELETE/SEARCH

Takeaminutetotalktotheperson

nexttoyou.Canyoucomeupwith

suchafunction?

Page 19: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is
Page 20: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Wereallycan’tbeatDarthVaderhere.

.

UniverseU

h(x)nbuckets

Theseareallthethingsthat

hashtothefirstbucket.

• TheuniverseUhasM items

• Theygethashedintonbuckets

• Atleastonebucket receivesatleastM/nitems

• MisWAAYYYYYbigger thenn,soM/nisbiggerthann.

• DarthVaderchoosesnoftheitemsthatlandedinthis

veryfullbucket.

Page 21: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Solution:

Randomness

Page 22: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Thegame

13 22 43 92

1. Anadversarychoosesanynitems

�", �$, … , �& ∈ �,andanysequence

ofINSERT/DELETE/SEARCH

operationsonthoseitems.

2. You,thealgorithm,

choosesarandom hash

functionℎ: � → {1,… , �}.

3. HASHITOUT

1

2

3

n

13

22

92

437

7

Whatdoes

randommean

here?Uniformly

random?

Pluckythepedanticpenguin

INSERT13,INSERT22,INSERT43,

INSERT92,INSERT7,SEARCH43,

DELETE92,SEARCH7,INSERT92

Page 23: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Whyshouldthishelp?

• Saythathis uniformlyrandom.

• Thatmeansthath(1)isauniformlyrandom numberbetween1andn.

• h(2)isalsoauniformlyrandomnumberbetween1andn,independentofh(1).

• h(3)isalsoauniformlyrandom numberbetween1andn,independentofh(1),h(2).

• …

• h(n)isalsoauniformlyrandom numberbetween1andn,independentofh(1),h(2),…,h(n-1).

Universe

U

nbucke

ts

h

Page 24: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Whatdowewant?

1

2

3

n

14

22

92

43

8

7 ui 32 5 15

It’sbad iflotsofitemslandinui’s bucket.

Sowewantnotthat.

Page 25: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Moreprecisely

1

2

3

n

14

22

92

43

8

ui

• Supposethatforallui thatthebadguychose• E[numberofitemsinui ‘sbucket]≤ 2.

• Thenforeachoperationinvolvingui• E[timeofoperation]=O(1)

• Bylinearityofexpectation,

• � �������������ℎ������������

• = � ∑ ���������������BCDEFGHIC&J

• = ∑ �[���������������BCDEFGHIC&J ]

• = ∑ � 1BCDEFGHIC&J

• =O(numberofoperations)

aka,O(1)peroperation!

Page 26: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Sowewant:

• Foralli=1,…,n,

E[numberofitemsinui ‘sbucket]≤ 2.

Page 27: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Aside:whynotjust:

• Foralli=1,…,n:

E[numberofitemsinbucketi ]≤ 2?

1

2

3

n

14 22 92

43 8

thishappenswith

probability1/n

Suppose:

1

2

3

n

14 22 92

43 8

andthishappens

withprobability1/netc.

ThenE[numberofitemsinbucketi ]=1foralli.

ButP{thebucketsgetbig}=1.

Page 28: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Sowewant:

• Foralli=1,…,n,

E[numberofitemsinui ‘sbucket]≤ 2.

Page 29: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Expectednumberofitemsinui’s bucket?

UniverseU

nbucke

ts

h

ujui

• � = ∑ � ℎ �I = ℎ �N&NO"

• = 1 +∑ � ℎ �I = ℎ �NBNQI

• = 1 +∑ 1/�BNQI

• = 1 +&S"

&≤ 2.

That’swhat

wewanted.youwillverify

thisonHW

COLLISION!

Page 30: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

That’sgreat!

• Foralli=1,…,n,

• E[numberofitemsinui ‘sbucket]≤ 2

Thisimplies(aswesawbefore):

Foranysequence ofLINSERT/DELETE/SEARCH

operationsonanynelementsofU,theexpected

runtime(overtherandomchoiceofh)isO(L).

aka,anythingDarthVadermight

pickinStep1ofthegame. aka,O(1)per

operation.

Page 31: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Theelephantintheroom

Page 32: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Theelephantintheroom

h(1)=2

h(2)=7

h(3)=9

h(4)=1

h(5)=0

h(6)=7

h(7)=2

h(8)=3

h(9)=7

h(10)=3

h(11)=4

h(12)=5

h(13)=7

h(14)=3

h(15)=2

h(16)=9

h(17)=3

h(18)=2

h(19)=1

h(20)=5

h(4511)=3

h(4512)=7

h(4513)=2

h(4514)=6

h(4515)=3

h(4516)=1

h(4517)=0

h(4518)=0

h(4519)=3

h(4520)=1

h(264511)=3

h(264512)=1

h(264513)=0

h(264514)=0

h(264515)=7

h(264516)=8

h(264517)=9

h(264518)=2

h(264519)=6

h(264520)=3

... ….

Page 33: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Randomizationisfine…

• Saythatthiselephant-shapedblob

representsthesetofallhashfunctions.

• Howbigisthisset?

• n|U| =nM =REALLYBIG.

• Inordertowritedown

anarbitraryelement

ofasetofsizeA,we

needlog(A)bits.

• Sowe’dneedaboutMlog(n)bits

torememberoneofthesehash

functions. That’s enough to do direct addressing!!!!

butweneedtobeabletostoreourchoiceofh!

Page 34: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Anotherthought…

• Justrememberhontherelevantvalues

Algorithmnow Algorithmlater

1322

4392

7

h(13)=6

h(13)=6

h(22)=3

h(92)=3

Butthat’swhatwe

wantedtobeginwith…

Page 35: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Solution

• Pickfromasmallersetoffunctions.

Acleverlychosen subset

offunctions.Wecallsuch

asubsetahashfamily.

Weneedonlylog|H|bits

tostoreanelementofH.

H

Page 36: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Howtopickthehashfamily?

• Let’sgobacktothatcomputationfromearlier….

Page 37: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Expectednumberofitemsinui’s bucket?

UniverseU

nbucke

ts

h

ujui

• � = ∑ � ℎ �I = ℎ �N&NO"

• = 1 +∑ � ℎ �I = ℎ �NBNQI

• = 1 +∑ 1/�BNQI

• = 1 +&S"

&≤ 2.

Sothenumber

ofitemsinui’s

bucketisO(1).

youwillverify

thisonHW

COLLISION!

Page 38: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Howtopickthehashfamily?

• Let’sgobacktothatcomputationfromearlier….

• � numberofthingsinbucketℎ �I

• =∑ � ℎ �I = ℎ �N&NO"

• = 1 +∑ � ℎ �I = ℎ �NBNQI

• ≤ 1 +∑ 1/�BNQI

• = 1 +&S"

&≤ 2.

• Allweneededwasthatthis ≤ 1/n.

Page 39: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Strategy

• PickasmallhashfamilyH,sothatwhenIchoosehrandomlyfromH,

forall�I , �N ∈ �with�I ≠ �N ,

�i∈j ℎ �I = ℎ �N ≤1

H

h

• ThenwestillgetO(1)-sizedbuckets

inexpectation.

• Butnowthespaceweneedis

log(|H|)bits.• Hopefullyprettysmall!

Page 40: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Sothewholeschemewillbe

nbucke

ts

h

ui

UniverseU

Choosehrandomly

fromauniversalhash

familyH

Wecanstorehinsmallspace

sinceHissosmall.

Probably

these

bucketswill

bepretty

balanced.

Page 41: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Whatisthisuniversalhashfamily?

• Here’sone:

• Pickaprime� ≥ �.

• Define�G,m � = �� + �����

ℎG,m � = �G,m � ����

• Claim:

� = {ℎG,m � ∶ � ∈ {1,… , � − 1}, � ∈ {0,… , � − 1}}

isauniversalhashfamily.

Page 42: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Saywhat?

• Example:M=p=5,n=3

• TodrawhfromH:

• Pickarandomain{1,…,4},bIn{0,…,4}

• Asperthedefinition:

• �$," � = 2� + 1���5

• ℎ$," � = �$," � ���3

1,2,3,4,5a=2,b=1

1

23

40

�$," �

1

23

4 0

�$," 1

�$," 0

�$," 3

�$," 4�$," 2U=

1

2

3

mod3

Thisstepjust

scramblesstuffup.

Nocollisionshere!

Thisstepistheone

wheretwodifferent

elementsmightcollide.

Page 43: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Ignoringwhythisisagoodidea…

howbigisH?

• Wehavep-1choicesfora,andpchoicesforb.

• So|H|=p(p-1)=O(M2)

• ThisismuchbetterthannM!!!!

• spaceneededtostoreh:O(log(M)).

O(Mlog(n))

bits

O(log(M))bits

Page 44: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Whydoesthiswork?

• Thisisactuallyalittlecomplicated.

• I’llgoovertheargumentnow,becauseit’sagoodexampleofhowtoreasonabouthashfunctions.

• Fancycounting!

• BUT! don’tworryifyoudon’tfollowallthecalculationsrightnow.

• Youcanalwaystakealookbackattheslidesorlecturenoteslater.

• Theimportantpartisthestructureoftheargument.

Page 45: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Whydoesthiswork?

• Wanttoshow:

• forall�I , �N ∈ �with�I ≠ �N , �i∈j ℎ �I = ℎ �N ≤"

&

• aka,theprobabilityofanytwoelementscollidingissmall.

• Let’sjustfixtwoelementsandseeanexample.

• Let’sconsider�I , = 0, �N = 1.

1

23

40

�G,m �

1

23

4 0U=

1

2

3

mod3

�� + �����

Convince

yourselfthatit

willbethesame

foranypair!

Page 46: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Theprobabilitythat0and1collideissmall

• Wanttoshow:

• �i∈j ℎ 0 = ℎ 1 ≤"

&

• Forany�w ≠ �" ∈ {0,1,2,3,4},howmanya,b aretheresothat�G,m 0 = �wand�G,m 1 = �"?

• Claim:it’sexactlyone.

• Proof:solvethesystemofeqs.foraandb.

1

23

40

�G,m �

1

23

4 0U=

1

2

3

mod3

�� + �����

eg,y0 =3,y1 =1.

� ⋅ 1 + � = �"����

� ⋅ 0 + � = �w����

Page 47: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Theprobabilitythat0and1collideissmall

• Wanttoshow:

• �i∈j ℎ 0 = ℎ 1 ≤"

&

• Forany�w ≠ �" ∈ {0,1,2,3,4}, exactlyonepaira,b have�G,m 0 = �wand�G,m 1 = �".

• If0and1collideit’sb/cthere’ssome�w ≠ �"sothat:

• �G,m 0 = �wand�G,m 1 = �".

• �w = �"����.

1

23

40

�G,m �

1

23

4 0U=

1

2

3

mod3

�� + �����

eg,y0 =3,y1 =1.

Page 48: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Theprobabilitythat0and1collideissmall

• Wanttoshow:

• �i∈j ℎ 0 = ℎ 1 ≤"

&

• Thenumberofa,b sothat0,1collideunderha,b isatmostthenumberof�w ≠ �"sothat�w = �"����.

• Howmanyisthat?• Wehavepchoicesfor�w,thenatmost1/noftheremainingp-1arevalidchoicesfor�"…

• Soatmost� ⋅DS"

&.

1

23

40

�G,m �

1

23

4 0U=

1

2

3

mod3

�� + �����

eg,y0 =3,y1 =1.

Page 49: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Theprobabilitythat0and1collideissmall

• Wanttoshow:

• �i∈j ℎ 0 = ℎ 1 ≤"

&

• The#of(a,b) sothat0,1collideunderha,b is≤ � ⋅DS"

&.

• Theprobability(overa,b)that0,1collideunderha,b is:

• �i∈j ℎ 0 = ℎ 1 ≤D⋅

yz{

|

j

• = D⋅

yz{

|

D DS"

• ="

&.

Page 50: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Thesameargumentgoesforanypair

forall�I , �N ∈ �with�I ≠ �N ,

�i∈j ℎ �I = ℎ �N ≤1

That’sthedefinitionofauniversalhashfamily.

SothisfamilyHindeeddoesthetrick.

Page 51: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Sothewholeschemewillbenbucke

ts

h

ui

UniverseUofsizeM

Chooseh

randomlyfromH

Wecanstorehinspace

O(log(M)).

TheexpectedtimetodoanyL

operationsonthesenelementsisO(L).

Page 52: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Recap

Page 53: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

WantO(1)INSERT/DELETE/SEARCH

• WeareinterestinginputtingnodeswithkeysintoadatastructurethatsupportsfastINSERT/DELETE/SEARCH.

• INSERT

• DELETE

• SEARCH

5

datastructure

5

4

52

HEREITIS

Page 54: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Westudiedthisgame

13 22 43 92

1. Anadversarychoosesanynitems

�", �$, … , �& ∈ �,andanysequence

ofLINSERT/DELETE/SEARCH

operationsonthoseitems.

2. You,thealgorithm,

choosesarandom hash

functionℎ: � → {1,… , �}.

3. HASHITOUT

1

2

3

n

13

22

92

437

7

INSERT13,INSERT22,INSERT43,

INSERT92,INSERT7,SEARCH43,

DELETE92,SEARCH7,INSERT92

Page 55: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Uniformlyrandomhwasgood

• Ifwechoosehuniformlyatrandom,forall�I , �N ∈ �with�I ≠ �N ,

�i∈j ℎ �I = ℎ �N ≤1

• Thatwasenoughtoensurethat,inexpectation,abucketisn’ttoofull.

Abitmoreformally:

Foranysequence ofLINSERT/DELETE/SEARCH

operationsonanynelementsofU,theexpected

runtime(overtherandomchoiceofh)isO(L).

aka,O(1)peroperation.

Page 56: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Uniformlyrandomhwasbad

• Ifweactuallywanttoimplementthis,wehavetostorethehashfunctionh!

• Thattakesalotofspace!• WemayaswellhavejustinitializedabucketforeverysingleiteminU.

• Instead,wechoseafunctionrandomlyfromasmallerset.

Page 57: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Weneededasmallersetthatstillhasthisproperty

• Ifwechoosehuniformlyatrandom,forall�I , �N ∈ �with�I ≠ �N ,

�i∈j ℎ �I = ℎ �N ≤1

Thiswasallweneededtomake

surethatthebucketswere

balancedinexpectation!

• Wecallanysetwiththatpropertya

universalhashfamily.

• Wewereabletocomeupwithareallysmallone!

Page 58: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

Conclusion:

• WecanbuildahashtablethatsupportsINSERT/DELETE/SEARCHinO(1)expectedtime,• ifweknowthatonlynitemsareeverygoingtoshowup,whereniswaaaayyyyyy lessthanthesizeMoftheuniverse.

• Thespacetoimplementthishashtableis

O(nlog(M)).

• Miswaaayyyyyy biggerthann,butlog(M)probablyisn’t.

Page 59: Lecture 8 - Stanford Universityweb.stanford.edu/class/archive/cs/cs161/cs161.1176/Slides/Lecture… · n buckets (say n=9) 1 2 3 9 13 22 43 For demonstration purposes only! This is

NextWeek

• Graphalgorithms!