Lecture 14 - web.stanford.edu
Transcript of Lecture 14 - web.stanford.edu
Lecture14Greedyalgorithms!
Announcements
⢠HW6DueFriday!
⢠TONSOFPRACTICEONDYNAMICPROGRAMMING
Lastweek
Roadmap
Graphs!
Asymptotic
Analysis
Dynamic
ProgrammingGreedyAlgs
MIDTERM
The
Future!
Moredetailedscheduleonthewebsite!
Thisweek
⢠Greedyalgorithms!
⢠Buildsonourideasfromdynamicprogramming
Greedyalgorithms
⢠Makechoicesone-at-a-time.
⢠Neverlookback.
⢠Hopeforthebest.
Today
⢠Onenon-exampleofagreedyalgorithm:
⢠Knapsackagain
⢠Threeexamplesofgreedyalgorithms:
⢠ActivitySelection
⢠JobScheduling
⢠HuffmanCoding
Non-example
⢠UnboundedKnapsack.
⢠(Frompre-lectureexercise)
⢠UnboundedKnapsack:
⢠SupposeIhaveinfinitecopiesofalloftheitems.
⢠Whatâsthemostvaluablewaytofilltheknapsack?
⢠âGreedyâalgorithmforunboundedknapsack:
⢠TacoshavethebestValue/Weightratio!
⢠Keepgrabbingtacos!
Weight:
Value:
6 2 4 3 11
20 8 14 3513
Item:
Capacity:10
Totalweight:10
Totalvalue:42
Totalweight:9
Totalvalue:39
ExamplewheregreedyworksActivityselection
FrisbeePractice
Orchestra
CS161study
group
Sleep
CS110
Class
TheoryLunch
TheorySeminar
Combinatorics
Seminar
Underwaterbasket
weavingclass
Math51Class
CS161Class
CS166Class
CS161
Section
CS161Office
Hours
Swimming
lessons
Programming
teammeeting
Socialactivity
time
Youcanonlydooneactivityatatime,andyouwantto
maximizethenumberofactivitiesthatyoudo.
Whattochoose?
Activityselection
⢠Input:
⢠Activitiesa1,a2,âŚ,an⢠Starttimess1,s2,âŚ,sn⢠Finishtimesf1,f2,âŚ,fn
⢠Output:
⢠Howmanyactivitiescanyoudotoday?
GreedyAlgorithm
a3a1
a4a2
a5
a7
a6
time
⢠Pickactivityyoucanaddwiththesmallestfinishtime.
⢠Repeat.
GreedyAlgorithm
a3a1
a4a2
a5
a7
a6
time
⢠Pickactivityyoucanaddwiththesmallestfinishtime.
⢠Repeat.
GreedyAlgorithm
a3a1
a4a2
a5
a7
a6
time
⢠Pickactivityyoucanaddwiththesmallestfinishtime.
⢠Repeat.
GreedyAlgorithm
a3a1
a4a2
a5
a7
a6
time
⢠Pickactivityyoucanaddwiththesmallestfinishtime.
⢠Repeat.
GreedyAlgorithm
a3a1
a4a2
a5
a7
a6
time
⢠Pickactivityyoucanaddwiththesmallestfinishtime.
⢠Repeat.
GreedyAlgorithm
a3a1
a4a2
a5
a7
a6
time
⢠Pickactivityyoucanaddwiththesmallestfinishtime.
⢠Repeat.
GreedyAlgorithm
a3a1
a4a2
a5
a7
a6
time
⢠Pickactivityyoucanaddwiththesmallestfinishtime.
⢠Repeat.
GreedyAlgorithm
a3a1
a4a2
a5
a7
a6
time
⢠Pickactivityyoucanaddwiththesmallestfinishtime.
⢠Repeat.
Atleastitâsfast
⢠Runningtime:
⢠O(n)iftheactivitiesarealreadysortedbyfinishtime.
⢠OtherwiseO(nlog(n))ifyouhavetosortthemfirst.
Whatmakesitgreedy?
⢠Ateachstepinthealgorithm,makeachoice.
⢠Hey,Icanincreasemyactivitysetbyone,
⢠Andleavelotsofroomforfuturechoices,
⢠Letâsdothatandhopeforthebest!!!
⢠Hope thatattheendoftheday,thisresultsinagloballyoptimalsolution.
Threequestions
1. Doesthisgreedyalgorithmforactivityselectionwork?
2. Ingeneral,whenaregreedyalgorithmsagoodidea?
3. TheâgreedyâapproachisoftenthefirstyouâdthinkofâŚ
⢠Whyarewegettingtoitnow,inWeek8?
Answers
1. Doesthisgreedyalgorithmforactivityselectionwork?
⢠Yes.
2. Ingeneral,whenaregreedyalgorithmsagoodidea?
⢠Whentheyexhibitespeciallyniceoptimalsubstructure.
3. TheâgreedyâapproachisoftenthefirstyouâdthinkofâŚ
⢠Whyarewegettingtoitnow,inWeek8?
⢠Relatedtodynamicprogramming!(WhichwedidinWeek7).
⢠Provingthatgreedyalgorithmsworkisoftennotsoeasy.
(Seemsto:IPython notebookâŚ) (ButnowletâsseewhyâŚ)
Whydoesitwork?
⢠Wheneverwemakeachoice,wedonâtruleoutanoptimalsolution.
a3a1
a4a2
a5
a7
a6
time
a5a3
a7
Thereâssomeoptimalsolutionthat
containsournextchoiceOurnext
choicewould
bethisone:
Toseethis,consider
OptimalSubstructure
⢠Subproblem i :
⢠A[i]=NumberofactivitiesyoucandoafterActivityi finishes.
ai
a2
a7
a6
time
a4
aka3
Wanttoshow:whenwemakeachoiceak,theoptimalsolution
tothesmallersub-problemkwillhelpussolvesub-problemi
Claim
⢠Letak havethesmallestfinishtimeamongactivitiesdo-ableafterai finishes.
⢠ThenA[i]=A[k]+1.
akai
a2
a7
a6
time
a4
aka3
A[k]:howmany
activitiescanIdohere?
A[i]:howmanyactivitiescanIdohere?
Proof⢠Letak havethesmallestfinishtimeamongactivitiesdo-ableafterai finishes.
⢠ThenA[i]=A[k]+1.
a1ai
a2
a7
a6
time
a4
aka3
⢠ClearlyA[i]⼠A[k]+1⢠SincewehaveasolutionwithA[k]+1activities.
ai
a2
Proof⢠Letak havethesmallestfinishtimeamongactivitiesdo-ableafterai finishes.
⢠ThenA[i]=A[k]+1.
⢠SupposetowardacontradictionthatA[i]> A[k]+1.
⢠Thereâssomebettersolutiontosubproblem(i)that
doesnâtuseak⢠Sayaj endsfirstafterai inthatbettersolution.
⢠Removeaj andaddak fromthebettersolution.
akai
a2
a7
a6
time
a4
a3 a7a4
a3
aj
Thesetwodonâtcount
forsub-problem(i)so
letâsgreythemout.
Proof⢠Letak havethesmallestfinishtimeamongactivitiesdo-ableafterai finishes.
⢠ThenA[i]=A[k]+1.
⢠SupposetowardacontradictionthatA[i]> A[k]+1.
⢠Thereâssomebettersolutiontosubproblem(i)that
doesnâtuseak⢠Sayaj endsfirstafterai inthatbettersolution.
⢠Removeaj andaddak fromthebettersolution.
⢠NowyouhaveasolutionofthesamesizeâŚ
butitincludesak soitmusthavesizeâ¤A[k]+1.ak
ai
a2
a7
a6
time
aj
a3 a7a3
Proof⢠Letak havethesmallestfinishtimeamongactivitiesdo-ableafterai finishes.
⢠ThenA[i]=A[k]+1.
a1ai
a2
a7
a6
time
a4
aka3
⢠ClearlyA[i]⼠A[k]+1⢠SincewehaveasolutionwithA[k]+1activities.
⢠Andwejustshowed A[i]⤠A[k]+1⢠Bycontradiction
⢠Thatprovestheclaim.
Weneverruleoutanoptimalsolution
⢠Weâve shown:
⢠Ifwechooseak havethesmallestfinishtimeamongactivitiesdo-ableafterai finishes,thenA[i]=A[k]+1.
⢠Thatis:
⢠Assumethatwehaveanoptimalsolutionuptoai⢠Byaddingak wearestillontracktohitthatoptimalvalue
ai
a2
a7
a6
time
a4
aka3
Sothealgorithmiscorrect
⢠Weneverruleoutanoptimalsolution
⢠Attheendofthealgorithm,weâvegotasolution.
⢠Itâsnotnotoptimal.
⢠Soitmustbeoptimal.
LuckytheLackadaisicalLemur
Sothealgorithmiscorrect
⢠InductiveHypothesis:⢠Afteraddingthetâth thing,thereisanoptimalsolutionthatextendsthecurrentsolution.
⢠Basecase:⢠Afteraddingzeroactivities,thereisanoptimalsolutionextendingthat.
⢠Inductivestep:⢠TODO
⢠Conclusion:⢠Afteraddingthelastactivity,thereisanoptimalsolutionthatextendsthecurrentsolution.
⢠Thecurrentsolutionistheonlysolutionthatextendsthecurrentsolution.
⢠Sothecurrentsolutionisoptimal.
PluckythePedanticPenguin
Inductivestep
⢠Supposethatafteraddingthetâth thing(Activityi),thereisanoptimalsolution:
⢠XactivitiesdoneandA[i]activitiesleft.
⢠Thenweaddthe(t+1)âst thing(Activityk).
⢠A[k]=A[i]- 1(bytheclaim)
⢠Now:
⢠X+1activitiesdoneandA[i]â 1activitiesleft.
⢠Samenumberasbefore!
⢠Stilloptimal.
Sothealgorithmiscorrect
⢠InductiveHypothesis:⢠Afteraddingthetâth thing,thereisanoptimalsolutionthatextendsthecurrentsolution.
⢠Basecase:⢠Afteraddingzeroactivities,thereisanoptimalsolutionextendingthat.
⢠Inductivestep:⢠TODO
⢠Conclusion:⢠Afteraddingthelastactivity,thereisanoptimalsolutionthatextendsthecurrentsolution.
⢠Thecurrentsolutionistheonlysolutionthatextendsthecurrentsolution.
⢠Sothecurrentsolutionisoptimal.
PluckythePedanticPenguin
Commonstrategyforgreedyalgorithms
⢠Makeaseriesofchoices.
⢠Showthat,ateachstep,ourchoicewonâtruleoutanoptimalsolution attheendoftheday.
⢠Afterweâvemadeallourchoices,wehavenâtruledoutanoptimalsolution,sowemusthavefoundone.
Commonstrategy(formally)forgreedyalgorithms
⢠InductiveHypothesis:
⢠Aftergreedychoicet,youhavenâtruledoutsuccess.
⢠Basecase:
⢠Successispossiblebeforeyoumakeanychoices.
⢠Inductivestep:
⢠TODO
⢠Conclusion:
⢠Ifyoureachtheendofthealgorithmandhavenâtruledoutsuccessthenyoumusthavesucceeded.
DPviewofactivityselection
⢠Thisalgorithmismostnaturallyviewedasa
greedyalgorithm.⢠Makegreedychoices
⢠Neverruleoutsuccess
⢠But,wecouldviewitasaDPalgorithm⢠Takeadvantageofoptimalsub-structureandfill
inatable.
⢠Weâlldothatnow.⢠Justforpedagogy!
⢠(Thisisnâtthebestwaytothinkaboutactivity
selection).
RecipeforapplyingDynamicProgramming
⢠Step1:Identifyoptimalsubstructure.
⢠Step2:Findarecursiveformulationforthevalueoftheoptimalsolution.
⢠Step3:Usedynamicprogrammingtofindthevalueoftheoptimalsolution.
⢠Step4:Ifneeded,keeptrackofsomeadditionalinfosothatthealgorithmfromStep3canfindtheactualsolution.
⢠Step5:Ifneeded,codethisuplikeareasonableperson.
Optimalsubstructure
⢠Subproblem i:
⢠A[i]=numberofactivitiesyoucandoafterActivityi finishes.
ai
a2
a7
a6
time
a4
a1a3
RecipeforapplyingDynamicProgramming
⢠Step1:Identifyoptimalsubstructure.
⢠Step2:Findarecursiveformulationforthevalueoftheoptimalsolution.
⢠Step3:Usedynamicprogrammingtofindthevalueoftheoptimalsolution.
⢠Step4:Ifneeded,keeptrackofsomeadditionalinfosothatthealgorithmfromStep3canfindtheactualsolution.
⢠Step5:Ifneeded,codethisuplikeareasonableperson.
Wedidthatalready
⢠Letak havethesmallestfinishtimeamongactivitiesdo-ableafterai finishes.
⢠ThenA[i]=A[k]+1.
a1ai
a2
a7
a6
time
a4
aka3
A[k]:howmany
activitiescanIdohere?
A[i]:howmanyactivitiescanIdohere?
RecipeforapplyingDynamicProgramming
⢠Step1:Identifyoptimalsubstructure.
⢠Step2:Findarecursiveformulationforthevalueoftheoptimalsolution.
⢠Step3:Usedynamicprogrammingtofindthevalueoftheoptimalsolution.
⢠Step4:Ifneeded,keeptrackofsomeadditionalinfosothatthealgorithmfromStep3canfindtheactualsolution.
⢠Step5:Ifneeded,codethisuplikeareasonableperson.
Top-downDP
⢠InitializeaglobalarrayAto[None,âŚ,None]
⢠Makeaâdummyâactivitythatendsattime-1.
⢠def findNumActivities(i):
⢠IfA[i]!=None:
⢠Return A[i]
⢠LetActivitykbetheactivityIcanfitinmyscheduleafterActivityi withthesmallestfinishtime.
⢠If thereisnosuchactivityk,setA[i]=0
⢠Else,A[i]=findNumActivities(k)+1
⢠Return A[i]
⢠Return findNumActivities(0)
Thisisaterriblewaytowritethis!
Theonlythingthatmattershereisthatthe
highlightedlinesareourrecursiverelationship.
SeeIPython notebookfor
implementation
RecipeforapplyingDynamicProgramming
⢠Step1:Identifyoptimalsubstructure.
⢠Step2:Findarecursiveformulationforthevalueoftheoptimalsolution.
⢠Step3:Usedynamicprogrammingtofindthevalueoftheoptimalsolution.
⢠Step4:Ifneeded,keeptrackofsomeadditionalinfosothatthealgorithmfromStep3canfindtheactualsolution.
⢠Step5:Ifneeded,codethisuplikeareasonableperson.
Top-downDP
⢠InitializeaglobalarrayAto[None,âŚ,None]
⢠InitializeaglobalarrayNextto[None,âŚ,None]
⢠Makeaâdummyâactivitythatendsattime-1.
⢠def findNumActivities(i):⢠IfA[i]!=None:
⢠Return A[i]⢠LetActivitykbetheactivityIcanfitinmyscheduleafterActivityi withthesmallestfinishtime.
⢠If thereisnosuchactivityk,setA[i]=0⢠Else,A[i]=findNumActivities(k)+1and Next[i]=k⢠Return A[i]
⢠findNumActivities(0)
⢠StepthroughâNextâarraytogetschedule.
Thisisaterriblewaytowritethis!
Theonlythingthatmattershereisthatthe
highlightedlinesareourrecursiverelationship.
SeeIPython notebookfor
implementation
Letâsstepthroughit.(SeeIPython notebookforcodewithsomeprintstatements)
Thislooksprettyfamiliar!!
Letâsstepthroughit.
a3a1
a4a2
a5
a7
a6
time
⢠Startwiththeactivitywiththesmallestfinishtime.
Letâsstepthroughit
a3a1
a4a2
a5
a7
a6
time
⢠Nowfindthenextactivitystilldo-ablewiththesmallestfinishtime,andrecurse afterthat.
Letâsstepthroughit
a3a1
a4a2
a5
a7
a6
time
⢠Nowfindthenextactivitystilldo-ablewiththesmallestfinishtime,andrecurse afterthat.
Letâsstepthroughit
a3a1
a4a2
a5
a7
a6
time
⢠Nowfindthenextactivitystilldo-ablewiththesmallestfinishtime,andrecurse afterthat.
Letâsstepthroughit
a3a1
a4a2
a5
a7
a6
time
⢠Ta-da!
Itâsexactlythesame*asthegreedysolution!
*ifyouimplementthetop-downDPsolutionappropriately.
Sub-problemgraphview
⢠Divide-and-conquer:
Bigproblem
sub-problemsub-problem
sub-sub-
problem
sub-sub-
problem
sub-sub-
problem
sub-sub-
problem
sub-sub-
problem
Sub-problemgraphview
⢠DynamicProgramming:
Bigproblem
sub-problemsub-problem
sub-sub-
problemsub-sub-
problem
sub-sub-
problem
sub-sub-
problem
sub-problem
Sub-problemgraphview
⢠Greedyalgorithms:
Bigproblem
sub-sub-
problem
sub-problem
Sub-problemgraphview
⢠Greedyalgorithms:
Bigproblem
sub-sub-
problem
sub-problem
⢠Notonlyisthereoptimalsub-structure:⢠optimalsolutionstoaproblemaremadeup
fromoptimalsolutionsofsub-problems
⢠buteachproblemdependsononlyone
sub-problem.
Answers
1. Doesthisgreedyalgorithmforactivityselectionwork?
⢠Yes.
2. Ingeneral,whenaregreedyalgorithmsagoodidea?
⢠Whentheyexhibitespeciallyniceoptimalsubstructure.
3. TheâgreedyâapproachisoftenthefirstyouâdthinkofâŚ
⢠Whyarewegettingtoitnow,inWeek8?
⢠Relatedtodynamicprogramming!(WhichwedidinWeek7).
⢠Provingthatgreedyalgorithmsworkisoftennotsoeasy.
Letâsseeafewmoreexamples
Anotherexample:
Scheduling
Overcommitted
StanfordStudent
CS161HW!
Callyourparents!
MathHW!
EconHW!
Practicemusicalinstrument!
ReadCLRS!
Haveasociallife!
Sleep!
Administrativestuffforyourstudentclub!
Dolaundry!
Meditate!
Scheduling
⢠ntasks
⢠Taski takesti hours
⢠Everythingisalreadylate!
⢠Foreveryhourthatpassesuntiltaski isdone,payci
⢠CS161HW,thenSleep:costs10â 2+(10+8)â 3=74units⢠Sleep,thenCS161HW:costs8â 3+(10+8)â 2=60units
CS161HW!
Sleep!
10hours
8hours
Cost:2 unitsper
houruntilitâsdone.
Cost:3unitsper
houruntilitâsdone.
Optimalsubstructure
⢠Thisproblembreaksupnicelyintosub-problems:
JobA JobB JobC JobD
Supposethisistheoptimalschedule:
Thenthismustbetheoptimal
scheduleonjustjobsB,C,D.
Optimalsubstructure
⢠Seemsamenabletoagreedyalgorithm:
JobA JobB JobC JobD
Takethebestjobfirst Thensolvethisproblem
JobBJobC JobD
Takethebestjobfirst Thensolvethisproblem
JobBJobD
Takethebestjobfirst
(ThatoneâseasyJ )
Thensolvethisproblem
Whatdoesâbestâmean?
⢠Recipeforgreedyalgorithmanalysis:
⢠Wemakeaseriesofchoices.
⢠Weshowthat,ateachstep,ourchoicewonâtruleoutanoptimalsolution attheendoftheday.
⢠Afterweâvemadeallourchoices,wehavenâtruledoutanoptimalsolution,sowemusthavefoundone.
JobA JobB JobC JobD
âBestâmeans:wonâtruleoutanoptimalsolution.
Theoptimalsolutiontothisproblemextendsanoptimalsolutiontothewholething.
Head-to-head
⢠Ofthesetwojobs,whichshouldwedofirst?
⢠Cost(AthenB)=xâ z+(x+y) â w⢠Cost(BthenA)=y â w+(x+y) â z
JobA
JobB
xhours
y hours
Cost:z unitsper
houruntilitâsdone.
Cost:w unitsper
houruntilitâsdone.
AthenBisbetterthanBthenAwhen:
đĽđ§ + đĽ + đŚ đ¤ ⤠đŚđ¤ + đĽ + đŚ đ§đĽđ§ + đĽđ¤ + đŚđ¤ ⤠đŚđ¤ + đĽđ§ + đŚđ§
đ¤đĽ ⤠đŚđ§đ¤đŚ â¤
đ§đĽ
Whatmattersistheratio:
costofdelaytimeittakes
Dothejobwiththe
biggestratiofirst.
Lemma
⢠GivenjobssothatJobi takestime ti withcostci ,
⢠Thereisanoptimalschedulesothatthefirstjobistheonethatmaximizestheratioci/ti
⢠Proof:
⢠SayJobBmaximizesthisratio,anditâsnotfirst:
⢠SwitchAandB!Nothingelsewillchange,andweshowedonthepreviousslidethatthecostwonâtincrease.
⢠RepeatuntilBisfirst.
JobA JobB
cA/tA >=cB/tB
JobC JobD
JobAJobBJobC JobD
Choosegreedily:Biggestcost/timeratiofirst
⢠Jobi takestime ti withcostci
⢠Thereisanoptimalschedulesothatthefirstjobistheonethatmaximizestheratioci/ti
⢠Soifwechoosejobsgreedilyaccordingtoci/ti,weneverruleoutsuccess!
GreedySchedulingSolution
⢠scheduleJobs(JOBS):
⢠SortJOBSbytheratio:
⢠đđ = đđđđ =
costofdelayingjobitimejobitakestocomplete
⢠Saythatsorted_JOBS[i] isthejobwiththeiâth biggestri⢠Return sorted_JOBS
TherunningtimeisO(nlog(n))
Nowyoucangoaboutyourschedule
peacefully,intheoptimalway.
Formally,useinduction!
⢠Inductivehypothesis:
⢠Thereisanoptimalorderingsothatthefirsttjobsaresorted_JOBS[:t].
⢠Basecase:
⢠Whent=0,thisreads:âThereisanoptimalorderingsothatthefirst0jobsare[]â
⢠Thatâstrue.
⢠InductiveStep:
⢠Boilsdownto:thereisanoptimalorderingonsorted_JOBS[t:]sothatsorted_JOBS[t]isfirst.
⢠ThisfollowsfromtheLemma.
⢠Conclusion:
⢠Whent=n,thisreads:âThereisanoptimalorderingsothatthefirstnjobsaresorted_JOBS.â
⢠aka,whatwereturnedisanoptimalordering.
SLIDESKIPPEDINCLASS
Whathavewelearned?
⢠Agreedyalgorithmworksforscheduling
⢠Thisfollowedthesameoutlineasthepreviousexample:
⢠Identifyoptimalsubstructure:
⢠Findawaytomakeâsafeâchoicesthatwonâtruleoutanoptimalsolution.
⢠largestratiosfirst.
JobA JobB JobC JobD
OnemoreexampleHuffmancoding
⢠everyday english sentence⢠01100101011101100110010101110010011110010110010001100001011110010010000001100101011011100110011101101100011010010111001101101000001000000111001101100101011011100111010001100101011011100110001101100101
⢠qwertyui_opasdfg+hjklzxcv⢠01110001011101110110010101110010011101000111100101110101011010010101111101101111011100000110000101110011011001000110011001100111001010110110100001101010011010110110110001111010011110000110001101110110
OnemoreexampleHuffmancoding
⢠everyday english sentence⢠01100101 0111011001100101 01110010011110010110010001100001011110010010000001100101 011011100110011101101100011010010111001101101000001000000111001101100101 011011100111010001100101 011011100110001101100101
⢠qwertyui_opasdfg+hjklzxcv⢠01110001011101110110010101110010011101000111100101110101011010010101111101101111011100000110000101110011011001000110011001100111001010110110100001101010011010110110110001111010011110000110001101110110
ASCIIisprettywasteful.Ife
showsupsooften,weshould
haveamoreparsimoniousway
ofrepresentingit!
Supposewehavesomedistributiononcharacters
Supposewehavesomedistributiononcharacters
A B C D E F
Percentage
Letter
45
1312
16
9
5
Forsimplicity,
letâsgowiththis
made-upexample
Howtoencodethemas
efficientlyaspossible?
Try0(likeASCII)
A B C D E F
Percentage
Letter
45
1312
16
9
5
000 011001 010 100 101
⢠Everyletterisassignedabinarystring
ofthreebits.
Wasteful!
⢠110and111areneverused.
⢠Weshouldhaveashorterwayof
representingA.
Try1
A B C D E F
Percentage
Letter
45
1312
16
9
5
0 100 01 10 11
⢠Everyletterisassignedabinarystring
ofoneortwobits.
⢠Themorefrequentlettersgetthe
shorterstrings.
⢠Problem:
⢠Does000meanAAAorBAorAB?
Try2:prefix-freecoding
A B C D E F
Percentage
Letter
45
1312
16
9
5
01 00101 110 111 100
⢠Everyletterisassignedabinarystring.
⢠Morefrequentlettersgetshorterstrings.
⢠Noencodedstringisaprefixofanyother.
10010101
Confusingly,âprefix-freecodesâarealsosometimes
calledâprefixcodesâ(includinginCLRS).
Try2:prefix-freecoding
A B C D E F
Percentage
Letter
45
1312
16
9
5
01 00101 110 111 100
⢠Everyletterisassignedabinarystring.
⢠Morefrequentlettersgetshorterstrings.
⢠Noencodedstringisaprefixofanyother.
10010101 F
Confusingly,âprefix-freecodesâarealsosometimes
calledâprefixcodesâ(includinginCLRS).
Try2:prefix-freecoding
A B C D E F
Percentage
Letter
45
1312
16
9
5
01 00101 110 111 100
⢠Everyletterisassignedabinarystring.
⢠Morefrequentlettersgetshorterstrings.
⢠Noencodedstringisaprefixofanyother.
10010101 FB
Confusingly,âprefix-freecodesâarealsosometimes
calledâprefixcodesâ(includinginCLRS).
Try2:prefix-freecoding
A B C D E F
Percentage
Letter
45
1312
16
9
5
01 00101 110 111 100
⢠Everyletterisassignedabinarystring.
⢠Morefrequentlettersgetshorterstrings.
⢠Noencodedstringisaprefixofanyother.
10010101 FBA
Question:Whatisthemost
efficientwaytodoprefix-free
coding?(Thisisnâtit).
Confusingly,âprefix-freecodesâarealsosometimes
calledâprefixcodesâ(includinginCLRS).
Aprefix-freecodeisatree
D:16A:45
B:13F:5 C:12 E:9
0
0 0
0 0 1
1
1
1
1
00 01
100 101 110 111Aslongasalltheletters
showupasleaves,this
codeis prefix-free.
B:13belowmeansthatâBâ
makesup13%ofthe
charactersthateverappear.
Sometreesarebetterthanothers
D:16A:45
B:13F:5 C:12 E:9
0
0 0
0 0 1
1
1
1
1
00 01
100 101 110 111
⢠Imaginechoosingaletteratrandomfromthelanguage.
⢠Notuniform,butaccordingtoourhistogram!
⢠Thecostofatreeistheexpectedlengthoftheencodingofthatletter.
Expectedcostofencodingaletterwiththistree:
đ đ. đđ + đ. đđ + đ đ. đđ + đ. đđ + đ. đđ + đ. đđ = đ. đđ
Cost=
K đ đĽ â depth(đĽ)ďż˝
QRSTRUV P(x)isthe
probability
ofletterx
Thedepthinthe
treeisthelength
oftheencoding
Question
⢠GivenadistributionP onletters,findthelowest-costtree,where
cost(tree) = K đ đĽ â depth(đĽ)ďż˝
XYZ[Y\V P(x)isthe
probability
ofletterx
Thedepthinthe
treeisthelength
oftheencoding
Optimalsub-structure
⢠Supposethisisanoptimaltree:
10
Thenthisisan
optimaltreeon
fewerletters.
Otherwise,wecould
changethissub-tree
andendupwitha
betteroveralltree.
Inordertodesignagreedyalgorithm
⢠Thinkaboutwhatlettersbelonginthissub-problem...
10Whatâsasafe
choicetomake
fortheselower
sub-trees?
Infrequent
elements!Wewantthemaslow
downaspossible.
Solutiongreedilybuildsubtrees,startingwiththeinfrequentletters
D:16A:45 B:13 F:5C:12 E:9
14
0 1
Solutiongreedilybuildsubtrees,startingwiththeinfrequentletters
D:16A:45 B:13 F:5C:12 E:9
14
0 1
25
0 1
Solutiongreedilybuildsubtrees,startingwiththeinfrequentletters
D:16A:45 B:13 F:5C:12 E:9
14
0 1
25
0 1
30
1
0
Solutiongreedilybuildsubtrees,startingwiththeinfrequentletters
D:16A:45 B:13 F:5C:12 E:9
14
0 1
25
0 1
30
1
0
551
0
Solutiongreedilybuildsubtrees,startingwiththeinfrequentletters
D:16A:45 B:13 F:5C:12 E:9
14
0 1
25
0 1
30
1
0
551
0
1001
0
Solutiongreedilybuildsubtrees,startingwiththeinfrequentletters
D:16
A:45
B:13
F:5
C:12
E:9
14
0 1
25
0 1
30
10
5510
100
10
0
100 101 110
1110 1111
Expectedcostofencodingaletter:
đ â đ. đđ+
đ â đ. đđ+
đ â đ. đđ= đ. đđ
Whatexactlywasthealgorithm?
⢠Createanodelikeforeachletter/frequency
⢠Thekeyisthefrequency(16inthiscase)
⢠LetCURRENT bethelistofallthesenodes.
⢠while len(CURRENT)>1:
⢠X andYâ thenodesinCURRENT withthesmallestkeys.
⢠CreateanewnodeZ withZ.key =X.key +Y.key
⢠SetZ.left =X,Z.right =Y
⢠AddZ toCURRENT andremoveX andY
⢠returnCURRENT[0]
D:16
F:5 E:9
14
0 1
Y
Z
XD:16A:45 B:13 C:12
Doesitwork?
⢠Yes.
⢠Samestrategy:
⢠Showthatateachstep,thechoiceswearemakingwonâtruleoutanoptimalsolution.
⢠Lemma:
⢠Supposethatxandyarethetwoleast-frequentletters.Thenthereisanoptimaltreewherexandyaresiblings.
D:16A:45 B:13 F:5C:12 E:9
14
0 1
Lemmaproofidea
⢠Saythatanoptimaltreelookslikethis:
⢠Whathappenstothecostifweswapxfora?⢠thecostcanâtincrease;awasmorefrequentthanx,andwejustmadeitsencodingshorter.
⢠Repeatthislogicuntilwegetanoptimaltreewithxandyassiblings.⢠Thecostneverincreasedsothistreeisstilloptimal.
Ifxandyarethetwoleast-frequentletters,there
isanoptimaltreewherexandyaresiblings.
x
a
Lowest-levelsibling
nodes:atleastoneof
themisneitherxnory
Lemmaproofidea
⢠Saythatanoptimaltreelookslikethis:
⢠Whathappenstothecostifweswapxfora?⢠thecostcanâtincrease;awasmorefrequentthanx,andwejustmadeitsencodingshorter.
⢠Repeatthislogicuntilwegetanoptimaltreewithxandyassiblings.⢠Thecostneverincreasedsothistreeisstilloptimal.
x y
Lowest-levelsibling
nodes:atleastoneof
themisneitherxnory
Ifxandyarethetwoleast-frequentletters,there
isanoptimaltreewherexandyaresiblings.
Proofstrategyjustlikebefore
⢠Showthatateachstep,thechoiceswearemakingwonâtruleoutanoptimalsolution.
⢠Lemma:
⢠Supposethatxandyarethetwoleast-frequentletters.Thenthereisanoptimaltreewherexandyaresiblings.
D:16A:45 B:13 F:5C:12 E:9
14
0 1
Proofstrategyjustlikebefore
⢠Showthatateachstep,thechoiceswearemakingwonâtruleoutanoptimalsolution.
⢠Lemma:
⢠Supposethatxandyarethetwoleast-frequentletters.Thenthereisanoptimaltreewherexandyaresiblings.
Thatâsenoughtoshowthatwe
donâtruleoutoptimalityafter
thefirststep.
Whataboutoncewestart
groupingstuff?
D:16A:45 B:13 F:5C:12 E:9
0 1
25
01
1
014
30
Lemma2thisdistinctiondoesnâtreallymatter
D:16
F:5E:9
14
0 1
25
0 1
30
10
5510
100
10
C:12B:13
A:45 A:4555
10
100
10
G:25H:30
Thefirstthingisanoptimal
treeon{A,B,C,D,E,F}
ifandonlyif
thesecondthingisan
optimaltreeon{A,G,H}
⢠Foraproof:
⢠SeeCLRS,Lemma16.3
⢠Rigorousalthoughpresentedinaslightlydifferentway
⢠SeeLectureNotes14
⢠Abitsketchier,butpresentedinthesamewayashere
⢠Proveityourself!
⢠Thisisthebest!
Siggi theStudiousStork
Gettingallthedetails
isnâtthatimportant,but
youshouldconvince
yourselfthatthisistrue.
Lemma2thisdistinctiondoesnâtreallymatter
Together
⢠Lemma1:
⢠Supposethatxandyarethetwoleast-frequentletters.Thenthereisanoptimaltreewherexandyaresiblings.
⢠Lemma2:
⢠WemayaswellimaginethatCURRENTcontainsonlyleaves.
⢠Theseimply:
⢠Ateachstep,ourchoicedoesnâtruleoutanoptimaltree.
Thewholeargument
⢠Inductivehypothesis:⢠afterthetâth step,
⢠thereisanoptimaltreecontainingthecurrentsubtreesasâleavesâ
⢠Basecase:⢠afterthe0âthstep,
⢠thereisanoptimaltreecontainingallthecharacters.
⢠Inductivestep:⢠TODO
⢠Conclusion:⢠afterthelaststep,
⢠thereisanoptimaltreecontainingthiswholetreeasasubtree.
⢠aka,⢠afterthelaststepthetreeweâveconstructedisoptimal.
Afterthetâth step,weâvegotabunchofcurrentsub-trees:
Inductivehyp.asserts
thatoursubtreescanbe
assembledintoan
optimaltree:
Inductivestep
⢠Supposethattheinductivehypothesisholdsfort-1
⢠Aftert-1steps,thereisanoptimaltreecontainingallthecurrentsub-treesasâleaves.â
⢠Wanttoshow:
⢠Aftertsteps,thereisanoptimaltreecontainingallthecurrentsub-treesasleaves.
Weâvegotabunchofcurrentsub-trees:
xy
saythatxandyarethetwosmallest.
wz
Inductivestep
⢠Supposethattheinductivehypothesisholdsfort-1
⢠Aftert-1steps,thereisanoptimaltreecontainingallthecurrentsub-treesasâleaves.â
⢠ByLemma2,mayaswelltreatas
Weâvegotabunchofcurrentsub-trees:
xyw
saythatxandyarethetwosmallest.
aa
yxw
z
z
Inductivestep
⢠Supposethattheinductivehypothesisholdsfort-1
⢠Aftert-1steps,thereisanoptimaltreecontainingallthecurrentsub-treesasâleaves.â
⢠ByLemma2,mayaswelltreatas
⢠Inparticular,optimaltreesonthisnewalphabetcorrespondtooptimaltreesontheoriginalalphabet.
Weâvegotabunchofcurrentsub-trees:
xyw
saythatxandyarethetwosmallest.
aa
zwyx
z
Inductivestep
⢠Supposethattheinductivehypothesisholdsfort-1
⢠Aftert-1steps,thereisanoptimaltreecontainingallthecurrentsub-treesasâleaves.â
⢠Ouralgorithmwoulddothisatlevelt:
Weâvegotabunchofcurrentsub-trees:
xyw
saythatxandyarethetwosmallest.
xy
wa a=x+y
z
zwyx
z
Inductivestep
⢠Supposethattheinductivehypothesisholdsfort-1
⢠Aftert-1steps,thereisanoptimaltreecontainingallthecurrentsub-treesasâleaves.â
⢠Ouralgorithmwoulddothisatlevelt:
Weâvegotabunchofcurrentsub-trees:
xyw
saythatxandyarethetwosmallest.
zw
a
yx
xy
wa a=x+y
Lemma1impliesthatthereâs
anoptimalsub-treethatlooks
likethis;aka,whatour
algorithmdidokay.
z
z
Inductivestep
⢠Supposethattheinductivehypothesisholdsfort-1
⢠Aftert-1steps,thereisanoptimaltreecontainingallthecurrentsub-treesasâleaves.â
⢠Ouralgorithmwoulddothisatlevelt:
Weâvegotabunchofcurrentsub-trees:
xyw
saythatxandyarethetwosmallest.
w
a
xy
wa a=x+y
Lemma2againsaysthat
thereâsanoptimaltreethat
lookslikethis
z
yxz
z
Inductivestep
⢠Supposethattheinductivehypothesisholdsfort-1
⢠Aftert-1steps,thereisanoptimaltreecontainingallthecurrentsub-treesasâleaves.â
⢠Ouralgorithmwoulddothisatlevelt:
Weâvegotabunchofcurrentsub-trees:
xyw
saythatxandyarethetwosmallest.
w
a
xy
wa a=x+y
Lemma2againsaysthat
thereâsanoptimaltreethat
lookslikethis
z
yxz
Thisiswhatwe
wantedtoshowfor
theinductivestep.
z
Inductiveoutline:
⢠Inductivehypothesis:⢠afterthetâth step,
⢠thereisanoptimaltreecontainingthecurrentsubtreesasâleavesâ
⢠Basecase:⢠afterthe0âthstep,
⢠thereisanoptimaltreecontainingallthevertices.
⢠Inductivestep:⢠TODO
⢠Conclusion:⢠afterthelaststep,
⢠thereisanoptimaltreecontainingthiswholetreeasasubtree.
⢠aka,⢠afterthelaststepthetreeweâveconstructedisoptimal.
Afterthetâth step,weâvegotabunchofcurrentsub-trees:
Inductivehyp.asserts
thatoursubtreescanbe
assembledintoan
optimaltree:
Whathavewelearned?
⢠ASCIIisnâtanoptimalwaytoencodeEnglish,sincethedistributiononlettersisnâtuniform.
⢠HuffmanCodingisanoptimalway!
⢠Tocomeupwithanoptimalschemeforanylanguageefficiently,wecanuseagreedyalgorithm.
⢠Tocomeupwithagreedyalgorithm:
⢠Identifyoptimalsubstructure
⢠Findawaytomakeâsafeâchoicesthatwonâtruleoutanoptimalsolution.
⢠Createsubtreesoutofthesmallesttwocurrentsubtrees.
RecapI
⢠Greedyalgorithms!
⢠Threeexamples:
⢠ActivitySelection
⢠SchedulingJobs
⢠HuffmanCoding
RecapII
⢠Greedyalgorithms!
⢠Ofteneasytowritedown
⢠Butmaybehardtocomeupwithandhardtojustify
⢠Thenaturalgreedyalgorithmmaynotalwaysbecorrect.
⢠Aproblemisagoodcandidateforagreedyalgorithmif:
⢠ithasoptimalsubstructure
⢠thatoptimalsubstructureisREALLYNICE
⢠solutionsdependonjustoneothersub-problem.
Nexttime
⢠GreedyalgorithmsforMinimumSpanningTree!
⢠Pre-lectureexercise:candidategreedyalgorithmsforMST
Before nexttime