Midterm Review 15-211 Fundamental Data Structures and Algorithms Margaret Reid-Miller 2 March 2004.
-
Upload
junior-garrett -
Category
Documents
-
view
218 -
download
1
description
Transcript of Midterm Review 15-211 Fundamental Data Structures and Algorithms Margaret Reid-Miller 2 March 2004.
Midterm Review
15-211 Fundamental Data Structures and Algorithms
Margaret Reid-Miller2 March 2004
Midterm Mechanics
3
Midterm mechanics
Worth a total of 125 points Closed books, closed computers One sheet of notes allowed If you have a question, raise your
hand and stay in your seat
Basic Data Structures
5
Basic Data Structures Linked List Stack Queue Tree
Height of tree, Depth of node, LevelPerfect, Complete, Full Min & Max number of nodes
6
Java - Implementations Interfaces vs Abstract Classes vs Classes Induction: base case vs inductive case
Recursion Recurrence Relations Object oriented classes
Invariants:LoopRecursionRepresentation
Persistent vs Destructive Sentinels
Recurrences
8
Example: list reversal
The reversal of a list L is:L, if L is emptyM.append(n), otherwise
where n is the first element of L, and M is the reversal of the tail of L
some constant- time steps
9
Recurrence Relations E.g., T(n) = T(n-1) + n/2 Solve by repeated substitution Solve resulting series Prove by guessing and substitution Divide & Conquer Theorem
T(N) = aT(N/c) + bN
10
Solving recurrence equations
Repeated substitution:t(n) = n + t(n-1) = n + (n-1) + t(n-2) = n + (n-1) + (n-2) + t(n-3)and so on… = n + (n-1) + (n-2) + (n-3) + … + 1
11
Incrementing series
This is an arithmetic series that comes up over and over again, because characterizes many nested loops:
for (i=1; i<n; i++) { for (j=1; j<i; j++) { f(); }}
12
Visualizing it
n
n
0 1 2 3 …
12
3
…Area: n2/2
Area of the leftovers: n/2
So: n2/2 + n/2 = (n2+n)/2 = n(n+1)/2
13
Doubling summationLike the incrementing summation, sums of powers of 2 are also encountered frequently in computer science.
What is the closed-form solution for this sum? Prove your answer by induction.
14
Visualizing it
Imagine filling a glass by halves…
2n-1
2n-2
2n-32n-42n-5
15
Divide-and-Conquer Theorem Theorem: Let a, b, c 0. The recurrence relation
T(1) = b T(N) = aT(N/c) + bN for any N which is a power of c
has upper-bound solutions T(N) = O(N) if a<c T(N) = O(Nlog N) if a=c T(N) = O(Nlogca) if a>c
a=2, b=1,c=2 for recursivesorting
Asymptotics
17
Performance and Scalingn 100n sec 7n2 sec 2n sec
1 100 s 7 s 2 s5 .5 ms 175 s 32 s
10 1 ms .7 ms 1 ms45 4.5 ms 14 ms 1 year
100 100 ms 7 sec 1016 year1,000 1 sec 12 min --
10,000 10 sec 20 hr --1,000,000 1.6 min .22 year --
18
“Big-Oh” notation
N
cf(N)
T(N)
n0
runn
ing
time
T(N) = O(f(N))“T(N) is order f(N)”
19
Upper And Lower Bounds
f(n) = O( g(n) ) Big-Ohf(n) ≤ c g(n) for some constant c and n > n0
f(n) = ( g(n) ) Big-Omegaf(n) ≥ c g(n) for some constant c and n > n0
f(n) = ( g(n) ) Thetaf(n) = O( g(n) ) and f(n) = ( g(n) )
20
Upper And Lower Bounds
f(n) = O( g(n) ) Big-OhCan only be used for upper bounds.
f(n) = ( g(n) ) Big-OmegaCan only be used for lower bounds
f(n) = ( g(n) ) ThetaPins down the running time exactly (up to a multiplicative constant).
21
Big-O characteristic Low-order terms “don’t matter”:
Suppose T(N) = 20n3 + 10nlog n + 5Then T(N) = O(n3)
Question:What constants c and n0 can be used to show
that the above is true? Answer: c=35, n0=1
22
Big-O characteristic The bigger task always dominates
eventually. If T1(N) = O(f(N)) and T2(N) = O(g(N)).Then T1(N) + T2(N) = max( O(f(N)), O(g(N) ).
Also:T1(N) T2(N) = O( f(N) g(N) ).
23
Some common functions
0
200
400
600
800
1000
1200
1 2 3 4 5 6 7 8 9 10
10N100 log N5 N^2N^32^N
Dictionaries
25
Dictionary Operations:
InsertDeleteFind
Implementations:Binary Search TreeAVL TreeTrieHash
26
Binary Search Trees (BST) “Perfect” BST:
Height? Number of nodes?Number of nodes as level i?Operations time?
Worse BST?Operations time?
27
AVL trees Definition Min number of nodes of height H
FH+3 -1, where Fn is nth Fibonacci number
Insert - single & double rotations. How many?
Delete - lazy. How bad?
28
Single rotation For the case of insertion into left
subtree of left child:
Z
YX
ZYX
Deepest node of X has depth 2 greater than deepest node of Z.
Depth reduced by 1
Depth increased by 1
29
Double rotation For the case of insertion into the
right subtree of the left child.
Z
X
Y1 Y2
ZX Y1 Y2
30
Tries Good for unequal
length keys or sequences
Find O(m), m sequence length
But: Few to many children
4 5 9
4 6 6
5 8 83 3
I
like loveyou
59
lovely
…
…
31
Hash Tables Hash function h: h(key) = index Desirable properties:
Approximate random distributionEasy to calculateE.g., Division: h(k) = k mod m
Perfect hashingWhen know all input keys in advance
32
Collisions Separate chaining
Linked list: ordered vs unordered Open addressing
Linear probing - clustering very bad with high load factor
*Quadratic probing - secondary clustering, table size must be prime
Double hashing - table size must be prime, too complex
33
Hash Tables Delete? Rehash when load factor high -
double (amortize cost constant) Find & insert are near constant
time! But: no min, max, next,… operation Trade space for time--load factors
<75%
Priority Queues
35
Priority Queues Operations:
Insert FindMin DeleteMin
Implementations:Linked listSearch treeHeap
36
Linked list deleteMin O(1) O(N) insert O(N) O(1)
Search trees All operations O(log N)
Heaps avg (assume random) worst
deleteMin O(log N) O(log N) insert 2.6 O(log N)
special case: buildheap O(N) O(N)
i.e., insert*N
or
Possible priority queue implementations
37
Linked list deleteMin O(1) O(N) insert O(N) O(1)
Search trees All operations O(log N)
Heaps avg (assume random) worst
deleteMin O(log N) O(log N) insert 2.6 O(log N)
special case: buildheap O(N) O(N)
i.e., insert*N
or
Possible priority queue implementations
38
Linked list deleteMin O(1) O(N) insert O(N) O(1)
Search treesAll operations O(log N)
Heaps avg (assume random) worst
deleteMin O(log N) O(log N) insert 2.6 O(log N)buildheap O(N) O(N)
N inserts
or
Possible priority queue implementations
39
Heaps Properties:
1. Complete binary tree in an array2. Heap order property
Insert: percolate up DeleteMin: percolate down BuildHeap: starting at bottom,
percolate down Heapsort: BuildHeap + DeleteMin
40
Representing complete binary trees
Arrays (1-based)Parent at position iChildren at 2i (and 2i+1).
21
984
105 76
3
1 2 3 4 5 6 7 8 9 10
41
Insert - Percolation up Insert leaf to establish complete tree property. Bubble inserted leaf up the tree until the heap
order property is satisfied.
13
266524
3231 6819
16
14 Not really there...
21
42
DeleteMin - Percolation down Move last leaf to root to restore complete tree
property. Bubble the transplanted leaf value down the tree
until the heap order property is satisfied.
14
31
2665
24
32
21 6819
16
14
--
2665
24
32
21 6819
16
31
1 2
43
BuildHeap - Percolation down Start at bottom subtrees. Bubble subtree root down until the heap order
property is satisfied.
24
236526
2131 1916
68
14
32
Data Compression
45
Pi 1000031415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679821480865132823066470938446095505822317253594081284811174502841027019385211055596446229489549303819644288109756659334461284756482337867831652712019091456485669234603486104543266482133936072602491412737245870066063155881748815209209628292540917153643678925903600113305305488204665213841469519415116094330572703657595919530921861173819326117931051185480744623799627495673518857527248912279381830119491298336733624406566430860213949463952247371907021798609437027705392171762931767523846748184676694051320005681271452635608277857713427577896091736371787214684409012249534301465495853710507922796892589235420199561121290219608640344181598136297747713099605187072113499999983729780499510597317328160963185950244594553469083026425223082533446850352619311881710100031378387528865875332083814206171776691473035982534904287554687311595628638823537875937519577818577805321712268066130019278766111959092164201989380952572010654858632788659361533818279682303019520353018529689957736225994138912497217752834791315155748572424541506959508295331168617278558890750983817546374649393192550604009277016711390098488240128583616035637076601047101819429555961989467678374494482553797747268471040475346462080466842590694912933136770289891521047521620569660240580381501935112533824300355876402474964732639141992726042699227967823547816360093417216412199245863150302861829745557067498385054945885869269956909272107975093029553211653449872027559602364806654991198818347977535663698074265425278625518184175746728909777727938000816470600161452491921732172147723501414419735685481613611573525521334757418494684385233239073941433345477624168625189835694855620992192221842725502542568876717904946016534668049886272327917860857843838279679766814541009538837863609506800642251252051173929848960841284886269456042419652850222106611863067442786220391949450471237137869609563643719172874677646575739624138908658326459958133904780275900994657640789512694683983525957098258226205224894077267194782684826014769909026401363944374553050682034962524517493996514314298091906592509372216964615157098583874105978859597729754989301617539284681382686838689427741559918559252459539594310499725246808459872736446958486538367362226260991246080512438843904512441365497627807977156914359977001296160894416948685558484063534220722258284886481584560285060168427394522674676788952521385225499546667278239864565961163548862305774564980355936345681743241125150760694794510965960940252288797108931456691368672287489405601015033086179286809208747609178249385890097149096759852613655497818931297848216829989487226588048575640142704775551323796414515237462343645428584447952658678210511413547357395231134271661021359695362314429524849371871101457654035902799344037420073105785390621983874478084784896833214457138687519435064302184531910484810053706146806749192781911979399520614196634287544406437451237181921799983910159195618146751426912397489409071864942319615679452080951465502252316038819301420937621378559566389377870830390697920773467221825625996615014215030680384477345492026054146659252014974428507325186660021324340881907104863317346496514539057962685610055081066587969981635747363840525714591028970641401109712062804390397595156771577004203378699360072305587631763594218731251471205329281918261861258673215791984148488291644706095752706957220917567116722910981690915280173506712748583222871835209353965725121083579151369882091444210067510334671103141267111369908658516398315019701651511685171437657618351556508849099898599823873455283316355076479185358932261854896321329330898570642046752590709154814165498594616371802709819943099244889575712828905923233260972997120844335732654893823911932597463667305836041428138830320382490375898524374417029132765618093773444030707469211201913020330380197621101100449293215160842444859637669838952286847831235526582131449576857262433441893039686426243410773226978028073189154411010446823252716201052652272111660396665573092547110557853763466820653109896526918620564769312570586356620185581007293606598764861179104533488503461136576867532494416680396265797877185560845529654126654085306143444318586769751456614068007002378776591344017127494704205622305389945613140711270004078547332699390814546646458807972708266830634328587856983052358089330657574067954571637752542021149557615814002501262285941302164715509792592309907965473761255176567513575178296664547791745011299614890304639947132962107340437518957359614589019389713111790429782856475032031986915140287080859904801094121472213179476477726224142548545403321571853061422881375850430633217518297986622371721591607716692547487389866549494501146540628433663937900397692656721463853067360965712091807638327166416274888800786925602902284721040317211860820419000422966171196377921337575114959501566049631862947265473642523081770367515906735023507283540567040386743513622224771589150495309844489333096340878076932599397805419341447377441842631298608099888
46
pitiny.c This C program is just 143 characters
long!
And it “decompresses” into the first 10,000 digits of Pi.
long a[35014],b,c=35014,d,e,f=1e4,g,h; main(){for(;b=c-=14;h=printf("%04ld",e+d/f)) for(e=d%=f;g=--b*2;d/=g) d=d*b+f*(h?a[b]:f/5), a[b]=d%--g;}
47
Data Compression Huffman
Optimal prefix-free codesFull binary tree - short codes for ?Priority queue on “tree” frequency
LZWDictionary of codes for previously seen
patternsWhen find pattern increase length by
one trie
48
Huffman Full: every node
Is a leaf, orHas exactly 2 children.
Build tree bottom up:Use priority queue of trees
weight - sum of frequencies.
New tree of two lowest weight trees.
c
a
b
d0
0
0
1
1
1
a=1, b=001, c=000, d=01
49
Byte LZW: Compress example
baddadInput:^
a bDictionary:
Output:
10 32
c d
10335
4
a
5
d
6
d
7
a
50
Byte LZW: Uncompress example
10335Input:^
a bDictionary:
Output:
10 32
c d
baddad
4
a
5
d
6
d
7
a
Sorting
52
Simple sorting algorithmsSeveral simple, quadratic algorithms (worst case and average).
- Bubble Sort- Insertion Sort- Shell Sort (sub-quadratic)
Only Insertion Sort of practical interest: running time linear in number of inversion of input sequence.
Constants small. Stable?
53
Sorting ReviewAsymptotically optimal O(n log n) algorithms (worst case and average).
- Merge Sort- Heap Sort
Merge Sort purely sequential and stable.
But requires extra memory: 2n + O(log n).
54
Quick Sort
Overall fastest. In place.
BUT:
Worst case quadratic. Average case O(n log n).
Not stable.
Implementation details messy.
55
Average-case analysis Consider the quicksort tree:
105 47 13 17 30 222 5 19
5 17 13 47 30 222 10519
5 17 30 222 10513 47
105 222
56
Radix Sort
Used by old computer-card-sorting machines.
Linear time:• b passes on b-bit elements• b/m passes m bits per pass
Each pass must be stable
BUT:
Uses 2n+2m space.
May only beat Quick Sort for very large arrays.
Questions?