CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this...
Transcript of CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this...
![Page 1: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/1.jpg)
CS208: Algorithms and ComplexityLecture 3: Search and Sort
Thomas Selig
University of Strathclyde
2 February 2017
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 0 / 26
![Page 2: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/2.jpg)
Assignments
They’re meant to help you.
Some questions can have several correct answers.
Working together is fine, even encouraged, but...
You are allowed to start working on them more than 48 hours before they are due.
I look for answers that are clear and reasonably concise. Beyond that, I’m notbothered by format.
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 1 / 26
![Page 3: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/3.jpg)
In the lectures so far we...
We introduced complexity, asymptotic complexity.
Formalised asymptotic complexity using big-O notation.
Determined the asymptotic complexity of polynomials.
We considered non-recursive algorithms.
Determined complexity of non-recursive algorithms.
Rules for if, while, for and sequential algorithms.
We considered recursive algorithms.
Determined complexity of recursive algorithms.
Obtained a recurrence relation and quoted its solution.
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 2 / 26
![Page 4: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/4.jpg)
Overview of Lecture 3
Today: Complexity of algorithms for searching in and sorting lists.
Initial questions: In this lecture
Input Size: Length of list.
Primitive Operations: Comparisons of numbers.
Algorithms: As a by-product, this lecture is also revision
Reminders about recursive algorithms
Two observations
Two algorithms for searching
Selection Sort analysis
Insertion Sort analysis
Merge Sort analysis
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 3 / 26
![Page 5: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/5.jpg)
Recursive algorithms
Recursive algorithms are algorithms which call themselves.
Recursive algorithms have recursively defined complexity functions.
The worst case asymptotic complexity is governed by a recurrence relation.
You should be able toDerive the recurrence relation for a recursive algorithm.Know the complexity class of functions defined by recurrence relations.Know a little bit about logarithms.
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 4 / 26
![Page 6: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/6.jpg)
What does O(1) mean?
O(1) doesn’t ‘mean’ anything by itself.
To say that another function f is O(1) does mean something...
To say that an algorithm A is O(1) usually means that the asymptotic worst casetime complexity of the algorithm for input size n, the function TA(n), is O(1).
I’ll return to this point again.
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 5 / 26
![Page 7: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/7.jpg)
Small observation
Suppose T (n) = k + T (n/2). Then
T (256) = k + T (128)
= k + k + T (64)
= k + k + k + T (32)
= k + k + k + k + T (16)
= k + k + k + k + k + T (8)
= k + k + k + k + k + k + T (4)
= k + k + k + k + k + k + k + T (2)
= k + k + k + k + k + k + k + k + T (1).
So T (256) = 8k + T (1) = k log2(256) + constant.
By the same reasoning, T (n) = k log2(n) + constant
This argument is used to show that T (n) is O(log2 n).
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 6 / 26
![Page 8: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/8.jpg)
A searching algorithm
Problem: Searching a list of integers L for m.
Algorithm search1(L,m)
Require: Ordered list L and integer mif L =[] then
answer ←falseelse if head(L)=m then
answer ←trueelse
answer ← search1(tail(L), m)end ifreturn answer
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 7 / 26
![Page 9: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/9.jpg)
Analysis of search1
Complexity: Primitive operation is comparing list head to value. Defined recursively by
Tsearch1(n) =
{0 if L is empty1 if the head of L is m1 + Tsearch1(n − 1) otherwise.
Recurrence relation: Worst-case asymptotic complexity given by
Tsearch1(n) = 1 + Tsearch1(n − 1)
so Tsearch1(n) is O(n) (WB). We say ‘search1 has linear worst-case time complexity’.
Can we do better? In general, no. However...
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 8 / 26
![Page 10: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/10.jpg)
Another searching Example
Suppose you want to search a list which is ordered.
Algorithm search2(L,m)
Require: Ordered list L and integer mif L =[] then
answer ←falseelse
a← middle value(L)if a = m then
answer ←trueelse if a > m then
answer ← search2(first half (L), m)else
answer ← search2(second half (L), m)end if
end ifreturn answer
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 9 / 26
![Page 11: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/11.jpg)
Analysis of search2
Recurrence Relation: Primitive operation is comparing middle list value to m.Worst-case asymptotic complexity is given by
Tsearch2(n) = 1 + Tsearch2(n/2)
so Tsearch2(n) is O(log2(n)) (by using the Fact at bottom of slide 5). In this case we say‘search2 has logarithmic (log) worst-case time complexity’.
Terminology: We call algorithms such as this one which break down a problem into two(or more) sub-problems of the same type a divide and conquer algorithm.
Why are such algorithms usually efficient?
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 10 / 26
![Page 12: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/12.jpg)
Selection Sort
Description: To sort a list
Head of the result is the smallest member of the list.
To obtain the tail of the result, remove the element we have just found and thensort the remaining elements recursively.
Algorithm selsort(L)
Require: Input is a list L1: if L =[] then2: return []3: else4: x ← smallest(L)5: L′ ← selsort(remove(x, L))6: J ← cons([x], L′)7: return J8: end if
Auxiliary Functions: Selection sort uses three other functions.
smallest returns the smallest element of a list.
remove deletes an element from a list.
cons is the concatenation operator.
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 11 / 26
![Page 13: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/13.jpg)
Analysis of Selection Sort
Key Idea: Selection sort uses other functions.
So the complexity of selection sort depends on the complexity of the other functions.
Step 1: Derive the complexity by using 4-rules
T (n) =
0 if L = []
T4(n) + T5(n) + T6(n)
= S(n) + (R(n) + T (n − 1)) + 0
= S(n) + R(n) + T (n − 1).
if L 6= []
Step 2: It is easy to see that S(n) = a1n + b1 and R(n) = a2n + b2 for some numbersa1, a2, b1, b2. (What exactly they are we do not care.) The important point is that they areboth linear functions of n, and therefore their sum will also be a linear function of n. So
T (n) = a3n + b3 + T (n − 1)
for some numbers a3 and b3. (Again we do not care what they are!)
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 12 / 26
![Page 14: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/14.jpg)
Another way to say this is that since S(n) and R(n) are both O(n), their sumS(n) + R(n) is O(n).
The previous equation can be written in the following (incorrect!) but highly suggestiveform:
T (n) = O(n) +O(n) + T (n − 1) = O(n) + T (n − 1)
What has just happened?
We have just performed arithmetic using big-O notation.
Can we make sense of the equation T (n) = O(n) + T (n − 1) ?What about T (n)− T (n − 1) = O(n) ?What about T (n)− T (n − 1) is O(n) ?
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 13 / 26
![Page 15: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/15.jpg)
Resolving T (n) = O(n) + T (n − 1)
The equation T (n) = O(n) + T (n − 1) ‘means’
T (n)− T (n − 1) = O(n),
which in turn ‘means’T (n)− T (n − 1) is O(n).
So from the definition of big-O, this tells us there are numbers N and c such that
T (n)− T (n − 1) ≤ cn
for all n > N.
If T (n) = O(n) + T (n − 1) then T (n) is O(n2). (WB)
More generally, if T (n) = O(na) + T (n − 1) for some a ≥ 0, then T (n) is O(na+1).Example for a = 0.
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 14 / 26
![Page 16: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/16.jpg)
The asymptotic complexity of Selection Sort
For Selection Sort, we have that T (n) = O(n) + T (n − 1).
We can therefore conclude that T (n) is O(n2),i.e. Selection sort has worst case time complexity that is quadratic.
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 15 / 26
![Page 17: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/17.jpg)
Insertion Sort
Description: Insertion sort is defined as follows
insert inserts an item in a sorted list
inssort recursively sorts the tail and insert puts the head in the correct place.
Algorithm inssort(L)
if L = [] thenreturn L
elsea ← head(L)return insert(a, inssort(tail(L)))
end if
Algorithm insert(a, L)
if L = [] thenreturn [a]
elseb ← head(L)if a ≤ b then
return cons([a], L)else
return cons([b], insert(a, tail(L))end if
end if
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 16 / 26
![Page 18: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/18.jpg)
Insertion Sort
We can re-write the algorithm as follows.
Description: Insertion sort is defined as follows
insert inserts an item in a sorted list
inssort recursively sorts the tail and insert puts the head in the correct place.
Algorithm inssort(L)
inssort([]) ← []inssort(cons(a, L′)) ← insert(a, inssort(L′))
insert(a, []) ← cons(a, [])insert(a, cons(b, L)) ← cons(a, cons(b, L)) when a ≤ binsert(a, cons(b, L)) ← cons(b, insert(a, L)) when a > b
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 17 / 26
![Page 19: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/19.jpg)
Complexity of Insertion Sort
The primitive operations to be counted are comparisons of pairs of integers.Insert: First the complexity of insert
Tinsert(n) =
0 if n = 01 if a ≤ head(L)1 + Tinsert(n − 1) otherwise
and so insert is a linear, i.e. O(n), algorithm.
Insertion Sort: The complexity of insertion sort is given by
Tinssort(n) =
{0 if n = 0Tinsert(n − 1) + Tinssort(n − 1) otherwise
So Tinssort(n) = O(n) + Tinssort(n − 1).
Hence: Insertion sort is therefore a quadratic, i.e. O(n2) , algorithm.
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 18 / 26
![Page 20: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/20.jpg)
Merge Sort
Description: Merge sort is defined as follows
Divide the list L into two (as far as possible) equal parts first(L) and second(L).
Recursively Merge sort each part, giving two sorted lists.
Merge the two resulting lists, e.g.
mge([1, 10, 20, 22], [5, 15, 21]) = [1, 5, 10, 15, 20, 21, 22].
Algorithm msort(L)
Require: Input is a list Lmsort([]) ← []msort([e]) ← [e]msort(L) ← mge(msort(first(L), msort(second(L)) when |L| > 1
Algorithm mge(L, L′)
Require: Input is a pair of sorted lists (L, L′)mge([], L) ← Lmge(L, []) ← Lmge(cons(a1, L1), cons(a2, L2)) ← cons(a1, mge(L1, cons(a2, L2))) when a1 ≤ a2
mge(cons(a1, L1), cons(a2, L2)) ← cons(a2, mge(cons(a1, L1), L2)) when a1 > a2
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 19 / 26
![Page 21: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/21.jpg)
Complexity of Merge Sort
Merge: To merge two lists whose total length is n, in the worst case we need n− 1 steps.
Merge Sort:
Tmsort(n) =
0 if n = 00 if n = 1O(n) + 2Tmsort(n/2) otherwise.
Fact: A recurrence relation of the form
T (n) = O(n) + 2T (n/2)
has a solution which is O(n log2(n)).(WB)
Conclusion: Merge sort is asymptotically more efficient than both insertion sort andselection sort.
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 20 / 26
![Page 22: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/22.jpg)
Quick sort
Description: Given list L
Choose a pivot x in L.
Move elements of L so thatall elements to the left of x are less than it, andall elements to the right of x are greater than it.
Apply this procedure to the lists left and right of x
Quick sort routine on a list L will be executed by calling q2sort(L,1,L.length)
(adding additional parameters is necessary for the recursive definition).
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 21 / 26
![Page 23: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/23.jpg)
Algorithm qsort(L)Require: Input is a list L1: q2sort(L, 1, length(L))2: return L
Algorithm q2sort(L, p, r)Require: Input is a list L, and two integers p and r1: if p < r then2: q ← partition(L, p, r)3: q2sort(L, p, q − 1)4: q2sort(L, q + 1, r)5: end if
Algorithm partition(L, p, r)Require: Input is a list L, and two integers p and r1: x ← L[r ]2: i ← p − 13: for j = p to r − 1 do4: if L[j] ≤ x then5: i ← i + 16: swap L[i ] and L[j]7: end if8: end for9: swap L[i + 1] and L[r ]
10: return i + 1
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 22 / 26
![Page 24: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/24.jpg)
Complexity of Quicksort
Let Tqsort(n) be the time taken to quicksort an input of size n.
ThenTqsort(n) = Tpartition(n) + Tqsort(m) + Tqsort(n − 1−m)
for some value m. Now Tpartition(n) = n, or is just O(n).
SoTqsort(n) = n + Tqsort(m) + Tqsort(n − 1−m)
for some m ∈ {0, 1, . . . , n − 1}.Analysing the behaviour of qsort is difficult due to the form of the above equation.
It depends on a quantity m which can vary greatly depending upon the particularinput.
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 23 / 26
![Page 25: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/25.jpg)
Complexity of Quicksort
However, for worst case time complexity we may write
Tqsort(n) ≤ n + maxm∈{0,...,n−1}
{Tqsort(m) + Tqsort(n − 1−m)}.
Solving such recursive inequalities is a delicate art.
The solution T is generally well behaved and one expects the argument of max tobe achieved for m = 0, n − 1, or (n − 1)/2. It is difficult to explain why this is so,but the ‘level 1’ reason is a combination of considering how quickly T increasesalong with a symmetry argument.
To cut a long derivation short, the answer is Tqsort(n) ∈ O(n2).
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 24 / 26
![Page 26: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/26.jpg)
Complexity of Quicksort
We have shown that the worst-case complexity of Quicksort is O(n2), i.e. quadratic.
In principle, this is less efficient than Mergesort.
However, recall:
Tqsort(n) = n + Tqsort(m) + Tqsort(n − 1−m)
for some m ∈ {0, 1, . . . , n − 1}.
If, in general, m = 0 or m = n − 1, we get T (n) = O(n) + T (n − 1), so T (n) isO(n2).
But if, in general, m = n/2, we get T (n) = O(n) + 2T (n/2) and T (n) isO(n log2(n)).
Same as Mergesort! This is what happens on average.
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 25 / 26
![Page 27: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two](https://reader033.fdocuments.in/reader033/viewer/2022042413/5f2d4a4452feb579745448cf/html5/thumbnails/27.jpg)
Comparing the sorting algorithms
Selection Sort has complexity that is quadratic for best and worst cases.
Insertion Sort has worst case time complexity that is quadratic, but its best case islinear (if the list is already sorted).
Merge Sort has worst case time complexity O(n log2(n)) (also average case).
Quick sort has worst-case complexity that is quadratic, but average case that isO(n log2(n)).
Quick sort vs Merge sort? Beyond the scope of these lectures (depends on how listsare implemented...).
T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 26 / 26