C++ Programming: Program Design Including Data Structures, Third Edition

Post on 03-Jan-2016

8 views 1 download

Tags:

description

C++ Programming: Program Design Including Data Structures, Third Edition. Chapter 19: Searching and Sorting Algorithms. Objectives. In this chapter you will: Learn the various search algorithms Explore how to implement the sequential and binary search algorithms - PowerPoint PPT Presentation

Transcript of C++ Programming: Program Design Including Data Structures, Third Edition

C++ Programming: Program Design IncludingData Structures, Third Edition

Chapter 19: Searching and Sorting Algorithms

ObjectivesIn this chapter you will:• Learn the various search algorithms• Explore how to implement the sequential and binary search

algorithms• Discover how the sequential and binary search algorithms

perform• Become aware of the lower bound on comparison-based

search algorithms• Learn the various sorting algorithms• Explore how to implement the bubble, selection, insertion,

quick, and merge sorting algorithms• Discover how the sorting algorithms discussed in this

chapter perform

• The most important operation that can be performed on a list is the search algorithm. Using the search algorithm, you can do the following:

• Determine whether a particular item is in the list.• If the data is specially organized (for example,

sorted), find the location in the list where a new item can be inserted.

• Find the location of an item to be deleted.

• The searching and sorting algorithms that we describe are generic. • Because searching and sorting require comparisons of data, the

algorithms should work on the type of data that provide appropriate functions to compare data items.

• Data can be organized with the help of an array or a linked list. • You can create an array of data items or you can use the class unorderedLinkedList to organize data.

• The algorithms that we describe should work on either organization.

• We write function templates to implement a particular algorithm. • All algorithms described in this chapter, with the exception of the

merge sort algorithms, are for array-based lists. • We show how to use the searching and sorting algorithms on

objects of the class unorderedArrayListType. • We place all the array-based searching and sorting functions in the

header file searchSortAlgorithms.h. • If you need to use a particular searching and/or sorting function

designed in this chapter, your program can include this header file and use that function.

• Associated with each item in a data set is a special member that uniquely identifies the item in the data set.

• This unique member of the item is called the key of the item. • The keys of the items in the data set are used in such

operations as searching, sorting, inserting, and deleting. • When analyzing searching and sorting algorithms, the key

comparisons refer to comparing the key of the search item with the key of an item in the list.

• The number of key comparisons refers to the number of times the key of the search item (in algorithms such as searching and sorting) is compared with the keys of the items in the list.

• Sequential search does not require the list elements to be in any particular order.

• The statements before and after the loop are executed only once, and hence require very little computer time.

• The statements in the for loop are the ones that are repeated several times. For each iteration of the loop, the search item is compared with an element in the list, and a few other statements are executed, including some other comparisons.

• The loop terminates as soon as the search item is found in the list. • Execution of the other statements in the loop is directly related to the

outcome of the key comparison. • Different programmers might implement the same algorithm differently,

although the number of key comparisons would typically be the same. • The speed of a computer can also easily affect the time an algorithm

takes to perform, but it of course does not affect the number of key comparisons required.

• Therefore, when analyzing a search algorithm, we count the number of key comparisons because this number gives us the most useful information.

• Suppose that L is a list of length n. • If the search item is not in the list, we then compare the search item

with every element in the list, making n comparisons. This is an unsuccessful case.

• Suppose that the search item is in the list. • The number of key comparisons depends on where in the list the

search item is located. • If the search item is the first element of L, we make only one key

comparison. This is the best case. • On the other hand, if the search item is the last element in the list, the

algorithm makes n comparisons. This is the worst case. • To determine the average number of comparisons in the successful

case of the sequential search algorithm:1. Consider all possible cases.2. Find the number of comparisons for each case.3. Add the number of comparisons and divide by the number of

cases.

• If the search item, called the target, is the first element in the list, one comparison is required. If the target is the second element in the list, two comparisons are required. Similarly, if the target is the kth element in the list, k comparisons are required. We assume that the target can be any element in the list

• The following expression gives the average number of comparisons:

• This expression shows that on average, a successful sequential search searches half the list.

• If the list size is 1,000,000, on average, the sequential search makes 500,000 comparisons.

• The sequential search is not efficient for large lists.

• Binary search can be applied to sorted lists• Uses the “divide and conquer” technique

− Compare search item to middle element

− If search item is less than middle element, restrict the search to the lower half of the list

− Otherwise search the upper half of the list

Search item = 89

Search item = 34

Search item = 22

Binary Search (continued)

• Every iteration cuts size of search list in half• If list L has 1000 items

− At most 11 iterations needed to determine if an item x is in list

• Every iteration makes 2 key (item) comparisons− Binary search makes at most 22 key comparisons to

determine if x is in L

• Sequential search makes 500 key comparisons (average) if x is in L for the same size list

• Suppose that L is a sorted list of size n. • Suppose that n is a power of 2, that is, n = 2m, for

some nonnegative integer m. • After each iteration of the for loop, about half the

elements are left to search. • For example, after the first iteration the search

sublist is of the size about n /2 = 2m1. • It is easy to see that the maximum number of the

iteration of the for loop is about m + 1. Also, m = log2n.

• Each iteration makes 2 key comparisons. • The maximum number of comparisons to determine

whether an element x is in L is 2(m + 1) = 2(log2n + 1) = 2log2n + 2.

• Just as a problem is analyzed before writing the algorithm and the computer program, after an algorithm is designed it should also be analyzed.

• There are various ways to design a particular algorithm.

• Certain algorithms take very little computer time to execute, while others take a considerable amount of time.

• Lines 1 to 6 each have 1 operation, << or >>.

• Line 7 has 1 operation, >=.

• Either Line 8 or Line 9 executes; each has 1 operation.

• There are 3 operations, <<, in Line 11.

• The total number of operations executed in this code is 6 + 1 + 1 + 3 = 11.

• This algorithm has 5 operations (Lines 1 through 5) before the while loop. Similarly, there are 9 or 8 operations after the while loop, depending on whether Line 11 or Line 13 executes.

• Line 5 has 1 operation, and 4 operations within the while loop (Lines 6 through 8).

• Lines 5 through 8 have 5 operations. If the while loop executes 10 times, these 5 operations execute 10 times, plus one extra operation is executed at Line 5 to terminate the loop. Therefore, the number of operations executed from Lines 5 through 8 is 51.

• If the while loop executes 10 times, the total number of operations executed is:

5 × 10 + 1 + 5 + 9 or 5 × 10 + 1 + 5 + 8that is,

5 × 10+ 15 or 5 × 10 + 14 • When the while loop executes n times: If the while loop executes n

times, the number of operations executed is:5n + 15 or 5n + 14

• In these expressions, for very large values of n, the term 5n becomes the dominating term and the terms 15 and 14 become negligible.

• Table 19-4 shows how certain functions grow as the parameter n grows.

• Suppose that the problem size is doubled.

• If the number of basic operations is a function of f(n) = n2; the number of basic operations is quadrupled.

• If the number of basic operations is a function of f(n) = 2n, then the number of basic operations is squared.

• However, if the number of operations is a function of f(n) = log2n, the change in the number of basic operations is insignificant.

Sorting a List: Bubble Sort

• Suppose list[0]...list[n - 1] is a list of n elements, indexed 0 to n – 1

• Bubble sort algorithm:

− In a series of n - 1 iterations, compare successive elements, list[index] and list[index + 1]

− If list[index] is greater than list[index + 1], then swap them

• Suppose a list L of length n is to be sorted using bubble sort. • Consider the function bubbleSort. • This function contains nested for loops. • The outer loop executes n – 1 times. • For each iteration of the outer loop, the inner loop executes a certain

number of times. Let us consider the first iteration of the outer loop. • During the first iteration of the outer loop, the number of iterations of

the inner loop is n – 1. So there are n – 1 comparisons. • During the second iteration of the outer loop, the number of iterations

of the inner loop is n – 2, and so on. Thus, the total number of comparisons is

• In the worst case, the number of assignments is

template <class elemType>void unorderedArrayListType<elemType>::sort(){ bubbleSort(list, length);}

• Selection sort: rearrange list by selecting an element and moving it to its proper position

• Find the smallest (or largest) element and move it to the beginning (end) of the list

Selection Sort (continued)

• On successive passes, locate the smallest item in the list starting from the next element

• Suppose that a list L of length n is to be sorted using the selection sort algorithm.

• The function swap does three item assignments and is executed n − 1 times.

• The number of item assignments is 3(n − 1) = O(n).

• The key comparisons are made by the function minLocation. • For a list of length k, the function minLocation makes k − 1 key

comparisons. Also, the function minLocation is executed n − 1 times (by the function selectionSort).

• The first time, the function minLocation finds the index of the smallest key item in the entire list and therefore makes n − 1 comparisons.

• The second time, the function minLocation finds the index of the smallest element in the sublist of length n − 1 and makes n − 2 comparisons, and so on.

• The number of key comparisons is as follows:

• If n = 1000, the number of key comparisons the selection sort algorithm makes is

The insertion sort algorithm sorts the list by moving each element to its proper place.

• Let L be a list of length n. • The for loop executes n – 1 times. • In the best case, when the list is already sorted, for each iteration

of the for loop, the if statement evaluates to false, so there are n – 1 key comparisons.

• In the best case, the number of key comparisons is n – 1 = O(n). • Let us consider the worst case. In this case, for each iteration of

the for loop, the if statement evaluates to true. Moreover, in the worst case, for each iteration of the for loop, the do…while loop executes firstOutOfOrder – 1 times. It follows that in the worst case, the number of key comparisons is:

1 + 2 + … + (n – 1) = n(n – 1 ) / 2 = O(n2).

• It can be shown that the average number of key comparisons and the average number of item assignments in an insertion sort algorithm are:

• We can trace the execution of a comparison-based algorithm by using a graph called a comparison tree.

• Let L be a list of n distinct elements, where n > 0.

• For any j and k, where 1 j n, 1 k n, either L[j] < L[k] or L[j] > L[k].

• Because each comparison of the keys has two outcomes, the comparison tree is a binary tree.

• While drawing this figure, we draw each comparison as a circle, called a node.

• The node is labeled as j:k, representing the comparison of L[j] with L[k]. If L[j] < L[k], follow the left branch; otherwise, follow the right branch.

• Figure 19-36 shows the comparison tree for a list of length 3.

• In Figure 19-36, the rectangle, called a leaf, represents the final ordering of the nodes.

• We call the top node in the figure the root node. • The straight line that connects the two nodes is called a branch. • A sequence of branches from a node, x, to another node, y, is

called a path from x to y. • Associated with each path from the root to a leaf is a unique

permutation of the elements of L. • This uniqueness follows because the sort algorithm only moves

the data and makes comparisons. • For a list of n elements, n > 0, there are n! different permutations.

Any one of these n! permutations might be the correct ordering of L. Thus, the comparison tree must have at least n! leaves.

• The quick sort algorithm uses the divide-and-conquer technique to sort a list.

• The list is partitioned into two sublists, which are then sorted and combined into one list in such a way so that the combined list is sorted.

• The general algorithm is

• To partition the list into two sublists, first we choose an element of the list called pivot.

• The pivot is used to divide the list into two sublists: lowerSublist and upperSublist.

• The elements in lowerSublist are smaller than pivot, and the elements in upperSublist are greater than pivot.

• There are several ways to determine pivot.

• However, pivot is chosen so that, it is hoped, lowerSublist and upperSublist are of nearly equal size.

• Let us choose the middle element of the list as pivot.

• The partition procedure that we describe partitions this list using pivot as the middle element, in our case 50, as shown in Figure 19-38.

The partition algorithm is as follows (we assume that pivot is chosen as the middle element of the list):1. Determine pivot, and swap pivot with the first element

of the list.

Suppose that the index smallIndex points to the last element less than pivot. The index smallIndex is initialized to the first element of the list.

2. For the remaining elements in the list (starting at the second element):If the current element is less than pivota. Increment smallIndex.b. Swap the current element with the array element

pointed to by smallIndex.3. Swap the first element, that is, pivot, with the array

element pointed to by smallIndex.

• The average-case behavior of a quick sort is O(nlog2n). However,

the worst-case behavior of a quick sort is O(n2). • This section describes the sorting algorithm whose behavior is

always O(nlog2n).• Like the quick sort algorithm, the merge sort algorithm uses the

divide-and-conquer technique to sort a list. • A merge sort algorithm also partitions the list into two sublists,

sorts the sublists, and then combines the sorted sublists into one sorted list.

• The merge sort and the quick sort algorithms differ in how they partition the list.

• A quick sort first selects an element in the list, called pivot, and then partitions the list so that the elements in one sublist are less than pivot and the elements in the other sublist are greater than or equal to pivot.

• By contrast, a merge sort divides the list into two sublists of nearly equal size.

• Every time we advance middle by one node, we advance current by one node.

• After advancing current by one node, if current is not NULL, we again advance current by one node.

• Eventually, current becomes NULL and middle points to the last node of the first sublist.

• Suppose that L is a list of n elements, where n > 0.

• Suppose that n is a power of 2, that is, n = 2m for some nonnegative integer m, so that we can divide the list into two sublists, each of size:

• Consider the general case when n = 2m. • The number of recursion levels is m. • To merge a sorted list of size s with a sorted list of size t, the

maximum number of comparisons is s + t 1. • Consider the function mergeList, which merges two sorted

lists into a sorted list. • This is where the actual work (comparisons and assignments)

is done. • The initial call to the function recMergeSort, at level 0, produces

two sublists, each of the size n / 2. • To merge these two lists, after they are sorted, the maximum

number of comparisons is

• At level 1, we merge two sets of sorted lists, where each sublist is of the size n / 4.

• To merge two sorted sublists, each of the size n / 4, we need at most

comparisons.

• At level 1 of the recursion, the number of comparisons is 2(n / 2 – 1) = n – 2 = O(n).

• At level k of the recursion, there are a total of 2k calls to the function mergeList. Each of these calls merge two sublists, each of the size n / 2k + 1, which requires a maximum of n / 2k 1 comparisons.

• At level k of the recursion, the maximum number of comparisons is

• The maximum number of comparisons at each level of the recursion is O(n).

• Because the number of levels of the recursion is m, the maximum number of comparisons made by the merge sort algorithms is O(nm).

• Now n = 2m implies that m = log2n. Hence, the maximum number of comparisons made by the merge sort algorithm is O(n log2n).

• If W(n) denotes the number of key comparisons in the worst case to sort L, then

• Let A(n) denote the number of key comparisons in the average case.

• On average, it can be shown that the number of comparisons for merge sort is given by the following equation: If n is a power of 2,

• The presidential election for the student council of your local university is about to be held.

• The chair of the election committee wants to computerize the voting and has asked you to write a program to analyze the data and report the winner.

• The university has four major divisions, and each division has several departments.

• For the election, the four divisions are labeled as region 1, region 2, region 3, and region 4.

• Each department in each division handles its own voting and reports the votes received by each candidate to the election committee.

• The voting is reported in the following form:

firstName lastName regionNumber numberOfVotes

• The input file containing the voting data looks like the following:

• The main component of this program is a candidate. Therefore, first we will design the class candidateType to implement a candidate object.

Function printResults