CPS120: Introduction to Computer Science Searching and Sorting.

CPS120: Introduction to Computer Science

Searching and Sorting

Basics of Sorting

When you rearrange data and put it into a certain kind of order, you are sorting the data. You can sort data alphabetically, numerically, and in other ways. Often you need to sort data before you use searching algorithms to find a particular piece of data.

Key Fields

The key field is the field upon which the data is sorted.

A key value is a specific value that is stored within the key field.

The input size is the number of elements in a list that will eventually be sorted.

Sorting Algorithms

There are a number of different sorting algorithms that are widely used by programmers. Each algorithm has its own advantages and disadvantages. The sorting algorithms include:

Sequential sort Insertion sort Bubble sortShell sortQuick sortMerge sort

Approaches to Sorting·There are two basic approaches to sorting data

The incremental approach The divide and conquer approach.

Using the incremental approach, one sorts the whole list at once using loops The divide and conquer approach splits the list up into parts and sorts each part separately. Then this approach manages to join the sorted parts together into a large sorted list

The Insertion Sort

The insertion sort is incremental in nature.

This is similar to the way a person usually organizes a hand of playing cards.

The insertion sort is relatively quick for small lists that are close to being sorted

Insertion Sorting

Mary Mary Gerri

Terry Gerri Kari

Gerri Kari Harry

Kari Harry Barry

Harry Barry Mary

Barry Terry Terry

Coding an Incremental Sort

Two lists are used throughout this algorithmOne list is sorted and the other is unsorted The sorted list begins with one key valueSuccessively, one key value from the unsorted list is placed into the proper order with the smaller, sorted listAs more key values are placed into the sorted list, that list becomes larger

Understanding the Selection Sort The selection sort is an incremental oneEvery key value is examined starting at the beginning of the list. A temporary variable to "remember" the position of the largest key valueBy the time you have examined every key value, you swap the key value that was the largest with the last key value in the listNext, you repeat the process again from the beginning of the list, however, you will not need to compare anything to the new last key value in the list since you know it is the largest

Coding the Selection Sort

This algorithm uses nested loops and is easy to code. It is quite inefficient since it continues processing even if the list is already sortedWhether the original data is close to being sorted or not, this algorithm takes quite awhile since a lot of loop iterations and comparisons must be made.

Understanding the Bubble Sort

The bubble sort is an incremental sort which is usually faster than the insertion and selection sorts.

A bubble sort works similarly to the release of CO2 in carbonated soda

The use of the Boolean variable causes this sort to only sweep the list one extra time after it has been fully sorted. This makes the bubble sort more efficient than a number of other incremental sorts

A Bubble Sort

Mary Mary Mary

Terry Terry Gerri

Gerri Gerri Terry

Kari Kari Kari

Harry Harry Harry

Barry Barry Barry

Coding a Bubble Sort

Beginning at one end of a list, adjacent key values are comparedAssuming that you are sorting the list into ascending order, these two key values would be swapped if the first was larger than the secondNext you compare the larger of the two to the next adjacent key value in the list, swapping if necessary. By the time that you compare the last two key values in the list, you know that the largest key value from the whole list will be in the last position

Continuing the Bubble Sort

Using a loop, this process is continued (comparing adjacent key values) from the beginning of the list. You will not need to compare anything to the last key value in the list since you know it is the largest. A Boolean variable is used in a bubble sort to "remember" if any swaps are made on a particular sweep of the list.

Understanding the Shell (Comb) Sort

The Shell sort is an incremental sort which was named after it's inventor, D. L. Shell. It is sometimes called the comb sort.

The Shell sort is much faster than other incremental sorts since very large and very small key values are quickly moved to the appropriate end of the list

Coding a Comb SortThis sorting algorithm is similar to the bubble sort but instead of comparing and swapping adjacent elements, elements that are a certain distance away from each other are compared and swapped. The certain distance can be called the gap size and might initially be set to half the input size.

After each sweep, the gap is decreased until eventually adjacent elements are compared. A Boolean variable is used as it is in the bubble sort to add efficiency

Understanding the Quick Sort

The quicksort is a divide and conquer algorithm and is more efficient than incremental sorts. It can be difficult to code though since it uses recursion or stacks. The original list is partitioned into two lists.

One of the lists contains elements that are greater than the first original element. The second list contains elements that are less than or equal to the first original element.

Quick Sort

Kari Gerri

Mary Harry

Terry Barry

Gerri Kari

Harry Mary

Barry Terri

Processing the Quick Sort

Each of these two partitions are partitioned using the same algorithm

Eventually, partitions will have only one element each. When this happens, that single-item partition is considered to be sorted and only partitions with multiple-items are partitioned.

When every partition consists of only a one element, the partitions are placed into order to make a fully sorted list

Understanding the Merge Sort

The merge sort is a divide and conquer algorithm.

The whole list is divided into lists that consist of one element a piece. Then, every two adjacent lists are merged into one larger sorted list. The process continues until one sorted list remains

Sequential Searching Although there are more efficient ways to search for data, sequential searching is a good choice of methods when amount of data to be searched is small. You simply check each element of an array position by position until you find the one that you are looking for. In any search, the item upon which the search is based is called the key and the field being searched is called the key field.

Binary Searching

If the data is ordered (alphabetically, for example) in an array then a binary search is much more efficient than a sequential search. In the case of a binary search, you first examine the "middle" element. If that element is "lower" (alphabetically, for example) than the element that you are looking for, then you discard the lower half of the data and continue to search the upper half. The process repeats itself when you then look at the middle element of the remaining "upper half" of the data.

CPS120: Introduction to Computer Science Searching and Sorting.

Documents

Transcript of CPS120: Introduction to Computer Science Searching and Sorting.