ADSA: IntroAlgs/1 1 241-423 Advanced Data Structures and Algorithms Objective –introduce algorithm...

Post on 04-Jan-2016

216 views 3 download

Transcript of ADSA: IntroAlgs/1 1 241-423 Advanced Data Structures and Algorithms Objective –introduce algorithm...

ADSA: IntroAlgs/1 1

241-423 Advanced Data Structures and Algorithms

• Objective– introduce algorithm design using basic

searching and sorting, and remind students about T() and Big-Oh running time.

Semester 2, 2013-2014

1. Intro. to Algorithms

ADSA: IntroAlgs/1 2

Contents1.Selection Sort

2. An Array Sublist

3.Sequential Search

4.Binary Search

5.System/Memory Efficiency

6.Running Time Analysis

7.Big-Oh Analysis

8.The Timing Class

ADSA: IntroAlgs/1 3

1. Selection Sort

• Go through an array position by position, starting at index 0.

• At the current position, select the smallest element from the rest of the array.

• Swap it with the value in the current position.

ADSA: IntroAlgs/1 4

An Example

ADSA: IntroAlgs/1 5

• Number of passes = size of array – 1– e.g. made 4 passes over the 5-element arr[]– stopped sorting when finshed comparing arr[3]

and arr[4]

ADSA: IntroAlgs/1 6

selectionSort()

public static void selectionSort(int[] arr){ // index of smallest elem in sublist int smallIndex; int idx; int n = arr.length;

// idx has range 0 to n-2 for (idx = 0; idx < n-1; idx++) { // scan sublist starting at idx smallIndex = idx; :

ADSA: IntroAlgs/1 7

/* j goes through sublist from arr[idx+1] to arr[n-1] */ for (int j = idx+1; j < n; j++) /* if smaller element found, assign smallIndex to that posn */ if (arr[j] < arr[smallIndex]) smallIndex = j;

//* swap next smallest elem into arr[idx] int temp = arr[idx]; arr[idx] = arr[smallIndex]; arr[smallIndex] = temp; }} // end of selectionSort()

ADSA: IntroAlgs/1 8

Usage Example// an integer arrayint[] arr = {66, 20, 33, 55, 53, 57, 69, 11, 67, 70};

// use selectionSort() to order arrayselectionSort(arr);

System.out.print("Sorted: ");for (int i=0; i < arr.length; i++)

System.out.print(arr[i] + " ");

Sorted: 11 20 33 53 55 57 66 67 69 70

ADSA: IntroAlgs/1 9

2. An Array Sublist

• An array sublist is a sequence of elements whose indices begin at index first and go up to, but not including, last.

• Uses the notation: [first, last).

ADSA: IntroAlgs/1 10

3. Sequential Search

• Begin with a target value and index range [first, last).

• Go through sublist item by item, looking for target.

• Return the index position of the match or -1 if target is not in sublist.

ADSA: IntroAlgs/1 11

Examples

ADSA: IntroAlgs/1 12

seqSearch()

public static int seqSearch(int[] arr, int first, int last, int target){ /* scan first <= i < last; return index for position if a match occurs */ for (int i = first; i < last; i++) if (arr[i] == target) return i;

return -1; // target not found}

ADSA: IntroAlgs/1 13

4. Binary Search

• Binary search requires an ordered list, so large sections of the list can be skipped during the search.

• Calculate midpoint of the current sublist [first,last).

• If target matches midpoint value, then the search is finished.

continued

ADSA: IntroAlgs/1 14

• If target is less than midpoint value, look in the lower sublist; otherwise, look in the upper sublist.

• Continue until target is found or sublist size is 0.

ADSA: IntroAlgs/1 15

Binary Search Case 1

• target == midpoint value.

• The search is complete– mid is the index of the midpoint value

ADSA: IntroAlgs/1 16

Binary Search Case 2

• target < midValue– so search in lower sublist

• Index range becomes [first, mid).

• Set index last to be end of lower sublist (last = mid).

ADSA: IntroAlgs/1 17

Binary Search Case 3

• target > midValue– so search upper sublist

• Index range becomes [mid+1,last), because the upper sublist starts to the right of mid

• Set index first to be front of the upper sublist (first = mid+1).

ADSA: IntroAlgs/1 18

Binary Search Finish

• The binary search stops when a match is found, or when the sublist is 'empty'. – an empty sublist for [first,last) is when first >= last.

ADSA: IntroAlgs/1 19

A Successful Search: Step 1

target = 23

continued

ADSA: IntroAlgs/1 20

Step 2

target = 23

The sublist is roughly halved.The sublist is roughly halved.

continued

ADSA: IntroAlgs/1 21

Step 3

target = 23

The sublist is roughly halved.The sublist is roughly halved.

continued

ADSA: IntroAlgs/1 22

Failure Example Step 1

target = 4target = 4

continued

ADSA: IntroAlgs/1 23

Step 2

target = 4target = 4

The sublist is roughly halved.The sublist is roughly halved.

continued

ADSA: IntroAlgs/1 24

Step 3

target = 4target = 4

The sublist is roughly halved.The sublist is roughly halved.

continued

ADSA: IntroAlgs/1 25

Step 4

Index range [2,2). first ≥ last, so search fails. The return value is -1.

ADSA: IntroAlgs/1 26

binSearch()

public static int binSearch(int arr[], int first, int last, int target){ int mid; // index of midpoint int midValue; // value from arr[mid]

// test for nonempty sublist while (first < last) { mid = (first+last)/2; midValue = arr[mid]; :

ADSA: IntroAlgs/1 27

if (target == midValue) return mid; // have a match // determine which sublist to search else if (target < midValue) // search lower sublist; set last last = mid; else // search upper sublist; set first first = mid+1; }

return -1; // target not found} // end of binSearch()

ADSA: IntroAlgs/1 28

5. System/Memory Efficiency

• System efficiency is how fast an algorithm runs on a particular machine.

• Memory efficiency is the amount of memory an algorithm uses – if an algorithm uses too much memory, it can be

too slow, or may not execute at all, on a particular system.

ADSA: IntroAlgs/1 29

6. Running Time Analysis

• Machine-independent algorithm efficiency is measured in terms of the number of operations used in the code.

• The complexity of the algorithm usually depends on some size measure– usually the size of the input data

ADSA: IntroAlgs/1 30

min()

public static int min(int[] arr)// return the smallest elem. in arr[]{ int n = arr.length; if (n == 0) { System.out.println("Array has 0 size"); return 0; } else { int min = arr[0]; for (int i = 1; i < n; i++) if (arr[i] < min) min = arr[i]; return min; }}

input data is the

array, arr[]

ADSA: IntroAlgs/1 31

Running Time

• The number of comparison operations, T(n), required to find the smallest element in an n-element array.

T(n) = n-1

T() was explained in241-303 "Discrete Maths",part 4

Could count all operations, but T() would still be linear in n.

ADSA: IntroAlgs/1 32

Running Time: Selection Sort

• Count the number of comparison operations used to sort an array of size n– there are n-1 passes altogether– in the first pass there are n-1 comparisons– in the 2nd pass, n-2 comparisons, ...

T(n) = (n-1) + (n-2) + ... + 2 + 1 = n(n-1)/2 = n2/2 - n/2

ADSA: IntroAlgs/1 33

Running Time: seqSearch()

• Best case: Find target at index 0.T(n) = 1

• Worst case: Find target at index n-1 or not finding it.

T(n) = n

• Average case: Average of the number of comparisons to find a target at any position.

T(n) = (1+2+3...+n)/n = n(n+1)/2 * (1/n) = (n+1)/2

ADSA: IntroAlgs/1 34

Running Time: binSearch()

• Best case: Target found at first midpoint.T(n) = 1

• Worst case: Length of sublists halves at each iteration.T(n) = (int) log2n + 1

• Average case: A fancy analysis shows:

T(n) = (int) log2n

ADSA: IntroAlgs/1 35

7. Big-Oh Notation

• Big-Oh, O(n) is a simpler version of T(n) that only uses the 'biggest' term of the T(n) equation, without constants. – e.g. if T(n) = 8n3+5n2-11n+1,

then T(n) is O(n3).

– Selection sort is O(n2).– The average case for seqSearch() is O(n).

– The worst case for binSearch() is O(log2n).

O() was explained in241-303 "Discrete Maths",part 4

ADSA: IntroAlgs/1 36

Common Big-Oh's

• Constant time: T(n) is O(1) when its running time is independent of the n value.– e.g. find the smallest value in an ordered

n-element array

1 3 4 6 8 1 0 1 5 2 0 3 5 5 5

m inim um = ar r [0 ]

ar r

continued

ADSA: IntroAlgs/1 37

• Linear: T(n) is O(n) when running time is proportional to n, the size of the data. If n doubles, T() doubles.– e.g. find the smallest value in an unordered

n-element array, as in min()

continued

ADSA: IntroAlgs/1 38

• Quadratic: T(n) is O(n2). If n doubles, T() increases by a factor of 4– e.g. selection sort

• Cubic: T(n) is O(n3). Doubling n increases T() by a factor of 8– e.g. multiplication of two n*n matricies

continued

ADSA: IntroAlgs/1 39

• Logarithmic: T() is O(log2n) or O(n log2n).

– occurs when the algorithm repeatedly subdivides the data into sublists whose sizes are 1/2, 1/4, 1/8, ... of the original size n

– e.g. binary search is O(log2n)

– e.g. quicksort is O(n log2n)

continued

ADSA: IntroAlgs/1 40

• Exponential: T(n) is O(an). – These algorithms deal with problems that require searching

through a large number of potential solutions before finding an answer.

– e.g. the traveling salesman problem

• mentioned in the Discrete Maths subject

ADSA: IntroAlgs/1 41

T() Graphs

ADSA: IntroAlgs/1 42

T() Equations

n log2n n log2n n2 n3 2n

2 1 2 4 8 4 4 2 8 16 64 16 8 3 24 64 512 256

16 4 64 256 4096 65536 32 5 160 1024 32768 4294967296

128 7 896 16384 2097152 3.4 x 1038

1024 10 10240 1048576 1073741824 1.8 x 10308

65536 16 1048576 4294967296 2.8 x 1014 Forget it!

ADSA: IntroAlgs/1 43

8. The Timing Class

class TIMING ds.time

Constructor

Timing() Create an instance of the class with all timing variables set to 0.

Methods

void start() Start the timing

double stop() Stop the timing and return elapsed time in seconds.

A class in the Ford&Toppds.time package.

ADSA: IntroAlgs/1 44

Timing Class Example

Timing sortTimer = new Timing();

sortTimer.start(); // start timing

selectionSort(arr); // sort double timeInSec = sortTimer.stop(); // get sorting time in secs

ADSA: IntroAlgs/1 45

Search Time Comparison

• Compare sequential and binary search on the folllowing problem:

0 1 2 3 49,999

target list

51 2710310 99

0 1 2 3 99,999

34 2251 82 56

listSeq (listBin after sorting)

search for each inside ...search for each inside ...

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

ADSA: IntroAlgs/1 46

SearchTimesimport java.util.Random;import java.text.DecimalFormat;import ds.util.Arrays; // Ford & Topp packagesimport ds.time.Timing;

public class SearchTimes{ public static void main(String[] args) { int ARRAY_SIZE = 100000; int TARGET_SIZE = 50000;

// arrays for searches int[] listSeq = new int[ARRAY_SIZE], listBin = new int[ARRAY_SIZE], targetList = new int[TARGET_SIZE]; :

ADSA: IntroAlgs/1 47

// use Timing object t to compute times Timing t = new Timing();

// random number object Random rnd = new Random();

// format real numbers with 3 dps DecimalFormat fmt = new DecimalFormat("#.000");

:

ADSA: IntroAlgs/1 48

// initialize arrays with random numbers for (int i = 0; i < ARRAY_SIZE; i++) listSeq[i]=listBin[i]= rnd.nextInt(1000000);

// initialize targetList with random numbers for (int i=0; i < TARGET_SIZE; i++) targetList[i] = rnd.nextInt(1000000);

// time seq. search for targets in listSeq t.start(); for (int i = 0; i < TARGET_SIZE; i++) Arrays.seqSearch(listSeq, 0, ARRAY_SIZE, targetList[i]);

:

ADSA: IntroAlgs/1 49

double seqTime = t.stop(); System.out.println("Sequential Search takes " + fmt.format(seqTime) + " seconds.");

// sort listBin Arrays.selectionSort(listBin);

// time binary search for targets in listBin t.start(); for (int i = 0; i < TARGET_SIZE; i++) Arrays.binSearch(listBin, 0, ARRAY_SIZE,

targetList[i]); :

ADSA: IntroAlgs/1 50

double binTime = t.stop(); System.out.println("Binary Search takes " + fmt.format(binTime) + " seconds.");

System.out.println("Ratio of sequential to binary search time is " + fmt.format(seqTime/binTime)); } // end of main()

} // end of SearchTimes class

ADSA: IntroAlgs/1 51

Compilation and Execution

Ford and Topplibraries

Must add in sorting time.