CSCE 3110 Data Structures & Algorithm Analysis

21
CSCE 3110 Data Structures & Algorithm Analysis Rada Mihalcea http://www.cs.unt.edu/~rada/CSCE3110 Algorithm Analysis I Reading: Weiss, chap.2

description

CSCE 3110 Data Structures & Algorithm Analysis. Rada Mihalcea http://www.cs.unt.edu/~rada/CSCE3110 Algorithm Analysis I Reading: Weiss, chap.2. Problem Solving: Main Steps. Problem definition Algorithm design / Algorithm specification Algorithm analysis Implementation Testing - PowerPoint PPT Presentation

Transcript of CSCE 3110 Data Structures & Algorithm Analysis

Page 1: CSCE 3110 Data Structures  & Algorithm Analysis

CSCE 3110Data Structures & Algorithm Analysis

Rada Mihalceahttp://www.cs.unt.edu/~rada/CSCE3110

Algorithm Analysis IReading: Weiss, chap.2

Page 2: CSCE 3110 Data Structures  & Algorithm Analysis

Problem Solving: Main Steps

1. Problem definition2. Algorithm design / Algorithm

specification3. Algorithm analysis4. Implementation5. Testing6. [Maintenance]

Page 3: CSCE 3110 Data Structures  & Algorithm Analysis

1. Problem Definition

What is the task to be accomplished?Calculate the average of the grades for a given studentUnderstand the talks given out by politicians and translate them in Chinese

What are the time / space / speed / performance requirements ?

Page 4: CSCE 3110 Data Structures  & Algorithm Analysis

2. Algorithm Design / Specifications

Algorithm: Finite set of instructions that, if followed, accomplishes a particular task.Describe: in natural language / pseudo-code / diagrams / etc. Criteria to follow:

Input: Zero or more quantities (externally produced)Output: One or more quantities Definiteness: Clarity, precision of each instructionFiniteness: The algorithm has to stop after a finite (may be very large) number of stepsEffectiveness: Each instruction has to be basic enough and feasible

• Understand speech• Translate to Chinese

Page 5: CSCE 3110 Data Structures  & Algorithm Analysis

4,5,6: Implementation, Testing, Maintainance

ImplementationDecide on the programming language to use

• C, C++, Lisp, Java, Perl, Prolog, assembly, etc. , etc.Write clean, well documented code

Test, test, test

Integrate feedback from users, fix bugs, ensure compatibility across different versions Maintenance

Page 6: CSCE 3110 Data Structures  & Algorithm Analysis

3. Algorithm Analysis

Space complexityHow much space is required

Time complexityHow much time does it take to run the algorithm

Often, we deal with estimates!

Page 7: CSCE 3110 Data Structures  & Algorithm Analysis

Space Complexity

Space complexity = The amount of memory required by an algorithm to run to completion

[Core dumps = the most often encountered cause is “memory leaks” – the amount of memory required larger than the memory available on a given system]

Some algorithms may be more efficient if data completely loaded into memory

Need to look also at system limitationsE.g. Classify 2GB of text in various categories [politics, tourism, sport, natural disasters, etc.] – can I afford to load the entire collection?

Page 8: CSCE 3110 Data Structures  & Algorithm Analysis

Space Complexity (cont’d)

1. Fixed part: The size required to store certain data/variables, that is independent of the size of the problem:- e.g. name of the data collection- same size for classifying 2GB or 1MB of texts

2. Variable part: Space needed by variables, whose size is dependent on the size of the problem:- e.g. actual text - load 2GB of text VS. load 1MB of text

Page 9: CSCE 3110 Data Structures  & Algorithm Analysis

Space Complexity (cont’d)

S(P) = c + S(instance characteristics)c = constant

Example:void float sum (float* a, int n) {

float s = 0; for(int i = 0; i<n; i++) { s+ = a[i]; } return s;}Space? one word for n, one for a [passed by

reference!], one for i constant space!

Page 10: CSCE 3110 Data Structures  & Algorithm Analysis

Time Complexity

Often more important than space complexityspace available (for computer programs!) tends to be larger and largertime is still a problem for all of us

3-4GHz processors on the market still … researchers estimate that the computation of various transformations for 1 single DNA chain for one single protein on 1 TerraHZ computer would take about 1 year to run to completion

Algorithms running time is an important issue

Page 11: CSCE 3110 Data Structures  & Algorithm Analysis

Running Time

Problem: prefix averagesGiven an array XCompute the array A such that A[i] is the average of elements X[0] … X[i], for i=0..n-1

Sol 1At each step i, compute the element X[i] by traversing the array A and determining the sum of its elements, respectively the average

Sol 2 At each step i update a sum of the elements in the array ACompute the element X[i] as sum/I

Big question: Which solution to choose?

Page 12: CSCE 3110 Data Structures  & Algorithm Analysis

Running time

Input

1 ms

2 ms

3 ms

4 ms

5 ms

A B C D E F G

worst-case

best-case}average-case?

Suppose the program includes an if-then statement that may execute or not: variable running timeTypically algorithms are measured by their worst case

Page 13: CSCE 3110 Data Structures  & Algorithm Analysis

Experimental Approach

Write a program that implements the algorithmRun the program with data sets of varying size.Determine the actual running time using a system call to measure time (e.g. system (date) );

Problems?

Page 14: CSCE 3110 Data Structures  & Algorithm Analysis

Experimental Approach

It is necessary to implement and test the algorithm in order to determine its running time. Experiments can be done only on a limited set of inputs, and may not be indicative of the running time for other inputs. The same hardware and software should be used in order to compare two algorithms. – condition very hard to achieve!

Page 15: CSCE 3110 Data Structures  & Algorithm Analysis

Use a Theoretical Approach

Based on high-level description of the algorithms, rather than language dependent implementationsMakes possible an evaluation of the algorithms that is independent of the hardware and software environments

Generality

Page 16: CSCE 3110 Data Structures  & Algorithm Analysis

Algorithm DescriptionHow to describe algorithms independent of a programming language Pseudo-Code = a description of an algorithm that is

more structured than usual prose but less formal than a programming language

(Or diagrams)Example: find the maximum element of an array.Algorithm arrayMax(A, n):

Input: An array A storing n integers.Output: The maximum element in A.currentMax A[0]for i 1 to n -1 do

if currentMax < A[i] then currentMax A[i]return currentMax

Page 17: CSCE 3110 Data Structures  & Algorithm Analysis

Pseudo CodeExpressions: use standard mathematical symbols

use for assignment ( ? in C/C++)use = for the equality relationship (? in C/C++)

Method Declarations: -Algorithm name(param1, param2) Programming Constructs:

decision structures: if ... then ... [else ..]while-loops while ... do repeat-loops: repeat ... until ... for-loop: for ... do array indexing: A[i]

Methodscalls: object method(args)returns: return value

Use commentsInstructions have to be basic enough and feasible!

Page 18: CSCE 3110 Data Structures  & Algorithm Analysis

Low Level Algorithm Analysis

Based on primitive operations (low-level computations independent from the programming language)E.g.:

Make an addition = 1 operationCalling a method or returning from a method = 1 operationIndex in an array = 1 operationComparison = 1 operation etc.

Method: Inspect the pseudo-code and count the number of primitive operations executed by the algorithm

Page 19: CSCE 3110 Data Structures  & Algorithm Analysis

Example

Algorithm arrayMax(A, n):Input: An array A storing n

integers.Output: The maximum element in

A.currentMax A[0]for i 1 to n -1 doif currentMax < A[i] then

currentMax A[i]return currentMax

How many operations ?

Page 20: CSCE 3110 Data Structures  & Algorithm Analysis

Asymptotic Notation

Need to abstract furtherGive an “idea” of how the algorithm performsn steps vs. n+5 stepsn steps vs. n2 steps

Page 21: CSCE 3110 Data Structures  & Algorithm Analysis

Problem

Fibonacci numbersF[0] = 0F[1] = 1F[i] = F[i-1] + F[i-2] for i 2

Pseudo-codeNumber of operations