CS235102 Data Structures

40
CS235102 CS235102 Data Structures Data Structures Chapter 1 Basic Concepts Chapter 1 Basic Concepts

description

CS235102 Data Structures. Chapter 1 Basic Concepts. Chapter 1 Basic Concepts. Overview: System Life Cycle Algorithm Specification Data Abstraction Performance Analysis Performance Measurement. System life cycle. - PowerPoint PPT Presentation

Transcript of CS235102 Data Structures

Page 1: CS235102 Data Structures

CS235102CS235102 Data StructuresData StructuresChapter 1 Basic ConceptsChapter 1 Basic Concepts

Page 2: CS235102 Data Structures

Chapter 1 Basic ConceptsChapter 1 Basic Concepts

Overview: System Life Cycle Algorithm Specification Data Abstraction Performance Analysis Performance Measurement

Page 3: CS235102 Data Structures

System life cycleSystem life cycle

Good programmers regard large-scale Good programmers regard large-scale computer programs as systems that computer programs as systems that contain many complex interacting parts.contain many complex interacting parts.

As systems, these programs undergo a As systems, these programs undergo a development process called the development process called the system system life cyclelife cycle..

Page 4: CS235102 Data Structures

System life cycleSystem life cycle We consider this cycle as consisting of We consider this cycle as consisting of

five phases.five phases. Requirements: Requirements: Inputs and OutputsInputs and Outputs Analysis: Analysis: bottom-up vs. top-downbottom-up vs. top-down Design: Design: data objects and operationsdata objects and operations Refinement and Coding: Refinement and Coding: Representations of Representations of

data objects and algorithms for operationsdata objects and algorithms for operations VerificationVerification

Program ProvingProgram Proving TestingTesting DebuggingDebugging

Page 5: CS235102 Data Structures

Algorithm SpecificationAlgorithm Specification

1.2.1 Introduction1.2.1 Introduction An An algorithmalgorithm is a finite set of instructions that is a finite set of instructions that

accomplishes a particular task.accomplishes a particular task. CriteriaCriteria

input: input: zero or more quantities that are externally suppliedzero or more quantities that are externally supplied output: output: at least one quantity is producedat least one quantity is produced definiteness: definiteness: clear and unambiguousclear and unambiguous finiteness: finiteness: terminate after a finite number of stepsterminate after a finite number of steps effectiveness: effectiveness: instruction is basic enough to be carried outinstruction is basic enough to be carried out

A program does not have to satisfy the finiteness criteria.

Page 6: CS235102 Data Structures

Algorithm SpecificationAlgorithm Specification

Representation A natural language, like English or Chinese. A graphic, like flowcharts. A computer language, like C.

Algorithms + Data structures = Programs [Niklus Wirth]

Sequential search vs. Binary search

Page 7: CS235102 Data Structures

Example 1.1 [Example 1.1 [Selection sortSelection sort]:]: From those integers that are currently unsorted, find the smallest and

place it next in the sorted list.

Algorithm SpecificationAlgorithm Specification

i [0] [1] [2] [3] [4]- 30 10 50 40

200 10 30 50 40

201 10 20 50 40

302 10 20 30 40

503 10 20 30 40

50

Page 8: CS235102 Data Structures

Program 1.3 contains Program 1.3 contains a complete program a complete program which you may run on which you may run on your computeryour computer

Page 9: CS235102 Data Structures

Algorithm SpecificationAlgorithm Specification Example 1.2Example 1.2 [[Binary searchBinary search]:]:

[0] [1] [2] [3] [4] [5] [6] 8 14 26 30 43 50 52left right middle list[middle] : searchnum

0 6 3 30 < 434 6 5 50 > 434 4 4 43 == 430 6 3 30 > 180 2 1 14 < 182 2 2 26 > 182 1 -

Searching a sorted list while (there are more integers to check) { middle = (left + right) / 2; if (searchnum < list[middle]) right = middle - 1; else if (searchnum == list[middle]) return middle; else left = middle + 1;}

Page 10: CS235102 Data Structures

int binsearch (int list[], int searchnum, int left, int right){/* search list[0] <= list[1] <= … <= list[n-1] for searchnum.Return its position if found. Otherwise return -1 */

int middle;while (left <= right) {

middle = (left + right)/2;switch ( COMPARE (list[middle], searchnum)){

case -1: left = middle + 1;break;

case 0 : return middle;case 1 : right = middle – 1;

}}return -1;

}

*Program 1.6: Searching an ordered list*Program 1.6: Searching an ordered list

Page 11: CS235102 Data Structures

Algorithm SpecificationAlgorithm Specification

1.2.2 Recursive algorithms1.2.2 Recursive algorithms Beginning programmer view a function as Beginning programmer view a function as

something that is invoked (called) by another something that is invoked (called) by another functionfunction It executes its code and then returns control to the It executes its code and then returns control to the

calling function.calling function.

Page 12: CS235102 Data Structures

Algorithm SpecificationAlgorithm Specification

This perspective ignores the fact that functions This perspective ignores the fact that functions can call themselves (can call themselves (direct recursiondirect recursion).).

They may call other functions that invoke the They may call other functions that invoke the calling function again (calling function again ( indirect recursionindirect recursion).). extremely powerfulextremely powerful frequently allow us to express an otherwise frequently allow us to express an otherwise

complex process in very clear termcomplex process in very clear term

We should express a recursive algorithm when the problem itself is defined recursively.

Page 13: CS235102 Data Structures

Algorithm SpecificationAlgorithm Specification Example 1.3 [Example 1.3 [Binary searchBinary search]:]:

Page 14: CS235102 Data Structures

First,We Permutations the char *string = “abc” ;by call perm(string,0,2);0 is start index 2 is end index

Example 1.4 [Example 1.4 [PermutationsPermutations]:]:

main

Perm ( string , 0 , 2 )

I=0 J=0 N=2

SWAP ( list[0],list[0], temp)SWAP ‘a’ ‘a’

Call : perm ( list,1, 2)

Perm ( string , 1 , 2 )

Call Stack:

I=1 J=1 N=2

SWAP ( list[1],list[1], temp)SWAP ‘b’ ‘b’

Call : perm ( list,2, 2)

Perm ( string , 2 , 2 )

Print The String“abc”

I=1 J=1 N=2

SWAP ( list[1],list[1], temp)SWAP ‘b’ ‘b’

I=1 J=2 N=2

SWAP ( list[1],list[2], temp)SWAP ‘b’ ‘c’

Call : perm ( list,2, 2)

Print The String“acb”

I=1 J=2 N=2

SWAP ( list[1],list[2], temp)SWAP ‘b’ ‘c’

I=0 J=0 N=2

SWAP ( list[0],list[0], temp)SWAP ‘a’ ‘a’

I=0 J=1 N=2

SWAP ( list[0],list[1], temp)SWAP ‘a’ ‘b’

Call : perm ( list,1, 2)I=1 J=3 N=2I=1 J=1 N=2

SWAP ( list[1],list[1], temp)SWAP ‘b’ ‘b’

Call : perm ( list,2, 2)

Print The String“bac”

I=1 J=1 N=2

SWAP ( list[1],list[1], temp)SWAP ‘b’ ‘b’

I=1 J=2 N=2

SWAP ( list[1],list[2], temp)SWAP ‘a’ ‘c’

Call : perm ( list,2, 2)

Print The String“bca”

I=1 J=2 N=2

SWAP ( list[1],list[2], temp)SWAP ‘a’ ‘c’

I=1 J=3 N=2

Page 15: CS235102 Data Structures

`̀ Example 1.4 [Example 1.4 [PermutationsPermutations]:]:

lv0 perm: i=0, n=2 abclv0 SWAP: i=0, j=0 abclv1 perm: i=1, n=2 abclv1 SWAP: i=1, j=1 abclv2 perm: i=2, n=2 abcprint: abclv1 SWAP: i=1, j=1 abclv1 SWAP: i=1, j=2 abclv2 perm: i=2, n=2 acbprint: acblv1 SWAP: i=1, j=2 acblv0 SWAP: i=0, j=0 abclv0 SWAP: i=0, j=1 abclv1 perm: i=1, n=2 baclv1 SWAP: i=1, j=1 baclv2 perm: i=2, n=2 bacprint: baclv1 SWAP: i=1, j=1 baclv1 SWAP: i=1, j=2 baclv2 perm: i=2, n=2 bcaprint: bcalv1 SWAP: i=1, j=2 bcalv0 SWAP: i=0, j=1 baclv0 SWAP: i=0, j=2 abclv1 perm: i=1, n=2 cbalv1 SWAP: i=1, j=1 cbalv2 perm: i=2, n=2 cbaprint: cbalv1 SWAP: i=1, j=1 cbalv1 SWAP: i=1, j=2 cbalv2 perm: i=2, n=2 cabprint: cablv1 SWAP: i=1, j=2 cablv0 SWAP: i=0, j=2 cba

Page 16: CS235102 Data Structures

Data AbstractionData Abstraction Data TypeData Type

A A data typedata type is a collection of is a collection of objectsobjects and a set of and a set of operationsoperations that act on those objects. that act on those objects. For example, the data type int consists of the objects {0,

+1, -1, +2, -2, …, INT_MAX, INT_MIN} and the operations +, -, *, /, and %.

The data types of CThe data types of C The basic data types: char, int, float and doubleThe basic data types: char, int, float and double The group data types: array and structThe group data types: array and struct The pointer data typeThe pointer data type The user-defined typesThe user-defined types

Page 17: CS235102 Data Structures

Data AbstractionData Abstraction

Abstract Data TypeAbstract Data Type An An abstract data type(ADT)abstract data type(ADT) is a data type is a data type

that is organized in such a way that that is organized in such a way that the specification of the objectsthe specification of the objects and and the operations on the objectsthe operations on the objects is separated from is separated from

the representation of the objectsthe representation of the objects and and the implementation of the operationsthe implementation of the operations..

We know what is does, but not necessarily how it will do it.

Page 18: CS235102 Data Structures

Data AbstractionData Abstraction Specification vs. ImplementationSpecification vs. Implementation

An ADT is implementation independentAn ADT is implementation independent Operation specificationOperation specification

function namefunction name the types of argumentsthe types of arguments the type of the resultsthe type of the results

The functions of a data type can be The functions of a data type can be classify into several categories:classify into several categories: creator / constructorcreator / constructor transformerstransformers observers / reportersobservers / reporters

Page 19: CS235102 Data Structures

Data AbstractionData Abstraction Example 1.5 [Example 1.5 [Abstract data typeAbstract data type Natural_NumberNatural_Number]]

::= is defined as

Page 20: CS235102 Data Structures

Performance AnalysisPerformance Analysis CriteriaCriteria

Is it correct?Is it correct? Is it readable?Is it readable? ……

Performance Analysis Performance Analysis (machine independent)(machine independent) space complexity: storage requirementspace complexity: storage requirement time complexity: computing timetime complexity: computing time

Performance Measurement Performance Measurement (machine dependent)(machine dependent)

Page 21: CS235102 Data Structures

Performance AnalysisPerformance Analysis 1.4.1 Space Complexity:1.4.1 Space Complexity:

S(P)=C+SS(P)=C+SPP(I)(I) Fixed Space Requirements (Fixed Space Requirements (CC))

Independent of the characteristics Independent of the characteristics of the inputs and outputsof the inputs and outputs instruction spaceinstruction space space for simple variables, fixed-size structured space for simple variables, fixed-size structured

variable, constantsvariable, constants

Variable Space Requirements (Variable Space Requirements (SSPP(I)(I)))depend on the instance characteristic Idepend on the instance characteristic I number, size, values of inputs and outputs number, size, values of inputs and outputs

associated with Iassociated with I recursive stack space, formal parameters, local recursive stack space, formal parameters, local

variables, return addressvariables, return address

Page 22: CS235102 Data Structures

Performance AnalysisPerformance Analysis Examples:Examples:

Example 1.6: In program 1.9, Example 1.6: In program 1.9, SSabcabc((II)=0.)=0.

Example 1.7: In program 1.10, Example 1.7: In program 1.10, SSsumsum((II)=)=SSsumsum((nn)=2.)=2.

Recall: pass the address of thefirst element of the array &pass by value

Page 23: CS235102 Data Structures

Performance AnalysisPerformance Analysis Example 1.8: Program 1.11 is a recursive Example 1.8: Program 1.11 is a recursive

function for addition. Figure 1.1 shows the function for addition. Figure 1.1 shows the number of bytes required for one recursive call.number of bytes required for one recursive call.

Ssum(I)=Ssum(n)=6n

Page 24: CS235102 Data Structures

Performance AnalysisPerformance Analysis 1.4.2 Time Complexity: 1.4.2 Time Complexity:

T(P)=C+TT(P)=C+TPP(I)(I) The time, The time, TT((PP), taken by a program, ), taken by a program, PP, is the , is the

sum of its compile time sum of its compile time CC and its run (or and its run (or execution) time, execution) time, TTPP(I)(I)

Fixed time requirementsFixed time requirements Compile time (Compile time (CC), independent of instance ), independent of instance

characteristicscharacteristics

Variable time requirementsVariable time requirements Run (execution) time Run (execution) time TTPP

TP(n)=caADD(n)+csSUB(n)+clLDA(n)+cstSTA(n)

Page 25: CS235102 Data Structures

Performance AnalysisPerformance Analysis A A program stepprogram step is a syntactically or is a syntactically or

semantically meaningful program segment semantically meaningful program segment whose execution time is independent of the whose execution time is independent of the instance characteristics.instance characteristics. Example Example

((Regard as the same unit machine independent)) abc = a + b + b * c + (a + b - c) / (a + b) + 4.0abc = a + b + b * c + (a + b - c) / (a + b) + 4.0 abc = a + b + cabc = a + b + c

Methods to compute the step countMethods to compute the step count Introduce variable count into programsIntroduce variable count into programs Tabular methodTabular method

Determine the total number of steps contributed by Determine the total number of steps contributed by each statement step pereach statement step per execution execution frequency frequency

add up the contribution of all statementsadd up the contribution of all statements

Page 26: CS235102 Data Structures

Performance AnalysisPerformance Analysis Iterative summing of a list of numbersIterative summing of a list of numbers **Program 1.12: Program 1.12: Program 1.10 with count statements (p.23)Program 1.10 with count statements (p.23)

float sum (float list[ ], int n){ float tempsum = 0; count++; /* for assignment */ int i; for (i = 0; i < n; i++) { count++; /*for the for loop */ tempsum += list[i]; count++; /* for assignment */ } count++; /* last execution of for */ return tempsum; count++; /* for return */ }

2n + 3 steps

Page 27: CS235102 Data Structures

Performance AnalysisPerformance Analysis

Tabular MethodTabular Method **Figure 1.2:Figure 1.2: Step count table for Program 1.10 (p.26) Step count table for Program 1.10 (p.26)

Statement s/e Frequency Total steps

float sum(float list[ ], int n) { float tempsum = 0; int i; for(i=0; i <n; i++)

tempsum += list[i]; return tempsum; }

0 0 0 0 0 0 1 1 1 0 0 0 1 n+1 n+1 1 n n 1 1 1 0 0 0

Total 2n+3

steps/executionIterative function to sum a list of numbers

Page 28: CS235102 Data Structures

Performance AnalysisPerformance Analysis

Recursive summing of a list of numbersRecursive summing of a list of numbers **Program 1.14: Program 1.14: Program 1.11 with count statements added (p.24)Program 1.11 with count statements added (p.24)

float rsum (float list[ ], int n){

count++; /*for if conditional */if (n) {

count++; /* for return and rsum invocation*/ return rsum (list, n-1) + list[n-1]; } count++; return list[0];}

2n+2 steps

Page 29: CS235102 Data Structures

Performance AnalysisPerformance Analysis• **Figure 1.3:Figure 1.3: Step count table for recursive summing function Step count table for recursive summing function

(p.27)(p.27)

Statement s/e Frequency Total steps

float rsum(float list[ ], int n) { if (n) return rsum(list, n-1)+list[n-1]; return list[0]; }

0 0 0 0 0 0 1 n+1 n+1 1 n n 1 1 1 0 0 0

Total 2n+2

Page 30: CS235102 Data Structures

Performance AnalysisPerformance Analysis 1.4.3 Asymptotic notation (O, 1.4.3 Asymptotic notation (O, , , ))

Complexity of Complexity of cc11nn22++cc22nn and and cc33nn

for sufficiently large of value of for sufficiently large of value of nn, , cc33nn is faster is faster

than than cc11nn22++cc22nn

for small values of n, either could be fasterfor small values of n, either could be faster cc11=1, =1, cc22=2, =2, cc33=100 --> =100 --> cc11nn22++cc22nn cc33nn for for nn 98 98

cc11=1, =1, cc22=2, =2, cc33=1000 --> =1000 --> cc11nn22++cc22nn cc33nn for for nn 998 998

break even pointbreak even point no matter what the values of no matter what the values of cc1, 1, cc2, and 2, and cc3, the 3, the nn

beyond which beyond which cc33nn is always faster than is always faster than cc11nn22++cc22nn

Page 31: CS235102 Data Structures

Performance AnalysisPerformance Analysis

DefinitionDefinition: [Big “oh’’] : [Big “oh’’] ff((nn) = O() = O(gg((nn)) iff there exist )) iff there exist positive constants c and and n0 such such

that that f(n) cg(n) for all for all nn, , nn nn00..

ExamplesExamples f(n) = 3n+2

3n + 2 <= 4n, for all n >= 2, 3n + 2 = (n)

f(n) = 10n2+4n+2 10n2+4n+2 <= 11n2, for all n >= 5, 10n2+4n+2 = (n2)

Page 32: CS235102 Data Structures

Performance AnalysisPerformance Analysis Definition:Definition: [Omega] [Omega]

ff((nn) = ) = ((gg((nn)) (read as “)) (read as “ff of of nn is omega of is omega of gg of of nn”) iff there ”) iff there exist exist positive constants c andand n0 such that such that f(n) cg(n) for for

all all nn, , nn nn00..

ExamplesExamples f(n) = 3n+2

3n + 2 >= 3n, for all n >= 1, 3n + 2 = (n)

f(n) = 10n2+4n+2 10n2+4n+2 >= n2, for all n >= 1, 10n2+4n+2 = (n2)

Page 33: CS235102 Data Structures

Performance AnalysisPerformance Analysis Definition:Definition: [Theta] [Theta]

ff((nn) = ) = ((gg((nn)) (read as “)) (read as “ff of of nn is theta of is theta of gg of of nn”) iff there ”) iff there exist exist positive constants c1, c2,, and and n0 such that such that c1g(n) f(n) c2g(n) for all for all nn, , nn nn00..

ExamplesExamples f(n) = 3n+2

3n <= 3n + 2 <= 4n, for all n >= 2, 3n + 2 = (n)

f(n) = 10n2+4n+2 n2 <= 10n2+4n+2 <= 11n2, for all n >= 5, 10n2+4n+2 = (n2)

Page 34: CS235102 Data Structures

Performance AnalysisPerformance Analysis

Theorem 1.2Theorem 1.2:: If If ff((nn) = ) = aammnnmm+…++…+aa11nn++aa00, then , then ff((nn) = O() = O(nnmm).).

Theorem 1.3:Theorem 1.3: If If ff((nn) = ) = aammnnmm+…++…+aa11nn++aa00 and and aamm > 0, then > 0, then ff((nn) = ) = ((nnmm).).

Theorem 1.4:Theorem 1.4: If If ff((nn) = ) = aammnnmm+…++…+aa11nn++aa00 and and aamm > 0, then > 0, then ff((nn) = ) = ((nnmm).).

Page 35: CS235102 Data Structures

Performance AnalysisPerformance Analysis• **Figure 1.3:Figure 1.3: Step count table for recursive summing function Step count table for recursive summing function

(p.27)(p.27)

Statement s/e Frequency Total steps

float rsum(float list[ ], int n) { if (n) return rsum(list, n-1)+list[n-1]; return list[0]; }

0 0 0 0 0 0 1 n+1 n+1 1 n n 1 1 1 0 0 0

Total 2n+2

= O(n)

Page 36: CS235102 Data Structures

Performance AnalysisPerformance Analysis 1.4.4 Practical complexity1.4.4 Practical complexity

To get a feel for how the various functions grow To get a feel for how the various functions grow with n, you are advised to study Figures 1.7 and with n, you are advised to study Figures 1.7 and 1.8 very closely.1.8 very closely.

Page 37: CS235102 Data Structures

Performance AnalysisPerformance Analysis

Page 38: CS235102 Data Structures

Performance AnalysisPerformance Analysis Figure 1.9 gives the time needed by a 1 billion Figure 1.9 gives the time needed by a 1 billion

instructions per second computer to execute a instructions per second computer to execute a program of complexity program of complexity ff((nn) instructions.) instructions.

Page 39: CS235102 Data Structures

Performance MeasurementPerformance Measurement Although performance analysis gives us a powerful Although performance analysis gives us a powerful

tool for assessing an algorithm’s space and time tool for assessing an algorithm’s space and time complexity, at some point we also must consider complexity, at some point we also must consider how the algorithm executes on our machine.how the algorithm executes on our machine. This consideration moves us from the realm of analysis This consideration moves us from the realm of analysis

to that of measurement.to that of measurement.

Page 40: CS235102 Data Structures

Performance MeasurementPerformance Measurement Example 1.22 Example 1.22

[[Worst case performance of the selection Worst case performance of the selection functionfunction]:]: The tests were conducted on an IBM compatible PC with The tests were conducted on an IBM compatible PC with

an 80386 cpu, an 80387 numeric coprocessor, and a an 80386 cpu, an 80387 numeric coprocessor, and a turbo accelerator. We use Broland’s Turbo C compiler.turbo accelerator. We use Broland’s Turbo C compiler.