COMP211slides_2

36
E.G.M. Petrakis Algorithm Analysis 1 Algorithm Analysis Algorithms that are equally correct can vary in their utilization of computational resources time and memory a slow program it is likely not to be used a program that demands too much memory may not be executable on the machines that are available

description

δομές αρχείων

Transcript of COMP211slides_2

E.G.M. Petrakis Algorithm Analysis 1

Algorithm Analysis

Algorithms that are equally correct can vary in their utilization of computational resources

time and memorya slow program it is likely not to be useda program that demands too much memory may not be executable on the machines that are available

E.G.M. Petrakis Algorithm Analysis 2

Memory - Time

It is important to be able to predicthow an algorithm behaves under a range of possible conditions

usually different inputsThe emphasis is on the time rather than on the space efficiency

memory is cheap!

E.G.M. Petrakis Algorithm Analysis 3

Running Time

The time to solve a problem increases with the size of the input

measure the rate at which the time increases e.g., linearly, quadratically, exponentially

This measure focuses on intrinsic characteristics of the algorithm rather than incidental factors

e.g., computer speed, coding tricks, optimizations, quality of compiler

E.G.M. Petrakis Algorithm Analysis 4

Additional ConsiderationSometimes, simple but less efficient algorithms are preferred over more efficient ones

the more efficient algorithms are not always easy to implementthe program is to be changed soon time may not be criticalthe program will run only few times (e.g., once a year)the size of the input is always very small

E.G.M. Petrakis Algorithm Analysis 5

Bubble sortCount the number of steps executed by the algorithm

step: basic operation ignore operations that are executed a constant number of timesfor ( i = 1; i < ( n – 1); i ++)

for (j = n; j < (i + 1); j --) if ( A[ j – 1 ] > A[ j ] ) T(n) = c n2

swap(A[ j ] , A[ j – 1 ]); //step

E.G.M. Petrakis Algorithm Analysis 6

Asymptotic Algorithm Analysis

Measures the efficiency of a program as the size of the input becomes large

focuses on “growth rate”: the rate at which the running time of an algorithm grows as the size of the input becomes “big”Complexity => Growth ratefocuses on the upper and lower bounds of the growth rates ignores all constant factors

E.G.M. Petrakis Algorithm Analysis 7

Growth RateDepending on the order of n an algorithm can beof

Linear time Complexity: T(n) = c nQuadratic Complexity: T(n) = c n2

Exponential Complexity: T(n) = c nk

The difference between an algorithm whose running time is T(n) = 10n and another with running time is T(n) = 2n2 is tremendous

for n > 5 the linear algorithm is must faster despite the fact that it has larger constantit is slower only for small n but, for such n, both algorithms are very fast

E.G.M. Petrakis Algorithm Analysis 8

E.G.M. Petrakis Algorithm Analysis 9

Upper – Lower BoundsSeveral terms are used to describe the running time of an algorithm:The upper bound to its growth rate which is expressed by the Big-Oh notation

f(n) is upper bound of T(n) T(n) is in O(f(n))The lower bound to it growth rate which is expressed by the Omega notation

g(n) is lower bound of T(n) T(n) is in Ω(g(n))If the upper and lower bounds are the same T(n) is in Θ(g(n))

E.G.M. Petrakis Algorithm Analysis 10

Big O (upper bound)Definition:

Example: 0nn cg(n)T(n)

1c , nn if Ο(g(n)) T(n)≥∀≤

≥≥∃∈ 0

)O(n T(n) n 10 n T(n))O(n T(n) 72n6nT(n)

)Ο(nT(n) nT(n)

3263

22

33

∈⇒+=

∈⇒+−=

∈⇒=

E.G.M. Petrakis Algorithm Analysis 11

Properties of O

O is transitive: if

If

If

If

Ο(h)fΟ(h)gΟ(g)f

∈⇒⎭⎬⎫

O(g)gfO(g)f ∈+⇒∈

)gfO(gf)gO(g)fO(f

′⋅′∈⋅⇒⎭⎬⎫

′∈

′∈

)gfO(gf)gO(g)fO(f

′+′∈+⇒⎭⎬⎫

′∈

′∈

E.G.M. Petrakis Algorithm Analysis 12

More on Upper BoundsT(n) ∈ O(g(n)) g(n) :upper boundfor T(n) There exist many upper bounds

e.g., T(n) = n2 ∈ O(n2) , O(n3) … O(n100) ...Find the smallest upper bound!!O notation “hides” the constants

e.g., cn2 , c2n2 , c!n2 ,…ckn2 ∈ O(n2)

E.G.M. Petrakis Algorithm Analysis 13

O in PracticeAmong different algorithms solving the same problem, prefer the algorithm with the smaller “rate of growth” O

it will be faster for large nΟ(1) < Ο(log n) < Ο(n) < Ο(n log n) < Ο(n2) < Ο(n3)< …< Ο(2n) < O(nn)Ο(n), Ο(n log n), Ο(n2) < Ο(n3): polynomialΟ(log n): logarithmicO(nk), O(2n), (n!), O(nn): exponential!!

E.G.M. Petrakis Algorithm Analysis 14

1000

500200n

10020Complexity

58min

35yrs1.0

2.7hrs0.1.0001

125cent220day

1.1.4

27hrs21min1.0min101.0.08

2min254.01.0.25.04

104.51.5.6.3.91000nlogn

1.0.5.2.1.5.021000n

2100n

lognn

3100n

n/32n2nn

cent8105⋅

cent4103 ⋅

cent4103 ⋅

cent9102 ⋅

10-6 sec/command

E.G.M. Petrakis Algorithm Analysis 15

Omega Ω (lower bound)

Definition:

g(n) is a lower bound of the rate of growth of f(n) f(n) increases faster than g(n) as nincreasese.g: T(n) = 6n2 – 2n + 7 =>T(n) in Ω(n2)

cg(n)f(n)1c if Ω(g(n))f(n)

≥∃∈

E.G.M. Petrakis Algorithm Analysis 16

More on Lower Bounds

Ω provides lower bound of the growth rate of T(n)

if there exist many lower boundsT(n) = n4 ∈ Ω(n) , ∈ Ω(n2) , ∈ Ω(n3), ∈ Ω(n4)find the larger one => T(n) ∈ Ω(n4)

E.G.M. Petrakis Algorithm Analysis 17

Thetα Θ

If the lower and upper bounds are the same:

e.g.: T(n) = n3 +135n ∈ Θ(n3)

Θ(g)fΩ(g)fO(g)f

∈⇔⎭⎬⎫

E.G.M. Petrakis Algorithm Analysis 18

Theorem: T(n)=Σ αi ni ∈ Θ(nm)

T(n) = αnnm +αn-1nm-1+…+α1n+ α0 ∈ O(nm)Proof:T(n) ≤ |T(n)| ≤ |αnnm|+|αm-1nm-1|+…|α1n|+|α0|

≤ nm|αm| + 1/n|αm-1|+…+1/nm-1|α1|+1/nm|α0| ≤

≤ nm|αm|+|αm-1|+…+|α1|+|α0|

T(n) ∈ O(nm)

E.G.M. Petrakis Algorithm Analysis 19

Theorem (cont.)

b) T(n) = αmnm + αm-1nm-1 +… α1n + α0 >= cnm + cnm-1 + … cn + c >= cnm

where c = minαm, αm-1,… α1, α0 =>

T(n) ∈ Ω(nm)

(a), (b) => Τ(n) ∈ Θ(nm)

E.G.M. Petrakis Algorithm Analysis 20

Theorem:

(a)(b)

(either or )2n1)i-(n ≥+

0k ),Θ(niT(n) 1kn

1i

k ≥∀∈= +

=∑

∑ ∑= =

+∈⇒⋅=≤=n

1i

n

1i

1kkkk )O(nT(n)nnniT(n)

⇒≥++

=++=⋅

∑∑

∑ ∑

==

= =

kn

1i

n

1i

kk

n

1i

n

1i

kk

)2n()1)i-(n(i

1)i-(niT(n)2

2ni ≥

)Ω(nn)(Tn2

1n)(T 1k1k1k

+++ ∈⇒≥

E.G.M. Petrakis Algorithm Analysis 21

Recursive Relations (1)

)Θ(nT(n)2

1)n(ni

n1)-(n....2T(1) ......

n1)-n(2)-T(n n1)-T(nT(n)

2

1∈⇒

+==

=++++==

=++==+=

∑=

n

i

⎩⎨⎧

>+=

=1n ifn1)-T(n1n if1

T(n)

E.G.M. Petrakis Algorithm Analysis 22

Fibonacci Relation

2n if 2)-F(n1)-F(nF(n)1F(1)0F(0)

≥+=

=

=

)O(2F(n) 2F(1)2 .... 2)-4F(n 1)-2F(n

2)-F(n1)-F(nF(n) (a)

n1-n1-n ∈⇒=≤

≤≤

≤+=

)2Ω()Ω(2F(n)even n if 2F(n) odd n if 2F(n)

4)-4F(n2)-2F(n2)-F(n1)-F(nF(n) (b)nn/2

2)/2-(n

1)/2-(n

=∈⇒⎩⎨⎧

≥≥

⇒≥≥+= Κ

E.G.M. Petrakis Algorithm Analysis 23

Fibonacci relation (cont.)

From (a), (b):

It can be proven that F(n) ∈Θ(φn)where

))n nO(2 2Ω(F(n) ∩∈

61801.=+

=2

51φ

golden ratio

E.G.M. Petrakis Algorithm Analysis 24

Binary Searchint BSearch(int table[a..b], key, n)

if (a > b) return (– 1) ;int middle = (a + b)/2 ;if (T[middle] == key) return (middle);else if ( key < T[middle] )

return BSearch(T[a..middle – 1], key);else return BSearch(T[middle + 1..b], key);

E.G.M. Petrakis Algorithm Analysis 25

Analysis of Binary Search

Let n = b – a + 1: size of array and n = 2k – 1for some k (e.g., 1, 3, 5 etc)The size of each sub-array is 2k-1 –1

c: execution time of first commandd: execution time outside recursionT: execution time

0k1)T(2d

0kc 1) - T(2 T(n) 1kk

⎩⎨⎧

>−+=

== −

E.G.M. Petrakis Algorithm Analysis 26

Binary Search (cont.)

O(logn)T(n) c 1)log(nd c kd T(0) kd

... 1) - T(2 2d 1) - T(2 d 1)T(2T(n) 2- k1- kk

∈=>

++⋅=+=+==+=+=−=

E.G.M. Petrakis Algorithm Analysis 27

Binary Search for Random n

T(n) <= T(n+1) <= ...T(u(n)) where u(n)=2k – 1For some k: n < u(n) < 2nThen, T(n) <= T(u(n)) = d log(u(n) + 1) + c =>d log(2n + 1) + c => T(n) ∈ O(log n)

⎣ ⎦ ⎡ ⎤⎩⎨⎧

>−−+=

=0k)1)/2(nT(),1)/2(nmaxT(d0kcT(n)

E.G.M. Petrakis Algorithm Analysis 28

Merge sortlist mergesort(list L, int n)

if (n == 1) return (L);L1 = lower half of L; L2 = upper half of L;return merge (mergesort(L1,n/2),

mergesort(L2,n/2) );

n: size of array L (assume L some power of 2)merge: merges the sorted L1, L2 in a sorted array

E.G.M. Petrakis Algorithm Analysis 29

E.G.M. Petrakis Algorithm Analysis 30

Analysis of Merge sort[25] [57] [48] [37] [12] [92] [86] [33]

[25 57] [37 48] [12 92] [33 86]

[25 37 48 57] [12 33 86 92]

[12 25 33 37 48 57 86 92]

log 2

n m

erge

s

N steps per merge, log2n steps/merge => O(n log2n)

E.G.M. Petrakis Algorithm Analysis 31

Analysis of Merge sort

⎩⎨⎧

=>+

=1nc1nnc2T(n/2)T(n)

1

2

Two ways: recursion analysisinduction

E.G.M. Petrakis Algorithm Analysis 32

Induction

Let T(n) = O(n log n) then T(n) can be written as T(n) <= a n log n + b, a, b > 0Proof: (1) for n = 1, true if b >= c1

(2) let T(k) = a k log k + b for k=1,2,..n-1(3) for k=n prove that T(n) <= a nlog n+b

For k = n/2 : T(n/2) <= a n/2 log n/2 + btherefore, T(n) can be written as T(n) = 2 a n/2 log n/2 + b + c2 n =

a n(log n – 1)+ 2b + c2 n =

E.G.M. Petrakis Algorithm Analysis 33

Induction (cont.)

T(n) = a n log n – a n + 2 b + c2 n

Let b = c1 and a = c1 + c2 =>T(n) <= (c1 + c2) n log n + c1 =>

T(n) <= C n log n + B

E.G.M. Petrakis Algorithm Analysis 34

Recursion analysis of mergesort

T(n) = 2T(n/2) + c2 n <=<= 4T(n/4) + 2 c2 n <=<= 8T(n/8) + 3 c2 n <=

……T(n) <= 2i T(n/2i) + ic2n where n = 2i

T(n) <= n T(1) + c2 n log n =>T(n) < = n c1 + c2n log n => T(n) <= (c1 + c2) n log n =>

T(n) <= C n log n

E.G.M. Petrakis Algorithm Analysis 35

Best, Worst, Average Cases

For some algorithms, different inputs require different amounts of time

e.g., sequential search in an array of size n

worst-case: element not in array => T(n) = nbest-case: element is first in array =>T(n) = 1average-case: element between 1, n =>T(n) = n/2

...n

∑ +== 1)/2(nxpx iicomparisons

E.G.M. Petrakis Algorithm Analysis 36

ComparisonThe best-case is not representative

happens only rarelyuseful when it has high probability

The worst-case is very useful: the algorithm performs at least that wellvery important for real-time applications

The average-case represents the typical behavior of the algorithm

not always possibleknowledge about data distribution is necessary