615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many...

41
Recursive Functions Biostatistics 615 Lecture 5

Transcript of 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many...

Page 1: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Recursive Functions

Biostatistics 615Lecture 5

Page 2: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Notes on Problem Set 1

Results were very positive!• (But homework was time-consuming!)…

Familiar with Union Find algorithms

Language of choice• 50% tried C• 50% tried R

Page 3: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Question 1

How many random pairs of connections are required to connect 1,000 objects?

• Answer: ~3,740

Useful notes:• Count all connections• Use simple termination condition

Page 4: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Question 2

What are path lengths in the saturated tree?

• ~1.8 nodes on average• ~5 nodes for maximum path

Random data is better than worst case• log2 N nodes

Page 5: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Question 3

Is it better to use tree height or weight in ordering union operations?

• Tree weight is better. If we point the root of a tree with X nodes to the root of a tree with Y nodes, the length of X paths increases by 1.

• Smallest increase corresponds to X < Y.

• However, using height ensures that longest path is shorter.

Page 6: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Other notes…

Indent code• Easier to read• Easier to debug and spot mistakes

Useful functions for editing code in R• Debug() for stepping through lines of code• Edit() opens a text-editor for a function

Page 7: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Last Lecture…

Introduction to Programming in C• Data and function types• Control structures

Standard C Function Library

Arithmetic Precision

Page 8: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Today…

Introduce recursive functions

The Stack

Problematic recursive functions

Page 9: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Recursion

A function that is part of its own definition

e.g.

A program that calls itself

⎪⎩

⎪⎨

=

>−⋅

=

0 N if1

0 N if)1()(

NFactorialNNFactorial

Page 10: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Key Applications of Recursion

Dynamic Programming• Related to Markov processes in Statistics

Divide-and-Conquer Algorithms

Tree Processing

Page 11: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Recursive Function in R

Factorial <- function(N){if (N == 0)return(1)

elsereturn(N * Factorial(N - 1))

}

Page 12: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Recursive Function in C

int factorial (int N){if (N == 0)return 1;

elsereturn N * factorial(N - 1);

}

Page 13: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Key Features of Recursions

Initial set of known values

Recursive definition for other values• Computation of large N depends on smaller N

Can generally be expressed with a loop

Page 14: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

An Exception, where N increases:A Strange Recursive Function in Cint puzzle (int N)

{if (N == 1)

return 1;

if (N % 2 == 0)return puzzle(N / 2);

return puzzle(3 * N + 1);}

Page 15: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Evaluating puzzle(3)

puzzle(3)puzzle(10)

puzzle(5)puzzle(16)

puzzle(8)puzzle(4)

puzzle(2)puzzle(1)

Page 16: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

More Typical:Euclid’s Algorithm

Algorithm for finding greatest common divisor of two integers a and b• If a divides b

• GCD(a,b) is a

• Otherwise, find the largest integer t such that• at + r = b• GCD(a,b) = GCD(r,a)

Page 17: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Euclid’s Algorithm in R

GCD <- function(a, b){if (a == 0)return(b)

return(GCD(b %% a, a))}

Page 18: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Euclid’s Algorithm in C

int gcd (int a, int b){if (a == 0)return b;

return gcd(b % a, a);}

Page 19: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Evaluating GCD(4458, 2099)

gcd(2099, 4458)gcd(350, 2099)

gcd(349, 350)gcd(1, 349)

gcd(0, 1)

Page 20: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Divide-And-Conquer Algorithms

Common class of recursive functionsFunction• Processes input

• Divides input in half• Calls itself recursively for at least one half

• Order of processing and recursion may vary

Page 21: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Binary Searchint search(int a[], int value, int start, int stop)

{while (stop >= start)

{// Find midpointint mid = (start + stop) / 2;

// Compare midpoint to valueif (value == a[mid]) return mid;

// Reduce input in half!!!if (value < a[mid])

{ stop = mid – 1; }else

{ start = mid + 1; }}

// Search failedreturn -1;}

Page 22: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Recursive Binary Searchint search(int a[], int value, int start, int stop)

{// Search failedif (start > stop)

return -1;

// Find midpointint mid = (start + stop) / 2;

// Compare midpoint to valueif (value == a[mid]) return mid;

// Reduce input in half!!!if (value < a[mid])

return search(a, start, mid – 1 };else

return search(a, mid + 1, stop);}

Page 23: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Recursive Maximumint Maximum(int a[], int start, int stop)

{int left, right;

// Maximum of one elementif (start == stop)

return a[start];

left = Maximum(a, start, (start + stop) / 2);right = Maximum(a, (start + stop) / 2 + 1, stop);

// Reduce input in half!!!if (left > right)

return left;else

return right;}

Page 24: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

The Stack

Specialized area of memory• Grows with each function call• Released when function returns

Tracks• Function arguments• Previous program state• Local variables

Size of stack limits depth of recursion

Page 25: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

A Typical Stack

An array with 9 elementsa[] = {2, 3, 5, 7, 11, 17, 19, 23, 29};

Function callsearch(a, 15, 0, 8);

Arguments and local variables for each recursive call stored in stack …

Page 26: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

A Typical Stack IImid=5stop=5start=5

value=15a[]

SEARCHS mid=6 mid=6T stop=8 stop=8A start=5 start=5C value=15 value=15K a[] a[]

SEARCH SEARCHmid=4 mid=4 mid=4stop=8 stop=8 stop=8start=0 start=0 start=0

value=15 value=15 value=15a[] a[] a[]

SEARCH SEARCH SEARCH… … … …

Page 27: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Well behaved recursions

So far …• Factorial• Greatest Common Divisor• Binary Search• Maximum

Situations where recursions are effective• Does this always hold?

Page 28: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

A trickier example

Consider the Fibonacci numbers…

⎪⎪⎪

⎪⎪⎪

−+−

=

=

=

)2()1(

1 N if1

0 N if0

)(

NFibonacciNFibonacci

NFibonacci

Page 29: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Fibonacci Numbersint Fibonacci(int i)

{// Simple cases firstif (i == 0)

return 0;

if (i == 1)return i;

return Fibonacci(i – 1) + Fibonacci(i – 2);}

Page 30: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Terribly Slow!

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Time

0

10

20

30

40

50

60

Fibonacci Number

Time (seconds)Calculating Fibonacci Numbers Recursively

Page 31: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

What is going on? …

Page 32: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Faster Alternatives

Certain quantities are recalculated• Far too many times!

Need to avoid recalculation…• Ideally, calculate each unique quantity once.

Page 33: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Dynamic Programming

A technique for avoiding recomputation

Can make exponential running times …

… become linear!

Page 34: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Bottom-Up Dynamic Programming

Evaluate function starting with smallest possible argument value• Stepping through possible values, gradually increase

argument value

Store all computed values in an array

As larger arguments evaluated, precomputed values for smaller arguments can be retrieved

Page 35: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Fibonacci Numbersint Fibonacci(int i)

{int a[LARGE_NUMBER], j;

a[0] = 0;a[1] = 1;

for (j = 2; j <= i; j++)a[j] = a[j – 1] + a[j – 2];

return a[i];}

Page 36: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Top-Down Dynamic Programming

Save each computed value as final action of recursive function

Check if pre-computed value exists as the first action

Page 37: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Fibonacci Numbersint Fibonacci(int i)

{// Simple cases firstif (saveF[i] > 0)

return saveF[i];

if (i <= 1)return 1;

// RecursionsaveF[i] = Fibonacci(i – 1) + Fibonacci(i – 2);return saveF[i];}

Page 38: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Much less recursion now…

Page 39: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Limitations of Dynamic Programming

Requires integer arguments• Need to index results in an array

Small number of possible argument values• Need enough memory to store array

Page 40: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Summary

Recursive functions

The stack

Dynamic programming• We’ll see more of this on Thursday!

Page 41: 615.05 -- Recursive Functionscsg.sph.umich.edu/abecasis/class/615.05.pdf · Question 1 zHow many random pairs of connections are required to connect 1,000 objects? • Answer: ~3,740

Reading

Sedgewick, Chapters 5.1 – 5.3