Week 12: Running Time and Performance 1. Most of the problems you have written in this class run in...

35
CS 177 Week 12: Running Time and Performance 1
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of Week 12: Running Time and Performance 1. Most of the problems you have written in this class run in...

Page 1: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

1

CS 177Week 12: Running Time and Performance

Page 2: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

2

How long does a program take to run?

Most of the problems you have written in this class run in a few seconds or less

Some kinds of programs can take much longer: Chess algorithms (Deep Blue) Routing and scheduling algorithms

(Traveling Sales Person, Purdue classroom scheduling)

Medical imaging (massive amounts of data)

Graphics processing (Pixar: Toy Story, Incredibles)

Page 3: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

3

How can you measure the amount of time a program takes?

You can time the program What if it takes 1000 years to run?

You can look at the program What are you looking for?

Does the input matter? Sorting 100,000 numbers must take

longer than sorting 10 numbers, right?

Page 4: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

4

Algorithms

A sequence of steps used to solve a problem

In CS, we take an algorithm and convert it into code -- Java in this class

The study of algorithms is intended to: Make sure that the code you write gets the

right answer Make sure that the code runs as quickly as

possible Make sure that the code uses a reasonable

amount of memory

Page 5: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

5

Algorithmic example: search You want to find the definition of the word

“rheostat” in the dictionary The Oxford English Dictionary has 301,100 main

entries What if you search for the word “rheostat” by

starting at the beginning and checking each word to see if it is the one we are interested in

Given a random word, you’ll have to check about 150,550 entries before you find it

Alternatively, you could jump to the correct letter of the alphabet, saving tons of time

Searching for “zither” takes forever otherwise

Page 6: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

6

Algorithmic example: Summation

Write a program to sum up the numbers between 1 and n

Easy, right? Let us recall that a faster way might

be to use the summation equation:

n

i

nni

1 2)1(

Page 7: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

7

Memory usage

Memory usage is a problem too If you run out of memory, your

program can crash Memory usage can have serious

performance consequences too

Page 8: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

8

Memory

Remember, there are multiple levels of memory on a computer

Each next level is on the order of 100 times larger and 100 times slower

Cache• Actually on the CPU• Fast and expensive

RAM• Primary memory for a desktop computer• Pretty fast and relatively expensive

Hard Drive• Secondary memory for a desktop computer• Slow and cheap

100X 100X

100X 100X

Size Speed

Page 9: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

9

Memory speed

If you can do a lot of number crunching without leaving cache, that will be very fast

If you have to fetch data from RAM, that will slow things down

If you have to read and write data to the hard drive (unavoidable with large pieces of data like digital movies), you will slow things down a lot

Page 10: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

10

Measuring memory

Memory can be easier to estimate than running time

Depending on your input, you will allocate a certain number of objects, arrays, and primitive data types

It is possible to count the storage for each item allocated

A reference to an object or an array costs an additional 4 bytes on top of the size of the object

Page 11: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

11

What’s a graph?

A graph is a set of nodes and edges It’s a really simple way to represent almost any

kind of relationship

The numbers on the edges could be distances, costs, time, whatever

5

3 4

16

A

B

D

C

E

2

8

6

Page 12: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

12

Traveling Salesperson Problem

There’s a classic CS problem called the Traveling Salesperson Problem (TSP)

Given a graph, find the shortest path that visits every node and comes back to the starting point

Like a lazy traveling salesperson who wants to visit all possible clients and then return home as soon as possible

Page 13: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

13

How would you solve the TSP?

Strategies: Always visit the nearest neighbor next Randomized approaches Try every possible combination

How can you tell what’s the best solution?

Page 14: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

14

Closest doesn’t always work

We are tempted to always take the closest neighbor, but there are pitfalls

Page 15: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

15

Trying every combination is brutal

In a completely connected graph, we can try any sequence of nodes

If there are n nodes, there are (n – 1)! tours

For 30 cities, 29! = 8841761993739701954543616000000

If we can check 1,000,000,000 tours in one second, it will only take about 20,000 times the age of the universe to check them all

We will (eventually) get the best answer!

Page 16: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

16

No one knows the best solution This and similar problems are useful

problems to solve for people who need to plan paths UPS Military commanders Pizza delivery people Internet routing algorithms

No one knows a solution that always works in a reasonable amount of time

We have been working on it for decades

Page 17: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

17

Fabulous cash prizes

The problem is so hard that you can win money if you solve it

Clay Mathematics Institute has offered a $1,000,000 prize

You can do one of two things to collect it: Find an efficient solution to the general

problem Prove that it is impossible to find an

efficient solution in some cases

Page 18: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

18

What does efficient mean?

How do we measure algorithm running time and memory usage in general?

We want to take into account the size of the problem (TSP with 4 cities is certainly easier than TSP with 4,000 cities)

Some algorithms might work well in certain cases and badly in others

How do we come up with a measuring scheme?

Page 19: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

19

Classification

Ideally, we want to classify both problems and algorithms

We want to figure out which problems take a really long time to solve

We want to figure out how we can compare one algorithm to another

We want to prove that some algorithm is the best possible algorithm to solve a particular problem

Page 20: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

20

Big-Oh Notation

Computer scientists use Big-Oh notation to give an indication of how much time (or memory) a program will take

Big-Oh notation is: Asymptotic

Focuses on the behavior as the size of input gets large

Worst-CaseFocuses on the absolute longest the program can take

Page 21: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

21

Running time example

Add up the operations done by the following code:

Initialization: 1 operation Loop: 1 initialization + n checks + n increments

+ n additions to sum = 3n + 1 Output: 1 operation Total: 3n + 3

int sum = 0;

for( int i = 0; i < n; i++ )sum += i;

System.out.println(“Sum: “ + sum);

Page 22: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

22

Running time function

We could express the time taken by the code on the previous slide as a function of n: f(n) = 3n + 3

This approach has a number of problems: We assumed that each line takes 1 time unit to

accomplish, but the output line takes much longer than an integer operation

This program is 4 lines long, a longer program is going to have a very messy running time function

We can get nit picky about details: Does sum += i; take one operation or two if we count the addition and the store separately?

Page 23: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

23

Inadequacy of running time function

In short, this way of getting a running time function is almost useless because: It cannot be used to give us an idea of how long

the program really runs in seconds It is complex and unwieldy

The most important thing about the analysis of the code that we did is learning that the growth of the function should be linear (3n+3)

A general description of how the running time grows as the input grows would be useful

Page 24: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

24

Big Oh notation

Enter Big Oh notation Big Oh simplifies a complicated running

time function into a simple statement about its worst case growth rate

All constant coefficients are ignored All low order terms are ignored 3n + 3 is O(n) Big Oh is a statement that a particular

running time is no worse than some function, with appropriate constants

Page 25: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

25

Big Oh simplification examples 147n3 + 2n2 + 5n + 12083 is

O(n3)

15n2 + 6n + 7log n + 145 isO(n2)

log n + 87829 isO(log n)

Note: In CS, we use log2 unless stated otherwise

Page 26: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

26

Mulitiplication by hand

How long does it take to do multiplication by hand?

123

x 456

738

615

492__

56088

Let’s assume that the length of the numbers is n digits (n multiplications + n carries) x n digits + (n + 1

digits) x n additions = 3n2 + n Running time: O(n2)

Page 27: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

27

Finding the largest element in an array

How do we find the largest element in an array?

Running time: O(n) if n is the length of the array What if the array is sorted in ascending order?

Running time: O(1)

int largest = array[0];

for( int i = 1; i < array.length; i++ )if( array[i] > largest )

largest = array[i];

System.out.println(“Largest: “ + largest);

System.out.println(“Largest: “ + array[array.length-1]);

Page 28: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

28

Multiplying matrices

Given two n x n matrices A and B, the code to multiply them is:

Running time: O(n3) Is there a faster way to multiply matrices? Yes, but it’s complicated and has other

problems

double[][] c = new double[n][n];

for( int i = 0; i < n; i++ )for( int j = 0; j < n; j++ )

for( int k = 0; k < n; k++ )c[i][j] += a[i][k]*b[k][j];

Page 29: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

29

Bubble sort

Here is some code that sorts an array in ascending order

What is its running time?

Running time: O(n2)

for( int i = 0; i < array.length; i++ )for( int j = 0; j < array.length - 1; j++ )

if( array[j] > array[j + 1] ){

int temp = array[j];array[j] = array[j + 1];

array[j + 1] = temp; }

Page 30: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

30

Rules of thumb

There is nothing better than constant time (O(1)) Logarithmic time (O(log n)means that the problem

can become much larger and only take a little longer

Linear time (O(n))means that time grows with the problem

Quadratic time (O(n2)) means that expanding the problem size significantly could make it impractical

Cubic time (O(n3)) is about the reasonable maximum if we expect the problem to grow

Exponential and factorial time mean that we cannot solve anything but the most trivial problem instances

Page 31: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

31

Empirical techniques

Looking at the code and determining the order of complexity is one technique

Sometimes this only captures worst-case behavior

The code can be complicated and difficult to understand

Another possibility is running experiments to see how the running time changes as the problem size increases

Page 32: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

32

Doubling hypothesis

The doubling hypothesis states that it is often possible to determine the order of the running time of a program by progressively doubling the input size and measuring the changes in time

Page 33: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

33

Finding the largest element in an array

Let’s use the doubling hypothesis to test the running time of the program that finds the largest element in an unsorted array

Looks like a decent indication of O(n)

Size Time Ratio

1024 0.035829 ms -

2048 0.071727 ms 2.00

4096 0.188013 ms 2.62

8192 0.278877 ms 1.48

Page 34: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

34

Multiplying matrices

Let’s check matrix multiplication

A factor of 8 is expected for O(n3) Not too bad

Size Time Ratio

128 39.786341 ms -

256 229.902823 ms 5.78

512 2945.396691 ms

12.81

1024 25957.367417 ms

8.81

Page 35: Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

35

Bubble sort

Let’s try bubble sort

Success! O(n2) is supposed to give a ratio of 4

We should have done more trials to do this scientifically

Size Time Ratio

1024 8.648306 ms -

2048 28.331604 ms 3.28

4096 108.931582 ms 3.84

8192 404.451683 ms 3.71