Chapter 14: Generating Data with Do Loops OBJECTIVES Understand iterative DO loops. Construct a DO...

37
Chapter 14: Generating Data with Do Loops OBJECTIVES • Understand iterative DO loops. • Construct a DO loop to perform repetitive calculations • Use DO loops to eliminate redundant code. • Use DO loop processing to conditionally execute code. • Generate multiple observations in one iteration of the DATA step • Construct nested DO loops

Transcript of Chapter 14: Generating Data with Do Loops OBJECTIVES Understand iterative DO loops. Construct a DO...

Chapter 14: Generating Data with Do Loops

OBJECTIVES

• Understand iterative DO loops.• Construct a DO loop to perform repetitive calculations• Use DO loops to eliminate redundant code.• Use DO loop processing to conditionally execute code.• Generate multiple observations in one iteration of the DATA

step• Construct nested DO loops

2

DO Loop Processing• Statements within a DO loop execute for a

specific number of iterations or until a specific condition stops the loop.

DATA statement SAS statements DO statement iterated SAS statements END statement SAS statementsRUN statement

3

Repetitive Coding

• Compare the interest for yearly versus quarterly compounding on a $50,000 investment made for one year at 7.5 percent interest.

• How much money will a person accrue in each situation?

4

Repetitive Coding

data compound; Amount=50000; Rate=.075;Yearly=Amount*Rate;Quarterly+((Quarterly+Amount)*Rate/4);

Quarterly+((Quarterly+Amount)*Rate/4);Quarterly+((Quarterly+Amount)*Rate/4);Quarterly+((Quarterly+Amount)*Rate/4);run;

5

Repetitive Coding

proc print data=compound noobs;run;

Amount Rate Yearly Quarterly

50000 0.075 3750 3856.79

What if you wanted to determine the quarterly compounded interest after a period of 20 years (80 quarters)?

PROC PRINT Output

6

DO Loop Processing

data compound(drop=Qtr); Amount=50000; Rate=.075; Yearly=Amount*Rate; do Qtr=1 to 4; Quarterly+((Quarterly+Amount)*Rate/4); end;run;

7

• The iterative DO statement executes statements between DO and END statements repetitively, based on the value of an index variable.

• specification-1…specification-n can represent a range of values or a list of specific values.

The Iterative DO Statement

DO index-variable=specification-1 <,…specification-n>; <additional SAS statements>END;

DO index-variable=specification-1 <,…specification-n>; <additional SAS statements>END;

General Syntax of a DO Loop

DO index-variable = start TO stop BY increment;SAS statements;END;

• Index-variable stores the value of the current iteration of the DO loop.

• Start, stop, increment values are set upon entry of the DO loop.

• Can not be changed during the processing the DO loop• Can be numbers, variables, or SAS expressions.

9

• The BY clause is optional. If ignored, the increment is 1.• Increment may be negative, which will process the DO

loop backwards. For this situation, Start should be larger than Stop.

• When completing the DO loop, the value of the index-variable is STOP_value+Increment, not the STOP_value. Ex: Do month = 1 to 12; pay = pay*1.05;

END;Once the DO loop is completed, the value of month = 13, NOT

12.

The Iterative DO Statement

10

The Iterative DO Statement

What are the values of each of the four index variables?

do i=1 to 12;

do j=2 to 10 by 2;

do k=14 to 2 by –2;

do m=3.6 to 3.8 by .05;

1 2 3 4 5 6 7 8 9 10 11 12 13

2 4 6 8 10 12

14 12 10 8 6 4 2 0

3.60 3.65 3.70 3.75 3.80 3.85

Out of range

11

The Iterative DO Statement

• item-1 through item-n can be either all numeric or all character constants, or they can be variables.

• The DO loop is executed once for each value in the list.

DO index-variable=item-1 <,…item-n>;DO index-variable=item-1 <,…item-n>;

12

do Month='JAN','FEB','MAR';

do Fib=1,2,3,5,8,13,21;

do i=Var1,Var2,Var3;

do j=BeginDate to Today() by 7;

do k=Test1-Test50;

The Iterative DO Statement

How many times will each DO loop execute?

3 times.

7 times.

3 times.

The number of iterations depends

on the values of BeginDate and Today().

1 time. A single value of k is determined

by subtracting Test50 from Test1. Use ‘TO’ not ‘-’.

14

Performing Repetitive Calculations

On January 1 of each year, $5,000 is invested in an account. Determine the value of the account after three years based on a constant annual interest rate of 7.5 percent.

data invest; do Year=2001 to 2003; Capital+5000; Capital+(Capital*.075); end;run;

15

Performing Repetitive Calculations

Year Capital

2004 17364.61

proc print data=invest noobs;run;

PROC PRINT Output

NOTE: The Year is 2004, NOT 2003. The Capital is not for 2003.

16

Performing Repetitive Calculations

data invest; do Year=2001 to 2003; Capital+5000; Capital+(Capital*.075); output; end;run;

proc print data=invest noobs;run;

Generate a separate observation for each year.

Explicit output

17

Performing Repetitive Calculations

Year Capital

2001 5375.00 2002 11153.13 2003 17364.61

Why is the value of Year not equal to 2004 in the last observation?

NOTE: The use of OUTPUT, the explicit output, controls the output from 2001 to 2003.

PROC PRINT Output

18

Reducing Redundant Code• The example that forecast the growth of each

division of an airline.• Partial Listing of the data set: mylib.growth

NumDivision Emps Increase

APTOPS 205 0.075 FINACE 198 0.040 FLTOPS 187 0.080

19

Reducing Redundant CodeA Forecasting of Airline Growth

data forecast; set mylib.growth(rename=(NumEmps=NewTotal)); Year=1; NewTotal=NewTotal*(1+Increase); output; Year=2; NewTotal=NewTotal*(1+Increase); output; Year=3; NewTotal=NewTotal*(1+Increase); output;run;

What if you want to forecast growth over the next 30 years?

20

Reducing Redundant CodeUse a DO loop to eliminate the redundant

code in the previous example.

data forecast; set mylib.growth(rename=(NumEmps=NewTotal)); do Year=1 to 3; NewTotal=NewTotal*(1+Increase); output; end;run;

21

Reducing Redundant Codeproc print data=forecast noobs;run;

Partial PROC PRINT Output

NewDivision Total Increase Year

APTOPS 220.38 0.075 1APTOPS 236.90 0.075 2APTOPS 254.67 0.075 3FINACE 205.92 0.040 1

What if you want to forecast the number of years it would take for the size of the Airport Operations Division to exceed 300 people?

22

Conditional Iterative Processing

• You can use DO WHILE and DO UNTIL statements to stop the loop when a condition is met rather than when the index variable exceeds a specific value.

• To avoid infinite loops, be sure that the condition specified will be met.

23

The DO WHILE statement executes statements in a DO loop while a condition is true.

• expression is evaluated at the top of the loop.• The statements in the loop never execute if the

expression is initially false.

The DO WHILE Statement

DO WHILE (expression); <additional SAS statements>END;

DO WHILE (expression); <additional SAS statements>END;

24

The DO UNTIL StatementThe DO UNTIL statement executes statements in a DO

loop until a condition is true.

• expression is evaluated at the bottom of the loop.• The statements in the loop are executed at least once.

DO UNTIL (expression); <additional SAS statements>END;

DO UNTIL (expression); <additional SAS statements>END;

25

Conditional Iterative Processing

Task:Determine the number of years it would take

for an account to exceed $1,000,000 if $5,000 is invested annually at 7.5 percent.

26

Conditional Iterative Processing

data invest;

do until(Capital>1000000); Year+1; Capital+5000; Capital+(Capital*.075); end;run;

proc print data=invest noobs;run;

27

Conditional Iterative Processing

PROC PRINT Output

How could you generate the same result with a DO WHILE statement?

Capital Year

1047355.91 38

28

The Iterative DO Statement with a Conditional Clause

You can combine DO WHILE and DO UNTIL statements with the iterative DO statement.

This is one method of avoiding an infinite loop in DO WHILE or DO UNTIL statements.

DO index-variable=start TO stop <BY increment> WHILE | UNTIL (expression); <additional SAS statements>END;

DO index-variable=start TO stop <BY increment> WHILE | UNTIL (expression); <additional SAS statements>END;

29

The Iterative DO Statement with a Conditional Clause

Task: • Determine the return of the account again.• Stop the loop if 25 years is reached or more

than $250,000 is accumulated.

30

The Iterative DO Statement with a Conditional Clause

data invest; do Year=1 to 25 until(Capital>250000); Capital+5000; Capital+(Capital*.075); end;run;

proc print data=invest noobs;run;

31

The Iterative DO Statement with a Conditional Clause

PROC PRINT Output

Year Capital

21 255594.86

32

Nested DO Loops

Nested DO loops are loops within loops.When you nest DO loops,– use different index variables for each loop– be certain that each DO statement has a

corresponding END statement.

33

Nested DO Loops

Task:• Create one observation per year for five

years and show the earnings if you invest $5,000 per year with 7.5 percent annual interest compounded quarterly.

34

data invest(drop=Quarter); do Year=1 to 5; Capital+5000; do Quarter=1 to 4; Capital+(Capital*(.075/4)); end; output; end;run;

proc print data=invest noobs;run;

Nested DO Loops

5x 4x

35

PROC PRINT Output

How could you generate one observation for each quarterly amount?

Nested DO Loops

Year Capital

1 5385.68 2 11186.79 3 17435.37 4 24165.94 5 31415.68

36

Nested DO LoopsTask:• Compare the final results of investing $5,000

a year for five years in three different banks that compound quarterly. Assume each bank has a fixed interest rate.

mylib.Banks

Name Rate

Calhoun Bank and Trust 0.0718State Savings Bank 0.0721National Savings and Trust 0.0728

37

Nested DO Loops

data invest(drop=Quarter Year); set mylib.banks; Capital=0; do Year=1 to 5; Capital+5000; do Quarter=1 to 4; Capital+(Capital*(Rate/4)); end; end;run;

4x5x

3x

38

Nested DO Loops

PROC PRINT Output

Name Rate Capital

Calhoun Bank and Trust 0.0718 31106.73State Savings Bank 0.0721 31135.55National Savings and Trust 0.0728 31202.91

proc print data=invest noobs;run;