Hunting for Columbus’ Eggs in the SAS Programming World: A ...

12
1 Hunting for Columbus’ Eggs in the SAS ® Programming World: A Guidance to Creative Thinking for SAS ® Programmers Alice M. Cheng, South San Francisco, CA ABSTRACT A Columbus‟ Egg, according to Wikipedia, refers to a brilliant idea or discovery that seems simple or easy after the fact. In this paper, the author presents several Columbus‟ Eggs that may be helpful to fellow SAS ® programmers. SAS examples are given, followed by learning tips to reminder readers what they actually have learned. Examples include catchy phrases to remember basic SQL structure, an innovative use of a SAS function, methods to white out/cover up unwanted information, horizontal vs. vertical approach to resolve a problem, as well as, divide and conquer technique. It is the author‟s intention to not only provide readers with useful techniques, but more importantly, to stimulate them to think outside the box and hunt for their own Columbus‟ Eggs! KEYWORDS SAS, Innovation, Creativity, Imagination INTRODUCTION As a SAS programmer, one faces the constant challenges to solve problems with various degrees of difficulties. Some problems may have a straightforward but more complex solution. However, if one thinks creatively, there may be a simpler approach. In this paper, the author explores some of these techniques, which can be helpful to fellow SAS programmers. After each example, learning tips are provided to remind readers of what they actually have learned other than the obvious technical aspect. Hopefully, through this learning process, one will start thinking creatively and look at a problem from different perspectives! THE COLUMBUS’ EGG STORY On October 12, 1492, the famous explorer and navigator, Christopher Columbus, discovered land previously unknown to the Europeans. During his lifetime, Columbus made 4 voyages across the Atlantic Ocean and introduced America as a continent to the Europeans. Today, we honor Columbus‟ accomplishment by naming October 12 as Columbus Day. Yet not everyone at Columbus‟ time thought highly of him. There is an unconfirmed story that Columbus was once challenged by a Spanish nobleman who claimed that the discovery of the Indies was not a great achievement. The Indies were there waiting to be discovered, if not by Columbus, definitely by some great, knowledgeable man that Spain was not short of. Columbus did not reply directly to this challenge; instead, he challenged the noblemen to make the egg stand on its end without assistance from any device. The noblemen tried without much success. Columbus then made the egg stand by slightly tapping the blunt end of the egg on the table, crumbling the shell. The answer may seem simple; yet, none of the noblemen have thought of it! Today a Columbus‟ Egg refers to a brilliant idea or discovery that seems simple once the solution has been uncovered! HUNTING FOR COLUMBUS’ EGGS Below the author introduces some examples which in her opinion, are Columbus‟ Eggs. They may appear simple and obvious once the solutions have been disclosed, just like a Columbus‟ Egg! Without further ado, let the egg hunting process begin in the SAS programming world! CATCHY PHRASES TO REMEMBER PROC SQL BASIC STRUCTURE Structured Query Language (SQL) is a powerful language for data manipulation. It enables users to quickly manipulate data with easiness. To take advantage of this process, one needs to remember the basic syntax of the SQL procedure:

Transcript of Hunting for Columbus’ Eggs in the SAS Programming World: A ...

Page 1: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

1

Hunting for Columbus’ Eggs in the SAS® Programming World:

A Guidance to Creative Thinking for SAS® Programmers

Alice M. Cheng, South San Francisco, CA

ABSTRACT

A Columbus‟ Egg, according to Wikipedia, refers to a brilliant idea or discovery that seems simple or easy after the fact. In this paper, the author presents several Columbus‟ Eggs that may be helpful to fellow SAS

® programmers.

SAS examples are given, followed by learning tips to reminder readers what they actually have learned. Examples include catchy phrases to remember basic SQL structure, an innovative use of a SAS function, methods to white out/cover up unwanted information, horizontal vs. vertical approach to resolve a problem, as well as, divide and conquer technique. It is the author‟s intention to not only provide readers with useful techniques, but more importantly, to stimulate them to think outside the box and hunt for their own Columbus‟ Eggs!

KEYWORDS

SAS, Innovation, Creativity, Imagination

INTRODUCTION

As a SAS programmer, one faces the constant challenges to solve problems with various degrees of difficulties. Some problems may have a straightforward but more complex solution. However, if one thinks creatively, there may be a simpler approach. In this paper, the author explores some of these techniques, which can be helpful to fellow SAS programmers. After each example, learning tips are provided to remind readers of what they actually have

learned other than the obvious technical aspect. Hopefully, through this learning process, one will start thinking creatively and look at a problem from different perspectives!

THE COLUMBUS’ EGG STORY

On October 12, 1492, the famous explorer and navigator, Christopher Columbus, discovered land previously unknown to the Europeans. During his lifetime, Columbus made 4 voyages across the Atlantic Ocean and introduced America as a continent to the Europeans. Today, we honor Columbus‟ accomplishment by naming October 12 as Columbus Day. Yet not everyone at Columbus‟ time thought highly of him. There is an unconfirmed story that Columbus was once challenged by a Spanish nobleman who claimed that the discovery of the Indies was not a great achievement. The Indies were there waiting to be discovered, if not by Columbus, definitely by some great, knowledgeable man that Spain was not short of. Columbus did not reply directly to this challenge; instead, he challenged the noblemen to make the egg stand on its end without assistance from any device. The noblemen tried without much success. Columbus then made the egg stand by slightly tapping the blunt end of the egg on the table, crumbling the shell. The answer may seem simple; yet, none of the noblemen have thought of it! Today a Columbus‟ Egg refers to a brilliant idea or discovery that seems simple once the solution has been uncovered!

HUNTING FOR COLUMBUS’ EGGS

Below the author introduces some examples which in her opinion, are Columbus‟ Eggs. They may appear simple and obvious once the solutions have been disclosed, just like a Columbus‟ Egg! Without further ado, let the egg hunting process begin in the SAS programming world!

CATCHY PHRASES TO REMEMBER PROC SQL BASIC STRUCTURE

Structured Query Language (SQL) is a powerful language for data manipulation. It enables users to quickly manipulate data with easiness. To take advantage of this process, one needs to remember the basic syntax of the SQL procedure:

Page 2: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

2

Basic Structure of SQL proc sql;

Create table table-name as

Select object-item1, …, object-itemN

From source-table

Where where-condition

Group by group-by-item

Having sql-expression

Order by order-by-item;

quit;

Create, Select, From, Where, Group by, Having and Order by are the keywords in the SQL procedure. These

keywords have to appear in the exact same order in order for the SQL procedure to be processed successfully. To help remember the order of these keywords, the author thought of 3 phrases (1 sentence and 2 questions, to be exact) that are helpful to memorize the SQL syntax. Using the first letter of each keyword above, the following phrases have been composed:

Creative Solution From the Wise Gets Hearty Outcome/Ovation.

Can San Francisco Workers Go Home Ontime?

Can So Few Wives Google Husbands Online?

Above are just some examples from the author. Readers are encouraged to use their imagination to create their own!

Learning Tip: A catchy phrase can be helpful for memorization when the situation becomes complicated and hard to remember! A simple tune has the same effect.

THINK BEYOND THE OBVIOUS

How many words are there in a character string? Carpenter (2004) has provided us with %WORDCOUNT macro to

answer this question. The definition of this macro is as follows:

%macro wordcount(list); %* Count the number of words in &LIST;

%local count;

%let count=0;

%do %while (%qscan (&list, &count+1, %str( ) ) ne %str());

%let count= %eval(&count+1);

%end;

&count

%mend wordcount;

Consider the string “Peter Paul and Mary”.

%let list=Peter Paul and Mary;

%put The total number of words in &list is: %wordcount(&list);

From SAS LOG, one sees:

The total number of words in Peter Paul and Mary is: 4

The same problem can be resolved using the DIM function. DIM function provides us with the dimension of an array.

It can also be used to find the number of words in a character string as seen in the following example.

Page 3: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

3

%macro wcount(list); %* Count the number of words in &LIST;

%global count;

data _null_;

array nlist{*} &list;

call symput(‘count’, dim(nlist));

run;

%mend wcount;

%let list=Peter Paul and Mary;

%wcount(&list);

%put The total number of words in &list is: &count;

From SAS LOG, one sees:

The total number of words in Peter Paul and Mary is: 4

This is an example of using a SAS function beyond its obvious usage. Here, an array has been created and the words in the string are treated as elements within an array. The goal is to find the number of words in the character string, which is the same as the dimension of the array! Of course, %WORDCOUNT macro is more robust than %WCOUNT. The latter works only when the words in the string are acceptable SAS variables! %WCOUNT fails if there are special characters in the string that are not a valid SAS variable!

Consider the string “Simon & Garfunkel”.

%let list=Simon & Garfunkel;

%put The total number of words in &list is: %wordcount(&list);

From SAS LOG, one sees:

The total number of words in Simon & Garfunkel is: 3

%WCOUNT fails because ampersand (&) is not an acceptable element in an array!

It is worth pointing out that SAS has introduced more new functions in the past several years. With later versions of SAS, there is one function that serves this purpose. Consider the usage of COUNTW function to count the words.

%let list=Simon & Garfunkel;

%put The total number of words in &list is: %sysfunc(countw(&list, ‘ ‘));

This gives the same result as %wordcount using blank as the delimiter.

Learning Tip: Think beyond the obvious. Use your resources creatively. Keep track of new technology! Often

times there are multiple approaches to the same problem.

THINK FORWARD. THINK BACKWARD.

In clinical studies when the end date of an Adverse Event is not certain, researchers usually adopt a conservative approach. If only the month and the year are available, the last day of the month is used as the end date. So when is the last day of the month? Obviously, for January, March, May, July, August, October and December, it is the 31

st

of the month; for April, June, September and November, the date is the 30th

of the month; for February, the last day is 29

th in a leap year and 28

th in a common year.

What exactly is a leap year? A leap year, according to Wikipedia, has to satisfy 2 conditions: (1). a year that is divisible by 4 and (2). If it is divisible by 100, it has to be divisible by 400 as well. The year 1700, 1800, 1900 and 2100 are not leap years since they fail the condition (2). So in clinical studies, unless you go through very old records, the first condition should suffice because 2100 seems a bit too remote in the future! For accuracy and completeness sake, both conditions for leap year will be taken into consideration.

A straightforward but more complicated approach is to compute the last day of the month following through the aforementioned algorithm. A more creative and simpler solution is to find the first day of the next month and then subtract the result by 1. Both yield the same answer based on the following code. However, the second approach is only one line of code!

Page 4: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

4

Code to Derive AE End Date When Only Month and Year is Available.

*------------------------------------------------------------------------------*;

* The derived AE End Date is the last day of the month in which an AE ends. *;

*------------------------------------------------------------------------------*;

proc format;

value DAYF

1, 3, 5, 7, 8, 10, 12 = 31

4, 6, 9, 11 = 30

; run;

data AEDATE;

input AEENDMM AEENDYY @@;

label AEENDMM=’Month for End of AE’

AEENDYY=’Year for End of AE’;

datalines;

3 2009 9 2010 12 2010 2 2008

2 2010 1 2009 2 2100 2 2000

; run;

data AEENDDT;

set AEDATE;

*- The more creative approach only needs the one single statement below!-*;

AEENDDT2=mdy(mod(AEENDMM+1, 12), 1, AEENDYY+(AEENDMM=12))-1;

*--- Code for the straight-forward approach. ---*;

if AEENDMM ne 2 then AEENDDT=mdy(AEENDMM, put(AEENDMM, DAYF.), AEENDYY);

else if mod(AEENDYY, 4) ne 0 then AEENDDT = mdy(AEENDMM, 28, AEENDYY);

else if mod(AEENDYY, 100) = 0 then do;

if mod(AEENDYY, 400)=0 then AEENDDT = mdy(AEENDMM, 29, AEENDYY);

else AEENDDT = mdy(AEENDMM,28, AEENDYY);

end;

else AEENDDT=mdy(AEENDMM, 29, AEENDYY);

*--- Use SAS function to find the last date of the month. ---*;

*--- Note: the author has arbitrarily picked 15 for the date. ---*;

*--- Any valid date will work. ---*;

AEENDDT3=intnx(‘month’, mdy(AEENDMM, 15, AEENDYY), 0, ‘e’);

format AEENDDT AEENDDT2 AEENDDT3 date9.; run;

title 'Last Date in a Month';

proc print data=AEENDDT;

var AEENDMM AEENDYY AEENDDT AEENDDT2 AEENDDT3; run;

(SAS Output)

Last Date in a Month Obs AEENDMM AEENDYY AEENDDT AEENDDT2 AEENDDT3

1 3 2009 31MAR2009 31MAR2009 31MAR2009

2 9 2010 30SEP2010 30SEP2010 30SEP2010

3 12 2010 31DEC2010 31DEC2010 31DEC2010

4 2 2008 29FEB2008 29FEB2008 29FEB2008

5 2 2010 28FEB2010 28FEB2010 28FEB2010

6 1 2009 31JAN2009 31JAN2009 31JAN2009

7 2 2100 28FEB2100 28FEB2100 28FEB2100

8 2 2000 29FEB2000 29FEB2000 29FEB2000

Aside: SAS programmers are blessed with a fourth generation language rich in functions. As Mr. Art Carpenter has kindly alerted the author, the INTNX function can serve the same purpose as seen in the derivation

of AEENDDT3.

Learning Tip: Think backward. Think forward. Think 360 degrees. A more creative approach can sometimes save time and effort!

Page 5: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

5

THINK VERTICALLY. THINK HORIZONTALLY.

Depending on the nature of the problem, sometimes a vertical data arrangement has advantages over a horizontal one and vice-versa. Suppose you are given 12 variables, X1, X2, …, X12. One can easily use MAX, MIN and ORDINAL functions to find the highest, lowest and the kth smallest value among these 12 variables, respectively.

However, if the objective is to find any identical values among these 12 variables, there are 66 pairs of variables to be considered if one uses the horizontal approach, let alone groups of 3, 4, 5, …, 12. A vertical representation of the data may be the easier approach to apply.

Code to Find Minimum, Maximum and the K-th Smallest Values

data A; input X1-X12;

XMAX=max(of X1-X12); *- Find the maximum value among X1-X12. -*;

XMAX2=max (of X:); *- Find the maximum value among variable names beginning -*;

*- with X -*;

XMIN=min(of X1-X12); *- Find the minimum value among X1-X12. -*;

X_3th=ordinal(3, of X1-X12); *- Find the third smallest variable. -*;

datalines;

29 38 74 89 16 29 29 7 123 78 56 77 98

; run;

*--- Convert data to a vertical data representation. ---*;

proc transpose data=A out=B (rename=(_name_=XVAR col1=Y));

var x1-x12; run;

proc sort data=B; by Y; run;

data DUP_B;

set B; by Y;

if not (first.Y and last.Y); run;

proc print data=A; title 'Maximum, Minimum and the Third Smallest Value of X';

var xmax xmax2 xmin x_3th; run;

proc print data=DUP_B;

title 'Observations with Duplication in Y Value'; run;

(SAS Output)

Maximum, Minimum and the Third Smallest Value of X

Obs XMAX XMAX2 XMIN X_3TH

1 123 123 7 29

Observations with Duplication in Y Value Obs XVAR Y

1 X1 29

2 X6 29

3 X7 29

Note as technology enhances, the choice of vertical vs. horizontal approaches may change. Before the introduction of ORDINAL function, it was easier to find the kth smallest value using a vertical representation of the data!

Aside: A lot of times in clinical studies, repeated values are represented by variables just like X1, X2, …, Xn. The last variable Xn changes as more data are collected. The use of colon (:) is very helpful here. X: means any variables whose names begin with X. In this example, the author uses XMAX2=max (of X: ); to find the maximum value among the values for variables with names beginning with an X. Intrigued about the usage of colon in SAS? Please refer to Luo (2001) for a comprehensive coverage on the use of colon in SAS.

For yet another example for vertical vs. horizontal approaches, please consult Cheng (2006) for a more creative vertical approach to compute the number of distinct days among overlapping intervals.

Learning Tip: Think backward. Think forward. Think from different perspective. The best approach may change as technology changes.

Page 6: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

6

DIVIDE AND CONQUER

Divide and conquer is a classic technique most programmers are familiar with. A program can be divided into several sections, each having a purpose of its own. Combine these sections together to achieve the objective of the program. Over years, the author also finds the divide and conquer technique extremely helpful to debug a program/a macro. One may submit only a portion of the program and check the intermediate results. For instance, to submit the top portion of the program in interactive SAS, one can simply highlight that section and then submit or use

SUBTOP line number

in the command line in Editor window to submit your program up to the line number specified. To do so in batch, one can put an

ENDSAS;

statement to execute a program up to that statement.

To exclude any section from execution, one can comment that section out by using

/* codes to be commented out. */

or to put it in a macro and never invoke it, such as

%macro hide;

codes to be commented out;

%mend hide;

However, when using /* */ to comment out a section, be careful of the nested comments, such as,

/* /* */ */

This is incorrect in SAS. Creating a macro to hide the codes may be more ideal if you have a lot of /* */ comments within the code. To debug a macro, the statement

options symbolgen mprint mlogic;

instructs SAS to resolve the macro and display each detail step in SAS log. One may still use the divide and conquer technique to debug a macro. One can still create a macro to hide codes within a macro. Macro %hide can be

defined repeatedly to comment out different sets of codes. Just make sure not to name your macro with an existing macro name and inadvertently overwrite it!

An even more convenient technique to debug a macro is to use

%return;

which instructs SAS to terminate the macro execution normally. One can move %return to different locations of the macro so as to see the result of the execution up to a certain point where %return has been encountered.

Learning Tip: Classic techniques such as DIVIDE and CONQUER can be very useful. One can be creative on how to apply classic techniques. %return, for instance, may not be a tool one thinks of to be used for debugging.

COVER UP

In SAS graphics, what SAS has drawn previously will be covered up by what it draws at a later stage at the same location of the figure. To illustrate, consider the following example of regression line with its 95% confidence interval of mean and 95% prediction interval. Suppose the goal is to draw a 95% confidence interval of mean in BLUE and a 95% prediction interval in GREEN and the regression line in RED.

In the following example, the 95% confidence intervals and the regression line were first drawn in blue in Figure 1A. The blue regression line has been covered up by the green regression line in Figure 1B when the 95% prediction interval, as well as the, regression lines were drawn. Finally, a since red regression line was drawn to cover up the green regression in Figure 1C. Like in data step, in SAS graphics, the latest prevails.

Page 7: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

7

Figure 1A shows the regression line and its

95% confidence interval of mean. All lines,

including the regression line, are displayed in

BLUE. This is a result of SYMBOL3

statement with INTERPOL=RLCLM95.

SAS fails to clearly distinguish the regression

line from the 95% confidence interval of

mean. In fact, based on the legend, regression line is 95% confidence interval!

Figure 1C shows the regression line in RED.

SAS has correctly identified in the legend. The

regression line is based on SYMBOL5

statement with INERPOL=RL and color=RED

options.

Note the latest RED regression line has covered

the GREEN regression line in Figure 1B !

A BLUE regression line.

A GREEN regression line has covered

the original BLUE regression line!

Figure 1B shows the regression line, its 95%

confidence interval of mean and its 95%

prediction interval. This is the result of

SYMBOL3 and SYMBOL4 statements with

INTERPOL=RLCLM95 and

INTERPOL=RLCLI95, respectively.

Note when generating prediction interval,

another regression line has been produced.

This time it is in GREEN. The green

regression line has covered the original BLUE

regression line.

SAS still fails to identify the regression line!

Based on the legend of Figure 1B, the

regression line is now part of the 95%

prediction interval!

A RED regression line has covered the

GREEN regression line, which in turn,

covered the BLUE regression line!

Page 8: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

8

Code for Generating Regression Lines, 95% Confidence Intervals of Mean and 95% Prediction Intervals in Figures 1A, 1B and IC.

goptions reset=all goutmode=replace gunit=percent ftext=swiss hsize=9 htext=2.5

xmax=10 in ymax=7 in lfactor=2 border;

options orientation=landscape nodate nonumber;

ods listing;

ods tagsets.rtf style=analysis file="c:\WUSS_2011_Coder\Output\g_reg_ci_pred.rtf";

ods graphics on;

/* Symbol definitions for the two groups in the scatter plot */

symbol1 interpol=none color=blue value=dot;

symbol2 interpol=none color=green value=dot;

/* Symbol definition for the regression lines, Confidence Intervals and */

/* Prediction Interval. */

symbol3 interpol=RLCLM95 color=blue value=none line=2;

symbol4 interpol=RLCLI95 color=green value=none line=2;

symbol5 interpol=RL color=red value=none line=1;

axis1 order=(40 to 80 by 10);

legend1 label=('Sex');

legend2 label=none value=('95% Confidence Interval' '95% Prediction Interval'

'Regression Line');

*--- Create Figure 1A. ---*;

title j=c h=4 pct 'Figure 1A. Regression of Height based on Age';

proc gplot data=SASHELP.CLASS;

plot HEIGHT*AGE=SEX / legend=legend1 vaxis=axis1;

plot2 HEIGHT*AGE / overlay vaxis=axis1 noaxis legend=legend2; run;

*--- Create Figure 1B. ---*;

title j=c h=4 pct 'Figure 1B. Regression of Height based on Age';

proc gplot data=SASHELP.CLASS;

plot HEIGHT*AGE=SEX / legend=legend1 vaxis=axis1;

plot2 HEIGHT*AGE HEIGHT*AGE/ overlay vaxis=axis1 noaxis legend=legend2; run;

*--- Create Figure 1C. ---*;

title j=c h=4 pct 'Figure 1C. Regression of Height based on Age';

proc gplot data=SASHELP.CLASS;

plot HEIGHT*AGE=SEX / legend=legend1 vaxis=axis1;

plot2 HEIGHT*AGE HEIGHT*AGE HEIGHT*AGE/ overlay vaxis=axis1 noaxis

legend=legend2;run;

ods listing; ods graphics off; ods tagsets.rtf close;

Define a broken BLUE ( - - - - - - ) regression line and 95% confidence interval of mean.

Define a broken GREEN ( - - - - - - ) regression line and 95% prediction interval.

Define a solid RED ( ) regression line.

Define a legend to identify the lines. This legend has been used in all 3 figures.

Learning Tip: Learn from your daily activities. Perfume was first used to cover bodily smells. Foundation and

concealers are used to cover facial blemishes. So why not apply the same cover-up technique to SAS programming!

WHITE OUT When the text color is the same as the background color, the text disappears! Writing white text on white

background makes the text invisible. This ‘white out’ technique is often used in SAS graphics to get rid of

undesirable SAS output! Consider the following example.

Page 9: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

9

Figure 2A: This is an example of a

figure of incorrectly spaced

intervals in the horizontal axis

produced by the GPLOT procedure.

For example, the interval from 24

to 37 has the same spacing as a

smaller interval from 18 to 24!

Incorrectly spaced intervals!

Figure 2B: After applying the

‘White Out’ technique, the tick

marks and the values have

vanished!

Tick marks and values disappeared

after being whited out!

Figure 2C: A correctly spaced

horizontal axis has been

generated by placing the tick

marks and its corresponding

values on the horizontal axis.

This is accomplished by means of

the annotation feature in the

GPLOT procedure.

Correctly spaced intervals!

Page 10: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

10

Code for Figures 2A (A Figure with Incorrectly Spaced Horizontal Axis), Figure 2B (A Figure with No Tick Mark and Values in the Horizontal Axis and Figure 2C (A Figure with Correctly Spaced Horizontal Axis

data DRUG;

input TRT & $6. WEEK : 2. SCORE : 3. @@ ;

label TRT='Treatment: '

WEEK='Week'

SCORE='Mean Score';

datalines;

Drug A 0 5 Drug A 12 16 Drug A 18 23 Drug A 24 36 Drug A 37 44

Drug A 50 60 Drug A 57 70 Drug A 68 80 Drug A 80 86

Drug B 0 4 Drug B 12 8 Drug B 18 12 Drug B 24 20 Drug B 37 33

Drug B 50 48 Drug B 57 65 Drug B 68 75 Drug B 80 82

; run;

*--- Create Annotation Dataset. ---*;

proc sort data=DRUG out=UNIQUE_WK nodupkey;

By WEEK; run;

data ANNO_2C;

length FUNCTION $5.;

retain XSYS ‘2’ YSYS ‘3’ WHEN ‘a’ SIZE 1.5 STYLE ‘swiss’;

set UNIQUE_WK;

FUNCTION=’move’; COLOR=’black’; X=week; Y=17; output;

FUNCTION=’draw’; COLOR=’black’; X=week; Y=15; output;

FUNCTION=’label’; COLOR=’black’; X=week; Y=13.5; TEXT=strip(put(WEEK, 3.)); output;

run;

goptions border;

ods graphics on; ods tagsets.rtf file="c:\wuss_2011_coder\output\white_out2abc.rtf";

symbol1 interpol=j line=1 width=3 ci=Blue co=Blue cv=Blue value=dot height=3 pct;

symbol2 interpol=j line=1 width=3 ci=Red cv=Red value=dot height=3 pct;

axis2 label=(justify=c font=swissb color=black h=3 pct 'Week') offset=(0, 3)

major=(c=black height=2 pct)minor=none width=3 value=(h=3 pct color=black) width=1

order=(0, 12, 18, 24, 37, 50, 57, 68, 80);

axis1 label=(justify=c font=swissb angle=90 h=3 pct 'Mean Score')

major=(h=1.2 pct) minor=(h=0.7 pct n=1) order=(0 to 100 by 10) width=1

offset=(0, 3) value=(h=3 pct color=Black);

legend1 position=(bottom center outside) label=(h=3 pct font=swiss 'Treatment: ')

value=(tick=1 h=3 pct font=swiss 'Drug A' tick=2 h=3 pct font=swiss 'Drug B');

*--- Create Figure 2A. ---*;

title h=4 pct font=swissb

'Figure 2A. Mean Score Over Time by Treatment';

proc gplot data=DRUG;

plot SCORE*WEEK=TRT/frame haxis=axis2 vaxis=axis1 legend=legend1;

run;

*--- Create Figure 2B. ---*;

*--- White out the Major Tick marks and their correponding values. ---*;

axis2 minor=none width=3 major=(c=white) value=(color=white) label=(justify=c

font=swissb color=black h=3 pct 'Week') width=1;

title h=4 pct font=swissb 'Figure 2B. Mean Score Over Time by Treatment';

proc gplot data=DRUG;

plot SCORE*WEEK=TRT/frame haxis=axis2 vaxis=axis1 legend=legend1;

run; *--- Create Figure 2C. ---*;

title h=4 pct font=swissb 'Figure 2C. Mean Score Over Time by Treatment';

proc gplot data=drug anno=ANNO_2C;

plot SCORE*WEEK=TRT/frame haxis=axis2 vaxis=axis1 legend=legend1;

run;

ods graphics off; ods tagsets.rtf close;

White-Out

Technique!

Page 11: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

11

Specify the values of the major tick marks for the unevenly spaced intervals in AXIS2. Apply it to Figure 2A.

Apply white-out technique to write white major tick marks and white values on white background as specified in

AXIS2 statement.

Assign AXIS2 to the horizontal axis for Figures 2B and 2C. This is a re-defined AXIS2 with white-out technique

applied. It is different from that for Figure 2A.

Apply annotation to the Figure 2C by specifying annotation dataset ANNO_2C to the ANNO options in the

GPLOT procedure.

Learning Tip: Learn from your observation. Painters put white paint on the wall before applying its final colors. Same analogy can be applied to your SAS programming!

MORE TIPS TO ENHANCE YOUR CREATIVITY

“Imagination is the highest kite you can fly.” That is a remark from the renowned actress and Academy Honorary Award recipient Laura Bacall. As a SAS programmer, how can you make your kite fly higher? Here are some more tips to further enhance your creativity. Some are applicable solely for SAS programmers; others to life in general.

Reserve at least 15 minutes of quiet moment every day. Relax. Listen to yourselves and your thoughts. Think of challenging programs you have encountered. Are there some ways they can be resolved?

Create a sanctuary where you can indulge in deep thought to fuel your creative process.

Ask yourself „What if?‟ What if I take a vertical approach to the problem? Will that make things easier? Think from different perspectives.

Communicate with fellow programmers. Share your thoughts and techniques. Learn from each others. Network.

Fill up your reservoir of knowledge. Read papers and code from fellow SAS programmers. For daily easy reading, please consider „Tip of the Day‟ on sasCommunity.org, Phil Mason (2006) and Ron Cody (2006). Attend SAS conferences, training and web seminars. Your knowledge is the raw material to fuel your creative thoughts in programming.

Keep a journal. Ideas and inspiration may come on the spur of the moment. Jolt them down. Capture your thoughts and ideas. Also capture SAS techniques/code that you may find useful for future reference.

Have a brainstorm session. Let your imagination run wild. Don‟t worry about the scores! Don‟t let failure hinder you!

Sometimes creativity takes time. You may undergo a gestation period before your creative ideas come to fruition. So be patient and don‟t be discouraged. Remember „Rome wasn‟t built in a day‟!

CONCLUSION

In this paper the author has provided 7 gems to solve problems that a typical SAS programmer may encounter. These techniques, in her opinion, are Columbus‟ eggs because they seem simple and obvious once they have been explained. Learning tips are provided at the end of each technique. While these techniques are helpful, she hopes that through these examples, readers are stimulated to think creatively and innovatively so that they can hunt for their own Columbus‟ Eggs!

REFERENCES

Carpenter, Arthur L. (1999). Annotate: Simply the Basics, Second Edition. SAS Institute, Cary, NC.

Carpenter, Arthur L. (2004). Carpenter’s Complete Guide to the SAS® Macro Language, Second Edition. SAS Institute, Cary, NC.

Cheng, Alice (2006). “Duration Calculation from a Clinical Programmer’s Perspective”, SUGI 31 Proceedings. http://www2.sas.com/proceedings/sugi31/048-31.pdf

Page 12: Hunting for Columbus’ Eggs in the SAS Programming World: A ...

12

Linguaspectrum (2010). Columbus’ Egg. Learn English. Interesting. http://www.youtube.com/watch?v=MKCi8cW9_3o&feature=mfu_in_order&list=UL

Luo, Haiping (2001). “That Mysterious Colon (:)”. SUGI 26 Proceedings. http://www2.sas.com/proceedings/sugi26/p073-26.pdf

McMeekin, Gail (2000). The 12 Secrets of Highly Creative Women: A Portable Mentor. MJF Books.

SAS Institute Inc. (2005). “Sample 24938: Display unevenly spaced horizontal axis tick marks.” SAS Institute, Cary, NC. http://support.sas.com/kb/24/938.html

SAS Institute Inc. (2010). SAS/GRAPH® 9.2, Second Edition. SAS Institute, Cary, NC. http://support.sas.com/documentation/cdl/en/graphref/63022/PDF/default/graphref.pdf

SAS Institute Inc. (2011). SAS 9.2 Language Reference Dictionary, Fourth Edition. SAS Institute, Cary, NC. http://support.sas.com/documentation/cdl/en/lrdict/64316/PDF/default/lrdict.pdf

ACKNOWLEDGMENTS

The author would like to thank you Ms. Kathryn McLawhorn and Ms. Susan Morrison of SAS Technical Support for their respective assistance in the examples for „FIND THE NUMBER OF WORDS IN A CHARACTER STRING‟ and „COVER UP‟ sections. She is grateful to Mr. Peng Zhang, independent SAS consultant, for alerting her on the use of ORDINAL function in „THINK VERTICALLY. THINK HORIZONTALLY.‟ Section. Kudos to Mr. Art Carpenter, Ms. Mary McCracken and Mr. Ethan Miller for their valuable feedback and to Mr. Carpenter whose books have served as excellent references. Any errors and oversight in this paper, however, are the sole responsibility of the author.

RECOMMENDED READING

Bergreen, Laurence (2011). Columbus: The Four Voyages. Viking Adult.

Cody, Ron. (2006). SAS Functions by Example. SAS Institute, Cary, NC.

Mason, Phil. (2006). In the know … SAS Tips and Techniques From Around the Globe, Second Edition. SAS Institute, Cary, NC.

CONTACT INFORMATION

Your comments and questions are valued and encouraged. Contact the author at:

Alice Monica Cheng E-mail: [email protected]

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies.