8/3/2019 Data Step Programming
1/24
Technical Team
8/3/2019 Data Step Programming
2/24
SAS BasicsTraining
8/3/2019 Data Step Programming
3/24
Confidential & Proprietary Copyright 2010 The Nielsen Company 3AC BA HUB - Technical Team
Reading a SAS Data Set
To create a SAS data set using a SAS data set as input, you must use a
DATA statement to start a DATA step and name the SAS data set being
created
SET statement to identify the SAS data set being read .
8/3/2019 Data Step Programming
4/24
Confidential & Proprietary Copyright 2010 The Nielsen Company 4AC BA HUB - Technical Team
General form of a DATA step:
Concatenating SAS Data Sets
DATA output SAS-data-set;SET input data-set1 input data-set2 ;additional SAS statements ;
RUN;
The SET statement in a data step reads observations from one or more
data sets.
Any number of data sets can be in the SET statement
The observations from the first data set in the set statement appear first
in the new data set, the observations from the second data set follow
those from the first, and so on.
8/3/2019 Data Step Programming
5/24
Confidential & Proprietary Copyright 2010 The Nielsen Company 5AC BA HUB - Technical Team
Concatenating SAS Data Sets
Example-1
Data store3;
set sundari.Store2;
Run;
Note: This is just copy the observation from store2 data set.
Example-2
Data store3;
set sundari.Store2 sundari.store1;
Run;
8/3/2019 Data Step Programming
6/24
Confidential & Proprietary Copyright 2010 The Nielsen Company 6AC BA HUB - Technical Team
Drop and Keep Statements
The Drop Statement specifies the names of the variables to omit from the
output dataset (s)
Example:
Data store4;
set sundari.Store2 sundari.store1;
drop state Top1_Brand Top2_brand Dollaramt;Run;
DROP variable(s);
8/3/2019 Data Step Programming
7/24
Confidential & Proprietary Copyright 2010 The Nielsen Company 7AC BA HUB - Technical Team
Drop and Keep Statements
The Keep Statement specifies the names of the variables to write to the
output dataset (s)
Example:
Data store5;
set sundari.Store2 sundari.store1;
Keep Store_Id City Store_size Number_Items_Sold;
Run;
Keep variable(s);
8/3/2019 Data Step Programming
8/24Confidential & Proprietary Copyright 2010 The Nielsen Company 8AC BA HUB - Technical Team
Drop and Keep Options
Alternatives to the DROP and KEEP statements are the DROP= and
KEEP= data-set options placed in the DATA statement:
Example:
Data store6 (Drop= state Top1_Brand Top2_brand Dollaramt);
set sundari.Store2 sundari.store1;
Run;
DATA output-data-set (DROP = variable-list);
8/3/2019 Data Step Programming
9/24Confidential & Proprietary Copyright 2010 The Nielsen Company 9AC BA HUB - Technical Team
Drop and Keep Options
This options can also be placed in the SET statement to control which
variables are read from the input data set:
Example:
Data store7;
set sundari.Store2 sundari.store1(Drop= state Top1_Brand
Top2_brand Dollaramt) ;
Run;
Set input-data-set (DROP = variable-list);
8/3/2019 Data Step Programming
10/24Confidential & Proprietary Copyright 2010 The Nielsen Company 10AC BA HUB - Technical Team
Drop and Keep Options
Alternatives to the DROP and KEEP statements are the DROP= and
KEEP= data-set options placed in the DATA statement:
Example:
Data store6 (Keep=Store_Id City Store_size Number_Items_Sold);
set sundari.Store2 sundari.store1;
Run;
DATA output-data-set (Keep = variable-list);
8/3/2019 Data Step Programming
11/24Confidential & Proprietary Copyright 2010 The Nielsen Company 11AC BA HUB - Technical Team
Drop and Keep Options
This options can also be placed in the SET statement to control which
variables are read from the input data set:
Example:
Data store7;
set sundari.Store2 sundari.store1(Keep=Store_Id City Store_size
Number_Items_Sold);
Run;
Set input-data-set (Keep = variable-list);
8/3/2019 Data Step Programming
12/24Confidential & Proprietary Copyright 2010 The Nielsen Company 12AC BA HUB - Technical Team
The Rename Data-Set Option
The RENAME = data set option changes the name of a variable.
General form of the RENAME= Data step option:
data-set (RENAME = (old-name1 = new-name1
old-name2 = new-name2...
old-name n = new-name n));
8/3/2019 Data Step Programming
13/24Confidential & Proprietary Copyright 2010 The Nielsen Company 13AC BA HUB - Technical Team
The Rename Data-Set Option
If the RENAME = option is associated with an input data-set in the SET
statement, the action applies to the data set that is being read.
Example:
Data store8;
set sundari.Store3 sundari.store4(rename=( City = CityName
Dollaramt = Amount)) ;
Run;
8/3/2019 Data Step Programming
14/24Confidential & Proprietary Copyright 2010 The Nielsen Company 14AC BA HUB - Technical Team
Interleaving SAS Data Sets
Use the SET statement with a BY statement in a DATA step to interleave
SAS data sets.
General form of a DATA step interleave:
Here data sets must be sorted first.
DATA output SAS-data-set;
SET input data-set1 input data-set2 ;
BY variables;
;
RUN;
8/3/2019 Data Step Programming
15/24Confidential & Proprietary Copyright 2010 The Nielsen Company 15AC BA HUB - Technical Team
Appending and Concatenating
Appending and Concatenating involves combining SAS data-sets
one after the other, into a single SAS data-set
Appending adds the observations in the second data set directly to
the end of the original data set.
Concatenating copies all the observations from the first data set and
then copies all observations from one or more successive data sets
into a new data set.
8/3/2019 Data Step Programming
16/24Confidential & Proprietary Copyright 2010 The Nielsen Company 16AC BA HUB - Technical Team
Appending
General form of the Append procedure:
BASE = names the data-set to which the observations are added.
DATA = names the data-set containing observations that are added to
the base data set.
Note:
Only two data sets can be used at a time in one step.
The observations in the base data set are not read.
PROC APPEND BASE = data-set
DATA = SAS-data-set;
RUN;
8/3/2019 Data Step Programming
17/24Confidential & Proprietary Copyright 2010 The Nielsen Company 17AC BA HUB - Technical Team
Example:
proc append base = store5 data = store4(drop= city store_size
Number_items_sold);
where Dollaramt le 30;
run;
Appending
8/3/2019 Data Step Programming
18/24Confidential & Proprietary Copyright 2010 The Nielsen Company 18AC BA HUB - Technical Team
Appending and Concatenating
DataSet 1
DataSet 2
Concatenating DataSet1 and DataSet2
Name Gender Age
Romina F 23
Flavia F 24
Anabella F 23
Cristian M 25
Francisco M 24
Name Gender Age
Eugenia F 24
Manuel M 24
Name Gender Age
Romina F 23
Flavia F 24
Anabella F 23
Cristian M 25
Francisco M 24
Eugenia F 24
Manuel M 24
8/3/2019 Data Step Programming
19/24Confidential & Proprietary Copyright 2010 The Nielsen Company 19AC BA HUB - Technical Team
Merging
Use the MERGE statement in a data step to join corresponding
observations from two or more data-sets.
General form of a DATA step match-merge:
A BY statement after the MERGE statement performs a match-merge
DATA data-set;
MERGE data-set (s);
BY variable (s);
< other SAS statements >;
Run;
8/3/2019 Data Step Programming
20/24Confidential & Proprietary Copyright 2010 The Nielsen Company20AC BA HUB - Technical Team
proc sort data = storem1;
by store_id;run;
proc sort data = storem2;
by store_id;
run;
proc sort data = storem3;
by store_id;
run;data storemerge;
merge storem1 storem2 storem3;
by store_Id;
run;
Merging- Example
8/3/2019 Data Step Programming
21/24Confidential & Proprietary Copyright 2010 The Nielsen Company 21AC BA HUB - Technical Team
The OBS= Data set option
The OBS= data set option specifies an ending for processing an input data
set
This option specifies the number of the last observation to process, not
how many observation should be processed.
DATA selectstore;
SET store(OBS= 25);
Run;
The OBS= 25 option in the SET statement stops reading after observation
25 in the data set
SAS-data-set (OBS=n)
8/3/2019 Data Step Programming
22/24Confidential & Proprietary Copyright 2010 The Nielsen Company 22AC BA HUB - Technical Team
The FIRSTOBS= Data set option
The FIRSTOBS= data set option specifies a starting point for processing an input
data set
FIRSTOBS= and OBS= are often used together to define a range of observation
to be processed.
SAS-data-set (FIRSTOBS=n);
8/3/2019 Data Step Programming
23/24Confidential & Proprietary Copyright 2010 The Nielsen Company 23AC BA HUB - Technical Team
The FIRSTOBS= Data set option
DATA selectstore15;
SET store (FIRSTOBS= 11 OBS= 25);
Run;
The FIRSTOBS= and OBS= data set options in the SET statement read
15 observation from sas-data-set. The processing begins with observation
11 and ends after observation 25.
8/3/2019 Data Step Programming
24/24
THANK YOU!
Top Related