28743095 SAS Interview Questions

download 28743095 SAS Interview Questions

of 4

Transcript of 28743095 SAS Interview Questions

  • 8/6/2019 28743095 SAS Interview Questions

    1/4

    SAS Interview Questions:Base SAS

    Very Basic:

    What SAS statements would you code to read an externalraw data file to a DATA

    step?INFILE statement.

    How do you read in the variables that you need?

    Using Input statement with the column pointers like @5/12-17 etc.

    Are you familiar with special input delimiters? How are they used?

    DLM and DSD are the delimiters that Ive used. They should be included in the infile

    statement. Comma separated values files or CSV files are a common type of file that

    can be used to read with the DSD option. DSD option treats two delimiters in a row as

    MISSING value.

    DSD also ignores the delimiters enclosed in quotation marks.

    If reading a variable length file with fixed input, how would you prevent SAS

    from reading the next record if the last variable didn't have a value?

    By using the option MISSOVER in the infile statement.If the input of some data lines

    are shorter than others then we use TRUNCOVER option in the infile statement.

    What is the difference between an informat and a format? Name three informats

    or formats.

    Informats read the data. Format is to write the data.

    Informats: comma. dollar. date.

    Formats can be same as informatsInformats: MMDDYYw. DATEw. TIMEw. ,

    PERCENTw,Formats: WORDIATE18., weekdatew.

    Name and describe three SAS functions that you have used, if any?

    LENGTH: returns the length of an argument not counting the trailing blanks.(missing

    values have a length of

    1)Ex: a=my cat;x=LENGTH(a); Result: x=6

    SUBSTR: SUBSTR(arg,position,n) extracts a substring from an argument starting at

    position for n characters or until end if no n.

    Ex: A=(916)734-6241;X=SUBSTR(a,2,3); RESULT: x=916

    TRIM: removes trailing blanks from character expression.

    Ex: a=my ; b=cat;X= TRIM(a)(b); RESULT: x=mycat.

    SUM: sum of non missing values.Ex: x=Sum(3,5,1); result: x=9.0

    INT: Returns the integer portion of the argument.

    How would you code the criteria to restrict the output to be produced?

    http://studysas.blogspot.com/2008/09/sas-interview-questionsbase-sas.htmlhttp://studysas.blogspot.com/2008/09/sas-interview-questionsbase-sas.html
  • 8/6/2019 28743095 SAS Interview Questions

    2/4

    Use NOPRINT option.

    What is the purpose of the trailing @ and the @@? How would you use them?

    @ holds the value past the data step.@@ holds the value till a input statement or end

    of the line.

    Double trailing @@: When you have multiple observations per line of raw data, we

    should use double trailing signs (@@) at the end of the INPUT statement. The line

    hold specifies like a stop sign telling SAS, stop, hold that line of raw data.

    Trailing @: By using @ without specifying a column, it is as if you are telling SAS,

    stay tuned for more information. Dont touch that dial. SAS will hold the line of data

    until it reaches either the end of the data step or an INPUT statement that does not end

    with the trailing.

    Under what circumstances would you code a SELECT construct instead of IF

    statements?When you have a long series of mutually exclusive conditions and the comparison is

    numeric, using a SELECT group is slightly more efficient than using IF-THEN or IF-

    THEN-ELSE statements because CPU time is reduced.

    SELECT GROUP:

    Select: begins with select group.When: identifies SAS statements that are executed

    when a particular condition is true.

    Otherwise (optional): specifies a statement to be executed if no WHEN condition is

    met.

    End: ends a SELECT group.

    What statement you code to tell SAS that it is to write to an external file?

    .What statement do you code to write the record to the file?

    PUT and FILE statements.

    If reading an external file to produce an external file, what is the shortcut to

    write that record without coding every single variable on the record?

    If you're not wanting any SAS output from a data step, how would you code the

    data statement to prevent SAS from producing a set?Data _Null_

    What is the one statement to set the criteria of data that can be coded in any

    step?

    Options statement: This a part of SAS program and effects all steps that follow it.

    Have you ever linked SAS code? If so, describe the link and any required

    statements used to either process the code or the step itself

    . How would you include common or reuse code to be processed along with your

    statements?By using SAS Macros.

  • 8/6/2019 28743095 SAS Interview Questions

    3/4

    When looking for data contained in a character string of 150 bytes, which

    function is the best to locate that data: scan, index, or indexc?

    SCAN. If you have a data set that contains 100 variables, but you need only five of

    those,

    .what is the code to force SAS to use only those variable?

    Using KEEP option or statement.

    Code a PROC SORT on a data set containing State, District and County as the

    primary variables, along with several numeric variables.

    Proc sort data=one;

    BY State District County ;

    Run ;

    How would you delete duplicate observations?NONUPLICATES

    How would you delete observations with duplicate keys?

    NODUPKEY

    How would you code a merge that will keep only the observations that have

    matches from both sets.

    Check the condition by using If statement in the Merge statement while merging

    datasets.

    How would you code a merge that will write the matches of both to one data set,

    the non-matches from the left-most data.

    Step1: Define 3 datasets in DATA step

    Step2: Assign values of IN statement to different variables for 2 datasets

    Step3: Check for the condition using IF statement and output the matching to first

    dataset and no matches to different datasets

    Ex: data xxx;

    merge yyy(in = inxxx) zzz (in = inzzz);

    by aaa;if inxxx = 1 and inyyy = 1;

    run;

    What is the Program Data Vector (PDV)? What are its functions?

    Function: To store the current obs;PDV (Program Data Vector) is a logical area in

    memory where SAS creates a dataset one observation at a time. When SAS processes

    a data step it has two phases. Compilation phase and execution phase. During the

    compilation phase the input buffer is created to hold a record from external file. After

    input buffer is created the PDV is created. The PDV is the area of memory where

    SAS builds dataset, one observation at a time. The PDV contains two automatic

    variables _N_ and _ERROR_.

  • 8/6/2019 28743095 SAS Interview Questions

    4/4

    The Logical Program Data Vector (PDV) is a set of buffers that includes all variables

    referenced either explicitly or implicitly in the DATA step. It is created at compile

    time, then used at execution time as the location where the working values of variables

    are stored as they are processed by the DATA step program(source:

    http://www2.sas.com/proceedings/sugi24/Posters/p235-24.pdf).

    Does SAS 'Translate' (compile) or does it 'Interpret'? Explain.

    SAS compiles the code At compile time when a SAS data set is read, what items are

    created?Automatic variables are created. Input Buffer, PDV and Descriptor

    Information

    Name statements that are recognized at compile time only?

    PUT

    Name statements that are execution only.

    INFILE, INPUT

    .Identify statements whose placement in the DATA step is critical.

    DATA, INPUT, RUN.

    Name statements that function at both compile and execution time.

    INPUT

    In the flow of DATA step processing, what is the first action in a typical DATA

    Step?

    The DATA step begins with a DATA statement. Each time the DATA statement

    executes, a new iteration of the DATA step begins, and the _N_ automatic variable is

    incremented by 1.

    What is _n_?

    It is a Data counter variable in SAS.

    Note: Both -N- and _ERROR_ variables are always available to you in the data step

    .N- indicates the number of times SAS has looped through the data step.This is not

    necessarily equal to the observation number, since a simple sub setting IF statement

    can change the relationship between Observation number and the numberof iterations of

    the data step.The ERROR- variable ha a value of 1 if there is a error in the data forthat observation and 0 if it is not. Ex: This is nothing but a implicit variable created by

    SAS during data processing. It gives the total number of records SAS has iterated in a

    dataset. It is Available only for data step and not for PROCS. Eg. If we want to find

    every third record in a Dataset thenwe can use the _n_ as follows

    Data new-sas-data-set;

    Set old;

    if mod(_n_,3)= 1 then;

    run;

    Note: If we use a where clause to subset the _n_ will not yield the required result.

    http://www2.sas.com/proceedings/sugi24/Posters/p235-24.pdfhttp://www2.sas.com/proceedings/sugi24/Posters/p235-24.pdf