SAS 9.2 Procedures_Guide Book

1684
Base SAS ® 9.2 Procedures Guide

description

SAS programming guideline

Transcript of SAS 9.2 Procedures_Guide Book

  • Base SAS 9.2 Procedures Guide

    TW10600_proc_colortitlepg.indd 1 1/22/09 11:41:14 AM

  • The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2009.Base SAS 9.2 Procedures Guide. Cary, NC: SAS Institute Inc.

    Base SAS 9.2 Procedures GuideCopyright 2009 by SAS Institute Inc., Cary, NC, USAISBN 9781-59994-714-3All rights reserved. Produced in the United States of America.For a hard-copy book: No part of this publication may be reproduced, stored in aretrieval system, or transmitted, in any form or by any means, electronic, mechanical,photocopying, or otherwise, without the prior written permission of the publisher, SASInstitute Inc.For a Web download or e-book: Your use of this publication shall be governed by theterms established by the vendor at the time you acquire this publication.U.S. Government Restricted Rights Notice. Use, duplication, or disclosure of thissoftware and related documentation by the U.S. government is subject to the Agreementwith SAS Institute and the restrictions set forth in FAR 52.22719 Commercial ComputerSoftware-Restricted Rights (June 1987).SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.1st electronic book, February 20091st printing, March 2009SAS Publishing provides a complete selection of books and electronic products to helpcustomers use SAS software to its fullest potential. For more information about oure-books, e-learning products, CDs, and hard-copy books, visit the SAS Publishing Web siteat support.sas.com/pubs or call 1-800-727-3228.SAS and all other SAS Institute Inc. product or service names are registered trademarksor trademarks of SAS Institute Inc. in the USA and other countries. indicates USAregistration.Other brand and product names are registered trademarks or trademarks of theirrespective companies.

  • Contents

    Whats New xiiiOverview xiiiNew Base SAS Procedures xivEnhanced Base SAS Procedures xvDocumentation Enhancements xxi

    P A R T 1 Concepts 1Chapter 1 Choosing the Right Procedure 3Functional Categories of Base SAS Procedures 3Report-Writing Procedures 5Statistical Procedures 6Utility Procedures 8Brief Descriptions of Base SAS Procedures 10

    Chapter 2 Fundamental Concepts for Using Base SAS Procedures 17Language Concepts 17Procedure Concepts 20Output Delivery System 33

    Chapter 3 Statements with the Same Function in Multiple Procedures 35Overview 35Statements 36

    P A R T 2 Procedures 49Chapter 4 The APPEND Procedure 53Overview: APPEND Procedure 53Syntax: APPEND Procedure 53

    Chapter 5 The CALENDAR Procedure 55Overview: CALENDAR Procedure 57Syntax: CALENDAR Procedure 62Concepts: CALENDAR Procedure 80Results: CALENDAR Procedure 90Examples: CALENDAR Procedure 91

    Chapter 6 The CALLRFC Procedure 127Information about the CALLRFC Procedure 127

    Chapter 7 The CATALOG Procedure 129Overview: CATALOG Procedure 129Syntax: CATALOG Procedure 130

  • iv

    Concepts: CATALOG Procedure 140Results: CATALOG Procedure 144Examples: CATALOG Procedure 145

    Chapter 8 The CHART Procedure 153Overview: CHART Procedure 153Syntax: CHART Procedure 158Concepts: CHART Procedure 171Results: CHART Procedure 172Examples: CHART Procedure 173References 187

    Chapter 9 The CIMPORT Procedure 189Overview: CIMPORT Procedure 189Syntax: CIMPORT Procedure 190CIMPORT Problems: Importing Transport Files 196Examples: CIMPORT Procedure 201

    Chapter 10 The COMPARE Procedure 205Overview: COMPARE Procedure 206Syntax: COMPARE Procedure 209Concepts: COMPARE Procedure 220Results: COMPARE Procedure 224Examples: COMPARE Procedure 237

    Chapter 11 The CONTENTS Procedure 257Overview: CONTENTS Procedure 257Syntax: CONTENTS Procedure 257

    Chapter 12 The COPY Procedure 259Overview: COPY Procedure 259Syntax: COPY Procedure 259Concepts: COPY Procedure 260Example: COPY Procedure 261

    Chapter 13 The CORR Procedure 265Information about the CORR Procedure 265

    Chapter 14 The CPORT Procedure 267Overview: CPORT Procedure 267Syntax: CPORT Procedure 268READ= Data Set Option in the PROC CPORT Statement 276Results: CPORT Procedure 277Examples: CPORT Procedure 277

    Chapter 15 The CV2VIEW Procedure 283Information about the CV2VIEW Procedure 283

  • vChapter 16 The DATASETS Procedure 285Overview: DATASETS Procedure 286Syntax: DATASETS Procedure 289Concepts: DATASETS Procedure 350Results: DATASETS Procedure 356Examples: DATASETS Procedure 369

    Chapter 17 The DBCSTAB Procedure 395Information about the DBCSTAB Procedure 395

    Chapter 18 The DISPLAY Procedure 397Overview: DISPLAY Procedure 397Syntax: DISPLAY Procedure 397Example: DISPLAY Procedure 398

    Chapter 19 The DOCUMENT Procedure 401Information about the DOCUMENT Procedure 401

    Chapter 20 The EXPLODE Procedure 403Information about the EXPLODE Procedure 403

    Chapter 21 The EXPORT Procedure 405Overview: EXPORT Procedure 405Syntax: EXPORT Procedure 405Examples: EXPORT Procedure 408

    Chapter 22 The FCMP Procedure 413Overview: FCMP Procedure 416Syntax: FCMP Procedure 416Concepts: FCMP Procedure 427PROC FCMP and DATA Step Differences 430Working with Arrays 433Reading Arrays and Writing Arrays to a Data Set 434Using Macros with PROC FCMP Routines 437Variable Scope in PROC FCMP Routines 438Recursion 439Directory Transversal 440Identifying the Location of Compiled Functions and Subroutines: The CMPLIB=System Option 443Special Functions and CALL Routines: Overview 446Special Functions and CALL Routines: Matrix CALL Routines 446Special Functions and CALL Routines: C Helper Functions and CALL Routines 458Special Functions and CALL Routines: Other Functions 462Functions for Calling SAS Code from Within Functions 467The FCmp Function Editor 472Examples: FCMP Procedure 483

    Chapter 23 The FONTREG Procedure 491

  • vi

    Overview: FONTREG Procedure 491Syntax: FONTREG Procedure 492Concepts: FONTREG Procedure 498Examples: FONTREG Procedure 500

    Chapter 24 The FORMAT Procedure 505Overview: FORMAT Procedure 506Syntax: FORMAT Procedure 507Informat and Format Options 528Specifying Values or Ranges 530Concepts: FORMAT Procedure 531Results: FORMAT Procedure 535Examples: FORMAT Procedure 540

    Chapter 25 The FORMS Procedure 567Information about the FORMS Procedure 567

    Chapter 26 The FREQ Procedure 569Information about the FREQ Procedure 569

    Chapter 27 The FSLIST Procedure 571Overview: FSLIST Procedure 571Syntax: FSLIST Procedure 571Using the FSLIST Window 576

    Chapter 28 The HTTP Procedure 583Overview: HTTP Procedure 583Syntax: HTTP Procedure 583Using Hypertext Transfer Protocol Secure (HTTPS) 585Examples: HTTP Procedure 585

    Chapter 29 The IMPORT Procedure 589Overview: IMPORT Procedure 589Syntax: IMPORT Procedure 590Examples: IMPORT Procedure 593

    Chapter 30 The INFOMAPS Procedure 599Information about the INFOMAPS Procedure 599

    Chapter 31 The JAVAINFO Procedure 601Overview: JAVAINFO Procedure 601Syntax: JAVAINFO Procedure 601

    Chapter 32 The MEANS Procedure 603Overview: MEANS Procedure 604Syntax: MEANS Procedure 606Concepts: MEANS Procedure 631Statistical Computations: MEANS Procedure 634

  • vii

    Results: MEANS Procedure 637Examples: MEANS Procedure 639References 668

    Chapter 33 The METADATA Procedure 669Information about the METADATA Procedure 669

    Chapter 34 The METALIB Procedure 671Information about the METALIB Procedure 671

    Chapter 35 The METAOPERATE Procedure 673Information about the METAOPERATE Procedure 673

    Chapter 36 The MIGRATE Procedure 675Overview: MIGRATE Procedure 675Syntax: MIGRATE Procedure 676Concepts: MIGRATE Procedure 679Migrating a Library with Validation Tools 684Using the SLIBREF= Option 685Examples 686

    Chapter 37 The OPTIONS Procedure 693Overview: OPTIONS Procedure 693Syntax: OPTIONS Procedure 699Results: OPTIONS Procedure 702Examples: OPTIONS Procedure 702

    Chapter 38 The OPTLOAD Procedure 707Overview: OPTLOAD Procedure 707Syntax: OPTLOAD Procedure 707

    Chapter 39 The OPTSAVE Procedure 709Overview: OPTSAVE Procedure 709Syntax: OPTSAVE Procedure 709

    Chapter 40 The PLOT Procedure 711Overview: PLOT Procedure 712Syntax: PLOT Procedure 714Concepts: PLOT Procedure 730Results: PLOT Procedure 735Examples: PLOT Procedure 736

    Chapter 41 The PMENU Procedure 769Overview: PMENU Procedure 769Syntax: PMENU Procedure 770Concepts: PMENU Procedure 784Examples: PMENU Procedure 786

    Chapter 42 The PRINT Procedure 807

  • viii

    Overview: PRINT Procedure 807Syntax: PRINT Procedure 810Results: Print Procedure 824Examples: PRINT Procedure 827

    Chapter 43 The PRINTTO Procedure 879Overview: PRINTTO Procedure 879Syntax: PRINTTO Procedure 880Concepts: PRINTTO Procedure 883Examples: PRINTTO Procedure 884

    Chapter 44 The PROTO Procedure 895Overview: PROTO Procedure 895Syntax: PROTO Procedure 896Concepts: PROTO Procedure 898C Helper Functions and CALL Routines 908Results: PROTO Procedure 910Examples: PROTO Procedure 911

    Chapter 45 The PRTDEF Procedure 913Overview: PRTDEF Procedure 913Syntax: PRTDEF Procedure 913Input Data Set: PRTDEF Procedure 915Examples: PRTDEF Procedure 920

    Chapter 46 The PRTEXP Procedure 925Overview: PRTEXP Procedure 925Syntax: PRTEXP Procedure 925Concepts: PRTEXP Procedure 927Examples: PRTEXP Procedure 927

    Chapter 47 The PWENCODE Procedure 929Overview: PWENCODE Procedure 929Syntax: PWENCODE Procedure 929Concepts: PWENCODE Procedure 930Examples: PWENCODE Procedure 931

    Chapter 48 The RANK Procedure 935Overview: RANK Procedure 935Syntax: RANK Procedure 937Concepts: RANK Procedure 943Results: RANK Procedure 945Examples: RANK Procedure 945References 951

    Chapter 49 The REGISTRY Procedure 953Overview: REGISTRY Procedure 953

  • ix

    Syntax: REGISTRY Procedure 953Creating Registry Files with the REGISTRY Procedure 958Examples: REGISTRY Procedure 961

    Chapter 50 The REPORT Procedure 969Overview: REPORT Procedure 971Concepts: REPORT Procedure 976Syntax: REPORT Procedure 993REPORT Procedure Windows 1041How PROC REPORT Builds a Report 1063Examples: REPORT Procedure 1075

    Chapter 51 The SCAPROC Procedure 1131Overview: SCAPROC Procedure 1131Syntax: SCAPROC Procedure 1132Results: SCAPROC Procedure 1133Examples: SCAPROC Procedure 1136

    Chapter 52 The SOAP Procedure 1141Overview: SOAP Procedure 1141Syntax: SOAP Procedure 1141Concepts: SOAP Procedure 1144WS-Security: Client Configuration 1145Using PROC SOAP with Secure Socket Layer (SSL) 1145Calling SAS Web Services 1146Examples: SOAP Procedure 1147

    Chapter 53 The SORT Procedure 1151Overview: SORT Procedure 1151Syntax: SORT Procedure 1153Concepts: SORT Procedure 1167Integrity Constraints: SORT Procedure 1171Results: SORT Procedure 1171Examples: SORT Procedure 1172

    Chapter 54 The SQL Procedure 1181Overview: SQL Procedure 1183Syntax: SQL Procedure 1185SQL Procedure Component Dictionary 1231PROC SQL and the ANSI Standard 1277Examples: SQL Procedure 1280

    Chapter 55 The STANDARD Procedure 1319Overview: STANDARD Procedure 1319Syntax: STANDARD Procedure 1321Results: STANDARD Procedure 1327Statistical Computations: STANDARD Procedure 1327

  • xExamples: STANDARD Procedure 1328

    Chapter 56 The SUMMARY Procedure 1335Overview: SUMMARY Procedure 1335Syntax: SUMMARY Procedure 1335

    Chapter 57 The TABULATE Procedure 1339Overview: TABULATE Procedure 1340Terminology: TABULATE Procedure 1343Syntax: TABULATE Procedure 1346Concepts: TABULATE Procedure 1376Results: TABULATE Procedure 1384Examples: TABULATE Procedure 1396References 1457

    Chapter 58 The TEMPLATE Procedure 1459Information about the TEMPLATE Procedure 1459

    Chapter 59 The TIMEPLOT Procedure 1461Overview: TIMEPLOT Procedure 1461Syntax: TIMEPLOT Procedure 1463Results: TIMEPLOT Procedure 1471Examples: TIMEPLOT Procedure 1473

    Chapter 60 The TRANSPOSE Procedure 1485Overview: TRANSPOSE Procedure 1485Syntax: TRANSPOSE Procedure 1488Results: TRANSPOSE Procedure 1494Examples: TRANSPOSE Procedure 1496

    Chapter 61 The TRANTAB Procedure 1509Information about the TRANTAB Procedure 1509

    Chapter 62 The UNIVARIATE Procedure 1511Information about the UNIVARIATE Procedure 1511

    P A R T 3 Appendixes 1513Appendix 1 SAS Elementary Statistics Procedures 1515Overview 1515Keywords and Formulas 1516Statistical Background 1524References 1549

    Appendix 2 Operating Environment-Specic Procedures 1551Descriptions of Operating Environment-Specific Procedures 1551

    Appendix 3 Raw Data and DATA Steps 1553

  • xi

    Overview 1554CENSUS 1554CHARITY 1555CONTROL Library 1557CUSTOMER_RESPONSE 1582DJIA 1584EDUCATION 1585EMPDATA 1586ENERGY 1588EXP Library 1589EXPREV 1590GROC 1591MATCH_11 1592PROCLIB.DELAY 1593PROCLIB.EMP95 1594PROCLIB.EMP96 1595PROCLIB.INTERNAT 1596PROCLIB.LAKES 1596PROCLIB.MARCH 1597PROCLIB.PAYLIST2 1598PROCLIB.PAYROLL 1598PROCLIB.PAYROLL2 1601PROCLIB.SCHEDULE 1602PROCLIB.STAFF 1605PROCLIB.SUPERV 1608RADIO 1609SALES 1621

    Appendix 4 ICU License 1623ICU License - ICU 1.8.1 and later 1623

    Appendix 5 Recommended Reading 1625Recommended Reading 1625

    Index 1627

  • xii

  • xiii

    Whats New

    OverviewThe following Base SAS procedures are new: CALLRFC FCMP HTTP JAVAINFO PROTO SCAPROC SOAP

    The following Base SAS procedures have been enhanced: APPEND CIMPORT CONTENTS COPY CPORT CORR DATASETS FREQ MEANS MIGRATE OPTIONS PRINT PWENCODE RANK REPORT SORT SQL

  • xiv Whats New

    TABULATE UNIVARIATE

    New Base SAS Procedures

    The CALLRFC ProcedureThe CALLRFC procedure enables you to invoke Remote Function Call (RFC) or

    RFC-compatible functions on an SAP System from a SAS program. You must licenseand configure SAS/ACCESS Interface to R/3 to use the CALLRFC procedure.

    The FCMP ProcedureThe FCMP procedure is new for 9.2. The SAS Function Compiler Procedure (FCMP)

    enables you to create, test, and store SAS functions and subroutines before you usethem in other SAS procedures. PROC FCMP accepts slight variations of DATA stepstatements, and most features of the SAS programming language can be used infunctions and subroutines that are processed by PROC FCMP.

    The JAVAINFO ProcedureThe JAVAINFO procedure conveys diagnostic information to the user about the Java

    environment that SAS is using. The diagnostic information can be used to confirm thatthe SAS Java environment has been configured correctly, and can be helpful whenreporting problems to SAS technical support. Also, PROC JAVAINFO is often used toverify that the SAS Java environment is working correctly because PROC JAVAINFOuses Java to report its diagnostics.

    The PROTO ProcedureThe PROTO procedure enables you to register, in batch mode, external functions that

    are written in the C or C++ programming languages. You can use these functions inSAS as well as in C-language structures and types. After the C-language functions areregistered in PROC PROTO, they can be called from any SAS function or subroutinethat is declared in the FCMP procedure. They can also be called from any SASfunction, subroutine, or method block that is declared in the COMPILE procedure.

    The SCAPROC ProcedureThe SCAPROC procedure enables you to specify a filename or fileref that will contain

    the output of the SAS Code Analyzer, and to write the output to the file. The SAS CodeAnalyzer captures information about the job step, input and output information such asfile dependencies, and information about macro symbol usage from a running SAS job.The SCAPROC procedure also can generate a grid-enabled job that can simultaneouslyrun independent pieces of a SAS job.

  • Whats New xv

    The SOAP ProcedureThe SOAP procedure reads XML input from a file that has a fileref and writes XML

    output to another file that has a fileref. The envelope and headings are part of thecontent of the fileref.

    The HTTP ProcedureThe HTTP procedure invokes a Web service that issues requests.

    Enhanced Base SAS Procedures

    The APPEND ProcedureThe NOWARN option has been added to the APPEND procedure. The NOWARN

    option suppresses the warning message when it is used with the FORCE option toconcatenate two data sets with different variables.

    The CIMPORT ProcedureThe following enhancement has been made to the CIMPORT procedure: ISFILEUTF8= is a new option that specifies whether the encoding of the transport

    file is UTF-8. This feature is useful when you import a transport file whose UTF-8encoding identity is known to you but is not stored in the transport file. SASreleases before SAS 9.2 do not store any encodings in the transport file. New warning and error messages are available to alert you to transport problems

    and recovery actions.

    The CONTENTS ProcedureThe WHERE option of the CONTENTS procedure has been restricted. You cannot

    use the WHERE option to affect the output because PROC CONTENTS does notprocess any observations.

    The COPY ProcedureThe PROC COPY option of the COPY procedure ignores concatenations with

    catalogs. Use PROC CATALOG COPY to copy concatenated catalogs.

    The CPORT ProcedureThe documentation about the READ= data set option (used in the DATA statement of

    PROC CPORT) was enhanced to explain when a read-only password might be required.You can create a transport file for a read-only data set only when you also specify thedata sets password using the READ= option in PROC CPORT. Clear-text and encodedpasswords are supported.

  • xvi Whats New

    The CORR ProcedureThe new ID statement for the CORR procedure specifies one or more additional tip

    variables to identify observations in scatter plots and scatter plot matrices.

    The DATASETS ProcedureThe following options are new or enhanced in the DATASETS procedure: The new REBUILD option specifies whether to correct or delete disabled indexes

    and integrity constraints. When a data set is damaged in some way and theDLDMGACTION=NOINDEX data set or system option is used, the data set isrepaired, the indexes and integrity constraint are disabled, and the index file isdeleted. The data set is then limited to INPUT mode only until the REBUILDoption is executed. This option enables you to continue with production withoutwaiting for the indexes to be repaired, which can take a long time on large datasets. Here is a list of enhancements for the COPY statement: The COPY statement with the NOCLONE option specified supports the

    OUTREP= and ENCODING= LIBNAME options for SQL views, DATA stepviews, and some SAS/ACCESS views (Oracle and Sybase). You can use the COPY statement, along with the XPORT engine or a

    REMOTE engine, to transport SAS data sets between hosts.

    Here is a list of enhancements for the CONTENTS statement: When using the OUT2 option, indexes and integrity constraints are labeled if

    disabled.

    The FREQ ProcedureThe FREQ procedure can now produce frequency plots, cumulative frequency plots,

    deviation plots, odds ratio plots, and kappa plots by using ODS Graphics. Thecrosstabulation table now has an ODS template that you can customize using theTEMPLATE procedure. Equivalence and noninferiority tests are now available for thebinomial proportion and the proportion difference. New confidence limits for thebinomial proportion include Agresti-Coull, Jeffreys, and Wilson (score) confidence limits.The RISKDIFF option in the EXACT statement provides unconditional exact confidencelimits for the proportion (risk) difference. The EQOR option in the EXACT statementprovides Zelens exact test for equal odds ratios.

    The MEANS ProcedureThe following enhancements have been made to the MEANS procedure: The PRT statistic is now an alias for the PROBT statistic. The MODE statistic can now be used with PROC MEANS.

    The MIGRATE ProcedureThe MIGRATE procedure now supports more cross-environment migrations. You can

    migrate a SAS 8.2 data library from almost every SAS 8.2 operating environment to

  • Whats New xvii

    any SAS 9.2 operating environment. Most SAS 6 operating environments are alsosupported, but not for cross-environment migration.

    The OPTIONS ProcedureThe following enhancements have been made to the OPTIONS procedure: Restricted options are now supported in all operating environments. The value of environment variables can be displayed by using the EXPAND option. System options that have a character value can be displayed as a hexadecimal

    value by using the HEXVALUE option. You can display a list of SAS system option groups by using the LISTGROUPS

    option. To display the options in multiple groups, you can list more than one group in the

    GROUP= option. The following system option groups are new and can be specified on the GROUP=

    option: CODEGEN, LOGCONTROL, LISTCONTROL, SMF, SQL, and SVG.

    The PRINT ProcedureThe following new options have been added to the PRINT procedure:

    SUMLABELenables you to display the label of the BY variable on the summary line.

    BLANKLINEenables you to insert a blank line after every n observations.

    The PWENCODE ProcedureThe PWENCODE procedure now supports the sas003 encoding method, which uses a

    256-bit key to generate encoded passwords. The sas003 encoding method supports theAES (Advanced Encryption Standard), which is a new security algorithm for SAS/SECURE.

    The RANK ProcedureThe TIES= option of the RANK procedure has a new value, DENSE, which computes

    scores and ranks by treating tied values as a single-order statistic.

    The REPORT ProcedureThe following enhancements have been made to the REPORT procedure: The PROBT statistic is now an alias for the PRT statistic. The MODE statistic can now be used with PROC REPORT. The STYLE/MERGE attribute name option has been added so that styles can be

    concatenated. Currently, there is no way to concatenate styles using a CALLDEFINE statement. Each time the CALL DEFINE statement is executed, itreplaces any previous style information. The BY statement is now available when requesting an output data set with the

    OUT= option in the PROC REPORT statement.

  • xviii Whats New

    The new Table of Contents (TOC) now supports the CONTENTS= option in theBREAK, RBREAK, and DEFINE statements. The BYPAGENO=n option had been added to reset the page number between BY

    groups. The SPANROWS option has been added for the PROC REPORT statement. This

    option permits the GROUP and ORDER variables to be contained in a box ratherthan blank cells appearing underneath the GROUP or ORDER variable values. The SPANROWS option also permits GROUP and ORDER variable values to

    repeat when the values break across pages in PDF, PS, and RTF destinations. PROC REPORT now supports the ODS DOCUMENT and ODS OUTPUT

    destinations. PROC REPORT now supports style attributes BORDERBOTTOMSTYLE,

    BORDERBOTTOMWIDTH, BORDERBOTTOMCOLOR, BORDERTOPSTYLE,BORDERTOPWIDTH, and BORDERTOPCOLOR.

    The SORT ProcedureThe following options and statements are new or enhanced in the SORT procedure : The new PRESORTED option causes PROC SORT to check within the input data

    set to determine whether the observations are in order before sorting. Use thePRESORTED option when you know or strongly suspect that a data set is alreadyin order according to the key variables specified in the BY statement. Byspecifying this option, you avoid the cost of sorting the data set. The SORTSEQ= option is enhanced. New suboptions have been added as follows: The LINGUISTIC suboption specifies linguistic collation, which sorts

    characters according to rules of language. The rules and default collatingsequence options are based on the language specified in the current localesetting. You can modify the default collating rules of linguistic collation. Thefollowing are the collating rules that can be used to modify the LINGUISTICcollation suboption: ALTERNATE_HANDLING= CASE_FIRST= COLLATION= LOCALE= NUMERIC_COLLATION= STRENGTH=

    You can now specify all possible encoding values. The result is the same as abinary collation of the character data represented in the specified encoding.The encoding values available are found in the SAS National LanguageSupport (NLS): Reference Guide. The KEY statement has been added to PROC SORT. You can specify multiple

    KEY statements and multiple variables per KEY statement. You can specifythe DESCENDING option to change the default collating direction fromascending to descending.

    The SQL ProcedureThe following enhancements have been made to the SQL procedure: A number of features have been added which enable you to optimize queries.

  • Whats New xix

    Depending on which engine type the query uses, you can replace the PUTfunction with a logically equivalent expression.

    You can replace references to the DATE, TIME, DATETIME, and TODAYfunctions in a query to their equivalent constant values before the queryexecutes.

    You can specify the minimum number of rows that must be in a table or themaximum number of SAS format values that can exist in a PUT function inorder for PROC SQL to consider optimizing the PUT function.

    You can bypass the remerging process when a summary function is used in aSELECT clause or a HAVING clause.

    If indexing is present, PROC SQL now uses the index files when processingSELECT DISTINCT statements.

    Semicolons can now be used in explicit queries for passthrough.

    You can use custom functions that are created with PROC FCMP in PROC SQL.

    The DICTIONARY.EXTFILES table will now include the access method and devicetype information.

    Three new DICTIONARY tables have been added. The FUNCTIONS tablecontains information about currently accessible functions. The INFOMAPS tablereturns information on all known information maps. The DESTINATIONS tablecontains information about all known ODS destinations.

    The DESCRIBE TABLE CONSTRAINTS statement will not display the names ofpassword-protected foreign key data set variables that reference the primary keyconstraint.

    The TRANSCODE=NO argument is not supported by some SAS Workspace Serverclients. In SAS 9.2, if the argument is not supported, column values withTRANSCODE=NO are replaced (masked) with asterisks (*). Before SAS 9.2,column values with TRANSCODE=NO were transcoded.

    The SAS/ACCESS CONNECT statement has a new AUTHDOMAIN option thatsupports lookup of security credentials (user ID and password) without yourhaving to explicitly specify the credentials.

    The following new options have been added to the PROC SQL statement:

    CONSTDATETIME|NOCONSTDATETIMEspecifies whether the SQL procedure replaces references to the DATE, TIME,DATETIME, and TODAY functions in a query with their equivalent constantvalues before the query executes.

    Note: The CONSTDATETIME option provides the same functionality as thenew SQLCONSTDATETIME system option.

    EXITCODEspecifies whether PROC SQL clears an error code for any SQL statement.

    IPASSTHRU|NOIPASSTHRUspecifies whether implicit pass-through is enabled or disabled.

    REDUCEPUTspecifies the engine type that a query uses for which optimization is performed byreplacing a PUT function in a query with a logically equivalent expression.

    Note: The REDUCEPUT option provides the same functionality as the newSQLREDUCEPUT system option.

    REMERGE|NOREMERGE

  • xx Whats New

    specifies that the SQL procedure does not process queries that use remerging ofdata.

    Note: The REMERGE option provides the same functionality as the newSQLREMERGE system option.

    The following new global system options affect SQL processing and performance:

    DBIDIRECTEXEC (SAS/ACCESS)controls SQL optimization for SAS/ACCESS engines.

    SQLCONSTANTDATETIMEspecifies whether the SQL procedure replaces references to the DATE, TIME,DATETIME, and TODAY functions in a query with their equivalent constantvalues before the query executes.

    SQLMAPPUTTO (SAS/ACCESS)for SAS 9.2 Phase 2 and later, specifies whether the PUT function in the SQLprocedure is processed by SAS or by the SAS_PUT( ) function inside the Teradatadatabase.

    SQLREDUCEPUTfor the SQL procedure, specifies the engine type that a query uses for whichoptimization is performed by replacing a PUT function in a query with a logicallyequivalent expression.

    SQLREDUCEPUTOBSfor the SQL procedure when the SQLREDUCEPUT= system option is set toNONE, specifies the minimum number of observations that must be in a table inorder for PROC SQL to consider optimizing the PUT function in a query.

    SQLREDUCEPUTVALUESfor the SQL procedure when the SQLREDUCEPUT= system option is set toNONE, specifies the maximum number of SAS format values that can exist in aPUT function expression in order for PROC SQL to consider optimizing the PUTfunction in a query.

    SQLREMERGEspecifies whether the SQL procedure can process queries that use remerging ofdata.

    SQLUNDOPOLICYspecifies whether the SQL procedure keeps or discards updated data if errors occurwhile the data is being updated.

    The TABULATE ProcedureThe following enhancements have been made to the TABULATE procedure: The PROBT statistic is now an alias for the PRT statistic. The MODE statistic can now be used with PROC TABULATE. You can specify variable name list shortcuts within the TABLE statement. PROC TABULATE now supports style attributes BORDERBOTTOMSTYLE,

    BORDERBOTTOMWIDTH, BORDERBOTTOMCOLOR, BORDERTOPSTYLE,BORDERTOPWIDTH, and BORDERTOPCOLOR.

    The UNIVARIATE ProcedureThe UNIVARIATE procedure now produces graphs that conform to ODS styles, so

    that creating consistent output is easier. Also, you now have two methods for producing

  • Whats New xxi

    graphs. With traditional graphics, you can control every detail of a graph throughfamiliar procedure syntax and the GOPTION and SYMBOL statements. With ODSGraphics (experimental for the UNIVARIATE procedure in SAS 9.2), you can obtain thehighest quality output with minimal syntax. You also now have full compatibility withgraphics that are produced by the SAS/STAT and SAS/ETS procedures.

    The new UNIVARIATE procedure CDFPLOT statement plots the observedcumulative distribution function (cdf) of a variable and enables you to superimpose afitted theoretical distribution on the graph. The new PPPLOT statement creates aprobability-probability plot (also referred to as a P-P plot or percent plot). Thisstatement compares the empirical cumulative distribution function (ecdf) of a variablewith a specified theoretical cumulative distribution function. The beta, exponential,gamma, lognormal, normal, and Weibull distributions are available in both statements.

    Documentation EnhancementsThe following Base SAS Procedures have had part or all of their documentation

    relocated to other SAS documents.

    The CV2VIEW ProcedureDocumentation for the CV2VIEW procedure is now in the SAS/ACCESS for

    Relational Databases: Reference.

    The DBCSTAB ProcedureDocumentation for the DBCSTAB procedure is now in the SAS National Language

    Support (NLS): Reference Guide.

    The EXPORT ProcedureThe Base SAS Procedures Guide contains only overview and common syntax

    information for the EXPORT procedure. Information that is specific to PC Files is nowin the SAS/ACCESS Interface to PC Files: Reference.

    The IMPORT ProcedureThe Base SAS Procedures Guide contains only overview and common syntax

    information for the IMPORT procedure. Information that is specific to PC Files is nowin the SAS/ACCESS Interface to PC Files: Reference.

    The TRANTAB ProcedureDocumentation for the TRANTAB procedure is now in the SAS National Language

    Support (NLS): Reference Guide.

  • xxii Whats New

  • 1P A R T1

    Concepts

    Chapter 1. . . . . . . . . .Choosing the Right Procedure 3

    Chapter 2. . . . . . . . . .Fundamental Concepts for Using Base SAS Procedures 17

    Chapter 3. . . . . . . . . .Statements with the Same Function in MultipleProcedures 35

  • 2

  • 3C H A P T E R

    1Choosing the Right Procedure

    Functional Categories of Base SAS Procedures 3Report Writing 3Statistics 3Utilities 4

    Report-Writing Procedures 5Statistical Procedures 6

    Available Statistical Procedures 6Efficiency Issues 7

    Quantiles 7Computing Statistics for Groups of Observations 7

    Additional Information about the Statistical Procedures 8Utility Procedures 8Brief Descriptions of Base SAS Procedures 10

    Functional Categories of Base SAS Procedures

    Report WritingThese procedures display useful information, such as data listings (detail reports),

    summary reports, calendars, letters, labels, multipanel reports, and graphical reports.

    CALENDAR PLOT SUMMARY*

    CHART* PRINT TABULATE*

    FREQ* REPORT* TIMEPLOT

    MEANS* SQL*

    * These procedures produce reports and compute statistics.

    StatisticsThese procedures compute elementary statistical measures that include descriptive

    statistics based on moments, quantiles, confidence intervals, frequency counts,

  • 4 Utilities Chapter 1

    crosstabulations, correlations, and distribution tests. They also rank and standardizedata.

    CHART RANK SUMMARY

    CORR REPORT TABULATE

    FREQ SQL UNIVARIATE

    MEANS STANDARD

    Utilities

    These procedures perform basic utility operations. They create, edit, sort, and transposedata sets, create and restore transport data sets, create user-defined formats, andprovide basic file maintenance such as to copy, append, and compare data sets.

    APPEND FONTREG PRINTTO

    BMDP* FORMAT PROTO

    CATALOG FSLIST PRTDEF

    CIMPORT IMPORT PRTEXP

    COMPARE INFOMAPS++ PWENCODE

    CONTENTS JAVAINFO REGISTRY

    CONVERT* METADATA@@ RELEASE*

    COPY METALIB@@ SORT

    CPORT METAOPERATE@@ SOURCE*

    CV2VIEW@ MIGRATE SQL

    DATASETS OPTIONS TAPECOPY*

    DBCSTAB# OPTLOAD TAPELABEL*

    DISPLAY OPTSAVE TEMPLATE+

    DOCUMENT+ PDS* TRANSPOSE

    EXPORT PDSCOPY* TRANTAB#

    FCMP PMENU

    * See the SAS documentation for your operating environment for a description of this procedure.+ See the SAS Output Delivery System: Users Guide for a description of these procedures.@ See the SAS/ACCESS for Relational Databases: Reference for a description of this procedure.# See the SAS National Language Support (NLS): Reference Guide for a description of this

    procedure.** See the SAS/ACCESS Interface to PC Files: Reference for a description of this procedure.++See the Base SAS Guide to Information Maps for a description of this procedure.@@See the SAS Language Interfaces to Metadata for a description of this procedure.

  • Choosing the Right Procedure Report-Writing Procedures 5

    Report-Writing ProceduresThe following table lists report-writing procedures according to the type of report.

    Table 1.1 Report-Writing Procedures by Task

    Report Type Procedure Description

    Detail reports PRINT produces data listings quickly; can supply titles,footnotes, and column sums.

    REPORT offers more control and customization than PROCPRINT; can produce both column and row sums; hasDATA step computation abilities.

    SQL combines Structured Query Language and SASfeatures such as formats; can manipulate data andcreate a SAS data set in the same step that creates thereport; can produce column and row statistics; does notoffer as much control over output as PROC PRINT andPROC REPORT.

    Summary reports MEANS orSUMMARY

    computes descriptive statistics for numeric variables;can produce a printed report and create an output dataset.

    PRINT produces only one summary report: can sum the BYvariables.

    REPORT combines features of the PRINT, MEANS, andTABULATE procedures with features of the DATA stepin a single report-writing tool that can produce avariety of reports; can also create an output data set.

    SQL computes descriptive statistics for one or more SASdata sets or DBMS tables; can produce a printedreport or create a SAS data set.

    TABULATE produces descriptive statistics in a tabular format; canproduce stub-and-banner reports (multidimensionaltables with descriptive statistics); can also create anoutput data set.

    Miscellaneous highly formatted reports

    Calendars CALENDAR produces schedule and summary calendars; canschedule tasks around nonwork periods and holidays,weekly work schedules, and daily work shifts.

    Multipanel reports(telephone book listings)

    REPORT produces multipanel reports.

    Low-resolution graphical reports*

    CHART produces bar charts, histograms, block charts, piecharts, and star charts that display frequencies andother statistics.

  • 6 Statistical Procedures Chapter 1

    Report Type Procedure Description

    PLOT produces scatter diagrams that plot one variableagainst another.

    TIMEPLOT produces plots of one or more variables over timeintervals.

    * These reports quickly produce a simple graphical picture of the data. To produce high-resolution graphicalreports, use SAS/GRAPH software.

    Statistical Procedures

    Available Statistical ProceduresThe following table lists statistical procedures according to task. Table A1.1 on page

    1517 lists the most common statistics and the procedures that compute them.

    Table 1.2 Elementary Statistical Procedures by Task

    Report type Procedure Description

    Descriptive statistics CORR computes simple descriptive statistics.

    MEANS orSUMMARY

    computes descriptive statistics; can produce printed outputand output data sets. By default, PROC MEANS producesprinted output, and PROC SUMMARY creates an outputdata set.

    REPORT computes most of the same statistics as PROC TABULATE;allows customization of format.

    SQL computes descriptive statistics for data in one or moreDBMS tables; can produce a printed report or create a SASdata set.

    TABULATE produces tabular reports for descriptive statistics; cancreate an output data set.

    UNIVARIATE computes the broadest set of descriptive statistics; cancreate an output data set.

    Frequency andcross-tabulation tables

    FREQ produces one-way to n-way tables; reports frequency counts;computes chi-square tests; computes computes test andmeasures of association and agreement for two-way ton-way cross-tabulation tables; can compute exact tests andasymptotic tests; can create output data sets.

    TABULATE produces one-way and two-way cross-tabulation tables; cancreate an output data set.

    UNIVARIATE produces one-way frequency tables.

    Correlation analysis CORR computes Pearsons, Spearmans, and Kendalls correlationsand partial correlations; also computes Hoeffdingsmeasures of dependence (D) and Cronbachs coefficientalpha.

    Distribution analysis UNIVARIATE computes tests for location and tests for normality.

  • Choosing the Right Procedure Efciency Issues 7

    Report type Procedure Description

    FREQ computes a test for the binomial proportion for one-waytables; computes a goodness-of-fit test for one-way tables;computes a chi-square test of equal distribution for two-waytables.

    Robust estimation UNIVARIATE computes robust estimates of scale, trimmed means, andWinsorized means.

    Data transformation

    Computing ranks RANK computes ranks for one or more numeric variables acrossthe observations of a SAS data set and creates an outputdata set; can produce normal scores or other rank scores.

    Standardizing data STANDARD creates an output data set that contains variables that arestandardized to a given mean and standard deviation.

    Low-resolution graphics*

    CHART produces a graphical report that can show one of thefollowing statistics for the chart variable: frequency counts,percentages, cumulative frequencies, cumulativepercentages, totals, or averages.

    UNIVARIATE produces descriptive plots such as stem-and-leaf plot, boxplots, and normal probability plots.

    * To produce high-resolution graphical reports, use SAS/GRAPH software.

    Efciency Issues

    QuantilesFor a large sample size n, the calculation of quantiles, including the median, requires

    computing time proportional to nlog(n). Therefore, a procedure, such as UNIVARIATE,that automatically calculates quantiles might require more time than other datasummarization procedures. Furthermore, because data is held in memory, the procedurealso requires more storage space to perform the computations. By default, the reportprocedures PROC MEANS, PROC SUMMARY, and PROC TABULATE require lessmemory because they do not automatically compute quantiles. These procedures alsoprovide an option to use a new fixed-memory, quantiles estimation method that isusually less memory-intense. See Quantiles on page 636 for more information.

    Computing Statistics for Groups of ObservationsTo compute statistics for several groups of observations, you can use any of the

    previous procedures with a BY statement to specify BY-group variables. However,BY-group processing requires that you previously sort or index the data set, which forvery large data sets might require substantial computer resources. A more efficient wayto compute statistics within groups without sorting is to use a CLASS statement withone of the following procedures: MEANS, SUMMARY, or TABULATE.

  • 8 Additional Information about the Statistical Procedures Chapter 1

    Additional Information about the Statistical ProceduresAppendix 1, SAS Elementary Statistics Procedures, on page 1515, lists standard

    keywords, statistical notation, and formulas for the statistics that Base SAS procedurescompute frequently. The sections on the individual statistical procedures discuss thestatistical concepts that are useful to interpret a procedure output.

    Utility Procedures

    The following table groups utility procedures according to task.

    Table 1.3 Utility Procedures by Task

    Tasks Procedure Description

    Supply information COMPARE compares the contents of two SAS data sets.

    CONTENTS describes the contents of a SAS library or specific librarymembers.

    JAVAINFO conveys diagnostic information about the Javaenvironment that SAS is using.

    OPTIONS lists the current values of all SAS system options.

    SQL supplies information through dictionary tables on anindividual SAS data set as well as all SAS files active inthe current SAS session. Dictionary tables can alsoprovide information about macros, titles, indexes,external files, or SAS system options.

    Manage SAS system options OPTIONS lists the current values of all SAS system options.

    OPTLOAD reads SAS system option settings that are stored in theSAS registry or a SAS data set.

    OPTSAVE saves SAS system option settings to the SAS registry or aSAS data set.

    Affect printing and OutputDelivery System output

    DOCUMENT+ manipulates procedure output that is stored in ODSdocuments.

    FONTREG adds system fonts to the SAS registry.

    FORMAT creates user-defined formats to display and print data.

    PRINTTO routes procedure output to a file, a SAS catalog entry, ora printer; can also redirect the SAS log to a file.

    PRTDEF creates printer definitions.

    PRTEXP exports printer definition attributes to a SAS data set.

    TEMPLATE+ customizes ODS output.

    Create, browse, and editdata

    FCMP enables creation, testing, and storage of SAS functionsand subroutines before they are used in other SASprocedures.

    FSLIST browses external files such as files that contain SASsource lines or SAS procedure output.

  • Choosing the Right Procedure Utility Procedures 9

    Tasks Procedure Description

    INFOMAPS++ creates or updates a SAS Information Map.

    SQL creates SAS data sets using Structured Query Languageand SAS features.

    Transform data DBCSTAB# produces conversion tables for the double-byte charactersets that SAS supports.

    FORMAT creates user-defined informats to read data anduser-defined formats to display data.

    SORT sorts SAS data sets by one or more variables.

    SQL sorts SAS data sets by one or more variables.

    TRANSPOSE transforms SAS data sets so that observations becomevariables and variables become observations.

    TRANTAB# creates, edits, and displays customized translation tables.

    Manage SAS files APPEND appends one SAS data set to the end of another.

    BMDP* invokes a BMDP program to analyze data in a SAS dataset.

    CATALOG manages SAS catalog entries.

    CIMPORT restores a transport sequential file that PROC CPORTcreates (usually in another operating environment) to itsoriginal form as a SAS catalog, a SAS data set, or a SASlibrary.

    CONVERT* converts BMDP system files, OSIRIS system files, andSPSS portable files to SAS data sets.

    COPY copies a SAS library or specific members of the library.

    CPORT converts a SAS catalog, a SAS data set, or a SAS libraryto a transport sequential file that PROC CIMPORT canrestore (usually in another operating environment) to itsoriginal form.

    CV2VIEW@ converts SAS/ACCESS view descriptors to PROC SQLviews.

    DATASETS manages SAS files.

    EXPORT reads data from a SAS data set and writes them to anexternal data source.

    IMPORT reads data from an external data source and writes themto a SAS data set.

    MIGRATE migrates members in a SAS library forward to the mostcurrent release of SAS.

    PDS* lists, deletes, and renames the members of a partitioneddata set.

    PDSCOPY* copies partitioned data sets from disk to tape, disk todisk, tape to tape, or tape to disk.

    PROTO enables registration, in batch mode, of external functionsthat are written in the C or C++ programming languages.

  • 10 Brief Descriptions of Base SAS Procedures Chapter 1

    Tasks Procedure Description

    REGISTRY imports registry information to the USER portion of theSAS registry.

    RELEASE* releases unused space at the end of a disk data set underthe z/OS environment.

    SOURCE* provides an easy way to back up and process sourcelibrary data sets.

    SQL concatenates SAS data sets.

    TAPECOPY* copies an entire tape volume or files from one or moretape volumes to one output tape volume.

    TAPELABEL* lists the label information of an IBM standard-labeledtape volume in the z/OS environment.

    Control windows PMENU creates customized menus for SAS applications.

    Miscellaneous DISPLAY executes SAS/AF applications.

    PWENCODE encodes passwords for use in SAS programs.

    Manage metadata in a SASMetadata Repository

    METADATA@@ sends a method call, in the form of an XML string, to aSAS Metadata Server.

    METALIB@@ updates metadata to match the tables in a library.

    METAOPERATE@@ performs administrative tasks on a metadata server.

    * See the SAS documentation for your operating environment for a description of these procedures.+ See the SAS Output Delivery System: Users Guide for a description of this procedure.@ See the SAS/ACCESS for Relational Databases: Reference for a description of this procedure.# See the SAS National Language Support (NLS): Reference Guide for a description of this procedure.** See the SAS/ACCESS Interface to PC Files: Reference for a description of this procedure.++See the Base SAS Guide to Information Maps for a description of this procedure.@@See the SAS Language Interfaces to Metadata for a description of this procedure.

    Brief Descriptions of Base SAS Procedures

    APPEND procedureadds observations from one SAS data set to the end of another SAS data set.

    BMDP procedureinvokes a BMDP program to analyze data in a SAS data set. See the SASdocumentation for your operating environment for more information.

    CALENDAR proceduredisplays data from a SAS data set in a monthly calendar format. PROCCALENDAR can display holidays in the month, schedule tasks, and process datafor multiple calendars with work schedules that vary.

    CATALOG proceduremanages entries in SAS catalogs. PROC CATALOG is an interactive,non-windowing procedure that enables you to display the contents of a catalog;copy an entire catalog or specific entries in a catalog; and rename, exchange, ordelete entries in a catalog.

    CHART procedure

  • Choosing the Right Procedure Brief Descriptions of Base SAS Procedures 11

    produces vertical and horizontal bar charts, block charts, pie charts, and starcharts. These charts provide a quick visual representation of the values of a singlevariable or several variables. PROC CHART can also display a statistic associatedwith the values.

    CIMPORT procedurerestores a transport file created by the CPORT procedure to its original form (aSAS library, catalog, or data set) in the format appropriate to the operatingenvironment. Coupled with the CPORT procedure, PROC CIMPORT enables youto move SAS libraries, catalogs, and data sets from one operating environment toanother.

    COMPARE procedurecompares the contents of two SAS data sets. You can also use PROC COMPARE tocompare the values of different variables within a single data set. PROCCOMPARE produces a variety of reports on the comparisons that it performs.

    CONTENTS procedureprints descriptions of the contents of one or more files in a SAS library.

    CONVERT procedureconverts BMDP system files, OSIRIS system files, and SPSS portable files to SASdata sets. See the SAS documentation for your operating environment for moreinformation.

    COPY procedurecopies an entire SAS library or specific members of the library. You can limitprocessing to specific types of library members.

    CORR procedurecomputes Pearson product-moment and weighted product-moment correlationcoefficients between variables and descriptive statistics for these variables. Inaddition, PROC CORR can compute three nonparametric measures of association(Spearmans rank-order correlation, Kendalls tau-b, and Hoeffdings measure ofdependence, D), partial correlations (Pearsons partial correlation, Spearmanspartial rank-order correlation, and Kendalls partial tau-b), and Cronbachscoefficient alpha.

    CPORT procedurewrites SAS libraries, data sets, and catalogs in a special format called a transportfile. Coupled with the CIMPORT procedure, PROC CPORT enables you to moveSAS libraries, data sets, and catalogs from one operating environment to another.

    CV2VIEW procedureconverts SAS/ACCESS view descriptors to PROC SQL views. Starting in SASSystem 9, conversion of SAS/ACCESS view descriptors to PROC SQL views isrecommended because PROC SQL views are platform-independent and enable youto use the LIBNAME statement. See the SAS/ACCESS for Relational Databases:Reference for details.

    DATASETS procedurelists, copies, renames, and deletes SAS files and SAS generation groups; managesindexes; and appends SAS data sets in a SAS library. The procedure provides allthe capabilities of the APPEND, CONTENTS, and COPY procedures. You can alsomodify variables within data sets; manage data set attributes, such as labels andpasswords; or create and delete integrity constraints.

    DBCSTAB procedureproduces conversion tables for the double-byte character sets that SAS supports.See the SAS National Language Support (NLS): Reference Guide for details.

  • 12 Brief Descriptions of Base SAS Procedures Chapter 1

    DISPLAY procedureexecutes SAS/AF applications. See the SAS Guide to Applications Development forinformation on building SAS/AF applications.

    DOCUMENT proceduremanipulates procedure output that is stored in ODS documents. PROCDOCUMENT enables a user to browse and edit output objects and hierarchies,and to replay them to any supported ODS output format. See SAS Output DeliverySystem: Users Guide for details.

    EXPORT procedurereads data from a SAS data set and writes it to an external data source.

    FCMP procedureenables you to create, test, and store SAS functions and subroutines before youuse them in other SAS procedures. PROC FCMP accepts slight variations of DATAstep statements. Most features of the SAS programming language can be used infunctions and subroutines that are processed by PROC FCMP.

    FONTREG procedureadds system fonts to the SAS registry.

    FORMAT procedurecreates user-defined informats and formats for character or numeric variables.PROC FORMAT also prints the contents of a format library, creates a control dataset to write other informats or formats, and reads a control data set to createinformats or formats.

    FREQ procedureproduces one-way to n-way frequency tables and reports frequency counts. PROCFREQ can compute chi-square tests for one-way to n-way tables; for tests andmeasures of association and of agreement for two-way to n-way cross-tabulationtables; risks and risk difference for 22 tables; trends tests;andCochran-Mantel-Haenszel statistics. You can also create output data sets.

    FSLIST proceduredisplays the contents of an external file or copies text from an external file to theSAS Text Editor.

    IMPORT procedurereads data from an external data source and writes them to a SAS data set.

    INFOMAPScreates or updates a SAS Information Map. See the Base SAS Guide toInformation Maps for details.

    JAVAINFO procedureconveys diagnostic information to the user about the Java environment that SASis using. The diagnostic information can be used to confirm that the SAS Javaenvironment has been configured correctly and can be helpful when reportingproblems to SAS technical support.

    MEANS procedurecomputes descriptive statistics for numeric variables across all observations andwithin groups of observations. You can also create an output data set that containsspecific statistics and identifies minimum and maximum values for groups ofobservations.

    METADATA proceduresends a method call, in the form of an XML string, to a SAS Metadata Server.

    METALIB procedure

  • Choosing the Right Procedure Brief Descriptions of Base SAS Procedures 13

    updates metadata in a SAS Metadata Repository to match the tables in a library.

    METAOPERATE procedureperforms administrative tasks on a metadata server.

    MIGRATE proceduremigrates members in a SAS library forward to the most current release of SAS.The migration must occur within the same engine family; for example, V6, V7, orV8 can migrate to V9, but V6TAPE must migrate to V9TAPE.

    OPTIONS procedurelists the current values of all SAS system options.

    OPTLOAD procedurereads SAS system option settings from the SAS registry or a SAS data set, andputs them into effect.

    OPTSAVE proceduresaves SAS system option settings to the SAS registry or a SAS data set.

    PDS procedurelists, deletes, and renames the members of a partitioned data set. See the SASCompanion for z/OS for more information.

    PDSCOPY procedurecopies partitioned data sets from disk to tape, disk to disk, tape to tape, or tape todisk. See the SAS Companion for z/OS for more information.

    PLOT procedureproduces scatter plots that graph one variable against another. The coordinates ofeach point on the plot correspond to the two variables values in one or moreobservations of the input data set.

    PMENU proceduredefines menus that you can use in DATA step windows, macro windows, andSAS/AF windows, or in any SAS application that enables you to specify customizedmenus.

    PRINT procedureprints the observations in a SAS data set, using all or some of the variables.PROC PRINT can also print totals and subtotals for numeric variables.

    PRINTTO proceduredefines destinations for SAS procedure output and the SAS log.

    PROTO procedureenables you to register, in batch mode, external functions that are written in the Cor C++ programming languages. You can use these functions in SAS as well as inC-language structures and types. After these functions are registered in PROCPROTO, they can be called from any SAS function or subroutine that is declaredin the FCMP procedure, as well as from any SAS function, subroutine, or methodblock that is declared in the COMPILE procedure.

    PRTDEF procedurecreates printer definitions for individual SAS users or all SAS users.

    PRTEXP procedureexports printer definition attributes to a SAS data set so that they can be easilyreplicated and modified.

    PWENCODE procedureencodes passwords for use in SAS programs.

  • 14 Brief Descriptions of Base SAS Procedures Chapter 1

    RANK procedurecomputes ranks for one or more numeric variables across the observations of aSAS data set. The ranks are written to a new SAS data set. Alternatively, PROCRANK produces normal scores or other rank scores.

    REGISTRY procedureimports registry information into the USER portion of the SAS registry.

    RELEASE procedurereleases unused space at the end of a disk data set in the z/OS environment. Seethe SAS documentation for this operating environment for more information.

    REPORT procedurecombines features of the PRINT, MEANS, and TABULATE procedures withfeatures of the DATA step in a single report-writing tool that can produce bothdetail and summary reports.

    SORT proceduresorts observations in a SAS data set by one or more variables. PROC SORT storesthe resulting sorted observations in a new SAS data set or replaces the originaldata set.

    SOURCE procedureprovides an easy way to back up and process source library data sets. See the SASdocumentation for your operating environment for more information.

    SQL procedureimplements a subset of the Structured Query Language (SQL) for use in SAS. SQLis a standardized, widely used language that retrieves and updates data in SASdata sets, SQL views, and DBMS tables, as well as views based on those tables.PROC SQL can also create tables and views, summaries, statistics, and reportsand perform utility functions such as sorting and concatenating.

    STANDARD procedurestandardizes some or all of the variables in a SAS data set to a given mean andstandard deviation and produces a new SAS data set that contains thestandardized values.

    SUMMARY procedurecomputes descriptive statistics for the variables in a SAS data across allobservations and within groups of observations and outputs the results to a newSAS data set.

    TABULATE proceduredisplays descriptive statistics in tabular form. The value in each table cell iscalculated from the variables and statistics that define the pages, rows, andcolumns of the table. The statistic associated with each cell is calculated on valuesfrom all observations in that category. You can write the results to a SAS data set.

    TAPECOPY procedurecopies an entire tape volume or files from one or more tape volumes to one outputtape volume. See the SAS Companion for z/OS for more information.

    TAPELABEL procedurelists the label information of an IBM standard-labeled tape volume under the z/OSenvironment. See the SAS Companion for z/OS for more information.

    TEMPLATE procedurecustomizes ODS output for an entire SAS job or a single ODS output object. SeeSAS Output Delivery System: Users Guide for details.

    TIMEPLOT procedure

  • Choosing the Right Procedure Brief Descriptions of Base SAS Procedures 15

    produces plots of one or more variables over time intervals.

    TRANSPOSE proceduretransposes a data set that changes observations into variables and vice versa.

    TRANTAB procedurecreates, edits, and displays customized translation tables. See SAS NationalLanguage Support (NLS): Reference Guide for more information.

    UNIVARIATE procedurecomputes descriptive statistics (including quantiles), confidence intervals, androbust estimates for numeric variables. Provides detail on the distribution ofnumeric variables, which include tests for normality, plots to illustrate thedistribution, frequency tables, and tests of location.

  • 16

  • 17

    C H A P T E R

    2Fundamental Concepts for UsingBase SAS Procedures

    Language Concepts 17Temporary and Permanent SAS Data Sets 18

    Naming SAS Data Sets 18USER Library 18

    SAS System Options 18Data Set Options 19Global Statements 20

    Procedure Concepts 20Input Data Sets 20RUN-Group Processing 21Creating Titles That Contain BY-Group Information 21

    BY-Group Processing 21Suppressing the Default BY Line 21Inserting BY-Group Information into a Title 21Example: Inserting a Value from Each BY Variable into the Title 22Example: Inserting the Name of a BY Variable into a Title 24Example: Inserting the Complete BY Line into a Title 24Error Processing of BY-Group Specifications 25

    Shortcuts for Specifying Lists of Variable Names 25Formatted Values 26

    Using Formatted Values 26Example: Printing the Formatted Values for a Data Set 26Example: Grouping or Classifying Formatted Data 28Example: Temporarily Associating a Format with a Variable 29Example: Temporarily Dissociating a Format from a Variable 30Formats and BY-Group Processing 31Formats and Error Checking 31

    Processing All the Data Sets in a Library 31Operating Environment-Specific Procedures 31Statistic Descriptions 32Computational Requirements for Statistics 33

    Output Delivery System 33

    Language Concepts

  • 18 Temporary and Permanent SAS Data Sets Chapter 2

    Temporary and Permanent SAS Data Sets

    Naming SAS Data SetsSAS data sets can have a one-level name or a two-level name. Typically, names of

    temporary SAS data sets have only one level and are stored in the WORK library. TheWORK library is defined automatically at the beginning of the SAS session and isdeleted automatically at the end of the SAS session. Procedures assume that SAS datasets that are specified with a one-level name are to be read from or written to theWORK library. To indicate otherwise, you specify a USER library (see USER Libraryon page 18). For example, the following PROC PRINT steps are equivalent. The secondPROC PRINT step assumes that the DEBATE data set is in the WORK library.

    proc print data=work.debate;run;

    proc print data=debate;run;

    The SAS system options WORK=, WORKINIT, and WORKTERM affect how youwork with temporary and permanent libraries. See the SAS Language Reference:Dictionary for complete documentation.

    Typically, two-level names represent permanent SAS data sets. A two-level nametakes the form libref.SAS-data-set. The libref is a name that is temporarily associatedwith a SAS library. A SAS library is an external storage location that stores SAS datasets in your operating environment. A LIBNAME statement associates the libref withthe SAS library. In the following PROC PRINT step, PROCLIB is the libref and EMP isthe SAS data set within the library:

    libname proclib SAS-library;proc print data=proclib.emp;run;

    USER LibraryYou can use one-level names for permanent SAS data sets by specifying a USER

    library. You can assign a USER library with a LIBNAME statement or with the SASsystem option USER=. After you specify a USER library, the procedure assumes thatdata sets with one-level names are in the USER library instead of the WORK library.For example, the following PROC PRINT step assumes that DEBATE is in the USERlibrary:

    options user=SAS-library;proc print data=debate;run;

    Note: If you have a USER library defined, then you can still use the WORK libraryby specifying WORK.SAS-data-set.

    SAS System OptionsSome SAS system option settings affect procedure output. The SAS system options

    listed below are the options that you are most likely to use with SAS procedures:BYLINE|NOBYLINE

  • Fundamental Concepts for Using Base SAS Procedures Data Set Options 19

    DATE|NODATEDETAILS|NODETAILS

    FMTERR|NOFMTERR

    FORMCHAR=FORMDLIM=

    LABEL|NOLABEL

    LINESIZE=NUMBER|NONUMBER

    PAGENO=

    PAGESIZE=REPLACE|NOREPLACE

    SOURCE|NOSOURCE

    For a complete description of SAS system options, see the SAS Language Reference:Dictionary.

    Data Set OptionsMost of the procedures that read data sets or create output data sets accept data set

    options. SAS data set options appear in parentheses after the data set specification.Here is an example:

    proc print data=stocks(obs=25 pw=green);

    The individual procedure chapters contain reminders that you can use data setoptions where it is appropriate.

    SAS data set options are as follows:

    ALTER= OBS=

    BUFNO= OBSBUF=

    BUFSIZE= OUTREP=

    CNTLLEV= POINTOBS=

    COMPRESS= PW=

    DLDMGACTION= PWREQ=

    DROP= READ=

    ENCODING= RENAME=

    ENCRYPT= REPEMPTY=

    FILECLOSE= REPLACE=

    FIRSTOBS= REUSE=

    GENMAX= SORTEDBY=

    GENNUM= SPILL=

    IDXNAME= TOBSNO=

    IDXWHERE= TYPE=

    IN= WHERE=

    INDEX= WHEREUP=

  • 20 Global Statements Chapter 2

    KEEP= WRITE=

    LABEL=

    For a complete description of SAS data set options, see the SAS Language Reference:Dictionary.

    Global Statements

    You can use these global statements anywhere in SAS programs except after aDATALINES, CARDS, or PARMCARDS statement:

    comment ODS

    DM OPTIONS

    ENDSAS PAGE

    FILENAME RUN

    FOOTNOTE %RUN

    %INCLUDE SASFILE

    LIBNAME SKIP

    %LIST TITLE

    LOCK X

    For information about all but the ODS statement, refer to the SAS LanguageReference: Dictionary. For information about the ODS statement, refer to the OutputDelivery System on page 33 and to SAS Output Delivery System: Users Guide.

    Procedure Concepts

    Input Data Sets

    Many Base SAS procedures require an input SAS data set. You specify the input SASdata set by using the DATA= option in the procedure statement, as in this example:

    proc print data=emp;

    If you omit the DATA= option, the procedure uses the value of the SAS system option_LAST_=. The default of _LAST_= is the most recently created SAS data set in thecurrent SAS job or session. _LAST_= is described in detail in the SAS LanguageReference: Dictionary.

  • Fundamental Concepts for Using Base SAS Procedures Creating Titles That Contain BY-Group Information 21

    RUN-Group ProcessingRUN-group processing enables you to submit a PROC step with a RUN statement

    without ending the procedure. You can continue to use the procedure without issuinganother PROC statement. To end the procedure, use a RUN CANCEL or a QUITstatement. Several Base SAS procedures support RUN-group processing:

    CATALOG

    DATASETS

    PLOT

    PMENU

    TRANTAB

    See the section on the individual procedure for more information.

    Note: PROC SQL executes each query automatically. Neither the RUN nor RUNCANCEL statement has any effect.

    Creating Titles That Contain BY-Group Information

    BY-Group ProcessingBY-group processing uses a BY statement to process observations that are ordered,

    grouped, or indexed according to the values of one or more variables. By default, whenyou use BY-group processing in a procedure step, a BY line identifies each group. Thissection explains how to create titles that serve as customized BY lines.

    Suppressing the Default BY LineWhen you insert BY-group processing information into a title, you usually want to

    suppress the default BY line. To suppress it, use the SAS system option NOBYLINE.

    Note: You must use the NOBYLINE option if you insert BY-group information intotitles for the following Base SAS procedures:

    MEANS

    PRINT

    STANDARD

    SUMMARY

    If you use the BY statement with the NOBYLINE option, then these procedures alwaysstart a new page for each BY group. This behavior prevents multiple BY groups fromappearing on a single page and ensures that the information in the titles matches thereport on the pages.

    Inserting BY-Group Information into a TitleThe general form for inserting BY-group information into a title is as follows:

    #BY-specification

    BY-specification

  • 22 Creating Titles That Contain BY-Group Information Chapter 2

    is one of the following specifications:

    BYVALn | BYVAL(BY-variable)places the value of the specified BY variable in the title. You specify the BYvariable with one of the following options:

    nis the nth BY variable in the BY statement.

    BY-variableis the name of the BY variable whose value you want to insert in thetitle.

    BYVARn | BYVAR(BY-variable)places the label or the name (if no label exists) of the specified BY variable inthe title. You designate the BY variable with one of the following options:

    nis the nth BY variable in the BY statement.

    BY-variableis the name of the BY variable whose name you want to insert in thetitle.

    BYLINEinserts the complete default BY line into the title.

    suffixsupplies text to place immediately after the BY-group information that you insertin the title. No space appears between the BY-group information and the suffix.

    Example: Inserting a Value from Each BY Variable into the TitleThis example demonstates these actions:1 creates a data set, GROC, that contains data for stores from four regions. Each

    store has four departments. See GROC on page 1591 for the DATA step thatcreates the data set.

    2 sorts the data by Region and Department.3 uses the SAS system option NOBYLINE to suppress the BY line that normally

    appears in output that is produced with BY-group processing.4 uses PROC CHART to chart sales by Region and Department. In the first TITLE

    statement, #BYVAL2 inserts the value of the second BY variable, Department, intothe title. In the second TITLE statement, #BYVAL(Region) inserts the value ofRegion into the title. The first period after Region indicates that a suffix follows.The second period is the suffix.

    5 uses the SAS system option BYLINE to return to the creation of the default BYline with BY-group processing.

    data groc; uinput Region $9. Manager $ Department $ Sales;datalines;

    Southeast Hayes Paper 250Southeast Hayes Produce 100Southeast Hayes Canned 120Southeast Hayes Meat 80...more lines of data...Northeast Fuller Paper 200

  • Fundamental Concepts for Using Base SAS Procedures Creating Titles That Contain BY-Group Information 23

    Northeast Fuller Produce 300Northeast Fuller Canned 420Northeast Fuller Meat 125;

    proc sort data=groc; vby region department;

    run;options nobyline nodate pageno=1

    linesize=64 pagesize=20; wproc chart data=groc; x

    by region department;vbar manager / type=sum sumvar=sales;title1 This chart shows #byval2 sales;title2 in the #byval(region)..;

    run;options byline; y

    This partial output shows two BY groups with customized BY lines:

    This chart shows Canned sales 1in the Northwest.

    Sales Sum

    400 + ***** *****| ***** *****

    300 + ***** *****| ***** ***** *****

    200 + ***** ***** *****| ***** ***** *****

    100 + ***** ***** *****| ***** ***** *****--------------------------------------------

    Aikmann Duncan Jeffreys

    Manager

    This chart shows Meat sales 2in the Northwest.

    Sales Sum

    75 + ***** *****| ***** *****

    60 + ***** *****| ***** *****

    45 + ***** *****| ***** *****

    30 + ***** ***** *****| ***** ***** *****

    15 + ***** ***** *****| ***** ***** *****--------------------------------------------

    Aikmann Duncan Jeffreys

    Manager

  • 24 Creating Titles That Contain BY-Group Information Chapter 2

    Example: Inserting the Name of a BY Variable into a TitleThis example inserts the name of a BY variable and the value of a BY variable into

    the title. The program does these actions.1 uses the SAS system option NOBYLINE to suppress the BY line that normally

    appears in output that is produced with BY-group processing.2 uses PROC CHART to chart sales by Region. In the first TITLE statement,

    #BYVAR(Region) inserts the name of the variable Region into the title. (If Regionhad a label, #BYVAR would use the label instead of the name.) The suffix al isappended to the label. In the second TITLE statement, #BYVAL1 inserts the valueof the first BY variable, Region, into the title.

    3 uses the SAS system option BYLINE to return to the creation of the default BYline with BY-group processing.

    options nobyline nodate pageno=1linesize=64 pagesize=20; u

    proc chart data=groc; vby region;vbar manager / type=mean sumvar=sales;title1 #byvar(region).al Analysis;title2 for the #byval1;

    run;options byline; w

    This partial output shows one BY group with a customized BY line:

    Regional Analysis 1for the Northwest

    Sales Mean

    300 + *****| *****

    200 + ***** *****100 + ***** ***** *****

    | ***** ***** *****--------------------------------------------

    Aikmann Duncan Jeffreys

    Manager

    Example: Inserting the Complete BY Line into a TitleThis example inserts the complete BY line into the title. The program does these

    actions:1 uses the SAS system option NOBYLINE to suppress the BY line that normally

    appears in output that is produced with BY-group processing.2 uses PROC CHART to chart sales by Region and Department. In the TITLE

    statement, #BYLINE inserts the complete BY line into the title.3 uses the SAS system option BYLINE to return to the creation of the default BY

    line with BY-group processing.

    options nobyline nodate pageno=1linesize=64 pagesize=20; u

  • Fundamental Concepts for Using Base SAS Procedures Shortcuts for Specifying Lists of Variable Names 25

    proc chart data=groc; vby region department;vbar manager / type=sum sumvar=sales;title Information for #byline;

    run;options byline; w

    This partial output shows two BY groups with customized BY lines:

    Information for Region=Northwest Department=Canned 1

    Sales Sum

    400 + ***** *****| ***** *****

    300 + ***** *****| ***** ***** *****

    200 + ***** ***** *****| ***** ***** *****

    100 + ***** ***** *****| ***** ***** *****--------------------------------------------

    Aikmann Duncan Jeffreys

    Manager

    Information for Region=Northwest Department=Meat 2

    Sales Sum

    75 + ***** *****| ***** *****

    60 + ***** *****| ***** *****

    45 + ***** *****| ***** *****

    30 + ***** ***** *****| ***** ***** *****

    15 + ***** ***** *****| ***** ***** *****--------------------------------------------

    Aikmann Duncan Jeffreys

    Manager

    Error Processing of BY-Group SpecicationsSAS does not issue error or warning messages for incorrect #BYVAL, #BYVAR, or

    #BYLINE specifications. Instead, the text of the item becomes part of the title.

    Shortcuts for Specifying Lists of Variable NamesSeveral statements in procedures allow multiple variable names. You can use these

    shortcut notations instead of specifying each variable name:

  • 26 Formatted Values Chapter 2

    Notation Meaning

    x1-xn specifies variables X1 through Xn. The numbers must beconsecutive.

    x: specifies all variables that begin with the letter X.

    x--a specifies all variables between X and A, inclusive. Thisnotation uses the position of the variables in the data set.

    x-numeric-a specifies all numeric variables between X and A, inclusive.This notation uses the position of the variables in the data set.

    x-character-a specifies all character variables between X and A, inclusive.This notation uses the position of the variables in the data set.

    _numeric_ specifies all numeric variables.

    _character_ specifies all character variables.

    _all_ specifies all variables.

    Note: You cannot use shortcuts to list variable names in the INDEX CREATEstatement in PROC DATASETS.

    See the SAS Language Reference: Concepts for complete documentation.

    Formatted Values

    Using Formatted ValuesTypically, when you print or group variable values, Base SAS procedures use the

    formatted values. This section contains examples of how Base SAS procedures useformatted values.

    Example: Printing the Formatted Values for a Data SetThe following example prints the formatted values of the data set

    PROCLIB.PAYROLL. (See PROCLIB.PAYROLL on page 1598 for details about theDATA step that creates this data set.) In PROCLIB.PAYROLL, the variable Jobcodeindicates the job and level of the employee. For example, TA1 indicates that theemployee is at the beginning level for a ticket agent.

    libname proclib SAS-library;

    options nodate pageno=1linesize=64 pagesize=40;

    proc print data=proclib.payroll(obs=10)noobs;

    title PROCLIB.PAYROLL;title2 First 10 Observations Only;

    run;

  • Fundamental Concepts for Using Base SAS Procedures Formatted Values 27

    The following example is a partial printing of PROCLIB.PAYROLL:

    PROCLIB.PAYROLL 1First 10 Observations Only

    IdNumber Gender Jobcode Salary Birth Hired

    1919 M TA2 34376 12SEP60 04JUN871653 F ME2 35108 15OCT64 09AUG901400 M ME1 29769 05NOV67 16OCT901350 F FA3 32886 31AUG65 29JUL901401 M TA3 38822 13DEC50 17NOV851499 M ME3 43025 26APR54 07JUN801101 M SCP 18723 06JUN62 01OCT901333 M PT2 88606 30MAR61 10FEB811402 M TA2 32615 17JAN63 02DEC901479 F TA3 38785 22DEC68 05OCT89

    The following PROC FORMAT step creates the format $JOBFMT., which assignsdescriptive names for each job:

    proc format;value $jobfmt

    FA1=Flight Attendant TraineeFA2=Junior Flight AttendantFA3=Senior Flight AttendantME1=Mechanic TraineeME2=Junior MechanicME3=Senior MechanicPT1=Pilot TraineePT2=Junior PilotPT3=Senior PilotTA1=Ticket Agent TraineeTA2=Junior Ticket AgentTA3=Senior Ticket AgentNA1=Junior NavigatorNA2=Senior NavigatorBCK=Baggage CheckerSCP=Skycap;

    run;

    The FORMAT statement in this PROC MEANS step temporarily associates the$JOBFMT. format with the variable Jobcode:

    options nodate pageno=1linesize=64 pagesize=60;

    proc means data=proclib.payroll mean max;class jobcode;var salary;format jobcode $jobfmt.;title Summary Statistics for;title2 Each Job Code;

    run;

  • 28 Formatted Values Chapter 2

    PROC MEANS produces this output, which uses the $JOBFMT. format:

    Summary Statistics for 1Each Job Code

    The MEANS Procedure

    Analysis Variable : Salary

    NJobcode Obs Mean Maximum---------------------------------------------------------------Baggage Checker 9 25794.22 26896.00

    Flight Attendant Trainee 11 23039.36 23979.00

    Junior Flight Attendant 16 27986.88 28978.00

    Senior Flight Attendant 7 32933.86 33419.00

    Mechanic Trainee 8 28500.25 29769.00

    Junior Mechanic 14 35576.86 36925.00

    Senior Mechanic 7 42410.71 43900.00

    Junior Navigator 5 42032.20 43433.00

    Senior Navigator 3 52383.00 53798.00

    Pilot Trainee 8 67908.00 71349.00

    Junior Pilot 10 87925.20 91908.00

    Senior Pilot 2 10504.50 11379.00

    Skycap 7 18308.86 18833.00

    Ticket Agent Trainee 9 27721.33 28880.00

    Junior Ticket Agent 20 33574.95 34803.00

    Senior Ticket Agent 12 39679.58 40899.00---------------------------------------------------------------

    Note: Because formats are character strings, formats for numeric variables areignored when the values of the numeric variables are needed for mathematicalcalculations.

    Example: Grouping or Classifying Formatted DataIf you use a formatted variable to group or classify data, then the procedure uses the

    formatted values. The following example creates and assigns a format, $CODEFMT.,that groups the levels of each job code into one category. PROC MEANS calculatesstatistics based on the groupings of the $CODEFMT. format.

    proc format;value $codefmt

    FA1,FA2,FA3=Flight AttendantME1,ME2,ME3=MechanicPT1,PT2,PT3=PilotTA1,TA2,TA3=Ticket Agent

    NA1,NA2=NavigatorBCK=Baggage Checker

  • Fundamental Concepts for Using Base SAS Procedures Formatted Values 29

    SCP=Skycap;run;

    options nodate pageno=1linesize=64 pagesize=40;

    proc means data=proclib.payroll mean max;class jobcode;var salary;format jobcode $codefmt.;title Summary Statistics for Job Codes;title2 (Using a Format that Groups the Job Codes);

    run;

    PROC MEANS produces this output:

    Summary Statistics for Job Codes 1(Using a Format that Groups the Job Codes)

    The MEANS Procedure

    Analysis Variable : Salary

    NJobcode Obs Mean Maximum-------------------------------------------------------Baggage Checker 9 25794.22 26896.00

    Flight Attendant 34 27404.71 33419.00

    Mechanic 29 35274.24 43900.00

    Navigator 8 45913.75 53798.00

    Pilot 20 72176.25 91908.00

    Skycap 7 18308.86 18833.00

    Ticket Agent 41 34076.73 40899.00-------------------------------------------------------

    Example: Temporarily Associating a Format with a VariableIf you want to associate a format with a variable temporarily, then you can use the

    FORMAT statement. For example, the following PROC PRINT step associates theDOLLAR8. format with the variable Salary for the duration of this PROC PRINT steponly:

    options nodate pageno=1linesize=64 pagesize=40;

    proc print data=proclib.payroll(obs=10)noobs;

    format salary dollar8.;title Temporarily Associating a Format;title2 with the Variable Salary;

    run;

  • 30 Formatted Values Chapter 2

    PROC PRINT produces this output:

    Temporarily Associating a Format 1with the Variable Salary

    IdNumber Gender Jobcode Salary Birth Hired

    1919 M TA2 $34,376 12SEP60 04JUN871653 F ME2 $35,108 15OCT64 09AUG901400 M ME1 $29,769 05NOV67 16OCT901350 F FA3 $32,886 31AUG65 29JUL901401 M TA3 $38,822 13DEC50 17NOV851499 M ME3 $43,025 26APR54 07JUN801101 M SCP $18,723 06JUN62 01OCT901333 M PT2 $88,606 30MAR61 10FEB811402 M TA2 $32,615 17JAN63 02DEC901479 F TA3 $38,785 22DEC68 05OCT89

    Example: Temporarily Dissociating a Format from a VariableIf a variable has a permanent format that you do not want a procedure to use, then

    temporarily dissociate the format from the variable by using a FORMAT statement.In this example, the FORMAT statement in the DATA step permanently associates

    the $YRFMT. variable with the variable Year. Thus, when you use the variable in aPROC step, the procedure uses the formatted values. The PROC MEANS step, however,contains a FORMAT statement that dissociates the $YRFMT. format from Year for thisPROC MEANS step only. PROC MEANS uses the stored value for Year in the output.

    proc format;value $yrfmt 1=Freshman

    2=Sophomore3=Junior4=Senior;

    run;data debate;

    input Name $ Gender $ Year $ GPA @@;format year $yrfmt.;datalines;

    Capiccio m 1 3.598 Tucker m 1 3.901Bagwell f 2 3.722 Berry m 2 3.198Metcalf m 2 3.342 Gold f 3 3.609Gray f 3 3.177 Syme f 3 3.883Baglione f 4 4.000 Carr m 4 3.750Hall m 4 3.574 Lewis m 4 3.421;

    options nodate pageno=1linesize=64 pagesize=40;

    proc means data=debate mean maxdec=2;class year;format year;title Average GPA;

    run;

    52

  • Fundamental Concepts for Using Base SAS Procedures Operating Environment-Specic Procedures 31

    PROC MEANS produces this output, which does not use the YRFMT. format:

    Average GPA 1

    The MEANS Procedure

    Analysis Variable : GPA

    NYear Obs Mean-------------------------------1 2 3.75

    2 3 3.42

    3 3 3.56

    4 4 3.69-------------------------------

    Formats and BY-Group ProcessingWhen a procedure processes a data set, it checks to determine whether a format is

    assigned to the BY variable. If it is, then the procedure adds observations to thecurrent BY groups until the formatted value changes. If nonconsecutive internal valuesof the BY variables have the same formatted value, then the values are grouped intodifferent BY group. Thus, two BY groups are created with the same formatted value.Further, if different and consecutive internal values of the BY variables have the sameformatted value, then they are included in the same BY group.

    Formats and Error CheckingIf SAS cannot find a format, then it stops processing and prints an error message in

    the SAS log. You can suppress this behavior with the SAS system option NOFMTERR.If you use NOFMTERR, and SAS cannot find the format, then SAS uses a defaultformat and continues processing. Typically, for the default, SAS uses the BESTw.format for numeric variables and the $w. format for character variables.

    Note: To ensure that SAS can find user-written formats, use the SAS system optionFMTSEARCH=. How to store formats is described in Storing Informats and Formatson page 532.

    Processing All the Data Sets in a LibraryYou can use the SAS Macro Facility to run the same procedure on every data set in a

    library. The macro facility is part of the Base SAS software.Example 9 on page 873 shows how to print all the data sets in a library. You can use

    the same macro definition to perform any procedure on all the data sets in a library.Simply replace the PROC PRINT piece of the program with the appropriate procedurecode.

    Operating Environment-Specic ProceduresSeveral Base SAS procedures are specific to one operating environment or one

    release. Appendix 2, Operating Environment-Specific Procedures, on page 1551contains a table with additional information. These procedures are described in moredetail in the SAS documentation for operating environments.

  • 32 Statistic Descriptions Chapter 2

    Statistic Descriptions

    The following table identifies common descriptive statistics that are available inseveral Base SAS procedures. See Keywords and Formulas on page 1516 for moredetailed information about available statistics and theoretical information.

    Table 2.1 Common Descriptive Statistics That Base SAS Procedures Calculate

    Statistic Description Procedures

    confidence intervals FREQ, MEANS/SUMMARY, TABULATE, UNIVARIATE

    CSS corrected sum ofsquares

    CORR, MEANS/SUMMARY, REPORT, SQL,TABULATE, UNIVARIATE

    CV coefficient of variation MEANS/SUMMARY, REPORT, SQL, TABULATE,UNIVARIATE

    goodness-of-fit tests FREQ, UNIVARIATE

    KURTOSIS kurtosis MEANS/SUMMARY, TABULATE, UNIVARIATE

    MAX largest (maximum)value

    CORR, MEANS/SUMMARY, REPORT, SQL,TABULATE, UNIVARIATE

    MEAN mean CORR, MEANS/SUMMARY, REPORT, SQL,TABULATE, UNIVARIATE

    MEDIAN median (50th percentile) CORR (for nonparametric correlation measures),MEANS/SUMMARY, TABULATE, UNIVARIATE

    MIN smallest (minimum)value

    CORR, MEANS/SUMMARY, REPORT, SQL,TABULATE, UNIVARIATE

    MODE most frequent value (ifnot unique, thesmallest mode is used)

    UNIVARIATE

    N number of observationson which calculationsare based

    CORR, FREQ, MEANS/SUMMARY, REPORT, SQL,TABULATE, UNIVARIATE

    NMISS number of missingvalues

    FREQ, MEANS/SUMMARY, REPORT, SQL, TABULATE,UNIVARIATE

    NOBS number of observations MEANS/SUMMARY, UNIVARIATE

    PCTN the percentage of a cellor row frequency to atotal frequency

    REPORT, TABULATE

    PCTSUM the percentage of a cellor row sum to a totalsum

    REPORT, TABULATE

    Pearson correlation CORR

    percentiles FREQ, MEANS/SUMMARY, REPORT, TABULATE,UNIVARIATE

    RANGE range CORR, MEANS/SUMMARY, REPORT, SQL,TABULATE, UNIVARIATE

  • Fundamental Concepts for Using Base SAS Procedures Output Delivery System 33

    Statistic Description Procedures

    robust statistics trimmed means,Winsorized means

    UNIVARIATE

    SKEWNESS skewness MEANS/SUMMARY, TABULATE, UNIVARIATE

    Spearman correlation CORR

    STD standard deviation CORR, MEANS/SUMMARY, REPORT, SQL,TABULATE, UNIVARIATE

    STDERR the standard error ofthe mean

    MEANS/SUMMARY, REPORT, SQL, TABULATE,UNIVARIATE

    SUM sum CORR, MEANS/SUMMARY, REPORT, SQL,TABULATE, UNIVARIATE

    SUMWGT sum of weights CORR, MEANS/SUMMARY, REPORT, SQL,TABULATE, UNIVARIATE

    tests of location UNIVARIATE

    USS uncorrected sum ofsquares

    CORR, MEANS/SUMMARY, REPORT, SQL,TABULATE, UNIVARIATE

    VAR variance CORR, MEANS/SUMMARY, REPORT, SQL,TABULATE, UNIVARIATE

    Computational Requirements for StatisticsThe following computational requirements are for the statistics that are listed in

    Table 2.1 on page 32. They do not describe recommended sample sizes.

    N and NMISS do not require any nonmissing observations. SUM, MEAN, MAX, MIN, RANGE, USS, and CSS require at least one nonmissing

    observation. VAR, STD, STDERR, and CV require at least two observations. CV requires that MEAN is not equal to zero.

    Statistics are reported as missing if they cannot be computed.

    Output Delivery SystemThe Output Delivery System (ODS) gives you greater flexibility in generating,

    storing, and reproducing SAS procedure and DATA step output, with a wide range offormatting options. ODS provides formatting functionality that is not available fromindividual procedures or from