Download - The Complete Guide to SAS Indexes by Michael A. Raithel Reviewed by Horst Wolter

Transcript

Slide 1

The Complete Guide to SAS IndexesbyMichael A. Raithel

Reviewed by Horst Wolter

Slide 2

About the Author

• Wrote two previous SAS books and over twenty SAS technical papers.

• Section Chair at SUGI, SESUG and NESUG and co-chaired NESUG in 1995.

• First book entitled Tuning SAS Applications in the MVS Environment resides in the Smithsonian Institution of American History’s Permanent Research Collection of Information Technology.

Slide 3

Goals of Indexing

• The main goal of using a SAS index is to read only a small portion of a large SAS dataset instead of reading the entire SAS dataset.

• Reduce I/Os.

• Reduce wall clock time.

• Reduce CPU time.

Slide 4

When to Index

Consider

• the size of the subsets (the overhead of using an index can become greater than the overhead of a sequential read of the entire dataset)

• frequency of use (is building an index cost effective)

• variability of the data (indexed SAS datasets have an overhead to maintain if data changes often and may not be cost effective)

Slide 5

1% to 15% An index will definitely improve processing. There should be dramatic resource savings in the lower end of this range.

16% to 30% An index will improve processing. However, the resource savings will not be as dramatic as in the lower range.

31% to 60% An index may improve processing, or it might worsen processing. Be very careful in this subset range.

61% to 100% Do not use an index. A sequential read of the entire dataset is very likely to be more efficient.

Indexing ActionSubset Size

Indexing Guidelines(Table 2.1 from book)

Slide 6

Index Variable Selection

What variables to consider for an index?

• Often used to subset data

• Values that represent a small subset

• A SAS dataset sorted by the index variable is more efficient

Slide 7

How to create an Index

• Data Step Method

DATA data-set-name(INDEX=(index-name=(var1 var2 etc.) </UNIQUE> </NOMISS>));

• SQL Method

proc sql;CREATE <UNIQUE> INDEX index-name ON table-name ( column

<, ... column>); quit;

Slide 8

Viewing Index Information

proc contents data=data-set-name <centiles>;run;

Indexes 1

Alphabetic List of Indexes and Attributes # of Unique # Index Values 1 var1 1535958

Slide 9

Using Indexes

• With a WHERE clause

• With a BY statement

SAS determines whether or not to use an index based on a number of rules as described in the book.

Slide 10

Other Topics covered in the book

• Composite indexes

• Creating indexes while using PROC SORT, PROC DATASETS

• Creating Multiple Indexes

• Rules for SAS Using an Index

• Removing Indexes

• Index options e.g. Unique Option, NoMiss Option

• Recovering Missing Index Files

• Repairing Damaged Index Files

• And more

Slide 11

From SAS Press

Slide 12

Questions?