WELCOME TO: WHAT, WHEN, WHY OF SAS /ACCESS
Transcript of WELCOME TO: WHAT, WHEN, WHY OF SAS /ACCESS
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
WELCOME TO:
WHAT, WHEN, WHY OF SAS®/ACCESS
Presented by
Jeff Simpson
SAS Customer Loyalty
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
By the end of this meeting, you will understand the key characteristics,
capabilities, and efficiencies of SAS/ACCESS interfaces.
What are SAS/ACCESS interfaces?
What capabilities do they provide?
When should they be used? Why?
Performance tips and hints
(Hadoop, ODBC, Oracle, Netezza, SQL Server, Teradata)
Recommended Resources
Goal and takeaways
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
With In-Database
data
Remember this
SAS
Conduct as much in-database
processing as possible
so your analytics can run faster.
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Why use SAS/ACCESS interfaces?
So your analytics can consume and disseminate diverse data sources
and targets
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Factors to Consider DBMS SAS dataset
operating system any any
purpose transactional analytics
concurrent / multi-user yes yes
language SQL SAS & SQL
method (non)sequential reads (non)sequential reads
tenure long established long established (1976)
scalable yes yes
Distinguishing DBMS and SAS datasets
SAS/ACCESS®
delimited
What data can Base SAS read from / write to?
(without SAS/ACCESS)
flat file
XML
mainframe VSAM,
EBCDIC
ASCII
JMP
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Aster Data
DB2
Cloudera Impala
Greenplum
IBM PureData
MYSQL
ODBC and OLE DB
Oracle & Oracle Exadata
PC file formats
PostgreSQL (including Amazon Redshift)
SAP HANA
Hadoop
SQL Server
Sybase
Teradata
others
What are SAS/ACCESS interfaces?Conduct read/write operations to/from SAS
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
APPEND
INSERT
LOAD / FAST LOAD
READ
UPDATE
WRITE
What operations can SAS/ACCESS interfaces perform?
When should we use SAS/ACCESS interfaces?
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Why not just write SQL?
A rank example
WITH "subquery0" ( "COSTPRICE_PER_UNIT", "DISCOUNT", "ORDER_ID", "ORDER_ITEM_NUM",
"PRODUCT_ID", "QUANTITY", "TOTAL_RETAIL_PRICE" ) AS ( SELECT "COSTPRICE_PER_UNIT", "DISCOUNT",
"ORDER_ID", "ORDER_ITEM_NUM", "PRODUCT_ID", "QUANTITY", "TOTAL_RETAIL_PRICE" FROM
"DB2_ORDER_ITEM" ) SELECT "table0"."ORDER_ID", "table0"."ORDER_ITEM_NUM",
"table0"."PRODUCT_ID", "table0"."QUANTITY", "table0"."TOTAL_RETAIL_PRICE",
"table0"."COSTPRICE_PER_UNIT", "table0"."DISCOUNT", "table2"."rankalias1" AS "QUANTITYRANK",
"table1"."rankalias0" AS "PRODUCTRANK" FROM "subquery0" AS "table0" LEFT JOIN ( SELECT DISTINCT
"PRODUCT_ID", "tempcol0" AS "rankalias0" FROM ( SELECT "PRODUCT_ID", MIN( "tempcol1" ) OVER (
PARTITION BY "PRODUCT_ID" ) AS "tempcol0" FROM ( SELECT "PRODUCT_ID", CAST( ROW_NUMBER() OVER (
ORDER BY "PRODUCT_ID" DESC ) AS DOUBLE PRECISION ) AS "tempcol1" FROM "subquery0" WHERE ( (
"PRODUCT_ID" IS NOT NULL ) ) ) AS "subquery2" ) AS "subquery1" ) AS "table1" ON ( (
"table0"."PRODUCT_ID" = "table1"."PRODUCT_ID" ) ) LEFT JOIN ( SELECT DISTINCT "QUANTITY",
"tempcol2" AS "rankalias1" FROM ( SELECT "QUANTITY", MIN( "tempcol3" ) OVER ( PARTITION BY
"QUANTITY" ) AS "tempcol2" FROM ( SELECT "QUANTITY", CAST( ROW_NUMBER() OVER ( ORDER BY
"QUANTITY" DESC ) AS DOUBLE PRECISION ) AS "tempcol3" FROM "subquery0" WHERE ( ( "QUANTITY" IS
NOT NULL ) ) ) AS "subquery4" ) AS "subquery3" ) AS "table2" ON ( ( "table0"."QUANTITY" =
"table2"."QUANTITY" ) )
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Why not just write SQL?
Because writing SAS is shorter, faster, easier to maintain
proc RANK example
proc rank data=indb2.db2_order_item out=work.order descending ties=low;
var quantity product_id;
ranks QuantityRank ProductRank;
run;
PROC RANK can also run in-database
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
It depends
Part of a package, like SAS Office Analytics
A la carte / individually (Base, SAS/STAT, SAS/ACCESS)
both
How are SAS/ACCESS interfaces licensed?
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
ODBC drivers connect SAS and other technologies to/from any ODBC-enabled
data source/target
Microsoft and others provide Windows ODBC drivers free or at a minimal cost
ODBC drivers in non-Windows environments can be costly
SAS/ACCESS
Interface to ODBC
ODBC drivers come with
your database or purchase
separately
Distinguishing ODBC & Database-Specific SAS/ACCESS Interfaces
1 of 2
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS/ACCESS Interface to ODBC SAS/ACCESS Interface to
[Oracle, Teradata, DB2, etc.]
data
SAS
data
Distinguishing ODBC & Database-Specific SAS/ACCESS Interfaces
2 of 2
ODBC
driver
ODBC program interface
SAS
DBMS client installed and
configured
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Factors to Consider ODBC OLE DB
operating system Windows & Unix Windows
multidimensional data support no yes
concurrent / multi-user no yes
method SQL multiple
terminology driver provider
tenure long established newer
costsmore low/no cost resources on
Windowsfewer low/no cost resources
More details: http://ftp.sas.com/techsup/download/v8papers/odbcdb.pdf
Distinguishing ODBC and OLE DB
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Database SAS-supplied Driver? ODBC-Based?
Aster, Impala, Informix, Netezza, ODBC, Sybase IQ, Vertica
no yes
DB2, Hadoop, MySQL, Oracle, Sybase, Teradata
no no
Greenplum, PostgreSQL, SAP HANA, SQL Server
yes yes
PC Files not applicable not applicable
DBMS Requirements and Configuration Notes:
http://support.sas.com/documentation/installcenter/en/ikfdtnunxcg/66380/PDF/default/config.pdf
System Requirements Notes:
http://support.sas.com/documentation/installcenter/en/ikfdtnlaxsr/66396/PDF/default/sreq.pdf
Distinguishing ODBC and SAS-Supplied Drivers
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Support for Hadoop
FOUNDATION SAS
Foundation SAS offers support for Hadoop through
Base SAS
SAS/Access Interface to Hadoop (Hive)
SAS/Access Interface to HAWQ
SAS/Access Interface to Impala
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
What about Amazon Redshift?
SAS supports Amazon Redshift via SAS/ACCESS interface to ODBC.
An ODBC Driver is available from Amazon. Information about it can be found here:
http://docs.aws.amazon.com/redshift/latest/mgmt/install-odbc-driver-linux.html
http://docs.aws.amazon.com/redshift/latest/mgmt/odbc-driver-configure-linux-mac.html
Once this ODBC driver is installed and configured, then the SAS/ACCESS to ODBC interface engine is
able to connect to it: http://support.sas.com/documentation/cdl/en/acreldb/67589/HTML/default/viewer.htm#p1g72kbb0m01y1n1gm1l
h532n5ru.htm
This SAS Global Forum paper elaborates more details about how these technologies operate together:
http://support.sas.com/resources/papers/proceedings15/SAS1789-2015.pdf
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Without In-DatabaseWith In-Database
data
SAS
data
Conduct as much in-database processing as possible
Distinguishing Traditional Processing & In-Database
program interface
SAS
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS/ACCESS Interface
data
SAS
Avoid heterogeneous joins
Hadoop
ODBC
Oracle
Teradata
SAS dataset
program interface
Join takes place on SAS server
ALL data moves to SAS first
SAS extracts, queries, summarizes…
Your results may cause more data
movement…
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Homogeneous
LIBNAME MTG ‘/sas/data/mortgage/’;
LIBNAME HPI
‘/sas/data/housing_data/’;
PROC SQL;
CREATE TABLE MTG.MYDATA AS
SELECT M.LTV, H.CURR_PROP_AMT
FROM MTG.MORTGAGE_DATA AS M
JOIN HPI.HOUSING_INDEX AS H
ON M.ACCT_NUM = H.ACCT_NUM;
QUIT;
Heterogeneous
LIBNAME MTG ‘/sas/data/mortgage/’;
LIBNAME DRI_DBO Teradata
Datasrc=DRI_CITY SCHEMA=dbo
USER=&userid PASSWORD=&pwd;
PROC SQL;
CREATE TABLE MTG.MYDATA AS
SELECT M.LTV, D.REO_DATE
FROM MTG.MORTGAGE_DATA AS M
JOIN DRI_DBO.FLAT_REO AS D
ON M.ACCT_NUM = D.ACCT_NUM; QUIT;
Minimize Data Returned to SAS for Processing
Avoid heterogeneous or federated joins
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
1. To merge SAS (or other) data with DBMS
• use pass-through SQL queries to process only the data you need on DBMS
• save the results to a SAS dataset
• merge all other SAS datasets with the newly created dataset
+ creates a homogeneous SAS data environment
+ you may not have to know DB-specific SQL
- can be inefficient; sacrifices some in-database processing
2. To filter large amounts of DBMS data based on a smaller SAS (or other)
dataset
• load the smaller SAS (or other) dataset into DBMS
• use pass-through SQL queries to process in-database (filter before join)
+ creates a homogeneous DBMS data environment
+ can gain in-database processing efficiencies
- you may have to know DB-specific SQL
Avoid heterogeneous joins
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
With In-Database
data
Remember this
SAS
Conduct as much in-database
processing as possible
so your analytics can run faster.
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
When both of these conditions occur:
1. EG detects a DBMS table/view via a SAS® library assigned through one of these
native SAS®/ACCESS® interfaces.
SAS®/ACCESS® Interface to DB2
SAS®/ACCESS® Interface to Oracle
SAS®/ACCESS® Interface to Netezza
SAS®/ACCESS® Interface to Teradata
AND
2. You reference DBMS source / input data in the query builder or task filter
How to conduct in-database processing via SAS® Enterprise
Guide® (EG)?
1 of 3SAS® in-database
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS® in-database
EG generates explicit SQL or pass-through SQL when working with a DBMS table / view.
Invoke via the Query Builder Options Options for This Query
How to conduct in-database processing via SAS® Enterprise
Guide® (EG)?
2 of 3
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS® in-database
Task filters apply selection criteria to wizard-driven tasks.
Task filters boost efficiency by avoiding a separate query or filter step.
Using task filters with DBMS causes EG to generate a WHERE clause for in-DB processing.
How to conduct in-database processing via SAS® Enterprise
Guide® (EG)?
3 of 3
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
By the end of this meeting, you will understand the key characteristics and
capabilities of SAS/ACCESS interfaces.
What are SAS/ACCESS interfaces?
What capabilities do they provide?
When should they be used? Why?
Performance tips and hints
(Hadoop, ODBC, Oracle, Netezza, SQL Server, Teradata)
Recommended Resources
Goal and takeaways
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
It depends
Implicit SQL = DBMS options on libname statement
• SAS creates a connection to the DBMS
• SAS translates your code into implicit SQL
+ you don’t have to know DB-specific SQL
- can be inefficient; all or portions may not translate
Explicit SQL = DBMS options on CONNECT statement + DB-specific SQL • SAS creates a connection to the DBMS
• You submit DBMS-specific explicit SQL to the DBMS
- you have to know DB-specific SQL
+ guarantees in-DB efficiency / no translation
Distinguishing Implicit and Explicit / Pass-Through SQL
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Pass-Through SQL enables the DBMS to optimize queries, especially when:
▪ querying, filtering, joining
▪ summarizing (such as AVG and COUNT, GROUP BY clauses)
▪ deriving variables that are created by expressions
Pass-through accepts the extensions to SQL that are provided by your DBMS
Distinguishing Implicit and Explicit / Pass-Through SQL
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
http://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a000433982.htm
http://support.sas.com/resources/papers/proceedings11/306-2011.pdf
Use implicit SQL
The LIBNAME statement must point to DBMS
Turn on sastrace
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
1) SAS language that has no database equivalent is processed in SAS
- does not minimize data that is returned to SAS
Avoid using SAS capabilities (functions) if they cannot be passed
to the database.
2) SAS functions that have database equivalents can process in-database
- function mapping and implicit pass-through
3) Database functions process in database
- explicit pass-through
Distinguishing Implicit and Explicit / Pass-Through SQL
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
#1- SAS language that has no database equivalent is processed in SAS- does not minimize data that is returned to SAS, does not run in-database
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
#2 - SAS functions that have database equivalents can process in-database- function mapping and implicit pass-through
Teradata
SAS
SQL
code
Translation by
SAS/ACCESS
Interface to
Teradata
Teradata
SQL
code
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
#2 - SAS functions that have database equivalents can process in-DB
- function mapping and implicit pass-through
SAS functions (indicated with *) are implicitly passed to Teradata
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Teradata
#3 - Database functions process in database
- explicit pass-through
Teradata
SQL
code
Passed verbatim by
SAS/ACCESS
Interface to Teradata
Teradata
SQL
code
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Teradata
#3 - Database functions process in database- explicit pass-through
Teradata
SQL
code
Passed verbatim by
SAS/ACCESS
Interface to Teradata
Teradata
SQL
code
libname myterlib teradata user=myusr1;
proc sql;
select customer from myterlib.customers where upper(country)="USA";
quit;
The Teradata UPPER function is used instead of SAS UPCASE function for explicit pass-through.
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Mapping SAS Functions to DBMS Functions
http://support.sas.com/documentation/cdl
/en/acreldb/66787/HTML/default/viewer.h
tm#p0f64yzzxbsg8un1uwgstc6fivjd.htm
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
With In-Database
data
Remember this
SAS
Conduct as much in-database
processing as possible
so your analytics can run faster.
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Use In-database Procedures
• PROC FREQ
• PROC MEANS
• PROC RANK
• PROC SQL
• PROC SORT
• PROC REPORT
• PROC SUMMARY
• PROC TABULATE
• Base SAS
• SAS/ACCESS to DBMS
• SQLGENERATION option
and LIBNAME statement
Aster
DB2
Greenplum
Hadoop
Netezza
Oracle
Teradata
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
The LIBNAME statement must point to DBMS.
The SQLGENERATION system option or the SQLGENERATION LIBNAME
option must be set to DBMS.
▪ By default, the SQLGENERATION system option is set to NONE and the
Base SAS in-database procedures DO NOT RUN IN THE DATABASE.
▪ Conventional SAS processing is also used when specific procedure
statements and options do not support in-database processing.
http://support.sas.com/documentation/cdl/en/lesysoptsref/66899/HTML/default/viewer.htm#n1ag2fud7u
e3aln1xiqqtev7ergg.htm
http://support.sas.com/documentation/cdl/en/hostwin/63047/HTML/default/viewer.htm#p0drw76qo0gig
2n1kcoliekh605k.htm
Use In-database Procedures
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Why not just write SQL?
A rank example
WITH "subquery0" ( "COSTPRICE_PER_UNIT", "DISCOUNT", "ORDER_ID", "ORDER_ITEM_NUM",
"PRODUCT_ID", "QUANTITY", "TOTAL_RETAIL_PRICE" ) AS ( SELECT "COSTPRICE_PER_UNIT", "DISCOUNT",
"ORDER_ID", "ORDER_ITEM_NUM", "PRODUCT_ID", "QUANTITY", "TOTAL_RETAIL_PRICE" FROM
"DB2_ORDER_ITEM" ) SELECT "table0"."ORDER_ID", "table0"."ORDER_ITEM_NUM",
"table0"."PRODUCT_ID", "table0"."QUANTITY", "table0"."TOTAL_RETAIL_PRICE",
"table0"."COSTPRICE_PER_UNIT", "table0"."DISCOUNT", "table2"."rankalias1" AS "QUANTITYRANK",
"table1"."rankalias0" AS "PRODUCTRANK" FROM "subquery0" AS "table0" LEFT JOIN ( SELECT DISTINCT
"PRODUCT_ID", "tempcol0" AS "rankalias0" FROM ( SELECT "PRODUCT_ID", MIN( "tempcol1" ) OVER (
PARTITION BY "PRODUCT_ID" ) AS "tempcol0" FROM ( SELECT "PRODUCT_ID", CAST( ROW_NUMBER() OVER (
ORDER BY "PRODUCT_ID" DESC ) AS DOUBLE PRECISION ) AS "tempcol1" FROM "subquery0" WHERE ( (
"PRODUCT_ID" IS NOT NULL ) ) ) AS "subquery2" ) AS "subquery1" ) AS "table1" ON ( (
"table0"."PRODUCT_ID" = "table1"."PRODUCT_ID" ) ) LEFT JOIN ( SELECT DISTINCT "QUANTITY",
"tempcol2" AS "rankalias1" FROM ( SELECT "QUANTITY", MIN( "tempcol3" ) OVER ( PARTITION BY
"QUANTITY" ) AS "tempcol2" FROM ( SELECT "QUANTITY", CAST( ROW_NUMBER() OVER ( ORDER BY
"QUANTITY" DESC ) AS DOUBLE PRECISION ) AS "tempcol3" FROM "subquery0" WHERE ( ( "QUANTITY" IS
NOT NULL ) ) ) AS "subquery4" ) AS "subquery3" ) AS "table2" ON ( ( "table0"."QUANTITY" =
"table2"."QUANTITY" ) )
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
With In-Database
data
Remember this
SAS
Conduct as much in-database
processing as possible so your
analytics can run faster.
Use implicit or explicit / pass-
through SQL plus 7 in-DB Base
procedures.
Minimize data returned to SAS.
Avoid heterogeneous joins.
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Video
http://www.youtube.com/watch?v=OSTa1EUpKT8
Training
https://support.sas.com/edu/prodcourses.html?code=A
CCESS&ctry=US
Iterative Programming In-Database Using SAS®
Enterprise Guide® Query Builder
http://support.sas.com/resources/papers/proceedings1
4/1567-2014.pdf
Overview, documentation, training, samples and
tips, conversations
http://support.sas.com/software/products/access/index
.html#s1=1
Recommended
Resources