Scaling Data
description
Transcript of Scaling Data
![Page 1: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/1.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are registered trademarks or Trademarks of their respective companies
Scaling SAS® Data Access toOracle® RDBMS
Howard PlemmonsSAS Institute Inc.Andrew HoldsworthOracle Corporation
![Page 2: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/2.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling
What is Scaling?
![Page 3: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/3.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling
“To remove the scales of a fish”
“To climb up by means of a scaling ladder”
“To reach the highest point”
Data
![Page 4: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/4.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling Data
Why Scale to Data
![Page 5: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/5.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling Data
SAS tools, SAS/ACCESS®
SAS Procedure and Processes
Oracle tools
Oracle Procedures and Processes
![Page 6: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/6.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Intelligence Value Chain
![Page 7: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/7.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Intelligence Value Chain Silver into Gold
![Page 8: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/8.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS System 9
![Page 9: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/9.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS V8 vs. SAS System 9
FEATURE SAS V8 SAS System 9
Libname Engine x x
Procedure Interface x x
Fast Load x x
Threaded Interface x
![Page 10: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/10.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS V8 I/O Model
![Page 11: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/11.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Threaded Interface SAS 9
![Page 12: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/12.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS Procedures proc sort
proc summary
proc dmine
proc reg; proc dmreg
proc means
proc loess; proc dmdb
proc glm
proc robustreg
![Page 13: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/13.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS/ACCESS® Engines
ORACLE
DB2
Informix
ODBC
Sybase
Teradata
![Page 14: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/14.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Libname and SAS Procedure Controls
dbslice (“where”,”where”,…)
dbsliceparm (ALL,…)
defaults (THREADED_APPS,2)
options sastrace=‘,,t’;
procedure controls – CPU count
![Page 15: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/15.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Options In Action - DBSLICEPARM
-dbsliceparm none
option dbsliceparm=
libname x oracle user=scott pass=tiger
dbsliceparm=(threaded_apps,2);
proc print data=y.oratab (dbsliceparm=(all,4)); run;
![Page 16: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/16.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Options In Action - DBSLICE
libname x oracle user=scott pass=tiger;
proc print data=x.oratab (dbslice= (“where x<100”, “where x >= 100”) );
![Page 17: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/17.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Options In Action – CPUCOUNT, THREADS
CPUCOUNT=
THREADS | NOTHREADS
![Page 18: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/18.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Process
Libname controls
Procedure controls
Execution
![Page 19: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/19.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Linear Scalability
Achieved Speedup
Scalability – SAS 9 Threaded speedup in PROC REG
Run on 12-way Unix Box
![Page 20: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/20.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scalability – SAS 9 Threaded speedup in PROC SORT
Run on 8-way Unix BoxTests run in memory cache
![Page 21: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/21.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
What Does This Mean - access
393000 Rows
No Threads - baseline
Two Threads (DBSLICE) – 31%
Six Threads (DBSLICEPARM) – 54%
Run on 10-way Unix BoxTests run in memory cache
![Page 22: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/22.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling Data
Data Volumes
Data ACCESS
Data Organization
Scaling using Oracle - Andrew
![Page 23: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/23.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling with
The Star Query
Use of Parallelism
Use of the Direct Path
Use of Specialist Indexes
Use of Analytical Functions
Use of Materialized Views
Use of The Oracle9i Optimizer
![Page 24: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/24.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
The Star Query
Fact
Product
Time
Geography
Customer
![Page 25: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/25.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Star Queries The star query is a very common DW
technique. It is highly optimized in Oracle and can be tuned depending on the type of queries. In summary the more known about the query composition the higher level of optimization possible.
![Page 26: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/26.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Star Query Optimization
The Optimization is 3 step Process1.Apply query predicates to dimension tables to generate
lists of foreign keys into the fact table.
2.Query the fact table using series of single column bit mapped indexes on the foreign keys
3.Having resolved the query within the fact table complete the query by joining back to dimension tables where needed and roll the query up.
![Page 27: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/27.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Star Queries
– To enable star queries the DBA should do the following1. Build single column bitmapped indexes on each
foreign key in the fact table
2. Build indexes on the dimension tables for query predicates
3. Build indexes on the dimension tables to assist in the join back and roll up process
4. Generate statistics for the schema
5. Set the parameter STAR_TRANSFORMATION_ENABLED=TRUE
![Page 28: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/28.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Use of Parallelism
Multiple CPUs to execute a single query as well multiple concurrent queries
Execute Table scans, Index probes and scans in parallel
Execute Joins and Sorts in parallel
Execute DML in parallel
Parallelism can be configured manually or automatically
![Page 29: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/29.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Use of Partitioning
Partitioning was originally designed to allow management of large db objects however by partitioning data performance gains can be made by the following• Partition pruning
• Join optimizations
Partitioning can be done by the following methods• Range e.g. Data or key ranges
• List e.g. Discrete values such as State
• Hash to achieve equal size partitions
Two types of partitioning can be applied
![Page 30: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/30.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Use of The Direct Path
By pass the conventional transaction layer to insert and copy data within the database
SQL*Loader is user currently by SAS
Other options include• Insert with /*+ append */ hint
• Create Table as Select with NOLOGGING
These constructs can be used to transform vast amounts of data rapidly in parallel
![Page 31: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/31.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Specialist Indexes
B-Tree Indexes
Bit Mapped Indexes including join indexes
Functional Indexes
![Page 32: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/32.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Analytical Functions
Oracle has embraced the ANSI OLAP extensions to SQL
These permit faster response times on queries that would require multiple passes of the data with conventional SQL
This allows grouped results and functionality such as moving averages
![Page 33: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/33.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Materialized Views
Materialized view allow automatic use of summary tables without a user having to re-write the query
Well designed materialized views are small in size and can increase performance by orders of magnitude.
Materialized views are in fact Oracle tables and can use all other features to improve performance
![Page 34: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/34.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Oracle9i Optimizer
On upgrade of Oracle Releases the Optimizer behavior will change
The Optimizer is tested with over 400,000 SQL Statements
• Where plans change between releases the actual query is ran to test for degradation
• Slower plans are corrected
It is still important to have good representative Statistics
DBMS_STATS package allows parallel generation and migration of schema statistics
![Page 35: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/35.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Oracle9i Optimizer
Some common Optimizer problems seen with Oracle9i
• Bad or incomplete statistics
• Init.ora parameters influencing optimizer
• SQL written for RBO
![Page 36: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/36.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Summary
Oracle and SAS provide techniques for scaling to larger databases by optimizing both query performance and fetch performance.
These techniques are simple to adopt and allow huge productivity improvements
We have identified some core technologies here however this is a partial picture of the SAS/Oracle ability.
![Page 37: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/37.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
About the Speakers
Howard Plemmons Andrew HoldsworthSenior Software Manager Director
SAS Institute Inc. Oracle Corp.
SAS Circle 500 Oracle Pkwy,
Cary, NC Redwood Shores, CA94065
Phone:
919-531-7779 650-506-2938
E-mail:
![Page 38: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/38.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Other SUGI Papers/Presentations
•PC File Data Objects Directly from UNIX – 8:00am Tuesday
•SAS/ACCESS and use of Metadata – Rm 619 @ 2:30
•Lessons in Scalability – SAS Presents – 3:20 Tuesday
•Data Warehousing section - performance
![Page 39: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/39.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling SAS Data ACCESS to ORACLE RDBMS
![Page 40: Scaling Data](https://reader031.fdocuments.in/reader031/viewer/2022013102/547cf8bab4af9fb7188b4631/html5/thumbnails/40.jpg)
Copyright © 2003, SAS Institute Inc. All rights reserved.Copyright © 2003, SAS Institute Inc. All rights reserved. 40