Informix Warehouse accelerator -- design, deploy, use
-
Upload
keshav-murthy -
Category
Technology
-
view
865 -
download
0
Transcript of Informix Warehouse accelerator -- design, deploy, use
0
Informix Warehouse AcceleratorQuery Flow and Matching
Keshava Murthy,
Architect, IBM Informix Development
1
Informix Database Server
Informix warehouse Accelerator
BI Applications
Step 1. Install, configure,start Informix
Step 2. Install, configure,start Accelerator
Step 3. Connect Studio to Informix & add accelerator
Step 4. Design, validate, Deploy Data mart
Step 5. Load data to accelerator
Ready for Queries
IBM Smart Analytics
Studio
Step 1
Step 2
Step 3
Step 4
Step 5
Ready
2
Connecting to Informix
• For data mart design, from ISAO Studio
– Use 11.5 Informix driver
– Protocol tcp/ip (onsoctcp or ontlitcp)
– Use the port with TCP/IP and SQLI protocol
• From informix applications, scripts, tools
– Supports protocols: tcp/ip, shared memory
– Supports all drivers
– CSDK, ODBC, JDBC, JCC, .NET, etc, etc.
3
Connection to Informix
• For data mart design, from ISAO Studio
• ISAO Studio runs on Windows and Linux
• Connect from these two platforms to any supported Informix server
– Linux64/Intel
– HP-UX/Itanium
– Power/AIX
– Sparc/Solaris
4
Connection to Informix
• For Applications, connect as usual.
• No Application changes/redeployment necessary
• Set the environments (USE_DWA) using sysdbopen() procedure
• sysdbopen() procedure is automatically executed when any application connects to a database.
5
Connection to Informix
• USE_DWA
SET ENVIRONMENT USE_DWA ‘1’;
– Controls the session behavior of query matching.
– ‘0’ (zero) turns off using IWA for query processing
– ‘1’ turns on considering IWA
– ‘3’ same as 1 with diagnostics
– ‘998’ Use IWA only.
6
Informix Database Server
Informix warehouse Accelerator
BI Applications
Step 1. Install, configure,start Informix
Step 2. Install, configure,start Accelerator
Step 3. Connect Studio to Informix & add accelerator
Step 4. Design, validate, Deploy Data mart
Step 5. Load data to accelerator
Ready for Queries
IBM Smart Analytics
Studio
Step 1
Step 2
Step 3
Step 4
Step 5
Ready
DRDA over TCP/IP
Adding Accelerator
7
Adding Accelerator
• Add new accelerator from data studio or command line interface (CLI)
• Need four parameters to add accelerator
– Name of the accelerator (you choose)
– IP address of the IWA instance
– Port on which IWA is listening to
– PIN obtaining after executing ‘ondwa getpin’
• Port number is in dwainst.conf file.
8
Adding Accelerator
• Informix always talks to IWA Coordinator
– For all data mart operations
– Queries
– To obtain the resultset.
• Informix treats IWA Coordinator as remote node
9
Informix Database Server
Informix warehouse Accelerator
BI Applications
Step 1. Install, configure,start Informix
Step 2. Install, configure,start Accelerator
Step 3. Connect Studio to Informix & add accelerator
Step 4. Design, validate, Deploy Data mart
Step 5. Load data to accelerator
Ready for Queries
IBM Smart Analytics
Studio
Step 1
Step 2
Step 3
Step 4
Step 5
Ready
Design, Validate, Deploy Data mart
10
ISAO Studio or
CLI Tool
Step 1. Design, Validate the
data mart.
AQT
Informix
Step 3
Send the data mart
definition
Step 4
Return the SQL
definitions
Coordinator
Compressed
data
In memory
Worker
Memory image
on disk
Compressed
data
In memory
Worker
Memory image
on disk
Compressed
data
In memory
Worker
Memory image
on disk
Compressed
data
In memory
Worker
Memory image
on disk
Step 6. Return acknowledgement
Design, Validate, Deploy Data marts
Step 5. Save the Definition
AQT
Step 2. Deploy Data mart
11
Store Sales ER-Diagram from TPC-DS300GB database
287,997,024
20
73,049
1,920,800
1000
204,000
1,000,000
402
86,400
7200
2,000,000
12
13
Designing data mart
• Start with a good logical and physical design
• Typically has Star or Snowflake schema
• Data mart itself can contains
– One or more fact tables
– Available dimensions
– Relationship between the fact and dimensions
• Relationships
– 1:n relationship -- needs unique constraint on PK
– n:m relationshp
14
Designing data mart
• Design identifies and uses existing PK-FK relationship between the tables
• In warehouse environment, it’s typical not to have constraints defined within the schema
• Manually create the relationships between the tables.
• Always start from the Parent and end with Child– In customer, web_sales relationship, customer is the
parent and web_sales is the child.
– customer.customer_id will be the primary key, web_sales.customer_id will be the foreign key.
15
Designing data mart
• When you don’t have PK-FK relationship
– Identify the keys from logical design
– Identify the keys from equi-join keys in queries
– Identify the parent and child
• Type of Relationships between two tables
– Single relationship with single key
– Single relationship with multiple keys
– Multiple relationship with single or multiple keys
16
Designing data mart
• Single Data mart with multiple fact tables
– Shares the dimensions with all
• Multiple data marts each with its own fact table, but same fact tables
– Separate copy of dimension tables
– Higher memory requirement
17
Designing data mart – Smart mart tool
• Simply enable workload analysis
• Run the workload
• Informix will give you data mart definitions required to run the workload
• Design is done for you based on workload
• Simply deploy and load the mart using this definition
• Useful while generating data mart for standard reports
• Use it as guiding tool for identifying tables needed within warehouses.
18
Deploying the data mart
• Creates and sends the data mart definition to IWA
• Verify the fact tables and dimension tables.
• Generate the report and verify when necessary
• You can load the data when deploying the data mart
• Typically you deploy once and load periodically
• Loading can be automated via command line inerface (CLI)
19
Deploying the data mart
• IWA returns one or more SQL statements representing the data mart.
• Informix creates Accelerated Query Tables (AQT) for those.
• AQTs are essentially views used exclusively for query matching
• Data mart deployment, enable, disable, drop events are recorded in the system catalog
20
Informix Database Server
Informix warehouse Accelerator
BI Applications
Step 1. Install, configure,start Informix
Step 2. Install, configure,start Accelerator
Step 3. Connect Studio to Informix & add accelerator
Step 4. Design, validate, Deploy Data mart
Step 5. Load data to
accelerator
Ready for Queries
IBM Smart Analytics
Studio
Step 1
Step 2
Step 3
Step 4
Step 5
Ready
Design, Validate, Deploy Data mart
21
Loading the data mart
• Load the data mart using Studio
• Load using loadMart command from CLI
• Takes snapshot of the table
• Options
– No locking of the tables
– Locking of all the tables
22
Applications
BI Tools
Step 1. Submit SQL
DB protocol: SQLI or DRDA
Network : TCP/IP,SHM
Informix
2. Query matching and
redirection technology
Step 3
offload SQL.
DRDA over TCP/IP
Step 4
Results:
DRDA over TCP/IP
Local
Execution
Coordinator
Compressed
data
In memory
Worker
Memory image
on disk
Compressed
data
In memory
Worker
Memory image
on disk
Compressed
data
In memory
Worker
Memory image
on disk
Compressed
data
In memory
Worker
Memory image
on disk
Step 5. Return results/describe/error
Database protocol: SQLI or DRDA
Network : TCP/IP, SHM
Query Flow
23
Step5: Send the results back to Infomrix server
Step1
SQL from Informix
Coordinator
Compressed dataIn memory
Worker
Step3: Scan, Filter, join, group
Compressed dataIn memory
Worker
Step3: Scan, Filter, join, group
Compressed dataIn memory
Worker
Step3: Scan, Filter, join, group
Compressed dataIn memory
Worker
Step3: Scan, Filter, join, group
Step2
Send the queries to all the workers
Step4: merge intermediate results, ORDER BY, FIRSTN
Query Flow within IWA
24
Life of a query
SQL Statement
SQL Parser Query PlanOptimizerSemantic
Analyzer
Executor
Query
Results
System
Catalog
Information
Table Stats &
Column
Distribution
Explain File
Query stats
25
SQL Statement
SQL Parser Query PlanOptimizerSemantic
Analyzer
Informix
Execution
Query
Results
System
Catalog
Information
Table Stats &
Column
Distribution
Explain File
Query stats
IWA
Execution
Query
Results
Generate
SQL--
Optimizer is enhanced to do the query matching
Query qualified for acceleration
26
27
create view "dwa"."aqt2dbca0d9-509d-434b-9cc9-4a12c6de6b3d" ("COL16","COL17","COL18","COL19","COL20","COL21","COL22","COL23","COL24","COL25","COL26","COL27","COL28","COL29","COL30","COL31"4","COL35","COL36","COL37","COL38","COL39","COL40","COL41","COL42","COL43","COL44","COL45","COL46","COL47","COL07","COL08","COL0
COL12","COL13","COL14","COL15","COL48","COL49","COL50","COL51","COL52","COL53","COL54","COL55","COL56","COL57","COL58","COL59","","COL02","COL03","COL04","COL05","COL06","COL62","COL63","COL64…) as
select x0.perkey ,x0.storekey ,x0.custkey ,x0.prodkey ,x0.promokey,x0.quantity_sold ,x0.extended_price ,x0.extended_cost ,x0.shelf_location
,x0.shelf_number ,x0.start_shelf_date ,x0.shelf_height ,x0.shelf_width,x0.shelf_depth ,x0.shelf_cost ,x0.shelf_cost_pct_of_sale
,x0.bin_number ,x0.product_per_bin ,x0.start_bin_date ,x0.bin_height,x0.bin_width ,x0.bin_depth ,x0.bin_cost ,x0.bin_cost_pct_of_sale
……from
((((("informix".daily_sales x0 left join "informix".period x1 on (x0.perkey= x1.perkey ) )left join "informix".product x2 on (x0.prodkey
= x2.prodkey ) )left join "informix"."store" x3 on (x0.storekey= x3.storekey ) )left join "informix".customer x4 on (x0.custkey
= x4.custkey ) )left join "informix".promotion x5 on (x0.promokey= x5.promokey ) );
28
•The data mart schema should be star or snowflake schema
• Single table data mart is fine (e.g. weblog, call detail record)
•The view created represents the whole data mart
•All the selected columns from all tables
•The join predicate between the fact and dimension and dimension to dimension.
Content of the view
29
•Fact table should be used in the query
•Dimensions should be joined using the join keys in the data mart
•Supported functions, expressions and aggregates
•INNER JOIN, LEFT OUTER JOIN with fact on the dominant side
•Cannot reference tables outside the data mart
Query Matching
30
create table kfact(id int, name varchar(32), amount decimal(9,2));
create view "dwa"."aqtf5246230-8cce-42fd-8c3e-f516bbeacca3" ("COL1","COL2","COL3")
as select x0.id ,x0."name" ,x0.amount
from "keshav".kfact x0 ;
Create the table
Create the data mart
Datamart definition in the database… saved as a special view
31
QUERY: (ISAO-Executed)(OPTIMIZATION TIMESTAMP: 05-16-2011 09:10:37)------select count(*) from kfact
Estimated Cost: 1Estimated # of Rows Returned: 1Maximum Threads: 0
1) tpcds_100gb@DWAFINAL:dwa.aqtf5246230-8cce-42fd-8c3e-f516bbeacca3: REMOTE PATH
Remote SQL Request:{QUERY {FROM dwa.aqtf5246230-8cce-42fd-8c3e-f516bbeacca3} {SELECT {count(*) } } }
32
select id, name, sum(amount) from kfact group by id, name
Estimated Cost: 4Estimated # of Rows Returned: 1Maximum Threads: 0
1) tpcds_100gb@DWAFINAL:dwa.aqtf5246230-8cce-42fd-8c3e-f516bbeacca3: REMOTE PATH
Remote SQL Request:{QUERY {FROM dwa.aqtf5246230-8cce-42fd-8c3e-f516bbeacca3} {SELECT {SYSCAST COL1 AS INTEGER NULLABLE} {SYSCAST COL2 AS VARCHAR 32 819} {SUM COL3 } } {GROUP COL1 COL2 } }
33
Thank You
34
2. Datamart Definition
Smart Anlaytics
Data Studio
1. Identify the datamart
to offload.
4. Create the metadata5. Issue Off-load Datamart command
Acceleration with Informix Warehouse Accelerator
3. Return the SQL representation
6. Off-load the data9. Return ACK
7. Distribute the data among workers
8. Compress the data
Coordinator process
Worker Processes
Informix
Informix Warehouse
Accelerator
35
Applications
BI Tools
Step 1. Submit SQL
DB protocol: SQLI or DRDA
Network : TCP/IP,SHM
Acceleration with Informix Warehouse Accelerator
Coordinator process
Worker Processes
Informix2. IDS query matching
and redirection
technology
Local Execution
Step 3offload SQL.DRDA over tcp/ip
Step 4
Results:
DRDA over tcp/ip
Step 5. Return results/describe/error
Database protocol: SQLI or DRDA
Network : TCP/IP, SHM