Informix Warehouse accelerator -- design, deploy, use

36
0 Informix Warehouse Accelerator Query Flow and Matching Keshava Murthy, Architect, IBM Informix Development

Transcript of Informix Warehouse accelerator -- design, deploy, use

Page 1: Informix Warehouse accelerator -- design, deploy, use

0

Informix Warehouse AcceleratorQuery Flow and Matching

Keshava Murthy,

Architect, IBM Informix Development

Page 2: Informix Warehouse accelerator -- design, deploy, use

1

Informix Database Server

Informix warehouse Accelerator

BI Applications

Step 1. Install, configure,start Informix

Step 2. Install, configure,start Accelerator

Step 3. Connect Studio to Informix & add accelerator

Step 4. Design, validate, Deploy Data mart

Step 5. Load data to accelerator

Ready for Queries

IBM Smart Analytics

Studio

Step 1

Step 2

Step 3

Step 4

Step 5

Ready

Page 3: Informix Warehouse accelerator -- design, deploy, use

2

Connecting to Informix

• For data mart design, from ISAO Studio

– Use 11.5 Informix driver

– Protocol tcp/ip (onsoctcp or ontlitcp)

– Use the port with TCP/IP and SQLI protocol

• From informix applications, scripts, tools

– Supports protocols: tcp/ip, shared memory

– Supports all drivers

– CSDK, ODBC, JDBC, JCC, .NET, etc, etc.

Page 4: Informix Warehouse accelerator -- design, deploy, use

3

Connection to Informix

• For data mart design, from ISAO Studio

• ISAO Studio runs on Windows and Linux

• Connect from these two platforms to any supported Informix server

– Linux64/Intel

– HP-UX/Itanium

– Power/AIX

– Sparc/Solaris

Page 5: Informix Warehouse accelerator -- design, deploy, use

4

Connection to Informix

• For Applications, connect as usual.

• No Application changes/redeployment necessary

• Set the environments (USE_DWA) using sysdbopen() procedure

• sysdbopen() procedure is automatically executed when any application connects to a database.

Page 6: Informix Warehouse accelerator -- design, deploy, use

5

Connection to Informix

• USE_DWA

SET ENVIRONMENT USE_DWA ‘1’;

– Controls the session behavior of query matching.

– ‘0’ (zero) turns off using IWA for query processing

– ‘1’ turns on considering IWA

– ‘3’ same as 1 with diagnostics

– ‘998’ Use IWA only.

Page 7: Informix Warehouse accelerator -- design, deploy, use

6

Informix Database Server

Informix warehouse Accelerator

BI Applications

Step 1. Install, configure,start Informix

Step 2. Install, configure,start Accelerator

Step 3. Connect Studio to Informix & add accelerator

Step 4. Design, validate, Deploy Data mart

Step 5. Load data to accelerator

Ready for Queries

IBM Smart Analytics

Studio

Step 1

Step 2

Step 3

Step 4

Step 5

Ready

DRDA over TCP/IP

Adding Accelerator

Page 8: Informix Warehouse accelerator -- design, deploy, use

7

Adding Accelerator

• Add new accelerator from data studio or command line interface (CLI)

• Need four parameters to add accelerator

– Name of the accelerator (you choose)

– IP address of the IWA instance

– Port on which IWA is listening to

– PIN obtaining after executing ‘ondwa getpin’

• Port number is in dwainst.conf file.

Page 9: Informix Warehouse accelerator -- design, deploy, use

8

Adding Accelerator

• Informix always talks to IWA Coordinator

– For all data mart operations

– Queries

– To obtain the resultset.

• Informix treats IWA Coordinator as remote node

Page 10: Informix Warehouse accelerator -- design, deploy, use

9

Informix Database Server

Informix warehouse Accelerator

BI Applications

Step 1. Install, configure,start Informix

Step 2. Install, configure,start Accelerator

Step 3. Connect Studio to Informix & add accelerator

Step 4. Design, validate, Deploy Data mart

Step 5. Load data to accelerator

Ready for Queries

IBM Smart Analytics

Studio

Step 1

Step 2

Step 3

Step 4

Step 5

Ready

Design, Validate, Deploy Data mart

Page 11: Informix Warehouse accelerator -- design, deploy, use

10

ISAO Studio or

CLI Tool

Step 1. Design, Validate the

data mart.

AQT

Informix

Step 3

Send the data mart

definition

Step 4

Return the SQL

definitions

Coordinator

Compressed

data

In memory

Worker

Memory image

on disk

Compressed

data

In memory

Worker

Memory image

on disk

Compressed

data

In memory

Worker

Memory image

on disk

Compressed

data

In memory

Worker

Memory image

on disk

Step 6. Return acknowledgement

Design, Validate, Deploy Data marts

Step 5. Save the Definition

AQT

Step 2. Deploy Data mart

Page 12: Informix Warehouse accelerator -- design, deploy, use

11

Store Sales ER-Diagram from TPC-DS300GB database

287,997,024

20

73,049

1,920,800

1000

204,000

1,000,000

402

86,400

7200

2,000,000

Page 13: Informix Warehouse accelerator -- design, deploy, use

12

Page 14: Informix Warehouse accelerator -- design, deploy, use

13

Designing data mart

• Start with a good logical and physical design

• Typically has Star or Snowflake schema

• Data mart itself can contains

– One or more fact tables

– Available dimensions

– Relationship between the fact and dimensions

• Relationships

– 1:n relationship -- needs unique constraint on PK

– n:m relationshp

Page 15: Informix Warehouse accelerator -- design, deploy, use

14

Designing data mart

• Design identifies and uses existing PK-FK relationship between the tables

• In warehouse environment, it’s typical not to have constraints defined within the schema

• Manually create the relationships between the tables.

• Always start from the Parent and end with Child– In customer, web_sales relationship, customer is the

parent and web_sales is the child.

– customer.customer_id will be the primary key, web_sales.customer_id will be the foreign key.

Page 16: Informix Warehouse accelerator -- design, deploy, use

15

Designing data mart

• When you don’t have PK-FK relationship

– Identify the keys from logical design

– Identify the keys from equi-join keys in queries

– Identify the parent and child

• Type of Relationships between two tables

– Single relationship with single key

– Single relationship with multiple keys

– Multiple relationship with single or multiple keys

Page 17: Informix Warehouse accelerator -- design, deploy, use

16

Designing data mart

• Single Data mart with multiple fact tables

– Shares the dimensions with all

• Multiple data marts each with its own fact table, but same fact tables

– Separate copy of dimension tables

– Higher memory requirement

Page 18: Informix Warehouse accelerator -- design, deploy, use

17

Designing data mart – Smart mart tool

• Simply enable workload analysis

• Run the workload

• Informix will give you data mart definitions required to run the workload

• Design is done for you based on workload

• Simply deploy and load the mart using this definition

• Useful while generating data mart for standard reports

• Use it as guiding tool for identifying tables needed within warehouses.

Page 19: Informix Warehouse accelerator -- design, deploy, use

18

Deploying the data mart

• Creates and sends the data mart definition to IWA

• Verify the fact tables and dimension tables.

• Generate the report and verify when necessary

• You can load the data when deploying the data mart

• Typically you deploy once and load periodically

• Loading can be automated via command line inerface (CLI)

Page 20: Informix Warehouse accelerator -- design, deploy, use

19

Deploying the data mart

• IWA returns one or more SQL statements representing the data mart.

• Informix creates Accelerated Query Tables (AQT) for those.

• AQTs are essentially views used exclusively for query matching

• Data mart deployment, enable, disable, drop events are recorded in the system catalog

Page 21: Informix Warehouse accelerator -- design, deploy, use

20

Informix Database Server

Informix warehouse Accelerator

BI Applications

Step 1. Install, configure,start Informix

Step 2. Install, configure,start Accelerator

Step 3. Connect Studio to Informix & add accelerator

Step 4. Design, validate, Deploy Data mart

Step 5. Load data to

accelerator

Ready for Queries

IBM Smart Analytics

Studio

Step 1

Step 2

Step 3

Step 4

Step 5

Ready

Design, Validate, Deploy Data mart

Page 22: Informix Warehouse accelerator -- design, deploy, use

21

Loading the data mart

• Load the data mart using Studio

• Load using loadMart command from CLI

• Takes snapshot of the table

• Options

– No locking of the tables

– Locking of all the tables

Page 23: Informix Warehouse accelerator -- design, deploy, use

22

Applications

BI Tools

Step 1. Submit SQL

DB protocol: SQLI or DRDA

Network : TCP/IP,SHM

Informix

2. Query matching and

redirection technology

Step 3

offload SQL.

DRDA over TCP/IP

Step 4

Results:

DRDA over TCP/IP

Local

Execution

Coordinator

Compressed

data

In memory

Worker

Memory image

on disk

Compressed

data

In memory

Worker

Memory image

on disk

Compressed

data

In memory

Worker

Memory image

on disk

Compressed

data

In memory

Worker

Memory image

on disk

Step 5. Return results/describe/error

Database protocol: SQLI or DRDA

Network : TCP/IP, SHM

Query Flow

Page 24: Informix Warehouse accelerator -- design, deploy, use

23

Step5: Send the results back to Infomrix server

Step1

SQL from Informix

Coordinator

Compressed dataIn memory

Worker

Step3: Scan, Filter, join, group

Compressed dataIn memory

Worker

Step3: Scan, Filter, join, group

Compressed dataIn memory

Worker

Step3: Scan, Filter, join, group

Compressed dataIn memory

Worker

Step3: Scan, Filter, join, group

Step2

Send the queries to all the workers

Step4: merge intermediate results, ORDER BY, FIRSTN

Query Flow within IWA

Page 25: Informix Warehouse accelerator -- design, deploy, use

24

Life of a query

SQL Statement

SQL Parser Query PlanOptimizerSemantic

Analyzer

Executor

Query

Results

System

Catalog

Information

Table Stats &

Column

Distribution

Explain File

Query stats

Page 26: Informix Warehouse accelerator -- design, deploy, use

25

SQL Statement

SQL Parser Query PlanOptimizerSemantic

Analyzer

Informix

Execution

Query

Results

System

Catalog

Information

Table Stats &

Column

Distribution

Explain File

Query stats

IWA

Execution

Query

Results

Generate

SQL--

Optimizer is enhanced to do the query matching

Query qualified for acceleration

Page 27: Informix Warehouse accelerator -- design, deploy, use

26

Page 28: Informix Warehouse accelerator -- design, deploy, use

27

create view "dwa"."aqt2dbca0d9-509d-434b-9cc9-4a12c6de6b3d" ("COL16","COL17","COL18","COL19","COL20","COL21","COL22","COL23","COL24","COL25","COL26","COL27","COL28","COL29","COL30","COL31"4","COL35","COL36","COL37","COL38","COL39","COL40","COL41","COL42","COL43","COL44","COL45","COL46","COL47","COL07","COL08","COL0

COL12","COL13","COL14","COL15","COL48","COL49","COL50","COL51","COL52","COL53","COL54","COL55","COL56","COL57","COL58","COL59","","COL02","COL03","COL04","COL05","COL06","COL62","COL63","COL64…) as

select x0.perkey ,x0.storekey ,x0.custkey ,x0.prodkey ,x0.promokey,x0.quantity_sold ,x0.extended_price ,x0.extended_cost ,x0.shelf_location

,x0.shelf_number ,x0.start_shelf_date ,x0.shelf_height ,x0.shelf_width,x0.shelf_depth ,x0.shelf_cost ,x0.shelf_cost_pct_of_sale

,x0.bin_number ,x0.product_per_bin ,x0.start_bin_date ,x0.bin_height,x0.bin_width ,x0.bin_depth ,x0.bin_cost ,x0.bin_cost_pct_of_sale

……from

((((("informix".daily_sales x0 left join "informix".period x1 on (x0.perkey= x1.perkey ) )left join "informix".product x2 on (x0.prodkey

= x2.prodkey ) )left join "informix"."store" x3 on (x0.storekey= x3.storekey ) )left join "informix".customer x4 on (x0.custkey

= x4.custkey ) )left join "informix".promotion x5 on (x0.promokey= x5.promokey ) );

Page 29: Informix Warehouse accelerator -- design, deploy, use

28

•The data mart schema should be star or snowflake schema

• Single table data mart is fine (e.g. weblog, call detail record)

•The view created represents the whole data mart

•All the selected columns from all tables

•The join predicate between the fact and dimension and dimension to dimension.

Content of the view

Page 30: Informix Warehouse accelerator -- design, deploy, use

29

•Fact table should be used in the query

•Dimensions should be joined using the join keys in the data mart

•Supported functions, expressions and aggregates

•INNER JOIN, LEFT OUTER JOIN with fact on the dominant side

•Cannot reference tables outside the data mart

Query Matching

Page 31: Informix Warehouse accelerator -- design, deploy, use

30

create table kfact(id int, name varchar(32), amount decimal(9,2));

create view "dwa"."aqtf5246230-8cce-42fd-8c3e-f516bbeacca3" ("COL1","COL2","COL3")

as select x0.id ,x0."name" ,x0.amount

from "keshav".kfact x0 ;

Create the table

Create the data mart

Datamart definition in the database… saved as a special view

Page 32: Informix Warehouse accelerator -- design, deploy, use

31

QUERY: (ISAO-Executed)(OPTIMIZATION TIMESTAMP: 05-16-2011 09:10:37)------select count(*) from kfact

Estimated Cost: 1Estimated # of Rows Returned: 1Maximum Threads: 0

1) tpcds_100gb@DWAFINAL:dwa.aqtf5246230-8cce-42fd-8c3e-f516bbeacca3: REMOTE PATH

Remote SQL Request:{QUERY {FROM dwa.aqtf5246230-8cce-42fd-8c3e-f516bbeacca3} {SELECT {count(*) } } }

Page 33: Informix Warehouse accelerator -- design, deploy, use

32

select id, name, sum(amount) from kfact group by id, name

Estimated Cost: 4Estimated # of Rows Returned: 1Maximum Threads: 0

1) tpcds_100gb@DWAFINAL:dwa.aqtf5246230-8cce-42fd-8c3e-f516bbeacca3: REMOTE PATH

Remote SQL Request:{QUERY {FROM dwa.aqtf5246230-8cce-42fd-8c3e-f516bbeacca3} {SELECT {SYSCAST COL1 AS INTEGER NULLABLE} {SYSCAST COL2 AS VARCHAR 32 819} {SUM COL3 } } {GROUP COL1 COL2 } }

Page 34: Informix Warehouse accelerator -- design, deploy, use

33

Thank You

Page 35: Informix Warehouse accelerator -- design, deploy, use

34

2. Datamart Definition

Smart Anlaytics

Data Studio

1. Identify the datamart

to offload.

4. Create the metadata5. Issue Off-load Datamart command

Acceleration with Informix Warehouse Accelerator

3. Return the SQL representation

6. Off-load the data9. Return ACK

7. Distribute the data among workers

8. Compress the data

Coordinator process

Worker Processes

Informix

Informix Warehouse

Accelerator

Page 36: Informix Warehouse accelerator -- design, deploy, use

35

Applications

BI Tools

Step 1. Submit SQL

DB protocol: SQLI or DRDA

Network : TCP/IP,SHM

Acceleration with Informix Warehouse Accelerator

Coordinator process

Worker Processes

Informix2. IDS query matching

and redirection

technology

Local Execution

Step 3offload SQL.DRDA over tcp/ip

Step 4

Results:

DRDA over tcp/ip

Step 5. Return results/describe/error

Database protocol: SQLI or DRDA

Network : TCP/IP, SHM