Ingres/Vectorwise Implementation Details XXV Ingres Benutzerkonferenz 2012 Confidential © 2011...

Post on 14-Dec-2015

214 views 1 download

Transcript of Ingres/Vectorwise Implementation Details XXV Ingres Benutzerkonferenz 2012 Confidential © 2011...

Confidential © 2011 Actian Corporation

Ingres/Vectorwise Implementation Details

XXV Ingres Benutzerkonferenz 2012

Doug Inkster

1 of 91 of 91 of 91 of 91 of 9

Confidential © 2011 Actian Corporation 2

Abstract

This session investigates the interface between Ingres and Vectorwise in more detail. It describes changes made to Ingres to accommodate Vectorwise and changes made to Vectorwise to accommodate Ingres, as well as specific features of Vectorwise and how they are exploited from Ingres.

of XX

Contents

• Ingres/VW overview• In Ingres, not VW• In VW, not Ingres• Ingres/VW coordination• Clever VW features and their exploitation• Coming VW features

Ingres/Vectorwise Overview

• Queries arrive in Ingres server, processed (as needed) by VW server, as well

• Tables defined in Ingres catalogs as additional table type, but also in VW catalog

• Select/update/delete compiled in OPF, passed by QEF to VW server as VW query

• Insert/copy compiled in OPF, row images passed by QEF to VW server

Ingres/Vectorwise Overview

• VW algebra• Generated by optimizer’s cross compiler• Nested operators: Project, Select, TopN,

Window, Sort, Aggr, OrdAggr, Mscan, MergeJoin1, HashJoin01, HashJoinN, CartProd

• Trace point op207 displays compiled query – use x100pp to format it

Sample VW query“select r_name, n_name from region, nation where

r_regionkey = n_regionkey order by r_name” generates:

Sort ( Project ( HashJoin01 ( MScan ( _nation = '_nation', [ '_n_regionkey', '_n_name'] ) [ 'est_card' = '25' ] , [ _nation._n_regionkey ], MScan (_region = '_region', [ '_r_regionkey', '_r_name'] ) [ 'est_card' = '5' ] , [ _region._r_regionkey ], 0 ) [ 'est_card' = '25' ] , [_region._r_name, _nation._n_name] ),[_region._r_name])

OP207/x100pp

• “set trace point op207” displays VW syntax, X100pp makes it readable

Sort (Project (HashJoin01 (MScan ( _nation = '_nation', [ _n_regionkey', '_n_name‘]) [ 'est_card' = '25' ] , [ _n_nation ._n_regionkey ],MScan (_region = '_region', [ '_r_regionkey', '_r_name‘]) [ 'est_card' = '5' ] , [ _region._r_regionkey ], 0) [ 'est_card' = '25' ] , [_region._r_name, _nation. _n_name]), [_region. _r_name])

in Ingres, not VW

• Statistical aggregate functions (standard deviation, variance, regression, correlation)

• var_pop(x) defined in standard as:(s2 – s1*s1)/n/n – where s2 is sum(x*x), s1 is

sum(x) and n is count(x)• OPF rewriter replaces var_pop() function with

expanded formula – same for other statistical aggs

in Ingres, not VW

• Vectorwise has no equivalent to Ingres SEjoin for handling difficult subqueries

• New flattening algorithms used e.g. “select * from p where pno >= all (select pno

from sp where qty = 100)” is flattened to“select * from p, (select max(pno) as mpno, count(*)

as cnt, count(pno) as cpno from sp where qty = 100) x where (p.pno >= mpno and cnt = cpno) or cnt = 0”

in VW, not Ingres• “derived” notation in aggregate groupinge.g. “select custname, count(ordno) from customer,

orders where custno = o_custno group by custno, custname” generates

“… Aggr(…, [custno, custname DERIVED], [x = count(ordno)]), …”

• Ingres now tracks functional dependencies based on key constraints (primary, foreign key), equijoins, aggregation groupings, etc.

• Cross compiler determines applicability of DERIVED

Not in VW, not in Ingres• GROUP BY enhancements: rollup, cube, grouping sets (new in 3.0)• Defined in SQL standard using UNION• SELECT … GROUP BY CUBE (a, b) transforms to:SELECT … GROUP BY a, b UNION ALL SELECT … GROUP BY a UNION ALL

SELECT … GROUP BY b UNION ALL SELECT … GROUP BY ()• The transformation is handled entirely in the optimizer rewriter

phase• Works in VectorWise with no changes to VW• Works for native Ingres with no changes to query execution facility

Ingres/VW Coordination• Some features involve both Ingres and Vectorwise changese.g. “select …, rank() over (partition by sno order by qty) as qrank …”

generates:… Project( Sort( Mscan(‘sp’, [‘sno’, …, ‘qty’] ), [sp.sno, sp.qty] ), [TRSDM_0 = diff(sp.sno), TRSDM_1 = rediff(TRSDM_0, sp.qty),

…, qrank = sqlrank(TRSDM_0, TRSDM_1)] )…

Ingres/VW Coordination• Vectorwise REUSE capability• OPF identifies fragments of query appearing in

multiple places (in UNIONs, in subqueries, …)• Common fragment builds separate query plan

component• Vectorwise query caches initial instantiation of

fragment• Subsequent references to fragment processed

against cached rows

Ingres/VW Coordination“select s_acctbal, s_name, p_partkey, p_mfgr, s_address, s_phone, s_comment

from part, supplier, partsupp where p_partkey = ps_partkey and s_suppkey = ps_suppkey and ...

and ps_supplycost = ( select min(ps_supplycost) from partsupp, supplier where p_partkey = ps_partkey and s_suppkey = ps_suppkey)”

Project (

HashJoin01 (

As (

IIREUSESQ6 =

Project (

HashJoin01 (

MergeJoin1 (

MScan (

_partsupp000 = '_partsupp', [ '_ps_suppkey', '_ps_partkey', '_ps_supplycost', '__jpartsupp'] ...

), __VT_6_1_3_1

), [ __VT_6_1_3_1._p_partkey, __VT_6_1_3_1._ps_supplycost ],

As (

Aggr (

As (

IIREUSESQ6, __VT_6_0_3_2 ...

Clever Vectorwise Features

• Compression – data compressed using variety of techniques (type and value distribution dependent)

• Kept compressed in buffers• Only expanded when being processed by query

operators

Clever Vectorwise Features• Even non-indexed columns max/min values

stored with each disk block• Restrictions are applied at the block levele.g. … where l_shipdate between date’2009-01-01’

and date‘2010-06-30’ …– will only read blocks with at least 1 row in the

restricted range– clustering lineitem rows on o_orderdate effectively

clusters on l_shipdate, too

Coming Vectorwise Features• Just in time compilation

– select portions of query for compiling into executable code

– Project(), other operators computing expressions– Single call to compute entire expression, not one per

operation• Cooperative scans

– Scan scheduler tracks different queries requesting scans on same tables/columns

– Single scan shared by multiple executing queries

Coming Vectorwise Features• Additional compression techniques• Clustered Vectorwise• New indexing techniques

• Intern program shared with CWI

Summary

• Exciting present• Promising future