Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse...

49

description

Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises

Transcript of Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse...

Page 1: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.
Page 2: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

SQL

Kennie Nybo Pontoppidan

Stories from the trenches - Partitioning as a design pattern

Head of R&DRehfeld

Page 3: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Agenda

Stories from the trenches - Partitioning as a design pattern

• About the FLIS project• Data warehouse heroes and architecture• Partitioning in many disguises

Page 4: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

The FLIS Project

Page 5: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

You are here

Page 6: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

The FLIS project - Team • Netcompa

ny75%

• Rehfeld25%

• TDC hosting

Page 7: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

The FLIS project - Mission• Jointly defined KPI’s

• View your own KPI’s• Benchmark with ”twins”

• Access to your own data • Common datamodel• Raw data

Page 8: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Access to your own data?• ~ 72 mill. kr.

to get data for 4 years

• (10 mill. euro)

Page 9: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

The FLIS project – DW Challenges• Data from 30 different it-systems

• 7 subject areas• Citizens• Employees• Salaries• Absence

• ERP• Budgets• Postings

• Schools• …

• Conformity• Within area• Between areas

Page 10: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.
Page 11: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Data warehouse heros

Bill

Ralph

Dan

Page 12: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Bill Inmon• Inventor of the word

Data warehouse

• Oracle reference architecture

• Top down approach

• Hub and Spoke

Page 13: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Ralph Kimball• ”Invented” dimension

modelling

• Microsoft ”reference architecture”

• Bottom up

Page 14: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Dan Lindstedt• Hybrid model

• Hubs, satellites and Links

Page 15: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

FLIS Data Warehouse Architecture

Page 16: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.
Page 17: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.
Page 18: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Technologies

Page 19: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Squirrel server ?

Page 20: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

SSIS package XML

Page 21: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Informatica PowerCenter XML

Page 22: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Look up the wordpar·ti·tion  (pär-tshn)n.

1.a. The act or process of dividing something into parts.b. The state of being so divided.2.a. Something that divides or separates, as a wall dividing one room or cubicle from another.b. A wall, septum, or other separating membrane in an organism.3. A part or section into which something has been divided.…

Source: http://www.thefreedictionary.com/partitioning

Page 23: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

MCM videos

Sqlskills.com

Page 24: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Partitioning as a design patternDividing something into parts

• Files• Database• Table

State of being so divided• ETL developers vs. Customer• Developers vs (evil?) DBA’s• Backups (what is a full backup anyway)

Page 25: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Dividing something into parts - files• Xml, csv, xls, fixed

format• 1 file• Data from • 1 or more tables• Data from 1 or more municipality

• 30 different file naming schemes

Page 26: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Dividing something into parts - files• Csv• 1 file• 1 table• Data from just 1 municipality

• 1 file naming scheme

Page 27: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Files (comple

x)

File-dsa mappin

gs(comple

x)

Dsa tables

Dividing something into parts - files

Files (complex)

Preprocessor

(complex)

File-dsa mappings (simple)

Dsa tables

Page 28: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Dividing something into parts - Database1 data warehouse layer = 1 database

• Scaling• IO-pattern for a data flow

Page 29: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Dividing something into parts - Database1 database

• 4 LUNS, RAID 10• 4 datafiles

Page 30: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Dividing something into parts - DatabaseFilegroups

• PRIMARY • 160MB

• BIG ONE• A lot of data

Page 31: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Dividing something into parts - TablesTable partitioning (2005 EE feature)

• Pruning• Divide the data warehouse into 100 parts 1/100 the size

• Switching• Separation of readers and writers• Fast

Page 32: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Dividing something into parts - Tables

• DSA• Partition by (municipality_id,

month_year)

• EDW and data marts• Partition by municipality_id

Page 33: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.
Page 34: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Ask Kristian!

CREATE TABLE [dbo].[partition_test](              [Kommune_id] [varchar](11) NULL,              [Kommunenummer] [nvarchar](4000) NULL,              [Distrikt_kode] [nvarchar](4000) NULL,              [Distrikt_type] [nvarchar](4000) NULL,              [Distrikt_tekst] [nvarchar](4000) NULL,              [Cpr_Dist_Tekst_TS] [nvarchar](4000) NULL) ON [partition_test_pt_sc]([Kommune_id])GO

Page 35: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Dear Mark! Please, please, please, pl

Page 36: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Dividing something into parts – CPU’s

• Affinity masking

• Not really possible in Oracle – even on Windows

Page 37: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

State of being so divided

Page 38: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

State of being so divided• Informatica PowerCenter (ETL vs.

Customer)

• Meta data • File definitions• Table definitions in all layers• Simple transformations

• Autogenerate mapping code from meta data• Go on – please change requirements

Page 39: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Mapping metadata model

MappingID (int)

DestinationColumnFK (int)SourceColumnFK (int)

Mapping

ColumnID (int)

ColumnName (varchar)

Column

DWLayerID (int)DWLayerName (varchar)

DWLayer

TransformationID (int)FlowFK (int)

Transformation

ColumnTypeFK (int) ColumnTypeID (int)ColumnTypeName (varchar)

ColumnType (source data)

TableFK (int)

TableTypeID (int)TableTypeName (varchar)

TableType

TransformationSchemaID (int)TransformationTypeID (int)

TransformationSchema

StandardColumnID (int)

StandardColumnName (varchar)

StandardColumn

DWLayerFK (int)

Logic (varchar)

TransformationTypeID (int)TransformationTypeName (varchar)

TransformationTypeTransformationSchemaName (varchar)

FlowID (int)Flow

FlowName (varchar)FlowTypeFK (int)

FlowTypeID (int)FlowType

FlowTypeName (varchar)

TransformationFK (int)

TransformationSchemaFK (int)Logic (varchar)

DataTypeFK (int)TableID (int)

TableName (varchar)

Table

DWLayerFK (int)TableTypeFK (int)

Order???

DataTypeID (int)DataTypeName (varchar)

DataType

Page 40: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

State of being so divided• Developers vs. • (evil) DBA’s

• Meta data on table definitions• Script ddl from meta data• Hide partitioning in ddl• Partition Scheme• Change File group design

Page 41: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

State of being so divided• Backups: DBA’s vs. Backup administrator

• Partitioning helps us divide data into • Hot• (C)old (and therefore read only)• Only backup hot partitions

Page 42: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

State of being so divided• Restoring: DBA vs. SQL server • PRIMARY filegroup• 150 MB

• Default filegroup• Big

• Restore database • only primary filegroup => online

Page 43: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

State of being so divided• DW vs OLTP and OLAP

• DW server• ETL• processing OLAP databases/cubes• Reporting server• Sharepoint databases (OLTP)• restoring OLAP databases/cubes

Page 44: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.
Page 45: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

We also use these cool features…• Page compression• Backup compression

Page 46: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

Page compression results

tableFactor (size) Num rows num reads

cpu time (ms)

elapsed time (ms)

dg1 1 89132 372 31 739dg1 2 178264 743 31 1517dg1 10 891320 3781 281 7551

dg1_page_compr 1 89132 143 15 748dg1_page_compr 2 178264 285 62 1512dg1_page_compr 10 891320 1427 421 7596

code vartype varvalDA000 DGALT 00K00DA000 DGCAT 06M38ADA000 DGPROP 26X01DA001 DGALT 00K00DA001 DGCAT 06M38ADA001 DGPROP 26X01DA009 DGALT 00K00DA009 DGCAT 06M38ADA009 DGPROP 26X01

Page 47: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

or

Page 48: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

EvaluationCreate a Text message on your phone and send it to 1919 with the content:

DB301 5 5 5 I liked it a lotSession Code

KenniePerformance (1 to 5)

Match of technical

Level(1 to 5)

Relevance(1 to 5) Comments

(optional)

Evaluation Scale: 1 = Very bad 2 = Bad 3 = Relevant 4 = Good 5 = Very Good!

Questions:• Speaker Performance• Relevance according to

your work • Match of technical level

according to published level• Comments

Page 49: Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises.

© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation.  Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.  MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Thank you