Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

27
Agile BI & Data Virtualization Tom Breur [email protected] BI & IM Symposium Bussum, 26 November 2012

description

 

Transcript of Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Page 1: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Agile BI& Data Virtualization

Tom [email protected]

BI & IM SymposiumBussum, 26 November 2012

Page 2: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

It’s a stretch…Volumes of data are growing (fast)!Variety of sources keeps expanding:

Social media, RFID, log-files, GPS, etc.Business users need their data (much)

sooner:monthly weekly daily intra-day

BI in support of operational processes, calls for (near) real-time data

Page 3: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Why go “Agile”? (1)BI projects fail too often, or don’t live

up to expectationsIncreasingly, BI development takes

place alongside (instead of after) application engineering

Page 4: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Why go “Agile”? (2)Winston Royce (1970):

4www.xlntconsulting.com

Release

Test

Development

Design

Analysis

“In my experience, the simpler model … [as pictured below] has never worked on large

software development efforts”

[Royce subsequently went on to describe an enhanced model, which included building a prototype first and then using the

prototype plus feedback between phases to build a final deployment]

Page 5: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

5

History of development methods

20001990198019701960

Brooks (1975) “The Mythical Man Month”

Boehm (1986) “A Spiral model of Software Development and Enhancement”

Martin (1991) “Rapid Application Development”

Jackson (1975) “Principles of Program Design”

1994: DSDM Consortium launched

1997: term eXtreme Programming (XP) ‘invented’

2001: term ‘Agile’ adopted

From “code and fix” to more structured, methodical approaches to software

development

Beck (2000) “Extreme Programming Explained”

Unstructured Prescriptive methods Structured methods

2010

1996: Scrum ‘invented’

Poppendieck (2003) “Lean Software Development”

Anderson (2010) “Kanban”

Cockburn (2004) “Crystal Clear”Royce (1970) “Managing the Development of Large Software Systems”

www.xlntconsulting.com

Page 6: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Quick & Dirty ≠ Agile (1)www.agilemanifesto.org (principle #1):

Creating “technical debt” stands squarely in the way of continuous delivery, and maintaining a so-called “sustainable pace”:it creates (new) legacy!

“Our highest priority is to satisfy the customer through early and

continuous delivery of valuable software” [emphasis added]

Page 7: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Quick & Dirty ≠ Agile (2)Top-down project management

(e.g.: Scrum)

&

Bottom-up software engineering(e.g.: Extreme Programming - XP)

Expedited delivery

&

Architectural integrity

Page 8: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Quick & Dirty ≠ Agile (3)

Page 9: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

BI requirementsInformation products trigger change

requests:new data insights new requirements

Gerald M. (Jerry) Weinberg:“Without stable requirements,

development can’t stabilize, either”

Page 10: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

BI: means and ends uncertaintyMeans uncertainty How do we get there? Lack of “design

patterns” Data integration fraught

with data quality issues Lack of Master Data

Management Lack of Meta Data No agreement on how

to conform dimensions

Ends uncertainty Where are we going

to? Requirements are

difficult to pin down Diverse end-user groups Ambiguous business

case(s) Scope is unclear Data warehouses are

never “done”

www.xlntconsulting.com 10

Page 11: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Waterfall Agile

www.xlntconsulting.com 11

source: Dean Leffingwell (2011)

Waterfall/Traditional Agile

PlanDrive

n

ValueDrive

n

Requirements Resources Date

Resources Date Requirements

Agile fixes the date and resources and varies the scope

Fixed

Estimated

Page 12: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Weinberg on Quality

www.xlntconsulting.com 12

“If quality isn’t an objective (if the software doesn’t have to

work), you can satisfy any other constraint

(e.g.: budget, time, etc.)” Gerald M. (Jerry) Weinberg

Page 13: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Concurrent development (1)Waterfall: you can avoid mistakes/rework

by getting good requirements upfrontThe most costly mistakes arise from

forgetting important elements early onDetailed planning (BDUF) requires:

early (ill informed) decisions uses more time leading to less tangible products to resolve

ambiguity

13www.xlntconsulting.com vicious cycle

Page 14: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Concurrent development (2)Agile: decide at “last responsible

moment”decisions that haven’t been made, don’t

ever need to be revertedNo “free lunch” – deferring decisions

requires: anticipating likely changecoordination/collaboration within teamclose contact with customers

14www.xlntconsulting.com

Page 15: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Inmon Kimball (1)

3-tiered 2-tiered

Page 16: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Inmon Kimball (2)Problems with Inmon Uncovering the

‘correct’ 3NF model requires scarce business expertise

Unclear where 3NF model boundaries begin and end

Model redesigns trigger a cascading nightmare of parent-child key updates

Problems with Kimball Smallest unit of delivery

is a Star, and incremental growth adds prohibitive overhead

Dimensional structure is very rigid not conducive to expansion or change

Conforming dimensions is hard, especially without access to data

www.xlntconsulting.com 16

Page 17: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

3NF Dimensional (1)

www.xlntconsulting.com 17

Page 18: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

3NF Dimensional (2)

www.xlntconsulting.com 18

see: Kimball design tip # 149http://www.kimballgroup.com/

2012/10/02/design-tip-149-facing-the-re-keying-crisis/

this problem gets (much!) worse with

multiple parent-child levels

Page 19: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Hyper normalized model

www.xlntconsulting.com 19

business keys, context attributes (history), and relations, all have their own tables

appending “Supplier data” to the model (or any other new source), is guaranteed to be contained as a “local” problem (=extension) in the data modelbecause business keys, context attributes (history), and relations all have their own tables

Page 20: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

3-tiered DWH architecture

Legacy

OLTP

ERP

LOG files

External

ETL Staging

Area

Data Warehouse

ODS

Datamart 1

Datamart 2

Datamart n

BusinessIntelligenceApplications

Metadata

3 NF hyper

normalized

dimensional

20www.xlntconsulting.com

Page 21: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Horses for courses3NF

quickly & accurately capture transaction dataeasy to get data in

Hyper normalizedintegrate historical data capture all data, all the time

Dimensionalpresent & analyze dataeasy to get data out

www.xlntconsulting.com 21

Page 22: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Legacy

OLTP

ERP

LOG files

External

ETL Staging

Area

Data Warehouse

ODS

Datamart 1

Datamart 2

Datamart n

BusinessIntelligenceApplications

MetadataBack roomData Warehouse Architecture

Front roomBusiness Intelligence Architecture

Backroom Frontroom

22www.xlntconsulting.com

Page 23: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Divide & Conquer“Break down” semantic gap from

back- to front roomOffer a range of data services:

Source data “as is”Source data that have undergone

cleansingDimensional modelsFull-fledge BI applications

Allow business to set priorities!

Page 24: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

Why data virtualization?Operational BI calls for real-time dataIntegrate heterogeneous sources, at

least “in the eye of the beholder”Data virtualization layer hides

complexity about underlying applications& enables sharing of meta data

Data virtualization enables federation, so you can delay (definitive) modeling, yet make data available early

Page 25: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

ConclusionBig Data are here to stay

(and lets hope the hype passes soon)Data provide a source of sustainable

competitive advantageSpeed and volume prohibit (wholesale)

copying: virtualization is the way forward

Agile BI enables business alignment, and gives us a “sporting chance” to keep up

Page 26: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

ConclusionBig Data are here to stay

(and lets hope the hype passes soon)Data provide a source of sustainable

competitive advantageSpeed and volume prohibit (wholesale)

copying: virtualization is the way forward

Agile BI enables business alignment, and gives us a “sporting chance” to keep up

Page 27: Tom Breur, XLNT - Agile BI And Data Virtualization - BI Symposium 2012

ConclusionBig Data are here to stay

(and lets hope the hype passes soon)Data provide a source of sustainable

competitive advantageSpeed and volume prohibit (wholesale)

copying: virtualization is the way forward

Agile BI enables business alignment, and gives us a “sporting chance” to keep up