Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1...

116
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 1 Pentaho Dev Day 30 th October 2013 Sébastien Cognet Sales Engineer EMEA @opentoile

Transcript of Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1...

Page 1: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 1

Pentaho Dev Day

30th October 2013

Sébastien Cognet

Sales Engineer EMEA

@opentoile

Page 2: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 2

Agenda

PENTAHO Introduction

Analytic suite and more…

USER CONSOLE Architecture

Report, Analyze & Dashboard

Datasource management

Mobility

Demo #1

DEV TOOLS Metadata

Metadata Editor

Schema Workbench

Report Designer

CTools

INTEGRATION Graphic Design

Agile development

Demo #2

OEM Embedded analytics

Many architectures

BIG DATA Visual MapReduce

Big Data Layer

Bended Data

Demo #3

DATAMINING Weka

PDI

Page 3: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 3

PENTAHO CORP

PRESENTATION

Page 4: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 4

Pentaho Mission Big Data Analytics without Boundaries

Modern, unified data integration and business

analytics platform

• Native integration into big data ecosystem

• Embeddable, cloud-ready analytics

Critical mass achieved

• Over 1,000 commercial customers

• Over 10,000 production deployments

Fast and Broad Innovation

• Open source development model

• Extensible by customer, partner & community

Page 5: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 5

Analytic suite and more …

Enterprise Analytic Platform Easy of Use / Administration / Audit

Data Integration

Traditional

Big Data Layer (framework)

Blended Data (as a service)

OEM

One value

CTOOLS framework

Multi-Tenant

Embedded

VISUALIZATION

Report

Analyze

Dashboard

Data Wizard

Services

Training

Workshop

Checkpoint

Consulting

Subscription

Productivity

Garanty

ASSISTANCE

Conciergerie

Network

JIRA

Help-Desk

Infocenter

Community / Forums

Page 6: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 6

Pentaho Business Analytics Platform Our approach addresses the challenges

Operational

Data

Big

Data

Data

Stream

Public/

Private Clouds

Multi-Tenant Ready Open API’s 100% Java

Access Integrate Cleanse Enrich

Score

Forecast

Connect Visualize

Report Dashboard

Analyze/Explore

DBA ETL/BI Developer

Business Users Executives

Analysts Data Scientists

Embed

Use Case Segments

Big Data

Business Analytics

Page 7: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 7

Data Ingestion

Manipulation

Integration

Enterprise &

Ad Hoc Reporting

Data Discovery

Visualization

Predictive Analytics

Complete Big Data Analytics &

Visual Data Management

Relational Hadoop NoSQL Analytic

Databases

Pentaho Big Data Analytics

Page 8: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 8

Pentaho Data Integration

• Visual development for big data

• Broad connectivity

• Data quality & enrichment • Integrated scheduling

• Security integration

• Visual data exploration

• Ad hoc analysis

• Interactive charts & visualizations

DASHBOARD DESIGNER

• Self-service dashboard builder

• Content linking & drill through

• Highly customized mash-ups

Pentaho Data Mining / Predictive Analytics

• Model construction & evaluation

• Learning schemes

• Integration with 3rd part models using PMML

Pentaho Product Components

INTERACTIVE REPORT

• Both ad hoc & distributed reporting

• Drag & drop interactive reporting

• Pixel-perfect enterprise reports

Pentaho for Big Data MapReduce & Instaview

• Visual Interface for Developing MR

• Self-service big data discovery

• Big data access to Data Analysts

ANALYZER

Page 9: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 9

Pentaho Business Analytics

Modern Platform Built for the Future of

Analytics

100% Java, cross-platform Modular, lightweight, pluggable High-performance, scalable

Service-oriented architecture for easy integration Standards-based, highly extensible, easy to embed

Reporting, dashboards, analysis, data mining, predictive Power tools for business users, analysts, & data scientists

Structured, unstructured & NoSQL data Native support for emerging Big Data sources

End-to-end platform – unified data integration and business analytics Agile approach for fast prototyping & iterations Low cost subscription

Modern Architecture

Embedded Analytics

Broadest Spectrum of Insight

Diverse Data

Integrated, Low Cost Platform

Page 10: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 10

Pentaho Core Competencies

Information Consumers

Powerful Reporting and Visualization Business Users

Power Users, Developers &

DBAs

Data Integration and Big Data

Advanced Analytical

Professionals Data Mining (Predictive Analytics)

Knowledge Workers/

Business Users

Self-Service Analysis & Queries

Page 11: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 11

Pentaho Core Competencies

Any Report Executive dashboards

Management / Operational Reports

Financial Reports

Any Format HTML – for the web

PDF – for printing

Excel / CSV – for finance or sharing

Anytime / Anywhere On-demand, scheduled

Event driven – manage by exception

Access via web portal, email or mobile

Information Consumers

Powerful Reporting and Visualization Business Users

Page 12: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 12

Pentaho Core Competencies

Easy and Intuitive Content Creation Drag and drop

Metadata intelligence

Context-sensitive, right-click interface

Web-based Interactivity Drill down, drill thru, slice & dice, pivoting, lasso filtering

User-defined calculations

Rich Visualizations Scatter plots

Geo-mapping

Heat grids

Knowledge Workers/

Business Users

Self-Service Analysis & Queries

Page 13: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 13

Pentaho Core Competencies

Deep Integration with multiple sources and analytics output

Inputs: RDBMS, files, web services, NoSQL, Analytical DBs, Hadoop

Output: ETL is tightly coupled with analytics

Scalability Scale up – multi-threaded

Scale out - clustering

Workflow Scheduling

Monitoring

Alerting

Power Users, Developers &

DBAs

Data Integration and Big Data

Page 14: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 14

Pentaho Core Competencies

Full data mining lifecycle support Preparation of input data

Statistical evaluation of learning schemes

Visualization of input data and result of learning

Visualization, classification and clustering capabilities

Explorer - data exploration/visualization, model construction and export, preliminary evaluation

118 classification/regression algorithms

11 clustering algorithms

Integrated with PDI ETL Execute Weka and R predictive models inside of a PDI transformation

Append probabilities dynamically to each row in the data flow

Retrain Weka models using the KnowledgeFlow plugin for PDI

Advanced Analytical

Professionals Data Mining (Predictive Analytics)

Page 16: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 16

Click Stream Analytics From buying patterns to revenue

360o View of Customer

• Monetize buying patterns hidden in billions of

data points

• Quickly analyze multi-channel click stream data

Pentaho Benefits

• Reduced ETL time to analyze blended data

from Hadoop, Hbase & data warehouse

• Use of big data analytics to grow revenue from

targeted campaigns

Page 17: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 17

Device Data Analytics Big Data for Net App

Business Challenge

• Affordably scale machine data from storage

devices for customer support app

• Predict device failure

• Enhance product performance

Pentaho Benefits

• Easy to use ETL & analysis for Hadoop, Hbase,

& Oracle data sources

• 15x cost improvement

• Stronger performance against customer SLA’s

Page 18: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 18

Data Warehouse Optimization Cost effective, fast processing

Business Challenge

• Gain competitive advantage through intraday

balance reporting for commercial customers

• Use Hadoop and relational data stores to

process huge volumes 15x faster

to develop

10x faster

to execute

No coding

Integrate

with existing

Easy to find

resources

Pentaho Benefits

• Graphical orchestration for Hadoop, Hbase &

DB2 data integration workloads

• 15x faster to develop, 10x faster to execute

Page 19: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 19

Telecom Use Cases Big Data Analytics and the Customer Experience

Expand Net Promoter Score beyond traditional

survey methods

• Add social media data

• Focus on highly valued customers

• Refine predictive models

Better track and manage overall IT

infrastructure for telecom services

• Capacity planning and forecasting

Use emerging sensor/device data to enable

services for a customer’s connected lifestyle

• Connected car, digital life, and mobile wallet

On-Line Ad Performance

• YP.com

Page 20: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 20

USER CONSOLE

PRESENTATION

Page 21: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 21

Plateform

CENTRAL ADMINISTRATION, AUDITING & MONITORING

DELIVER When & Where Users Need It

STREAMLINE Information Delivery

VISUALIZE & Report Information In Any Style

ACCESS All Data Sources

ISV & Packaged Applications

SaaS / Cloud Applications

INTEGRATION

Web

Mobile

Print

E-Mail

STANDALONE

‣ Advanced &

Predictive Analytics

DATA MINING

‣ Proactive

‣ Operational

‣ Enterprise

REPORTING

‣ Ad hoc Exploration

‣ Multi-Dimensional

ANALYSIS

‣ Interactive Metrics

‣ Rich Visualizations

DASHBOARDS

ERP / CRM /

Enterprise Apps (e.g. SAP, Oracle)

Hadoop, NoSQL Data & Analytical

Unstructured &

semi-structured (XML, Excel, Files, etc.)

Traditional Relational Data

Cloud (e.g. Salesforce,

Amazon, Dell)

‣ Direct Access

‣Data Integration

‣ Hadoop Clustering

‣ Graphical ETL Designer

‣ Enterprise Scalability

INTEGRATE, CLEANSE, & ENRICH DATA

‣ In Memory Caching

‣ High Performance

‣ Relational OLAP Cubes

METADATA LAYER

Page 22: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 22

Generic architecture

Unstructured data

EDW

Structured data

Technology BIG DATA

and/or

Staging Area

Pentaho Data Integration

Collect

Pentaho Data Integration

Cleansing

Transformation

Change Data Capture

Data Warehouse Management

PDI PDI Metadata

Dashboard

Report

Analyzer

Page 23: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 23

Cache

Complete management

Ad-hoc Data

Data Mart(s) / Entrepôts

Alertes SMS, eMail & pièces jointes

PDI Metadata

Dashboard

Rapport

Analyse

Technology BIG DATA

and/or

Staging Area

PDI

Collect, Transform, Load and Alert

Structured data

Unstructured data

Pentaho Data Integration

Pentaho Data Integration

Page 24: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 24

J2EE Container (Tomcat/JBoss)

Data Integration

Architecture: Components Se

rver

W

ork

stat

ion

Th

in c

lien

t

Business Analytics

• Analytics • Reporting • Dashboards

Data Discovery and Advanced Analytics

Reporting, Ad Hoc Query, Dashboards, Mobile

Pu

blish

H

TTP/HTTP

S

• ETL • Data profiling, cleansing, quality • Job Design/Orchestration

• Scheduling • Administration • Content Storage

Report Designer PDI Designer Metadata Design

Page 25: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 25

Architecture: Technology

J2EE Container (Tomcat/JBoss)

Data Integration

Serv

eur

Stat

ion

de

trav

ail

Clie

nt

lége

r

Business Analytics

• Analytics • Reporting • Dashboards

Data Discovery and Advanced Analytics

Reporting, Ad Hoc Query, Dashboards, Mobile

• ETL • Data profiling, cleansing, quality • Job Design/Orchestration

• Scheduling • Administration • Content Storage

Report Designer PDI Designer Metadata Design

• JavaScript • Dojo • GWT

• Swing • SWT

• Java • J2EE Web

Application

Technologies

Page 26: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 26

Extensible Architecture

BUSINESS ANALYTICS PLUGIN TYPES:

❯ Platform plugins - security integration, new components

❯ User Console – new editors, analytic displays

❯ Analyzer visualizations – integrate 3rd party visualizations

❯ Dashboard Framework – filter control types, visualizations

❯ Dashboard Designer – additional widget types

DATA INTEGRATION

❯ Transformation Steps – connectors, transformation elements

❯ Job Entries – process/orchestration elements

❯ Perspectives – integrated design or analytic environments

Page 27: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 27

Architectural Components

Page 28: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 28

User Console

DEMO

Page 29: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 29

2013 Performance Goals

1. Increase subscription revenue by analyzing call data to upsell PAYG customers to subscriptions

2. Improve store profitability by holding store managers accountable by bursting store income statements

3. Reduce stock outs with real-time inventory report delivered on an Ipad.

4. Maximize profits by profiling users with high average call duration

5. Maximize revenue by analyzing e-commerce clickstream data in MongoDB to profile purchasing users

6. Improve supply chain by giving phone manufacturers and resellers web-based reports

Page 30: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 30

3 Calling Plans

• Nationwide

• PAYG

• Prepaid 50

2 Business

Units

• B2B

• B2C

7 Retail Stores

7 Product

Lines

3 Websites

Clear Wireless – Wireless Carrier

10 Resellers

9 Phone

Manf

Red River Mobile

Apple

• San Francisco • Boston • NYC • Paris • Tokyo • Sydney • London

• Smartphones • Home Phones • Wifi Devices • Modems • Notebooks • Tablets • Accessories

• Ecommerce Site • Reseller Portal • Manufacturer Portal

EXTERNAL INTERNAL

IFrame Integration

Custom Widget Embedding

Page 31: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 31

Pentaho pour iPad & Android Instant Visualization & Analysis for Mobile Users

INSTANT AND INTERACTIVE VISUALIZATION

❯ Attractive dashboards, analysis, operational &

enterprise reports

❯ Touch filtering, drill-thru to details

POWER TO CREATE NEW ANALYSIS ON

THE GO

❯ Unique to Pentaho

❯ Highly interactive vs. a read-only access to static

content

EASY TO DEPLOY, EASY TO EMBED

❯ IT-free, create once, access anywhere

❯ Web-based, easily embeddable into mobile apps

Page 32: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 32

DEV TOOLS

Page 33: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 33

Translation

YOU CAN LOCALIZE THE PENTAHO USER CONSOLE AND DEVELOPMENT TOOL IN ANY LANGUAGE USING ISO FORMAT

CONCEPT:

❯ Application:

❯ Specific message bundles within the Pentaho Web application

❯ Message bundles are dynamically adjusted according to browser locale

❯ Metadata:

❯ Reporting & Analysis metadata development tool contain specific localization functionalities

❯ Data:

❯ As in your database. Can use multi Tenant Id to switch beetwen different content.

Page 34: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 34

Metadata Translation

METADATA

EDITOR

❯ All data can

be translated

SCHEMA

WORKBENCH

❯ All levels can

be translated

Page 35: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 35

Metadata

Data Mart(s) / Warehouse

Metadata

Dashboard

Rapport

Analyse

PDI Datasource

Operations Mart

Page 36: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 36

Schema Workbench

Pentaho Server

Mondrian Schema

Metadata Schema

MDX

SQL

Metadata Editor

Analyzer

Interactive Reporting

Report Designer

Architecture Metadata

2 TYPES OF METADATA

❯ Metadata Report

❯ Metadata Olap

Page 37: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 37

END USER CAN CREATE HIS OWN METADATA

❯ From a file

❯ With Sql Statement

❯ From database

Architecture Metadata

Data Source Wizard

Pentaho Server

Mondrian Schema

Metadata Schema

MDX

SQL

Analyzer

Interactive Reporting

Report Designer

Page 38: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 38

Schema Workbench & Aggregation Designer

CALCULATION, VIRTUAL CUBES AND AGGREGATION TOOL

Page 39: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 39

Metadata Editor

CREATE METADATA FOR YOUR END-USERS

Modeling

Properties

Page 40: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 40

PENTAHO DATA INTEGRATION

Page 41: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 41

PDI Composants

Extract

Transform

Load

Page 42: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 42

PDI Components

SPOON ❯ Graphical environment for modeling

❯ Transformations are metadata models describing the flow of data

❯ Jobs are workflow-like models for coordinating resources, execution and dependencies of ETL activities

PAN ❯ Command line tool for executing

transformations modeled in Spoon

KITCHEN ❯ Command line tool for executing

jobs modeled in Spoon

… AND OF COURSE KETTLE ❯ The Engine itself

KDE ETTL Environment

Spoon Interface – Designing a Transformation

Job Example

Page 43: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 43

PDI Components

ENTERPRISE EDITION DATA INTEGRATION SERVER

❯ Execution and remote monitoring

❯ Integrated scheduling

❯ Enterprise Security options

❯ Enhanced content management including revision history and locking

❯ Remote distributed cluster based processing

Page 44: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 44

Any Format of Data

Page 45: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 45

Overall Management

Not just processing … A key element once in a production environment

Page 46: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 46

Access Controls

PREVENT UNAUTHORIZED USERS FROM VIEWING DATA

TRANSFORMATION RULES AND POSSIBLY CONNECTION

CREDENTIALS (E.G. DATABASE LOGINS / PASSWORDS)

❯ Integrate with existing security (e.g. LDAP / Active Directory)

PROVIDE ACCESS TO TRANSFORMATIONS AND JOBS ON A

“NEED TO KNOW” BASIS’

PROTECT DATABASE LOGINS AND PASSWORDS

Page 47: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 47

Version Control

AUTOMATICALLY SAVE MULTIPLE REVISIONS OF A

TRANSFORMATION OR JOB

ELIMINATE THE RISK OF “FAT FINGERS” … ACCIDENTAL

DELETION OR CHANGES

EXPERIMENT WITH DIFFERENT ETL DESIGNS WHILE PRESERVING

THE ORIGINAL

RESTORE TRANSFORMATIONS AND JOBS FROM AN EARLIER

VERSION

Page 48: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 48

Production Control

IDENTIFY THE CORRECT VERSIONS OF TRANSFORMATIONS AND

JOBS TO RUN

ALLOW “EXECUTE ONLY” BY IT OPERATIONS PERSONNEL

LOCK TRANSFORMATIONS WITH COMMENTS

SCHEDULE JOBS TO RUN ON A CENTRAL SERVER AT

PREDETERMINED TIMES

Page 49: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 49

Pentaho Data Integration

STEP BASED PROCESSING ENGINE WITH INSTANT VISUALISATION

OF RESULTS

Page 50: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 50

Pentaho Data Integration

• AGILE BI METHODOLOGY

• Load

• Modeling

• Visualize

Page 51: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 51

Traditional DB

DATA INTEGRATION ANALYSIS

etc etc etc

Page 52: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 52

Broadest Support for Big Data Platforms

Hadoop NoSQL Analytic Databases

Page 53: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 53

BIG DATA

Page 54: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 54

Big Data?

http://www.youtube.com/watch?v=QV3t-3QIf1E

Page 55: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 55

CENTRAL ADMINISTRATION, AUDITING & MONITORING

DELIVER When & Where Users Need It

STREAMLINE Information Delivery

VISUALIZE & Report Information In Any Style

ACCESS All Data Sources

ISV & Packaged Applications

SaaS / Cloud Applications

INTEGRATION

Web

Mobile

Print

E-Mail

STANDALONE

‣ Advanced &

Predictive Analytics

DATA MINING

‣ Proactive

‣ Operational

‣ Enterprise

REPORTING

‣ Ad hoc Exploration

‣ Multi-Dimensional

ANALYSIS

‣ Interactive Metrics

‣ Rich Visualizations

DASHBOARDS

ERP / CRM /

Enterprise Apps (e.g. SAP, Oracle)

Hadoop, NoSQL Data & Analytical

Unstructured &

semi-structured (XML, Excel, Files, etc.)

Traditional Relational Data

Cloud (e.g. Salesforce,

Amazon, Dell)

‣ Direct Access

‣Data Integration

‣ Hadoop Clustering

‣ Graphical ETL Designer

‣ Enterprise Scalability

INTEGRATE, CLEANSE, & ENRICH DATA

‣ In Memory Caching

‣ High Performance

‣ Relational OLAP Cubes

METADATA LAYER

Big Data with Pentaho

BIG DATA Discovery

‣ Instaview

Page 56: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 56

Classic usage

Web Application

User Behavior JavaScript, Java, PHP,

Embedded Specialist Tool

CRM Style Data

Page 57: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 57

ORCHESTRATE

ERP DW

Processing

CRM

Pig, Oozie, Flume, Hive, Hbase, Sqoop

Raw Data

Parsed Data

Analytic Datasets

Transform & visualize

Master Data

Analysis & Reporting

A

N

A

L

Y

Z

E

Unstructured Data

Structured Data

INGEST

Ingestion

VISUAL MAP REDUCE

Data Integration Analytics

Page 58: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 58

Pentaho Big Data Strategy

• VISUAL MAP REDUCE • Graphic development

• Technical architecture near from Hadoop

• BIG DATA LAYER • Framework included all Big Data distribution

• Technical partnership (Cloudera, HortonWorkd, MongoDB, …)

• BLENDED DATA • JDBC Driver to use our ETL like a datasource

• Data as a service

Page 59: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 59

Pentaho Visual Development Eliminates need for complex coding

Would you rather do this?

Integrate, Manipulate, Ingest

… or this?

Schedule

Model

Page 60: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 60

Pentaho Visual MapReduce Drag & Drop then run in the cluster

Parallel execution as MapReduce

in the Hadoop cluster

Up to 15x faster than hand-

written code

Page 61: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 61

VISUAL MAP REDUCE

UNIQUEMENT DES DÉVELOPPEURS ETL

The main part of your transformation doesn’t change… only a new first and last steps

Page 62: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 62

Big Data Orchestration

• Scheduling, Event management • Use your existing scripts (cf. scripts Pig) • All Db’s and File system – Hadoop, NoSQL, RDBMS, …

Page 63: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 63

Adaptive Big Data Layer Leadership in Big Data Integration and Analytics

• Insulates from changing versions, vendors, data stores

• Give customers broad flexibility of choice, rapid time to value, reduced risk

• Provides native integration into the big data ecosystem

• Broadest, deepest Big Data Support

Transparent Access to & Integration of Big Data

Page 64: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 64

BIG DATA LAYER

ALL MAIN HADOOP DISTRIBUTION

NOSQL CONNECTORS

ACCESS TO AMAZON REDSHIFT & SPLUNK.

Page 65: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 65

How Pentaho helps you?

• WE REDUCE COMPLEXITY WHICH SIMPLIFIES MIGRATION TO NEW VERSIONS OF

HADOOP

• BECAUSE WE DON’T GENERATE CODE, WE REDUCE THE RISK OF OBSOLESCENCE AS

HADOOP EVOLVES

Page 66: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 66

Data Ingestion

Manipulation

Integration

Enterprise &

Ad Hoc Reporting

Data Discovery

Visualization

Predictive Analytics

Complete Big Data Analytics &

Visual Data Management

Relational Hadoop NoSQL Analytic

Databases

Pentaho Big Data Analytics

Page 67: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 67

Orchestration Engine

Applications

Databases

Files

Job (.KJB)

Plug-In Job

Entries

Monitoring Logs

Flume

SQL

Files FTP

Sqoop

Folder

Oozie

Email

Sub-Job

Analytic DB

NoSQL

Hadoop Cluster

PDI Engine

Page 68: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 68

Transformation Engine

Data Node

Task Tracker

Transformation Engine

Data Node

Task Tracker

Transformation Engine

Data Node

Task Tracker

Transformation Engine

JobTracker

Orchestration

Distributed Cache

Transformation (.KTR)

Transformation Engine

Page 69: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 69

BLENDED DATA

Data sources SQL

Datawarehouse

Location

Network

Web

Social Media

WebServices NoSql

Hadoop

Page 70: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 70

• Takes time • Requires IT • Target database is updated as

transformations are run

How do we integrate data today?

Page 71: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 71

Location

Network

Web

Social Media

We put it in: Hadoop NoSQL Analytic DB’s

Where do we store big data?

Page 72: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 72

Location

Network

Web

Social Media

How can we bring it together?

ETL

Page 73: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 73

Location

Network

Web

Social Media

PDI

What if a user could bring together both types

of data on demand?

Page 74: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 74

❯ Just in time blending of data from multiple sources for a complete

picture

❯ Connect, combine and transform data from multiple sources

❯ Query data directly from any transformation

❯ Access architected blends with the full spectrum of Pentaho Analytics

❯ Manage governance and security of data for on-going accuracy

Accurate, Blended Big Data Analytics

EDW

Existing ETL Tool

or PDI Custom

er

Billing

Provisioning

NoSQL Network

Location

PDI

PDI Analytics

Just in time blending

Page 75: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 75

Evolving Big Data Architectures

Just-in-Time Integration P

D

I

PDI

Analytic

DB

Location

Web

Social Media

Network

Existing

Process

or PDI Hadoop

Cluster

NoSQL

Existing

ETL Tool

or PDI

EDW Data

Marts

Analytics

Existing

ETL Tool

or PDI

Customer

Provisioning

Billing

Other BI Tools

Page 76: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 76

Improve operational effectiveness

• Machines/sensors: predict failures, network attacks

• Financial risk management: reduce fraud, increase security

Reduce data warehouse cost

• Integrate new data sources without increased database cost

• Provide online access to ‘dark data’

Drive incremental revenue

• Predict customer behavior across all channels

• Understand and monetize customer behavior

• Begin to monetize data as a service

Customer Value from Big Data

MONETIZING BIG DATA-DRIVEN USE CASES DRIVING NEED TO BLEND DATA

Page 77: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 77

Analytics

Analyze quality of service: • Network outages

• Dropped calls

• Poor quality

• Calls to support center

For profiles of customers: • Up for renewal

• Profitable

• Multiple agreements/services

• In competitive area

Determine best action to take: • Billing Credit

• Customer Coupon

• No Action

EDW

Existing

ETL

Tool

or PDI

Customer

Billing

Provisioning

Customer Financial Data:

• Billing

• Payment

• Usage

NoSQL Network

Location

PDI

Customer Experience Data:

• Outages

• Drops

• Service Quality

PDI

Blend revenue-related and

quality-of-service data

together to find customers at

risk

Why Blending at the Source Matters Customer Experience Analytics for Loyalty and Revenue

Page 78: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 78

Optimisation Data Warehouse

Data Sources Big Data Architecture

Data Warehouse (Master & Transactional Data)

ERP

CRM

CDR

Analytic Data Mart(s)

Analytic Data Mart(s)

Analytic Data Mart(s)

Logs Logs

Other Data

Raw Data

Parsed Data

Analytic Datasets

Master Data

Tape

Archive

Page 79: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 79

Our Big Data solutions B

ig D

ata

Mgm

t.

NoSQL Databases

• Data Integration • Job Orchestration • Workflow

Pentaho Business Analytics

• Scheduling • High Performance • Visual IDE

Dat

a In

tegr

atio

n

Analytic Databases Hadoop Java MapReduce, Pig Pentaho MapReduce

Big

An

alyt

ics

3RD Party Tools

• R • 3rd Party BI Tools • Applications

Business Analytics Embedded Analytics

Page 80: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 80

MongoDB & Pentaho

Page 81: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 81

OEM & ISV

Page 82: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 82

OEM pattern

Pentaho BI Server

Your Application

Pentaho

Your functions

Your application

Pentaho components

Page 83: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 83

OEM

Bundled Mashup Extended Embedded

Value Fastest Way to Get Analytics that Have

Your Look & Feel

An Integrated Experience for Yours

End User

Customizing Pentaho for Your

Experience

Ultimate Integration and Customization

What it Takes?

• Pentaho is a separate app, branded with Partner’s logo, look & feel

• Optional: Partner app may include links to Pentaho reports, analysis and dashboards (popping new window)

• Optional: Single sign-on creates a seamless experience

• Pentaho & Partner app have the same UI • Pentaho User Console, or individual reports, analysis or dashboards are included in partner app

• Single sign-on creates a seamless experience

• Pentaho’s core functionality is extended through plug-ins. Examples: - Connecting to custom data sources - Adding new visualizations - Customizing security - Replacing Pentaho rules engine

• Integrate with Partner’s App Server

• Directly embedding Pentaho into your app

• Calling Pentaho Java APIs from your App

Skill Level • Limited HTML skills • HTML skills • HTML skills • Java skills

• HTML skills • Java skills • Knowledge of Pentaho architecture

Page 84: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 84

COMPLETE ISOLATION OF ALL CONTENT INCLUDING:

ARTIFACTS (REPORTS, DASHBOARDS, ETC.)

DATA SOURCES

SCHEDULER

PLUGINS?

“VIRTUAL SERVER PER ORGANIZATION”

Pentaho BI Server

Use Case – Share Nothing

Artifacts

Data Sources

Schedules

Configuration

Artifacts

Data Sources

Schedules

Configuration

Organizations

Company B Company A

Page 85: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 85

•ONE OR MORE COMMON DATABASES

•DATA ‘STRIPED’ WITH TENANT ID

Pentaho BI Server

Use Case – Shared Data, tagged with Tenant ID

Artifacts

Data Sources

Schedules

Configuration

Artifacts

Data Sources

Schedules

Configuration

Organizations

Company B Company A

Shared Database

Page 86: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 86

•COMMON DATA SOURCE DEFINITION

•ORGANIZATION DATA IN ISOLATED DATABASES WITH THE SAME DATA

MODEL

•CONNECTIONS ‘ROUTED’ BY TENANT ID

Pentaho BI Server

Use Case – Shared Data Source (data isolation)

Artifacts

Data Sources

Schedules

Configuration

Artifacts

Data Sources

Schedules

Configuration

Organizations

Company B Company A

Data Sources

Page 87: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 87

•COMMON PENTAHO ARTIFACTS – REPORTS, DASHBOARDS, ANALYSIS

•EACH TENANT CAN VIEW SHARED AND TENANT SPECIFIC CONTENT

Pentaho BI Server

Use Case – Shared Content

Artifacts

Data Sources

Schedules

Configuration

Artifacts

Data Sources

Schedules

Configuration

Organizations

Company B Company A

Artifacts

Page 88: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 88

DATAMINING

Page 89: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 89

Class Photo

…And that, in simple terms, is how data mining works.

Page 90: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 90

Pentaho Core Competencies

Full data mining lifecycle support Preparation of input data

Statistical evaluation of data mining models

Visualization of inputs and results of model learning

Visualization, classification and clustering capabilities

118 classification/regression algorithms

11 clustering algorithms

Integrated with PDI ETL Execute Weka and R predictive models inside of a PDI transformation

Append probabilities dynamically to each row in the data flow

Retrain Weka models using the KnowledgeFlow plugin for PDI

Advanced Analytical

Professionals Data Mining (Predictive Analytics)

Page 91: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 91

3 Calling Plans

• Nationwide

• PAYG

• Prepaid 50

2 Business

Units

• B2B

• B2C

7 Retail Stores

7 Product

Lines

3 Websites

Clear Wireless – Wireless Carrier

• San Francisco • Boston • NYC • Paris • Tokyo • Sydney • London

• Smartphones • Home Phones • Wifi Devices • Modems • Notebooks • Tablets • Accessories

• Ecommerce Site • Reseller Portal • Manufacturer Portal

Call Detail Records

Retail Sales

Website Clickstream

Databases

Page 92: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 92

2013 Performance Goals

Increase subscription revenue

Improve store profitability

Eliminate inventory stock outs

Leverage big data to maximize profits

Profile and target profitable customers

Improve supply chain visibility for partners

Page 93: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 93

2013 Performance Goals

93

Goals Objectives Enabler

Increase subscription revenue

Analyze call data to upsell PAYG customers to subscriptions

Improve store profitability

Hold store managers accountable by pushing store income statements to email

Eliminate inventory stock outs

Empower store employees with iPads and real-time inventory reports

Profile and target profitable customers

Profile mobile plan customers with high average call duration

Leverage big data to maximize profits

• Analyze e-commerce clickstream data in MongoDB to profile purchasing users

• Use predictive technologies to improve marketing effectiveness

Improve supply chain visibility for partners

Give phone manufacturers and resellers web access to secure sales reports

Page 94: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 94

Data Mining Lifecycle

Phases

a visual guide to CRISP-DM methodology

SOURCE CRISP-DM 1.0

http://www.crisp-dm.org/download.htm

DESIGN Nicole Leaper

http://www.nicoleleaper.com

Generic Tasks

Specialized Tasks

(Process Instances)

Determine Business

Objectives

Background

Business Objectives

Business Success Criteria

(Log and Report Process)

Assess Situation

Inventory of Resources,

Requirements, Assumptions,

and Constraints

Risks and Contingencies

Terminology

Costs and Benefits

(Log and Report Process)

Determine Data Mining

Goals

Data Mining Goals

Data Mining Success Criteria

(Log and Report Process)

Produce Project Plan

Project Plan

Initial Assessment of Tools and

Techniques

(Log and Report Process)

Collect Initial Data

Initial Data Collection Report

(Log and Report Process)

Describe Data

Data Description Report

(Log and Report Process)

Explore Data

Data Exploration Report

(Log and Report Process)

Verify Data Quality

Data Quality Report

(Log and Report Process)

Data Set

Data Set Description

(Log and Report Process)

Select Data

Rationale for Inclusion/

Exclusion

(Log and Report Process)

Clean Data

Data Cleaning Report

(Log and Report Process)

Construct Data

Derived Attributes

Generated Records

(Log and Report Process)

Integrate Data

Merged Data

(Log and Report Process)

Format Data

Reformatted Data

(Log and Report Process)

Select Modeling

Technique

Modeling Technique

Modeling Assumptions

(Log and Report Process)

Generate Test Design

Test Design

(Log and Report Process)

Build Model Parameter

Settings

Models

Model Description

(Log and Report Process)

Assess Model

Model Assessment

Revised Parameter

(Log and Report Process)

Evaluate Results

Align Assessment of Data

Mining Results with

Business Success Criteria

(Log and Report Process)

Approved Models

Review Process

Review of Process

(Log and Report Process)

Determine Next Steps

List of Possible Actions

Decision

(Log and Report Process)

Plan Deployment

Deployment Plan

(Log and Report Process)

Plan Monitoring and

Maintenance

Monitoring and

Maintenance Plan

(Log and Report Process)

Produce Final Report

Final Report

Final Presentation

(Log and Report Process)

Review Project

Experience

Documentation

(Log and Report Process)

Modeling

manipulate data and

draw conclusions

Evaluation

evaluate model and

conclusions

Deployment

apply conclusions to

business

Business Understanding

identify project objectives

Data Understanding

collect and review data

Data Preparation

select and cleanse data

Data Mining Life Cycle

Page 95: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 95

Business Understanding

WEBSITE CLICKSTREAM DATA

INCREASE REVENUE THROUGH TARGETED MARKETING

LIMITED DIRECT MARKETING BUDGET

PREDICT A WEB USER’S PROPENSITY TO PURCHASE BASED ON THEIR ONLINE CLICKSTREAM BEHAVIOR

SEND EXPENSIVE PROMOTIONAL OFFERS TO WEB USERS MOST LIKELY TO MAKE A PURCHASE

ASSUMPTIONS: $5/MAILING FOR $500 PURCHASE

❯ Expected Benefit of true positive prediction: $495 ($500 – mailing cost)

❯ Expected Benefit of false negative: $0 (no gain & no loss)

❯ Expected Cost of a false positive: $5 (cost of mailing)

Page 96: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 96

Data Understanding Phases

a visual guide to CRISP-DM methodology

SOURCE CRISP-DM 1.0

http://www.crisp-dm.org/download.htm

DESIGN Nicole Leaper

http://www.nicoleleaper.com

Generic Tasks

Specialized Tasks

(Process Instances)

Determine Business

Objectives

Background

Business Objectives

Business Success Criteria

(Log and Report Process)

Assess Situation

Inventory of Resources,

Requirements, Assumptions,

and Constraints

Risks and Contingencies

Terminology

Costs and Benefits

(Log and Report Process)

Determine Data Mining

Goals

Data Mining Goals

Data Mining Success Criteria

(Log and Report Process)

Produce Project Plan

Project Plan

Initial Assessment of Tools and

Techniques

(Log and Report Process)

Collect Initial Data

Initial Data Collection Report

(Log and Report Process)

Describe Data

Data Description Report

(Log and Report Process)

Explore Data

Data Exploration Report

(Log and Report Process)

Verify Data Quality

Data Quality Report

(Log and Report Process)

Data Set

Data Set Description

(Log and Report Process)

Select Data

Rationale for Inclusion/

Exclusion

(Log and Report Process)

Clean Data

Data Cleaning Report

(Log and Report Process)

Construct Data

Derived Attributes

Generated Records

(Log and Report Process)

Integrate Data

Merged Data

(Log and Report Process)

Format Data

Reformatted Data

(Log and Report Process)

Select Modeling

Technique

Modeling Technique

Modeling Assumptions

(Log and Report Process)

Generate Test Design

Test Design

(Log and Report Process)

Build Model Parameter

Settings

Models

Model Description

(Log and Report Process)

Assess Model

Model Assessment

Revised Parameter

(Log and Report Process)

Evaluate Results

Align Assessment of Data

Mining Results with

Business Success Criteria

(Log and Report Process)

Approved Models

Review Process

Review of Process

(Log and Report Process)

Determine Next Steps

List of Possible Actions

Decision

(Log and Report Process)

Plan Deployment

Deployment Plan

(Log and Report Process)

Plan Monitoring and

Maintenance

Monitoring and

Maintenance Plan

(Log and Report Process)

Produce Final Report

Final Report

Final Presentation

(Log and Report Process)

Review Project

Experience

Documentation

(Log and Report Process)

Modeling

manipulate data and

draw conclusions

Evaluation

evaluate model and

conclusions

Deployment

apply conclusions to

business

Business Understanding

identify project objectives

Data Understanding

collect and review data

Data Preparation

select and cleanse data

Data Mining Life Cycle

Use ecommerce website clickstream log data stored in a MongoDB database

Key Source Fields: [id_user], [date], [event_name]

Events

Page 97: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 97

Data Understanding/Data Preparation

WEBSITE CLICKSTREAM DATA USE WEBSITE CLICKSTREAM LOG DATA STORED IN MONGODB

NEED TO AGGREGATE, TRANSFORM, AND ENRICH THAT DATA

NEED TO PIVOT THE DATA INTO TABULAR FORMAT FOR PREDICTIVE MODELS

CREATE A FILE IN AARF FORMAT FOR PENTAHO DATA MINING

USE PENTAHO DATA MINING TO DERIVE “PROPENSITY SCORE” FOR EACH USER

Phases

a visual guide to CRISP-DM methodology

SOURCE CRISP-DM 1.0

http://www.crisp-dm.org/download.htm

DESIGN Nicole Leaper

http://www.nicoleleaper.com

Generic Tasks

Specialized Tasks

(Process Instances)

Determine Business

Objectives

Background

Business Objectives

Business Success Criteria

(Log and Report Process)

Assess Situation

Inventory of Resources,

Requirements, Assumptions,

and Constraints

Risks and Contingencies

Terminology

Costs and Benefits

(Log and Report Process)

Determine Data Mining

Goals

Data Mining Goals

Data Mining Success Criteria

(Log and Report Process)

Produce Project Plan

Project Plan

Initial Assessment of Tools and

Techniques

(Log and Report Process)

Collect Initial Data

Initial Data Collection Report

(Log and Report Process)

Describe Data

Data Description Report

(Log and Report Process)

Explore Data

Data Exploration Report

(Log and Report Process)

Verify Data Quality

Data Quality Report

(Log and Report Process)

Data Set

Data Set Description

(Log and Report Process)

Select Data

Rationale for Inclusion/

Exclusion

(Log and Report Process)

Clean Data

Data Cleaning Report

(Log and Report Process)

Construct Data

Derived Attributes

Generated Records

(Log and Report Process)

Integrate Data

Merged Data

(Log and Report Process)

Format Data

Reformatted Data

(Log and Report Process)

Select Modeling

Technique

Modeling Technique

Modeling Assumptions

(Log and Report Process)

Generate Test Design

Test Design

(Log and Report Process)

Build Model Parameter

Settings

Models

Model Description

(Log and Report Process)

Assess Model

Model Assessment

Revised Parameter

(Log and Report Process)

Evaluate Results

Align Assessment of Data

Mining Results with

Business Success Criteria

(Log and Report Process)

Approved Models

Review Process

Review of Process

(Log and Report Process)

Determine Next Steps

List of Possible Actions

Decision

(Log and Report Process)

Plan Deployment

Deployment Plan

(Log and Report Process)

Plan Monitoring and

Maintenance

Monitoring and

Maintenance Plan

(Log and Report Process)

Produce Final Report

Final Report

Final Presentation

(Log and Report Process)

Review Project

Experience

Documentation

(Log and Report Process)

Modeling

manipulate data and

draw conclusions

Evaluation

evaluate model and

conclusions

Deployment

apply conclusions to

business

Business Understanding

identify project objectives

Data Understanding

collect and review data

Data Preparation

select and cleanse data

Data Mining Life Cycle

Page 98: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 98

Data Understanding/Data Preparation Phases

a visual guide to CRISP-DM methodology

SOURCE CRISP-DM 1.0

http://www.crisp-dm.org/download.htm

DESIGN Nicole Leaper

http://www.nicoleleaper.com

Generic Tasks

Specialized Tasks

(Process Instances)

Determine Business

Objectives

Background

Business Objectives

Business Success Criteria

(Log and Report Process)

Assess Situation

Inventory of Resources,

Requirements, Assumptions,

and Constraints

Risks and Contingencies

Terminology

Costs and Benefits

(Log and Report Process)

Determine Data Mining

Goals

Data Mining Goals

Data Mining Success Criteria

(Log and Report Process)

Produce Project Plan

Project Plan

Initial Assessment of Tools and

Techniques

(Log and Report Process)

Collect Initial Data

Initial Data Collection Report

(Log and Report Process)

Describe Data

Data Description Report

(Log and Report Process)

Explore Data

Data Exploration Report

(Log and Report Process)

Verify Data Quality

Data Quality Report

(Log and Report Process)

Data Set

Data Set Description

(Log and Report Process)

Select Data

Rationale for Inclusion/

Exclusion

(Log and Report Process)

Clean Data

Data Cleaning Report

(Log and Report Process)

Construct Data

Derived Attributes

Generated Records

(Log and Report Process)

Integrate Data

Merged Data

(Log and Report Process)

Format Data

Reformatted Data

(Log and Report Process)

Select Modeling

Technique

Modeling Technique

Modeling Assumptions

(Log and Report Process)

Generate Test Design

Test Design

(Log and Report Process)

Build Model Parameter

Settings

Models

Model Description

(Log and Report Process)

Assess Model

Model Assessment

Revised Parameter

(Log and Report Process)

Evaluate Results

Align Assessment of Data

Mining Results with

Business Success Criteria

(Log and Report Process)

Approved Models

Review Process

Review of Process

(Log and Report Process)

Determine Next Steps

List of Possible Actions

Decision

(Log and Report Process)

Plan Deployment

Deployment Plan

(Log and Report Process)

Plan Monitoring and

Maintenance

Monitoring and

Maintenance Plan

(Log and Report Process)

Produce Final Report

Final Report

Final Presentation

(Log and Report Process)

Review Project

Experience

Documentation

(Log and Report Process)

Modeling

manipulate data and

draw conclusions

Evaluation

evaluate model and

conclusions

Deployment

apply conclusions to

business

Business Understanding

identify project objectives

Data Understanding

collect and review data

Data Preparation

select and cleanse data

Data Mining Life Cycle

Key Source Fields: [id_user], [date], [event_name]

PARSE CLEAN AND FORMAT

GROUP AND AGGREGATE ENRICH w OTHER DATA

Page 99: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 99

Data Understanding/Data Preparation

USE PDI ROW DENORMALIZER STEP TO PREPARE DATE FOR PREDICTIVE MODEL

• Pivots events from records into columns (tabular format)

Phases

a visual guide to CRISP-DM methodology

SOURCE CRISP-DM 1.0

http://www.crisp-dm.org/download.htm

DESIGN Nicole Leaper

http://www.nicoleleaper.com

Generic Tasks

Specialized Tasks

(Process Instances)

Determine Business

Objectives

Background

Business Objectives

Business Success Criteria

(Log and Report Process)

Assess Situation

Inventory of Resources,

Requirements, Assumptions,

and Constraints

Risks and Contingencies

Terminology

Costs and Benefits

(Log and Report Process)

Determine Data Mining

Goals

Data Mining Goals

Data Mining Success Criteria

(Log and Report Process)

Produce Project Plan

Project Plan

Initial Assessment of Tools and

Techniques

(Log and Report Process)

Collect Initial Data

Initial Data Collection Report

(Log and Report Process)

Describe Data

Data Description Report

(Log and Report Process)

Explore Data

Data Exploration Report

(Log and Report Process)

Verify Data Quality

Data Quality Report

(Log and Report Process)

Data Set

Data Set Description

(Log and Report Process)

Select Data

Rationale for Inclusion/

Exclusion

(Log and Report Process)

Clean Data

Data Cleaning Report

(Log and Report Process)

Construct Data

Derived Attributes

Generated Records

(Log and Report Process)

Integrate Data

Merged Data

(Log and Report Process)

Format Data

Reformatted Data

(Log and Report Process)

Select Modeling

Technique

Modeling Technique

Modeling Assumptions

(Log and Report Process)

Generate Test Design

Test Design

(Log and Report Process)

Build Model Parameter

Settings

Models

Model Description

(Log and Report Process)

Assess Model

Model Assessment

Revised Parameter

(Log and Report Process)

Evaluate Results

Align Assessment of Data

Mining Results with

Business Success Criteria

(Log and Report Process)

Approved Models

Review Process

Review of Process

(Log and Report Process)

Determine Next Steps

List of Possible Actions

Decision

(Log and Report Process)

Plan Deployment

Deployment Plan

(Log and Report Process)

Plan Monitoring and

Maintenance

Monitoring and

Maintenance Plan

(Log and Report Process)

Produce Final Report

Final Report

Final Presentation

(Log and Report Process)

Review Project

Experience

Documentation

(Log and Report Process)

Modeling

manipulate data and

draw conclusions

Evaluation

evaluate model and

conclusions

Deployment

apply conclusions to

business

Business Understanding

identify project objectives

Data Understanding

collect and review data

Data Preparation

select and cleanse data

Data Mining Life Cycle

Data Output: • Parsed • Cleaned; Formatted • Grouped; Aggregated • Enriched

Page 100: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 100

Data Understanding/Data Preparation

The learning algorithms require each row of data to be an independent example of what is to be learned. This allows us to provide an example which summarizes all of the possible user's events over the hour in a single record. This “presence/absence” of event types are then used as predictors for "add to cart".

Why do we have to pivot the records into tabular format?

Page 101: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 101

Data Preparation – Final Step

• Create a file in Weka’s ARFF format

Page 102: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 102

Modeling

Phases

a visual guide to CRISP-DM methodology

SOURCE CRISP-DM 1.0

http://www.crisp-dm.org/download.htm

DESIGN Nicole Leaper

http://www.nicoleleaper.com

Generic Tasks

Specialized Tasks

(Process Instances)

Determine Business

Objectives

Background

Business Objectives

Business Success Criteria

(Log and Report Process)

Assess Situation

Inventory of Resources,

Requirements, Assumptions,

and Constraints

Risks and Contingencies

Terminology

Costs and Benefits

(Log and Report Process)

Determine Data Mining

Goals

Data Mining Goals

Data Mining Success Criteria

(Log and Report Process)

Produce Project Plan

Project Plan

Initial Assessment of Tools and

Techniques

(Log and Report Process)

Collect Initial Data

Initial Data Collection Report

(Log and Report Process)

Describe Data

Data Description Report

(Log and Report Process)

Explore Data

Data Exploration Report

(Log and Report Process)

Verify Data Quality

Data Quality Report

(Log and Report Process)

Data Set

Data Set Description

(Log and Report Process)

Select Data

Rationale for Inclusion/

Exclusion

(Log and Report Process)

Clean Data

Data Cleaning Report

(Log and Report Process)

Construct Data

Derived Attributes

Generated Records

(Log and Report Process)

Integrate Data

Merged Data

(Log and Report Process)

Format Data

Reformatted Data

(Log and Report Process)

Select Modeling

Technique

Modeling Technique

Modeling Assumptions

(Log and Report Process)

Generate Test Design

Test Design

(Log and Report Process)

Build Model Parameter

Settings

Models

Model Description

(Log and Report Process)

Assess Model

Model Assessment

Revised Parameter

(Log and Report Process)

Evaluate Results

Align Assessment of Data

Mining Results with

Business Success Criteria

(Log and Report Process)

Approved Models

Review Process

Review of Process

(Log and Report Process)

Determine Next Steps

List of Possible Actions

Decision

(Log and Report Process)

Plan Deployment

Deployment Plan

(Log and Report Process)

Plan Monitoring and

Maintenance

Monitoring and

Maintenance Plan

(Log and Report Process)

Produce Final Report

Final Report

Final Presentation

(Log and Report Process)

Review Project

Experience

Documentation

(Log and Report Process)

Modeling

manipulate data and

draw conclusions

Evaluation

evaluate model and

conclusions

Deployment

apply conclusions to

business

Business Understanding

identify project objectives

Data Understanding

collect and review data

Data Preparation

select and cleanse data

Data Mining Life Cycle

Page 103: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 103

Modeling

BUSINESS GOAL: INCREASE REVENUE THROUGH TARGETED MARKETING

❯ Would like to focus on who is most likely to purchase and try to persuade them to spend more (or perhaps commit to spending if they are borderline) with seductive marketing, special offers etc.

DATA MINING

❯ Build a model that will predict “Added_item_to_cart” with higher accuracy than random selection or any hand-crafted business rules

❯ Based on data characteristics – few attributes; small number of instances – try some “likely suspects” first

❯ Naïve Bayes (linear)

❯ Logistic regression (linear)

❯ Decision tree (non-linear)

❯ K nearest neighbors (non-linear)

Page 104: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 104

Modeling

• Take a quick look at summary info in the Explorer

Phases

a visual guide to CRISP-DM methodology

SOURCE CRISP-DM 1.0

http://www.crisp-dm.org/download.htm

DESIGN Nicole Leaper

http://www.nicoleleaper.com

Generic Tasks

Specialized Tasks

(Process Instances)

Determine Business

Objectives

Background

Business Objectives

Business Success Criteria

(Log and Report Process)

Assess Situation

Inventory of Resources,

Requirements, Assumptions,

and Constraints

Risks and Contingencies

Terminology

Costs and Benefits

(Log and Report Process)

Determine Data Mining

Goals

Data Mining Goals

Data Mining Success Criteria

(Log and Report Process)

Produce Project Plan

Project Plan

Initial Assessment of Tools and

Techniques

(Log and Report Process)

Collect Initial Data

Initial Data Collection Report

(Log and Report Process)

Describe Data

Data Description Report

(Log and Report Process)

Explore Data

Data Exploration Report

(Log and Report Process)

Verify Data Quality

Data Quality Report

(Log and Report Process)

Data Set

Data Set Description

(Log and Report Process)

Select Data

Rationale for Inclusion/

Exclusion

(Log and Report Process)

Clean Data

Data Cleaning Report

(Log and Report Process)

Construct Data

Derived Attributes

Generated Records

(Log and Report Process)

Integrate Data

Merged Data

(Log and Report Process)

Format Data

Reformatted Data

(Log and Report Process)

Select Modeling

Technique

Modeling Technique

Modeling Assumptions

(Log and Report Process)

Generate Test Design

Test Design

(Log and Report Process)

Build Model Parameter

Settings

Models

Model Description

(Log and Report Process)

Assess Model

Model Assessment

Revised Parameter

(Log and Report Process)

Evaluate Results

Align Assessment of Data

Mining Results with

Business Success Criteria

(Log and Report Process)

Approved Models

Review Process

Review of Process

(Log and Report Process)

Determine Next Steps

List of Possible Actions

Decision

(Log and Report Process)

Plan Deployment

Deployment Plan

(Log and Report Process)

Plan Monitoring and

Maintenance

Monitoring and

Maintenance Plan

(Log and Report Process)

Produce Final Report

Final Report

Final Presentation

(Log and Report Process)

Review Project

Experience

Documentation

(Log and Report Process)

Modeling

manipulate data and

draw conclusions

Evaluation

evaluate model and

conclusions

Deployment

apply conclusions to

business

Business Understanding

identify project objectives

Data Understanding

collect and review data

Data Preparation

select and cleanse data

Data Mining Life Cycle

Page 105: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 105

Knowledge Flow

Page 106: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 106

Evaluation

Phases

a visual guide to CRISP-DM methodology

SOURCE CRISP-DM 1.0

http://www.crisp-dm.org/download.htm

DESIGN Nicole Leaper

http://www.nicoleleaper.com

Generic Tasks

Specialized Tasks

(Process Instances)

Determine Business

Objectives

Background

Business Objectives

Business Success Criteria

(Log and Report Process)

Assess Situation

Inventory of Resources,

Requirements, Assumptions,

and Constraints

Risks and Contingencies

Terminology

Costs and Benefits

(Log and Report Process)

Determine Data Mining

Goals

Data Mining Goals

Data Mining Success Criteria

(Log and Report Process)

Produce Project Plan

Project Plan

Initial Assessment of Tools and

Techniques

(Log and Report Process)

Collect Initial Data

Initial Data Collection Report

(Log and Report Process)

Describe Data

Data Description Report

(Log and Report Process)

Explore Data

Data Exploration Report

(Log and Report Process)

Verify Data Quality

Data Quality Report

(Log and Report Process)

Data Set

Data Set Description

(Log and Report Process)

Select Data

Rationale for Inclusion/

Exclusion

(Log and Report Process)

Clean Data

Data Cleaning Report

(Log and Report Process)

Construct Data

Derived Attributes

Generated Records

(Log and Report Process)

Integrate Data

Merged Data

(Log and Report Process)

Format Data

Reformatted Data

(Log and Report Process)

Select Modeling

Technique

Modeling Technique

Modeling Assumptions

(Log and Report Process)

Generate Test Design

Test Design

(Log and Report Process)

Build Model Parameter

Settings

Models

Model Description

(Log and Report Process)

Assess Model

Model Assessment

Revised Parameter

(Log and Report Process)

Evaluate Results

Align Assessment of Data

Mining Results with

Business Success Criteria

(Log and Report Process)

Approved Models

Review Process

Review of Process

(Log and Report Process)

Determine Next Steps

List of Possible Actions

Decision

(Log and Report Process)

Plan Deployment

Deployment Plan

(Log and Report Process)

Plan Monitoring and

Maintenance

Monitoring and

Maintenance Plan

(Log and Report Process)

Produce Final Report

Final Report

Final Presentation

(Log and Report Process)

Review Project

Experience

Documentation

(Log and Report Process)

Modeling

manipulate data and

draw conclusions

Evaluation

evaluate model and

conclusions

Deployment

apply conclusions to

business

Business Understanding

identify project objectives

Data Understanding

collect and review data

Data Preparation

select and cleanse data

Data Mining Life Cycle

Page 107: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 107

Logistic Regression: Results

MODEL IS A LINEAR FUNCTION THAT PREDICTS THE LIKELIHOOD

(PROBABILITY) OF A PERSON “ADDING ITEM TO CART”

RELATIVE MAGNITUDES OF THE COEFFICIENTS GIVE AN

INDICATION OF IMPORTANCE

Function for label “0” – i.e. wont add to cart

Page 108: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 108

Logistic Regression: Results

Negative Impact on Purchase Positive Impact on Purchase

Subscribed to Email Gender - Male

Commented on Blog Language – French

Visited Site Language – Spanish

Signup Newsletter Primary Use - Personal

Watched Video Referring URL - Ebay

Tweeted Item Referring URL - Google

Tweeted Blog Posts Referring URL – Live.com

Signup Free Offer

Page 109: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 109

Cost/Benefit Analysis

LOGIBOOST LOGISTIC

REGRESSION

LEFT CURVE:

CUMULATIVE GAINS

RIGHT CURVE: BENEFIT

CURVE

❯ y axis: benefit

❯ x axis: sample size

EXPECTED BENEFIT $ 21,475

❯ Also shows

expected benefit if

we just chose a

random subset of

this size from the

total pop.

❯ $ -3,036

Page 110: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 110

Cost/Benefit Analysis

ALGORITHM: LOGIBOOST LOGISTIC REGRESSION

EXPECTED BENEFITS IF WE USE MACHINE RECOMMENDED MAILER

RECIPIENTS

❯ $ 21,475

EXPECTED BENEFIT IF WE JUST CHOSE A RANDOM SUBSET OF THIS

SIZE FROM THE TOTAL POP.

❯ $ (-3,036)

Page 111: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 111

Modeling

Phases

a visual guide to CRISP-DM methodology

SOURCE CRISP-DM 1.0

http://www.crisp-dm.org/download.htm

DESIGN Nicole Leaper

http://www.nicoleleaper.com

Generic Tasks

Specialized Tasks

(Process Instances)

Determine Business

Objectives

Background

Business Objectives

Business Success Criteria

(Log and Report Process)

Assess Situation

Inventory of Resources,

Requirements, Assumptions,

and Constraints

Risks and Contingencies

Terminology

Costs and Benefits

(Log and Report Process)

Determine Data Mining

Goals

Data Mining Goals

Data Mining Success Criteria

(Log and Report Process)

Produce Project Plan

Project Plan

Initial Assessment of Tools and

Techniques

(Log and Report Process)

Collect Initial Data

Initial Data Collection Report

(Log and Report Process)

Describe Data

Data Description Report

(Log and Report Process)

Explore Data

Data Exploration Report

(Log and Report Process)

Verify Data Quality

Data Quality Report

(Log and Report Process)

Data Set

Data Set Description

(Log and Report Process)

Select Data

Rationale for Inclusion/

Exclusion

(Log and Report Process)

Clean Data

Data Cleaning Report

(Log and Report Process)

Construct Data

Derived Attributes

Generated Records

(Log and Report Process)

Integrate Data

Merged Data

(Log and Report Process)

Format Data

Reformatted Data

(Log and Report Process)

Select Modeling

Technique

Modeling Technique

Modeling Assumptions

(Log and Report Process)

Generate Test Design

Test Design

(Log and Report Process)

Build Model Parameter

Settings

Models

Model Description

(Log and Report Process)

Assess Model

Model Assessment

Revised Parameter

(Log and Report Process)

Evaluate Results

Align Assessment of Data

Mining Results with

Business Success Criteria

(Log and Report Process)

Approved Models

Review Process

Review of Process

(Log and Report Process)

Determine Next Steps

List of Possible Actions

Decision

(Log and Report Process)

Plan Deployment

Deployment Plan

(Log and Report Process)

Plan Monitoring and

Maintenance

Monitoring and

Maintenance Plan

(Log and Report Process)

Produce Final Report

Final Report

Final Presentation

(Log and Report Process)

Review Project

Experience

Documentation

(Log and Report Process)

Modeling

manipulate data and

draw conclusions

Evaluation

evaluate model and

conclusions

Deployment

apply conclusions to

business

Business Understanding

identify project objectives

Data Understanding

collect and review data

Data Preparation

select and cleanse data

Data Mining Life Cycle

Page 112: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 112

Marketing

HOW MANY MARKETING PEOPLE DOES

IT TAKE TO SCREW IN A LIGHTBULB?

None….they’ve automated it.

Page 113: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 113

YOUR PROJET

Page 114: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 114

Project method

Prepare Explore Designe Develop Deployed

Installation et

Configuration

Formation Agile BI

Etudes & Infrastructure

Architecture

Kick-off projet Revue de développement

Créer du contenu

Explorer

les données

Identifier les besoins

Revue itérative

Publier

Développer le contenu

Extraire et

charger les

données

Affiner le modele de données

Tester, recetter et déployer

Collecte besoins métier

Etendre

Mise en production

Réunion Go/NoGo

Définition Projet

Suite BI et formations avancées

Planning Projet

Plan Projet

Modèle de données

Cahier des charges

Procédure de mise en

production

Specification

Plans test

Formation et documentation

utilisateurs

Revue itérative

Page 115: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 115

How are you going to manage your project?

Vos efforts

Assistance Pentaho

Ask yourself: What resources do we have?

What competences do we have? What is our project timeline?

What’s our project complexity? Do we have the infrastructure?

Page 116: Pentaho Dev Day 30th October 2013 - Meetupfiles.meetup.com/1804355/Pentaho_Zaponet_DevDay... · 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 116

Thanks

blog.pentaho.com

@Pentaho

Facebook.com/Pentaho

Pentaho Business Analytics

JOIN THE CONVERSATION. YOU CAN FIND US ON: