Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation

Filling the Data LakeChuck Yarbrough, Director, Pentaho SolutionsMark Burnette, Enterprise Sales Engineer

Preview presentation for Strata + HadoopWorld San Jose 2016 session Thursday, March 31 at 11:50 am, room 230B

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-75552

Hadoop is Hard…

Empower team members to

integrate and process Hadoop

Data

Establish a modern data on

boarding process that is flexible and

scalable

Deliver governed analytic insights

for large production use

bases

Things that can help ease the pain


Proper Care and Feeding of the Data Lake


How do we effectively scale data pipelines to accommodate exploding data sources, volumes, and complexity?

More Data, More Problems

Have you ever had the pleasure of…

Migrating hundreds of tables between databases?

Enabling business users to onboard a variety of data themselves?

Ingesting hundreds of changing data sources into Hadoop?


More Data, More Problems

Modern data onboarding is more than just “connecting” or “loading” – it includes:

Managing a changing array of data sources

Establishing repeatable processes at scale

Maintaining control and governance


CSVCSV

RDBMS

Big Data On Boarding

Ingest Procedures

Hadoop

AVRO

RDBMS

Disparate Data Sources Dynamic Integration Processes Dynamic Transformations

RDBMS


Continuous Big Data On BoardingBlueprint

Streamline data ingest from wide

variety of source data

Reduce dependence on hard coded data

movement procedures

Simplify regular data movement at scale

into Hadoop

Visit Pentaho in the exhibit hall booth #1025

Attend the session Filling the Data LakeThursday, March 31 11:50a-12:30pRoom 230B

http://conferences.oreilly.com/strata/hadoop-big-data-ca/public/schedule/detail/50677

Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation

Software

Transcript of Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation