TiE Big Data panel

Post on 27-Jan-2015

115 views 8 download

Tags:

description

 

Transcript of TiE Big Data panel

TiE SV Big Data Panel Oct 13, 2011

What did Google do?

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Dremel

Dremel Evenflow

MySQL

Gateway

Sawzall Bigtable

Chubby

MapReduce / GFS

Evenflow

What did Google do?

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Dremel

Dremel Evenflow

MySQL

Gateway

Sawzall Bigtable

Chubby

MapReduce / GFS

Evenflow

Store files

What did Google do?

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Dremel

Dremel Evenflow

MySQL

Gateway

Sawzall Bigtable

Chubby

MapReduce / GFS

Evenflow

Process

data

What did Google do?

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Dremel

Dremel Evenflow

MySQL

Gateway

Sawzall Bigtable

Chubby

MapReduce / GFS

Evenflow

Ingest data

What did Google do?

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Dremel

Dremel Evenflow

MySQL

Gateway

Sawzall Bigtable

Chubby

MapReduce / GFS

Evenflow

Store records & tables

What did Google do?

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Dremel

Dremel Evenflow

MySQL

Gateway

Sawzall Bigtable

Chubby

MapReduce / GFS

Evenflow

High level domain specific

language

What did Google do?

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Dremel

Dremel Evenflow

MySQL

Gateway

Sawzall Bigtable

Chubby

MapReduce / GFS

Evenflow

Chain together complex workloads

What did Google do?

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Dremel

Dremel Evenflow

MySQL

Gateway

Sawzall Bigtable

Chubby

MapReduce / GFS

Evenflow

Schedule them

What did Google do?

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Dremel

Dremel Evenflow

MySQL

Gateway

Sawzall Bigtable

Chubby

MapReduce / GFS

Evenflow

Columnar format + metadata

What did Google do?

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Dremel

Dremel Evenflow

MySQL

Gateway

Sawzall Bigtable

Chubby

MapReduce / GFS

Evenflow

End user queries

What did Google do?

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Dremel

Dremel Evenflow

MySQL

Gateway

Sawzall Bigtable

Chubby

MapReduce / GFS

Evenflow

Coordinate within

system

The pattern repeated

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

HiPal

Hive Databee Databee

Scribe

Hive HBase

Zookeeper

The pattern repeated

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Hive Oozie Oozie

Data

Highway

Pig & Hive HBase

Zookeeper

The pattern repeated

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Azkaban Azkaban

Sqoop

Kafka

Pig Voldemort

Zookeeper

The pattern repeated

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Hue Hue

Hive Oozie Oozie

Sqoop

Flume

Hive / Pig HBase

Zookeeper

Cloudera’s Distribution Including Apache Hadoop

Project summary

©2011 Cloudera, Inc. All Rights Reserved. Confidential.

Reproduction or redistribution without written permission is

prohibited.

Topic Project(s)

File storage HDFS

Record storage Hbase, Hypertabe, Accumulo

Metadata storage Hive, Hcatalog

Batch data processing MapReduce

Streaming data processing S4, Storm

Graph processing Giraph, X-Rime

Query language Hive

Dataflow language Pig

Database integration Sqoop

Event data collection Flume, Scribe

Test & assembly Bigtop

Distributed lock Zookeeper

Web access Hue

Workflow Oozie, Azkaban

File format Avro, RCFile, Protocol Buffers, Sequence File

BIG DATA

PO

SS

IBL

E

anything

with

is

Ce

leb

rate

Ne

xt

Satu

rday