Apache Hadoop Architecture (2016-17)

1
Structured Data Unstructured Data Semi Structured Data Documents Social Media Video | Enterprise Data CRM Machine Sensor EDI XML/JSON | Transaction Integration Tools Apache Flume Apache Kafa Apache Sqoop Apache NiFi Apache ManifoldCF File System ERP CRM Data Sources Hadoop Distributed File System Quantcast File System Ceph File System XtreemFS YARN Cluster Resource Management Execution Engine Direct Java .NET Script Slides Batch Script Cascading SQL Pig Hive, Apache Drill, Cloudera IMPALA Scala Java Other ISV Stream NoSQL Strom Other ISV In- memory Data Flow Engines Machine Learning & Search Other ISV ZooKeeper Visualization Management & Coordination

Transcript of Apache Hadoop Architecture (2016-17)

Page 1: Apache Hadoop Architecture (2016-17)

Structured Data Unstructured Data Semi Structured Data

Documents Social MediaVideo|

Enterprise Data CRM Machine Sensor

EDI XML/JSON| Transaction

Inte

gra

tio

n

To

ols

Apache FlumeApache KafaApache Sqoop Apache NiFi Apache ManifoldCF

File

S

yste

m

ERPCRM

Da

ta S

ou

rce

s

Hadoop Distributed File System

Quantcast File System Ceph File System XtreemFS

YARNClu

ste

r R

eso

urc

e

Ma

na

ge

me

nt

Execution Engine

DirectJava.NET

Script

Slides

Batch

Script CascadingSQL

PigHive, Apache

Drill, Cloudera IMPALA

ScalaJava Other

ISV

StreamNoSQL

Strom Other ISV

In-memory

Data Flow

Engines

Machine Learning & Search Other

ISV

ZooKeeper

VisualizationM

an

ag

em

en

t &

Co

ord

ina

tion