Post on 08-May-2015
description
Loading data in Hadoop 2
with SQOOP and Flume
Christophe Marchal | Software Architect
Problem to solve
Hortonworks stack
Batch Loading vs Stream Loading
SQOOP
HCatalog
SQOOP 1: Import
SQOOP 1: Export
SCOOP 2
Flume
AgentWeb Server
Source
Channel
Sink
HDFSAgent
Source
Channel
Sink
Agent
Source
Channel
Sink
Agent
Source
Channel
Sink
Web ServerWeb
Server
Multi agent flow
Consolidation flow
Flume vs SQOOP
● distributed
● reliable (transaction)
● available (backup
routes)
● collecting data
● aggregating data
● Data imports
● Parallelizes data
transfer
● Copies data quickly
Flume example
Flume example
Flume example
SQOOP: import HDFS
SQOOP: import HDFS
SQOOP: import HDFS
SQOOP: import Hive
SQOOP: import Hive
SQOOP: import Hive
Thanks
Christophe Marchal | Software Architect @toff63