Apache Flume NG

Post on 04-Dec-2014

3.559 views 0 download

description

Talk given by Kai Voigt, Cloudera Inc, at the Hadoop User Group UK meetup on 10 Oct 2012 in London

Transcript of Apache Flume NG

APACHE FLUME NGKai Voigt, Cloudera IncLondon, Hadoop User Group, 10 Oct 2012

Donnerstag, 11. Oktober 12

FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA

”Donnerstag, 11. Oktober 12

FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA

”Donnerstag, 11. Oktober 12

httpd

/var/log/htaccess

HDFS

Flume

Donnerstag, 11. Oktober 12

5

Donnerstag, 11. Oktober 12

6

mysource

mychannel

mysink

myagent.sources = mysourcemyagent.sinks = mysinkmyagent.channels = mychannel

Donnerstag, 11. Oktober 12

7

myagent.sources.mysource.type = execmyagent.sources.mysource.command = tail -F /var/log/htaccessmyagent.sources.mysource.channels = mychannel

mysource

mychannel

mysink

Donnerstag, 11. Oktober 12

8

myagent.sinks.mysink.type = hdfsmyagent.sinks.mysink.hdfs.path = /user/cloudera/htaccessmyagent.sinks.mysink.hdfs.fileType = DataStreammyagent.sinks.mysink.channel = mychannel

mysource

mychannel

mysink

Donnerstag, 11. Oktober 12

9

myagent.channels.mychannel.type = memorymyagent.channels.mychannel.capacity = 1000myagent.channels.mychannel.transactionCapactiy = 100

mysource

mychannel

mysink

Donnerstag, 11. Oktober 12

10

$ flume-ng agent --conf-file simple.conf --name myagent$ hadoop fs -ls htaccess-rw-r--r-- 1 cloudera cloudera 1001 2012-09-30 05:58 htaccess/FlumeData.1348999108529-rw-r--r-- 1 cloudera cloudera 993 2012-09-30 05:58 htaccess/FlumeData.1348999108530-rw-r--r-- 1 cloudera cloudera 997 2012-09-30 05:59 htaccess/FlumeData.1348999108531-rw-r--r-- 1 cloudera cloudera 1009 2012-09-30 05:59 htaccess/FlumeData.1348999108532...

Donnerstag, 11. Oktober 12

FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA

”Donnerstag, 11. Oktober 12

12

MULTI HOP

Donnerstag, 11. Oktober 12

13

myagent1.sinks = mysinkmyagent1.sinks.mysink.type = avromyagent1.sinks.mysink.bind = 10.10.10.20myagent1.sinks.mysink.port = 4141

myagent2.sources = mysourcemyagent2.sources.mysource.type = avromyagent2.sources.mysource.bind = 10.10.10.10myagent2.sources.mysource.port = 4141

Donnerstag, 11. Oktober 12

14

CONSOLIDATION

Donnerstag, 11. Oktober 12

15

MULTIPLEXING

Donnerstag, 11. Oktober 12

16

Sources Sinks Channels

Avro Avro Memory

Exec Logger JDBC

NetCat IRC File

Sequence Generator File

Syslog HBase

Scribe

Donnerstag, 11. Oktober 12

DEMODEMODEMODEMODEMO

Donnerstag, 11. Oktober 12

Thank you!kai@cloudera.comhttp://www.cloudera.com/

Donnerstag, 11. Oktober 12