Post on 04-Dec-2014
description
APACHE FLUME NGKai Voigt, Cloudera IncLondon, Hadoop User Group, 10 Oct 2012
Donnerstag, 11. Oktober 12
FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA
“
”Donnerstag, 11. Oktober 12
FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA
“
”Donnerstag, 11. Oktober 12
httpd
/var/log/htaccess
HDFS
Flume
Donnerstag, 11. Oktober 12
5
Donnerstag, 11. Oktober 12
6
mysource
mychannel
mysink
myagent.sources = mysourcemyagent.sinks = mysinkmyagent.channels = mychannel
Donnerstag, 11. Oktober 12
7
myagent.sources.mysource.type = execmyagent.sources.mysource.command = tail -F /var/log/htaccessmyagent.sources.mysource.channels = mychannel
mysource
mychannel
mysink
Donnerstag, 11. Oktober 12
8
myagent.sinks.mysink.type = hdfsmyagent.sinks.mysink.hdfs.path = /user/cloudera/htaccessmyagent.sinks.mysink.hdfs.fileType = DataStreammyagent.sinks.mysink.channel = mychannel
mysource
mychannel
mysink
Donnerstag, 11. Oktober 12
9
myagent.channels.mychannel.type = memorymyagent.channels.mychannel.capacity = 1000myagent.channels.mychannel.transactionCapactiy = 100
mysource
mychannel
mysink
Donnerstag, 11. Oktober 12
10
$ flume-ng agent --conf-file simple.conf --name myagent$ hadoop fs -ls htaccess-rw-r--r-- 1 cloudera cloudera 1001 2012-09-30 05:58 htaccess/FlumeData.1348999108529-rw-r--r-- 1 cloudera cloudera 993 2012-09-30 05:58 htaccess/FlumeData.1348999108530-rw-r--r-- 1 cloudera cloudera 997 2012-09-30 05:59 htaccess/FlumeData.1348999108531-rw-r--r-- 1 cloudera cloudera 1009 2012-09-30 05:59 htaccess/FlumeData.1348999108532...
Donnerstag, 11. Oktober 12
FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA
“
”Donnerstag, 11. Oktober 12
12
MULTI HOP
Donnerstag, 11. Oktober 12
13
myagent1.sinks = mysinkmyagent1.sinks.mysink.type = avromyagent1.sinks.mysink.bind = 10.10.10.20myagent1.sinks.mysink.port = 4141
myagent2.sources = mysourcemyagent2.sources.mysource.type = avromyagent2.sources.mysource.bind = 10.10.10.10myagent2.sources.mysource.port = 4141
Donnerstag, 11. Oktober 12
14
CONSOLIDATION
Donnerstag, 11. Oktober 12
15
MULTIPLEXING
Donnerstag, 11. Oktober 12
16
Sources Sinks Channels
Avro Avro Memory
Exec Logger JDBC
NetCat IRC File
Sequence Generator File
Syslog HBase
Scribe
Donnerstag, 11. Oktober 12
DEMODEMODEMODEMODEMO
Donnerstag, 11. Oktober 12
Thank you!kai@cloudera.comhttp://www.cloudera.com/
Donnerstag, 11. Oktober 12