upload test 1
-
Upload
sadayuki-furuhashi -
Category
Technology
-
view
1.945 -
download
7
Transcript of upload test 1
Sadayuki Furuhashi
Fluentd
@frsyuki
!e Event Collector Service
Treasure Data, Inc.
Structured logging
Pluggable architecture
Reliable forwarding
• Sadayuki Furuhashi> twitter: @frsyuki
• Treasure Data, Inc.> Software Engineer; founder
• Author of MessagePack
• Author of Fluentd
What’s Fluentd?
It's like syslogd, but uses JSON for log messages
What’s Fluentd?
Application
Fluentd
Storage
2012-02-04 01:33:51myapp.buylog { “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing”}
What’s Fluentd?
Application
Fluentd
Storage
2012-02-04 01:33:51myapp.buylog { “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing”}
timetag
record
What’s Fluentd?
Application
Fluentd
Storage
!lter / bu"er / routing
What’s Fluentd?
Application
Fluentd
FluentdStorageSaaS
!lter / bu"er / routing
Plug-in Plug-in Plug-in
What’s Fluentd?
Application
Fluentd
FluentdStorageSaaS
!lter / bu"er / routing
File
tail
Scribesyslogd
Plug-in Plug-in
Plug-in
Plug-in Plug-in Plug-in
What’s Fluentd?• Client libraries
> Ruby> Perl> PHP> Python> Java> ...
Fluent.open(“myapp”)
Fluent.event(“login”, {“user”=>38})
#=> 2012-02-04 04:56:01 myapp.login {“user”:38}
Application
Fluentd
Fluentd & Event logsBefore:
Application
File File File ...
App server
Application
File File File ...
App server
File
Application
File File File ...
App server
Log server
Burst of tra!c
High latencymust wait for a day
Hard to analyzecomplex text parsers
Fluentd & Event logsAfter:
Application
App server
Fluentd
Application
App server
Fluentd
Application
App server
Fluentd
Fluentd Fluentd
Realtime!
Fluentd & Event logs
Fluentd Fluentd Fluentd
Fluentd Fluentd
Hadoop/ Hive MongoDB Amazon
S3 / EMRReady toAnalyze!
Realtime!
# receive events via HTTP<source> type http port 8888</source>
# read logs from a file<source> type tail path /var/log/httpd.log format apache tag apache.access</source>
# save access logs to MongoDB<match apache.access> type mongo host 127.0.0.1</match>
# save alerts to a file<match alert.**> type file path /var/log/fluent/alerts</match>
# forward other logs to servers# (load-balancing + fail-over)<match **> type forward <server> host 192.168.0.11 weight 20 </server> <server> host 192.168.0.12 weight 60 </server></match>
Fluentd vs Scribe
• Deals with structured logs
• Easy to install> “gem install fluentd”> apt-get and yum http://packages.treasure-data.com/
• Easy to customize
• add/modify plugins without re-compiling> “gem search -rd fluent-plugin”
Fluentd vs Flume
• Easy to setup> “sudo fluentd --setup && fluentd”
• Very small footprint> small engine (3,000 lines) + plugins
• JVM-free
• Easy to configure
Architecture of Fluentd
Input Buffer Output
HTTP+JSONFile tailSyslog...
MemoryFile
FileAmazon S3Fluent...
Pluggable Pluggable Pluggable
Architecture :: Input
Input
HTTP+JSONFile tailSyslog...
Pluggable
✓ Receive logs✓ Or pull logs from data sources✓ Non-blocking
Input plugins:
Architecture :: Bu"er
Pluggable
✓ Improve performance✓ Improve reliability✓ Provide thread-safety
Buffer plugins:
Buffer
MemoryFile
Architecture :: Output
Pluggable
✓ Write or send event logsOutput plugins:
Output
FileAmazon S3Fluent...
Plugins :: out_forward
Fluentd
Fluentd Fluentd
out_forward
in_forward
forward event logs
Heartbeat
✓ load balancing
Plugins :: out_forward
Fluentd
Fluentd Fluentd
out_forward
in_forward
forward event logs
Heartbeat
! accrual failure detector
✓ load balancing
Plugins :: out_copy
Fluentd
MongoDB Fluentd
out_copy
out_forwardout_mongo
duplicate event logs
File
out_#le
Plugins :: buf_#le
Fluentd
buf_#le
reliable bu"ering
#le
#le
#le✓ Automatic retry✓ 2^N retry interval
#le#le
#le
✓ Persistent bu"er
Plugins :: out_exec
Fluentd
out_exec
externalprogram
TSV → stdin
execute external programs
✓ Python✓ Perl✓ C++
Plugins :: out_exec_#lter
Fluentd
out_exec_#lter
externalprogram
stdin
stdoutexternalprogram
out_execTSV → stdin
execute external programs
✓ Python✓ Perl✓ C++
Plugins :: in_exec
Fluentd
out_exec_#lter
externalprogram
stdin
stdoutexternalprogram
out_execTSV → stdin
externalprogram
stdout
in_exec
execute external programs
✓ Python✓ Perl✓ C++
Plugins :: in_tail
Fluentd
Application
in_tail
File /var/log/access.log
Read event logs from a #le
✓ Apache log parser✓ Syslog parser✓ Custom parser
Plugins :: in_tailApache log parser
87.12.1.87 - - [04/Feb/2012:00:20:11 +0900] "GET / HTTP/1.1" 200 9887.12.1.87 - - [04/Feb/2012:00:20:11 +0900] "GET / HTTP/1.1" 200 98...
{ “host”: “87.12.1.87”, “method”: “GET”, “code”: 200, “size”: 98, “path”: “/”}...
Plugins
• Bundled plugins> file writes event logs to files hourly or daily
> forward forwards event logs (+fail-over and load balancing)
> exec passes event logs to/from external commands
> tail reads event logs from a file (like `tail -f`)
Plugins
• 3rd party plugins> scribe integrates Fluentd with Scribe
> s3 uploads log files to Amazon S3 hourly or daily
> mongo writes logs to MongoDB
> hoop puts log files on Hadoop HDFS via Hoop
...
Plugin developer API
• Unit test framework (like “MRUnit”)> Fluent::Test::InputTestDriver> Fluent::Test::OutputTestDriver> Fluent::Test::BufferedOutputTestDriver
• Fluent::TailInput (base class of “tail” plugin)> text parser is customizable def parse_line(line)
Fluentd• Documents
> http://fluentd.org
• Source code> http://github.com/fluent
• Twitter> #fluentd
• Mailing list> http://groups.google.com/group/fluentd
“BIG DATA ANALYTICS PLATFORM”as a Service
Fluentd & Treasure Data
Fluentd Fluentd Fluentd
Fluentd Fluentd
Hadoop/ Hive MongoDB Amazon
S3 / EMRReady toAnalyze!
Realtime!
Fluentd & Treasure Data
Fluentd Fluentd Fluentd
Fluentd Fluentd
Realtime!
Treasure DataCloud Platform
Ready toAnalyze!
Fluentd & Treasure DataTreasure Data
Cloud Platform
SQL VisualizationSELECT users.age, COUNT(1)FROM logsLEFT JOIN users ON logs.user_id = users.idGROUP BY users.ageWHERE path = “/buyItem”