An Introduction to Fluent & MongoDB Plugins

28
An Introduction to Fluent & MongoDB Plugins @doryokujin MongoDB Meet-Up #7 in Japan

description

 

Transcript of An Introduction to Fluent & MongoDB Plugins

Page 1: An Introduction to Fluent & MongoDB Plugins

An Introductionto Fluent &

MongoDB Plugins@doryokujin

MongoDB Meet-Up #7 in Japan

Page 2: An Introduction to Fluent & MongoDB Plugins

・Takahiro Inoue(age 26)

・twitter: doryokujin

・Majored in Math (Statistics & Graph Algorithm)

・Data Scientist

・Leader of MongoDB JP

・Interest: DataProcessing, GraphDB

About Me

Page 3: An Introduction to Fluent & MongoDB Plugins

1. What is Fluent?

2. Introduction to MongoDB Plugins & Use Cases

3. Demo

Agenda

Sadayuki Furuhashi

Fluent

@frsyuki

!e Event Collector Service

Treasure Data, Inc.

Structured logging

Pluggable architecture

Reliable forwarding

Page 4: An Introduction to Fluent & MongoDB Plugins

What is Fluent?

Sadayuki Furuhashi

Fluent

@frsyuki

!e Event Collector Service

Treasure Data, Inc.

Structured logging

Pluggable architecture

Reliable forwarding

Page 5: An Introduction to Fluent & MongoDB Plugins

CassandraWeb Server(nginx)

Appserver(Tomcat)

There are many formats(MySQL, Cassandra, Text...)

Access Log

MySQL

Action Log Save Data

Payment,Registration

Log Flow Example

Page 6: An Introduction to Fluent & MongoDB Plugins

CassandraWeb Server(nginx)

Appserver(Tomcat)

Access Log

MySQL

Action Log Save Data

Payment,Registration

log

Log Flow Example

Text Logs Grow very Fast and BIg!!

Page 7: An Introduction to Fluent & MongoDB Plugins

App Server App Server App Server App Server

log log log log

Analyze Server

Analyze Server

Analyze Server

2. Download Logs from S3 to Local

1. Upload Logs to Amazon S3 per Day

How to Generate Logs?-Traditional Approach-

Page 8: An Introduction to Fluent & MongoDB Plugins

App Server App Server App Server App Server

log log log log

Analyze Server

Analyze Server

Analyze Server

Interval: Download Daily, not Hourly

Size: Lots of Stress on the Network

What is a Problem?-Traditional Approach-

Page 9: An Introduction to Fluent & MongoDB Plugins

App Server App Server App Server App Server

log log log log

Analyze Server

Analyze Server

Analyze Server

Relay Server

Stream Data per Hour, Minute, even Second!!!

Relay Server

Improvement Network Stress very much!!!

How to Generate Logs?-Streaming Approach-

Page 10: An Introduction to Fluent & MongoDB Plugins

Event Collector Services

Not only Streaming:Realtime Stat & ML

For Large Data Streaming

One of Hadoop Ecosystem

Page 11: An Introduction to Fluent & MongoDB Plugins

Fluent: Handling Structured Data

Not only Streaming:Realtime Stat & ML

For Large Data Streaming

One of Hadoop Ecosystem

Sadayuki Furuhashi

Fluent

@frsyuki

!e Event Collector Service

Treasure Data, Inc.

Structured logging

Pluggable architecture

Reliable forwarding

Page 12: An Introduction to Fluent & MongoDB Plugins

What is Fluent?

Sadayuki Furuhashi

Fluent

@frsyuki

!e Event Collector Service

Treasure Data, Inc.

Structured logging

Pluggable architecture

Reliable forwarding

http://www.scribd.com/doc/70897187/Fluent-event-collector-update

Page 13: An Introduction to Fluent & MongoDB Plugins

Introduction to MongoDB Plugins & Use Cases

Sadayuki Furuhashi

Fluent

@frsyuki

!e Event Collector Service

Treasure Data, Inc.

Structured logging

Pluggable architecture

Reliable forwarding

Page 15: An Introduction to Fluent & MongoDB Plugins

App Server

Out Mongo: For Local Back Up

App Server App Server App Server

log log log log

Analyze Server

Analyze Server

Analyze Server

Relay Server

Relay Server

Page 16: An Introduction to Fluent & MongoDB Plugins

App Server

Out Mongo: For Local Back Up

App Server App Server App Server

log log log log

Analyze Server

Analyze Server

Analyze Server

Relay Server

Relay Server

・Network Partition ※fluentd can buffer data and retry sending it after

・Fluentd Down※ we want to

access event data by another way

Page 17: An Introduction to Fluent & MongoDB Plugins

MongoDB: Capped Collection is suitable: ・Fast Write &

・Fixed Size Storage

App Server App Server

log log

Analyze Server

Analyze Server

Analyze Server

Relay Server

Enable Data Access Quickly When

・Network Partition ・Fluent Down

BackUp to Capped Collection

BackUp to Capped Collection

BackUp to Capped Collection

Out Mongo: For Local Back Up

Decreasing Possibility of Data Lost

Page 18: An Introduction to Fluent & MongoDB Plugins

App Server App Server

log log

Analyze Server

Analyze Server

Analyze Server

Relay Server

BackUp to Capped Collection

BackUp to Capped Collection

BackUp to Capped Collection

Out Mongo: For Local Back Up

Enable Data Access Quickly

Page 19: An Introduction to Fluent & MongoDB Plugins

logBackUp to

Capped Collection

Out Mongo: For Local Back Up

tcp

<match ...>

type mongo_backup

capped_size 100m

<store>

type tcp

host 192.168.0.13

...

</store>

</match>

Parent

Child

・To Back Up: We only add the configuration.

-Configuration-

Page 20: An Introduction to Fluent & MongoDB Plugins

App Server

Out Mongo: For Result Output

App Server App Server App Server

log log log log

Analyze Server

Analyze Server

Analyze Server

Relay Server

Relay Server

Output to Mongo Collection

Output to Mongo Collection

・Result Output:JSON Structured Data is suitable for Mongo!!!

Page 21: An Introduction to Fluent & MongoDB Plugins

<match mongo.**>

type mongo

database fluent

collection test

# Following attibutes are optional

host fluenter

port 10000

# Other buffer configurations here

</match>

Out Mongo: For Result Output

log

Parent

Child Output to Mongo Collection

-Configuration-

Page 22: An Introduction to Fluent & MongoDB Plugins

Mon Nov 14 23:36:22 [conn13] run command admin.$cmd { replSetGetStatus: 1 }

Mon Nov 14 23:36:22 [conn13] command admin.$cmd command: { replSetGetStatus: 1 } ntoreturn:1 reslen:571 0ms

Mon Nov 14 23:36:22 [conn13] run command admin.$cmd { ismaster: 1 }

Mon Nov 14 23:36:22 [conn13] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 reslen:234 0ms

Mon Nov 14 23:36:22 [conn13] run command admin.$cmd { replSetGetStatus: 1 }

Out Mongo: For Result Output

log

Parent

Child Output to Mongo Collection

{

_id : ...,

time: Mon Nov 14 23:36:22,

key1 : “[conn13]”,

key2 : “command”,

key3 : ”admin.$cmd”,

key4 : {

“ismaster”: 1

},

value : “0ms”,

}

Input

Output

-In & Output-

Page 24: An Introduction to Fluent & MongoDB Plugins

aggregate

key1 key2 key3 shuffle

aggregate perday, hour,

minute, second

aggregate

App Serverlog

aggregate

App Serverlog

aggregate

App Serverlog

aggregate

App Serverlog

aggregate

Relay Server

Relay Server

Relay Server

Analyze Server

aggregate aggregate

Overview

Page 25: An Introduction to Fluent & MongoDB Plugins

<source>

type tail

format /^(?<time>[^ ]* [^ ]* [^ ]* [^ ]*) (?<key1>[^ ]*) (?<key2>[^ ]*) (?<key3>[^ ]*)

(?<value1>[^ ]*)$/

time_format %a %b %e %H:%M:%S

path /var/log/something.log

tag aggr_hostneme

</source>

<metrics>

name one_key

partition_by m

each_key key1

</metrics>

<metrics>

name two_keys

partition_by m

each_key key2,key3

value_key value1

type float

</metrics>

aggregate per minute

count(*) group by key1

sum(value1), count(*)group by key2, key3

Aggregation Mongo: configuration

<server>

name host1

host host1

port 24224

</server>

<server>

name host2

host host2

port 24224

</server>

...

key-value are shuffled for each servers (like Hadoop)

Page 26: An Introduction to Fluent & MongoDB Plugins

Mon Nov 14 23:36:22 [conn13] run command admin.$cmd { replSetGetStatus: 1 }

Mon Nov 14 23:36:22 [conn13] command admin.$cmd command: { replSetGetStatus: 1 } ntoreturn:1 reslen:571 0ms

Mon Nov 14 23:36:22 [conn13] run command admin.$cmd { ismaster: 1 }

Mon Nov 14 23:36:22 [conn13] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 reslen:234 0ms

Mon Nov 14 23:36:22 [conn13] run command admin.$cmd { replSetGetStatus: 1 }

Aggregation Mongo: in & output

{

_id : "399e94941cacf13eeb3f808e8ac00981",

name : one_key,

partition : "2011-11-14 19:17"

key : {

key1 : "PeriodicTask::Runner"

},

count : 30,

value : {

response : 1024

}

}

Input

Output

delta1 delta2 delta3

delta5delta4

delta6

per day, hour, minute, second

Page 27: An Introduction to Fluent & MongoDB Plugins

aggregate

key1

key2

key3 shuffle

aggregate

Relay Server

Relay Server

Relay Server

Analyze Server

aggregate aggregate

Analyze Server

Analyze Server

mongos mongos mongos

aggregate

shard key3shard key1shard key2

MongoSharding

We can Scale!!...

...

Scale Up

Page 28: An Introduction to Fluent & MongoDB Plugins

Demo

Sadayuki Furuhashi

Fluent

@frsyuki

!e Event Collector Service

Treasure Data, Inc.

Structured logging

Pluggable architecture

Reliable forwarding