Hadoop For OpenStack Log Analysis

17
Hadoop For OpenStack Log Analysis Havana Summit, April 16 th , 2013, Mike Pittaro Principal Architect, Big Data Solutions @pmikeyp Freenode: mikeyp [email protected]

description

 

Transcript of Hadoop For OpenStack Log Analysis

Page 1: Hadoop For OpenStack Log Analysis

Hadoop For OpenStack Log Analysis Havana Summit, April 16th, 2013,

Mike Pittaro

Principal Architect, Big Data Solutions

@pmikeyp Freenode: mikeyp

[email protected]

Page 2: Hadoop For OpenStack Log Analysis

2 Revolutionary Cloud Team

The Problem : Operating OpenStack at Scale

Page 3: Hadoop For OpenStack Log Analysis

3 Revolutionary Cloud Team

The Search for the Holy Grail of OpenStack Operations

• Imagine if we could follow a request … – Through the entire system … – Across compute, storage, network … – Independent of physical nodes … – With timestamps … – Correlated with events outside OpenStack …

Page 4: Hadoop For OpenStack Log Analysis

4 Revolutionary Cloud Team

It’s Easy !

Collect Logs

Analyze ? Sleep !

Page 5: Hadoop For OpenStack Log Analysis

5 Revolutionary Cloud Team

OpenStack Log Analysis is a Big Data Problem

Big Data is when the data itself is part of the problem.

Volume

• A large amount of data, growing at large rates

Velocity

• The speed at which the data must be processed

Variety

• The range of data types and data structure

Page 6: Hadoop For OpenStack Log Analysis

6 Revolutionary Cloud Team

Initial Focus and Scope of Our Efforts • The Operators

– Assist in running OpenStack – Not for tenants

• The Data – Load all detail into Hadoop – Extract and index significant fields – Enable future analysis

• The Patterns – What works – What is repeatable across installs – How can we collaborate

Page 7: Hadoop For OpenStack Log Analysis

7 Revolutionary Cloud Team

Hadoop 101 – Simplified Block Diagram

HDFS Distributed Storage

MapReduce Distributed Processing

Pig

(Batch Language)

Hive

(SQL-ish query)

Sqoop

Flume

Mahout (Machine Learning)

Oozie (Workflow)

AP

I’s

JDBC

HBase Database

HUE

(Web UI)

Page 8: Hadoop For OpenStack Log Analysis

8 Revolutionary Cloud Team

The Big Pieces

• Log Collection – Continuous streaming of log data into Hadoop from OpenStack

• Intelligent Log parsing, Indexing and Search – Should ‘know’ about OpenStack

• Well Defined Storage Organization – Defined schema for the data – Predefined queries for high level status - dashboard

• Straightforward implementation pattern – Add as little complexity as possible

• Ability to perform deeper analysis – Hadoop enables this

Page 9: Hadoop For OpenStack Log Analysis

9 Revolutionary Cloud Team

OpenStack Log Analysis Block Diagram

MapReduce Jobs

Pig Hive

Sqoop /Flume

Solr Cloud

HDFS

FlumeNG Agent

Lucene Indexes

Python

Logging Syslog

Nagios + Ganglia Avro

Avro

Avro

Search

Query

OpenStack Nodes

Page 10: Hadoop For OpenStack Log Analysis

10 Revolutionary Cloud Team

Current Development Status

• Batch Only, no Flume Collection

• Converting logs to AVRO format

• First cut of schema in place

• Loading into Hadoop

• Processing into SOLR indexes

• Starting to look at data – Solr Searches – Pig scripts

Page 11: Hadoop For OpenStack Log Analysis

11 Revolutionary Cloud Team

Schema Thoughts 2013-03-26 11:57:41 WARNING nova.db.sqlalchemy.session

[req-ace2ccc0-919e-4fd1-9f3a-671c0c87d28f None None] Got mysql server has gone away: (2006, 'MySQL server has gone away')

{"namespace": "logfile.openstack",

"type": "record",

"name": "logentry",

"fields": [

{"name": "hostname", "type": "string"},

{"name": "date", "type": "string"},

{"name": "time", "type": "string"},

{"name": "level", "type": "string"},

{"name": "module", "type": "string"},

{"name": "request_id1", "type": ["string", "null"]},

{"name": "request_id2", "type": ["string", "null"]},

{"name": "request_id3", "type": ["string", "null"]},

{"name": "data", "type": "string"}

]

}

Page 12: Hadoop For OpenStack Log Analysis

12 Revolutionary Cloud Team

Demo: Where we are today

• Solr Indexing and Search – Example of indexed fields and searching.

• Pig for batch analysis – Reconstruct a sequence of messages related to an API request

Page 13: Hadoop For OpenStack Log Analysis

13 Revolutionary Cloud Team

Data Collection Thoughts

• Sources – OpenStack subsystems – Syslog files – Nagios and Ganglia – General Infrastructure Data

– Network Switches / Routers

• Input Formats – Mostly Semi-structured text – Subsystem, timestamp, hostname, severity and error level are important

• Output Formats – Avro – Thrift – Protocol Buffers

Page 14: Hadoop For OpenStack Log Analysis

14 Revolutionary Cloud Team

Log Collection Thoughts

• Well understood patterns – Evolving best practices

• Commonly Used Tools – Kafka – Scribe – Flume and FlumeNG

• Key Requirements – Distributed – Reliable – Aggregators – consolidate streams – Store and Forward – when links are down

Page 15: Hadoop For OpenStack Log Analysis

15 Revolutionary Cloud Team

Storage Organization Thoughts

• File Organization within Hadoop – Naming Convention – Directory Organization

• Data Lifecycle – Input and Staging – ‘Hot’ and ‘Cold’ Data – Tiered Indexes – Compression – Archival and Deletion

Page 16: Hadoop For OpenStack Log Analysis

16 Revolutionary Cloud Team

What should we do next ?

• Document the basic patterns so far ?

• Are there any related efforts ?

• Begin deeper discussions – Lots of decisions to make – Need community input and suggestions

• Collaborate on Schema Design

• What upstream OpenStack changes are needed ?

• Get sample logs in hands of Hadoopians – Cleansed reference log sets would be useful

Page 17: Hadoop For OpenStack Log Analysis

17 Revolutionary Cloud Team

References

• The unified logging infrastructure for data analytics at Twitter

• Building LinkedIn’s Real-time Activity Data Pipeline

• Advances and challenges in log analysis

• BP: Ceilometer HBase Storage Backend

• BP: Cross Service Request ID

• Log Everything All the Time

• Holy Grail, Gangam Style