OpenStack Log Mining

download OpenStack Log Mining

of 33

Embed Size (px)

description

Presentation from the OpenStack Summit 2014 in Atlanta.

Transcript of OpenStack Log Mining

  • Accelerating adoption of Open Infrastructure May 2014 Log Management and Mining
  • Copyright 2014 Solinea, Inc. Logging has a Long History photo credit: The Forest History Society via photopin cc
  • Copyright 2014 Solinea, Inc. In Multiple Domains
  • Copyright 2014 Solinea, Inc. Like Many Things, It Has Evolved photo credit: Richard Hurd via photopin cc photo credit: Richard Hurd via photopin cc
  • Copyright 2014 Solinea, Inc. Here Too
  • Copyright 2014 Solinea, Inc. Complexity Reigns in Cloud
  • Copyright 2014 Solinea, Inc. BEEF Nova Cinder Etc. rsyslog logstash elasticsearch tcp:5514 tcp:9200 verbose = True use_syslog = True syslog_log_facility=LOG_LOCAL{n} local{n}.* @@logstash:5514
  • Copyright 2014 Solinea, Inc. Standards are Elusive We have a couple standards that might apply: RFC5424 (The Syslog Protocol) NCSA/Apache CLF (Web servers) Project adoption varies, but right trajectory Some duplication of fields with rsyslog When shipping remotely Dont get me started on timestamps!
  • Copyright 2014 Solinea, Inc. Anatomy of an OpenStack Message Most projects use a similar format Date: 2014-05-02 14:10:57.278 PID: 3609 Level: INFO Prog: oslo.messaging._drivers.impl_qpid ID: [-] Msg: Connected to AMQP
  • Copyright 2014 Solinea, Inc. use_syslog = True Existing syslog format is DEPRECATED during I, and then will be changed in J to honor RFC5424 May 15 12:28:57 compute-01 2014-05-15 12:28:57.767 20739 WARNING nova.openstack.common.loopingcall [-] task run outlasted interval by 110.003069 sec Note1: standard ryslog config on CentOS 6.5 with remote shipping to central server
  • Copyright 2014 Solinea, Inc. use_syslog_rfc_format = True Adds APP-NAME before message Nice idea, but Appears incompatible with use_syslog = True Nova-compute fails to launch when both set With use_syslog = False Messages in /var/log/nova/compute.log look the same Could be environmental, needs more exploration
  • Copyright 2014 Solinea, Inc. Shipping via rsyslog rsyslog.conf global settings change: $ActionFileDefaultTemplate RSYSLOG_FileFormat $ActionForwardDefaultTemplate RSYSLOG_ForwardFormat Effect: 2014-05-15T13:37:11.138121+00:00 controller-01 2014-05-15 13:37:11.137 3412 INFO nova.openstack.common.service [-] Caught SIGTERM, stopping children
  • Copyright 2014 Solinea, Inc. Shipping via rsyslog (conf.d) rsyslog.d/10-goldstone.conf file: $WorkDirectory /var/lib/rsyslog # where to place spool files $ActionQueueFileName fwdGoldstone # unique name prefix for spool files $ActionQueueMaxDiskSpace 1g # 1gb space limit (use as much as possible) $ActionQueueSaveOnShutdown on # save messages to disk on shutdown $ActionQueueType LinkedList # run asynchronously $ActionResumeRetryCount -1 # infinite retries if host is down local0.* @@10.10.11.122:5514 # nova local1.* @@10.10.11.122:5514 # glance local2.* @@10.10.11.122:5514 # neutron local3.* @@10.10.11.122:5514 # ceilometer local4.* @@10.10.11.122:5514 # swift local5.* @@10.10.11.122:5514 # cinder local6.* @@10.10.11.122:5514 # keystone
  • Copyright 2014 Solinea, Inc. Receiving via Logstash (Input) input { tcp { port => 5514 # matches port that rsyslog ships to type => syslog # insert a type field to identify this as an incoming message from syslog } }
  • Copyright 2014 Solinea, Inc. Receiving via Logstash (Output) output { elasticsearch { host => localhost port => 9200 protocol => http } }
  • Copyright 2014 Solinea, Inc. Receiving via Logstash (Patterns) OPENSTACK_PROG (?:[ a-zA-Z0-9_-]+.)+[ A-Za-z0-9_-$]+ OPENSTACK_PROG_SINGLE [A-Za-z0-9_-$]+ OPENSTACK_SOURCE %{OPENSTACK_PROG}|%{OPENSTACK_PROG_SINGLE} OPENSTACK_REQ_LIST ([(?:(req-%{UUID}|%{UUID}|%{BASE16NUM}|None|-|%{SPACE}))+])? OPENSTACK_PID ( %{POSINT:pid:int})? OPENSTACK_LOGLEVEL ([D|d]ebug|DEBUG|[N|n]otice|NOTICE|[I|i]nfo|INFO|[W|w]arn?(?:ing)?|WARN?(?:ING)?|[E| e]rr?(?:or)?|ERR?(?:OR)?|[C|c]rit?(?:ical)?|CRIT?(?:ICAL)?|[F|f]atal|FATAL|[S|s]evere|SEVERE|[A|a]udit|AUDIT) OPENSTACK_NORMAL %{TIMESTAMP_ISO8601:timestamp}%{OPENSTACK_PID} % {OPENSTACK_LOGLEVEL:loglevel} %{OPENSTACK_SOURCE:program} {OPENSTACK_REQ_LIST:request_id_list} %{GREEDYDATA:msg} RAW_TRACE (?:^[^0-9].*$|^$) OPENSTACK_TRACE %{TIMESTAMP_ISO8601:timestamp} %{POSINT:pid:int} ([T|t]race|TRACE) % {OPENSTACK_SOURCE:program} %{GREEDYDATA:msg}|%{RAW_TRACE:msg} OPENSTACK_MESSAGE %{OPENSTACK_NORMAL}|%{OPENSTACK_TRACE} OPENSTACK_SYSLOGLINE %{SYSLOG5424PRINUM}%{CISCOTIMESTAMP:syslog_ts} % {HOSTNAME:syslog5424_host} %{OPENSTACK_MESSAGE:os_message}
  • Copyright 2014 Solinea, Inc. Receiving via Logstash (Filter Fun) filter { if ([type] == "syslog) { grok { patterns_dir => "/opt/logstash/patterns" match => { "message" => "%{OPENSTACK_SYSLOGLINE}" } add_field => { "received_at" => "%{@timestamp}" } add_field => { "_message" => "%{syslog5424_host} %{message}" } } if ("_grokparsefailure" not in [tags]) { see following slides } } }
  • Copyright 2014 Solinea, Inc. Receiving via Logstash (Filter Fun) syslog_pri { severity_labels => ["EMERGENCY", "ALERT", "CRITICAL", "ERROR", "WARNING", "NOTICE", "INFO", "DEBUG"] syslog_pri_field_name => "syslog5424_pri" } date { match => [ "timestamp", "yyyy-MM-dd HH:mm:ss.SSS" ] remove_field => "timestamp" timezone => "Etc/UTC" } NOTE1: syslog_pri parses up that ugly number at the front of the incoming message (i.e. ) NOTE2: This date processing is based on the timestamp in the OpenStack generated message, not the rsyslog message. With enhanced rsyslog template, or better OpenStack message format, we can avoid inferring timezone.
  • Copyright 2014 Solinea, Inc. Receiving via Logstash (Filter Fun) translate { field => "syslog_facility" dictionary => [ "local0", "nova", "local1", "glance", "local2", "neutron", "local3", "ceilometer", "local4", "swift", "local5", "cinder", "local6", "keystone" ] fallback => "unknown" destination => "component" } NOTE1: syslog_facility generated by syslog_pri earlier. Adds a new component field so we can figure out who generated these messages.
  • Copyright 2014 Solinea, Inc. Receiving via Logstash (Filter Fun) mutate { rename => [ "msg", "message" ] rename => [ "syslog5424_host", "host" ] remove_field => "syslog_ts" remove_field => "syslog5424_pri" remove_field => "os_message" add_tag => ["processed", "openstack_syslog", "filter_34"] } Note1: We made it to the end of the filter successfully, so lets clean up a little and add some tags to indicate how we navigated the filter space.
  • Copyright 2014 Solinea, Inc. Result in ES: photo credit: Robbert van der Steeg via photopin cc
  • Copyright 2014 Solinea, Inc. Interpreting Specific Messages (Patterns) NOVA_API_CALL %{IP:ip} "(?:GET|PUT|POST|DELETE) %{URIPATH:uri} %{NOTSPACE:protocol}" status: % {NUMBER:response_status:int} len: %{NUMBER:response_length:int} time: %{NUMBER:response_time:float}
  • Copyright 2014 Solinea, Inc. Interpreting Specific Messages if ("_grokparsefailure" not in [tags]) { # clean up extra fields and tag us mutate { replace => [ "type", "openstack_api_stats" ] remove_field => "pid" remove_field => "hostname" remove_field => "message" remove_field => "_message" remove_field => "loglevel" remove_field => "syslog_severity_code" remove_field => "syslog_facility_code" remove_field => "syslog_facility" remove_field => "syslog_severity" add_tag => ["metric", "filter_37"] } } Note1: Processed after successful openstack message filtering. We know the lineage, so we dont need to keep a bunch of redundant information.
  • Copyright 2014 Solinea, Inc. Result in ES: photo credit: Www.CourtneyCarmody.com/ via photopin cc
  • Copyright 2014 Solinea, Inc. Querying ES for Logs { "query": { "bool": { "must": [ {"range": {"@timestamp": {"gte": "2014-05-08T16:31:07+00:00", "lte": "2014-05-15T16:31:07+00:00"}}}, {"terms": {"type": ["openstack_log"]}} ] } }, "aggs": { "events_by_time": { "date_histogram": {"field": "@timestamp", "interval": "5448.648648648648s", "min_doc_count": 0}, "aggs": { "events_by_loglevel": {"terms": {"field": "loglevel"}} } } } }
  • Copyright 2014 Solinea, Inc. Querying Nova API Stats { "query": { "filtered": { "filter": {"match_all": {}}, "query": {"bool": "must": [ {"range": {"@timestamp": {"gte": "2014-04-15T16:45:53+00:00", "lte": "2014-05-15T16:45:53+00:0