How ElasticSearch lives in my DevOps life
-
Upload
- -
Category
Technology
-
view
105 -
download
3
description
Transcript of How ElasticSearch lives in my DevOps life
ElasticSearch for DevOps
What’s ElasticSearch?
• “flexible and powerful open source, distributed real-timesearch and analytics engine for the cloud”
• http://www.elasticsearch.org/
What’s ElasticSearch?
• “flexible and powerful open source, distributed real-timesearch and analytics engine for the cloud”
• JSON-oriented;• RESTful API;• Schema free.
MySQL ElasticSearch
database Index
table Type
column field
Defined data type Auto detected
What’s ElasticSearch?
• “flexible and powerful open source, distributed real-timesearch and analytics engine for the cloud”
• Master nodes & data nodes;• Auto-organize for replicas and shards;• Asynchronous transport between nodes.
What’s ElasticSearch?
• “flexible and powerful open source, distributed real-timesearch and analytics engine for the cloud”
• Flush every 1 second.
What’s ElasticSearch?
• “flexible and powerful open source, distributed real-timesearch and analytics engine for the cloud”
• Build on Apache lucene.• Also has facets just as solr.
What’s ElasticSearch?
• “flexible and powerful open source, distributed real-timesearch and analytics engine for the cloud”
• Give a cluster name, auto-discovery by unicast/multicast ping or EC2 key.
• No zookeeper needed.
Howto Curl
• Index$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "message" : "trying out Elastic Search"}‘
{"ok":true,"_index":“twitter","_type":“tweet","_id":"1","_version":1}
Howto Curl
• Get$ curl -XGET 'http://localhost:9200/twitter/tweet/1'{
"_index" : "twitter", "_type" : "tweet", "_id" : "1", "_source" : { "user" : "kimchy", "postDate" : "2009-11-15T14:12:12", "message" : "trying out Elastic Search" }
}
Howto Curl
• Query$ curl -XPOST 'http://localhost:9200/twitter/tweet/_search?
pretty=1&size=1' -d '{ "query" : { "term" : { "user" : "kimchy" }
"fields": ["message"] }}'
Howto Curl
• Query• Term => { match some terms (after analyzed)}• Match => { match whole field (no analyzed)}• Prefix => { match field prefix (no analyzed)}• Range => { from, to}• Regexp => { .* }• Query_string => { this AND that OR thus }• Must/must_not => {query}• Shoud => [{query},{}]• Bool => {must,must_not,should,…}
Howto Curl
• Filter$ curl -XPOST 'http://localhost:9200/twitter/tweet/_search?
pretty=1&size=1' -d '{ "query" : { “match_all" : {} },
"filter" : { "term" : { “user" : “kimchy" } }}'
Much faster because filter is cacheable and do not calcute _score.
Howto Curl
• Filter• And => [{filter},{filter}] (only two)• Not => {filter}• Or => [{filter},{filter}](only two)• Script => {“script”:”doc[‘field’].value > 10”}• Other like the query DSL
Howto Curl
• Facets$ curl -XPOST 'http://localhost:9200/twitter/tweet/_search?pretty=1&size=0' -
d '{ "query" : { “match_all" : {} },
"filter" : { “prefix" : { “user" : “k" } },
"facets" : { “usergroup" : { "terms" : { "field" : “user" } } } }'
Howto Curl
• Facets• terms => [{“term”:”kimchy”,”count”:20},{}]• Range <= [{“from”:10,”to”:20},]• Histogram <= {“field”:”user”,”interval”:10}• Statistical <= {“field”:”reqtime”}
=> [{“min”:,”max”:,”avg”:,”count”:}]
Howto Perl – ElasticSearch.pm
use ElasticSearch;my $es = ElasticSearch->new( servers => 'search.foo.com:9200', # default '127.0.0.1:9200' transport => 'http' # default 'http' | 'httplite ' # 30% faster, future default | 'httptiny ' # 1% more faster | 'curl' | 'aehttp' | 'aecurl' | 'thrift', # generated code too slow max_requests => 10_000, # default 10000 trace_calls => 'log_file', no_refresh => 0 | 1,);
Howto Perl – ElasticSearch.pm
use ElasticSearch;my $es = ElasticSearch->new( servers => 'search.foo.com:9200', transport => 'httptiny ‘, max_requests => 10_000, trace_calls => 'log_file', no_refresh => 0 | 1,);
• Get nodelist by /_cluster API from the $servers;• Rand change request to other node after
$max_requests.
Howto Perl – ElasticSearch.pm
$es->index( index => 'twitter', type => 'tweet', id => 1, data => { user => 'kimchy', post_date => '2009-11-15T14:12:12', message => 'trying out Elastic Search' });
Howto Perl – ElasticSearch.pm
$es->search( facets => { wow_facet => { query => { text => { content => 'wow' }}, facet_filter => { term => {status => 'active' }}, } })
Howto Perl – ElasticSearch.pm
$es->search( facets => { wow_facet => { queryb => { content => 'wow' }, facet_filterb => { status => 'active' }, } })
ElasticSearch::SearchBuilderMore perlishSQL::Abstract-likeBut I don’t like ==!
Howto Perl – Elastic::Model
• Tie a Moose object to elasticsearch
package MyApp;use Elastic::Model; has_namespace 'myapp' => { user => 'MyApp::User'}; no Elastic::Model;1;
Howto Perl – Elastic::Model
package MyApp::User;use Elastic::Doc;use DateTime;has 'name' => ( is => 'rw', isa => 'Str',);has 'email' => ( is => 'rw', isa => 'Str',);has 'created' => ( is => 'ro', isa => 'DateTime', default => sub { DateTime->now });no Elastic::Doc;1;
Howto Perl – Elastic::Model
package MyApp::User;use Moose;use DateTime;has 'name' => ( is => 'rw', isa => 'Str',);has 'email' => ( is => 'rw', isa => 'Str',);has 'created' => ( is => 'ro', isa => 'DateTime', default => sub { DateTime->now });no Moose;1;
Howto Perl – Elastic::Model• Connect to dbmy $es = ElasticSearch->new( servers => 'localhost:9200' );my $model = MyApp->new( es => $es );• Create database and table$model->namespace('myapp')->index->create();• CRUDmy $domain = $model->domain('myapp');$domain->newdoc()|get();• searchmy $search = $domain->view->type(‘user’)->query(…)->filterb(…);$results = $search->search;say "Total results found: ".$results->total; while (my $doc = $results->next_doc) { say $doc->name;}
ES for Dev -- Github
• 20TB data;• 1300000000 files;• 130000000000 code lines.• Using 26 Elasticsearch storage nodes(each
has 2TB SSD) managed by puppet.• 1replica + 20 shards.
• https://github.com/blog/1381-a-whole-new-code-search• https://github.com/blog/1397-recent-code-search-outages
ES for Dev – Git::Search
• Thank you, Mateu Hunter!• https://github.com/mateu/Git-Search
cpanm --installdeps .cp git-search.conf git-search-local.confedit git-search-local.confperl -Ilib bin/insert_docs.plplackup -Ilibcurl http://localhost:5000/text_you_want
ES for Perler -- Metacpan
• search.cpan.org => metacpan.org• use ElasticSearch as API backend;• use Catalyst build website frontend.
• Learn API:https://github.com/CPAN-API/cpan-api/wiki/API-docs
• Have a try:http://explorer.metacpan.org/
ES for Perler – index-weekly
• A Perl script (55 lines) to index devopsweekly into elasticsearch.
• https://github.com/alcy/index-weekly
• We can do same thing to perlweekly,right?
ES for logging - Logstash
• “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.”
• http://logstash.net/
ES for logging - Logstash
• “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.”
• Log is stream, not file!• Event is something not only oneline!
ES for logging - Logstash
• “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.”
• file/*mq/stdin/tcp/udp/websocket…(34 input plugins now)
ES for logging - Logstash
• “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.”
• date/geoip/grok/multiline/mutate…(29 filter plugins now)
ES for logging - Logstash
• “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.”
• transfer:stdout/*mq/tcp/udp/file/websocket…• alert:ganglia/nagios/opentsdb/graphite/irc/
xmpp/email…• store:elasticsearch/mongodb/riak• (47 output plugins now)
ES for logging - Logstash
ES for logging - Logstashinput { redis { host => "127.0.0.1“ type => "redis-input“ data_type => "list“ key => "logstash“ }}filter { grok { type => “redis-input“ pattern => "%{COMBINEDAPACHELOG}" }}output { elasticsearch { host => "127.0.0.1“ }}
ES for logging - Logstash• Grok(Regexp capture):%{IP:client:string}%{NUMBER:bytes:int}
More default patterns at source:https://github.com/logstash/logstash/tree/master/patterns
ES for logging - Logstash
For example:
10.2.21.130 - - [08/Apr/2013:11:13:40 +0800] "GET /mediawiki/load.php HTTP/1.1" 304 - "http://som.d.xiaonei.com/mediawiki/index.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10"
ES for logging - Logstash{"@source":"file://chenryn-Lenovo/home/chenryn/test.txt","@tags":[],"@fields":{ "clientip":["10.2.21.130"], "ident":["-"], "auth":["-"], "timestamp":["08/Apr/2013:11:13:40 +0800"], "verb":["GET"], "request":["/mediawiki/load.php"], "httpversion":["1.1"], "response":["304"], "referrer":["\"http://som.d.xiaonei.com/mediawiki/index.php\""], "agent":["\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like
Gecko) Version/6.0.3 Safari/536.28.10\""]},"@timestamp":"2013-04-08T03:34:37.959Z","@source_host":"chenryn-Lenovo","@source_path":"/home/chenryn/test.txt","@message":"10.2.21.130 - - [08/Apr/2013:11:13:40 +0800] \"GET /mediawiki/load.php HTTP/1.1\" 304 -
\"http://som.d.xiaonei.com/mediawiki/index.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10\"",
"@type":"apache“}
ES for logging - Logstash "properties" : { "@fields" : { "dynamic" : "true", "properties" : { "client" : { "type" : "string", "index" : "not_analyzed“ }, "size" : { "type" : "long", "index" : "not_analyzed“ }, "status" : { "type" : "string", "index" : "not_analyzed“ }, "upstreamtime" : { "type" : "double“ }, } },
ES for logging - Kibana
ES for logging – Message::Passing
• Logstash port to Perl5• 17 CPAN modules
ES for logging – Message::Passinguse Message::Passing::DSL;run_message_server message_chain { output elasticsearch => ( class => 'ElasticSearch', elasticsearch_servers => ['127.0.0.1:9200'], ); filter regexp => ( class => 'Regexp', format => ':nginxaccesslog', capture => [qw( ts status remotehost url oh responsetime upstreamtime bytes )] output_to => 'elasticsearch', ); filter tologstash => ( class => 'ToLogstash', output_to => 'regexp', ); input file => ( class => 'FileTail', output_to => ‘tologstash', );};
Message::Passing vs Logstash
100_000 lines nginx access loglogstash::output::elasticsearch_http(default)4m30.013slogstash::output::elasticsearch_http(flush_size => 1000)3m41.657smessage::passing::filter::regexp(v0.01 call $self->_regex->regexp() everyline)1m22.519smessage::passing::filter::regexp(v0.04 store $self->_regex->regexp() to $self->_re)0m44.606s
D::P::Elasticsearch & D::P::Ajax
Build Website using PerlDancerget '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%F\T%T", localtime( time() - 864000 ) ), inputto => strftime( "%F\T%T", localtime() ), };};
ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res);};
use Dancer ‘:syntax’;get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%F\T%T", localtime( time() - 864000 ) ), inputto => strftime( "%F\T%T", localtime() ), };};
ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res);};
use Dancer::Plugin::Auth::Extensible;get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%F\T%T", localtime( time() - 864000 ) ), inputto => strftime( "%F\T%T", localtime() ), };};
ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res);};
use Dancer::Plugin::Ajax;get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%F\T%T", localtime( time() - 864000 ) ), inputto => strftime( "%F\T%T", localtime() ), };};
ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res);};
use Dancer::Plugin::ElasticSearch;get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%F\T%T", localtime( time() - 864000 ) ), inputto => strftime( "%F\T%T", localtime() ), };};
ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res);};
use Dancer::Plugin::ElasticSearch;sub area_terms { my ( $index, $level, $limit, $from, $to ) = @_; my $data = elsearch->search( index => $index, type => $type, facets => { area => { facet_filter => { and => [ { range => { date => { from => $from, to => $to } } }, { numeric_range => { timeCost => { gte => $level } } }, ], }, terms => { field => "fromArea", size => $limit, } } } ); return $data->{facets}->{area}->{terms};}
ES for monitor – oculus(Etsy Kale)
• Kale to detect anomalous metrics and see if any other metrics look similar.
• http://codeascraft.com/2013/06/11/introducing-kale/
ES for monitor – oculus(Etsy Kale)
• Kale to detect anomalous metrics and see if any other metrics look similar.
• https://github.com/etsy/skyline
ES for monitor – oculus(Etsy Kale)
• Kale to detect anomalous metrics and see if any other metrics look similar.
• https://github.com/etsy/oculus
ES for monitor – oculus(Etsy Kale)
• import monitor data from redis/ganglia to elasticsearch
• Using native script to calculate distance:script.native: oculus_euclidian.type:
com.etsy.oculus.tsscorers.EuclidianScriptFactory oculus_dtw.type: com.etsy.oculus.tsscorers.DTWScriptFactory
ES for monitor – oculus(Etsy Kale)
• https://speakerdeck.com/astanway/bring-the-noise-continuously-deploying-under-a-hailstorm-of-metrics
VBox example
• apt-get install -y git cpanminus virtualbox• cpanm Rex• git clone https://github.com/chenryn/esdevops• cd esdevops• rex init --name esdevops