Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

48

Transcript of Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Page 1: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails
Page 2: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Elasticsearch?

Page 3: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

clustered and sharded document storage with powerful

language analysing features and a query language,

all wrapped by a REST API

Page 4: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Getting Started

• install elasticsearch

• needs some JDK

• start it

Page 5: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Getting Started

• https://github.com/elastic/elasticsearch-rails

• gems for Rails:

• elasticsearch-model & elasticsearch-rails

• without Rails / AR:

• elasticsearch-persistence

Page 6: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

class Event < ActiveRecord::Base include Elasticsearch::Model

Page 7: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

class Event < ActiveRecord::Base include Elasticsearch::Model

def as_indexed_json(options={}) { title: title, description: description, starts_at: starts_at.iso8601 } end

Page 8: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.import

Page 9: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.import

PUT /events/event/31710 { "title": "Finding the right stuff, ...", "description": "Searching in data sets with ...", "starts_at": “2015-10-08T19:00:00+09:00" }

Page 10: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.import

PUT /events/event/31710 { "title": "Finding the right stuff, ...", "description": "Searching in data sets with ...", "starts_at": “2015-10-08T19:00:00+09:00" }

index

Page 11: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.import

PUT /events/event/31710 { "title": "Finding the right stuff, ...", "description": "Searching in data sets with ...", "starts_at": “2015-10-08T19:00:00+09:00" }

index

type

Page 12: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.import

PUT /events/event/31710 { "title": "Finding the right stuff, ...", "description": "Searching in data sets with ...", "starts_at": “2015-10-08T19:00:00+09:00" }

index

type

ID

Page 13: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search 'tokyo rubyist'

Page 14: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

response = Event.search 'tokyo rubyist' response.took # => 28 response.results.total # => 2075 response.results.first._score # => 0.921177 response.results.first._source.title # => "Drop in Ruby"

Page 15: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

response = Event.search 'tokyo rubyist' response.took # => 28 response.results.total # => 2075 response.results.first._score # => 0.921177 response.results.first._source.title # => "Drop in Ruby"

GET /events/event/_search?q=tokyo%20rubyist

Page 16: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

response = Event.search 'tokyo rubyist' response.records.to_a # => [#<Event id: 12409, ...>, ...]

response.page(2).results response.page(2).records

Page 17: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

response = Event.search 'tokyo rubyist' response.records.to_a # => [#<Event id: 12409, ...>, ...]

response.page(2).results response.page(2).records supports kaminari /

will_paginate

Page 18: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

response = Event.search 'tokyo rubyist' response.records.each_with_hit do |rec,hit| puts "* #{rec.title}: #{hit._score}" end # * Drop in Ruby: 0.9205564 # * Javascript meets Ruby in Kamakura: 0.8947 # * Meetup at EC Navi: 0.8766844 # * Pair Programming Session #3: 0.8603562 # * Kickoff Party: 0.8265461 # * Tales of a Ruby Committer: 0.74487066 # * One Year Anniversary Party: 0.7298212

Page 19: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search 'tokyo rubyist'

Page 20: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search 'tokyo rubyist'

only upcoming events?

Page 21: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search 'tokyo rubyist'

only upcoming events?

sorted by start date?

Page 22: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }

Page 23: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }

basically same as before

Page 24: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }

basically same as before

filtered by conditions

Page 25: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }

basically same as before

filtered by conditions

sorted by start time

Page 26: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Query DSL

• query: { <query_type>: <arguments> }

• valid arguments depend on query type

• "Filtered Query" takes a query and a filter

• "Simple Query String Query" does not allow nested queries

Page 27: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }

Page 28: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Query DSL

• filter: { <filter_type>: <arguments> }

• valid arguments depend on filter type

• "And filter" takes an array of filters

• "Range filter" takes a property and lt(e), gt(e)

• "Term filter" takes a property and a value

Page 29: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Match QueryMulti Match Query

Bool Query Boosting Query

Common Terms Query Constant Score Query

Dis Max Query Filtered Query

Fuzzy Like This Query Fuzzy Like This Field Query

Function Score QueryFuzzy Query

GeoShape Query Has Child Query Has Parent Query

Ids Query Indices Query

Match All Query More Like This Query

Nested Query Prefix Query

Query String Query Simple Query String Query

Range Query Regexp Query

Span First Query Span Multi Term Query

Span Near Query Span Not Query Span Or Query

Span Term Query Term Query Terms Query

Top Children Query Wildcard Query

Minimum Should Match Multi Term Query Rewrite

Template Query

Page 30: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

And FilterBool Filter

Exists Filter Geo Bounding Box Filter

Geo Distance Filter Geo Distance Range Filter

Geo Polygon Filter GeoShape Filter

Geohash Cell Filter Has Child Filter Has Parent Filter

Ids Filter Indices Filter

Limit Filter Match All Filter Missing Filter Nested Filter

Not FilterOr Filter

Prefix Filter Query Filter

Range FilterRegexp Filter Script Filter Term Filter

Terms FilterType Filter

Page 31: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }

Page 32: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

class Event < ActiveRecord::Base include Elasticsearch::Model

def as_indexed_json(options={}) { title: title, description: description, starts_at: starts_at.iso8601, featured: group.featured? } end

settings do mapping dynamic: 'false' do indexes :title, type: 'string' indexes :description, type: 'string' indexes :starts_at, type: 'date' indexes :featured, type: 'boolean' end end

Page 33: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.import force: true

deletes existing index, creates new index with settings,

imports documents

Page 34: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }

Page 35: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search query: { bool: { should: [ { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, { function_score: { filter: { and: [ { range: { starts_at: { lte: 'now' } } }, { term: { featured: true } } ] }, gauss: { starts_at: { origin: 'now', scale: '10d', decay: 0.5 }, }, boost_mode: "sum" } } ], minimum_should_match: 2 } }

Page 36: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search '東京rubyist'

Page 37: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Dealing with different languages

built in analysers for arabic, armenian, basque, brazilian, bulgarian, catalan, cjk, czech, danish, dutch, english, finnish, french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, latvian, norwegian, persian, portuguese, romanian, russian, sorani, spanish, swedish, turkish, thai.

Page 38: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Japanese?

• install kuromoji plugin

• https://github.com/elastic/elasticsearch-analysis-kuromoji

• plugin install elasticsearch/elasticsearch-analysis-kuromoji/2.7.0

Page 39: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

class Event < ActiveRecord::Base include Elasticsearch::Model

def as_indexed_json(options={}) { title: { en: title_en, ja: title_ja }, description: { en: description_en, ja: description_ja }, starts_at: starts_at.iso8601, featured: group.featured? } end

settings do mapping dynamic: 'false' do indexes :title do indexes :en, type: 'string', analyzer: 'english' indexes :ja, type: 'string', analyzer: 'kuromoji' end indexes :description do indexes :en, type: 'string', analyzer: 'english' indexes :ja, type: 'string', analyzer: 'kuromoji' end indexes :starts_at, type: 'date' indexes :featured, type: 'boolean' end end

Page 40: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Event.search 'tokyo rubyist'

with data from other models?

Page 41: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

class Event < ActiveRecord::Base include Elasticsearch::Model

def as_indexed_json(options={}) { title: { en: title_en, ja: title_ja }, description: { en: description_en, ja: description_ja }, group_name: { en: group.name_en, ja: group.name_ja }, starts_at: starts_at.iso8601, featured: group.featured? } end

settings do mapping dynamic: 'false' do indexes :title do indexes :en, type: 'string', analyzer: 'english' indexes :ja, type: 'string', analyzer: 'kuromoji' end indexes :description do indexes :en, type: 'string', analyzer: 'english' indexes :ja, type: 'string', analyzer: 'kuromoji' end indexes :group_name do indexes :en, type: 'string', analyzer: 'english' indexes :ja, type: 'string', analyzer: 'kuromoji' end indexes :starts_at, type: 'date' indexes :featured, type: 'boolean' end end

Page 42: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Automated Tests

Page 43: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

class Event < ActiveRecord::Base include Elasticsearch::Model

index_name "drkpr_#{Rails.env}_events"

Index names with environment

Page 44: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Test Helpers

• https://gist.github.com/mreinsch/094dc9cf63362314cef4

• Helpers: wait_for_elasticsearchwait_for_elasticsearch_removalclear_elasticsearch!

• specs: Tag tests which require elasticsearch

Page 45: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Production Ready?

• use elastic.co/found or AWS ES

• use two instances for redundancy

• elasticsearch could go away

• usually only impacts search

• keep impact at a minimum

Page 46: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

class Event < ActiveRecord::Base include Elasticsearch::Model

after_save do IndexerJob.perform_later( 'update', self.class.name, self.id) end

after_destroy do IndexerJob.perform_later( 'delete', self.class.name, self.id) end

...

Page 47: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

class IndexerJob < ActiveJob::Base queue_as :default

def perform(action, record_type, record_id) record_class = record_type.constantize record_data = { index: record_class.index_name, type: record_class.document_type, id: record_id } client = record_class.__elasticsearch__.client

case action.to_s when 'update' record = record_class.find(record_id) client.index record_data.merge(body: record.as_indexed_json) when 'delete' client.delete record_data.merge(ignore: 404) end end

end

https://gist.github.com/mreinsch/acb2f6c58891e5cd4f13

Page 48: Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Questions?

Elastic Docs https://www.elastic.co/guide/index.html

Ruby Gem Docshttps://github.com/elastic/elasticsearch-rails

Resources

or ask me later: [email protected] @mreinsch