Elastic search Walkthrough

27
Elastic Search

description

Elastic search Walkthrough

Transcript of Elastic search Walkthrough

Page 1: Elastic search Walkthrough

Elastic Search

Page 2: Elastic search Walkthrough

Concepts• Elastic search is an open source(Apache 2), Distributed, RESTful,

Search Engine built on top of Apache Lucene• Schema Free & Document Oriented• Support JSON Model• Elastic Search allows you to completely control how a JSON

document gets mapped into the search on a per type and per index level.

• Multi Tenancy – Support for more than one index, support for more than one type per index

• Distributed Nature - Indices are broken down into shards, each shard with 0 or more replicas

• In RDBMS terms, index corresponds to database, type corresponds to table, a document corresponds to a table row and a field corresponds to a table column.

Page 3: Elastic search Walkthrough

Create Index

• The create index API allows to instantiate an index• Curl Example for making Sales index (index name should be in lowercase)

$ curl -XPOST 'http://localhost:9200/sales/‘ • Each index created can have specific settings associated with it. Following example

create index sales with 3 shards, each with 2 replicascurl - XPOST 'http://localhost:9200/sales/' -d '{

"settings" : { "number_of_shards" : 3, "number_of_replicas" : 2 }

}‘• Reference link : http://www.elasticsearch.org/guide/reference/api/admin-indices-create-index.html

Page 4: Elastic search Walkthrough

Mapping• Mapping is the process of defining how a document should be mapped to

the Search Engine• If no mapping is defined, elasticsearch will guess the kind of the data and

map it.• In ES, an index may store documents of different “mapping types”• The put mapping API allows to register specific mapping definition for a

specific type. Example – mapping for Order type curl -XPOST 'http://localhost:9200/sales/order1/_mapping' -d '{ "order1": { "properties": { "entity_id":{"type":"integer"}, "increment_id":{"type":"string","index":"not_analyzed"}, "status":{"type":"string"} } } }‘

Page 5: Elastic search Walkthrough

Mapping• Get Mapping available in index. Following curl examples returned all the type and its associate mapping available in sales index curl –XGET ‘localhost:9200/sales/_mapping?pretty=1’

• Get Mapping of type curl – XGET‘localhost:9200/sales/order1/_mapping?pretty=1’

• Reference link : http://www.elasticsearch.org/guide/reference/mapping/index.html

Page 6: Elastic search Walkthrough

Add document• The following example inserts the JSON document into the

“sales” index, under a type called “order1” with an id of 1: curl -XPOST 'http://localhost:9200/sales/order1/1' -d ' {

"entity_id":1, "increment_id":"1000001",“status":"shipped",

}'

• Reference link:http://www.elasticsearch.org/guide/reference/api/index_.html

Page 7: Elastic search Walkthrough

GET API (Get data)• The get API allows to get a typed JSON document from the

index based on its id. The following example gets a JSON document from an index called sales, under a type called order1, with id valued 1:

curl -XGET 'localhost:9200/sales/order1/1?pretty=1'

• The get operation allows to specify a set of fields that will be returned by passing the fields parameter. For example:

curl -XGET 'localhost:9200/sales/order/1?fields=entity_id?pretty=1‘

• Reference link :http://www.elasticsearch.org/guide/reference/api/get.html

• For Multi Get Api

http://www.elasticsearch.org/guide/reference/api/multi-get.html

Page 8: Elastic search Walkthrough

Search API (Search data)• The search API allows to execute a search query and get back search hits that

match the query. Following query returns the document which have entity_id 1 curl -XGET 'http://localhost:9200/sales/order1/_search' -d '{ "query" : { "term" : { “entity_id" : 1 } }}'

• The additional parameter for search API are from, size, search_type, sort,fields etc.

curl -XGET 'http://localhost:9200/sales/order1/_search' -d '{ "query" : { "term" : {"status" : "confirmed" } },

"from" :0, "size" :1,"sort" :[{"entity_id" : "desc"],"fields":["entity_id","increment_id"] } ‘• Reference Link :

http://www.elasticsearch.org/guide/reference/api/search/request-body.html

Page 9: Elastic search Walkthrough

Multi - Search API (Search data)• The search API can be applied to multiple types within an index, and

across multiple indices with support for the multi index syntax. For example, we can search on all documents across all types within the sales index:

curl -XGET 'http://localhost:9200/sales/_search?q=status:confirmed‘

• We can also search within specific types: curl -XGET 'http://localhost:9200/sales/order,order1/_search?q=status:confirmed‘

• We can also search all orders with a certain field across several indices:

curl -XGET 'http://localhost:9200/sales,newsales/order1/_search?q=entity_id:1‘

• we can search all orders across all available indices using _all placeholder:

curl - XGET 'http://localhost:9200/_all/order1/_search?q=entity_id:1‘

• even search across all indices and all types: curl -XGET 'http://localhost:9200/_search?q=entity_id:1'

Page 10: Elastic search Walkthrough

Update API

• The update API allows to update a document based on a script provided. Following example update the status field of document which has id 1 with new value.

curl -XPOST 'localhost:9200/sales/order1/1/_update' -d '{ "script" : "ctx._source.status= newStatus", "params" : { "newStatus" : " confirmed" }}‘

• We can also add a new field to the document:curl -XPOST 'localhost:9200/sales/order1/1/_update' -d '{ "script" : "ctx._source.newField = \"new field intoduced\""}‘

• We can also remove a field from the document:curl -XPOST 'localhost:9200/sales/order1/1/_update' -d '{ "script" : "ctx._source.remove(\"newField\")"}‘

• Reference link :http://www.elasticsearch.org/guide/reference/api/update.html

Page 11: Elastic search Walkthrough

Delete API• The delete API allows to delete a typed JSON document from a

specific index based on its id. The following example deletes the JSON document from an index called sales, under a type called order1, with id valued 1:

curl -XDELETE 'http://localhost:9200/sales/order1/1‘

• Delete entire typecurl -XDELETE 'http://localhost:9200/sales/order1‘

• The delete by query API allows to delete documents from one or more indices and one or more types based on a query:

curl -XDELETE 'http://localhost:9200/sales/order1/_query?q=entity_id:1‘ curl -XDELETE 'http://localhost:9200/sales/_query?q=entity_id:1'

curl -XDELETE 'http://localhost:9200/sales/order1/_query' -d '{ "term" : { “status" : “confirmed" }} '

Page 12: Elastic search Walkthrough

Count API• The count API allows to easily execute a query and get the number of matches

for that query. It can be executed across one or more indices and across one or more types.

curl -XGET 'http://localhost:9200/sales/order/_count' -d '{"term":{"status":"confirmed"}} '

curl -XGET 'http://localhost:9200/_count' -d '{"term":{"status":"confirmed"}} '

curl -XGET 'http://localhost:9200/sales/order,order1/_count' -d '{"term":{"status":"confirmed"}} '

• Reference Link :http://www.elasticsearch.org/guide/reference/api/count.html

Page 13: Elastic search Walkthrough

Facet Search• Facets provide aggregated data based on a search query.• A terms facet can return facet counts for various facet values for a

specific field. ElasticSearch supports more facet implementations, such as range, statistical or date histogram facets.

• The field used for facet calculations must be of type numeric, date/time or be analyzed as a single token.

• You can give the facet a custom name and return multiple facets in one request.

• Now, let’s query the index for products which has category id 3 and retrieve a terms facet for the brands field. We will name the facet simply: Brands (Example of facet terms)

curl -XGET 'localhost:9200/category/products/_search?pretty=1' -d '{ "query": {"term":{"category_id":3} }, "facets": { "Brands": {"terms":{"fields":["brands"],"size":10,"order":"term"}} }} '

• Reference link: http://www.elasticsearch.org/guide/reference/api/search/facets/ http://www.elasticsearch.org/guide/reference/api/search/facets/terms-facet.html

Page 14: Elastic search Walkthrough

Facet search• Range facet allows to specify a set of ranges and get both the number of docs (count) that

fall within each range, and aggregated data either based on the field, or using another field.curl -XGET 'localhost:9200/sales/order/_search?pretty=1' -d '{

"query" : {"term" : {"status" : "confirmed"} }, "facets" : { "range1" : { "range" : { "grand_total" : [ { "to" : 50 }, { "from" : 20, "to" : 70 }, { "from" : 70, "to" : 120 }, { "from" : 150 } ] } } },

"sort":[{"entity_id":"asc"}]}'

• Reference link : http://www.elasticsearch.org/guide/reference/api/search/facets/range-facet.html

Page 15: Elastic search Walkthrough

Elastica• Elastica is an Open Source PHP client for the elasticsearch search

engine/database. • Reference Link : http://www.elastica.io/en• To use Elastica, Download and Include Elastica in a project using PHP autoload.

function __autoload_elastica ($class) { $path = str_replace('_', '/', $class); if (file_exists('/xampp/htdocs/project/Elastica/lib/' . $path . '.php')) {require_once('/xampp/htdocs/project/Elastica/lib/' . $path . '.php'); }}spl_autoload_register('__autoload_elastica');

• Connecting to ElasticSearch: On single node : $elasticaClient- = new Elastica_Client(array('host' => '192.168.0.27','port' => '9200'));

• It is quite easy to start a elasticsearch cluster simply by starting multiple instances of elasticsearch on one server or on multiple servers. One of the goals of the distributed search index is availability. If one server goes down, search results should still be served.

$elasticaClient- = new Elastica_Client('servers'=>array(array('host' => '192.168.0.27','port' => '9200'), array('host' => '192.168.0.27','port' => '9201')));

Page 16: Elastic search Walkthrough

Elastica• Create Index :

$elasticaClient- = new Elastica_Client(array('host' => '192.168.0.27','port' => '9200')); $elasticaIndex = $elasticaClient->getIndex(‘sales'); $elasticaIndex->create(array( 'number_of_shards' => 4, 'number_of_replicas' => 1), true);

• Define Mapping : $mapping = new Elastica_Type_Mapping(); $elasticaIndex = $elasticaClient- >getIndex('sales'); $elasticaType = $elasticaIndex->getType('order'); $mapping->setType($elasticaType);

$mapping->setProperties(array('entity_id' => array('type' => 'integer'),'increment_id' => array('type' => 'string',"index" => "not_analyzed"),‘status' =>array('type'=>'string',"index" => "not_analyzed")

));$mapping->send();

Page 17: Elastic search Walkthrough

Elastica Add documents

$elasticaClient- = new Elastica_Client(array('host' => '192.168.0.27','port' => '9200'));

$elasticaIndex = $elasticaClient ->getIndex('sales'); $elasticaType = $elasticaIndex->getType('order');

// The Id of the document$id = 1;

// Create a document$record = array('entity_id'=>1,

‘increment_id'=>‘100001',‘status'=>‘confirmed');

$recordDocument = new Elastica_Document($id, $record);

// Add record to type$elasticaType->addDocument($ recordDocument );

// Refresh Index$elasticaType->getIndex()->refresh();

Page 18: Elastic search Walkthrough

Elastica Get Document

$elasticaClient- = new Elastica_Client(array('host' => '192.168.0.27','port' => '9200'));

$index = $elasticaClient->getIndex('sales'); //get index$type = $index->getType('order'); //get type$Doc = $type->getDocument($id)->getData(); //get data

Page 19: Elastic search Walkthrough

Elastica Update Document$elasticaClient- = new Elastica_Client(array('host' => '192.168.0.27','port' => '9200')); $index = $elasticaClient->getIndex('sales'); //get index$type = $index->getType('order'); //get type$id = 1; //id of document which need to be updated$newVal = 'confirmed'; //value to be updated$update = new Elastica_Script("ctx._source.status = newval", array('newval' => $newVal)); $res=$type->updateDocument($id,$update);if(!empty($res)){ $val=$res->getData(); if($val['ok']) {

echo "updated"; } else {

echo “value not updated"; }}else{

echo “value not updated";}

Page 20: Elastic search Walkthrough

Elastica Search Documents• The search API allows to execute a search query and get back search

hits that match the query.• Search API consists following major methods:

– Query String– Term– Terms– Range– Bool Query– Filter (it also contain Filter_term, Filter_Range etc)– Facets (it contain Facet_Range, Facet_Terms,Facet_Filter,

Facet_Query, Facet_statistical etc.)– Query (where we can set fields for output, limit , sorting)

Page 21: Elastic search Walkthrough

Search Documents – Query String$elasticaClient = new Elastica_Client(array('host' => '192.168.0.27','port' => '9200')); $elasticaIndex = $elasticaClient->getIndex('sales');$elasticaType = $elasticaIndex->getType('order');

$elasticaQueryString = new Elastica_Query_QueryString();$elasticaQueryString->setQuery((string) “shipped*");$elasticaQueryString->setFields(array(‘status')); //we can set 1 or more than 1 field in query string

$elasticaQuery = new Elastica_Query();$elasticaQuery->setQuery($elasticaQueryString);$elasticaQuery->setFields(array('increment_id','entity_id','billing_name','grand_total')); $elasticaQuery->setFrom(0);$elasticaQuery->setLimit(20);$sort = array("entity_id" => "desc");$elasticaQuery->setSort($sort);

$elasticaResultSet = $elasticaType->search($elasticaQuery);$totalResults = $ elasticaResultSet ->getTotalHits();$elasticaResults = $elasticaResultSet ->getResults();foreach ($elasticaResults as $elasticaResult) {

print_r($elasticaResult->getData());}

Page 22: Elastic search Walkthrough

Search Documents – Query Term$elasticaQueryTerm = new Elastica_Query_Term();$elasticaQueryTerm->setTerm('entity_id',1);

$elasticaQuery = new Elastica_Query();$elasticaQuery->setQuery($elasticaQueryTerm);$elasticaQuery->setFields(array('increment_id','entity_id','billing_name','grand_total')); $elasticaQuery->setFrom(0);$elasticaQuery->setLimit(20);$sort = array("entity_id" => “asc");$elasticaQuery->setSort($sort);

$elasticaResultSet = $elasticaType->search($elasticaQuery);

Page 23: Elastic search Walkthrough

Search Documents – Query Terms$elasticaQueryTerms = new Elastica_Query_Terms();

//for query terms, you can specify 1 or more than 1 value per field

$elasticaQueryTerms->setTerms('entity_id', array(1,2,3,4,5));$elasticaQueryTerms->addTerm(6);

$elasticaQuery = new Elastica_Query();$elasticaQuery->setQuery($elasticaQueryTerms);$elasticaQuery->setFields(array('increment_id','entity_id','billing_name','grand_total')); $elasticaQuery->setFrom(0);$elasticaQuery->setLimit(20);$sort = array("entity_id" => “asc");$elasticaQuery->setSort($sort);

$elasticaResultSet = $elasticaType->search($elasticaQuery);

Page 24: Elastic search Walkthrough

Search Documents – Query Range

$elasticaQueryRange = new Elastica_Query_Range();

//for range query , you can specify from, from & to or to only

$elasticaQueryRange->addField('entity_id', array('from' => 10,"to"=>14));

$elasticaQuery = new Elastica_Query();$elasticaQuery->setQuery($elasticaQueryRange);$elasticaQuery->setFields(array('increment_id','entity_id','billing_name','grand_total')); $elasticaQuery->setFrom(0);$elasticaQuery->setLimit(20);$sort = array("entity_id" => “asc");$elasticaQuery->setSort($sort);

$elasticaResultSet = $elasticaType->search($elasticaQuery);

Page 25: Elastic search Walkthrough

Search Documents – Bool Query• The bool query maps to Lucene BooleanQuery• Bool Query contains clause Occurrence – must, should, must_not

$boolQuery = new Elastica_Query_Bool();

$elasticaQueryString = new Elastica_Query_QueryString(); $elasticaQueryString ->setQuery(‘shoh*');

$elasticaQueryString->setFields(array('‘billing_name, ‘shipping_name')); $boolQuery->addMust($elasticaQueryString);

$elasticaQueryTerm = new Elastica_Query_Term();$elasticaQueryTerm->setTerm('entity_id',1);$boolQuery->addMust($elasticaQueryTerm );

$elasticaQuery = new Elastica_Query();$elasticaQuery->setQuery($boolQuery);$elasticaResultSet = $elasticaType->search($elasticaQuery);

Page 26: Elastic search Walkthrough

Search Documents – Query Filters• When doing things like facet navigation, sometimes only the hits are needed to be filtered by

the chosen facet, and all the facets should continue to be calculated based on the original query. The filter element within the search request can be used to accomplish it.

$elasticaQueryString = new Elastica_Query_QueryString(); $elasticaQueryString->setQuery('*'); $elasticaQueryString->setFields(array('increment_id'));

$filteredQuery = new Elastica_Query_Filtered($elasticaQueryString,new Elastica_Filter_Range('created_at', array('from' => '2011-01-04 07:36:00','to' => '2013-01-04 19:36:25')));

$elasticaQuery = new Elastica_Query(); $elasticaQuery->setQuery($filteredQuery); $elasticaResultSet = $elasticaType->search($elasticaQuery);

Page 27: Elastic search Walkthrough

Elastica - Facet Terms$elasticaQuery = new Elastica_Query();$elasticaQuery->setQuery($boolQuery); //set main query$facet = new Elastica_Facet_Terms('status Facet');$facet->setField('status');$facet->setOrder(‘term'); //another options are reverse_term,count,reverse_count$facet->setSize(5);

$elasticaQuery->addFacet($facet); //adding facet to query$elasticaResultSet = $elasticaType->search($elasticaQuery);

$facets = $ elasticaResultSet ->getFacets(); //get facets dataforeach($facets as $k=>$v){

if(isset($v['terms']) && is_array($v['terms'])){ $data['facets'][$k]=$v['terms'];}

}