Health & Society CSA-Europ Assistance Barometer 2013_full report
CouchConf Israel 2013_Full Text Search
Transcript of CouchConf Israel 2013_Full Text Search
![Page 1: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/1.jpg)
1
Couchbase Server 2.0:Full Text Search Integration
Matt IngenthronDirector, Developer Solutions
![Page 2: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/2.jpg)
2
Couchbase Server 2.0
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2
Doc 7
Doc 1
SERVER 3
Doc 3
Doc 9
Doc 7
Doc 8 Doc 6
Doc 3
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
Doc 9
Doc 5
DOC
DOC
DOC
Doc 1
Doc 8 Doc 2
Replica Docs Replica Docs Replica Docs
Query / Response
Active Docs Active Docs Active Docs
Distributed Indexing and Querying using Incremental Map Reduce
![Page 3: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/3.jpg)
3
{ "name": "Abbey Belgian Style Ale", "description": "Winner of four World Beer Cup medals and eight medals at the Great American Beer Fest, Abbey Belgian Ale is the Mark Spitz of New Belgium’s lineup – but it didn’t start out that way."}
Search Across Full JSON Body
Search term: abbey
![Page 4: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/4.jpg)
4
{ "name": "Abbey Belgian Style Ale", "description": "Winner of four World Beer Cup medals and eight medals at the Great American Beer Fest, Abbey Belgian Ale is the Mark Spitz of New Belgium’s lineup – but it didn’t start out that way."}
Search Across Full JSON Body
Search term: abbey
![Page 5: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/5.jpg)
5
Integrate with ElasticSearch for Full Text Search
• Based on proven Apache Lucene technology• Apache 2 Licensed with commercial support available• Distributed• Schema Free JSON Documents• RESTful API
![Page 6: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/6.jpg)
6
ElasticSearch Terminology
• Document– Schema-less JSON…– Contains a set of fields
• Type– Contains a set of mappings describing how fields are indexed
• Index– Logical namespace for scoping indexing/searching– May contain documents of different types– Uniqueness by ID/Type
![Page 7: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/7.jpg)
7
How does it work?
ElasticSearch
Unidirectional Cross Data Center Replication
![Page 8: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/8.jpg)
8
GETTING STARTED
![Page 9: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/9.jpg)
9
Install the Couchbase Plug-In
• Pre-requisite– Existing Couchbase and ElasticSearch Clusters
• Install the ElasticSearch Couchbase Transport Plug-in– bin/plugin -install couchbaselabs/elasticsearch-transport-couchbase/1.0.0-beta
• Configure the Plug-in– Set a password– Install the Couchbase Index Template
• Restart ElasticSearch• Create an ElasticSearch index for your documents
![Page 10: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/10.jpg)
10
Configure XDCR (part 1)
![Page 11: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/11.jpg)
11
Configure XDCR (part 2)
![Page 12: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/12.jpg)
12
Documents are now being indexed!
Document Count Increasing
![Page 13: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/13.jpg)
13
WHAT NOW?
![Page 14: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/14.jpg)
14
Document from Beer Sample Dataset
{ "name": "Pabst Blue Ribbon", "abv": 4.74, "ibu": 0, "srm": 0, "upc": 0, "type": "beer", "brewery_id": "110f1d5dc2", "updated": "2010-07-22 20:00:20", "description": "PBR is not just any beer…", "style": "American-Style Light Lager", "category": "North American Lager"}
![Page 15: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/15.jpg)
15
Simple ES Query with HTTP
• Search for any beer matching the term “lager”– GET http://127.0.0.1:9200/beer-sample/_search?q=lager
{ "took": 7, "timed_out": false, "_shards": { ... }, "hits": { "total": 1271, "max_score": 1.1145955, "hits": [...] }}
![Page 16: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/16.jpg)
16
Simple ES Query with HTTP
• Search for any beer matching the term “lager”– GET http://127.0.0.1:9200/beer-sample/_search?q=lager
{ "took": 7, "timed_out": false, "_shards": { ... }, "hits": { "total": 1271, "max_score": 1.1145955, "hits": [...] }}
Total Search Execution Time
![Page 17: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/17.jpg)
17
Simple ES Query with HTTP
• Search for any beer matching the term “lager”– GET http://127.0.0.1:9200/beer-sample/_search?q=lager
{ "took": 7, "timed_out": false, "_shards": { ... }, "hits": { "total": 1271, "max_score": 1.1145955, "hits": [...] }}
Total Number of Documents Matching
Query
![Page 18: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/18.jpg)
18
Simple ES Query with HTTP
• Search for any beer matching the term “lager”– GET http://127.0.0.1:9200/beer-sample/_search?q=lager
{ "took": 7, "timed_out": false, "_shards": { ... }, "hits": { "total": 1271, "max_score": 1.1145955, "hits": [...] }}
Maximum Score of All Matching Documents
![Page 19: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/19.jpg)
19
Simple ES Query with HTTP
• Search for any beer matching the term “lager”– GET http://127.0.0.1:9200/beer-sample/_search?q=lager
{ "took": 7, "timed_out": false, "_shards": { ... }, "hits": { "total": 1271, "max_score": 1.1145955, "hits": [...] }}
Array of Matching Documents
![Page 20: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/20.jpg)
20
Single Search Result
"hits": [ { "_index": "beer-sample", "_type": "couchbaseDocument", "_id": "110fc4b16b", "_score": 1.1145955, "_source": { "meta": { "id": "110fc4b16b", "rev": "1-001ba0044ce30dd50000000000000000", "flags": 0, "expiration": 0 } } }, … ]
ID of Matching Document
![Page 21: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/21.jpg)
21
Single Search Result
"hits": [ { "_index": "beer-sample", "_type": "couchbaseDocument", "_id": "110fc4b16b", "_score": 1.1145955, "_source": { "meta": { "id": "110fc4b16b", "rev": "1-001ba0044ce30dd50000000000000000", "flags": 0, "expiration": 0 } } }, … ]
Where’s the document body?
![Page 22: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/22.jpg)
22
Recommended Usage Pattern
ElasticSearch
1. ElasticSearch Query
2. ElasticSearch Result
3. Couchbase Multi-GET
4. Couchbase Result
![Page 23: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/23.jpg)
23
Architecture Overview
XDCR Couchbase ES Transport
Data
Couchbase Server Cluster
MR Views
MR Views
MR Views
MR Views Index Server Cluster
M
RefsES QueryMR Query
App Server
Couchbase SDK ES queries over HTTP
![Page 24: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/24.jpg)
24
MORE ADVANCED CAPABILITIES
![Page 25: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/25.jpg)
25
Another Query with HTTP
• POST http://127.0.0.1:9200/default/_search
{ "name": "Wild Blue Blueberry Lager", "abv": 8, "type": "beer", "brewery_id": "110f01abce", "updated": "2010-07-22 20:00:20", "description": "…ripe blueberry aroma…", "style": "Belgian-Style Fruit Lambic", "category": "Belgian and French Ale"}
{ "query": { "query_string": { "query": "style: lambic AND description: blueberry" } }}
![Page 26: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/26.jpg)
26
Faceted Search
Categories
Items with Counts
Range Facets
![Page 27: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/27.jpg)
27
Faceted Search Query – Beer Style
{ "query": { "query_string":{ "query":"bud” } }, "facets" : { "styles" : { "terms" : { "field" : "style", "size" : 3 } } }}
![Page 28: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/28.jpg)
28
Faceted Search Results - Incorrect
"terms": [ { "term": "style" "count": 8 } { "term": "lager" "count": 6 } { "term": "american" "count": 4 }]
Style was “American-Style Lager”
![Page 29: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/29.jpg)
29
Update the Mapping
{ "couchbaseDocument":{ "properties":{ "doc":{ "properties":{ "style": { "type":"string", "index": "not_analyzed" } } } } }}
• PUT /beer-sample/couchbaseDocument/_mapping
NOTE: When you change the mapping you MUST re-index.
![Page 30: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/30.jpg)
30
Faceted Search Results - Correct
"terms": [ { "term": "American-Style Light Lager”, "count": 5 }, { "term": "American-Style Lager”, "count": 2 }, { "term": "Belgian-Style White”, "count": 1 }]
![Page 31: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/31.jpg)
31
Faceted Search Query – % Alcohol Range
{ "query": { "query_string":{ "query":"bud” } }, "facets" : { "abv" : { "range" : { "abv" : [ { "to" : 3 }, { "from" : 3, "to" : 5 }, { "from" : 5 } ] } } }}
![Page 32: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/32.jpg)
32
Faceted Search Results - % Alcohol Range
"ranges": [ { "to": 3, "count": 1 }, { "from": 3, "to": 5, "count": 5 }, { "from": 5, "count": 3 }]
![Page 33: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/33.jpg)
33
Search Result Scoring
• Each matching document is assigned a scored based on how well it matches the query
hits: [{ "_index": "default", "_type": "couchbaseDocument", "_id": "35addbc374", "_score": 1.1306798, …
![Page 34: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/34.jpg)
34
Custom Scoring – Document Properties
• Each document has a numerical field “abv”• Let’s use this field to boost the beers natural score
{ "query": { "custom_score" : { "query": { "query_string": { "query": "bud" } }, "script" : "_score * doc['abv'].value" } }}
![Page 35: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/35.jpg)
35
Custom Scoring – User Preferences
• Let users could rank beer styles from 1-10• User with no preferences set searches for “bud”
Name Style Score
Bud Extra 1.5409653
Bud Light Lime American-Style Light Lager 1.513119
Bud Light Golden Wheat Belgian-Style White 1.3208274
Bud Ice American-Style Lager 1.2839241
Bud Ice Light American-Style Lager 1.2839241
Bud Light American-Style Light Lager 1.245288
Bud Dry American-Style Light Lager 1.1968427
Budweiser Select American-Style Light Lager 0.8559494
Miller Lite American-Style Light Lager 0.7201389
![Page 36: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/36.jpg)
36
Custom Scoring – User Preferences
• User ranks “Belgian-Style White” with value 10
{ "query": { "custom_filters_score" : { "query" : { "text" : { "_all": "bud"} }, "filters" : [ { "filter" : { "term" : { "style" : "Belgian-Style White" } }, "boost" : "10" } ], "score_mode" : "first” } }}
![Page 37: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/37.jpg)
37
Custom Scoring – User Preferences
Name Style Score
Bud Light Golden Wheat Belgian-Style White 13.208274
Bud Extra 1.5409653
Bud Light Lime American-Style Light Lager 1.513119
Bud Light Golden Wheat Belgian-Style White 1.3208274
Bud Ice American-Style Lager 1.2839241
Bud Ice Light American-Style Lager 1.2839241
Bud Light American-Style Light Lager 1.245288
Bud Dry American-Style Light Lager 1.1968427
Budweiser Select American-Style Light Lager 0.8559494
Miller Lite American-Style Light Lager 0.7201389
![Page 38: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/38.jpg)
38
Learning Portal – Proof of Concept
![Page 39: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/39.jpg)
39
NEXT STEPS
![Page 40: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/40.jpg)
40
Explore ElasticSearch Capabilities
• Customize Document Mappings– Default behavior isn’t always what you want– Index one field multiple ways
• Advanced Cluster Topologies– Dedicate nodes for routing/querying
• Rich Query DSL
ElasticSearch Guide: http://www.elasticsearch.org/guide/
![Page 41: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/41.jpg)
41
Couchbase ElasticSearch Future
• Release 1.0.0• Possible features for future– More fine-grained cluster configuration– More index-level configuration– Pre-index script execution– Indexing non-JSON data
• Give us your feedback!
![Page 42: CouchConf Israel 2013_Full Text Search](https://reader035.fdocuments.in/reader035/viewer/2022062220/557d3f4ad8b42ac2788b5230/html5/thumbnails/42.jpg)
42
Resources
• Marty Schoch’s blog:http://blog.couchbase.com/couchbase-and-full-text-search-couchbase-transport-elastic-search
• https://github.com/couchbaselabs/elasticsearch-transport-couchbase