Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017
-
Upload
couchbase -
Category
Technology
-
view
102 -
download
0
Transcript of Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017
FULL TEXT SEARCHFor Couchbase Documents
2017-10-26
Marty Schoch | Principle Engineer
Mobile & IoTKV Query
Elastic Scale Architecture Memory-first Architecture
Unified Programming
Core Database Engine
Infrastructure - Cloud & Containers
Cro
ss S
tack S
ecurity S
QL &
Big
Data
Inte
gra
tions
Couchbase Data Platform
Couchbase Full-Text Search
Mobile & IoTKV Query
Elastic Scale Architecture Memory-first Architecture
Unified Programming
Core Database Engine
Infrastructure - Cloud & Containers
Cro
ss S
tack S
ecurity S
QL &
Big
Data
Inte
gra
tions
Couchbase Data Platform
Mobile & IoTKV Query Search
Elastic Scale Architecture Memory-first Architecture
Unified Programming
Core Database Engine
Infrastructure - Cloud & Containers
Cro
ss S
tack S
ecurity S
QL &
Big
Data
Inte
gra
tions
Couchbase Data Platform
6
Photo caption here
This layout has a
WHITE logo for use
on a darker photo.
Let's
Go
Back
In
Time
7
What a Long Strange Trip It's Been
Bleve
2013
Bleve
• Full Text Search in Go
• Text Analysis
• Indexing
• Searching
• Scoring
• Faceting
9
Text Analysis
10
Inverted Index
Indexed Terms Document ID Postings List
cozi hotel_1289, hotel_3376, hotel_5022, hotel_9994
luxuri hotel_0092, hotel_1289, hotel_8989
small hotel_3376
spacious hotel_0092, hotel_1289, hotel_3376, hotel_5022
11
Searching
•Match
•Match Phrase
•Prefix, Regex, Fuzzy
•Conjunction, Disjunction, Boolean
•Numeric and Date Ranges
•Query String
12
Relevance Scoring
13
What a Long Strange Trip It's Been
Bleve
CBFT
2014
2013
14
cbft design / index partitioning
bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)
15
cbft design / index partitioning
bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)
index partitions: A B C
16
cbft design / index partitioning
bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)
index partitions: A B C
17
cbft design / index partitioning
bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)
index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023
18
cbft design / index partitioning
bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)
index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023
cbft nodes: X
19
cbft design / index partitioning
bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)
index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023
assign to cbft nodes:
cbft nodes: X
20
cbft design / index partitioning
bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)
index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023
assign to cbft nodes:
cbft nodes: X Y Z
21
cbft design / index partitioning
bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)
index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023
assign to cbft nodes:replicas, too:
cbft nodes: X Y Z
22
cbft design / queries
cbft cbft
a query sentto any cbftnode…
your application
RE
ST
cbft
…is scatter / gatheredto the other
cbft nodes
23
What a Long Strange Trip It's Been
Bleve
FTS
Dev
Preview
CBFT
2014
2013 2015
24
What a Long Strange Trip It's Been
CB 4.5
Dev
Preview
Bleve
FTS
Dev
Preview
CBFT
2014 2016
2013 2015
25
What a Long Strange Trip It's Been
CB 4.5
Dev
Preview
Bleve
FTS
Dev
Preview
CBFT
2014 2016
2013 2015
CB 5.0
2017
Using FTS
Add FTS Node
Create FTS
Index
Customize
Field Behavior
Search within
the Couchbase
UI
Monitoring
Live Demos
33
Call Center{
"_id": "user_1",
"address_1": "3217 Camylle Green Keys",
"address_2": "Suite 172",
"city": "North Emmalee",
"company": null,
"country": "LK",
"created_on": 1482264366000,
"dob": null,
"doc_type": "user",
"first_name": "Rigoberto",
"gender": "F",
Mapping
Query{
"size": 10,
"explain": true,
"highlight": {},
"fields": [
"*"
],
"query": {
"boost": 1,
"match": "valley streem",
"fuzziness": 1,
"field": "address"
}
}
36
E-commerce{
"title": "Ballet Dress-Up Fairy Tutu",
"categories": [
[
"Clothing, Shoes & Jewelry",
"Girls",
"Clothing",
"Active",
"Active Skirts"
]
],
"description": "This adorable basic ballerina tutu is perfect for dance recitals. Fairy Princes Dress up,
Mapping
Keyword
Analyzer for
Term Facets
Request
Facets
"facets": {
"Category": {
"field": "categories",
"size": 5
},
"Brand": {
"field": "brand",
"size": 5
},
"Price": {
"field": "price",
"size": 3,
"numeric_ranges": [
{
"name": "Under $100",
"max": 100
},
{
"name": "$100 - $500",
"max": 500,
"min": 100
},
{
"name": "$500 & Above",
"min": 500
}
]
}
}
Response
Facets
{
"Brand": {
"field": "brand",
"total": 12582,
"missing": 8851,
"other": 8162,
"terms": [
{
"term": "Invicta",
"count": 1450
},
{
"term": "Casio",
"count": 959
},
{
"term": "Seiko",
"count": 898
},
{
"term": "Stuhrling Original",
"count": 664
},
{
"term": "Timex",
"count": 449
}
]
},
"Category": {
"field": "categories",
"total": 156064,
"missing": 0,
Query
"query": {
"conjuncts": [
{
"query": "watch"
}
]
},
Query
"query": {
"conjuncts": [
{
"query": "watch"
},
{
"field": "categories",
"term": "Men"
}
]
},
Query
"query": {
"conjuncts": [
{
"query": "watch"
},
{
"field": "categories",
"term": "Men"
},
{
"field": "brand",
"term": "Casio"
}
]
},
Query
"query": {
"conjuncts": [
{
"query": "watch"
},
{
"field": "categories",
"term": "Men"
},
{
"field": "brand",
"term": "Casio"
},
{
"field": "price",
"min": 500
}
]
},
45
Geo Local Results (dev preview)
{
"name": "Tied House Cafe & Brewery - San Jose",
"geo": {
"accuracy": "ROOFTOP",
"lat": 37.3362,
"lon": -121.894
},
"address": [
"65 North San Pedro"
],
"city": "San Jose",
"code": "95110",
Mapping
Query{
"query": {
"conjuncts": [
{
"query": "brewery"
},
{
"location": {
"lon": -121.888889,
"lat": 37.328611
},
"distance": "20mi",
"field": "geo"
}
]
}
}
Real-World
Enterprise
Example
Enabling collaborative
storytelling by
“transforming your team
into your film crew”.
Machine learning
analyzes media files for
audio/visual attributes and
tags, captioning, stored as
documents in Couchbase
for rapid search and
retrieval. Very rapid
development process…
“hardest part was
deciding when to stop”
Seenit Studio
seenit.io
49
What a Long Strange Trip It's Been
CB 4.5
BleveDeveloper
Preview
CBFT
2014 2016
2013 2015
CB 5.0
???
Future
2017
50
Photo caption here
This layout has a
WHITE logo for use
on a darker photo.
Future
•Performance
•N1QL Integration