Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

52
FULL TEXT SEARCH For Couchbase Documents 2017-10-26 Marty Schoch | Principle Engineer

Transcript of Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Page 1: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

FULL TEXT SEARCHFor Couchbase Documents

2017-10-26

Marty Schoch | Principle Engineer

Page 2: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Mobile & IoTKV Query

Elastic Scale Architecture Memory-first Architecture

Unified Programming

Core Database Engine

Infrastructure - Cloud & Containers

Cro

ss S

tack S

ecurity S

QL &

Big

Data

Inte

gra

tions

Couchbase Data Platform

Page 3: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Couchbase Full-Text Search

Page 4: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Mobile & IoTKV Query

Elastic Scale Architecture Memory-first Architecture

Unified Programming

Core Database Engine

Infrastructure - Cloud & Containers

Cro

ss S

tack S

ecurity S

QL &

Big

Data

Inte

gra

tions

Couchbase Data Platform

Page 5: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Mobile & IoTKV Query Search

Elastic Scale Architecture Memory-first Architecture

Unified Programming

Core Database Engine

Infrastructure - Cloud & Containers

Cro

ss S

tack S

ecurity S

QL &

Big

Data

Inte

gra

tions

Couchbase Data Platform

Page 6: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

6

Photo caption here

This layout has a

WHITE logo for use

on a darker photo.

Let's

Go

Back

In

Time

Page 7: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

7

What a Long Strange Trip It's Been

Bleve

2013

Page 8: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Bleve

• Full Text Search in Go

• Text Analysis

• Indexing

• Searching

• Scoring

• Faceting

Page 9: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

9

Text Analysis

Page 10: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

10

Inverted Index

Indexed Terms Document ID Postings List

cozi hotel_1289, hotel_3376, hotel_5022, hotel_9994

luxuri hotel_0092, hotel_1289, hotel_8989

small hotel_3376

spacious hotel_0092, hotel_1289, hotel_3376, hotel_5022

Page 11: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

11

Searching

•Match

•Match Phrase

•Prefix, Regex, Fuzzy

•Conjunction, Disjunction, Boolean

•Numeric and Date Ranges

•Query String

Page 12: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

12

Relevance Scoring

Page 13: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

13

What a Long Strange Trip It's Been

Bleve

CBFT

2014

2013

Page 14: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

14

cbft design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)

Page 15: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

15

cbft design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)

index partitions: A B C

Page 16: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

16

cbft design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)

index partitions: A B C

Page 17: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

17

cbft design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)

index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023

Page 18: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

18

cbft design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)

index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023

cbft nodes: X

Page 19: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

19

cbft design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)

index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023

assign to cbft nodes:

cbft nodes: X

Page 20: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

20

cbft design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)

index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023

assign to cbft nodes:

cbft nodes: X Y Z

Page 21: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

21

cbft design / index partitioning

bucket partitions: 0, 1, 2, 3, 4, … … ,1021, 1022, 1023(1024 vbuckets)

index partitions: A B C(groups of vbuckets) 0-399 400-799 800-1023

assign to cbft nodes:replicas, too:

cbft nodes: X Y Z

Page 22: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

22

cbft design / queries

cbft cbft

a query sentto any cbftnode…

your application

RE

ST

cbft

…is scatter / gatheredto the other

cbft nodes

Page 23: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

23

What a Long Strange Trip It's Been

Bleve

FTS

Dev

Preview

CBFT

2014

2013 2015

Page 24: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

24

What a Long Strange Trip It's Been

CB 4.5

Dev

Preview

Bleve

FTS

Dev

Preview

CBFT

2014 2016

2013 2015

Page 25: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

25

What a Long Strange Trip It's Been

CB 4.5

Dev

Preview

Bleve

FTS

Dev

Preview

CBFT

2014 2016

2013 2015

CB 5.0

2017

Page 26: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Using FTS

Page 27: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Add FTS Node

Page 28: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Create FTS

Index

Page 29: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Customize

Field Behavior

Page 30: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Search within

the Couchbase

UI

Page 31: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Monitoring

Page 32: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Live Demos

Page 33: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

33

Call Center{

"_id": "user_1",

"address_1": "3217 Camylle Green Keys",

"address_2": "Suite 172",

"city": "North Emmalee",

"company": null,

"country": "LK",

"created_on": 1482264366000,

"dob": null,

"doc_type": "user",

"first_name": "Rigoberto",

"gender": "F",

Page 34: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Mapping

Page 35: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Query{

"size": 10,

"explain": true,

"highlight": {},

"fields": [

"*"

],

"query": {

"boost": 1,

"match": "valley streem",

"fuzziness": 1,

"field": "address"

}

}

Page 36: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

36

E-commerce{

"title": "Ballet Dress-Up Fairy Tutu",

"categories": [

[

"Clothing, Shoes & Jewelry",

"Girls",

"Clothing",

"Active",

"Active Skirts"

]

],

"description": "This adorable basic ballerina tutu is perfect for dance recitals. Fairy Princes Dress up,

Page 37: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Mapping

Page 38: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Keyword

Analyzer for

Term Facets

Page 39: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Request

Facets

"facets": {

"Category": {

"field": "categories",

"size": 5

},

"Brand": {

"field": "brand",

"size": 5

},

"Price": {

"field": "price",

"size": 3,

"numeric_ranges": [

{

"name": "Under $100",

"max": 100

},

{

"name": "$100 - $500",

"max": 500,

"min": 100

},

{

"name": "$500 & Above",

"min": 500

}

]

}

}

Page 40: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Response

Facets

{

"Brand": {

"field": "brand",

"total": 12582,

"missing": 8851,

"other": 8162,

"terms": [

{

"term": "Invicta",

"count": 1450

},

{

"term": "Casio",

"count": 959

},

{

"term": "Seiko",

"count": 898

},

{

"term": "Stuhrling Original",

"count": 664

},

{

"term": "Timex",

"count": 449

}

]

},

"Category": {

"field": "categories",

"total": 156064,

"missing": 0,

Page 41: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Query

"query": {

"conjuncts": [

{

"query": "watch"

}

]

},

Page 42: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Query

"query": {

"conjuncts": [

{

"query": "watch"

},

{

"field": "categories",

"term": "Men"

}

]

},

Page 43: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Query

"query": {

"conjuncts": [

{

"query": "watch"

},

{

"field": "categories",

"term": "Men"

},

{

"field": "brand",

"term": "Casio"

}

]

},

Page 44: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Query

"query": {

"conjuncts": [

{

"query": "watch"

},

{

"field": "categories",

"term": "Men"

},

{

"field": "brand",

"term": "Casio"

},

{

"field": "price",

"min": 500

}

]

},

Page 45: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

45

Geo Local Results (dev preview)

{

"name": "Tied House Cafe & Brewery - San Jose",

"geo": {

"accuracy": "ROOFTOP",

"lat": 37.3362,

"lon": -121.894

},

"address": [

"65 North San Pedro"

],

"city": "San Jose",

"code": "95110",

Page 46: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Mapping

Page 47: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Query{

"query": {

"conjuncts": [

{

"query": "brewery"

},

{

"location": {

"lon": -121.888889,

"lat": 37.328611

},

"distance": "20mi",

"field": "geo"

}

]

}

}

Page 48: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Real-World

Enterprise

Example

Enabling collaborative

storytelling by

“transforming your team

into your film crew”.

Machine learning

analyzes media files for

audio/visual attributes and

tags, captioning, stored as

documents in Couchbase

for rapid search and

retrieval. Very rapid

development process…

“hardest part was

deciding when to stop”

Seenit Studio

seenit.io

Page 49: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

49

What a Long Strange Trip It's Been

CB 4.5

BleveDeveloper

Preview

CBFT

2014 2016

2013 2015

CB 5.0

???

Future

2017

Page 50: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

50

Photo caption here

This layout has a

WHITE logo for use

on a darker photo.

Page 51: Enabling Full Text Search for Couchbase documents – Connect Silicon Valley 2017

Future

•Performance

•N1QL Integration