Couchbase N1QL: Language & Architecture Overview.

38
N1QL : OVERVIEW, LANGUAGE, ARCHITECTURE Keshav Murthy Senior Director, Couchbase R&D N1QL WORKSHOP

Transcript of Couchbase N1QL: Language & Architecture Overview.

Page 1: Couchbase N1QL: Language & Architecture Overview.

N1QL : OVERVIEW, LANGUAGE, ARCHITECTUREKeshav Murthy

Senior Director, Couchbase R&D

N1QL WORKSHOP

Page 2: Couchbase N1QL: Language & Architecture Overview.

AGENDA01

02

03

04

05

Agenda for the day

Motivation for N1QL

Language Overview

Architecture

Overview of N1QL Features in 5.0

Page 3: Couchbase N1QL: Language & Architecture Overview.

1 AGENDA FOR THE DAY

Page 4: Couchbase N1QL: Language & Architecture Overview.

4

Time Topic Speaker

8:30 AM – 9:00 AM Registration and Breakfast

9.00 AM – 9.45 AM N1QL: Overview, Language, Architecture Keshav Murthy

9.45 AM – 10.15 AM Query Developer Tooling in 5.0 Eben Haber

10.15 AM – 11.00 AM Deep dive into Data modeling Dean Proctor

11.00 AM – 11.15 AM Break

11.15 AM – 12.00 AM Index Advisor: Rules for creating indexes Keshav & Sitaram

12.00 PM – 1.00 PM Lunch

1.00 PM – 2.00 PM N1QL Query Optimizer and Improvements in 5.0 Sitaram Vemulapalli

2.00 PM – 2.45 PM Query Monitoring, profiling and trouble shooting in

5.0

Marco Greco

2.45 AM - 3.00 PM Break All

3.00 PM - 3.45 PM Indexing Manageability in 5.0 Deepkaran Salooja

3.45 PM – 4.30 PM Security: RBAC: Role Based Access Control in 5.0 Johan Larson

4.30 PM – 5.15 PM Mindmap for Oracle (& other RDBMS) Developers Raju Suravarjjala

5.15 PM - END Q & A

Page 5: Couchbase N1QL: Language & Architecture Overview.

2 MOTIVATIONS FOR N1QL

Page 6: Couchbase N1QL: Language & Architecture Overview.

6

Page 7: Couchbase N1QL: Language & Architecture Overview.

7

SQL SQL>

Page 8: Couchbase N1QL: Language & Architecture Overview.

8

SQL SQL• Hide the complexity of relational calculus

• English Language based access to Data

• Language to manipulate data easily

• Targeted for developers & professionals

• SELECT, INSERT, UPDATE, DELETE, …

• Arithmetic and logical operators

• Data Types

• Schema management

• All of original SQL capabilities

• ++++++

• Transactions – ACID

• Stored procedures, Triggers, Indexes

• Views, Materialized Views

• Optimizers: Rule & Cost based optimizers

• Spatial, Search, integrations

• Storage optimizations

• SMP, MPP innovations

• Performance: TPC

• Distributed Query Processing

• Two-phase commits

• T-SQL, PL/SQL,…

• JDBC, ODBC drivers.

• Tools

“Life as we know it depends on SQL & OLTP”

>

Page 9: Couchbase N1QL: Language & Architecture Overview.

9

SQL: Progress

• 1970 : Codd developed relational model.

"A Relational Model of Data for Large Shared Data Banks”

• 1974 : SEQUEL paper by Don Chamberlin and Raymond Boyce

“SEQUEL: A Structured English Query Language”

• 1974-1979 : System R with SQL (later SQL) at IBM Research Lab

• 1979 : Oracle markets first relational database with SQL

• 1986 : ANSI SQL Standard released

• 1989, 1992, 1999, 2003, 2008, 2011, 2016: Major ANSI standard updates

Page 10: Couchbase N1QL: Language & Architecture Overview.

10

ResultSet

Relations/Tuples

Page 11: Couchbase N1QL: Language & Architecture Overview.

11

{"Name" : "Jane Smith","DOB" : "1990-01-30","Billing" : [

{"type" : "visa","cardnum" : "5827-2842-2847-3909","expiry" : "2019-03"

},{

"type" : "master","cardnum" : "6274-2842-2847-3909","expiry" : "2019-03"

}],"Connections" : [

{"CustId" : "XYZ987","Name" : "Joe Smith"

},{

"CustId" : "PQR823","Name" : "Dylan Smith"

}{

"CustId" : "PQR823","Name" : "Dylan Smith"

}],"Purchases" : [

{ "id":12, item: "mac", "amt": 2823.52 }{ "id":19, item: "ipad2", "amt": 623.52 }

]}

LoyaltyInfoResults

Orders

CUSTOMER

• NoSQL systems provide specialized APIs

• Key-Value get and set

• Each task requires custom built program

• Should test & maintain it

Page 12: Couchbase N1QL: Language & Architecture Overview.

12

Find High-Value Customers with Orders > $10000

Query customer

objects from

database

• Complex codes and logic

• Inefficient processing on client side

• Query will will be slow

For each customer

object

Find all the order

objects for the

customer

Calculate the total

amount for each

order

Sum up the grand

total amount for all

orders

If grand total

amount > $10000,

Extract customer

data

Add customer to

the high-value

customer list

Sort the high-value

customer list

LOOPING OVER MILLIONS OF CUSTOMERS IN APPLICATION!!!

Page 13: Couchbase N1QL: Language & Architecture Overview.

13

{"Name" : "Jane Smith","DOB" : "1990-01-30","Billing" : [

{"type" : "visa","cardnum" : "5827-2842-2847-3909","expiry" : "2019-03"

},{

"type" : "master","cardnum" : "6274-2842-2847-3909","expiry" : "2019-03"

}],"Connections" : [

{"CustId" : "XYZ987","Name" : "Joe Smith"

},{

"CustId" : "PQR823","Name" : "Dylan Smith"

}{

"CustId" : "PQR823","Name" : "Dylan Smith"

}],"Purchases" : [

{ "id":12, item: "mac", "amt": 2823.52 }{ "id":19, item: "ipad2", "amt": 623.52 }

]}

LoyaltyInfoResultDocuments

Orders

CUSTOMER

Page 14: Couchbase N1QL: Language & Architecture Overview.

14

N1QL = SQL + JSON

Give developers and enterprises an

expressive, powerful, and complete language

for querying, transforming, and manipulating

JSON data.

Page 15: Couchbase N1QL: Language & Architecture Overview.

15

Why SQL for NoSQL?

Page 16: Couchbase N1QL: Language & Architecture Overview.

3 LANGUAGE OVERVIEW

Page 17: Couchbase N1QL: Language & Architecture Overview.

17

N1QL : Data Types from JSON

Data Type Example

Numbers { "id": 5, "balance":2942.59 }

Strings { "name": "Joe", "city": "Morrisville" }

Boolean { "premium": true, "balance pending": false}

Null { "last_address": Null }

Array { "hobbies": ["tennis", "skiing", "lego"]}

Object { "address": {"street": "1, Main street", "city":

Morrisville, "state":"CA", "zip":"94824"}}

MISSING

Arrays of objects of arrays[

{

"type": "visa",

"cardnum": "5827-2842-2847-3909",

"expiry": "2019-03"

},

{

"type": "master",

"cardnum": "6274-2542-5847-3949",

"expiry": "2018-12"

}

]

Page 18: Couchbase N1QL: Language & Architecture Overview.

18

N1QL : Data Type Handling

Non-JSON data types

• MISSING

• Binary

Data type handling

• Date functions for string and numeric encodings

• Total ordering across all data types

• Well defined semantics for ORDER BY and comparison operators

• Defined expression semantics for all input data types

• No type mismatch errors

Page 19: Couchbase N1QL: Language & Architecture Overview.

19

Considerations: SQL for NoSQL

• Flexible Schema

• Cannot rely on predefined schema

• columns, data types, data comparison

• Nested Objects

• support scalars, objects and array

• SQL operators on nested objects

• Work with distributed data store

• Query Performance

• Optimization

• Exploit data store performance

• Design right kinds of indices

• Optimizer

{"Name" : "Jane Smith","DOB" : "1990-01-30","Billing" : [

{"type" : "visa","cardnum" : "5827-2842-2847-3909","expiry" : "2019-03"

},{

"type" : "master","cardnum" : "6274-2842-2847-3909","expiry" : "2019-03"

}],"address" :

{"Street" : "10, Downing Street","City" : "San Francico","State" : "California","zip" : 94401

}}

Page 20: Couchbase N1QL: Language & Architecture Overview.

20

N1QL: Approach

• Flexible Schema

• Rely on JSON data interpretation

• Define 4-value predicate logic

• True, False, NULL, MISSING

• Nested Objects

• Key name becomes column reference

• Use dot-notation and array[] reference

• SQL operators on nested objects

• select, join, project operators

• nest and unnest for arrays & objects

• Query Performance Optimization

SELECT c.name, c.address.zip,c.phone[0]

FROM customer cWHERE c.address.zip = 94587AND ANY s IN c.status

SATISFIES s = 'Premium'

ENDAND purchases IS NOT MISSING;

Page 21: Couchbase N1QL: Language & Architecture Overview.

21

N1QL: SELECT Statement

SELECT *

FROM customers c

WHERE c.address.state = 'NY'

AND c.status = 'premium'

ORDER BY c.address.zip

Project Everything

From the bucket customers

Sort order

Predicate

Page 22: Couchbase N1QL: Language & Architecture Overview.

22

N1QL: SELECT Statement

SELECT customers.id,

customers.NAME.lastname,

customers.NAME.firstname

Sum(orderline.amount)

FROM orders UNNEST orders.lineitems AS orderline

INNER JOIN customers ON KEYS orders.custid

WHERE customers.state = 'NY'

GROUP BY customers.id,

customers.NAME.lastname,

customers.NAME.firstname

HAVING sum(orderline.amount) > 10000

ORDER BY sum(orderline.amount) DESC

• Dotted sub-document

reference

• Names are CASE-

SENSITIVE

UNNEST to flatten the arrays

JOINS with Document KEY of

customers

Page 23: Couchbase N1QL: Language & Architecture Overview.

23

N1QL: SELECT Statement Highlights

• Querying across relationships

• JOINs

• Subqueries

• Aggregation

• MIN, MAX

• SUM, COUNT, AVG, ARRAY_AGG [ DISTINCT ]

• Combining result sets using set operators

• UNION, UNION ALL, INTERSECT, INTERSECT ALL, EXCEPT, EXCEPT ALL

Page 24: Couchbase N1QL: Language & Architecture Overview.

24

N1QL : Query Operators [ 1 of 2 ]

• USE KEYS …

• Direct primary key lookup bypassing index scans

• Ideal for hash-distributed datastore

• Available in SELECT, UPDATE, DELETE

• JOIN … ON KEYS …

• Nested loop JOIN using key relationships

• Ideal for hash-distributed datastore

• Current implementation supports INNER and LEFT OUTER joins

• ANSI JOINS

• We’re working on it. Be part of BETA for the next release.

Page 25: Couchbase N1QL: Language & Architecture Overview.

25

N1QL : Query Operators [ 2 of 2 ]

• NEST

• Special JOIN that embeds external child documents under their parent

• Ideal for JSON encapsulation

• UNNEST

• Flattening JOIN that surfaces nested objects as top-level documents

• Ideal for decomposing JSON hierarchies

• JOIN, NEST, and UNNEST can be chained in any combination

Page 26: Couchbase N1QL: Language & Architecture Overview.

26

N1QL : Expressions for JSON

Ranging over collections• WHERE ANY c IN children SATISFIES c.age > 10 END

• WHERE EVERY r IN ratings SATISFIES r > 3 END

Mapping with filtering • ARRAY c.name FOR c IN children WHEN c.age > 10 END

Deep traversal, SET,

and UNSET

• WHERE ANY node WITHIN request SATISFIES node.type = “xyz” END

• UPDATE doc UNSET c.field1 FOR c WITHIN doc END

Dynamic Construction

• SELECT { “a”: expr1, “b”: expr2 } AS obj1, name FROM … // Dynamic

object

• SELECT [ a, b ] FROM … // Dynamic array

Nested traversal • SELECT x.y.z, a[0] FROM a.b.c …

IS [ NOT ] MISSING • WHERE name IS MISSING

Page 27: Couchbase N1QL: Language & Architecture Overview.

27

N1QL: Data Modification Statements

• UPDATE … SET … WHERE …

• DELETE FROM … WHERE …

• INSERT INTO … ( KEY, VALUE ) VALUES …

• INSERT INTO … ( KEY …, VALUE … ) SELECT …

• MERGE INTO … USING … ON …

WHEN [ NOT ] MATCHED THEN …

Note: Couchbase provides per-document atomicity.

Page 28: Couchbase N1QL: Language & Architecture Overview.

4 ARCHITECTURE

Page 29: Couchbase N1QL: Language & Architecture Overview.

29

Couchbase Server Cluster Service Deployment

STORAGE

Couchbase Server 1

SHARD

7

SHARD

9

SHARD

5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Managed Cache

Storage

Data

Service STORAGE

Couchbase Server 2

Managed Cache

Cluster ManagerCluster

Manager

Data

Service STORAGE

Couchbase Server 3

SHARD

7

SHARD

9

SHARD

5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Data

Service STORAGE

Couchbase Server 4

SHARD

7

SHARD

9

SHARD

5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Query

Service STORAGE

Couchbase Server 5

SHARD

7

SHARD

9

SHARD

5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Query

Service STORAGE

Couchbase Server 6

SHARD

7

SHARD

9

SHARD

5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Index

Service

Managed Cache

Storage

Managed Cache

Storage Storage

STORAGE

Couchbase Server 6

SHARD

7

SHARD

9

SHARD

5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Index

Service

Storage

Managed CacheManaged Cache

SDK SDK

Page 30: Couchbase N1QL: Language & Architecture Overview.

30

N1QL: Query Execution Flow

Clients

1. Submit the query over REST API 8. Query result

2. Parse, Analyze, create Plan 7. Evaluate: Documents to results

3. Scan Request;

index filters

6. Fetch the documents

Index

Service

Query

Service

Data

Service

4. Get qualified doc keys

5. Fetch Request,

doc keys

SELECT c_id,

c_first,

c_last,

c_max

FROM CUSTOMER

WHERE c_id = 49165;

{

"c_first": "Joe",

"c_id": 49165,

"c_last": "Montana",

"c_max" : 50000

}

Page 31: Couchbase N1QL: Language & Architecture Overview.

31

N1QL: Inside the Query Service

Client

FetchParse Plan Join FilterPre-Aggregate

Offset Limit ProjectSortAggregateScan

Query Service

Index

Service

Data

Service

Page 32: Couchbase N1QL: Language & Architecture Overview.

5 N1QL PATH OF PROGRESS

Page 33: Couchbase N1QL: Language & Architecture Overview.

33

N1QL features in Couchbase releases

33

Couchbase 4.0: N1QL GAQuery language for JSON, Integrated Query Service, Global Secondary Index, REST API,Simba ODBC, JDBC Drivers

Couchbase 4.1:INSERT, UPDATE, DELETE, MERGECovering Index Optimization

Couchbase 4.1.1: Index JOINs

Couchbase 4.5: Array Indexes, Workbench, CBQ Shell++, INFERMemory Optimized Index, IndexScanCountYCSB Performance Optimizations++, Language++

Couchbase 4.5.1: Pretty=false; Fetch; SUFFIXES;Index Selection; UPDATE improvement

Oct 2015

Dec 2015

March 2016

June 2016

Sep 2016

Page 34: Couchbase N1QL: Language & Architecture Overview.

34

N1QL features in Couchbase releases.

34

Couchbase 4.6.1:TOKENS (Simple Search/Faster LIKE), Optimizer improvements

Couchbase 5.0:Subqueries over nested data ; Pagination; RBAC; Curl, Super Charged Indexing; Monitoring & Profiling; New workbench, UI, monitoring, profiler, visual EXPLAIN Performance++, Bitwise functions

Q1 2017

4Q 2017

Couchbase 4.6.2:Optimizer improvements, intersect scan performance

Q2 2017

Page 35: Couchbase N1QL: Language & Architecture Overview.

35

Couchbase N1QL and GSI features

• Index : New Storage engine, PLASMA

• Index Replicas

• Index with individual keys ASC, DESC

• Index Key size

• Security: RBAC: Statement level security

• Query-Index API improvement

• Complex filters pushdown

• Pagination

• Exploit index with ASC, DESC keys

• Projection optimization

• Query: Subqueries over nested collections

• Query Performance: Intersect Scans

• Flexible indexing (Adaptive Index)

• CURL function with full set of security

• Additional Date & time functions

• Bitwise functions

• Query Monitoring, Profiling with UI

• Query work bench and UI: Fully upgraded

• Query UI: Visual Explain

• Application Continuity (Rolling Upgrade)

• Performance Proof Points

• Core Daily workload

• YCSB

• YCSB-JSON for SoE

• Golang compiler upgraded to 1.8.3

Page 36: Couchbase N1QL: Language & Architecture Overview.

36

36

N1QL Roadmap for 5.1

• Performance Improvements: Phase 1

• View Replacement : Phase 1

• Aggregate Performance : Phase 1

• ANSI Joins: Phase 1

• Statement Level Auditing

• ALTER INDEX, Configuration

• Support for XATTRS

Page 37: Couchbase N1QL: Language & Architecture Overview.

37

Couchbase Forum: N1QL

Page 38: Couchbase N1QL: Language & Architecture Overview.

*END OF THE BEGINNING