Native APIs for Querying Couchbase Server with N1QL: Couchbase Connect
Couchbase N1QL: Language & Architecture Overview.
-
Upload
keshav-murthy -
Category
Software
-
view
136 -
download
1
Transcript of Couchbase N1QL: Language & Architecture Overview.
N1QL : OVERVIEW, LANGUAGE, ARCHITECTUREKeshav Murthy
Senior Director, Couchbase R&D
N1QL WORKSHOP
AGENDA01
02
03
04
05
Agenda for the day
Motivation for N1QL
Language Overview
Architecture
Overview of N1QL Features in 5.0
1 AGENDA FOR THE DAY
4
Time Topic Speaker
8:30 AM – 9:00 AM Registration and Breakfast
9.00 AM – 9.45 AM N1QL: Overview, Language, Architecture Keshav Murthy
9.45 AM – 10.15 AM Query Developer Tooling in 5.0 Eben Haber
10.15 AM – 11.00 AM Deep dive into Data modeling Dean Proctor
11.00 AM – 11.15 AM Break
11.15 AM – 12.00 AM Index Advisor: Rules for creating indexes Keshav & Sitaram
12.00 PM – 1.00 PM Lunch
1.00 PM – 2.00 PM N1QL Query Optimizer and Improvements in 5.0 Sitaram Vemulapalli
2.00 PM – 2.45 PM Query Monitoring, profiling and trouble shooting in
5.0
Marco Greco
2.45 AM - 3.00 PM Break All
3.00 PM - 3.45 PM Indexing Manageability in 5.0 Deepkaran Salooja
3.45 PM – 4.30 PM Security: RBAC: Role Based Access Control in 5.0 Johan Larson
4.30 PM – 5.15 PM Mindmap for Oracle (& other RDBMS) Developers Raju Suravarjjala
5.15 PM - END Q & A
2 MOTIVATIONS FOR N1QL
6
7
SQL SQL>
8
SQL SQL• Hide the complexity of relational calculus
• English Language based access to Data
• Language to manipulate data easily
• Targeted for developers & professionals
• SELECT, INSERT, UPDATE, DELETE, …
• Arithmetic and logical operators
• Data Types
• Schema management
• All of original SQL capabilities
• ++++++
• Transactions – ACID
• Stored procedures, Triggers, Indexes
• Views, Materialized Views
• Optimizers: Rule & Cost based optimizers
• Spatial, Search, integrations
• Storage optimizations
• SMP, MPP innovations
• Performance: TPC
• Distributed Query Processing
• Two-phase commits
• T-SQL, PL/SQL,…
• JDBC, ODBC drivers.
• Tools
“Life as we know it depends on SQL & OLTP”
>
9
SQL: Progress
• 1970 : Codd developed relational model.
"A Relational Model of Data for Large Shared Data Banks”
• 1974 : SEQUEL paper by Don Chamberlin and Raymond Boyce
“SEQUEL: A Structured English Query Language”
• 1974-1979 : System R with SQL (later SQL) at IBM Research Lab
• 1979 : Oracle markets first relational database with SQL
• 1986 : ANSI SQL Standard released
• 1989, 1992, 1999, 2003, 2008, 2011, 2016: Major ANSI standard updates
10
ResultSet
Relations/Tuples
11
{"Name" : "Jane Smith","DOB" : "1990-01-30","Billing" : [
{"type" : "visa","cardnum" : "5827-2842-2847-3909","expiry" : "2019-03"
},{
"type" : "master","cardnum" : "6274-2842-2847-3909","expiry" : "2019-03"
}],"Connections" : [
{"CustId" : "XYZ987","Name" : "Joe Smith"
},{
"CustId" : "PQR823","Name" : "Dylan Smith"
}{
"CustId" : "PQR823","Name" : "Dylan Smith"
}],"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52 }{ "id":19, item: "ipad2", "amt": 623.52 }
]}
LoyaltyInfoResults
Orders
CUSTOMER
• NoSQL systems provide specialized APIs
• Key-Value get and set
• Each task requires custom built program
• Should test & maintain it
12
Find High-Value Customers with Orders > $10000
Query customer
objects from
database
• Complex codes and logic
• Inefficient processing on client side
• Query will will be slow
For each customer
object
Find all the order
objects for the
customer
Calculate the total
amount for each
order
Sum up the grand
total amount for all
orders
If grand total
amount > $10000,
Extract customer
data
Add customer to
the high-value
customer list
Sort the high-value
customer list
LOOPING OVER MILLIONS OF CUSTOMERS IN APPLICATION!!!
13
{"Name" : "Jane Smith","DOB" : "1990-01-30","Billing" : [
{"type" : "visa","cardnum" : "5827-2842-2847-3909","expiry" : "2019-03"
},{
"type" : "master","cardnum" : "6274-2842-2847-3909","expiry" : "2019-03"
}],"Connections" : [
{"CustId" : "XYZ987","Name" : "Joe Smith"
},{
"CustId" : "PQR823","Name" : "Dylan Smith"
}{
"CustId" : "PQR823","Name" : "Dylan Smith"
}],"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52 }{ "id":19, item: "ipad2", "amt": 623.52 }
]}
LoyaltyInfoResultDocuments
Orders
CUSTOMER
14
N1QL = SQL + JSON
Give developers and enterprises an
expressive, powerful, and complete language
for querying, transforming, and manipulating
JSON data.
15
Why SQL for NoSQL?
3 LANGUAGE OVERVIEW
17
N1QL : Data Types from JSON
Data Type Example
Numbers { "id": 5, "balance":2942.59 }
Strings { "name": "Joe", "city": "Morrisville" }
Boolean { "premium": true, "balance pending": false}
Null { "last_address": Null }
Array { "hobbies": ["tennis", "skiing", "lego"]}
Object { "address": {"street": "1, Main street", "city":
Morrisville, "state":"CA", "zip":"94824"}}
MISSING
Arrays of objects of arrays[
{
"type": "visa",
"cardnum": "5827-2842-2847-3909",
"expiry": "2019-03"
},
{
"type": "master",
"cardnum": "6274-2542-5847-3949",
"expiry": "2018-12"
}
]
18
N1QL : Data Type Handling
Non-JSON data types
• MISSING
• Binary
Data type handling
• Date functions for string and numeric encodings
• Total ordering across all data types
• Well defined semantics for ORDER BY and comparison operators
• Defined expression semantics for all input data types
• No type mismatch errors
19
Considerations: SQL for NoSQL
• Flexible Schema
• Cannot rely on predefined schema
• columns, data types, data comparison
• Nested Objects
• support scalars, objects and array
• SQL operators on nested objects
• Work with distributed data store
• Query Performance
• Optimization
• Exploit data store performance
• Design right kinds of indices
• Optimizer
{"Name" : "Jane Smith","DOB" : "1990-01-30","Billing" : [
{"type" : "visa","cardnum" : "5827-2842-2847-3909","expiry" : "2019-03"
},{
"type" : "master","cardnum" : "6274-2842-2847-3909","expiry" : "2019-03"
}],"address" :
{"Street" : "10, Downing Street","City" : "San Francico","State" : "California","zip" : 94401
}}
20
N1QL: Approach
• Flexible Schema
• Rely on JSON data interpretation
• Define 4-value predicate logic
• True, False, NULL, MISSING
• Nested Objects
• Key name becomes column reference
• Use dot-notation and array[] reference
• SQL operators on nested objects
• select, join, project operators
• nest and unnest for arrays & objects
• Query Performance Optimization
SELECT c.name, c.address.zip,c.phone[0]
FROM customer cWHERE c.address.zip = 94587AND ANY s IN c.status
SATISFIES s = 'Premium'
ENDAND purchases IS NOT MISSING;
21
N1QL: SELECT Statement
SELECT *
FROM customers c
WHERE c.address.state = 'NY'
AND c.status = 'premium'
ORDER BY c.address.zip
Project Everything
From the bucket customers
Sort order
Predicate
22
N1QL: SELECT Statement
SELECT customers.id,
customers.NAME.lastname,
customers.NAME.firstname
Sum(orderline.amount)
FROM orders UNNEST orders.lineitems AS orderline
INNER JOIN customers ON KEYS orders.custid
WHERE customers.state = 'NY'
GROUP BY customers.id,
customers.NAME.lastname,
customers.NAME.firstname
HAVING sum(orderline.amount) > 10000
ORDER BY sum(orderline.amount) DESC
• Dotted sub-document
reference
• Names are CASE-
SENSITIVE
UNNEST to flatten the arrays
JOINS with Document KEY of
customers
23
N1QL: SELECT Statement Highlights
• Querying across relationships
• JOINs
• Subqueries
• Aggregation
• MIN, MAX
• SUM, COUNT, AVG, ARRAY_AGG [ DISTINCT ]
• Combining result sets using set operators
• UNION, UNION ALL, INTERSECT, INTERSECT ALL, EXCEPT, EXCEPT ALL
24
N1QL : Query Operators [ 1 of 2 ]
• USE KEYS …
• Direct primary key lookup bypassing index scans
• Ideal for hash-distributed datastore
• Available in SELECT, UPDATE, DELETE
• JOIN … ON KEYS …
• Nested loop JOIN using key relationships
• Ideal for hash-distributed datastore
• Current implementation supports INNER and LEFT OUTER joins
• ANSI JOINS
• We’re working on it. Be part of BETA for the next release.
25
N1QL : Query Operators [ 2 of 2 ]
• NEST
• Special JOIN that embeds external child documents under their parent
• Ideal for JSON encapsulation
• UNNEST
• Flattening JOIN that surfaces nested objects as top-level documents
• Ideal for decomposing JSON hierarchies
• JOIN, NEST, and UNNEST can be chained in any combination
26
N1QL : Expressions for JSON
Ranging over collections• WHERE ANY c IN children SATISFIES c.age > 10 END
• WHERE EVERY r IN ratings SATISFIES r > 3 END
Mapping with filtering • ARRAY c.name FOR c IN children WHEN c.age > 10 END
Deep traversal, SET,
and UNSET
• WHERE ANY node WITHIN request SATISFIES node.type = “xyz” END
• UPDATE doc UNSET c.field1 FOR c WITHIN doc END
Dynamic Construction
• SELECT { “a”: expr1, “b”: expr2 } AS obj1, name FROM … // Dynamic
object
• SELECT [ a, b ] FROM … // Dynamic array
Nested traversal • SELECT x.y.z, a[0] FROM a.b.c …
IS [ NOT ] MISSING • WHERE name IS MISSING
27
N1QL: Data Modification Statements
• UPDATE … SET … WHERE …
• DELETE FROM … WHERE …
• INSERT INTO … ( KEY, VALUE ) VALUES …
• INSERT INTO … ( KEY …, VALUE … ) SELECT …
• MERGE INTO … USING … ON …
WHEN [ NOT ] MATCHED THEN …
Note: Couchbase provides per-document atomicity.
4 ARCHITECTURE
29
Couchbase Server Cluster Service Deployment
STORAGE
Couchbase Server 1
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster
Manager
Managed Cache
Storage
Data
Service STORAGE
Couchbase Server 2
Managed Cache
Cluster ManagerCluster
Manager
Data
Service STORAGE
Couchbase Server 3
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster
Manager
Data
Service STORAGE
Couchbase Server 4
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster
Manager
Query
Service STORAGE
Couchbase Server 5
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster
Manager
Query
Service STORAGE
Couchbase Server 6
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster
Manager
Index
Service
Managed Cache
Storage
Managed Cache
Storage Storage
STORAGE
Couchbase Server 6
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster
Manager
Index
Service
Storage
Managed CacheManaged Cache
SDK SDK
30
N1QL: Query Execution Flow
Clients
1. Submit the query over REST API 8. Query result
2. Parse, Analyze, create Plan 7. Evaluate: Documents to results
3. Scan Request;
index filters
6. Fetch the documents
Index
Service
Query
Service
Data
Service
4. Get qualified doc keys
5. Fetch Request,
doc keys
SELECT c_id,
c_first,
c_last,
c_max
FROM CUSTOMER
WHERE c_id = 49165;
{
"c_first": "Joe",
"c_id": 49165,
"c_last": "Montana",
"c_max" : 50000
}
31
N1QL: Inside the Query Service
Client
FetchParse Plan Join FilterPre-Aggregate
Offset Limit ProjectSortAggregateScan
Query Service
Index
Service
Data
Service
5 N1QL PATH OF PROGRESS
33
N1QL features in Couchbase releases
33
Couchbase 4.0: N1QL GAQuery language for JSON, Integrated Query Service, Global Secondary Index, REST API,Simba ODBC, JDBC Drivers
Couchbase 4.1:INSERT, UPDATE, DELETE, MERGECovering Index Optimization
Couchbase 4.1.1: Index JOINs
Couchbase 4.5: Array Indexes, Workbench, CBQ Shell++, INFERMemory Optimized Index, IndexScanCountYCSB Performance Optimizations++, Language++
Couchbase 4.5.1: Pretty=false; Fetch; SUFFIXES;Index Selection; UPDATE improvement
Oct 2015
Dec 2015
March 2016
June 2016
Sep 2016
34
N1QL features in Couchbase releases.
34
Couchbase 4.6.1:TOKENS (Simple Search/Faster LIKE), Optimizer improvements
Couchbase 5.0:Subqueries over nested data ; Pagination; RBAC; Curl, Super Charged Indexing; Monitoring & Profiling; New workbench, UI, monitoring, profiler, visual EXPLAIN Performance++, Bitwise functions
Q1 2017
4Q 2017
Couchbase 4.6.2:Optimizer improvements, intersect scan performance
Q2 2017
35
Couchbase N1QL and GSI features
• Index : New Storage engine, PLASMA
• Index Replicas
• Index with individual keys ASC, DESC
• Index Key size
• Security: RBAC: Statement level security
• Query-Index API improvement
• Complex filters pushdown
• Pagination
• Exploit index with ASC, DESC keys
• Projection optimization
• Query: Subqueries over nested collections
• Query Performance: Intersect Scans
• Flexible indexing (Adaptive Index)
• CURL function with full set of security
• Additional Date & time functions
• Bitwise functions
• Query Monitoring, Profiling with UI
• Query work bench and UI: Fully upgraded
• Query UI: Visual Explain
• Application Continuity (Rolling Upgrade)
• Performance Proof Points
• Core Daily workload
• YCSB
• YCSB-JSON for SoE
• Golang compiler upgraded to 1.8.3
36
36
N1QL Roadmap for 5.1
• Performance Improvements: Phase 1
• View Replacement : Phase 1
• Aggregate Performance : Phase 1
• ANSI Joins: Phase 1
• Statement Level Auditing
• ALTER INDEX, Configuration
• Support for XATTRS
37
Couchbase Forum: N1QL
*END OF THE BEGINNING