When to no sql and when to know sql javaone
-
Upload
simon-elliston-ball -
Category
Technology
-
view
241 -
download
1
description
Transcript of When to no sql and when to know sql javaone
![Page 1: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/1.jpg)
Simon Elliston Ball Head of Big Data
@sireb
When to NoSQL and When to Know SQL
#noSQLknowSQL
http://nosqlknowsql.io
![Page 2: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/2.jpg)
Not only SQL
SQL
what is NoSQL?
Many many things
NoSQL
No, SQL
![Page 3: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/3.jpg)
files
before SQL
multi-value
ur… hash maps?
![Page 4: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/4.jpg)
everything is relational
after SQL
ORMs fill in the other data structures
scale up rules
data first design
![Page 5: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/5.jpg)
datastores that suit applications
and now NoSQL
polyglot persistence: the right tools
scale out rules
APIs not EDWs
![Page 6: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/6.jpg)
data growth
why should you care?
machine learning
social
rapid development
fewer migration headaches… maybe
![Page 7: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/7.jpg)
big bucks.
O’Reilly 2013 Data Science Salary Survey
![Page 8: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/8.jpg)
So many NoSQLs…
![Page 9: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/9.jpg)
![Page 10: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/10.jpg)
document databases
![Page 11: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/11.jpg)
document databases
known access pattern
JSON docs
complex, variable models
rapid development
![Page 12: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/12.jpg)
document databases
JUST DON’Tjoins?
learn a very new query language
denormalize
http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
document form
![Page 13: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/13.jpg)
document vs SQLwhat can SQL do?
query all the angles
sure, you can use blobs…
… but you can’t get into them
![Page 14: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/14.jpg)
documents in SQLSQL xml fields
mapping xquery paths is painful
native JSON
but still structured
![Page 15: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/15.jpg)
class of database database
query everything: search
full-text indexing
![Page 16: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/16.jpg)
range query
you know google, right…
span query
keyword query
![Page 17: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/17.jpg)
scores
you know the score"query": { "function_score": { "query": { "match": { "title": "NoSQL"} }, "functions": [ "boost": 1, "gauss": { "timestamp": { "scale": "4w" } }, "script_score" : { "script" : "_score * doc['important_document'].value ? 2 : 1" } ], "score_mode": "sum" }}
![Page 18: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/18.jpg)
scores
SQL knows the score toodeclare @origin float = 0;declare @delay_weeks float = 4;
SELECT TOP 10 * FROM ( SELECT title, score * CASE WHEN p.important = 1 THEN 2.0 WHEN p.important = 0 THEN 1.0 END * exp(-power(timestamp-@origin,2)/(2*@delay*7*24*3600)) + 1 AS score FROM posts p WHERE title LIKE '%NoSQL%') as foundORDER BY score
![Page 19: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/19.jpg)
more like this: instant tf-idf
you know google, right…
{ "more_like_this" : { "fields" : ["name.first", "name.last"], "like_text" : "text like this one", "min_term_freq" : 1, "max_query_terms" : 12 }}
![Page 20: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/20.jpg)
Facets
![Page 21: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/21.jpg)
Facets
![Page 22: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/22.jpg)
FacetsSQL:
x lots
SELECT a.name, count(p.id) FROM people p JOIN industry a on a.id = p.industry_id JOIN people_keywords pk on pk.person_id = p.id JOIN keywords k on k.id = pk.keyword_id WHERE CONTAINS(p.description, 'NoSQL') OR k.name = 'NoSQL' ...GROUP BY a.name
SELECT a.name, count(p.id) FROM people p JOIN area a on a.id = p.area_id JOIN people_keywords pk on pk.person_id = p.id JOIN keywords k on k.id = pk.keyword_id WHERE CONTAINS(p.description, 'NoSQL') OR k.name = 'NoSQL' ...GROUP BY a.name
![Page 23: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/23.jpg)
FacetsElastic search:{ "query": { “query_string": { “default_field”: “content”, “query”: “keywords” } }, "facets": { “myTerms": { "terms": { "field" : "lang", "all_terms" : true } } }}
![Page 24: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/24.jpg)
untyped free-text documents
logs
timestamped
semi-structured
discovery
aggregation and statistics
![Page 25: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/25.jpg)
close to your programming model
key: value
distributed map | list | set
keys can be objects
![Page 26: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/26.jpg)
hash types
SQL extensions
hstore
![Page 27: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/27.jpg)
inheritance
SQL and polymorphism
ORMs hide the horror
![Page 28: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/28.jpg)
columnar databases
turning round the rows
physical layout matters
![Page 29: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/29.jpg)
turning round the rowskey value type
1 A Home
2 B Work
3 C Work
4 D Work
00001 1 A Home 00002 2 B Work 00003 3 C Work …
Row storage
A B C D Home Work Work Work …
Column storage
![Page 30: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/30.jpg)
teaching an old SQL new tricksMySQL InfoBright
SQL Server Columnar Indexes
CREATE NONCLUSTERED COLUMNSTORE INDEX idx_col ON Orders (OrderDate, DueDate, ShipDate)
Great for your data warehouse, but no use for OLTP
![Page 31: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/31.jpg)
Parquet
ORC files
column for hadoop and other animals
http://parquet.io
![Page 32: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/32.jpg)
wide column databases
column families
![Page 33: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/33.jpg)
millions of columns
eventually consistent
CQL
http://cassandra.apache.org/http://www.datastax.com/
set | list | map types
![Page 34: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/34.jpg)
SQL: so many views, so much confusion
cell level security
accumulo https://accumulo.apache.org/
![Page 35: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/35.jpg)
Tim
e se
ries
![Page 36: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/36.jpg)
timeretrieving time series and graphs
window functionsSELECT business_date, ticker,
close, close /
LAG(close,1) OVER (PARTITION BY ticker ORDER BY business_date ASC)
- 1 AS ret FROM sp500
![Page 37: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/37.jpg)
timeOpen TSDB
Influx DB
key:sub-key:metric
key:*
key:*:metric
![Page 38: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/38.jpg)
Queues
![Page 39: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/39.jpg)
CREATE procedure [dbo].[Dequeue]AS
set nocount on
declare @BatchSize intset @BatchSize = 10
declare @Batch table (QueueID int, QueueDateTime datetime, Title nvarchar(255))
begin tran
insert into @Batchselect Top (@BatchSize) QueueID, QueueDateTime, Title from QueueMetaWITH (UPDLOCK, HOLDLOCK)where Status = 0order by QueueDateTime ASC
declare @ItemsToUpdate intset @ItemsToUpdate = @@ROWCOUNT
update QueueMetaSET Status = 1WHERE QueueID IN (select QueueID from @Batch)AND Status = 0
if @@ROWCOUNT = @ItemsToUpdatebegin
commit transelect b.*, q.TextData from @Batch binner join QueueData q on q.QueueID = b.QueueIDprint 'SUCCESS'
endelsebegin
rollback tranprint 'FAILED'
end
queues in SQL
![Page 40: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/40.jpg)
index fragmentation is a problem
queues in SQL
but built in logs of a sort
![Page 41: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/41.jpg)
specialised apis
message queues
capabilities like fan-out
routing
acknowledgement
![Page 42: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/42.jpg)
relationships countGraph databases
![Page 43: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/43.jpg)
relationships counttrees and hierarchies
overloaded relationships
fancy algorithms
![Page 44: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/44.jpg)
hierarchies with SQLadjacency lists
nested sets (MPTT)
materialised path
CONSTRAIN fk_parent_id_id FOREIGN KEY parent_id REFERENCES some_table.id
path = 1.2.23.55.786.33425
Node Left Right Depth
A 1 22 1
B 2 9 2
C 10 21 2
D 3 4 3
E 5 6 3
F 7 8 3
G 11 18 3
H 19 20 3
I 12 13 4
J 14 15 4
K 16 17 4
![Page 45: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/45.jpg)
OrientDB
Consolidation
ArangoDB
SQLs with NoSQL
Postgres
MySQL
https://www.arangodb.org/
http://www.orientechnologies.com/
![Page 46: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/46.jpg)
Big Data
![Page 47: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/47.jpg)
SQL on Hadoop
![Page 48: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/48.jpg)
Shark
More than SQL
Drill
Cascading
Map Reduce
![Page 49: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/49.jpg)
System issues, Speed issues,Soft issues
![Page 50: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/50.jpg)
Atomic
the ACID, BASE litmus
Consistent
Isolated
Durable
Basically Available
Soft-state
Eventually consistent
what matters to you?
![Page 51: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/51.jpg)
CAP it all
Consistency
AvailabilityPartition
![Page 52: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/52.jpg)
SQL writes cost a lot
write fast, ask questions later
mainly write workload: NoSQL
low latency write workload: NoSQL
![Page 53: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/53.jpg)
most NoSQL scales well
is it web scale?
but clusters still need management
are you facebook? one machine is easier than n
can ops handle it? app developers make bad admins
![Page 54: When to no sql and when to know sql javaone](https://reader038.fdocuments.in/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/54.jpg)
analysts: they want SQL
who is going to use it?
developers: they want applications
data scientists: they want access