NoSQL with Graphs
-
Upload
claudio-martella -
Category
Technology
-
view
12.251 -
download
2
description
Transcript of NoSQL with Graphs
![Page 1: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/1.jpg)
NoSQL with Graphsmining graphs for fun & profit
claudio martellaNoSQLDay 2011
Saturday, March 26, 2011
![Page 2: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/2.jpg)
Outline
Graphs
Why
Tools
Apps
NoSQL
RDBMS
O(1)
Semantic Web
Tinkerpop
Recommendation
Query
Table
Documents
GraphDBs
2
Saturday, March 26, 2011
![Page 3: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/3.jpg)
Who am I?
• PhD in Distributed Graphs @ UniBZ
• Analyst @ TIS Innovation Park
• Topics: Data / Text Mining with Graphs
• Technology: Hadoop, NoSQL, GraphDBs
• Writing Graffiti
3
Saturday, March 26, 2011
![Page 4: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/4.jpg)
Surrounded by graphs
• the Web Graph
• Semantic Web
• Social Networks
• Natural Sciences
• GIS
4
Saturday, March 26, 2011
![Page 5: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/5.jpg)
Property Graph
• A Graph is composed by Vertices and Edges
• Vertices are connected by Edges
• An Edge has a Label and Direction
• Edges and Vertices have Properties
5
Saturday, March 26, 2011
![Page 6: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/6.jpg)
Who am I?6
Me
TIS
works at
UniBZ
studies at
NoSQL
likes Hadoop
works withGraffiti
author
belongs to
GraphDB
belongs to
belongs to
name: claudiosurname: martellaemail: [email protected]
Saturday, March 26, 2011
![Page 7: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/7.jpg)
A graph in RDBMS
7
Follower Followee
1 2
1 3
1 4
2 5
... ...
ID Name
1 Claudio
2 Cirpo
3 Okram
4 Spinoza
... ...
Saturday, March 26, 2011
![Page 8: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/8.jpg)
BTree Index 101
8
• Lookup costs Log(N)
• Where N is the global size of the data structure
• Updating the index is also not for free
Cirpo Claudio Okram Spinoza
Saturday, March 26, 2011
![Page 9: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/9.jpg)
A lookup (RDBMS)
• Look for Claudio’s ID [ Log(N) ]
• Look for K Followees [ Log(N) ]
• Get their names [ K*Log(N) ]
Fr Fe
1 2
1 3
1 4
2 5
... ...
I Name
1 Claudio
2 Cirpo
3 Okram
4 Spinoza
... ...
9
Saturday, March 26, 2011
![Page 10: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/10.jpg)
A graph in NoSQL
10
ID F1 F2 F3 ...
Cirpo ... ... ... ...
Claudio Cirpo Okram Spinoza ...
Okram ... ... ... ...
Spinoza ... ... ... ...
... ... ... ... ...
Saturday, March 26, 2011
![Page 11: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/11.jpg)
A lookup (NoSQL)
• Look for Claudio’s ID [ Log(N) ]
• Look for Followees [ O(K) ]
11
ID F1 F2 F3 ...
Cirpo ... ... ... ...
Claudio ... ... ... ...
Okram ... ... ... ...
Spinoza ... ... ... ...
... ... ... ... ...
Saturday, March 26, 2011
![Page 12: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/12.jpg)
A graph in GraphDB
12
1
2
follows
3follows
4follows
name: Spinoza
name: Okramname: Claudio
name: Cirpo
Saturday, March 26, 2011
![Page 13: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/13.jpg)
A lookup (Graph)
13
• Look for Claudio’s ID [ Log(N) ]
• Look for Followees [ O(K) ]
1
2
follows
3follows
4follows
name: Spinoza
name: Okramname: Claudio
name: Cirpo
Saturday, March 26, 2011
![Page 14: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/14.jpg)
What about Friends (of Friends)*?
14
Saturday, March 26, 2011
![Page 15: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/15.jpg)
A benchmark
15
• 1 Million Vertices
• 4 Million Edges
• Scale-Free Topology
• Postgres VS Neo4J
• Both Hash and BTree
Depth RDBMS Graph
1
2
3
4
5
100ms 30ms
1000ms 500ms
10000ms 3000ms
100000ms 50000ms
N/A 100000ms
Ref: http://markorodriguez.com/2011/02/18/mysql-vs-neo4j-on-a-large-scale-graph-traversal/
Saturday, March 26, 2011
![Page 16: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/16.jpg)
A benchmark
• 50 friends on average
• Look if there’s a path connecting two people
16
Ref: http://www.slideshare.net/thobe/nosqleu-graph-databases-and-neo4j
DB # Time
RDBMS
Graph
Graph
RDBMS
1K 2000ms
1K 2ms
1M 2ms
1M N/A
Saturday, March 26, 2011
![Page 17: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/17.jpg)
A Graph Database allows O(1) access to
adjacent Vertices
Ref: The Graph Traversal Pattern: Marko A. Rodriguez and Peter Neubauer17
Saturday, March 26, 2011
![Page 18: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/18.jpg)
Example: Queries
18
Brad Pitt
Ocean 11
actor Ocean 12
actor Ocean 13
actor
Se7en
actorThe Departedproducer
Actiongenre
Crime
genre
genre
Thrillergenre
genre
genre
genre Drama
genre
genre
genre
Steven Soderbergh
director
director
director
Saturday, March 26, 2011
![Page 19: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/19.jpg)
Example: Queries
19
Brad Pitt
Ocean 11
actor Ocean 12
actor Ocean 13
actor
Se7en
actorThe Departedproducer
Actiongenre
Crime
genre
genre
Thrillergenre
genre
genre
genre Drama
genre
genre
genre
Steven Soderbergh
director
director
director
Saturday, March 26, 2011
![Page 20: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/20.jpg)
Example: Queries
20
Brad Pitt
Ocean 11
actor Ocean 12
actor Ocean 13
actor
Se7en
actorThe Departedproducer
Actiongenre
Crime
genre
genre
Thrillergenre
genre
genre
Steven Soderbergh
director
director
director
genre Drama
genre
genre
genre
Saturday, March 26, 2011
![Page 21: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/21.jpg)
Example: Queries
21
Brad Pitt
Ocean 11
actor Ocean 12
actor Ocean 13
actor
Se7en
actorThe Departedproducer
Actiongenre
Crime
genre
Steven Soderbergh
director
director
director genre
Thrillergenre
genre
genre
genre Drama
genre
genre
genre
Saturday, March 26, 2011
![Page 22: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/22.jpg)
Example: Recommendations
22
ClaudioGraph Runnerlikes
The Lord of the Graphs
likes
Adventure
tagged
Sci-Fitagged
tagged
Trilogytagged
Cirpo
likes
PHP I love Youlikes
Geekytagged
Boringtagged
Caprazzi
likes
likes
Javatarlikes
tagged
tagged
Saturday, March 26, 2011
![Page 23: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/23.jpg)
Example: Recommendations
23
ClaudioGraph Runnerlikes
The Lord of the Graphs
likes
Adventure
tagged
Sci-Fitagged
tagged
Trilogytagged
Cirpo
likes
PHP I love Youlikes
Geekytagged
Boringtagged
Caprazzi
likes
likes
Javatarlikes
tagged
tagged
Saturday, March 26, 2011
![Page 24: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/24.jpg)
Example: Recommendations
24
Claudio
The Lord of the Graphs
likes
Graph Runnerlikes
Cirpo
likes
PHP I love Youlikes
Caprazzi
likes
likes
Javatarlikes
Adventuretagged
Trilogytagged
tagged
Sci-Fitagged
Geekytagged
Boringtagged
tagged
tagged
Saturday, March 26, 2011
![Page 25: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/25.jpg)
Example: Recommendations
25
Claudio
The Lord of the Graphs
likes
Graph Runnerlikes
Cirpo
likes
PHP I love Youlikes
Caprazzi
likes
Javatarlikes
likes
Adventuretagged
Trilogytaggedtagged
Geekytagged
tagged
Boringtagged
tagged
Sci-Fitagged
Saturday, March 26, 2011
![Page 26: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/26.jpg)
Example: Recommendations
26
ClaudioGraph Runnerlikes
The Lord of the Graphs
likes
Adventure
tagged
Sci-Fitagged
tagged
Trilogytagged
Cirpo
likes
PHP I love Youlikes
Geekytagged
Boringtagged
Caprazzi
likes
likes
Javatarlikes
tagged
tagged
Saturday, March 26, 2011
![Page 27: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/27.jpg)
Example: Recommendations
27
ClaudioGraph Runnerlikes
The Lord of the Graphs
likes
Adventure
tagged
Sci-Fitagged
tagged
Trilogytagged
Cirpo
likes
PHP I love Youlikes
Geekytagged
Boringtagged
Caprazzi
likes
likes
Javatarlikes
tagged
tagged
Saturday, March 26, 2011
![Page 28: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/28.jpg)
Example: Recommendations
28
Claudio
Graph Runnerlikes
The Lord of the Graphs
likes
Sci-Fitagged
Adventure
tagged
Trilogytagged
tagged
Cirpolikes
PHP I love You
likes
Geekytagged
Boringtagged
Caprazzi
likes
likes
Javatarlikes
tagged
tagged
Saturday, March 26, 2011
![Page 29: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/29.jpg)
Example: Recommendations
29
ClaudioGraph Runnerlikes
The Lord of the Graphs
likes
Adventure
Javatar
tagged
Geekytagged
tagged
Sci-Fitagged
tagged
Trilogytagged
Cirpolikes
PHP I love You
likes
tagged
Boringtagged
Caprazzi likes
likes
likes
Saturday, March 26, 2011
![Page 30: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/30.jpg)
Example: Recommendations
30
ClaudioGraph Runnerlikes
The Lord of the Graphs
likes
Adventure
Javatar
tagged
Geekytagged
Caprazzi likes
likes
PHP I love You
likes
tagged
Sci-Fitagged
tagged
Trilogytagged
Cirpolikes
likes
tagged
Boringtagged
Saturday, March 26, 2011
![Page 31: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/31.jpg)
Graph Mining
31
Ref: Programming the Semantic Web - O’Reilly
How are they connected?
Saturday, March 26, 2011
![Page 32: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/32.jpg)
Graph Mining
32
Ref: Programming the Semantic Web - O’Reilly
Saturday, March 26, 2011
![Page 33: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/33.jpg)
Graph Mining
33
Saturday, March 26, 2011
![Page 34: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/34.jpg)
Other Applications
34
• Community Analysis
• Fraud Detection
• Planning
• Text Processing
• Reasoning
Saturday, March 26, 2011
![Page 35: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/35.jpg)
as you can’t get rid of logicians
35
Saturday, March 26, 2011
![Page 36: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/36.jpg)
there’s an SQL also for Graphs
36
Saturday, March 26, 2011
![Page 37: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/37.jpg)
Triplestores
37
Tom Cruise
Top Gun
actor
Katie Holmesmarried
Scientology
advocate
Hollywoodlives
July 3, 1962
born
Saturday, March 26, 2011
![Page 38: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/38.jpg)
Triplestores
38
Subject Predicate Object
Tom Cruise actor Top Gun
Tom Cruise married Katie Holmes
Tom Cruise advocate Scientology
Tom Cruise lives Hollywood
Tom Cruise born July 3, 1962
Saturday, March 26, 2011
![Page 39: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/39.jpg)
SPARQL
39
PREFIX ged: <http://www.daml.org/2001/01/gedcom/gedcom#>SELECT ?name ?marriedOnFROM <http://www.daml.org/2001/01/gedcom/royal92.daml>WHERE{ ?royal ged:title "Princess". ?royal ged:name ?name. ?royal ged:spouseIn ?family. ?family ged:marriage ?marriage. ?marriage ged:date ?marriedOn.}ORDER BY ASC [?name]
Saturday, March 26, 2011
![Page 40: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/40.jpg)
what if Internet was your GraphDB?
40
Saturday, March 26, 2011
![Page 41: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/41.jpg)
41
Saturday, March 26, 2011
![Page 42: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/42.jpg)
what about a NoSPARQL?
42
Saturday, March 26, 2011
![Page 43: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/43.jpg)
Tinkerpop
43
Saturday, March 26, 2011
![Page 44: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/44.jpg)
44
• Blueprints is the like the JDBC of the graph database community.
• Provides a Java-based interface API for the property graph data model. Graph, Vertex, Edge, Index.
• Provides implementations of the interfaces for TinkerGraph, Neo4j, OrientDB, Sails (e.g. AllegroSail, Neo4jSail), and soon (hopefully) others such as InfiniteGraph, InfoGrid, Sones, and HyperGraphDB
Saturday, March 26, 2011
![Page 45: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/45.jpg)
45
• A dataflow framework with support for Blueprints-based graph processing.
• Provides a collection of “pipes” (implement Iterable and Iterator)
✴ Filters: ComparisonFilterPipe, RandomFilterPipe, etc.
✴Traversal: VertexEdgePipe, EdgeVertexPipe, PropertyPipe, etc.
✴ Splitting/Merging: CopySplitPipe, RobinMergePipe, etc.
✴ Logic: OrPipe, AndPipe, etc.
Saturday, March 26, 2011
![Page 46: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/46.jpg)
46
• A Turing-complete, graph-based programming language that compiles Gremlin syntax down to Pipes (implements JSR 223).
• Builds on top of Groovy
• Support various language constructs: :=, foreach, while, repeat, if/else, function and path definitions, etc.
An example of “Amazon’s” recommender: m = [:] g.v(1).outE('purchased').inV.inE('purchased').outV.groupCount(m); m.sort{ a,b -> a.value <=> b.value }
Saturday, March 26, 2011
![Page 47: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/47.jpg)
47
• Allows Blueprints graphs to be exposed through a RESTful API (HTTP)
• Supports stored traversals written in raw Pipes or Gremlin.
• Supports adhoc traversals represented in Gremlin.
• Provides “helper classes” for performing search-, score-, and rank-based traversal algorithms—in concert, support for recommendation.
Saturday, March 26, 2011
![Page 48: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/48.jpg)
Sample Stack
48
• HTTP Request arrives
• Converts REST to Gremlin
• Gremlin “compiles” to Pipes
• Pipes makes Blueprints calls
• Store provides the data
Saturday, March 26, 2011
![Page 49: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/49.jpg)
Neo4J
49
• Engine: Graph
• License: AGPLv3
• Language: Java
• Transactions: ACID
• Distributed: HA, Master-Slave Cache Sharding, Domain-Specific
• Features: Embeddable, REST, many plugins
Saturday, March 26, 2011
![Page 50: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/50.jpg)
OrientDB
50
• Engine: Document-Graph
• License: Apache 2.0
• Language: Java
• Transactions: ACID
• Distributed: HA through Replication
• Features: Embeddable, REST, SQL-like
Saturday, March 26, 2011
![Page 51: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/51.jpg)
HypergraphDB
51
• Engine: HyperGraph
• License: LGPL
• Language: Java
• Transactions: ACID
• Distributed: P2P distribution and replication
• Features: Hyperedges, Java OODB, storage on BerkeleyDB
Saturday, March 26, 2011
![Page 52: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/52.jpg)
InfiniteGraph
52
• Engine: Graph
• License: Commercial
• Language: Java
• Transactions: ACID
• Distributed: Graph Partitioning, Federation on Objectivity
• Features: Distributed lock management, scales to Exabytes
Saturday, March 26, 2011
![Page 53: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/53.jpg)
Where do I go now?
53
Tinkerpop: http://www.tinkerpop.comNeo4J: http://neo4j.org OrientDB: http://www.orientechnologies.com/orient-db.htm InfoGrid: http://infogrid.orgInfiniteGraph: http://www.infinitegraph.comSones: http://developers.sones.deAllegroGraph: http://www.franz.com/agraph/allegrographHypergraphDB: http://www.kobrix.com/hgdb.jsp
Saturday, March 26, 2011
![Page 54: NoSQL with Graphs](https://reader034.fdocuments.in/reader034/viewer/2022052522/554f4fd5b4c905524c8b4d89/html5/thumbnails/54.jpg)
http://blog.acaro.orghttp://github.com/claudiomartella/
@claudiomartellahttp://joind.in/2946
Saturday, March 26, 2011