Post on 12-Jul-2015
Graph All The ThingsIntroduction to Graph Databases
Neo4j GraphDays 2014Chicago
Philip RathleVP of Products, Neo4j
@prathle#neo4j
C34,3%B
38,4%A3,3%
D3,8%
1,8%1,8% 1,8%
1,8%
1,8%
E8,1%
F3,9%
INDUSTRY TRENDS: GRAPHS TRANSFORMED CONSUMER WEB
Use of Relationship Information in The Consumer Web
INDUSTRY TRENDS: GRAPHS TRANSFORMED CONSUMER WEB
Use of Relationship Information in The Consumer Web
INDUSTRY TRENDS: GRAPHS TRANSFORMED CONSUMER WEB
Ref: http://www.gartner.com/id=2081316
Interest Graph
Payment Graph
Intent Graph
Mobile Graph
Consumer Web Giants Depends on Five GraphsGartner’s “5 Graphs”
Social Graph
GARTNER’S 5 GRAPHS OF CONSUMER WEB: SUSTAINABLE COMPETITIVE DIFFERENTIATION COMES FROM MASTERING 5 GRAPHS
Key-Value
Graph DB
RiakRedis
Neo4j
membase
0x235C Philip
0xCD21 Neo4j Chicago
0x2014 [PPR,RB,NL]
0x3821 [CHI, SFO, BOS]
0x3890 B75DD108A
Column FamilyName UID Members Groups Photo
0x235C Philip PPR CHI, SFO, BOS B75DD108A893A
0xCD21 Neo4j Chicago CHI PPR,RB,
NL 218758D88E901 CassandraHBase
Document DB0x235C {name:Philip, UID: PPR, Groups: [CHI,SFO,BOS]}
0xCD21{name:Neo4j Chicago, UID: PPR, Members:[PPR,RB,NL],
where:{city:Chicago, State: IL}} MongoDB CouchDB
NI name:Neo4j Chicago, UID: CHI,Photo: 218758D88E901
ABK name:Philip, UID: PPR, Photo: B75DD108A893A
MEMBERsince: 2011
UNLOCKING THE POTENTIAL OF RELATIONSHIPS IN DATA
A GRAPH DATABASE IS PURPOSE-BUILT FOR:
When your business depends on Relationships in Data
The Property Graph ModelTHE PROPERTY GRAPH MODEL
The Property Graph ModelTHE PROPERTY GRAPH MODEL
LovesAnn Dan
The Property Graph Model
Ann DanLoves
THE PROPERTY GRAPH MODEL
The Property Graph Model
(Ann) –[:LOVES]-> (Dan)
THE PROPERTY GRAPH MODEL
Ann DanLoves
The Property Graph Model
(:Person {name:"Ann"}) –[:LOVES]-> (:Person {name:"Dan"})
THE PROPERTY GRAPH MODEL
Ann DanLoves
The Property Graph Model
(:Person {name:"Ann"}) –[:LOVES]-> (:Person {name:"Dan"})
THE PROPERTY GRAPH MODEL
Ann DanLoves
Node Relationship Node
The Property Graph Model
(:Person {name:"Ann"}) –[:LOVES]-> (:Person {name:"Dan"})
THE PROPERTY GRAPH MODEL
Ann DanLoves
Node Relationship Nodeproperty propertylabel labeltype
Cypher
Query: Whom does Ann love?
(:Person {name:"Ann"})–[:LOVES]->(whom)
CYPHER
Cypher
Query: Whom does Ann love?
MATCH (:Person {name:"Ann"})–[:LOVES]->(whom)
CYPHER
Cypher
Query: Whom does Ann love?
MATCH (:Person {name:"Ann"})–[:LOVES]->(whom)
RETURN whom
CYPHER
CypherCYPHER
Under The Hood
MATCH (:Person {name:"Ann"})–[:LOVES]->(whom)RETURN whom
cypher
native graph processing
native storage
UNDER THE HOOD
BUSINESS & PROJECT IMPACT
#1: EASIER TO UNDERSTAND COMPLEX MODELS
“Find all sushi restaurants in NYC that my friends like”
“Find all direct reports and how many they manage, up to 3 levels down”
#2: EASIER TO EXPRESS COMPLEX QUERIES
Example HR Query:
MATCH (boss)-‐[:MANAGES*0..3]-‐>(sub), (sub)-‐[:MANAGES*1..3]-‐>(report) WHERE boss.name = “John Doe” RETURN sub.name AS Subordinate, count(report) AS Total
(SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM ( SELECT manager.pid AS directReportees, 0 AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") UNION SELECT manager.pid AS directReportees, count(manager.directly_manages) AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT manager.pid AS directReportees, count(reportee.directly_manages) AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT manager.pid AS directReportees, count(L2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM ( SELECT manager.directly_manages AS directReportees, 0 AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") UNION SELECT reportee.pid AS directReportees, count(reportee.directly_manages) AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION
(continued from previous page...) SELECT depth1Reportees.pid AS directReportees, count(depth2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM( SELECT reportee.directly_manages AS directReportees, 0 AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT L2Reportees.pid AS directReportees, count(L2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT L2Reportees.directly_manages AS directReportees, 0 AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") )
SAME QUERY IN SQL ( ! ! )
PERFORMANCE AT SCALE
RDBMS/Other vs. Native Graph Database
Connectedness of Data Set
Resp
onse
Tim
e
RDBMS / Other NOSQL# Hops: 0-2 Degree: < 3
Size: ThousandsNeo4j
# Hops: Tens to Hundreds Degree: Thousands+Size: Billions+
1000x faster
#3: PERFORMANCE
DATABASE # PEOPLE QUERY TIME (MS)
MySQL 1,000 2,000
Neo4j 1,000 2
Neo4j 1,000,000 2
Business Impact: Move FasterThe whole design, development, QA,
and release process for CruchBase
Events was a total of 2 weeks.”
“The ability to iterate that quickly is
a mammoth step up for us.
In CrunchBase 1.0 (MySQL), it probably
would have taken 2 months.”
-‐ Kurt Freytag, CTO CrunchBase
Total DollarAmount
Transaction Count
Investigate
Investigate
Business Impact: Invent Faster
“Our Neo4j solution is literally thousands of times
faster than the prior MySQL solution,with queries that require 10-‐100 times less code.”
-‐ Volker Pacher, Senior Developer eBay
Business Impact: Run Faster
Neo Technology, Inc Confidential
Real-Time/ OLTP
Offline/ Batch
Connected Queries Enable Real-Time Analytics
GRAPHS ARE TRANSFORMING THE WORLD
Core industries & Use Cases WEB / ISV Financial Services Tele-communications
Network &Data Center Management
Master Data Management
Social
Geo
?
Core industries & Use Cases WEB / ISV Financial Services Telecommunications Health Care
& Life Sciences
Network &Data Center Management
Master Data Management
Social
GEO
Finance
GRAPHS ARE TRANSFORMING THE WORLD
Neo Technology, Inc Confidential
Core industries
& Use CasesWEB / ISV Financial
ServicesTelecom-
munications
Health Care & Life
Sciences
Web Social,HR &
Recruiting
Media & Publishing
Energy, Services, Automotive, Gov’t,
Logistics, Education, Gaming, Other
Network &Data Center Management
Master Data Management
Social
GEO
Recomm-endations
Identity & Access Mgmt
Search & Discovery
BI, CRM, Impact Analysis, Fraud
Detection, Resource
Optimization, etc.
Finance
Neo4j Adoption Snapshot
GRAPH DATABASES - THE FASTEST GROWING DBMS CATEGORY
Source: http://db-engines.com/en/ranking/graph+dbms!
0%
10%
20%
30%
2011 2014 2017
25%
2.5%
0%
% o
f Ent
erpr
ises
usi
ng G
raph
Dat
abas
es “Forrester estimates that over 25% of enterprises will be using graph databases by 2017”
Sources• Forrester TechRadar™: Enterprise DBMS, Feb 13 2014 (http://www.forrester.com/TechRadar+Enterprise+DBMS+Q1+2014/fulltext/-/E-RES106801)• Dataversity Mar 31 2014: “Deconstructing NoSQL: Analysis of a 2013 Survey on the Use, Production and Assessment of NoSQL Technologies in the Enterprise” (http://www.dataversity.net)• Neo Technology customer base in 2011 and 2014• Estimation of other graph vendors’ customer base in 2011 and 2014 based on best available intelligence
“25% of survey respondents said they plan to use Graph databases in the future.”
Graph Databases: Powering The EnterpriseGRAPH DATABASES - POWERING
THE ENTERPRISE
Ref: Gartner, ‘IT Market Clock for Database Management Systems, 2014,’ September 22, 2014https://www.gartner.com/doc/2852717/it-market-clock-database-management
“Graph analysis is possibly the single most effective
competitive differentiator for organizations pursuing data-
driven operations and decisions after the design of
data capture.”
Graph Databases: Can Transform Your BusinessGRAPH DATABASES - CAN
TRANSFORM YOUR BUSINESS
Summary
When your business depends on Relationships in Data
SUMMARY
Your Mission:
Connect.