Webinar: RDBMS to Graphs
-
Upload
neo4j-the-fastest-and-most-scalable-native-graph-database -
Category
Technology
-
view
493 -
download
0
Transcript of Webinar: RDBMS to Graphs
RDBMS TO GRAPH
Live from San Mateo, March 9, 2017Webinar
HR-tools Supply Payments Logistics CRM Support
TRADITIONAL DATA STRUCTURE
RDBMS RDBMS RDBMSRDBMS RDBMS RDBMS
SHIFT TOWARDS SYSTEMS OF ENGAGEMENT
Users Engaging With DevicesUsers Engaging With Users Devices Engaging With Devices
SYSTEMS OF ENGAGEMENT
SHIFT TOWARDS SYSTEMS OF ENGAGEMENT
You are here!
Data volume
SYSTEMS OF RECORDRelational Database Model
StructuredPre-computed
Based on rigid rules
SYSTEMS OF ENGAGEMENTNoSQL Database Model
Highly FlexibleReal-Time QueriesHighly Contextual
SYSTEMS OF RECORD
SYSTEMS OF ENGAGEMENT
This is data modeled as a graph!
IntuitivnessSpeedAgility
IntuitivenessSpeedAgility
Intuitiveness
IntuitivnessSpeedAgility
Speed
“We found Neo4j to be literally thousands of times faster than our prior MySQL solution, with queries that require
10-100 times less code. Today, Neo4j provides eBay with functionality that was previously impossible.”
- Volker Pacher, Senior Developer
“Minutes to milliseconds” performance Queries up to 1000x faster than RDBMS or other NoSQL
IntuitivnessSpeed, because Native Graph Database
Agility
WHAT DOES “NATIVE” MEAN?
Employee ID Name PictureRef Building Office Departme
nt Title Degree1 Uni1 Major1
4951870 John Doe
s3://acme-pics/
4951870.png
1200 124A Eng Software Engineer II MS Harvard Computer
Science
9765207 Jane Smith
s3://acme-pics/
9765207.png
1300 187D BizOpsSr
Operations Associate
BS Stanford Physics
4150915 Shyam Bhatt
s3://acme-pics/
4150915.png
45 432C SalesEnterprise
Sales Assoc
MBA Penn Accounting
7566243 Kathryn Bates
s3://acme-pics/
7566243.png
44 334B EngStaff
Software Engineer
PhD UCB Computer Science
WHY DOES “NATIVE” MATTER?
Think of a Relational DB QueryEmpID Name PictureRef
4951870 John Doe s3://acme-pics/4951870.png
9765207 Jane Smith s3://acme-pics/9765207.png
4150915 Shyam Bhatt s3://acme-pics/4150915.png
7566243 Kathryn Bates s3://acme-pics/7566243.png
EmpID Manager ID StartDate EndDate
4951870 9765207 20170101 null
9765207 7566243 20150130 null
4150915 8795882 20141215 20150312
7566243 8509238 20120605 20140124
EmpID Building Office
4951870 1200 124A
9765207 1300 187D
4150915 45 432C
Think of a Relational DB QueryEmpID Name PictureRef
4951870 John Doe s3://acme-pics/4951870.png
9765207 Jane Smith s3://acme-pics/9765207.png
4150915 Shyam Bhatt s3://acme-pics/4150915.png
7566243 Kathryn Bates s3://acme-pics/7566243.png
EmpID Manager ID StartDate EndDate
4951870 9765207 20170101 null
9765207 7566243 20150130 null
4150915 8795882 20141215 20150312
7566243 8509238 20120605 20140124
EmpID Building Office
4951870 1200 124A
9765207 1300 187D
4150915 45 432C
Think of a Relational DB QueryEmpID Name PictureRef
4951870 John Doe s3://acme-pics/4951870.png
9765207 Jane Smith s3://acme-pics/9765207.png
4150915 Shyam Bhatt s3://acme-pics/4150915.png
7566243 Kathryn Bates s3://acme-pics/7566243.png
EmpID Manager ID StartDate EndDate
4951870 9765207 20170101 null
9765207 7566243 20150130 null
4150915 8795882 20141215 20150312
7566243 8509238 20120605 20140124
EmpID Building Office
4951870 1200 124A
9765207 1300 187D
4150915 45 432C
16+ Index LookupsExpensive!
(Partial) Graph View
1 Index Lookup (find :Employee nodes)
Then Index-Free Adjacency
:Employee{id:4951870}
:Employee{id:9765207}
:Office{id: 1200124a}
:Building{id: 1200}
[:IS_MANAGED_BY]
[:HAS_OFFICE]
[:LOCATED_IN]
IntuitivnessSpeedAgility
A Naturally Adaptive Model
A Query Language Designed for Connectedness
+
=Agility
CypherTypical Complex SQL Join The Same Query using Cypher
MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report)WHERE boss.name = “John Doe”RETURN sub.name AS Subordinate, count(report) AS Total
Project ImpactLess time writing queries
Less time debugging queries
Code that’s easier to read
ABOUT ME• Developed web apps for 5 years
including e-commerce, business workflow, more.
• Worked at Google for 8 years on Google Apps, Cloud Platform
• Technologies: Python, Java, BigQuery, Oracle, MySQL, OAuth
[email protected] @ryguyrg
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING: Real Time Recommendations
VIEW
ED
VIEWED
BOUG
HT
VIEWED BOUGHT
BOUGHT
BO
UG
HT
BOUG
HT
“As the current market leader in graph databases, and with enterprise features for scalability and availability, Neo4j is the right choice to meet our demands.” Marcos Wada
Software Developer, Walmart
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING: Master Data Management
MANAGES
MANAGES
LEADS
REGION
MANAGES
MANAGES
REGION
LEADS
LEADS
COLL
ABO
RATE
S
Neo4j is the heart of Cisco HMP: used for governance and single source of truth and a one-stop shop for all of Cisco’s hierarchies.
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING: Master Data Management
Solu%onSupportCase
SupportCase
KnowledgeBaseAr%cle
Message
KnowledgeBaseAr%cle
KnowledgeBaseAr%cle
Neo4j is the heart of Cisco’s Helpdesk Solution too.
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING: Fraud Detection
OPENED_ACCOUNT
HAS IS_ISSUED
HAS
LIVES LIVES
IS_ISSUED
OPE
NED_
ACCO
UNT
“Graph databases offer new methods of uncovering fraud rings and other sophisticated scams with a high-level of accuracy, and are capable of stopping advanced fraud scenarios in real-time.”
Gorka SadowskiCyber Security Expert
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING: Graph Based Search
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
PUBLISH
INCLUDE
INCLUDE
CREATE
CAPT
URE
IN
INSO
URCE
USES
USES
IN
IN
USES
SOURCE SOURCE
Uses Neo4j to manage the digital assets inside of its next generation in-flight entertainment system.
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
BROWSESCO
NNEC
TS
BRIDGES
ROUTES
POW
ERSROUTES
POWERSPOWERS
HOSTS
QUERIES
GRAPH THINKING: Network & IT-Operations
Uses Neo4j for network topology analysis for big telco service providers
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING: Identity And Access Management
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
TRUSTS
TRUSTS
ID
ID
AUTHENTICATES
AUTH
ENTI
CATE
S
OWNS
OWNSC
AN
_REA
D
UBS was the recipient of the 2014 Graphie Award for “Best Identify And Access Management App”
NEO4j USE CASESReal Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
Neo4j Adoption by Selected VerticalsSOFTWARE FINANCIAL
SERVICES RETAIL MEDIA & BROADCASTING
SOCIAL NETWORKS TELECOM HEALTHCARE
AGENDA• Use Cases • SQL Pains • Building a Neo4j Application • Moving from RDBMS -> Graph Models
• Walk through an Example • Creating Data in Graphs • Querying Data
SQL
Day in the Life of a RDBMS Developer
SELECT p.name, c.country, c.leader, p.hair, u.name, u.pres, u.stateFROM people p LEFT JOIN country c ON c.ID=p.country LEFT JOIN uni u ON p.uni=u.idWHERE u.state=‘CT’
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
• Complex to model and store relationships • Performance degrades with increases in data • Queries get long and complex • Maintenance is painful
SQL Pains
• Easy to model and store relationships • Performance of relationship traversal remains constant with
growth in data size • Queries are shortened and more readable • Adding additional properties and relationships can be done on
the fly - no migrations
Graph Gains
What does this Graph look like?
CYPHER
Ann DanLoves
Property Graph Model
CREATE (:Person { name:“Dan”} ) - [:LOVES]-> (:Person { name:“Ann”} )
LOVES
LABEL PROPERTY
NODE NODE
LABEL PROPERTY
MATCH (p:Person)-[:WENT_TO]->(u:Uni), (p)-[:LIVES_IN]->(c:Country), (u)-[:LED_BY]->(l:Leader), (u)-[:LOCATED_IN]->(s:State)WHERE s.abbr = ‘CT’RETURN p.name, c.country, c.leader, p.hair, u.name, l.name, s.abbr
How do you use Neo4j?
CREATE MODEL
+
LOAD DATA QUERY DATA
How do you use Neo4j?
How do you use Neo4j?
Official Language Drivers
Community Language Drivers
Java Stored Procedures and Functions
GET STARTED TODAY!!!
Architectural Options
DataStorageandBusinessRulesExecu5on
DataMiningandAggrega5on
Applica'on
GraphDatabaseCluster
Neo4j Neo4j Neo4j
AdHocAnalysis
BulkAnaly'cInfrastructureHadoop,EDW…
DataScien'st
EndUser
DatabasesRela5onalNoSQLHadoop
RDBMS to Graph Options
MIGRATEALLDATA
MIGRATESUBSET
DUPLICATESUBSET
Non-GraphQueries GraphQueries
GraphQueriesNon-GraphQueries
AllQueries
Rela3onalDatabase
GraphDatabase
Application
Application
Application
NonGraphData
AllData
FROM RDBMS TO GRAPHS
Northwind
Northwind - the canonical RDBMS Example
( )-[:TO]->(Graph)
( )-[:IS_BETTER_AS]->(Graph)
Starting with the ER Diagram
Locate the Foreign Keys
Drop the Foreign Keys
Find the JOIN Tables
(Simple) JOIN Tables Become Relationships
Attributed JOIN Tables -> Relationships with Properties
Querying a Subset Today
As a Graph
QUERYING THE GRAPH
using openCypher
Property Graph Model
CREATE(:Employee{firstName:“Steven”})-[:REPORTS_TO]->(:Employee{firstName:“Andrew”})
REPORTS_TO Steven Andrew
LABEL PROPERTY
NODE NODE
LABEL PROPERTY
Who do people report to?MATCH (e:Employee)<-[:REPORTS_TO]-(sub:Employee)RETURN *
Who do people report to?
Who do people report to?MATCH (e:Employee)<-[:REPORTS_TO]-(sub:Employee)RETURN e.employeeID AS managerID, e.firstName AS managerName, sub.employeeID AS employeeID, sub.firstName AS employeeName;
Who do people report to?
Who does Robert report to?
MATCH p=(e:Employee)<-[:REPORTS_TO]-(sub:Employee)WHERE sub.firstName = ‘Robert’RETURN p
Who does Robert report to?
What is Robert’s reporting chain?
MATCH p=(e:Employee)<-[:REPORTS_TO*]-(sub:Employee)WHERE sub.firstName = ‘Robert’RETURN p
What is Robert’s reporting chain?
Who’s the Big Boss?MATCH (e:Employee)WHERE NOT (e)-[:REPORTS_TO]->()RETURN e.firstName as bigBoss
Who’s the Big Boss?
Product Cross-SellingMATCH (choc:Product {productName: 'Chocolade'}) <-[:INCLUDES]-(:Order)<-[:SOLD]-(employee), (employee)-[:SOLD]->(o2)-[:INCLUDES]->(other:Product)RETURN employee.firstName, other.productName, COUNT(DISTINCT o2) as countORDER BY count DESCLIMIT 5;
Product Cross-Selling
(ASIDE ON GRAPH COMPUTE)
Shortest Path Between AirportsMATCH p = shortestPath( (a:Airport {code:”SFO”})-[*0..2]-> (b:Airport {code: “MSO”}))RETURN p
(END ASIDE ON GRAPH COMPUTE)
POWERING AN APP
Simple App
Simple App
Simple Python Code
Simple Python Code
Simple Python Code
Simple Python Code
LOADING OUR DATA
CSV
CSV files for Northwind
CSV files for Northwind
3 Steps to Creating the Graph
IMPORT NODES CREATE INDEXES IMPORT RELATIONSHIPS
Importing Nodes// Create customersUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/customers.csv" AS rowCREATE (:Customer {companyName: row.CompanyName, customerID: row.CustomerID, fax: row.Fax, phone: row.Phone});
// Create productsUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/products.csv" AS rowCREATE (:Product {productName: row.ProductName, productID: row.ProductID, unitPrice: toFloat(row.UnitPrice)});
Importing Nodes// Create suppliersUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/suppliers.csv" AS rowCREATE (:Supplier {companyName: row.CompanyName, supplierID: row.SupplierID});
// Create employeesUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/employees.csv" AS rowCREATE (:Employee {employeeID:row.EmployeeID, firstName: row.FirstName, lastName: row.LastName, title: row.Title});
Creating RelationshipsUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS rowMATCH (order:Order {orderID: row.OrderID})MATCH (customer:Customer {customerID: row.CustomerID})MERGE (customer)-[:PURCHASED]->(order);
USING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/products.csv" AS rowMATCH (product:Product {productID: row.ProductID})MATCH (supplier:Supplier {supplierID: row.SupplierID})MERGE (supplier)-[:SUPPLIES]->(product);
Creating RelationshipsUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS rowMATCH (order:Order {orderID: row.OrderID})MATCH (product:Product {productID: row.ProductID})MERGE (order)-[pu:INCLUDES]->(product)ON CREATE SET pu.unitPrice = toFloat(row.UnitPrice), pu.quantity = toFloat(row.Quantity);
USING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS rowMATCH (order:Order {orderID: row.OrderID})MATCH (employee:Employee {employeeID: row.EmployeeID})MERGE (employee)-[:SOLD]->(order);
High Performance LOADingneo4j-import
4.58 million thingsand their relationships…
Loads in 100 seconds!
JDBCapoc.load.jdbc
THERE’S A PROCEDURE FOR THAT
https://github.com/neo4j-contrib/neo4j-apoc-procedures
WRAPPING UP
“We found Neo4j to be literally thousands of times faster than our prior MySQL solution, with queries that require 10 to 100 times less code. Today, Neo4j provides eBay with functionality that was previously impossible.”
Volker PacherSenior Developer
THANK YOU!
Ryan Boyd @ryguyrg [email protected]