Neo4j and graph databases introduction

Post on 11-Apr-2017

151 views 4 download

Transcript of Neo4j and graph databases introduction

Presented by Stefano Conoci

The problem

Need to scale to big data

Data always more connected

Each user action creates data relationships

SQL JOIN doesn’t scale

Big and complex queries

Solution: NoSQL and graph databasesNoSQL for scalability

Neo4j leading graph database“Neo4j is a highly scalable native graph database that leverages data relationships as first-class entities, helping enterprises build intelligent applications to meet today’s evolving data challenges.”

Intuitive approach to data

Neo4j key characteristics and features● Flexible schema● Native graph storage● Native graph processing● ACID● Rest API● Data visualization● Cypher query language

Relational vs graph: relationship model

Pricing and scaling Community edition GPLv3

Neo4j Enterprise edition:

● Different licenses ● In-memory page cache● Clustering● Cache sharding● Monitoring

Cypher query language SQL is bad for graph queries

Cypher as a declarative graph query language

● Ask for data to match specific patterns● Expressive and efficient ● Understandable by non technical people

What syntax?!You won’t see this coming.

Cypher basic query structure

Clauses Node Identifier Property

Relationship syntax and labels

Label Relationship

Cypher’s magic!

Query comparisonSELECT product.product_name as Recommendation, count(1) as Frequency

FROM product, customer_product_mapping, (SELECT cpm3.product_id, cpm3.customer_id

FROM Customer_product_mapping cpm, Customer_product_mapping cpm2, Customer_product_mapping cpm3

WHERE cpm.customer_id = ‘customer-one’

and cpm.product_id = cpm2.product_id

and cpm2.customer_id != ‘customer-one’

and cpm3.customer_id = cpm2.customer_id

and cpm3.product_id not in (select distinct product_id

FROM Customer_product_mapping cpm

WHERE cpm.customer_id = ‘customer-one’)

) recommended_products

WHERE customer_product_mapping.product_id = product.product_id

and customer_product_mapping.product_id in recommended_products.product_id

and customer_product_mapping.customer_id = recommended_products.customer_id

GROUP BY product.product_name

ORDER BY Frequency desc

MATCH(u:Customer{customer_id:'customer-one'})-[:BOUGHT]->(p:Product)<-[:BOUGHT]-(peer:

Customer)-[:BOUGHT]->(reco:Product)

WHERE not (u)-[:BOUGHT]->(reco)

RETURN reco as Recommendation, count(*) as Frequency

ORDER BY Frequency DESC LIMIT 5;

SQL CYPHER

Neo4j vs SQL: social graph Sample social graph● With 1000 persons● Average of 50 friends per person● PathExists(a,b) limited to depth 4● Cache warmed up

Neo4j vs NoSQL

Let’s see some action!

Case for graph databases as default choice Graph databases as successor of RDBMS:

● ACID● High applicability ● Good performance in the majority of scenarios● Faster development

But for this to happen we need a standard language...

Questions?!

Thank you Download Neo4J: http://neo4j.com/download/

Get started: http://neo4j.com/developer/get-started/ and have FUN!