Introduction to Neo4j and .Net

53
Intro to Neo4j and .Net Harnessing the Power of the Graph Michael Hunger DevWeek Nürnberg 2015

Transcript of Introduction to Neo4j and .Net

Intro to Neo4j and .NetHarnessing the Power of the Graph

Michael HungerDevWeek Nürnberg 2015

Agenda

• Neo4j Introduction• Relational Pains – Graph Pleasure• Data Modeling• Query with Cypher• Neo4j and .Net• Drivers & Azure• Demo• Q&A

Neo4j Intro

Because Data Relationships Matter

What is it with Relationships?

• World is full of connected people, events, things• There is “Value in Relationships” !• What about Data Relationships?• How do you store your object model?• How do you explain

JOIN tables to your boss?

Neo4j – allows you to connect the dots

• Was built to efficiently • store, • query and • manage highly connected data

• Transactional, ACID• Real-time OLTP• Open source• Highly scalable on few machines

Value from Data RelationshipsCommon Use Cases

Internal ApplicationsMaster Data Management

Network and IT Operations

Fraud Detection

Customer-Facing ApplicationsReal-Time Recommendations

Graph-Based SearchIdentity and

Access Management

Neo4j Browser – Built-in Learning

RDBMS to Graph – Familiar Examples

Neo4j Browser – First Class Graph Visualization

• Graph Visualization• Tabular Results• Visual Query Plan• X-Ray Mode• Export to CSV, JSON,

PNG, SVG• Graph Style Sheet• Auto-Retrieve

Connections• Much more …

… to come.

Working with Neo4j

Model, Import, Query

The Whiteboard Model is the Physical Model

Eliminates Graph-to-Relational MappingIn your data model

Bridge the gap between business

and IT modelsIn your application

Greatly reduce need for application code

CAR

DRIVES

name: “Dan”born: May 29, 1970

twitter: “@dan”name: “Ann”

born: Dec 5, 1975

since: Jan 10, 2011

brand: “Volvo”model: “V70”

Property Graph Model Components

Nodes• The objects in the graph• Can have name-value properties• Can be labeled

Relationships• Relate nodes by type and direction• Can have name-value properties

LOVES

LOVES

LIVES WITH

OWN

S

PERSON PERSON

Cypher: Powerful and Expressive Query Language

MATCH (:Person { name:“Dan”} ) -[:LOVES]-> (:Person { name:“Ann”} )

LOVES

Dan Ann

LABEL PROPERTY

NODE NODE

LABEL PROPERTY

Getting Data into Neo4j

Cypher-Based “LOAD CSV” Capability• Transactional (ACID) writes• Initial and incremental loads of up to

10 million nodes and relationships

Command-Line Bulk Loader neo4j-import• For initial database population• For loads up to 10B+ records• Up to 1M records per second

4.58 million thingsand their relationships…

Loads in 100 seconds!

CSV

From RDBMS to Neo4j

Relational Pains = Graph Pleasure

Relational DBs Can’t Handle Relationships Well

• Cannot model or store data and relationships without complexity

• Performance degrades with number and levels of relationships, and database size

• Query complexity grows with need for JOINs• Adding new types of data and relationships

requires schema redesign, increasing time to market

… making traditional databases inappropriate when data relationships are valuable in real-time

Slow developmentPoor performance

Low scalabilityHard to maintain

Unlocking Value from Your Data Relationships

• Model your data naturally as a graph of data and relationships

• Drive graph model from domain and use-cases

• Use relationship information in real-time to transform your business

• Add new relationships on the fly to adapt to your changing requirements

High Query Performance with a Native Graph DB

• Relationships are first class citizen• No need for joins, just follow pre-

materialized relationships of nodes• Query & Data-locality – navigate out

from your starting points• Only load what’s needed• Aggregate and project results as you go• Optimized disk and memory model for

graphs

MATCH (boss)-[:MANAGES*0..3]->(mgr) WHERE boss.name = "John Doe" AND (mgr)-[:MANAGES]->()RETURN mgr.name AS Manager, size((mgr)-[:MANAGES*1..3]->()) AS Total

Express Complex Queries Easily with Cypher

Find all reports and how many people they manage, each up to 3 levels down

Cypher Query

SQL Query

High Query Performance: Some Numbers

• Traverse 2-4M+ relationships per second and core

• Cost based query optimizer – complex queries return in milliseconds

• Import 100K-1M records per second transactionally

• Bulk import tens of billions of records in a few hours

Querying Your Data

Basic Pattern: Tom Hanks‘ Movies?

MATCH (:Person {name:”Tom Hanks"} ) -[:ACTED_IN]-> (:Movie {title:”Forrest Gump"} )

ACTED_IN

Tom Hanks

Forrest Gump

LABEL PROPERTY

NODE NODE

Forrest Gump

LABEL PROPERTY

Basic Query: Tom Hanks‘ Movies?

MATCH (actor:Person)-[:ACTED_IN]->(m:Movie)

WHERE actor.name = "Tom Hanks"

RETURN *

Basic Query: Tom Hanks‘ Movies?

Query Comparison: Colleagues of Tom Hanks?

SELECT *FROM Person as actor JOIN ActorMovie AS am1 ON (actor.id = am1.actor_id) JOIN ActorMovie AS am2 ON (am1.movie_id = am2.movie_id) JOIN Person AS coll ON (coll.id = am2.actor_id)WHERE actor.name = "Tom Hanks“

MATCH (actor:Person)-[:ACTED_IN]->()<-[:ACTED_IN]-(coll:Person)WHERE actor.name = "Tom Hanks"RETURN *

Basic Query Comparison: Colleagues of Tom Hanks?

Most prolific actors and their filmography?

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)

RETURN p.name, count(*), collect(m.title) as movies

ORDER BY count(*) desc, p.name asc

LIMIT 10;

Most prolific actors and their filmography?

Neo4j Query Planner

Cost based Query Planner since Neo4j 2.2• Uses database stats to select best plan• Currently for Read Operations• Query Plan Visualizer, finds• Non optimal queries• Cartesian Product• Missing Indexes, Global Scans• Typos• Massive Fan-Out

Query Planner

Slight change, add a Label to query -> more stats available -> new plan with fewer database-hits

Neo4j Remoting Protocols

• Cypher HTTP Endpoint is• Fast• Transactional (multi-request)• Streaming• Batching• Parameters• Statistics, Query Plan, Result Representations

:POST /db/data/transaction/commit {"statements":[{"statement": "MATCH (p:Person) WHERE p.name = {name} RETURN p", "parameters":{"name":"Clint Eastwood"}}]}

• Up next: binary protocol

Neo4j for .Net Developers

Install, Drivers, Deployment, Hosting

Neo4j for .Net Developers

Don’t be afraid or disgusted, because “Java”

It’s just a database implemented in some language

You’ll rarely see it.

Neo4j for .Net Developers - Installation

• Neo4j Windows Installer was first

• Chocolatey Packages for Neo4j• Upcoming in Neo4j 2.3 - full PowerShell support• Just install Neo4j as a service

• More to come

Neo4j for .Net Developers - Drivers

• Neo4jClient – one of the first Neo4j Drivers• by Readify Australia• Uses Neo4j’s HTTP APIs• Opinionated• Query DSL

• NetGain – new and thin layer over APIs• New Drivers for binary protocol

Neo4j for .Net Developers – Development & Deployment

• Develop • on Windows with Visual Studio• everywhere with Mono / Xamarin

• Develop locally with local Neo4j instance• Deploy to Azure, use provisioned instances

Neo4j on Azure – Hosting / Provisioning

• Hosted Neo4j Databases by GrapheneDB• Just install on Linux instance• VMDepot Images• Upcoming: Docker

Develop a simple Movie Database

Demoneo4j.com/developer/dotnet

Single Page WebApp on the Movie Dataset

Single Page WebApp on the Movie Dataset

• Bootstrap • Javascript (jQuery)• 3 json http-endpoints• Single: /movie/title/The%20Matrix• Search: /search?query=Matrix• Graph: /graph?limit=100

• Send XHR, Render results

Data Model

public class Person { public string name { get; set; } public int born { get; set; } }

public class Movie { public string title { get; set; } public int released { get; set; } public string tagline { get; set; } }

ACTED_IN|DIRECTED|…

name,born

Forrest Gump

titlereleasetagline

Setup

• Add Neo4jClient as dependency• Store GraphDB-URL in WebConfig • Connect in WebApiConfig

var url = AppSettings["GraphDBUrl"];var client = new GraphClient(new Uri(url));client.Connect();

Routes & Controllers

• Provide Routes for• index.html and • 3 endpoints

• 4 Controllers: • query with parameter, • return results as JSON

[RoutePrefix("search")]public class SearchController : ApiController { [HttpGet] [Route("")] public IHttpActionResult SearchMoviesByTitle(string q) { var data = WebApiConfig.GraphClient.Cypher .Match("(m:Movie)") .Where("m.title =~ {title}") .WithParam("title", "(?i).*" + q + ".*") .Return<Movie>("m") .Results.ToList();

return Ok(data.Select(c => new { movie = c})); }}

Production Architecture & Integration

Neo4j Clustering Architecture Optimized for Speed & Availability at Scale

45

Performance Benefits• No network hops within queries• Real-time operations with fast and

consistent response times • Cache sharding spreads cache across

cluster for very large graphs

Clustering Features• Master-slave replication with

master re-election and failover • Each instance has its own local cache• Horizontal scaling & disaster recovery

Load Balancer

Neo4jNeo4jNeo4j

MIGRATE ALL DATA

MIGRATE GRAPH DATA

DUPLICATE GRAPH DATA

Non-graph data Graph data

Graph dataAll data

All data

RelationalDatabase

GraphDatabase

Application

Application

Application

Three Ways to Migrate Data to Neo4j

Data Storage andBusiness Rules Execution

Data Mining and Aggregation

Neo4j Fits into Your Enterprise Environment

Application

Graph Database Cluster

Neo4j Neo4j Neo4j

Ad HocAnalysis

Bulk AnalyticInfrastructure

Graph Compute EngineEDW …

Data Scientist

End User

DatabasesRelational

NoSQLHadoop

Kamille Nixon
Need a simplified polyglot persistence image, showing that we play well with others. Adding a new data source is no big deal. Add Hadoop and other NoSQL.

Get up to speed with Neo4jQuickly and Easily

There Are Lots of Ways to Easily Learn Neo4j

Resources

Online• Developer Site

neo4j.com/developer• DotNet Page• Guide: Cypher• Guide: CSV Import

• Courses• Pluralsight• Wintellect Now

• Reference Manual• StackOverflow

Offline• In Browser Guides• Training Classes (Intro, Modeling)• Office Hours• Professional Services Workshop• Free e-Books: • Graph Databases 2nd Ed (O‘Reilly)• Learning Neo4j

SummaryIntroduction Neo4j & .NetNeo4j Allows You…• Keep your rich data model• Handle relationships efficiently• Write queries easily• Develop applications quickly

For .Net Developers• Neo4j Installer• Drivers for Neo4j from .Net• Host Database on Azure• Deploy Apps to Azure

Users Love Neo4j

Thank You!Ask Questions, or Tweet

@neo4j | http://neo4j.com@mesirii | Michael Hunger