Neo4j Manual 2.1 SNAPSHOT

The Neo4j Manual v2.1-SNAPSHOT

The Neo4j Team neo4j.org www.neotechnology.com

The Neo4j Manual v2.1-SNAPSHOTby The Neo4j Team neo4j.org www.neotechnology.com

Publication date 2014-03-0304:16:34Copyright 2014 Neo Technology

Starting points

What is a graph database? Cypher Query Language Languages / Remote Client Libraries REST API Installation Upgrading Security

License: Creative Commons 3.0This book is presented in open source and licensed through Creative Commons 3.0. You are free to copy, distribute, transmit, and/or adapt the work. Thislicense is based upon the following conditions:

Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or youruse of the work).

Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license.

Any of the above conditions can be waived if you get permission from the copyright holder.

In no way are any of the following rights affected by the license:

Your fair dealing or fair use rights The authors moral rights Rights other persons may have either in the work itself or in how the work is used, such as publicity or privacy rights

NoteFor any reuse or distribution, you must make clear to the others the license terms of this work. The best way to do this is with a direct link tothis page: http://creativecommons.org/licenses/by-sa/3.0/

iii

Table of ContentsPreface .................................................................................................................................................... vI. Introduction ........................................................................................................................................ 1

1. Neo4j Highlights ....................................................................................................................... 22. Graph Database Concepts ......................................................................................................... 33. The Neo4j Graph Database ..................................................................................................... 11

II. Tutorials .......................................................................................................................................... 214. Getting started with Cypher .................................................................................................... 225. Data Modeling Examples ........................................................................................................ 346. Languages ................................................................................................................................ 77

III. Cypher Query Language ............................................................................................................... 837. Introduction .............................................................................................................................. 848. Syntax ...................................................................................................................................... 979. General Clauses ..................................................................................................................... 11510. Reading Clauses .................................................................................................................. 13111. Writing Clauses ................................................................................................................... 15912. Importing Data from CSV ................................................................................................... 18513. Functions .............................................................................................................................. 19014. Schema ................................................................................................................................. 21815. From SQL to Cypher .......................................................................................................... 222

IV. Reference ..................................................................................................................................... 22916. Capabilities .......................................................................................................................... 23017. Transaction Management .................................................................................................... 23718. Data Import .......................................................................................................................... 24519. Graph Algorithms ................................................................................................................ 24620. REST API ............................................................................................................................ 24821. Deprecations ........................................................................................................................ 369

V. Operations ..................................................................................................................................... 37022. Installation & Deployment .................................................................................................. 37123. Configuration & Performance ............................................................................................. 38524. High Availability ................................................................................................................. 42025. Backup ................................................................................................................................. 44326. Security ................................................................................................................................ 44927. Monitoring ........................................................................................................................... 455

VI. Tools ............................................................................................................................................ 47128. Web Interface ...................................................................................................................... 47229. Neo4j Shell .......................................................................................................................... 473

VII. Community ................................................................................................................................. 49030. Community Support ............................................................................................................ 49131. Contributing to Neo4j .......................................................................................................... 492

VIII. Advanced Usage ....................................................................................................................... 51632. Extending the Neo4j Server ................................................................................................ 51733. Using Neo4j embedded in Java applications ...................................................................... 52534. The Traversal Framework ................................................................................................... 55735. Legacy Indexing .................................................................................................................. 56636. Batch Insertion ..................................................................................................................... 585

A. Manpages ...................................................................................................................................... 589neo4j ........................................................................................................................................... 590

The Neo4j Manual v2.1-SNAPSHOT

iv

neo4j-installer ............................................................................................................................. 592neo4j-shell .................................................................................................................................. 593neo4j-backup .............................................................................................................................. 595neo4j-arbiter ............................................................................................................................... 597

vPreface

This is the reference manual for Neo4j version 2.1-SNAPSHOT, authored by the Neo4j Team.

The main parts of the manual are:

PartI, Introduction introducing graph database concepts and Neo4j. PartII, Tutorials learn how to use Neo4j. PartIII, Cypher Query Language details on the Cypher query language. PartIV, Reference detailed information on Neo4j. PartV, Operations how to install and maintain Neo4j. PartVI, Tools guides on tools. PartVII, Community getting help from, contributing to. PartVIII, Advanced Usage using Neo4j in more advanced ways. AppendixA, Manpages command line documentation.

The material is practical, technical, and focused on answering specific questions. It addresses howthings work, what to do and what to avoid to successfully run Neo4j in a production environment.

The goal is to be thumb-through and rule-of-thumb friendly.

Each section should stand on its own, so you can hop right to whatever interests you. When possible,the sections distill rules of thumb which you can keep in mind whenever you wander out of thehouse without this manual in your back pocket.

The included code examples are executed when Neo4j is built and tested. Also, the REST API requestand response examples are captured from real interaction with a Neo4j server. Thus, the examples arealways in sync with how Neo4j actually works.

Theres other documentation resources besides the manual as well:

Neo4j Cypher Refcard, see http://docs.neo4j.org/ for available versions. Neo4j GraphGist, an online tool for creating interactive web pages with executable Cypher

statements: http://gist.neo4j.org/. The main Neo4j site at http://www.neo4j.org/ is a good starting point to learn about Neo4j.

Who should read this?

The topics should be relevant to architects, administrators, developers and operations personnel.

PartI.IntroductionThis part gives a birds eye view of what a graph database is, and then outlines some specifics of Neo4j.

2Chapter1.Neo4j Highlights

As a robust, scalable and high-performance database, Neo4j is suitable for full enterprise deploymentor a subset of the full server can be used in lightweight projects.

It features:

true ACID transactions, high availability, scales to billions of nodes and relationships, high speed querying through traversals, declarative graph query language.

Proper ACID behavior is the foundation of data reliability. Neo4j enforces that all operations thatmodify data occur within a transaction, guaranteeing consistent data. This robustness extends fromsingle instance embedded graphs to multi-server high availability installations. For details, seeChapter17, Transaction Management.

Reliable graph storage can easily be added to any application. A graph can scale in size andcomplexity as the application evolves, with little impact on performance. Whether starting newdevelopment, or augmenting existing functionality, Neo4j is only limited by physical hardware.

A single server instance can handle a graph of billions of nodes and relationships. When datathroughput is insufficient, the graph database can be distributed among multiple servers in a highavailability configuration. See Chapter24, High Availability to learn more.

The graph database storage shines when storing richly-connected data. Querying is performed throughtraversals, which can perform millions of traversal steps per second. A traversal step resembles a joinin a RDBMS.

3Chapter2.Graph Database Concepts

This chapter contains an introduction to the graph data model and also compares it to other datamodels used when persisting data.

Graph Database Concepts

4

2.1.What is a Graph Database?A graph database stores data in a graph, the most generic of data structures, capable of elegantlyrepresenting any kind of data in a highly accessible way. Lets follow along some graphs, using themto express graph concepts. Well read a graph by following arrows around the diagram to formsentences.

2.1.1.A Graph contains Nodes and RelationshipsA Graph records data in Nodes which have Properties

The simplest possible graph is a single Node, a record that has named values referred to as Properties.A Node could start with a single Property and grow to a few million Properties, though that can get alittle awkward. At some point it makes sense to distribute the data into multiple nodes, organized withexplicit Relationships.

Graph

Nodes

records data in Relat ionships

records data in

Propert ies

have

organize

have

Labels

group

2.1.2.Relationships organize the GraphNodes are organized by Relationships which also have Properties

Relationships organize Nodes into arbitrary structures, allowing a Graph to resemble a List, a Tree,a Map, or a compound Entity any of which can be combined into yet more complex, richly inter-connected structures.

2.1.3.Labels group the NodesNodes are grouped by Labels into Sets

Labels are a means of grouping the nodes in the graph. They can be used to restrict queries to subsetsof the graph, as well as enabling optional model constraints and indexing rules.


5

2.1.4.Query a Graph with a TraversalA Traversal navigates a Graph; it identifies Paths which order Nodes

A Traversal is how you query a Graph, navigating from starting Nodes to related Nodes according toan algorithm, finding answers to questions like what music do my friends like that I dont yet own,or if this power supply goes down, what web services are affected?

Traversal

Graph

navigates

Paths

ident ifies

Algorithm

expresses

Relat ionships

records data in

Nodes

records data in order

organize

2.1.5.Indexes look-up Nodes or RelationshipsAn Index maps from Properties to either Nodes or Relationships

Often, you want to find a specific Node or Relationship according to a Property it has. Rather thantraversing the entire graph, use an Index to perform a look-up, for questions like find the Account forusername master-of-graphs.


6

Indexes

Relat ionships

m ap to

Nodes

m ap to

Propert ies

m ap fromorganize

have

have

2.1.6.Neo4j is a Graph DatabaseA Graph Database manages a Graph and also manages related Indexes

Neo4j is a commercially supported open-source graph database. It was designed and built from theground-up to be a reliable database, optimized for graph structures instead of tables. Working withNeo4j, your application gets all the expressiveness of a graph, with all the dependability you expectout of a database.


7

Graph Database

Graph

m anages

Indexes

m anages

Relat ionships

records data in

Nodes

records data in

m ap to

m ap to

Propert ies

m ap from organize

have

have

Traversal

navigates

Paths

ident ifies

Algorithm

expresses

order


8

2.2.Comparing Database ModelsA Graph Database stores data structured in the Nodes and Relationships of a graph. How does thiscompare to other persistence models? Because a graph is a generic structure, lets compare how a fewmodels would look in a graph.

2.2.1.A Graph Database transforms a RDBMSTopple the stacks of records in a relational database while keeping all the relationships, and youll seea graph. Where an RDBMS is optimized for aggregated data, Neo4j is optimized for highly connecteddata.

Figure2.1.RDBMS

A1

A2

A3

B1

B2

B3

B4

B5

B6

B7

C1

C2

C3

Figure2.2.Graph Database as RDBMS

A1

B1B2

A2

B4B6

A3

B3B5 B7

C1 C2C3

2.2.2.A Graph Database elaborates a Key-Value StoreA Key-Value model is great for lookups of simple values or lists. When the values are themselvesinterconnected, youve got a graph. Neo4j lets you elaborate the simple data structures into morecomplex, interconnected data.


9

Figure2.3.Key-Value Store

K1

K2

K3

V1

K2

V2

K1

K3

V3

K1

K* represents a key, V* a value. Note that some keys point to other keys as well as plain values.

Figure2.4.Graph Database as Key-Value Store

V1

V2

V3K1

K2

K3

2.2.3.A Graph Database relates Column-FamilyColumn Family (BigTable-style) databases are an evolution of key-value, using "families" to allowgrouping of rows. Stored in a graph, the families could become hierarchical, and the relationshipsamong data becomes explicit.

2.2.4.A Graph Database navigates a Document StoreThe container hierarchy of a document database accommodates nice, schema-free data that can easilybe represented as a tree. Which is of course a graph. Refer to other documents (or document elements)within that tree and you have a more expressive representation of the same data. When in Neo4j, thoserelationships are easily navigable.


10

Figure2.5.Document Store

D1

S1

D2

S2S3

V1D2/S2 V2V3V4D1/S1

D=Document, S=Subdocument, V=Value, D2/S2 = reference to subdocument in (other) document.

Figure2.6.Graph Database as Document Store

D1

S1D2 S2S3

V1

V2

V3

V4

11

Chapter3.The Neo4j Graph Database

This chapter goes into more detail on the data model and behavior of Neo4j.

The Neo4j Graph Database

12

3.1.NodesThe fundamental units that form a graph are nodes and relationships. In Neo4j, both nodes andrelationships can contain properties.

Nodes are often used to represent entities, but depending on the domain relationships may be used forthat purpose as well.

Apart from properties and relationships, nodes can also be labeled with zero or more labels.

A Node

Relat ionships

can have

Propert ies

can have Labels

can have

can have

Lets start out with a really simple graph, containing only a single node with one property:

nam e: Peter


13

3.2.RelationshipsRelationships between nodes are a key part of a graph database. They allow for finding related data.Just like nodes, relationships can have properties.

A Relat ionship

Start node

has a

End node

has a

Relat ionship type

has a

Propert ies

can have

Nam e

uniquely ident ified by

A relationship connects two nodes, and is guaranteed to have valid start and end nodes.

Start node End noderelat ionship

As relationships are always directed, they can be viewed as outgoing or incoming relative to a node,which is useful when traversing the graph:

Nodeincom ing relat ionship outgoing relat ionship

Relationships are equally well traversed in either direction. This means that there is no need to addduplicate relationships in the opposite direction (with regard to traversal or performance).

While relationships always have a direction, you can ignore the direction where it is not useful in yourapplication.

Note that a node can have relationships to itself as well:

Node loop

To further enhance graph traversal all relationships have a relationship type. Note that the word typemight be misleading here, you could rather think of it as a label. The following example shows asimple social network with two relationship types.


14

Maja

Oscar

follows follows

William

blocks

Alice

follows

Using relationship direction and typeWhat Howget who a person follows outgoing follows relationships, depth oneget the followers of a person incoming follows relationships, depth oneget who a person blocks outgoing blocks relationships, depth oneget who a person is blocked by incoming blocks relationships, depth one


15

3.3.PropertiesBoth nodes and relationships can have properties.

Properties are key-value pairs where the key is a string. Property values can be either a primitive or anarray of one primitive type. For example String, int and int[] values are valid for properties.

NoteNULL is not a valid property value. NULLs can instead be modeled by the absence of a key.

A Property

Key

has a

Value

has a

Prim it ive

boolean

byte

short

int

long

float

double

char

St ring

is acan be acan be an array of

Property value typesType Description Value rangeboolean true/falsebyte 8-bit integer -128 to 127, inclusiveshort 16-bit integer -32768 to 32767, inclusiveint 32-bit integer -2147483648 to 2147483647, inclusivelong 64-bit integer -9223372036854775808 to

9223372036854775807, inclusivefloat 32-bit IEEE 754 floating-point numberdouble 64-bit IEEE 754 floating-point number


16

Type Description Value rangechar 16-bit unsigned integers representing

Unicode charactersu0000 to uffff (0 to 65535)

String sequence of Unicode characters

For further details on float/double values, see Java Language Specification .


17

3.4.LabelsA label is a named graph construct that is used to group nodes into sets; all nodes labeled with thesame label belongs to the same set. Many database queries can work with these sets instead of thewhole graph, making queries easier to write and more efficient. A node may be labeled with anynumber of labels, including none, making labels an optional addition to the graph.

A Label

Nam e

has a

Node

groups

Labels are used when defining contraints and adding indexes for properties.

An example would be a label named User that you label all your nodes representing users with. Withthat in place, you can ask Neo4j to perform operations only on your user nodes, such as finding allusers with a given name.

However, you can use labels for much more. For instance, since labels can be added and removedduring runtime, they can be used to mark temporary states for your nodes. You might create anOffline label for phones that are offline, a Happy label for happy pets, and so on.

3.4.1.Label namesAny non-empty unicode string can be used as a label name. In Cypher, you may need to use thebacktick (`) syntax to avoid clashes with Cypher identifier rules. By convention, labels are writtenwith CamelCase notation, with the first letter in upper case. For instance, User or CarOwner.

Labels have an id space of an int, meaning the maximum number of labels the database can contain isroughly 2 billion.


18

3.5.PathsA path is one or more nodes with connecting relationships, typically retrieved as a query or traversalresult.

A Path

Start Node

has a

Relat ionship

can contain one or m ore

End Node

has an

Node

accom panied by a

The shortest possible path has length zero and looks like this:

Node

A path of length one:

Node 1

Node 2

Relat ionship 1

Another path of length one:

Node 1 Relat ionship 1


19

3.6.TraversalTraversing a graph means visiting its nodes, following relationships according to some rules. In mostcases only a subgraph is visited, as you already know where in the graph the interesting nodes andrelationships are found.

Cypher provides a declarative way to query the graph powered by traversals and other techniques. SeePartIII, Cypher Query Language for more information.

Neo4j comes with a callback based traversal API which lets you specify the traversal rules. At a basiclevel theres a choice between traversing breadth- or depth-first.

For an in-depth introduction to the traversal framework, see Chapter34, The Traversal Framework.For Java code examples see Section33.7, Traversal.


20

3.7.SchemaNeo4j is a schema-optional graph database. You can use Neo4j without any schema. Optionally youcan introduce it in order to gain performance or modeling benefits. This allows a way of workingwhere the schema does not get in your way until you are at a stage where you want to reap the benefitsof having one.

3.7.1.Indexes

NoteThis feature was introduced in Neo4j 2.0, and is not the same as the legacy indexes (seeChapter35, Legacy Indexing).

Performance is gained by creating indexes, which improve the speed of looking up nodes in thedatabase. Once youve specified which properties to index, Neo4j will make sure your indexes arekept up to date as your graph evolves. Any operation that looks up nodes by the newly indexedproperties will see a significant performance boost.

Indexes in Neo4j are eventually available. That means that when you first create an index, theoperation returns immediately. The index is populating in the background and so is not immediatelyavailable for querying. When the index has been fully populated it will eventually come online. Thatmeans that it is now ready to be used in queries.

If something should go wrong with the index, it can end up in a failed state. When it is failed, it willnot be used to speed up queries. To rebuild it, you can drop and recreate the index. Look at logs forclues about the failure.

You can track the status of your index by asking for the index state through the API you are using.Note, however, that this is not yet possible through Cypher.

How to use indexes in the different APIs:

Cypher: Section14.1, Indexes REST API: Section20.13, Indexing Listing Indexes via Shell: Section29.6.11, Listing Indexes and Constraints Java Core API: Section33.3, User database with indexes

3.7.2.Constraints

NoteThis feature was introduced in Neo4j 2.0.

Neo4j can help you keep your data clean. It does so using constraints, that allow you to specify therules for what your data should look like. Any changes that break these rules will be denied.

In this version, unique constraints is the only available constraint type.

How to use constraints in the different APIs:

Cypher: Section14.2, Constraints REST API: Section20.14, Constraints Listing Constraints via Shell: Section29.6.11, Listing Indexes and Constraints

PartII.TutorialsThe tutorial part describes how use Neo4j. It takes you from Hello World to advanced usage of graphs.

22

Chapter4.Getting started with Cypher

This chapter will guide you through your first steps with Cypher.

In the online edition of this manual, all queries in this section can be executed interactively withoutinstalling Neo4j on your computer.

Otherwise, first get the Neo4j server running to try things out locally. Instructions are found inSection22.2, Server Installation. With the server running, you can choose to issue Cypher queriesfrom either the web interface or the Neo4j shell. See Chapter28, Web Interface or Chapter29, Neo4jShell.

Getting started with Cypher

23

4.1.Create nodes and relationshipsCreate a node for the actor Tom Hanks:CREATE (n:Actor { name:"Tom Hanks" });

Lets find the node we created:MATCH (actor:Actor { name: "Tom Hanks" })RETURN actor;

Now lets create a movie and connect it to the Tom Hanks node with an ACTED_IN relationship:MATCH (actor:Actor)WHERE actor.name = "Tom Hanks"CREATE (movie:Movie { title:'Sleepless IN Seattle' })CREATE (actor)-[:ACTED_IN]->(movie);

Using a WHERE clause in the query above to get the Tom Hanks node does the same thing as the patternin the MATCH clause of the previous query.

This is how our graph looks now:

Actor

nam e = 'Tom Hanks'

Movie

t it le = 'Sleepless in Seat t le'

ACTED_IN

We can do more of the work in a single clause. CREATE UNIQUE will make sure we dont create duplicatepatterns. Using this: [r:ACTED_IN] lets us return the relationship.MATCH (actor:Actor { name: "Tom Hanks" })CREATE UNIQUE (actor)-[r:ACTED_IN]->(movie:Movie { title:"Forrest Gump" })RETURN r;

Set a property on a node:MATCH (actor:Actor { name: "Tom Hanks" })SET actor.DoB = 1944RETURN actor.name, actor.DoB;

The labels Actor and Movie help us organize the graph. Lets list all Movie nodes:MATCH (movie:Movie)RETURN movie AS `All Movies`;

All MoviesNode[1]{title:"Sleepless in Seattle"}

Node[2]{title:"Forrest Gump"}

2 rows


24

4.2.Movie DatabaseOur example graph consists of movies with title and year and actors with a name. Actors have ACTS_INrelationships to movies, which represents the role they played. This relationship also has a roleattribute.

Well go with three movies and three actors:CREATE (matrix1:Movie { title : 'The Matrix', year : '1999-03-31' })CREATE (matrix2:Movie { title : 'The Matrix Reloaded', year : '2003-05-07' })CREATE (matrix3:Movie { title : 'The Matrix Revolutions', year : '2003-10-27' })CREATE (keanu:Actor { name:'Keanu Reeves' })CREATE (laurence:Actor { name:'Laurence Fishburne' })CREATE (carrieanne:Actor { name:'Carrie-Anne Moss' })CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix1)CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix2)CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix3)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix1)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix2)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix3)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix1)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix2)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix3)

This gives us the following graph to play with:

Movie

t it le = 'The Matrix 'year = '1999-03-31'

Movie

t it le = 'The Matrix Reloaded'year = '2003-05-07'

Movie

t it le = 'The Matrix Revolut ions'year = '2003-10-27'

Actor

nam e = 'Keanu Reeves'

ACTS_INrole = 'Neo'

ACTS_INrole = 'Neo'

ACTS_INrole = 'Neo'

Actor

nam e = 'Laurence Fishburne'

ACTS_INrole = 'Morpheus'



Actor

nam e = 'Carrie-Anne Moss'

ACTS_INrole = 'Trinity '



Lets check how many nodes we have now:MATCH (n)RETURN "Hello Graph with " + count(*)+ " Nodes!" AS welcome;

Return a single node, by name:MATCH (movie:Movie { title: 'The Matrix' })RETURN movie;

Return the title and date of the matrix node:MATCH (movie:Movie { title: 'The Matrix' })RETURN movie.title, movie.year;

Which results in:

movie.title movie.year"The Matrix" "1999-03-31"

1 row

Show all actors:MATCH (actor:Actor)RETURN actor;


25

Return just the name, and order them by name:MATCH (actor:Actor)RETURN actor.nameORDER BY actor.name;

Count the actors:MATCH (actor:Actor)RETURN count(*);

Get only the actors whose names end with s:MATCH (actor:Actor)WHERE actor.name =~ ".*s$"RETURN actor.name;

Heres some exploratory queries for unknown datasets. Dont do this on live production databases!

Count nodes:MATCH (n)RETURN count(*);

Count relationship types:MATCH (n)-[r]->()RETURN type(r), count(*);

type(r) count(*)"ACTS_IN" 9

1 row

List all nodes and their relationships:MATCH (n)-[r]->(m)RETURN n AS FROM , r AS `->`, m AS to;

from -> toNode[3]{name:"Keanu Reeves"} :ACTS_IN[0]{role:"Neo"} Node[0]{title:"The Matrix",

year:"1999-03-31"}

Node[3]{name:"Keanu Reeves"} :ACTS_IN[1]{role:"Neo"} Node[1]{title:"The MatrixReloaded", year:"2003-05-07"}

Node[3]{name:"Keanu Reeves"} :ACTS_IN[2]{role:"Neo"} Node[2]{title:"The MatrixRevolutions", year:"2003-10-27"}

Node[4]{name:"LaurenceFishburne"}

:ACTS_IN[3]{role:"Morpheus"} Node[0]{title:"The Matrix", year:"1999-03-31"}


:ACTS_IN[4]{role:"Morpheus"} Node[1]{title:"The MatrixReloaded", year:"2003-05-07"}


:ACTS_IN[5]{role:"Morpheus"} Node[2]{title:"The MatrixRevolutions", year:"2003-10-27"}

Node[5]{name:"Carrie-AnneMoss"}

:ACTS_IN[6]{role:"Trinity"} Node[0]{title:"The Matrix", year:"1999-03-31"}

9 rows


26

from -> toNode[5]{name:"Carrie-AnneMoss"}

:ACTS_IN[7]{role:"Trinity"} Node[1]{title:"The MatrixReloaded", year:"2003-05-07"}

Node[5]{name:"Carrie-AnneMoss"}

:ACTS_IN[8]{role:"Trinity"} Node[2]{title:"The MatrixRevolutions", year:"2003-10-27"}

9 rows


27

4.3.Social Movie DatabaseOur example graph consists of movies with title and year and actors with a name. Actors have ACTS_INrelationships to movies, which represents the role they played. This relationship also has a roleattribute.

So far, we queried the movie data; now lets update the graph too.

CREATE (matrix1:Movie { title : 'The Matrix', year : '1999-03-31' })CREATE (matrix2:Movie { title : 'The Matrix Reloaded', year : '2003-05-07' })CREATE (matrix3:Movie { title : 'The Matrix Revolutions', year : '2003-10-27' })CREATE (keanu:Actor { name:'Keanu Reeves' })CREATE (laurence:Actor { name:'Laurence Fishburne' })CREATE (carrieanne:Actor { name:'Carrie-Anne Moss' })CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix1)CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix2)CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix3)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix1)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix2)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix3)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix1)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix2)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix3)

We will add ourselves, friends and movie ratings.

Heres how to add a node for yourself and return it, lets say your name is Me:

CREATE (me:User { name: "Me" })RETURN me;

meNode[6]{name:"Me"}

1 rowNodes created: 1Properties set: 1Labels added: 1

Lets check if the node is there:

MATCH (me:User { name: "Me" })RETURN me.name;

Add a movie rating:

MATCH (me:User { name: "Me" }),(movie:Movie { title: "The Matrix" })CREATE (me)-[:RATED { stars : 5, comment : "I love that movie!" }]->(movie);

Which movies did I rate?

MATCH (me:User { name: "Me" }),(me)-[rating:RATED]->(movie)RETURN movie.title, rating.stars, rating.comment;

movie.title rating.stars rating.comment"The Matrix" 5 "I love that movie!"

1 row


28

We need a friend!CREATE (friend:User { name: "A Friend" })RETURN friend;

Add our friendship idempotently, so we can re-run the query without adding it several times. Wereturn the relationship to check that it has not been created several times.MATCH (me:User { name: "Me" }),(friend:User { name: "A Friend" })CREATE UNIQUE (me)-[friendship:FRIEND]->(friend)RETURN friendship;

You can rerun the query, see that it doesnt change anything the second time!

Lets update our friendship with a since property:MATCH (me:User { name: "Me" })-[friendship:FRIEND]->(friend:User { name: "A Friend" })SET friendship.since='forever'RETURN friendship;

Lets pretend us being our friend and wanting to see which movies our friends have rated.MATCH (me:User { name: "A Friend" })-[:FRIEND]-(friend)-[rating:RATED]->(movie)RETURN movie.title, avg(rating.stars) AS stars, collect(rating.comment) AS comments, count(*);

movie.title stars comments count(*)"The Matrix" 5. 0 ["I love that movie!"] 1

1 row

Thats too little data, lets add some more friends and friendships.MATCH (me:User { name: "Me" })FOREACH (i IN range(1,10)| CREATE (friend:User { name: "Friend " + i }),(me)-[:FRIEND]->(friend));

Show all our friends:MATCH (me:User { name: "Me" })-[r:FRIEND]->(friend)RETURN type(r) AS friendship, friend.name;

friendship friend.name"FRIEND" "A Friend"

"FRIEND" "Friend 1"

"FRIEND" "Friend 2"

"FRIEND" "Friend 3"

"FRIEND" "Friend 4"

"FRIEND" "Friend 5"

"FRIEND" "Friend 6"

"FRIEND" "Friend 7"

"FRIEND" "Friend 8"

"FRIEND" "Friend 9"

"FRIEND" "Friend 10"

11 rows


29

4.4.Finding PathsOur example graph consists of movies with title and year and actors with a name. Actors have ACTS_INrelationships to movies, which represents the role they played. This relationship also has a roleattribute.We queried and updated the data so far, now lets find interesting constellations, a.k.a. paths.CREATE (matrix1:Movie { title : 'The Matrix', year : '1999-03-31' })CREATE (matrix2:Movie { title : 'The Matrix Reloaded', year : '2003-05-07' })CREATE (matrix3:Movie { title : 'The Matrix Revolutions', year : '2003-10-27' })CREATE (keanu:Actor { name:'Keanu Reeves' })CREATE (laurence:Actor { name:'Laurence Fishburne' })CREATE (carrieanne:Actor { name:'Carrie-Anne Moss' })CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix1)CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix2)CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix3)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix1)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix2)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix3)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix1)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix2)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix3)

All other movies that actors in The Matrix acted in ordered by occurrence:MATCH (:Movie { title: "The Matrix" })(movie)RETURN movie.title, count(*)ORDER BY count(*) DESC ;

movie.title count(*)"The Matrix Revolutions" 3

"The Matrix Reloaded" 3

2 rows

Lets see who acted in each of these movies:MATCH (:Movie { title: "The Matrix" })(movie)RETURN movie.title, collect(actor.name), count(*) AS countORDER BY count DESC ;

movie.title collect(actor.name) count"The Matrix Revolutions" ["Keanu Reeves", "Laurence

Fishburne", "Carrie-Anne Moss"]3

"The Matrix Reloaded" ["Keanu Reeves", "LaurenceFishburne", "Carrie-Anne Moss"]

3

2 rows

What about co-acting, that is actors that acted together:MATCH (:Movie { title: "The Matrix" })(movie)


30

actor.name collect(distinct colleague.name)"Laurence Fishburne" ["Keanu Reeves", "Carrie-Anne Moss"]

"Keanu Reeves" ["Laurence Fishburne", "Carrie-Anne Moss"]

3 rows

Who of those other actors acted most often with anyone from the matrix cast?MATCH (:Movie { title: "The Matrix" })(movie)


31

p length(p)[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[1]{role:"Neo"}, Node[1]{title:"The MatrixReloaded", year:"2003-05-07"}, :ACTS_IN[4]{role:"Morpheus"}, Node[4]{name:"LaurenceFishburne"}, :ACTS_IN[3]{role:"Morpheus"}, Node[0]{title:"The Matrix", year:"1999-03-31"}, :ACTS_IN[6]{role:"Trinity"}, Node[5]{name:"Carrie-Anne Moss"}]

4

[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[1]{role:"Neo"}, Node[1]{title:"The MatrixReloaded", year:"2003-05-07"}, :ACTS_IN[4]{role:"Morpheus"}, Node[4]{name:"LaurenceFishburne"}, :ACTS_IN[5]{role:"Morpheus"}, Node[2]{title:"The Matrix Revolutions", year:"2003-10-27"}, :ACTS_IN[8]{role:"Trinity"}, Node[5]{name:"Carrie-Anne Moss"}]

4

[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[1]{role:"Neo"}, Node[1]{title:"The MatrixReloaded", year:"2003-05-07"}, :ACTS_IN[7]{role:"Trinity"}, Node[5]{name:"Carrie-AnneMoss"}]

2

[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[2]{role:"Neo"}, Node[2]{title:"The MatrixRevolutions", year:"2003-10-27"}, :ACTS_IN[5]{role:"Morpheus"}, Node[4]{name:"LaurenceFishburne"}, :ACTS_IN[3]{role:"Morpheus"}, Node[0]{title:"The Matrix", year:"1999-03-31"}, :ACTS_IN[6]{role:"Trinity"}, Node[5]{name:"Carrie-Anne Moss"}]

4

[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[2]{role:"Neo"}, Node[2]{title:"The MatrixRevolutions", year:"2003-10-27"}, :ACTS_IN[5]{role:"Morpheus"}, Node[4]{name:"LaurenceFishburne"}, :ACTS_IN[4]{role:"Morpheus"}, Node[1]{title:"The Matrix Reloaded", year:"2003-05-07"}, :ACTS_IN[7]{role:"Trinity"}, Node[5]{name:"Carrie-Anne Moss"}]

4

[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[2]{role:"Neo"}, Node[2]{title:"The MatrixRevolutions", year:"2003-10-27"}, :ACTS_IN[8]{role:"Trinity"}, Node[5]{name:"Carrie-AnneMoss"}]

2

9 rows

Bur thats a lot of data, we just want to look at the names and titles of the nodes of the path.MATCH p =(:Actor { name: "Keanu Reeves" })-[:ACTS_IN*0..5]-(:Actor { name: "Carrie-Anne Moss" })


32

RETURN extract(n IN nodes(p)| coalesce(n.title,n.name)) AS `names AND titles`, length(p)ORDER BY length(p)LIMIT 10;

names and titles length(p)["Keanu Reeves", "The Matrix", "Carrie-AnneMoss"]

2

["Keanu Reeves", "The Matrix Reloaded", "Carrie-Anne Moss"]

2

["Keanu Reeves", "The MatrixRevolutions", "Carrie-Anne Moss"]

2

["Keanu Reeves", "The Matrix", "LaurenceFishburne", "The Matrix Reloaded", "Carrie-AnneMoss"]

4

["Keanu Reeves", "The Matrix", "LaurenceFishburne", "The Matrix Revolutions", "Carrie-Anne Moss"]

4

["Keanu Reeves", "The Matrix Reloaded", "LaurenceFishburne", "The Matrix", "Carrie-Anne Moss"]

4

["Keanu Reeves", "The Matrix Reloaded", "LaurenceFishburne", "The Matrix Revolutions", "Carrie-Anne Moss"]

4

["Keanu Reeves", "The MatrixRevolutions", "Laurence Fishburne", "TheMatrix", "Carrie-Anne Moss"]

4

["Keanu Reeves", "The MatrixRevolutions", "Laurence Fishburne", "The MatrixReloaded", "Carrie-Anne Moss"]

4

9 rows


33

4.5.Labels, Constraints and IndexesLabels are a convenient way to group nodes together. They are used to restrict queries, defineconstraints and create indexes.

The following will give an example of how to use labels. Lets start out adding a constraint in thiscase we decided that all Movie node titles should be unique.CREATE CONSTRAINT ON (movie:Movie) ASSERT movie.title IS UNIQUE

Note that adding the unique constraint will add an index on that property, so we wont do thatseparately. If we drop the constraint, we will have to add an index instead, as needed.

In this case we want an index to speed up finding actors by name in the database:CREATE INDEX ON :Actor(name)

Indexes can be added at any time. Constraints can be added after a label is already in use, but thatrequires that the existing data complies with the constraints. Note that it will take some time for anindex to come online when theres existing data.

Now, lets add some data.CREATE (actor:Actor { name:"Tom Hanks" }),(movie:Movie { title:'Sleepless IN Seattle' }), (actor)-[:ACTED_IN]->(movie);

Normally you dont specify indexes when querying for data. They will be used automatically. Thismeans we can simply look up the Tom Hanks node, and the index will kick in behind the scenes toboost performance.MATCH (actor:Actor { name: "Tom Hanks" })RETURN actor;

Now lets say we want to add another label for a node. Heres how to do that:MATCH (actor:Actor { name: "Tom Hanks" })SET actor :American;

To remove a label from nodes, this is what to do:MATCH (actor:Actor { name: "Tom Hanks" })REMOVE actor:American;

For more information on labels and related topics, see:

Section3.4, Labels Chapter14, Schema Section14.2, Constraints Section14.1, Indexes Section9.7, Using Section11.4, Set Section11.6, Remove

34

Chapter5.Data Modeling Examples

The following chapters contain simplified examples of how different domains can be modeled usingNeo4j. The aim is not to give full examples, but to suggest possible ways to think using nodes,relationships, graph patterns and data locality in traversals.

The examples use Cypher queries a lot, read PartIII, Cypher Query Language for more information.

Data Modeling Examples

35

5.1.Linked ListsA powerful feature of using a graph database, is that you can create your own in-graph datastructures for example a linked list.

This data structure uses a single node as the list reference. The reference has an outgoing relationshipto the head of the list, and an incoming relationship from the last element of the list. If the list isempty, the reference will point to itself.

To make it clear what happens, we will show how the graph looks after each query.

To initialize an empty linked list, we simply create a node, and make it link to itself. Unlike the actuallist elements, it doesnt have a value property.

CREATE (root { name: 'ROOT' })-[:LINK]->(root)RETURN root

nam e = 'ROOT' LINK

Adding values is done by finding the relationship where the new value should be placed in, andreplacing it with a new node, and two relationships to it. We also have to handle the fact that thebefore and after nodes could be the same as the root node. The case where before, after and the rootnode are all the same, makes it necessary to use CREATE UNIQUE to not create two new value nodes bymistake.

MATCH (root)-[:LINK*0..]->(before),(after)-[:LINK*0..]->(root),(before)-[old:LINK]->(after)WHERE root.name = 'ROOT' AND (before.value < 25 OR before = root) AND (25 < after.value OR after = root)CREATE UNIQUE (before)-[:LINK]->({ value:25 })-[:LINK]->(after)DELETE old

nam e = 'ROOT'

value = 25

LINK LINK

Lets add one more value:

MATCH (root)-[:LINK*0..]->(before),(after)-[:LINK*0..]->(root),(before)-[old:LINK]->(after)WHERE root.name = 'ROOT' AND (before.value < 10 OR before = root) AND (10 < after.value OR after = root)CREATE UNIQUE (before)-[:LINK]->({ value:10 })-[:LINK]->(after)DELETE old


36

nam e = 'ROOT'

value = 10

LINK

value = 25

LINK

LINK

Deleting a value, conversely, is done by finding the node with the value, and the two relationshipsgoing in and out from it, and replacing the relationships with a new one.MATCH (root)-[:LINK*0..]->(before),(before)-[delBefore:LINK]->(del)-[delAfter:LINK]->(after), (after)-[:LINK*0..]->(root)WHERE root.name = 'ROOT' AND del.value = 10CREATE UNIQUE (before)-[:LINK]->(after)DELETE del, delBefore, delAfter

nam e = 'ROOT'

value = 25

LINK LINK

Deleting the last value node is what requires us to use CREATE UNIQUE when replacing the relationships.Otherwise, we would end up with two relationships from the root node to itself, as both before andafter nodes are equal to the root node, meaning the pattern would match twice.MATCH (root)-[:LINK*0..]->(before),(before)-[delBefore:LINK]->(del)-[delAfter:LINK]->(after), (after)-[:LINK*0..]->(root)WHERE root.name = 'ROOT' AND del.value = 25CREATE UNIQUE (before)-[:LINK]->(after)DELETE del, delBefore, delAfter

nam e = 'ROOT' LINK


37

5.2.TV ShowsThis example show how TV Shows with Seasons, Episodes, Characters, Actors, Users and Reviewscan be modeled in a graph database.

5.2.1.Data ModelLets start out with an entity-relationship model of the domain at hand:

TV Show

Season

has

Episode

has

Review

has

Character

featured

User

wrote

Actor

played

To implement this in Neo4j well use the following relationship types:

Relationship Type DescriptionHAS_SEASON Connects a show with its seasons.HAS_EPISODE Connects a season with its episodes.FEATURED_CHARACTER Connects an episode with its characters.PLAYED_CHARACTER Connects actors with characters. Note that an

actor can play multiple characters in an episode,and that the same character can be played bymultiple actors as well.

HAS_REVIEW Connects an episode with its reviews.WROTE_REVIEW Connects users with reviews they contributed.

5.2.2.Sample DataLets create some data and see how the domain plays out in practice:

CREATE (himym:TVShow { name: "How I Met Your Mother" })


38

CREATE (himym_s1:Season { name: "HIMYM Season 1" })CREATE (himym_s1_e1:Episode { name: "Pilot" })CREATE (ted:Character { name: "Ted Mosby" })CREATE (joshRadnor:Actor { name: "Josh Radnor" })CREATE UNIQUE (joshRadnor)-[:PLAYED_CHARACTER]->(ted)CREATE UNIQUE (himym)-[:HAS_SEASON]->(himym_s1)CREATE UNIQUE (himym_s1)-[:HAS_EPISODE]->(himym_s1_e1)CREATE UNIQUE (himym_s1_e1)-[:FEATURED_CHARACTER]->(ted)CREATE (himym_s1_e1_review1 { title: "Meet Me At The Bar In 15 Minutes & Suit Up", content: "It was awesome" })CREATE (wakenPayne:User { name: "WakenPayne" })CREATE (wakenPayne)-[:WROTE_REVIEW]->(himym_s1_e1_review1)(himym_s1)-[:HAS_EPISODE]->(himym_s1_e1)CREATE (marshall:Character { name: "Marshall Eriksen" })CREATE (robin:Character { name: "Robin Scherbatsky" })CREATE (barney:Character { name: "Barney Stinson" })CREATE (lily:Character { name: "Lily Aldrin" })CREATE (jasonSegel:Actor { name: "Jason Segel" })CREATE (cobieSmulders:Actor { name: "Cobie Smulders" })CREATE (neilPatrickHarris:Actor { name: "Neil Patrick Harris" })CREATE (alysonHannigan:Actor { name: "Alyson Hannigan" })CREATE UNIQUE (jasonSegel)-[:PLAYED_CHARACTER]->(marshall)CREATE UNIQUE (cobieSmulders)-[:PLAYED_CHARACTER]->(robin)CREATE UNIQUE (neilPatrickHarris)-[:PLAYED_CHARACTER]->(barney)CREATE UNIQUE (alysonHannigan)-[:PLAYED_CHARACTER]->(lily)


39

CREATE UNIQUE (himym_s1_e1)-[:FEATURED_CHARACTER]->(marshall)CREATE UNIQUE (himym_s1_e1)-[:FEATURED_CHARACTER]->(robin)CREATE UNIQUE (himym_s1_e1)-[:FEATURED_CHARACTER]->(barney)CREATE UNIQUE (himym_s1_e1)-[:FEATURED_CHARACTER]->(lily)CREATE (himym_s1_e1_review2 { title: "What a great pilot for a show :)", content: "The humour is great." })CREATE (atlasredux:User { name: "atlasredux" })CREATE (atlasredux)-[:WROTE_REVIEW]->(himym_s1_e1_review2)(season)-[:HAS_EPISODE]->(episode)WHERE tvShow.name = "How I Met Your Mother"RETURN season.name, episode.name

season.name episode.name"HIMYM Season 1" "Pilot"

1 row

We could also grab the reviews if there are any by slightly tweaking the query:MATCH (tvShow:TVShow)-[:HAS_SEASON]->(season)-[:HAS_EPISODE]->(episode)WHERE tvShow.name = "How I Met Your Mother"WITH season, episodeOPTIONAL MATCH (episode)-[:HAS_REVIEW]->(review)RETURN season.name, episode.name, review

season.name episode.name review"HIMYM Season 1" "Pilot" Node[5]{title:"Meet Me At The

Bar In 15 Minutes & Suit Up", content:"It was awesome"}

"HIMYM Season 1" "Pilot" Node[15]{title:"What agreat pilot for a show :)", content:"The humour is great. "}

2 rows

Now lets list the characters featured in a show. Note that in this query we only put identifiers on thenodes we actually use later on. The other nodes of the path pattern are designated by ().MATCH (tvShow:TVShow)-[:HAS_SEASON]->()-[:HAS_EPISODE]->()-[:FEATURED_CHARACTER]->(character)WHERE tvShow.name = "How I Met Your Mother"RETURN DISTINCT character.name

character.name"Ted Mosby"

"Marshall Eriksen"

"Robin Scherbatsky"

"Barney Stinson"

5 rows


40

character.name"Lily Aldrin"

5 rows

Now lets look at how to get all cast members of a show.MATCH (tvShow:TVShow)-[:HAS_SEASON]->()-[:HAS_EPISODE]->(episode)-[:FEATURED_CHARACTER]->()(er_s9)CREATE UNIQUE (er_s9)-[:HAS_EPISODE]->(er_s9_e17)WITH er_s9_e17MATCH (actor:Actor),(episode:Episode)WHERE actor.name = "Josh Radnor" AND episode.name = "Peter's Progress"WITH actor, episodeCREATE (keith:Character { name: "Keith" })CREATE UNIQUE (actor)-[:PLAYED_CHARACTER]->(keith)CREATE UNIQUE (episode)-[:FEATURED_CHARACTER]->(keith)

And now well create a query to find the episodes that he has appeared in:MATCH (actor:Actor)-[:PLAYED_CHARACTER]->(character)(character)


41

character.name AS Character

Show Season Episode Character"How I Met Your Mother" "HIMYM Season 1" "Pilot" "Ted Mosby"

"ER" "ER S7" "Peter's Progress" "Keith"

2 rows


42

5.3.ACL structures in graphsThis example gives a generic overview of an approach to handling Access Control Lists (ACLs) ingraphs, and a simplified example with concrete queries.

5.3.1.Generic approachIn many scenarios, an application needs to handle security on some form of managed objects. Thisexample describes one pattern to handle this through the use of a graph structure and traversersthat build a full permissions-structure for any managed object with exclude and include overridingpossibilities. This results in a dynamic construction of ACLs based on the position and context of themanaged object.

The result is a complex security scheme that can easily be implemented in a graph structure,supporting permissions overriding, principal and content composition, without duplicating dataanywhere.

TechniqueAs seen in the example graph layout, there are some key concepts in this domain model:

The managed content (folders and files) that are connected by HAS_CHILD_CONTENT relationships The Principal subtree pointing out principals that can act as ACL members, pointed out by the

PRINCIPAL relationships. The aggregation of principals into groups, connected by the IS_MEMBER_OF relationship. One principal

(user or group) can be part of many groups at the same time. The SECURITY relationships, connecting the content composite structure to the principal composite

structure, containing a addition/removal modifier property ("+RW").


43

Constructing the ACLThe calculation of the effective permissions (e.g. Read, Write, Execute) for a principal for any givenACL-managed node (content) follows a number of rules that will be encoded into the permissions-traversal:

Top-down-TraversalThis approach will let you define a generic permission pattern on the root content, and then refine thatfor specific sub-content nodes and specific principals.

1. Start at the content node in question traverse upwards to the content root node to determine thepath to it.

2. Start with a effective optimistic permissions list of "all permitted" (111 in a bit encodedReadWriteExecute case) or 000 if you like pessimistic security handling (everything is forbiddenunless explicitly allowed).

3. Beginning from the topmost content node, look for any SECURITY relationships on it.4. If found, look if the principal in question is part of the end-principal of the SECURITY relationship.5. If yes, add the "+" permission modifiers to the existing permission pattern, revoke the "-"

permission modifiers from the pattern.6. If two principal nodes link to the same content node, first apply the more generic prinipals

modifiers.7. Repeat the security modifier search all the way down to the target content node, thus overriding

more generic permissions with the set on nodes closer to the target node.

The same algorithm is applicable for the bottom-up approach, basically just traversing from the targetcontent node upwards and applying the security modifiers dynamically as the traverser goes up.

ExampleNow, to get the resulting access rights for e.g. "user 1" on the "My File.pdf" in a Top-Down approachon the model in the graph above would go like:

1. Traveling upward, we start with "Root folder", and set the permissions to 11 initially (onlyconsidering Read, Write).

2. There are two SECURITY relationships to that folder. User 1 is contained in both of them, but "root"is more generic, so apply it first then "All principals" +W +R 11.

3. "Home" has no SECURITY instructions, continue.4. "user1 Home" has SECURITY. First apply "Regular Users" (-R -W) 00, Then "user 1" (+R +W) 11.5. The target node "My File.pdf" has no SECURITY modifiers on it, so the effective permissions for "User

1" on "My File.pdf" are ReadWrite 11.

5.3.2.Read-permission exampleIn this example, we are going to examine a tree structure of directories and files. Also, there areusers that own files and roles that can be assigned to users. Roles can have permissions on directory orfiles structures (here we model only canRead, as opposed to full rwx Unix permissions) and be nested.A more thorough example of modeling ACL structures can be found at How to Build Role-BasedAccess Control in SQL .


44

Node[20]'nam e' = 'Hom eU1'

Node[17]'nam e' = 'File1'

leaf

Node[23]'nam e' = 'Desktop'

Node[16]'nam e' = 'File2'

leaf

Node[10]'nam e' = 'Hom e'

contains

Node[15]'nam e' = 'Hom eU2'

contains

contains

Node[11]'nam e' = ' init .d'

Node[12]'nam e' = 'etc'

contains

Node[18]'nam e' = 'FileRoot '

contains contains

Node[7]'nam e' = 'User'

Node[14]'nam e' = 'User1'

m em ber

Node[13]'nam e' = 'User2'

m em ber

owns

owns

Node[8]'nam e' = 'Adm in2'

Node[9]'nam e' = 'Adm in1'

Node[21]'nam e' = 'Role'

subRole

Node[22]'nam e' = 'SUDOers'

subRole

canReadm em ber m em ber

Node[19]'nam e' = 'Root '

has

has

Find all files in the directory structureIn order to find all files contained in this structure, we need a variable length query that follows allcontains relationships and retrieves the nodes at the other end of the leaf relationships.

MATCH ({ name: 'FileRoot' })-[:contains*0..]->(parentDir)-[:leaf]->(file)RETURN file

resulting in:

fileNode[10]{name:"File1"}

Node[9]{name:"File2"}

2 rows

What files are owned by whom?If we introduce the concept of ownership on files, we then can ask for the owners of the files wefind connected via owns relationships to file nodes.

MATCH ({ name: 'FileRoot' })-[:contains*0..]->()-[:leaf]->(file)


45

Who has access to a File?If we now want to check what users have read access to all Files, and define our ACL as

The root directory has no access granted. Any user having a role that has been granted canRead access to one of the parent folders of a File has

read access.

In order to find users that can read any part of the parent folder hierarchy above the files, Cypherprovides optional variable length path.MATCH (file)


46

5.4.HyperedgesImagine a user being part of different groups. A group can have different roles, and a user can be partof different groups. He also can have different roles in different groups apart from the membership.The association of a User, a Group and a Role can be referred to as a HyperEdge. However, it can beeasily modeled in a property graph as a node that captures this n-ary relationship, as depicted below inthe U1G2R1 node.

Figure5.1.Graph

nam e = 'U1G2R1'

nam e = 'Role1'

hasRole nam e = 'Group2'

hasGroup

nam e = 'Role'

isA

canHave

nam e = 'Role2'

canHave

nam e = 'Group'

isA

isA

nam e = 'Group1'

canHave canHaveisA

nam e = 'User1'

hasRoleInGroup

in in nam e = 'U1G1R2'

hasRoleInGroup

hasRole

hasGroup

5.4.1.Find GroupsTo find out in what roles a user is for a particular groups (here Group2), the following query cantraverse this HyperEdge node and provide answers.

Query.

MATCH ({ name: 'User1' })-[:hasRoleInGroup]->(hyperEdge)-[:hasGroup]->({ name: 'Group2' }), (hyperEdge)-[:hasRole]->(role)RETURN role.name

The role of User1 is returned:


47

Resultrole.name"Role1"

1 row

5.4.2.Find all groups and roles for a userHere, find all groups and the roles a user has, sorted by the name of the role.

Query.MATCH ({ name: 'User1' })-[:hasRoleInGroup]->(hyperEdge)-[:hasGroup]->(group), (hyperEdge)-[:hasRole]->(role)RETURN role.name, group.nameORDER BY role.name ASC

The groups and roles of User1 are returned:

Resultrole.name group.name"Role1" "Group2"

"Role2" "Group1"

2 rows

5.4.3.Find common groups based on shared rolesAssume a more complicated graph:

1. Two user nodes User1, User2.2. User1 is in Group1, Group2, Group3.3. User1 has Role1, Role2 in Group1; Role2, Role3 in Group2; Role3, Role4 in Group3 (hyper edges).4. User2 is in Group1, Group2, Group3.5. User2 has Role2, Role5 in Group1; Role3, Role4 in Group2; Role5, Role6 in Group3 (hyper edges).

The graph for this looks like the following (nodes like U1G2R23 representing the HyperEdges):

Figure5.2.Graph

nam e = 'U2G2R34'

nam e = 'Group2'

hasGroup

nam e = 'Role3'

hasRole

nam e = 'Role4'

hasRole

nam e = 'U1G3R34'

hasRole hasRole

nam e = 'Group3'

hasGroup

nam e = 'User2'

hasRoleInGroup

nam e = 'U2G1R25'

hasRoleInGroup

nam e = 'U2G3R56'

hasRoleInGroup

nam e = 'Role2'

hasRole

nam e = 'Role5'

hasRole

nam e = 'Group1'

hasGrouphasGroup

nam e = 'Role6'

hasRole hasRole

nam e = 'User1'

hasRoleInGroup

nam e = 'U1G1R12'

hasRoleInGroup

nam e = 'U1G2R23'

hasRoleInGroup

hasRole hasGroup

nam e = 'Role1'

hasRolehasGroup hasRole hasRole

To return Group1 and Group2 as User1 and User2 share at least one common role in these two groups, thequery looks like this:

Query.MATCH (u1)-[:hasRoleInGroup]->(hyperEdge1)-[:hasGroup]->(group),(hyperEdge1)-[:hasRole]->(role), (u2)-[:hasRoleInGroup]->(hyperEdge2)-[:hasGroup]->(group),(hyperEdge2)-[:hasRole]->(role)WHERE u1.name = 'User1' AND u2.name = 'User2'RETURN group.name, count(role)


48

ORDER BY group.name ASC

The groups where User1 and User2 share at least one common role:

Resultgroup.name count(role)"Group1" 1

"Group2" 1

2 rows


49

5.5.Basic friend finding based on social neighborhoodImagine an example graph like the following one:

Figure5.3.Graph

nam e = 'Bill'

nam e = 'Derrick'

knows

nam e = 'Ian'

knows

nam e = 'Sara'

knows

knows nam e = 'Jill'

knows

nam e = 'Joe'

knows

knows

To find out the friends of Joes friends that are not already his friends, the query looks like this:

Query.MATCH (joe { name: 'Joe' })-[:knows*2..2]-(friend_of_friend)WHERE NOT (joe)-[:knows]-(friend_of_friend)RETURN friend_of_friend.name, COUNT(*)ORDER BY COUNT(*) DESC , friend_of_friend.name

This returns a list of friends-of-friends ordered by the number of connections to them, and secondly bytheir name.

Resultfriend_of_friend.name COUNT(*)"Ian" 2

"Derrick" 1

"Jill" 1

3 rows


50

5.6.Co-favorited placesFigure5.4.Graph

nam e = 'SaunaX' nam e = 'CoffeeShop1'

nam e = 'Cool'

tagged

nam e = 'Cosy'

tagged

nam e = 'MelsPlace'

taggedtagged

nam e = 'CoffeeShop3'

tagged

nam e = 'CoffeeShop2'

tagged

nam e = 'CoffeShop2'

nam e = 'Jill'

favorite favorite favorite

nam e = 'Joe'

favorite favorite favorite

5.6.1.Co-favorited places users who like x also like yFind places that people also like who favorite this place:

Determine who has favorited place x. What else have they favorited that is not place x.

Query.MATCH (place)(stuff)WHERE place.name = 'CoffeeShop1'RETURN stuff.name, count(*)ORDER BY count(*) DESC , stuff.name

The list of places that are favorited by people that favorited the start place.

Resultstuff.name count(*)"MelsPlace" 2

"CoffeShop2" 1

"SaunaX" 1

3 rows

5.6.2.Co-Tagged places places related through tagsFind places that are tagged with the same tags:

Determine the tags for place x. What else is tagged the same as x that is not x.

Query.MATCH (place)-[:tagged]->(tag)


51

ResultotherPlace.name collect(tag.name)"MelsPlace" ["Cool", "Cosy"]

"CoffeeShop2" ["Cool"]

"CoffeeShop3" ["Cosy"]

3 rows


52

5.7.Find people based on similar favoritesFigure5.5.Graph

nam e = 'Sara'

nam e = 'Cats'

favorite

nam e = 'Bikes'

favorite

nam e = 'Derrick'

favoritefavorite

nam e = 'Jill'

favorite

nam e = 'Joe'

friend

favoritefavorite

To find out the possible new friends based on them liking similar things as the asking person, use aquery like this:

Query.MATCH (me { name: 'Joe' })-[:favorite]->(stuff)


53

5.8.Find people based on mutual friends and groupsFigure5.6.Graph

Node[0]nam e = 'Bill'

Node[1]nam e = 'Group1'

m em ber_of_group

Node[2]nam e = 'Bob'

m em ber_of_group

Node[3]nam e = 'Jill'

knows

m em ber_of_group

Node[4]nam e = 'Joe'

knows

m em ber_of_group

In this scenario, the problem is to determine mutual friends and groups, if any, between persons. If nomutual groups or friends are found, there should be a 0 returned.

Query.MATCH (me { name: 'Joe' }),(other)WHERE other.name IN ['Jill', 'Bob']OPTIONAL MATCH pGroups=(me)-[:member_of_group]->(mg)(mf)


54

5.9.Find friends based on similar taggingFigure5.7.Graph

nam e = 'Anim als' nam e = 'Hobby'

nam e = 'Surfing'

tagged

nam e = 'Sara'

nam e = 'Bikes'

favorite

nam e = 'Horses'

favorite

taggedtagged

nam e = 'Cats'

tagged

nam e = 'Derrick'

favorite

nam e = 'Joe'

favorite favoritefavoritefavorite

To find people similar to me based on the taggings of their favorited items, one approach could be:

Determine the tags associated with what I favorite. What else is tagged with those tags? Who favorites items tagged with the same tags? Sort the result by how many of the same things these people like.

Query.MATCH (me)-[:favorite]->(myFavorites)-[:tagged]->(tag)


55

5.10.Multirelational (social) graphsFigure5.8.Graph

nam e = 'cats'

nam e = 'nature'

nam e = 'Ben'

nam e = 'Sara'LIKES

FOLLOWS

nam e = 'Joe'

FOLLOWS

nam e = 'bikes'

LIKES

nam e = 'cars'

LIKES

LIKES

FOLLOWS

LIKES

nam e = 'Maria'

LOVESFOLLOWSFOLLOWSLOVES

LIKES

This example shows a multi-relational network between persons and things they like. A multi-relational graph is a graph with more than one kind of relationship between nodes.

Query.MATCH (me { name: 'Joe' })-[r1:FOLLOWS|:LOVES]->(other)-[r2]->(me)WHERE type(r1)=type(r2)RETURN other.name, type(r1)

The query returns people that FOLLOWS or LOVES Joe back.

Resultother.name type(r1)"Sara" "FOLLOWS"

"Maria" "FOLLOWS"

"Maria" "LOVES"

3 rows


56

5.11.Implementing newsfeeds in a graph

nam e = 'Bob'

nam e = 'bob_s1'text = 'bobs status1'date = 1

STATUS

nam e = 'Alice'

FRIENDstatus = 'CONFIRMED'

nam e = 'bob_s2'text = 'bobs status2'date = 4

NEXT

nam e = 'alice_s1'text = 'Alices status1'date = 2

STATUS

nam e = 'Joe'

FRIENDstatus = 'PENDING'

nam e = 'alice_s2'text = 'Alices status2'date = 5

NEXT

FRIENDstatus = 'CONFIRMED'

nam e = ' joe_s1'text = 'Joe status1'date = 3

STATUS

nam e = ' joe_s2'text = 'Joe status2'date = 6

NEXT

Implementation of newsfeed or timeline feature is a frequent requirement for social applications. Thefollowing exmaples are inspired by Newsfeed feature powered by Neo4j Graph Database . The query asked here is:

Starting at me, retrieve the time-ordered status feed of the status updates of me and and all friends thatare connected via a CONFIRMED FRIEND relationship to me.

Query.MATCH (me { name: 'Joe' })-[rels:FRIEND*0..1]-(myfriend)WHERE ALL (r IN rels WHERE r.status = 'CONFIRMED')WITH myfriendMATCH (myfriend)-[:STATUS]-(latestupdate)-[:NEXT*0..1]-(statusupdates)RETURN myfriend.name AS name, statusupdates.date AS date, statusupdates.text AS textORDER BY statusupdates.date DESC LIMIT 3

To understand the strategy, lets divide the query into five steps:

1. First Get the list of all my friends (along with me) through FRIEND relationship (MATCH (me {name:'Joe'})-[rels:FRIEND*0..1]-(myfriend)). Also, the WHERE predicate can be added to check whetherthe friend request is pending or confirmed.


57

2. Get the latest status update of my friends through Status relationship (MATCH myfriend-[:STATUS]-latestupdate).

3. Get subsequent status updates (along with the latest one) of my friends through NEXT relationships(MATCH (myfriend)-[:STATUS]-(latestupdate)-[:NEXT*0..1]-(statusupdates)) which will give you thelatest and one additional statusupdate; adjust 0..1 to whatever suits your case.

4. Sort the status updates by posted date (ORDER BY statusupdates.date DESC).5. LIMIT the number of updates you need in every query (LIMIT 3).

Result

name date text"Joe" 6 "Joe status2"

"Bob" 4 "bobs status2"

"Joe" 3 "Joe status1"

3 rows

Here, the example shows how to add a new status update into the existing data for a user.

Query.

MATCH (me)WHERE me.name='Bob'OPTIONAL MATCH (me)-[r:STATUS]-(secondlatestupdate)DELETE rCREATE (me)-[:STATUS]->(latest_update { text:'Status',date:123 })WITH latest_update, collect(secondlatestupdate) AS secondsFOREACH (x IN seconds | CREATE latest_update-[:NEXT]->x)RETURN latest_update.text AS new_status

Dividing the query into steps, this query resembles adding new item in middle of a doubly linked list:

1. Get the latest update (if it exists) of the user through the STATUS relationship (OPTIONAL MATCH (me)-[r:STATUS]-(secondlatestupdate)).

2. Delete the STATUS relationship between user and secondlatestupdate (if it exists), as this wouldbecome the second latest update now and only the latest update would be added through a STATUSrelationship; all earlier updates would be connected to their subsequent updates through a NEXTrelationship. (DELETE r).

3. Now, create the new statusupdate node (with text and date as properties) and connectthis with the user through a STATUS relationship (CREATE me-[:STATUS]->(latest_update{ text:'Status',date:123 })).

4. Pipe over statusupdate or an empty collection to the next query part (WITH latest_update,collect(secondlatestupdate) AS seconds).

5. Now, create a NEXT relationship between the latest status update and the second latest status update(if it exists) (FOREACH(x in seconds | CREATE latest_update-[:NEXT]->x)).


58

Resultnew_status"Status"

1 rowNodes created: 1Relationships created: 2Properties set: 2Relationships deleted: 1

Node[0]nam e = 'Bob'

Node[1]nam e = 'bob_s1'text = 'bobs status1'date = 1

STATUS

Node[2]nam e = 'bob_s2'text = 'bobs status2'date = 4

NEXT


59

5.12.Boosting recommendation resultsFigure5.9.Graph

nam e = 'Clark Kent '

nam e = 'Lois Lane'

KNOWSweight = 4

nam e = 'Jim m y Olsen'

KNOWSweight = 4

nam e = 'Daily Planet '

WORKS_ATweight = 2act ivity = 45


nam e = 'Perry White'

KNOWSweight = 4

nam e = 'Anderson Cooper'

KNOWSweight = 4


KNOWSweight = 4


nam e = 'CNN'



This query finds the recommended friends for the origin that are working at the same place as theorigin, or know a person that the origin knows, also, the origin should not already know the target.This recommendation is weighted for the weight of the relationship r2, and boosted with a factor of 2,if there is an activity-property on that relationship

Query.MATCH (origin)-[r1:KNOWS|WORKS_AT]-(c)-[r2:KNOWS|WORKS_AT]-(candidate)WHERE origin.name = "Clark Kent" AND type(r1)=type(r2) AND NOT (origin)-[:KNOWS]-(candidate)RETURN origin.name AS origin, candidate.name AS candidate, SUM(ROUND(r2.weight +(COALESCE(r2.activity, 0)* 2))) AS boostORDER BY boost DESC LIMIT 10

This returns the recommended friends for the origin nodes and their recommendation score.

Resultorigin candidate boost"Clark Kent" "Perry White" 22. 0

"Clark Kent" "Anderson Cooper" 4. 0

2 rows


60

5.13.Calculating the clustering coefficient of a networkFigure5.10.Graph

nam e = 'startnode'

KNOWS KNOWS

KNOWS

KNOWS

KNOWS KNOWS KNOWS

In this example, adapted from Niko Gamulins blog post on Neo4j for Social Network Analysis, the graph inquestion is showing the 2-hop relationships of a sample person as nodes with KNOWS relationships.

The clustering coefficient of a selected nodeis defined as the probability that two randomly selected neighbors are connected to each other. Withthe number of neighbors as n and the number of mutual connections between the neighbors r thecalculation is:

The number of possible connections between two neighbors is n!/(2!(n-2)!) = 4!/(2!(4-2)!) = 24/4 =6, where n is the number of neighbors n = 4 and the actual number r of connections is 1. Therefore theclustering coefficient of node 1 is 1/6.

n and r are quite simple to retrieve via the following query:

Query.MATCH (a { name: "startnode" })--(b)WITH a, count(DISTINCT b) AS nMATCH (a)--()-[r]-()--(a)RETURN n, count(DISTINCT r) AS r

This returns n and r for the above calculations.

Resultn r4 1

1 row


61

5.14.Pretty graphsThis section is showing how to create some of the named pretty graphs on Wikipedia .

5.14.1.Star graphThe graph is created by first creating a center node, and then once per element in the range, creates aleaf node and connects it to the center.

Query.

CREATE (center)FOREACH (x IN range(1,6)| CREATE (leaf),(center)-[:X]->(leaf))RETURN id(center) AS id;

The query returns the id of the center node.

Result

id0

1 rowNodes created: 7Relationships created: 6

Figure5.11.Graph

XX

XX

X

X

5.14.2.Wheel graphThis graph is created in a number of steps:

Create a center node. Once per element in the range, create a leaf and connect it to the center. Connect neighboring leafs. Find the minimum and maximum leaf and connect these.


62

Return the id of the center node.

Query.

CREATE (center)FOREACH (x IN range(1,6)| CREATE (leaf { count:x }),(center)-[:X]->(leaf))WITH centerMATCH (large_leaf)(small_leaf)WHERE large_leaf.count = small_leaf.count + 1CREATE (small_leaf)-[:X]->(large_leaf)WITH center, min(small_leaf.count) AS min, max(large_leaf.count) AS maxMATCH (first_leaf)(last_leaf)WHERE first_leaf.count = min AND last_leaf.count = maxCREATE (last_leaf)-[:X]->(first_leaf)RETURN id(center) AS id

The query returns the id of the center node.

Result

id0

1 rowNodes created: 7Relationships created: 12Properties set: 6

Figure5.12.Graph

count = 1

X

count = 2X

count = 3

X

count = 4

X

count = 5 X

count = 6

X

X

X

X

X

X

X

5.14.3.Complete graphTo create this graph, we first create 6 nodes and label them with the Leaf label. We then match all theunique pairs of nodes, and create a relationship between them.

Query.

FOREACH (x IN range(1,6)| CREATE (leaf:Leaf { count : x }))WITH *MATCH (leaf1:Leaf),(leaf2:Leaf)WHERE id(leaf1)< id(leaf2)CREATE (leaf1)-[:X]->(leaf2);


63

Nothing is returned by this query.

Result(empty result)

Nodes created: 6Relationships created: 15Properties set: 6Labels added: 6

Figure5.13.Graph

Leaf

count = 1

Leaf

count = 2

X

Leaf

count = 3

X

Leaf

count = 4

X

Leaf

count = 5

X

Leaf

count = 6 X

X

X X

X

X X

X

X

X X

5.14.4.Friendship graphThis query first creates a center node, and then once per element in the range, creates a cycle graphand connects it to the center

Query.CREATE (center)FOREACH (x IN range(1,3)| CREATE (leaf1),(leaf2),(center)-[:X]->(leaf1),(center)-[:X]->(leaf2), (leaf1)-[:X]->(leaf2))RETURN ID(center) AS id

The id of the center node is returned by the query.

Resultid0

1 rowNodes created: 7Relationships created: 9


64

Figure5.14.Graph

XX

X

XX

X

X

X

X


65

5.15.A multilevel indexing structure (path tree)In this example, a multi-level tree structure is used to index event nodes (here Event1, Event2 andEvent3, in this case with a YEAR-MONTH-DAY granularity, making this a timeline indexingstructure. However, this approach should work for a wide range of multi-level ranges.

The structure follows a couple of rules:

Events can be indexed multiple times by connecting the indexing structure leafs with the events viaa VALUE relationship.

The querying is done in a path-range fashion. That is, the start- and end path from the indexing rootto the start and end leafs in the tree are calculated

Using Cypher, the queries following different strategies can be expressed as path sections and puttogether using one single query.

The graph below depicts a structure with 3 Events being attached to an index structure at differentleafs.

Figure5.15.Graph

Root

Year 2010

2010

Year 2011

2011

Month 12

12

Month 01

01

Day 31

31

Day 01

01

Day 02

02

Day 03

03

NEXT

Event1

VALUE

Event2

VALUE

NEXT

VALUE

NEXT

Event3

VALUE

5.15.1.Return zero rangeHere, only the events indexed under one leaf (2010-12-31) are returned. The query only needs onepath segment rootPath (color Green) through the index.


66

Figure5.16.Graph

Root

Year 2010

2010

Year 2011

2011

Month 12

12

Month 01

01

Day 31

31

Day 01

01

Day 02

02

Day 03

03

NEXT

Event1

VALUE

Event2

VALUE

NEXT

VALUE

NEXT

Event3

VALUE

Query.

MATCH rootPath=(root)-[:`2010`]->()-[:`12`]->()-[:`31`]->(leaf),(leaf)-[:VALUE]->(event)WHERE root.name = 'Root'RETURN event.nameORDER BY event.name ASC

Returning all events on the date 2010-12-31, in this case Event1 and Event2

Result

event.name"Event1"

"Event2"

2 rows

5.15.2.Return the full rangeIn this case, the range goes from the first to the last leaf of the index tree. Here, startPath (colorGreenyellow) and endPath (color Green) span up the range, valuePath (color Blue) is then connecting theleafs, and the values can be read from the middle node, hanging off the values (color Red) path.


67

Figure5.17.Graph

Root

Year 2010

2010

Year 2011

2011

Month 12

12

Month 01

01

Day 31

31

Day 01

01

Day 02

02

Day 03

03

NEXT

Event1

VALUE

Event2

VALUE

NEXT

VALUE

NEXT

Event3

VALUE

Query.

MATCH startPath=(root)-[:`2010`]->()-[:`12`]->()-[:`31`]->(startLeaf), endPath=(root)-[:`2011`]->()-[:`01`]->()-[:`03`]->(endLeaf), valuePath=(startLeaf)-[:NEXT*0..]->(middle)-[:NEXT*0..]->(endLeaf), vals=(middle)-[:VALUE]->(event)WHERE root.name = 'Root'RETURN event.nameORDER BY event.name ASC

Returning all events between 2010-12-31 and 2011-01-03, in this case all events.

Result

event.name"Event1"

"Event2"

"Event2"

"Event3"

4 rows


68

5.15.3.Return partly shared path rangesHere, the query range results in partly shared paths when querying the index, making the introductionof and common path segment commonPath (color Black) necessary, before spanning up startPath (colorGreenyellow) and endPath (color Darkgreen) . After that, valuePath (color Blue) connects the leafs and theindexed values are returned off values (color Red) path.

Figure5.18.Graph

Root

Year 2010

2010

Year 2011

2011

Month 12

12

Month 01

01

Day 31

31

Day 01

01

Day 02

02

Day 03

03

NEXT

Event1

VALUE

Event2

VALUE

NEXT

VALUE

NEXT

Event3

VALUE

Query.

MATCH commonPath=(root)-[:`2011`]->()-[:`01`]->(commonRootEnd), startPath=(commonRootEnd)-[:`01`]->(startLeaf), endPath=(commonRootEnd)-[:`03`]->(endLeaf), valuePath=(startLeaf)-[:NEXT*0..]->(middle)-[:NEXT*0..]->(endLeaf), vals=(middle)-[:VALUE]->(event)WHERE root.name = 'Root'RETURN event.nameORDER BY event.name ASC

Returning all events between 2011-01-01 and 2011-01-03, in this case Event2 and Event3.

Result

event.name"Event2"

2 rows


69

event.name"Event3"

2 rows


70

5.16.Complex similarity computations5.16.1.Calculate similarities by complex calculations

Here, a similarity between two players in a game is calculated by the number of times they have eatenthe same food.

Query.MATCH (me { name: 'me' })-[r1:ATE]->(food)(food)


71

5.17.The Graphity activity stream model5.17.1.Find Activity Streams in a network without scaling penalty

This is an approach for scaling the retrieval of activity streams in a friend graph put forward by RenePickard as Graphity . In short, a linked list is created for every personsfriends in the order that the last activities of these friends have occured. When new activities occur fora friend, all the ordered friend lists that this friend is part of are reordered, transferring computing loadto the time of new event updates instead of activity stream reads.

TipThis approach of course makes excessive use of relationship types. This needs to betaken into consideration when designing a production system with this approach. SeeSection16.5, Capacity for the maximum number of relationship types.

To find the activity stream for a person, just follow the linked list of the friend list, and retrieve theneeded amount of activities form the respective activity list of the friends.

Query.

MATCH p=(me { name: 'Jane' })-[:jane_knows*]->(friend),(friend)-[:has]->(status)RETURN me.name, friend.name, status.name, length(p)ORDER BY length(p)

The returns the activity stream for Jane.

Result

me.name friend.name status.name length(p)"Jane" "Bill" "Bill_s1" 1

"Jane" "Joe" "Joe_s1" 2

"Jane" "Bob" "Bob_s1" 3

3 rows


72

Figure5.20.Graph

nam e = 'Bill'

nam e = 'Bill_s1'

has

nam e = 'Joe'

jane_knows

nam e = 'Bill_s2'

next

nam e = 'Joe_s1'

has

nam e = 'Bob'

jane_knows

nam e = 'Ted_s1'

nam e = 'Ted_s2'

next

nam e = 'Jane'

jane_knows

nam e = 'Joe_s2'

next

nam e = 'Bob_s1'

has

nam e = 'Ted'

bob_knows

bob_knows

has


73

5.18.User roles in graphsThis is an example showing a hierarchy of roles. Whats interesting is that a tree is not sufficient forstoring this kind of structure, as elaborated below.

This is an implementation of an example found in the article A Model to Represent Directed AcyclicGraphs (DAG) on SQL Databases by Kemal Erdogan . The article discusses how to store directedacyclic graphs (DAGs) in SQL based DBs.DAGs are almost trees, but with a twist: it may be possible to reach the same node through differentpaths. Trees are restricted from this possibility, which makes them much easier to handle. In our caseit is Ali and Engin, as they are both admins and users and thus reachable through these groupnodes. Reality often looks this way and cant be captured by tree structures.In the article an SQL Stored Procedure solution is provided. The main idea, that also have somesupport from scientists, is to pre-calculate all possible (transitive) paths. Pros and cons of thisapproach:

decent performance on read low performance on insert wastes lots of space relies on stored procedures

In Neo4j storing the roles is trivial. In this case we use PART_OF (green edges) relationships to modelthe group hierarchy and MEMBER_OF (blue edges) to model membership in groups. We also connect thetop level groups to the reference node by ROOT relationships. This gives us a useful partitioning of thegraph. Neo4j has no predefined relationship types, you are free to create any relationship types andgive them the semantics you want.Lets now have a look at how to retrieve information from the graph. The the queries are done usingCypher, the Java code is using the Neo4j Traversal API (see Section34.2, Traversal Framework JavaAPI, which is part of PartVIII, Advanced Usage).


74

5.18.1.Get the adminsIn Cypher, we could get the admins like this:MATCH ({ name: 'Admins' })(group)RETURN group.name

group.name"ABCTechnicians"

"Technicians"

"Users"

3 rows

Using the Neo4j Java Traversal API, this query looks like:Node jale = getNodeByName( "Jale" );


75

traversalDescription = db.t

Neo4j Manual 2.1 SNAPSHOT

Documents

Transcript of Neo4j Manual 2.1 SNAPSHOT