Neo4j + MongoDB. Neo4j Doc Manager for Mongo Connector - GraphConnect SF 2015
Data Modeling with Neo4j - GOTO Conference Data Modeling with Neo4j 1 Stefan Armbruster, Neo...
Transcript of Data Modeling with Neo4j - GOTO Conference Data Modeling with Neo4j 1 Stefan Armbruster, Neo...
7
A graph database...
7
NO: not for charts & diagrams, or vector artwork
YES: for storing data that is structured as a graph
remember linked lists, trees?
graphs are the general-purpose data structure
“A relational database may tell you the average age of everyone in this place,
but a graph database will tell you who is most likely to buy you a beer.”
10
We're talking about aProperty Graph
10
Properties (each a key+value)
+ Indexes (for easy look-ups)
1313
“There is a significant downside - the whole approach works really well when data access is
aligned with the aggregates, but what if you want to look at the data in a different way? Order entry
naturally stores orders as aggregates, but analyzing product sales cuts across the aggregate
structure. The advantage of not using an aggregate structure in the database is that it
allows you to slice and dice your data different ways for different audiences.
This is why aggregate-oriented stores talk so much about map-reduce.”
Martin Fowler
Aggregate Oriented Model
1414
The connected data model is based on fine grained elements that are richly connected, the emphasis is on extracting many dimensions and
attributes as elements. Connections are cheap and can be used not only
for the domain-level relationships but also for additional structures that allow efficient access for
different use-cases. The fine grained model requires a external scope for mutating operations that ensures Atomicity, Consistency, Isolation and
Durability - ACID also known as Transactions.
Michael Hunger
Connected Data Model
16
Why Data Modeling
16
๏What is modeling?
๏Aren‘t we schema free?
๏How does it work in a graph?
๏Where should modeling happen? DB or Application
21
Whiteboard --> Data
21
Andreas
Peter
Emil
Allison
knows
knows knows
knows
// Cypher query - friend of a friendstart n=node(0)match (n)--()--(foaf) return foaf
22
// lookup starting point in an indexSTART n=node:People(name = ‘Andreas’)
You traverse the graph
22
// then traverse to find resultsSTART me=node:People(name = ‘Andreas’MATCH (me)-[:FRIEND]-(friend)-[:FRIEND]-(friend2) RETURN friend2
23
SELECT skills.*, user_skill.* FROM users JOIN user_skill ON users.id = user_skill.user_id JOIN skills ON user_skill.skill_id = skill.id WHERE users.id = 1
23
START user = node(1) MATCH user -[user_skill]-> skill RETURN skill, user_skill
Need to model the relationship
language_code
language_name
word_count
Language
country_code
country_name
flag_uri
language_code
Country
What if the cardinality changes?
language_code
language_name
word_count
country_code
Language
country_code
country_name
flag_uri
Country
Or we go many-to-many?
language_code
language_name
word_count
Language
country_code
country_name
flag_uri
Country
language_code
country_code
LanguageCountry
Or we want to qualify the relationship?
language_code
language_name
word_count
Language
country_code
country_name
flag_uri
Country
language_code
country_code
primary
LanguageCountry
What’s different?
language_code
language_name
word_count
Language
country_code
country_name
flag_uri
Country
language_code
country_code
primary
LanguageCountry
IS_SPOKEN_IN
What’s different?๏ Implementation of maintaining relationships is left up
to the database
๏ Artificial keys disappear or are unnecessary
๏ Relationships get an explicit name
• can be navigated in both directions
Bidirectional relationships
name
word_count
Language
name
flag_uri
Country
IS_SPOKEN_IN
PRIMARY_LANGUAGE
Weighted relationships
name
word_count
Language
name
flag_uri
Country
POPULATION_SPEAKS
population_fraction
Keep on adding relationships
name
word_count
Language
name
flag_uri
Country
POPULATION_SPEAKS
population_fraction
SIMILAR_TO ADJACENT_TO
Anti-Pattern: Node represents multiple concepts
name
flag_uri
language_name
number_of_words
yes_in_language
no_in_language
currency_code
currency_name
Country
USES_CURRENCY
Split up in separate concepts
name
flag_uri
currency_code
currency_name
Country
name
number_of_words
yes
no
Country
SPEAKS
Currency
currency_code
currency_name
Challenge: Property or Relationship?๏ Can every property be replaced by a relationship?
๏ Should every entities with the same property values be connected?
Object Mapping๏ Similar to how you would map objects to a relational
database, using an ORM such as Hibernate
๏ Generally simpler and easier to reason about
๏ Examples
• Java: Spring Data Graph
• Ruby: Active Model
๏ Why Map?
• Do you use mapping because you are scared of SQL?
• Following DDD, could you write your repositories directly against the graph API?
Relationships for querying๏ like in other databases
• same structure for different use-cases (OLTP and OLAP) doesn‘t work
• graph allows: add more structures
๏ Relationships should the primary means to access nodes in the database
๏ Traversing relationships is cheap – that’s the whole design goal of a graph database
๏ Use lookups only to find starting nodes for a query
Data Modeling examples in Manual
Anti-pattern: unconnected graph
name: “Jones” name: “Jones”
name: “Jones”
name: “Jones”
name: “Jones”
name: “Jones”
name: “Jones” name: “Jones”
name: “Jones”
name: “Jones”
name: “Jones”
60
Evolution: Relationship to Node
60
SENT_EMAIL
EMAIL_FROMEMAIL_TO
_CC
TAGGED
. . .
see Hyperedges
Combine multiple Domains in a Graph๏ you start with a single domain
๏ add more connected domains as your system evolves
๏ more domains allow to ask different queries
๏ one domain „indexes“ the other
๏ Example Facebook Graph Search
• social graph
• location graph
• activity graph
• favorite graph
• ...
62
Notes on the Graph Data Model๏ Schema free, but constraints
๏Model your graph with a whiteboard and a wise man
๏Nodes as main entities but useless without connections
๏ Relationships are first level citizens in the model and database
๏Normalize more than in a relational database
๏ use meaningful relationship-types, not generic ones like IS_
๏ use in-graph structures to allow different access paths
๏ evolve your graph to your needs, incremental growth
62
68
How to get started?๏ Documentation
• neo4j.org
‣http://www.neo4j.org/learn/nosql
• docs.neo4j.org - tutorials+reference
‣Data Modeling Examples
• http://console.neo4j.org
• Neo4j in Action
• Good Relationships
๏ Worldwide one-day Neo4j Trainings
๏ Get Neo4j
• http://neo4j.org/download
• http://addons.heroku.com/neo4j/
๏ Participate
• http://groups.google.com/group/neo4j
• http://neo4j.meetup.com
• a session like this one ;)
68
69
69
Really, once you start thinking in graphs it's hard to stop
Recommendations MDM
Systems Management
Geospatial
Social computing
Business intelligence
Biotechnology
Making Sense of all that data
your brainaccess control
linguistics
catalogs
genealogy routing
compensation market vectors
What will you build?