Grails goes Graph
-
Upload
darthvader42 -
Category
Technology
-
view
4.251 -
download
0
Transcript of Grails goes Graph
SpringSource 2GX 2009
Grails Goes Graph
Stefan Armbruster, presales engineer @[email protected] Twitter: @darthvader42
2012 SpringOne 2GX. All rights reserved. Do not distribute without permission.
about @self
This talk: Grails goes Graph
Intro into Graph (Databases)
Intro into Neo4j
Grails Neo4j plugin
Live demo
case study
trend 1: data growth
source: Digital Universe Study 2011 by IDC
Eric Schmidt: Every two days we create as much information as up to 2003
trend 2: data connectedness
Information connectivityText DocumentsHypertextFeedsBlogsWikisUGCTaggingFolksonomiesRDFOnotologiesGGG
trend 3: semi-structured information
Individualisation of content1970s salary lists, all elements exactly one job
2000s salary lists, we need many job columns!
All encompassing entire world viewsStore more data about each entity
Trend accelerated by the decentralization of content generation
Age of participation (web 2.0)
trend 4: architecture
1980's: mainframe
trend 4: architecture
1990's: DB as integration platform
trend 4: architecture
2000's: decoupling of services
trend 4: architecture
2010: SOA
trend 4: scale for performance
Salary listMost Web appsSocial NetworkLocation-based services
data is different over times: 4 trends
amount of data grows (bigdata)
data gets more connected
less structure semi-structured
architecture massive horizontal scalability
NoSQL what does that mean?
NO to SQL ?
not only SQL!
simplistic cartography of NoSQL
side note: aggregate oriented databases
89% of all virtualized applications
in the world run on VMware. Gartner, December 2008
"There is a significant downside - the whole approach works really well when data access is aligned with the aggregates, but what if you want to look at the data in a different way? ...Order entry naturally stores orders as aggregates, but analyzing product sales cuts across the aggregate structure. This is why aggregate-oriented stores talk so much about map-reduce"Martin Fowler on http://martinfowler.com/bliki/AggregateOrientedDatabase.html
graphs are everywhere
graphs everywhere
Relationships in Politics, Economics, History, Science, Transportation
Biology, Chemistry, Physics, SociologyBody, Ecosphere, Reaction, Interactions
InternetHardware, Software, Interaction
Social NetworksFamily, Friends
Work, Communities
Neighbours, Cities, Society
relationships
the world is rich, messy and related data
relationships are as least as important as the things they connect
Graphs = Whole > parts
complex interactions
always changing, change of structures as well
Graph: Relationships are part of the data
RDBMS: Relationships part of the fixed schema
questions & answers
Complex Questions
Answers lie between the lines (things)
Locality of the information
Global searches / operations very expensive
constant query time, regardless of data volume
categories
Categories == Classes, Trees ?
What if more than one category fits?
Tags
Categories via relationships like IS_A
any number, easy change
virtual Relationships - Traversals
Category dynamically derived from queries
everyone is talking about graphs
Facebook Open Graph
Neo4j
example of a property graph
querying the graph: your choice
Simple way: navigate relationship paths by core API
More powerful: simple traversers with callbacks forWhere to end traversal
What should be in the result set
Even more powerful: Traversal APIFluent interface for specifying traversals,
Shell: mimics unix filesystem commands (ls, cd, ...)
Gremlin: graph traversal language
Cypher: the SQL for Neo4jDeclarative
Designed for Humans (Devs + Domain experts)
deprecated
to be deprecated
Cypher examples
START user=node(5,4,1,2,3)MATCH user-[:friend]->followerWHERE follower.name =~ /S.*/RETURN user, follower.name
START john=node:node_auto_index(name = 'John')MATCH john-[:friend]->()-[:friend]->fofRETURN john, fof
query performance
a sample social graph with ~1,000 persons
average 50 friends per person
pathExists(a,b) limited to depth 4
caches warmed up to eliminate disk I/O
# Personquery time
relational DB1.0002.000 ms
Neo4j1.0002 ms
Neo4j1.000.0002 ms
deployment options
Embedded in JVMJust drop couple of jars into your application
Use EmbeddedGraphDatabase
Very fast no marshalling/unmarshalling, no network overhead
Neo4j as ServerExposes rich REST interfacegranular API many requests, consider network overhead
use batching or Cypher if possible
Add custom modules to the server (plugins/unmanaged extensions)
Both, embedded and server can be run as HA!One master, multiple slaves
Zookeeper for managing the cluster, about to change for upcoming versions
Neo4j HA architecture
Licensing Neo4j
3 editions available:
Community: GPL
AdvancedCommunity + enhanced Monitoring + enhanced Webadmin
AGPL or Commercial
EnterpriseAdvanced + HA + online backup + GCR-Cache
AGPL or Commercial
Neo4j - Overview
RUNS_AS
HIGH_AVAIL.
SCALES_TO
RUNS_AS
RUNS_ON
PROVIDES
LICENSED_LIKE
INTEGRATES
TRAVERSALS
Sharding
Master/Slave
graphconnect.com, Nov 6 7
GORM
Grails Object Relational Mapping (GORM) aka grails-data-mappingLib: https://github.com/SpringSource/grails-data-mapping
manages meta-model of domain classes
Common data persistence abstraction layer
Methods for domain classes (CRUD + finders + X)
Extensible
Access to low level API of the implementation
TCK for implementation, +200 testcases
Existing implementationsSimple (In-Memory, hashmap based for unit testing)
Hibernate, JPA
MongoDB, SimpleDB, Dynamo, Redis, (Riak), Neo4j
some key abstractions in g-d-m
MappingContext:holds metainformation about mapping domain classes to the underlying datastore, does type conversion, holds list of EntityPersisters
Datastore:create sessions
manage connection to low-level storage
Session:similar HibernateSession
EntityPersister:does the dirty work: interact with low level datastore
Query:knows how to query the datastore by criteria (criterion, projections,...)
GORM has a price tag ;-)
Grails Neo4j Integration
Resources:Lib: https://github.com/SpringSource/grails-data-mapping
Plugin: http://www.grails.org/plugin/neo4j
Plugin docs: http://springsource.github.com/grails-data-mapping/neo4j/manual/index.html
goal: use Neo4j as persistence layer for a standard Grails domain model
Mapping Grails domain model to the nodespace
domain classassociationdomain classinstancedomain instanceproperty
reference node
subreference
instance
properties
2 challanges involved
Locking of domain nodes in HA mode
Category nodes become super nodes causes potential bottleneck on traversals
Solutions:add intermediate category nodes
use indexing instead
reference node
domain node
instance nodes
currently working in the neo4j plugin (1/2)
passing >98% of GORM TCK (hurray!)
accessing embedded, REST and HA datasourcesand ImpermanentGraphdatabase for testing
property type conversion
support of schemaless properties
access to native APIinstance.getNode(), bean: graphDatabaseService
GORM enhancements:.traverseStatic, .cypherStatic
.traverse, .cypher
currently working in the neo4j plugin (2/2)
prevention of locking exceptions by using intermediate category nodes
Declarative Indexingapply static mapping closure just the standard way
convenience methods on Neo4j's nodes and relationships:node. =
JSON marshalling for Neo4j's Node and Relationships
embed Neo4j's webadmin into grails application
praying to the demo god...
looking into the crystal ball
get rid of subreferences in favour of indexing
migrate plugin to use Cypher only instead of core-API
option for mapping domain classes as a relationshipthink of roads between cities having a distance property
fix open issues: http://bit.ly/KEmVX2
maybe use Spring Data Neo4j internally
and more
case study
back in 2010 a website to collect and aggregate opinions of soccer fans went life
votes can be based on almost everythingplayers, teams, matches, events in matches
hard to model with classic RDBMS
Neo4j to the rescue, used in embedded mode
as always: hard and very tight schedulebuild up technical debt due to lack of automated tests
Neo4j HA scales very good for reads
case study: lessons learned
massive amount of very small write transactions in HA mode caused trouble:e.g. locking exceptions upon user registration
aggregate multiple write transactions using JMS queue
serious issues with full GCssince app AND Neo4j reside in same JVM full GCs happen
if stop-the-world pause is too large: master switch
have loadbalancer with 2 setups (planned):write-driven requests go to master node
read-driven requests go to slave nodes
References
general overview of nosql:http://www.nosql-databases.org/
Neo4j itself: http://www.neo4j.orghttp://api.neo4j.org
http://doc.neo4j.org
neo4j grails plugin:source: https://github.com/SpringSource/grails-data-mapping
docs: http://springsource.github.com/grails-data-mapping/neo4j/
issues: http://jira.grails.org/browse/GPNEO4J
demo app: https://github.com/sarmbruster/neo4jsample
Java REST driver: https://github.com/jexp/neo4j-java-rest-binding
my blog: http://blog.armbruster-it.de
twitter: @darthvader42
10/20/12
10/20/12
10/20/12
10/20/12
10/20/12
10/20/12
R: 0G: 152B: 204
R: 194G: 205B: 35
R: 109G: 179B: 63
R: 56G: 124B: 44
R: 102G: 102B: 102
R: 255G: 255B:255
SpringOne 2GX 2011Theme Colors
SpringSourceBrand Colors
R: 102G: 102B: 102
10/20/12
10/20/12
10/20/12
10/20/12
10/20/12
60%
10/20/12
// This is Helvetica: 18 pt or higher please
public class TransferServiceImpl implements TransferService {
public TransferServiceImpl(AccountRepository ar) { this.accountRepository = ar; } }
10/20/12
10/20/12
10/20/12