20150624 Belgian GraphDB meetup at Ordina
-
Upload
rik-van-bruggen -
Category
Software
-
view
466 -
download
3
Transcript of 20150624 Belgian GraphDB meetup at Ordina
Welcome'to'the''GraphDB'meetup'
24th'of'June'2015'@rvanbruggen*
Big$THANK$YOU!$
Agenda$
• Intro&to&Graphs&
• Graph&Prototyping&&&Visualisa4on&
• KNMI&Case&study&
• …&whatever&else&we&want&to&cover!!!&
Above$all!$
Don’t$be$a$lonely$document!$$CONNECT!$
Intro'to'Graph'Databases''in'a'NOSQL'world'
24th'of'June'2015'@rvanbruggen*
Agenda$
• About&Graphs&• About&Graph&Databases&– About&Neo4j&
• Graph&Querying&– Short&demonstra4on&
• Case&Studies&• Q&A&
IntroducAon:$about$Graphs$
Meet&&Leonhard$Euler$(again?)&
• Swiss&mathema4cian&• Inventor&of&Graph&Theory&(1736)&
&
Königsberg$(Prussia)$L$1736$
A&
B&
D&
C&
A&
B&
D&
C&
1
2
3
4
7
6
5
About$Graph$Databases$
ComplemenAng$$
Relational Databases
VOLUME$ COMPLEXITY$
NOT$ONLY$SQL$
The$RelaAonal$Crossroads$
Courtesy&of&@apcj&hSp://www.apcjones.com/&
So$what$is$a$graph$database?$
• OLTP&database&• �endYuser�&transac4ons&
• Model,&store,&manage&data&as&a&graph&
What$is$a$graph?$
Vertex&
Edge&
What$is$a$graph?$
Node&
Rela4onship&
Contrast$with$RelaAonal$
Graphs&are&o^en&referred&to&as&“Whiteboard&Friendly”.&The&data&model&reflects&the&way&a&domain&expert&would&naturally&draw&their&data&on&a&whiteboard&
“The&schema&is&the&data”.&Schema&flexibility&allows&the&system&to&change&in&response&to&a&changing&environment&
Neo4j$is$a$Graph$Database$
• JVM&based&• ACID&transac4ons&• Rich&Java&APIs&• Query&language&• Using&the&Labeled&&
Property&Graph&model&
Cypher:$THE$graph$query$language$
• Learning&from&RDBMS’&evolu4on&• Introduc4on&of&SQL!&
• Key&characteris4cs&• Declara4ve:&tell&it&what&you&want,¬&how&to&get&it&• Expressive:&Op4mize&for&reading&• PaSern&matching:&easy&on&your&brain!&• Idempotent:&state&change&expressed&idempotently&
Labeled$Property$Graph$Model$
Author
Book
Reader
Reader
Author
Book
Author
Labeled$Property$Graph$Summary$
• Nodes&• Containers&for&proper4es&
• Grouped&together&in&subgraphs&by&“Labels”&
• Proper4es&• KeyYvalue&pairs&
• Primi4ve&and&array&values&
• Rela4onships&• Name&
• Direc4on&
• May&also&contain&proper4es&
• Rela4onships&(ctd.)&• Must&have&a&start&node&and&an&end&node&
(no&dangling&rela4onships)&
• Start&node&and&end&node&can&be&the&same&(e.g.&�self�&rela4onships)&
• Nodes&can&be&connected&by&more&than&one&rela4onship&
What$are$graphs$good$for?$
Complexity&
Data$Complexity$
complexity = f(size, semi-structure, connectedness)
complexity = f(size, semi-structure, connectedness)
The$Real$Complexity$
SemiLStructure$
SemiLStructure$
Email:&[email protected]&Email:&[email protected]&TwiSer:&@rvanbruggen&Skype:&rvanbruggen&
USER&
CONTACT&
CONTACT_TYPE&
FIRST_NAME& LAST_NAME&USER_ID& EMAIL_1& EMAIL_2& TWITTER&FACEBOOK& SKYPE&
Rik& Van&Bruggen&315& [email protected]& [email protected]& @rvanbruggen&NULL& rvanbruggen&
complexity = f(size, semi-structure, connectedness)
The$Real$Complexity$
Examples$of$Connectedness$
When$Should$I$Use$Graph$Databases??$• DenselyYconnected,&semiYstructured&domains&• Lots&of&join&tables?&Connectedness&• Lots&of&sparse&tables?&SemiYstructure&
• Data&Model&Vola4lity&• Easy&to&evolve&
• “Graphy”&Query&paSerns&• Deeps&Join&Complexity&and&Performance&• Pathfinding&opera4ons&• Millions&of&�joins�&per&second&• Consistent&query&4mes&as&dataset&grows&
Graph$Querying$
Querying$a$Graph$
• “Graph&local”&vs&“Graph&global”&• Contextualized&�egoYcentric�&queries&
• �Parachute�&into&graph&• Start&node(s)&• Found&through&Index&lookups&
• Crawl&the&surrounding&graph&• 2&million+&joins&per&second&• No&more&Index&lookups:&&IndexYfree&adjacency&
Queries:$Pa_ern$Matching$
PaSern&
Short$demo$
Case$Studies$
What’s$next?$
www.neo4j.com/graphacademy&www.meetup.com/graphdbYlondon&
www.neo4j.com&&[email protected]&or&+32&478&686800&or&@rvanbruggen&
Q&A,$Conclusion,$Next$Steps$
Monitoring using ‘Big Data’ tooling
Prisma Team Hotze de Jong Femke Haga Roy van den Berg Jeroen Bos Rijk Oosterhoff Ronny de Wit Henk Jan Klein Ikkink Tom Elsten Annick van der Hoest Ian van de Neut Frank Duin Michal Koutek Wim Som de Cerff Hans Verhoef, [email protected]
KNMI Royal Netherlands Meteorological Institute Ministry of Infrastructure and the Environment The national knowledge institute for: • Weather • Climate • Seismology
Both research and operational 24/7
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected] 2
What is the problem?
3 KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
What needs to be monitored?
4
SLA for data product delivery with our customers Æ Timely delivery Æ Timely warning
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
Example production chain
5
Pre-process
ECMWF Model
HIRLAM
Observations
Variable extractor
Collect
Interpolate/ transform
Distribute
Customer
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
So what is new?
6
• Integrated monitoring of the production chain i.e, not at the individual sub systems
• Provides a dynamic ‘blue print‘ of the KNMI production environment
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
PRocessflow Infrastructure Surveillance and Monitoring Application At each moment in time answer:
• What is the status of our production environment?
• At each failure, provide:
What is the impact? What is the root cause?
PRISMA goals
7 KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
Approach • URD, for vision and stakeholder engagement • ITIL as terminology baseline • Procurement of Event Management Systeem (EMS)
• SCRUM for s/w development
8 KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
PRISMA Architecture
9
Reference Model
Real World: Applications
& Infrastructure
GUI
Event Management
Configuration loader
Event Management • Events Æ failures • Events Æ timing info
GUI • Dashboard • Root cause • Impact • Alerting
Information systems Config. Æ model Logging Æ events
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
Why use ‘Big data’ tooling?
10
Big Data definition v.s. our problem:
• Volume: • 5 Gbyte/day stored • 100K Monitored Objects, 100K*x relations between the monitored objects
• Variety • Heterogeneous log data from many systems
• Velocity: • 5 Gbyte/day and growing • Need real-time analysis for root cause and impact
• Variability • Inconsistency of log messages and configuration information
• Veracity • Quality of log messages varies greatly
• Complexity • Complex queries needed for root cause and impact analysis
Event Management System (EMS)
11
Event management system must deal with: • Event analysis and correlation • Different log formats
We choose:
• Pro’s Can deal with all log data formats Can read log information from a database (e.g. Zabbix) Event analysis is excellent: no extra software development needed “Lots” of standard Apps available (~ 550 Apps) Support: SMT
• Con’s:
Learning curve
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
Why a Graph database?
12
• Production chains in ‘reference model’, which is a graph • Root cause and impact are graph type queries • Display graph of KNMI production chains • Performance needs
We choose:
• Pro’s: • Performance • Ease of use • Support: GoDataDriven
• Con’s • Own query language called ‘cypher’ • License model + costs
PRISMA Architecture
13
Reference Model
Real World: Applications
& Infrastructure
GUI
Event Management
Configuration loader
Event Management: Splunk Database: Neo4J Persistence: Spring Data Visualization: Graphviz MVC framework: Wicket Java/Tomcat
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
PRISMA: development status
14
First main applications in model (+20k MO with relations) Failures via Splunk events mapping between domains Next: root cause, impact analysis July 2015 Milestone: first operational use Q3 2015: Incident management Q4 2015: Consolidation
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected] 15
Operator view
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected] 16
Operator view
17 KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
Operator view
Modeller view
18 KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
Summary
20
SPLUNK and Neo4J are key elements for PRISMA • SPLUNK enables event analysis capabilities • Neo4J enables the root cause/impact analysis + graph display • Other possibilities: • Dynamic blue print of infrastructure • Interactive analysis using Splunk
Summer 2015 operational in use Innovative product: invited presentations on: • SPLUNK user day, March 25, Utrecht • European Geosciences Union, April 13, Vienna • GraphConnect, May 6, London
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
Questions?
21
Ian van der Neut Frank Duin Michal Koutek Wim Som de Cerff Hans Verhoef
Henk Jan Klein Ikkink
Tom Elsten Rijk Oosterhoff
Hotze de Jong Femke Haga Annick van der Hoest Roy van den Berg Jeroen Bos Ronny de Wit
KNMI PRISMA – Neo4J Meetup - june 2015 – [email protected]
Agenda&
• Solu&on!Architecture!• Accessing!Neo4j!• Graph!visualiza&on!• D3.js!• Required!client!side!components!• Typical!Applica&on!Architecture!• A!simple!example!• A!valid!applica&on!example!
2!
Typical&solu-on&architectures&
3!
Applica&on!!
Applica&on!Client!
rest!
Neo4j!Server!
Applica&on!!
Applica&on!Client!
Neo4j!Embedded!
neo4
j!
Cypher!Cypher!Cypher!Cypher!Cypher!
Cypher!
REST!API!(traversals,!graph!algos)!
Java!API!(server!extensions)!
Dragons!!(don’t!go!there)!
How&to&access&Neo4J&
Graph&Visualiza-on&
• More!than!one!library:!D3.js!and!sigma.js!• Now!focus!on!D3.js!(also!used!in!Neo4j!browser)!
5!
D3.js&
• Build!in!graph!(nodes!and!links)!layout:!“force”!• Generates!SVG!structures!• Input!for!a!force!layout!is!a!simple!nodes!and!links!structure:!!{“nodes”:[{},{},{},{}],!!“links”:![{“source”:0,!“target”:1,!…}!!!!!!!!!!!!!!!,!{“source”:0,!“target”:2,!…}!!!!!!!!!!!!!!!,!{“source”:0,!“target”:3,!…}]!}!!
• Easy!to!add!event!listeners!via!“force.on(…”!• Lots!of!examples!on!the!internet;!good!star&ng!points:!• hbp://bl.ocks.org/sathomas!• hbp://www.coppelia.io/2014/07/aneaetoezeofeextraefeatureseforetheed3eforceelayout/!
6!
Required&Client&Side&Components&
• Components!• D3.js!force!layout!• Neo4j!REST!API:!transac&onal!endpoint!
• Process:!Query!Neo4j!!!Convert!response!for!D3.js!force!!!build!force!layout!
7!
Typical&applica-on&architecture&
8!
Applica&on!Server!
Web!Client!
rest!
Neo4j!Server!
A&Simple&example&
• This!is!not!secure,!a!web!page!accessing!the!Neo4j!DB!API!directly!• One!page!example!• Used!also!Font!Awesome!:!For1displaying1Icons1in1the1nodes1
9!
A&valid&applica-on&example&
• ZK!RIA!framework!(there!are!a!lot!of!them!but!this!one!I!know)!• server!components!"!!client!side!widgets!• On!the!server!side!neo4j!access,!on!the!client!the!D3.js!graph.!!• Propaga&ng!client!side!events!to!the!server!side.!
10!
Ques-ons….Discussion&