20150624 Belgian GraphDB meetup at Ordina

Post on 06-Aug-2015

466 views 3 download

Tags:

Transcript of 20150624 Belgian GraphDB meetup at Ordina

Welcome'to'the''GraphDB'meetup'

24th'of'June'2015'@rvanbruggen*

Big$THANK$YOU!$

Agenda$

•  Intro&to&Graphs&

•  Graph&Prototyping&&&Visualisa4on&

•  KNMI&Case&study&

•  …&whatever&else&we&want&to&cover!!!&

Above$all!$

Don’t$be$a$lonely$document!$$CONNECT!$

Intro'to'Graph'Databases''in'a'NOSQL'world'

24th'of'June'2015'@rvanbruggen*

Agenda$

•  About&Graphs&•  About&Graph&Databases&–  About&Neo4j&

•  Graph&Querying&–  Short&demonstra4on&

•  Case&Studies&•  Q&A&

IntroducAon:$about$Graphs$

Meet&&Leonhard$Euler$(again?)&

•  Swiss&mathema4cian&•  Inventor&of&Graph&Theory&(1736)&

&

Königsberg$(Prussia)$L$1736$

A&

B&

D&

C&

A&

B&

D&

C&

1

2

3

4

7

6

5

About$Graph$Databases$

ComplemenAng$$

Relational Databases

VOLUME$ COMPLEXITY$

NOT$ONLY$SQL$

The$RelaAonal$Crossroads$

Courtesy&of&@apcj&hSp://www.apcjones.com/&

So$what$is$a$graph$database?$

•  OLTP&database&•  �endYuser�&transac4ons&

•  Model,&store,&manage&data&as&a&graph&

What$is$a$graph?$

Vertex&

Edge&

What$is$a$graph?$

Node&

Rela4onship&

Contrast$with$RelaAonal$

Graphs&are&o^en&referred&to&as&“Whiteboard&Friendly”.&The&data&model&reflects&the&way&a&domain&expert&would&naturally&draw&their&data&on&a&whiteboard&

“The&schema&is&the&data”.&Schema&flexibility&allows&the&system&to&change&in&response&to&a&changing&environment&

Neo4j$is$a$Graph$Database$

•  JVM&based&•  ACID&transac4ons&•  Rich&Java&APIs&•  Query&language&•  Using&the&Labeled&&

Property&Graph&model&

Cypher:$THE$graph$query$language$

•  Learning&from&RDBMS’&evolu4on&•  Introduc4on&of&SQL!&

•  Key&characteris4cs&•  Declara4ve:&tell&it&what&you&want,&not&how&to&get&it&•  Expressive:&Op4mize&for&reading&•  PaSern&matching:&easy&on&your&brain!&•  Idempotent:&state&change&expressed&idempotently&

Labeled$Property$Graph$Model$

Author

Book

Reader

Reader

Author

Book

Author

Labeled$Property$Graph$Summary$

•  Nodes&•  Containers&for&proper4es&

•  Grouped&together&in&subgraphs&by&“Labels”&

•  Proper4es&•  KeyYvalue&pairs&

•  Primi4ve&and&array&values&

•  Rela4onships&•  Name&

•  Direc4on&

•  May&also&contain&proper4es&

•  Rela4onships&(ctd.)&•  Must&have&a&start&node&and&an&end&node&

(no&dangling&rela4onships)&

•  Start&node&and&end&node&can&be&the&same&(e.g.&�self�&rela4onships)&

•  Nodes&can&be&connected&by&more&than&one&rela4onship&

What$are$graphs$good$for?$

Complexity&

Data$Complexity$

complexity = f(size, semi-structure, connectedness)

complexity = f(size, semi-structure, connectedness)

The$Real$Complexity$

SemiLStructure$

SemiLStructure$

Email:&rik@neotechnology.com&Email:&rik@vanbruggen.be&TwiSer:&@rvanbruggen&Skype:&rvanbruggen&

USER&

CONTACT&

CONTACT_TYPE&

FIRST_NAME& LAST_NAME&USER_ID& EMAIL_1& EMAIL_2& TWITTER&FACEBOOK& SKYPE&

Rik& Van&Bruggen&315& rik@neotechnology.com& rik@vanbruggen.be& @rvanbruggen&NULL& rvanbruggen&

complexity = f(size, semi-structure, connectedness)

The$Real$Complexity$

Examples$of$Connectedness$

When$Should$I$Use$Graph$Databases??$•  DenselyYconnected,&semiYstructured&domains&•  Lots&of&join&tables?&Connectedness&•  Lots&of&sparse&tables?&SemiYstructure&

•  Data&Model&Vola4lity&•  Easy&to&evolve&

•  “Graphy”&Query&paSerns&•  Deeps&Join&Complexity&and&Performance&•  Pathfinding&opera4ons&•  Millions&of&�joins�&per&second&•  Consistent&query&4mes&as&dataset&grows&

Graph$Querying$

Querying$a$Graph$

•  “Graph&local”&vs&“Graph&global”&•  Contextualized&�egoYcentric�&queries&

•  �Parachute�&into&graph&•  Start&node(s)&•  Found&through&Index&lookups&

•  Crawl&the&surrounding&graph&•  2&million+&joins&per&second&•  No&more&Index&lookups:&&IndexYfree&adjacency&

Queries:$Pa_ern$Matching$

PaSern&

Short$demo$

Case$Studies$

What’s$next?$

www.neo4j.com/graphacademy&www.meetup.com/graphdbYlondon&

www.neo4j.com&&rik@neotechnology.com&or&+32&478&686800&or&@rvanbruggen&

Q&A,$Conclusion,$Next$Steps$

Monitoring using ‘Big Data’ tooling

Prisma Team Hotze de Jong Femke Haga Roy van den Berg Jeroen Bos Rijk Oosterhoff Ronny de Wit Henk Jan Klein Ikkink Tom Elsten Annick van der Hoest Ian van de Neut Frank Duin Michal Koutek Wim Som de Cerff Hans Verhoef, hans.verhoef@knmi.nl

KNMI Royal Netherlands Meteorological Institute Ministry of Infrastructure and the Environment The national knowledge institute for: • Weather • Climate • Seismology

Both research and operational 24/7

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl 2

What is the problem?

3 KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

What needs to be monitored?

4

SLA for data product delivery with our customers Æ Timely delivery Æ Timely warning

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

Example production chain

5

Pre-process

ECMWF Model

HIRLAM

Observations

Variable extractor

Collect

Interpolate/ transform

Distribute

Customer

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

So what is new?

6

• Integrated monitoring of the production chain i.e, not at the individual sub systems

• Provides a dynamic ‘blue print‘ of the KNMI production environment

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

PRocessflow Infrastructure Surveillance and Monitoring Application At each moment in time answer:

• What is the status of our production environment?

• At each failure, provide:

What is the impact? What is the root cause?

PRISMA goals

7 KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

Approach • URD, for vision and stakeholder engagement • ITIL as terminology baseline • Procurement of Event Management Systeem (EMS)

• SCRUM for s/w development

8 KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

PRISMA Architecture

9

Reference Model

Real World: Applications

& Infrastructure

GUI

Event Management

Configuration loader

Event Management • Events Æ failures • Events Æ timing info

GUI • Dashboard • Root cause • Impact • Alerting

Information systems Config. Æ model Logging Æ events

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

Why use ‘Big data’ tooling?

10

Big Data definition v.s. our problem:

• Volume: • 5 Gbyte/day stored • 100K Monitored Objects, 100K*x relations between the monitored objects

• Variety • Heterogeneous log data from many systems

• Velocity: • 5 Gbyte/day and growing • Need real-time analysis for root cause and impact

• Variability • Inconsistency of log messages and configuration information

• Veracity • Quality of log messages varies greatly

• Complexity • Complex queries needed for root cause and impact analysis

Event Management System (EMS)

11

Event management system must deal with: • Event analysis and correlation • Different log formats

We choose:

• Pro’s Can deal with all log data formats Can read log information from a database (e.g. Zabbix) Event analysis is excellent: no extra software development needed “Lots” of standard Apps available (~ 550 Apps) Support: SMT

• Con’s:

Learning curve

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

Why a Graph database?

12

• Production chains in ‘reference model’, which is a graph • Root cause and impact are graph type queries • Display graph of KNMI production chains • Performance needs

We choose:

• Pro’s: • Performance • Ease of use • Support: GoDataDriven

• Con’s • Own query language called ‘cypher’ • License model + costs

PRISMA Architecture

13

Reference Model

Real World: Applications

& Infrastructure

GUI

Event Management

Configuration loader

Event Management: Splunk Database: Neo4J Persistence: Spring Data Visualization: Graphviz MVC framework: Wicket Java/Tomcat

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

PRISMA: development status

14

First main applications in model (+20k MO with relations) Failures via Splunk events mapping between domains Next: root cause, impact analysis July 2015 Milestone: first operational use Q3 2015: Incident management Q4 2015: Consolidation

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl 15

Operator view

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl 16

Operator view

17 KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

Operator view

Modeller view

18 KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

Modeller view

19

Summary

20

SPLUNK and Neo4J are key elements for PRISMA • SPLUNK enables event analysis capabilities • Neo4J enables the root cause/impact analysis + graph display • Other possibilities: • Dynamic blue print of infrastructure • Interactive analysis using Splunk

Summer 2015 operational in use Innovative product: invited presentations on: • SPLUNK user day, March 25, Utrecht • European Geosciences Union, April 13, Vienna • GraphConnect, May 6, London

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

Questions?

21

Ian van der Neut Frank Duin Michal Koutek Wim Som de Cerff Hans Verhoef

Henk Jan Klein Ikkink

Tom Elsten Rijk Oosterhoff

Hotze de Jong Femke Haga Annick van der Hoest Roy van den Berg Jeroen Bos Ronny de Wit

KNMI PRISMA – Neo4J Meetup - june 2015 – hans.verhoef@knmi.nl

Graph&Visualiza-on&how$to$embed$it$in$your$own$applica3on$

Kees&Vegter&!!

kees.vegter@neotechnology.com11

Agenda&

•  Solu&on!Architecture!•  Accessing!Neo4j!•  Graph!visualiza&on!•  D3.js!•  Required!client!side!components!•  Typical!Applica&on!Architecture!•  A!simple!example!•  A!valid!applica&on!example!

2!

Typical&solu-on&architectures&

3!

Applica&on!!

Applica&on!Client!

rest!

Neo4j!Server!

Applica&on!!

Applica&on!Client!

Neo4j!Embedded!

neo4

j!

Cypher!Cypher!Cypher!Cypher!Cypher!

Cypher!

REST!API!(traversals,!graph!algos)!

Java!API!(server!extensions)!

Dragons!!(don’t!go!there)!

How&to&access&Neo4J&

Graph&Visualiza-on&

•  More!than!one!library:!D3.js!and!sigma.js!•  Now!focus!on!D3.js!(also!used!in!Neo4j!browser)!

5!

D3.js&

•  Build!in!graph!(nodes!and!links)!layout:!“force”!•  Generates!SVG!structures!•  Input!for!a!force!layout!is!a!simple!nodes!and!links!structure:!!{“nodes”:[{},{},{},{}],!!“links”:![{“source”:0,!“target”:1,!…}!!!!!!!!!!!!!!!,!{“source”:0,!“target”:2,!…}!!!!!!!!!!!!!!!,!{“source”:0,!“target”:3,!…}]!}!!

•  Easy!to!add!event!listeners!via!“force.on(…”!•  Lots!of!examples!on!the!internet;!good!star&ng!points:!•  hbp://bl.ocks.org/sathomas!•  hbp://www.coppelia.io/2014/07/aneaetoezeofeextraefeatureseforetheed3eforceelayout/!

6!

Required&Client&Side&Components&

•  Components!•  D3.js!force!layout!•  Neo4j!REST!API:!transac&onal!endpoint!

•  Process:!Query!Neo4j!!!Convert!response!for!D3.js!force!!!build!force!layout!

7!

Typical&applica-on&architecture&

8!

Applica&on!Server!

Web!Client!

rest!

Neo4j!Server!

A&Simple&example&

•  This!is!not!secure,!a!web!page!accessing!the!Neo4j!DB!API!directly!•  One!page!example!•  Used!also!Font!Awesome!:!For1displaying1Icons1in1the1nodes1

9!

A&valid&applica-on&example&

•  ZK!RIA!framework!(there!are!a!lot!of!them!but!this!one!I!know)!•  server!components!"!!client!side!widgets!•  On!the!server!side!neo4j!access,!on!the!client!the!D3.js!graph.!!•  Propaga&ng!client!side!events!to!the!server!side.!

10!

Ques-ons….Discussion&