New opportunities for connected data

Post on 09-May-2015

359 views 0 download

description

Today's complex data is not only big, but also semi-structured and densely connected. In this session we'll look at how size, structure and connectedness have converged to change the way we work with data. We'll then go on to look at some of the new opportunities for creating end-user value that have emerged in a world of connected data, illustrated with practical examples implemented using Neo4j, the world's leading graph database. Speaker: Ian Robinson, Director of Customer Success, Neo Technology Ian is an engineer at Neo Technology, currently working on research and development for future versions of the Neo4j graph database. Prior to joining the engineering team, Ian served as Neo's Director of Customer Success, managing training, professional services and support, and working with customers to design and develop graph database solutions. He is a co-author of 'Graph Databases' (O'Reilly) and 'REST in Practice' (O'Reilly), and a contributor to 'REST: From Research to Practice' (Springer) and 'Service Design Patterns' (Addison-Wesley). He presents at conferences worldwide on REST and the graph capabilities of Neo4j, and blogs at http://iansrobinson.com.

Transcript of New opportunities for connected data

#neo4j  

New  Opportuni0es  for  Connected  Data  

@ianSrobinson  ian@neotechnology.com  

 

#neo4j  

Outline  

•  Data  complexity  •  Graph  databases  –  features  and  benefits  •  Querying  graph  data  •  Use  cases  

#neo4j  

Data  Complexity  

complexity = f(size, semi-structure, connectedness)

#neo4j  

complexity = f(size, semi-structure, connectedness)

Data  Complexity  

#neo4j  

Semi-­‐Structure  

#neo4j  

Semi-­‐Structure  

Email:  ian@neotechnology.com  Email:  iansrobinson@gmail.com  Twi*er:  @iansrobinson  Skype:  iansrobinson  

FIRST_NAME   LAST_NAME  USER_ID   EMAIL_1   EMAIL_2   TWITTER  FACEBOOK   SKYPE  

Ian   Robinson  315   ian@neotechnology.com   iansrobinson@gmail.com   @iansrobinson  NULL   iansrobinson  

USER  

CONTACT  

CONTACT_TYPE  

0..n  

#neo4j  

Social  Network  

#neo4j  

Network  Impact  Analysis  

#neo4j  

Route  Finding  

#neo4j  

Recommenda0ons  

#neo4j  

Logis0cs  

#neo4j  

Access  Control  

#neo4j  

Fraud  Analysis  

#neo4j  

Securi0es  and  Debt  

Image:  orgnet.com  

#neo4j  

Graphs  Are  Everywhere  

#neo4j  

Graph  Databases  

•  Store  •  Manage  •  Query   data  

#neo4j  

Neo4j  is  a  Graph  Database  

Java  APIs  

Applica0on  

REST  API  REST  API  REST  API  

REST  Client  

Applica0on  

Write  LB   Read  LB  

#neo4j  

Property  Graph  Data  Model  

#neo4j  

Nodes  

#neo4j  

Labels  

#neo4j  

Rela0onships  

#neo4j  

Graph  Database  Benefits  

“Minutes  to  milliseconds”  performance  •  Millions  of  ‘joins’  per  second  •  Consistent  query  0mes  as  dataset  grows  

Fit  for  the  domain  •  Lots  of  join  tables?  Connectedness  •  Lots  of  sparse  tables?  Semi-­‐structure  

Business  responsiveness  •  Easy  to  evolve  

#neo4j  

Querying  Graph  Data  

•  Describing  graphs  •  Crea0ng  nodes,  rela0onships  and  proper0es  •  Querying  graphs  

#neo4j  

Describing  Graphs  

#neo4j  

Cypher  

(neo4j)<-[:HAS_SKILL]-(ben)-[:HAS_SKILL]->(rest),(ben)-[:WORKS_FOR]->(acme)

#neo4j  

Cypher  

(ben)-[:HAS_SKILL]->(neo4j),(ben)-[:HAS_SKILL]->(rest),(ben)-[:WORKS_FOR]->(acme)

#neo4j  

Create  Some  Data  CREATE (ben:person { name:'Ben' }), (acme:company { name:'Acme' }), (rest:skill { name:'REST' }), (neo4j:skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs)RETURN ben

#neo4j  

Create  Nodes  CREATE (ben:person { name:'Ben' }), (acme:company { name:'Acme' }), (rest:skill { name:'REST' }), (neo4j:skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs)RETURN ben

#neo4j  

Node  

(ben:person { name:'Ben' })

#neo4j  

Iden0fier  

(ben:person { name:'Ben' })

ben  =  

#neo4j  

Label  

(ben:person { name:'Ben' })

ben  =  

#neo4j  

Proper0es  

(ben:person { name:'Ben' })

ben  =  

#neo4j  

Create  Rela0onships  CREATE (ben:person { name:'Ben' }), (acme:company { name:'Acme' }), (rest:skill { name:'REST' }), (neo4j:skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs)RETURN ben

#neo4j  

Return  Node  CREATE (ben:person { name:'Ben' }), (acme:company { name:'Acme' }), (rest:skill { name:'REST' }), (neo4j:skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs)RETURN ben

#neo4j  

Eventually…  

#neo4j  

Querying  a  Graph  

Graph  local:  •  One  or  more  start  nodes  •  Explore  surrounding  graph  •  Millions  of  joins  per  second  

#neo4j  

Pafern  Matching  

Pafern  

#neo4j  

Pick  Start  Node  

Pafern  

#neo4j  

First  Match  

Pafern  

#neo4j  

Second  Match  

Pafern  

#neo4j  

Third  Match  

Pafern  

#neo4j  

Not  A  Match  

Pafern  

#neo4j  

Not  A  Match  

Pafern  

#neo4j  

Who  Shares  My  Skill  Set?  

#neo4j  

Colleagues  Who  Share  My  Skills  

#neo4j  

Cypher  Pafern  

(company)<-[:WORKS_FOR]-(me) -[:HAS_SKILL]->(skill),(company)<-[:WORKS_FOR]-(colleague) -[:HAS_SKILL]->(skill)

#neo4j  

Find  Colleagues  MATCH (company)<-[:WORKS_FOR]-(me:person) -[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC

#neo4j  

Find  Colleagues  MATCH (company)<-[:WORKS_FOR]-(me:person) -[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC

#neo4j  

Find  Colleagues  MATCH (company)<-[:WORKS_FOR]-(me:person) -[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC

#neo4j  

Find  Colleagues  MATCH (company)<-[:WORKS_FOR]-(me:person) -[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC

#neo4j  

Results  +--------------------------------------+| name | score | skills |+--------------------------------------+| "Ben" | 2 | ["Neo4j","REST"] || "Charlie" | 1 | ["Neo4j"] |+--------------------------------------+2 rows

#neo4j  

Search  the  En0re  Network  MATCH (me:person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(person) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN person.name AS name, company.name AS company, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC

#neo4j  

Search  the  En0re  Network  MATCH (me:person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(person) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN person.name AS name, company.name AS company, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC

#neo4j  

Results  +--------------------------------------------------------------+| name | company | score | skills |+--------------------------------------------------------------+| "Arnold" | "Startup, Ltd" | 3 | ["Java","Neo4j","REST"] || "Ben" | "Acme, Inc" | 2 | ["Neo4j","REST"] || "Gordon" | "Startup, Ltd" | 1 | ["Neo4j"] || "Charlie" | "Acme, Inc" | 1 | ["Neo4j"] |+--------------------------------------------------------------+4 rows

#neo4j  

Find  People  With  Matching  Skills  MATCH p=(me:person)-[:WORKED_ON*2..4]-(person) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian' AND person <> me AND skill.name IN ['Java','Clojure','SQL']WITH person, skill, min(length(p)) as pathLengthRETURN person.name AS name, count(skill) AS score, collect(skill.name) AS skills, ((pathLength - 1)/2) AS distanceORDER BY score DESC

#neo4j  

Find  People  With  Matching  Skills  MATCH p=(me:person)-[:WORKED_ON*2..4]-(person) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian' AND person <> me AND skill.name IN ['Java','Clojure','SQL']WITH person, skill, min(length(p)) as pathLengthRETURN person.name AS name, count(skill) AS score, collect(skill.name) AS skills, ((pathLength - 1)/2) AS distanceORDER BY score DESC

#neo4j  

Results  +---------------------------------------------------+| name | score | skills | distance |+---------------------------------------------------+| "Arnold" | 2 | ["Clojure","Java"] | 2 || "Charlie" | 1 | ["SQL"] | 1 |+---------------------------------------------------+2 rows

#neo4j  

Case  Studies  

#neo4j  

Network  Impact  Analysis  

•  Which  parts  of  network  does  a  customer  depend  on?  

•  Who  will  be  affected  if  we  replace  a  network  element?  

#neo4j  

Asset  Management  &  Access  Control  

•  Which  assets  can  an  admin  control?  

•  Who  can  change  my  subscrip0on?  

#neo4j  

Logis0cs  

•  What’s  the  quickest  delivery  route  for  this  parcel?  

#neo4j  

Social  Network  &  Recommenda0ons  

•  Which  assets  can  I  access?  

•  Who  shares  my  interests?  

#neo4j  

Download  the  free  book  from  O’Reilly  

hfp://graphdatabases.com  

Ian Robinson, Jim Webber & Emil Eifrem

Graph Databases

h

Compliments

of Neo Technology

Thank  you  @ianSrobinson  

ian@neotechnology.com    

github.com/iansrobinson