Gluecon InfiniteGraph/DB
-
Upload
wdavidson16 -
Category
Technology
-
view
1.300 -
download
1
description
Transcript of Gluecon InfiniteGraph/DB
The following is an excerpt of presentation delivered at Gluecon 2010 in Broomfield
Colorado.
The presentation is not a presentation on the InfiniteGraph/DB, but an overview of
managing distributed graph data in a graph database.
Copyright © InfiniteGraph
Scaling the [Social] Graphin the [Cloud]
Darren WoodLead Architect, InfiniteGraph
Graph Databases (Quickly)
• Optimized around data relationships
• Small focused API (typically not SQL)
• Typical Use Cases :
– Social Graph Analysis
– Catching Bad Guys (see Booth 16)
– Fraud / Financial (more bad guys)
– Data Intensive Science
– Web / Advertising Analytics
Copyright © InfiniteGraph
Graph Databases (Almost Done)
Copyright © InfiniteGraph
Vertex alice = myGraph.addVertex(new Person(“Alice”)); Vertex bob = myGraph.addVertex(new Person(“Bob”)); Vertex carlos = myGraph.addVertex(new Person(“Carlos”)); Vertex charlie = myGraph.addVertex(new Person(“Charlie”));
alice.addEdge(new Meeting(“Denver”, “5-27-10”), bob);bob.addEdge(new Call(timestamp), carlos);carlos.addEdge(new Payment(100000.00), charlie);bob.addEdge(new Call(timestamp), charlie);
Alice Carlos CharlieBobMeets Calls Pays
Calls
What’s So Difficult Then ?
• Graphs grow quickly
– Billions of phone calls / day in US
– Emails, social media events, IP Traffic
– Financial transactions
• Some analytics require navigation of large sections of the graph
• Each step (often) depends on the last
• Must distribute data and go parallel
Copyright © InfiniteGraph
First Some Good News…
• Graph algorithms naturally branch
• Can be automated or guided
Copyright © InfiniteGraph
Alice
Carlos CharlieBobMeets Calls Pays
Dave EveChuck
Calls
Lives With
Meets
Distributed API
Application(s)
Partition 1 Partition 3Partition 2 Partition ...n
Processor Processor Processor Processor
Big Distributed Data(Traditional - Huge Generalization)
Copyright © InfiniteGraph
Distributed API
Application(s)
Partition 1 Partition 3Partition 2 Partition ...n
Processor Processor Processor Processor
Big Distributed Data(Graph)
Copyright © InfiniteGraph
Processor
Distributed API
Partition 1 Partition 2
Processor
So What Are The Answers?Best Effort Partitioning
Copyright © InfiniteGraph
Processor
Distributed API
Partition 1 Partition 2
Processor
So What Are The Answers?The Look Ahead Example
Copyright © InfiniteGraph
Application
A
XY
B
C
D
E
Which of These Work ?
• A carefully orchestrated combination of various options
• Can be tuned (degree of look ahead)
• Healing graph can be expensive (write cost)
• This can also be tuned/configured (external edge thresholds)
Copyright © InfiniteGraph