System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.
-
Upload
victoria-annabel-tucker -
Category
Documents
-
view
218 -
download
0
Transcript of System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.
![Page 1: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/1.jpg)
System Support for Managing Graphs in the Cloud
Sameh Elnikety & Yuxiong HeMicrosoft Research
![Page 2: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/2.jpg)
Example 1 : Social Network•Scale
– LinkedIn» 70+ million users
– Facebook» 500+ million users» 65+ billion photos
•Rich graph– Types, attributes
•Queries– Find Alice’s friends– Find Alice’s photos with friends
Hillary
Bob Alice
Chris David
FranceEd George
Hillary
Bob Alice
Chris David
FranceEd George
Photo1
Photo2
Photo3
Photo4Photo5 Photo6
Photo8
Photo7
![Page 3: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/3.jpg)
Objective: Management & Querying•Manage and query large graphs online
– Today» Expensive hardware (limited)» Cluster of machines (ad-hoc)
– Tomorrow » Growth area
•Analogy: provide system support» Analogy: DBMS manages application data» Offers easy, efficient, reliable solution to developers
![Page 4: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/4.jpg)
Key Design Decisions• Interactive queries
– Graph topology stored in random access memory
•Graph too large for one machine– Partitioning: slice graph into pieces– Replication: for fault tolerance
•Long time to build graph– Support updates
![Page 5: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/5.jpg)
A Few Research Problems1. Querying the graph
– Data model & query language– Execution engine
2. Supporting updates– Multi-version concurrency control
3. Partitioning– Scale-free graphs
![Page 6: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/6.jpg)
Data Model•Node
– ID, type, attributes
•Edge– Connects two nodes– Direction, type, attributes
Hillary
Bob Alice
Chris David
FranceEd George
Hillary
Bob Alice
Chris David
FranceEd George
Photo1
Photo2
Photo3
Photo4Photo5 Photo6
Photo8
Photo7
Manages BobAlice
BobAlice
Manages
Managed-by
App
System
![Page 7: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/7.jpg)
Query Language•Trade-off
– Expressiveness vs. efficiency
•Two query types– Reachability – Pattern matching
![Page 8: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/8.jpg)
Query Language - Reachability•Regular language reachability
– Query is a regular expression» Sequence of node and edge predicates» Regular predicates
– Example» Alice’s photos» Photo, tags, Alice» Node: type=photo , edge: type=tags , node: type=person,
name=Alice» Result: matching paths
![Page 9: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/9.jpg)
Query Language - Reachability• Projection
» Alice’s photos» Photo, tags, Alice» Node: type=photo , edge: type=tags , node: type=person,
name=Alice» SELECT photo
FROM photo, tags, Alice
•OR» (Photo | video), tags, Alice
•Kleene star » Alice org chart» Alice, (manages, person)*
![Page 10: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/10.jpg)
Example 2: CodeBook - Graph
![Page 11: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/11.jpg)
1. Person, FileOwner>, TFSFile, FileOwner<, Person
2. Person, DiscussionOwner>, Discussion, DiscussionOwner<, Person
3. Person, WorkItemOwner>, TFSWorkItem, WorkItemOwner< ,Person
4. Person, Manages<, Person, Manages>, Person
5. Person, WorkItemOwner>, TFSWorkItem, Mentions>, TFSFile, Mentions>, TFSWorkItem, WorkItemOwner<, Person
6. Person, WorkItemOwner>, TFSWorkItem, Mentions>, TFSFile, FileOwner<, Person
7. Person, FileOwner>, TFSFile, Mentions>, TFSWorkItem, Mentions>, TFSFile, FileOwner<, Person
Example 2: CodeBook - Queries
![Page 12: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/12.jpg)
Query Language – Pattern Matching•From reachability
» Paths» Sequence of predicates
•To graph patterns» Subgraph isomorphism
Friend
Tags AlicePhoto
Tags AlicePhoto Bob
Lives-in
Tags Alice
City
Photo
Bob
![Page 13: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/13.jpg)
Talk Outline•Query language
– Regular language reachability– Graph pattern matching
•Execution engine– Based on distributed breadth first-search– Optimizations
• Supporting updates
![Page 14: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/14.jpg)
Alice <Tags Photo
Breadth First Search
Answer Paths:Alice, Tags, Photo1Alice, Tags, Photo8
S2S0 S1 S3
Alice Tags PhotoCentralized Query Execution
Hillary
Bob Alice
Chris David
FranceEd George
Photo1
Photo2
Photo3
Photo4Photo5 Photo6
Photo8
Photo7
Photo Tags> Alice
![Page 15: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/15.jpg)
Cheap Concurrency Control!• Replication and partitioning
– Surprising synergy
• Multi-Version Concurrency Control– Generalized Snapshot Isolation
• Replicas agree on– Which update transactions commit– Their commit order
• Example– T1: set x = 1– T2: set x = 2– Replica1: see T1 then T2 x= 2– Replica2: see T2 then T1 x = 1
![Page 16: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/16.jpg)
Summary•Query language
– Regular language reachability– Graph pattern matching
•Execution engine– Based on distributed breadth first-search– Optimizations
•Supporting updates– Replication and partitioning– Multi-version concurrency models
![Page 17: System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649ed25503460f94be0ef0/html5/thumbnails/17.jpg)
A Few Research Problems1. Querying the graph
– Data model & query language– Execution engine
2. Supporting updates– Multi-version concurrency control
3. Partitioning– Scale-free graphs