LLAMA: Efficient graph analytics using Large...
Transcript of LLAMA: Efficient graph analytics using Large...
![Page 1: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/1.jpg)
LLAMA: Efficient graph analytics using Large Multiversioned Arrays
Presenters:Bryan Wilder, Ethan Kripke, Zach Dietz
![Page 2: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/2.jpg)
Background
![Page 3: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/3.jpg)
Mickens, James. “The Saddest Moment.” ;Login: Logout, (May 2013).
Big-Data Graph Analytics Applications
![Page 4: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/4.jpg)
Big-Data Graph Analytics Applications
FriendBook
• • •
GoggleDon’t Not Not Be Not Evil™
123
![Page 5: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/5.jpg)
Design Considerations
Compressed Sparse Row
Adjacency Lists
Compressed Sparse Bitmaps❓
❓
❓
![Page 6: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/6.jpg)
Design Considerations
Compressed Sparse Row
Mutability
Multi-versioning
In-Memory vs. Out of Memory
Overhead
![Page 7: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/7.jpg)
Compressed Sparse Row (CSR)
0 4 0 0 0 0 0
0 0 2 8 0 0 1
0 0 0 0 0 0 0
0 0 0 1 0 3 0
[ 4 ]
[ 2, 8, 1 ]
[ 1, 3 ]
[ ]
![Page 8: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/8.jpg)
0 1 2 3
[ 0, 1, 4, 4, 6 ]
Compressed Sparse Row (CSR)
0 4 0 0 0 0 0
0 0 2 8 0 0 1
0 0 0 0 0 0 0
0 0 0 1 0 3 0 Value
Column
Cum. NNZ
(implicit: row)
[ 1, 2, 3, 6, 3, 5 ]
[ 4, 2, 8, 1, 1, 3 ]
![Page 9: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/9.jpg)
Compressed Sparse Row (CSR)
0
1 2 [ 1, 2, 2, 0, 2 ]Col
[ 0, 2, 3, 5 ]NNZ
(row) 0 1 2
0 1 1
0 0 1
1 0 1
![Page 10: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/10.jpg)
Compressed Sparse Row (CSR)
0
1 2 [ 1, 2, 2, 0, 2 ]
[ 0, 2, 3, 5 ]0 1 1
0 0 1
1 0 1
VertexTable
Edge Table
• • •
• • •
![Page 11: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/11.jpg)
Implementation
![Page 12: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/12.jpg)
Snapshot 0 0 1 2 3
2 3 0 2
Vertex Table
Edge TableSnapshot 0Vertex 0
Offset: 0Length: 2
Snapshot 0Vertex 1
Offset: 2Length: 1
Snapshot 0Vertex 2
Offset: 3Length: 0
Snapshot 0Vertex 3
Offset: 3Length: 1
2 3 0 2
Vertex Table
Edge Table
Blue Red Green BlueEdge Property
Tall Short Skinny ShortVertex Property
0 0 0 0Deletions
Indirection Array(pointers to pages in VT)
0-1 2-3
Page 0 Page 1
< L3 Cache
multiple of file system & virtual mem page size
![Page 13: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/13.jpg)
Snapshot 0 0 1 2 3
2 3 0 2
Vertex Table
Edge Table
Snapshot 1 0 1
3 Continuation:NONE
Indirection Table 0-1 2-3
Page 0 Page 1 Page 0’
3 Continuation:Snapshot 0, Offset 2
0-1 2-3
Reading? Merging?
Rule for merging?
![Page 14: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/14.jpg)
Using LLAMAExposes three operations to user:
Iterate over verticesSelect a specific vertexIterate over neighbors of a vertex
Contrast: does not enforce sequential access pattern
(unlike GraphChi or X-stream)
![Page 15: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/15.jpg)
ExperimentalEvaluation
![Page 16: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/16.jpg)
TasksPageRank computation
Breadth-first search
Triangle counting
![Page 17: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/17.jpg)
TasksPageRank computation
Breadth-first search
Triangle counting
DatasetsLarge synthetic graphs (R-MAT)
Large, publicly available social graphs (Twitter, Livejournal)
![Page 18: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/18.jpg)
TasksPageRank computation
Breadth-first search
Triangle counting
DatasetsLarge synthetic graphs (R-MAT)
Large, publicly available social graphs (Twitter, Livejournal)
In memory vs out of memory
![Page 19: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/19.jpg)
PageRank, in-memory
In memory Out of memory
![Page 20: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/20.jpg)
PageRank, out-of-memory
![Page 21: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/21.jpg)
Runtime breakdown
Enormous variation!
LLAMA minimizes overhead
![Page 22: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/22.jpg)
How is this possible? What features of LLAMA’s design essentially eliminate overhead like buffer management?
![Page 23: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/23.jpg)
Overhead of snapshots
LLAMA
GraphChi (for reference)
PageRank Time to merge snapshots
Merges are IO-bound
![Page 24: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/24.jpg)
Scalability in number of cores (PageRank)
![Page 25: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/25.jpg)
Experimental evaluationCompare to in-memory and out-of-memory systems
(GreenMarl and GraphLab, GraphChi and X-stream)
Evaluate multi-version support
How much overhead from snapshots?
Evaluate scalability on different datasets
![Page 26: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/26.jpg)
Experimental takeawaysLLAMA performs well both in and out of memory
Significantly reduces existing overhead
Introduces manageable additional overhead
![Page 27: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/27.jpg)
Thank you!
![Page 28: LLAMA: Efficient graph analytics using Large ...daslab.seas.harvard.edu/classes/cs265/files/presentations/2020/CS2… · LLAMA: Efficient graph analytics using Large Multiversioned](https://reader034.fdocuments.in/reader034/viewer/2022042420/5f36c0964a561d49a839b98d/html5/thumbnails/28.jpg)
How can we store variable length properties?
Can we create a hybrid between a row- and column-store with the storage of properties?