Design of a NVRAM Specialized Dynamic Graph Data StructureKeita Iwabuchi1,2, Roger A Pearce2, Brian Van Essen2, Maya B Gokhale2, Satoshi Matsuoka1
1. Tokyo Institute of Technology (Tokyo Tech) 2. Lawrence Livermore National Laboratory (LLNL)
A NVRAM Specialized Degree-Aware Dynamic Graph Data Structure
1. Streaming edges are ingested in Sorted, Random, or BFS order
2. Partition the edges into 1D or 2D partitioning (#partitions is 64)
3. Buffer a subset of edges (1 million) into DRAM4. Insert edge buffer into the graph data
structure sequentially
Streaming edge insertion
ExperimentsMotivation
Key Design Objectives• Increase page-level locality of data stored in NVRAM• Optimize for low degree vertices • Efficiently search and retrieve vertices, edges, and metadata• Quickly locate a specific edge matching topological and metadata
constraints
Our Approach• Degree aware data structures, where low-degree vertices are
compactly represented• Use Robin Hood Hashing [Celis ‘86] because of its locality properties
Store and Process Large Dynamic Graphs• Social network, genome analysis, WWW, etc.• Streaming graph updates (insert or delete edges or
vertices)• Efficiently store sparse scale-free graphs
Leverage Emerging NVRAM in HPC Systems• NVRAM has lower cost and power consumption than DRAM• Persistently store distributed graph database across
compute nodes with attached NVRAM• Extends node’s memory capacity
GoalHigh performance:• Insertion and deletion of vertices and edges• search for a specific edge based on edge meta data
Controller /Partitioner
Streaming edges (Sorted, Random, BFS order)
Comp.Node
v1p1
v2p2
The dynamic graph data structure
(This work)
Comp.Node
Comp.Node
v3p3
WebGraph 2012 [Lehmberg’14]
Configuration• Catalyst cluster at LLNL with 800GB of NVRAM per node (single node)• Memory mapped I/O using DI-MMAP as an interface to NVRAM, limiting the
DRAM resident portion of graph DB (page buffer) to 4GB• Boost.Interprocess to allocate data structures in memory-mapped region
• Largest open source webgraph to our knowledge, 120 billion of edges• Graph is 1D or 2D partitioned, modeling 64 partitions• Vertex: webpage, Edge: hyperlink, Weight: N/A
Graph
Note that each data structure is distributed across the nodes
Baseline model (Boost)
Dataset
Verte
x ta
ble
(uno
rder
ed m
ap) Edge tables
(vector)
Results
Robin Hood Hashing (edge insertion example)
Degree Aware Edge Insertion Algorithm
• Degree aware data structures scale near-linearly with the number of edges inserted (up to 2 billion edges)
• Robin Hood Hashing improves page-level locality and overall performance when graph database grows beyond 4GB page cache
This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-‐POST-‐676096
Jeremiah Willcock, Indiana University
IEEE ACM The International Conference for High Performance Computing, Networking, Storage and Analysis 2015 AustinTOKYO INSTITUTE OF TECHNOLOGYLAWRENCE LIVERMORE NATIONAL LABORATORY
v4p4
w1
w2
w4
w5
w7
v: vertex IDp: vertex propertyw: edge weight
w6・・・
w8
{v1,%v2} {v1,%v3}p1w1 w2
{v2} {v4}p2 p4
{v3} � �
w4 � �
{v3} {v3} �
w7 w8 �
{v1} � �
w4 � �
Low-degree table Mid-high-degree table
Overview
Insert Sorted-‐edgelist
DegreeAware exceed4 GB (page cache)
Baseline exceed4 GB (page cache)
DegreeAware exceed4 GB (page cache)
Baseline exceed4 GB (page cache)
Insert BFS-‐orded-‐edgelist
DegreeAware: 34GB
DegreeAware: 33GB
d1#< low_degree_threshold
d1⟵ L_TBL.degree(u)
False
MH_TBL.insert(L_TBL.pop(u))
MH_TBL.insert(u,v)
L_TBL.insert(u,v)
max_probedistance <#
long_probedistance
Allocate#a#chain#table
Insert#Edge#(u,v)
Finish
u:#source#vertex#ID
v:#target#vertex#ID
L_TBL:# lowHdegree#table
MH_TBL:#middleHhighHdegree#tabled1#>#0
True
True
False
d2#⟵MH_TBL.dgree(u)
d2#=#0
L_TBL.insert(u,v) MH_TBL.insert(u,v)
True False
True False
0 1 2 3 4 5 6 7
0 0 1 0 1 1 1 2 5 0 6 0
page-0 page-1
1--0
Key
Hashvalue
Probedistance
0 0 1 0 1 1 5 0 6 0{0,-1} {1,-0} {1,-2}
{1,-5}
0 1 2 3 4 5 6 7
{5,-6} {6,-5}
{0,-1} {1,-0} {1,-2} {5,-6} {6,-5}{1,-5}
Top Related