Domain-Specialized Cache Management for Graph...
Transcript of Domain-Specialized Cache Management for Graph...
![Page 1: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/1.jpg)
Domain-Specialized Cache Management
for Graph Analytics
Priyank Faldu, Boris Grot
This research is partially supported by a grant from Oracle Labs.
Jeff Diamond
1
![Page 2: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/2.jpg)
Cache management in the age of big data
Variety of application domains
Working set size much larger than typical SPEC benchmarks
- Vastly different cache access patterns across domains
2
Data Analytics Graph Analytics Machine Learning
HPCA'20
![Page 3: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/3.jpg)
Cache management in the age of big data
Variety of application domains
Working set size much larger than typical SPEC benchmarks
- Vastly different cache access patterns across domains
Yet, cache management mechanisms are “domain-agnostic”
- Assumption: one size fits all
2
Data Analytics Graph Analytics Machine Learning
HPCA'20
![Page 4: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/4.jpg)
Cache management in the age of big data
Variety of application domains
Working set size much larger than typical SPEC benchmarks
- Vastly different cache access patterns across domains
Yet, cache management mechanisms are “domain-agnostic”
- Assumption: one size fits all
2
Graph Analytics
A case for domain-specialized cache management
HPCA'20
![Page 5: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/5.jpg)
Domain-agnostic techniques for graph analytics
3
HPCA'20
![Page 6: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/6.jpg)
Domain-agnostic techniques for graph analytics
3
Winner of the latest cache replacement championship
HPCA'20
![Page 7: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/7.jpg)
Domain-agnostic techniques for graph analytics
3
HPCA'20
![Page 8: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/8.jpg)
Domain-agnostic techniques for graph analytics
3
Slowdown
HPCA'20
![Page 9: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/9.jpg)
Domain-agnostic techniques for graph analytics
3
Slowdown
1-15% geomean slowdown
HPCA'20
![Page 10: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/10.jpg)
Outline
➢ Performance of domain-agnostic cache management
➢ Graph analytics
➢ GRASP: domain-specialized cache management
- Software-guided reuse-prediction
- Hardware-enforced cache management
➢ Performance evaluation
4
HPCA'20
![Page 11: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/11.jpg)
Applications of graph analytics
Extract meaningful information out of complex many-to-many
relationships among objects
Community Analysis
- Identify customers with similar interests
5
HPCA'20
![Page 12: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/12.jpg)
Applications of graph analytics
Extract meaningful information out of complex many-to-many
relationships among objects
Community Analysis
- Identify customers with similar interests
Connectivity Analysis
- Find weakness in a network
Path Analysis
- Route optimization for distribution and supply chain
Centrality Analysis
- Most influential people and information in social media
And many others …5
HPCA'20
![Page 13: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/13.jpg)
Real-world graphs & power-law degree distribution
Small fraction of vertices have high connectivity – hot vertices
Large fraction of vertices have low connectivity – cold vertices
Prevalent in many domains – e.g., Twitter user-follower graph
6
HPCA'20
![Page 14: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/14.jpg)
Real-world graphs & power-law degree distribution
Small fraction of vertices have high connectivity – hot vertices
Large fraction of vertices have low connectivity – cold vertices
Prevalent in many domains – e.g., Twitter user-follower graph
6
∼700
Average User
HPCA'20
![Page 15: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/15.jpg)
Real-world graphs & power-law degree distribution
Small fraction of vertices have high connectivity – hot vertices
Large fraction of vertices have low connectivity – cold vertices
Prevalent in many domains – e.g., Twitter user-follower graph
6
∼700
Average User
∼72M
Donald Trump
HPCA'20
![Page 16: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/16.jpg)
Real-world graphs & power-law degree distribution
Small fraction of vertices have high connectivity – hot vertices
Large fraction of vertices have low connectivity – cold vertices
Prevalent in many domains – e.g., Twitter user-follower graph
6
∼700
Average User
∼72M
Donald Trump
How does connectivity influence cache locality?
HPCA'20
![Page 17: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/17.jpg)
A canonical example of graph analyticsComputes property for a vertex based on its neighbors' properties
7
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
Example Graph
Vertex Properties
Hot
Hot
HPCA'20
![Page 18: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/18.jpg)
A canonical example of graph analyticsComputes property for a vertex based on its neighbors' properties
7
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0Example Graph
Vertex Properties
Hot
Hot
HPCA'20
![Page 19: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/19.jpg)
A canonical example of graph analyticsComputes property for a vertex based on its neighbors' properties
7
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V0 E P2 E P5
Example Graph
Vertex Properties
Cache Accesses in Time
Hot
Hot
HPCA'20
![Page 20: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/20.jpg)
A canonical example of graph analyticsComputes property for a vertex based on its neighbors' properties
7
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V0 E P2 E P5 V1 E P4
Example Graph
Vertex Properties
Cache Accesses in Time
Hot
Hot
HPCA'20
![Page 21: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/21.jpg)
A canonical example of graph analyticsComputes property for a vertex based on its neighbors' properties
7
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V2
V0 E P2 E P5 V1 E P4 V2 E P0 E P3 E P1
Example Graph
Vertex Properties
Cache Accesses in Time
Hot
Hot
HPCA'20
![Page 22: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/22.jpg)
A canonical example of graph analyticsComputes property for a vertex based on its neighbors' properties
7
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V2
V3
V0 E P2 E P5 V1 E P4 V2 E P0 E P3 E P1 V3 E P4
Example Graph
Vertex Properties
Cache Accesses in Time
Hot
Hot
HPCA'20
![Page 23: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/23.jpg)
A canonical example of graph analyticsComputes property for a vertex based on its neighbors' properties
7
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V2
V3
V4
V0 E P2 E P5 V1 E P4 V2 E P0 E P3 E P1 V3 E P4 V4 E P2
Example Graph
Vertex Properties
Cache Accesses in Time
Hot
Hot
HPCA'20
![Page 24: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/24.jpg)
A canonical example of graph analyticsComputes property for a vertex based on its neighbors' properties
7
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V2
V3
V4
V5
V0 E P2 E P5 V1 E P4 V2 E P0 E P3 E P1 V3 E P4 V4 E P2 V5 E P4
Example Graph
Vertex Properties
Cache Accesses in Time
Hot
Hot
HPCA'20
![Page 25: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/25.jpg)
A canonical example of graph analyticsComputes property for a vertex based on its neighbors' properties
7
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V2
V3
V4
V5
V0 E P2 E P5 V1 E P4 V2 E P0 E P3 E P1 V3 E P4 V4 E P2 V5 E P4
Example Graph
Vertex Properties
Cache Accesses in Time
Hot
Hot
Key observation: vertex reuse is proportional to its degree
HPCA'20
![Page 26: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/26.jpg)
A canonical example of graph analyticsComputes property for a vertex based on its neighbors' properties
7
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V2
V3
V4
V5
V0 E P2 E P5 V1 E P4 V2 E P0 E P3 E P1 V3 E P4 V4 E P2 V5 E P4
Example Graph
Vertex Properties
Cache Accesses in Time
Hot
Hot
Key observation: vertex reuse is proportional to its degree
HPCA'20
![Page 27: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/27.jpg)
A canonical example of graph analyticsComputes property for a vertex based on its neighbors' properties
7
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V2
V3
V4
V5
V0 E P2 E P5 V1 E P4 V2 E P0 E P3 E P1 V3 E P4 V4 E P2 V5 E P4
Example Graph
Vertex Properties
Cache Accesses in Time
Hot
Hot
Key observation: vertex reuse is proportional to its degree
Hot vertices → Small footprint + High reuse
HPCA'20
![Page 28: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/28.jpg)
Challenging to identify hot vertices in hardwareDomain-agnostic techniques rely on purely hardware mechanisms
8
HPCA'20
![Page 29: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/29.jpg)
Challenging to identify hot vertices in hardwareDomain-agnostic techniques rely on purely hardware mechanisms
8
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V2
V3
V4
V5
V0 E P2 E P5 V1 E P4 V2 E P0 E P3 E P1 V3 E P4 V4 E P2 V5 E P4
Example Graph
Vertex Properties
Cache Accesses in Time
Hot
Hot
HPCA'20
![Page 30: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/30.jpg)
Challenging to identify hot vertices in hardwareDomain-agnostic techniques rely on purely hardware mechanisms
8
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V2
V3
V4
V5
V0 E P2 E P5 V1 E P4 V2 E P0 E P3 E P1 V3 E P4 V4 E P2 V5 E P4
Reason ❶ Irregular Accesses
Example Graph
Vertex Properties
Cache Accesses in Time
Hot
Hot
HPCA'20
![Page 31: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/31.jpg)
Challenging to identify hot vertices in hardwareDomain-agnostic techniques rely on purely hardware mechanisms
8
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V2
V3
V4
V5
V0 E P2 E P5 V1 E P4 V2 E P0 E P3 E P1 V3 E P4 V4 E P2 V5 E P4
Reason ❶ Irregular Accesses
Example Graph
Vertex Properties
Reason ❷ Long Reuse Distances
Cache Accesses in Time
Hot
Hot
HPCA'20
![Page 32: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/32.jpg)
Challenging to identify hot vertices in hardwareDomain-agnostic techniques rely on purely hardware mechanisms
8
V4
V0
V1
V2 V3
V5
P0
P1
P2
P3
P4
P5
V0
V1
V2
V3
V4
V5
V0 E P2 E P5 V1 E P4 V2 E P0 E P3 E P1 V3 E P4 V4 E P2 V5 E P4
Reason ❶ Irregular Accesses
Example Graph
Vertex Properties
Reason ❷ Long Reuse Distances
Cache Accesses in Time
Hot
Hot
Idea: Leverage domain-knowledge for reuse prediction
HPCA'20
![Page 33: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/33.jpg)
Proposal: GRASP – a software-hardware co-design
Software aids hardware in identifying hot vertices
Hardware preferentially caches hot vertices
9
HPCA'20
![Page 34: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/34.jpg)
Outline
➢ Performance of domain-agnostic cache management
➢ Graph analytics
➢ GRASP: domain-specialized cache management
- Software-guided reuse-prediction
- Hardware-enforced cache management
➢ Performance evaluation
10
HPCA'20
![Page 35: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/35.jpg)
GRASP: Software-guided reuse-prediction
Task: Let software aid hardware in identifying hot vertices
11
HPCA'20
![Page 36: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/36.jpg)
GRASP: Software-guided reuse-prediction
Task: Let software aid hardware in identifying hot vertices
Challenge: Non-trivial due to sparse distribution of hot
vertices in memory
11
P0
P1
P2
P3
P4
P5
Vertex Properties
Hot
Hot
HPCA'20
![Page 37: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/37.jpg)
GRASP: Software-guided reuse-prediction
Task: Let software aid hardware in identifying hot vertices
Challenge: Non-trivial due to sparse distribution of hot
vertices in memory
11
Idea: Leverage prior graph reordering optimization
P0
P1
P2
P3
P4
P5
Vertex Properties
Hot
Hot
HPCA'20
![Page 38: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/38.jpg)
Optimization: skew-aware graph reordering
Vertices are ordered in memory based on their assigned IDs
Changing vertex order to improve cache locality [IISWC’19]
12
HPCA'20
![Page 39: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/39.jpg)
Optimization: skew-aware graph reordering
Vertices are ordered in memory based on their assigned IDs
Changing vertex order to improve cache locality [IISWC’19]
V4
V0
V1
V2 V3
V5
Hot
Hot
Original Vertex Order12
HPCA'20
![Page 40: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/40.jpg)
Optimization: skew-aware graph reordering
Vertices are ordered in memory based on their assigned IDs
Changing vertex order to improve cache locality [IISWC’19]
Degree-based SortV4
V0
V1
V2 V3
V5 V0
V2
V3
V1 V4
V5
Hot
Hot
Original Vertex Order
Graph is unchanged
12
HPCA'20
![Page 41: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/41.jpg)
Optimization: skew-aware graph reordering
Vertices are ordered in memory based on their assigned IDs
Changing vertex order to improve cache locality [IISWC’19]
Degree-based Sort
Hot vertices are placed
in a contiguous region
V4
V0
V1
V2 V3
V5 V0
V2
V3
V1 V4
V5
Hot
Hot
Original Vertex Order
Hot
Hot
New Vertex Order
Graph is unchanged
12
HPCA'20
![Page 42: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/42.jpg)
Optimization: skew-aware graph reordering
Vertices are ordered in memory based on their assigned IDs
Changing vertex order to improve cache locality [IISWC’19]
Degree-based Sort
Hot vertices are placed
in a contiguous region
V4
V0
V1
V2 V3
V5 V0
V2
V3
V1 V4
V5
Hot
Hot
Original Vertex Order
Hot
Hot
New Vertex Order
Graph is unchanged
12
Easy to communicate the region boudary to hardware
HPCA'20
![Page 43: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/43.jpg)
GRASP: Region-based lightweight interface
❶ Preprocessing:
Software applies
skew-aware
reordering
13
Hot
Vertices
Cold
Vertices
HPCA'20
![Page 44: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/44.jpg)
GRASP: Region-based lightweight interface
❶ Preprocessing:
Software applies
skew-aware
reordering
13
Hot
Vertices
Cold
Vertices
Region
Start
Region
End
Architecturally exposed
configuration registers
HPCA'20
![Page 45: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/45.jpg)
GRASP: Region-based lightweight interface
❶ Preprocessing:
Software applies
skew-aware
reordering
❷ Initialization:
Software populates
configuration registers
13
Hot
Vertices
Cold
Vertices
Region
Start
Region
End
Architecturally exposed
configuration registers
HPCA'20
![Page 46: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/46.jpg)
GRASP: Region-based lightweight interface
❶ Preprocessing:
Software applies
skew-aware
reordering
❷ Initialization:
Software populates
configuration registers
❸ Initialization:
Hardware logically
partitions the
Property Array13
Hot
Vertices
Cold
Vertices
Region
Start
Region
End
Architecturally exposed
configuration registers
High Reuse
Region
Low Reuse
Region
HPCA'20
![Page 47: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/47.jpg)
GRASP: Region-based lightweight interface
❶ Preprocessing:
Software applies
skew-aware
reordering
❷ Initialization:
Software populates
configuration registers
❸ Initialization:
Hardware logically
partitions the
Property Array13
Hot
Vertices
Cold
Vertices
Region
Start
Region
End
LLC
size
Architecturally exposed
configuration registers
High Reuse
Region
Low Reuse
Region
HPCA'20
![Page 48: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/48.jpg)
GRASP: Region-based lightweight interface
❶ Preprocessing:
Software applies
skew-aware
reordering
❷ Initialization:
Software populates
configuration registers
❸ Initialization:
Hardware logically
partitions the
Property Array13
Hot
Vertices
Cold
Vertices
Region
Start
Region
End
LLC
size
Architecturally exposed
configuration registers
High Reuse
Region
Low Reuse
Region
Software involvement is limited to initialization
HPCA'20
![Page 49: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/49.jpg)
GRASP: Reuse prediction at runtime
14
High Reuse Hint
Low Reuse Hint
Cache AccessDoes it
belong to
High Reuse
Region?
Yes
No
HPCA'20
![Page 50: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/50.jpg)
GRASP: Reuse prediction at runtime
14
High Reuse Hint
Low Reuse Hint
Cache AccessDoes it
belong to
High Reuse
Region?
Yes
No
Prediction is entirely done in hardware
HPCA'20
![Page 51: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/51.jpg)
Outline
➢ Performance of domain-agnostic cache management
➢ Graph analytics
➢ GRASP: domain-specialized cache management
- Software-guided reuse-prediction
- Hardware-enforced cache management
➢ Performance evaluation
15
HPCA'20
![Page 52: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/52.jpg)
GRASP: Hardware-enforced cache management
Task: Preferentially cache hot vertices
Challenge: LLC capacity is limited- Not all hot vertices can fit
16
HPCA'20
![Page 53: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/53.jpg)
GRASP: Hardware-enforced cache management
Task: Preferentially cache hot vertices
Challenge: LLC capacity is limited- Not all hot vertices can fit
16
Hot
Vertices
Cold
Vertices
High Reuse
Region
Low Reuse
Region
Hot vertices but predicted to have
low reuse due to limited LLC capacity
LLC
size
HPCA'20
![Page 54: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/54.jpg)
GRASP: Hardware-enforced cache management
Task: Preferentially cache hot vertices
Challenge: LLC capacity is limited- Not all hot vertices can fit
16
Hot
Vertices
Cold
Vertices
High Reuse
Region
Low Reuse
Region
Hot vertices but predicted to have
low reuse due to limited LLC capacity
Requirement: Keep cache management flexible
LLC
size
HPCA'20
![Page 55: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/55.jpg)
GRASP: Preferential but flexible cache management
17
LRU Cache management Technique
Highest Priority
(MRU)
Lowest Priority
(LRU)
Way 1 Way 4
4-Way Set Associative Cache
HPCA'20
![Page 56: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/56.jpg)
GRASP: Preferential but flexible cache management
17
LRU Cache management Technique
Insertion
Way 1 Way 4
4-Way Set Associative Cache
HPCA'20
![Page 57: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/57.jpg)
GRASP: Preferential but flexible cache management
17
LRU Cache management Technique
Insertion
Hit Promotion
Way 1 Way 4
4-Way Set Associative Cache
HPCA'20
![Page 58: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/58.jpg)
GRASP: Preferential but flexible cache management
17
LRU Cache management Technique
Insertion
Hit Promotion
Eviction
Way 1 Way 4
4-Way Set Associative Cache
HPCA'20
![Page 59: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/59.jpg)
GRASP: Preferential but flexible cache management
17
LRU Cache management Technique
GRASP policies for Low Reuse Prediction
High
Reuse
Low
Reuse
Cache
Access
Insertion
Hit Promotion
Eviction
Way 1 Way 4
4-Way Set Associative Cache
GRASP policies for High Reuse Region
HPCA'20
![Page 60: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/60.jpg)
GRASP: Preferential but flexible cache management
17
LRU Cache management Technique
GRASP policies for Low Reuse Prediction
High
Reuse
Low
Reuse
Cache
Access
Insertion
Insertion
Eviction
Way 1 Way 4
Way 1 Way 4
4-Way Set Associative Cache
GRASP policies for High Reuse Region
HPCA'20
Hit Promotion
![Page 61: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/61.jpg)
GRASP: Preferential but flexible cache management
17
LRU Cache management Technique
GRASP policies for Low Reuse Prediction
High
Reuse
Low
Reuse
Cache
Access
Insertion
Insertion
Hit Promotion
Hit Promotion
Eviction
Way 1 Way 4
Way 1 Way 4
4-Way Set Associative Cache
GRASP policies for High Reuse Region
HPCA'20
![Page 62: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/62.jpg)
GRASP: Preferential but flexible cache management
17
LRU Cache management Technique
GRASP policies for Low Reuse Prediction
High
Reuse
Low
Reuse
Cache
Access
Insertion
Insertion
Hit Promotion
Hit Promotion
Eviction
Eviction
Way 1 Way 4
Way 1 Way 4
4-Way Set Associative Cache
GRASP policies for High Reuse Region
HPCA'20
![Page 63: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/63.jpg)
GRASP is simple!
Software- Off the shelf skew-aware reordering optimization
- Compatible with multiple skew-aware reordering techniques
18
HPCA'20
![Page 64: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/64.jpg)
GRASP is simple!
Software- Off the shelf skew-aware reordering optimization
- Compatible with multiple skew-aware reordering techniques
Lightweight Interface- Software configures a pair of registers at initialization
- No software dependency after initialization
18
HPCA'20
![Page 65: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/65.jpg)
GRASP is simple!
Software- Off the shelf skew-aware reordering optimization
- Compatible with multiple skew-aware reordering techniques
Lightweight Interface- Software configures a pair of registers at initialization
- No software dependency after initialization
Hardware- Lightweight address comparison logic to infer the reuse hint
- Trivial policy changes
- Minimal modifications to cache structure – no additional metadata
18
HPCA'20
![Page 66: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/66.jpg)
GRASP is simple!
Software- Off the shelf skew-aware reordering optimization
- Compatible with multiple skew-aware reordering techniques
Lightweight Interface- Software configures a pair of registers at initialization
- No software dependency after initialization
Hardware- Lightweight address comparison logic to infer the reuse hint
- Trivial policy changes
- Minimal modifications to cache structure – no additional metadata
18
Accelerating graph analytics at minimal cost
HPCA'20
![Page 67: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/67.jpg)
Outline
➢ Performance of domain-agnostic cache management
➢ Graph analytics
➢ GRASP: domain-specialized cache management
- Software-guided reuse-prediction
- Hardware-enforced cache management
➢ Performance evaluation
19
HPCA'20
![Page 68: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/68.jpg)
Evaluation methodology
Evaluated 25 benchmarks (5 applications x 5 graph datasets)- Graph applications from the Ligra framework [PPoPP’13]
- Graph datasets are 0.3GB – 8GB in Compressed Sparse Row
(CSR) format
Datasets are reordered using DBG [IISWC’19]- Degree-Based Grouping is state-of-the-art skew-aware reordering
Evaluated on the Sniper simulator [TACO’14]- 8 Out of Order cores
- 16MB shared LLC (2MB per core)
20
HPCA'20
![Page 69: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/69.jpg)
Domain-agnostic techniques vs GRASP
21
HPCA'20
![Page 70: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/70.jpg)
Domain-agnostic techniques vs GRASP
21
HPCA'20
![Page 71: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/71.jpg)
Domain-agnostic techniques vs GRASP
21
✓ No Slowdown
✓ Up to 10.2% speed-up
HPCA'20
![Page 72: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/72.jpg)
More results in paper
Evaluation of pinning-based techniques
Evaluation of GRASP on low-/no-skew graph datasets
Evaluation of GRASP on top of other reordering schemes
… and more
22
HPCA'20
![Page 73: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/73.jpg)
Key take away: one size does NOT fit all
23
Look beyond domain-agnostic cache management
HPCA'20
![Page 74: Domain-Specialized Cache Management for Graph Analyticsfaldupriyank.com/papers/GRASP_HPCA20_Slides.pdf · Graph Analytics A case for domain-specialized cache management HPCA'20. ...](https://reader036.fdocuments.in/reader036/viewer/2022071216/604896a4d0946b52af1c9e5d/html5/thumbnails/74.jpg)
Thank You
I am on the job market24
Priyank Faldu
Source code https://github.com/faldupriyank
Personal website www.faldupriyank.com
HPCA'20