Verification of Hierarchical Cache Coherence Protocols for Future Processors
description
Transcript of Verification of Hierarchical Cache Coherence Protocols for Future Processors
![Page 1: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/1.jpg)
Verification of Hierarchical Cache Coherence Protocols for Future Processors
Student: Xiaofang Chen
Advisor: Ganesh Gopalakrishnan
![Page 2: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/2.jpg)
2
Outline
Background Proposed solutions
– High level hierarchical coherence protocol verification
– Refinement check: specifications vs. RTL implementations
Conclusion
![Page 3: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/3.jpg)
3
Hierarchical Cache Coherence Protocols
Chip-level protocols
Inter-cluster protocols
Intra-cluster protocols
dirmem dirmem
…
![Page 4: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/4.jpg)
4
Modeling and Verification of Coherence Protocols
High-level modeling approaches– Model checking
Low-level modeling: RTL or VHDL– Simulation
![Page 5: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/5.jpg)
5
Problems with Hierarchical Coherence Protocols
For high level modeling– Handle the complexity of hierarchical protocols
For RTL implementations– Verify a RTL correctly implements the specification
![Page 6: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/6.jpg)
6
Example: Verification Complexity (I)
RAC
L2 Cache+Local Dir
L1 Cache
Main Mem
Home ClusterRemote Cluster 1
Remote Cluster 2
L1 Cache
Global Dir
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
![Page 7: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/7.jpg)
7
Example: Verification Complexity (II)
Tool: Murphi Verification
– IA-64 machine
– 18GB memory
– 40-bit hash compaction
– Non-conclusive after >30 hours of state enumeration
![Page 8: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/8.jpg)
8
Differences in Modeling: Specs vs. Impls
1 1.1 1.
2
1.3
home clientbuf
local
cache
One step in high-level
Multiple steps in low-level
1.4
1.5
![Page 9: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/9.jpg)
9
Differences in Execution: Specs vs. Impls
1
1.1 1.2
1.3
2 3
2.1 2.2 3.1
3.2
3.3
Interleaving in HL
Concurrency in LL
![Page 10: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/10.jpg)
10
Proposed Mechanisms
For high level modeling, develop– A few M-CMP coherence protocols
– A compositional approach
For specifications vs. implementations, develop– A formal theory
– A compositional approach
– A practical tool
![Page 11: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/11.jpg)
11
2005
Abstraction + assume guarantee Inclusive M-CMP protocols Chen et al. FMCAD 2006
Transaction based refinement check
Hierarchical protocols verification
2006 2007 2008
Transaction based refinement check Complete case study for a benchmark Chen et al. TECHCON 2007 Best session paper in verification
Extensions: refinement check
Present Predicate abstraction for Murphi Bounded Transaction based testing Chen et al. UUCS-06-002, UUCS-06-003
Starting practices
Hierarchical protocols verification
Refinement theory Modular refinement check Chen et al. FMCAD 2007
Improved approach: one level a time Automated abstraction Non-inclusive M-CMP protocols Chen et al. HLDVT 2007
Make muv a practical tool
Thesis Timeline
![Page 12: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/12.jpg)
12
Outline
Background Proposed solutions
– High level hierarchical coherence protocol verification
– Refinement check: specifications vs. RTL implementations
Conclusion
![Page 13: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/13.jpg)
13
An M-CMP Benchmark Protocol
RAC
L2 Cache+Local Dir
L1 Cache
Main Mem
Home ClusterRemote Cluster 1
Remote Cluster 2
L1 Cache
Global Dir
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
Inter-cluster
Intra-cluster
![Page 14: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/14.jpg)
14
Protocol Features
Both levels use MESI protocols– Intra-cluster: FLASH
– Inter-cluster: DASH
Silent drop on non-Modified cache lines Network channels are non-FIFO Inclusive caches
![Page 15: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/15.jpg)
15
Another Benchmark: Non-inclusive Caches
RAC
L2 Cache+Local Dir
L1 Cache
Main Mem
Home ClusterRemote Cluster 1
Remote Cluster 2
L1 Cache
Global Dir
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
![Page 16: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/16.jpg)
16
Our Compositional Approach
Original protocol
![Page 17: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/17.jpg)
17
Our Compositional Approach
![Page 18: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/18.jpg)
18
One Way to Decompose Protocols
Create three abstract protocols Each with 1 detailed cluster + 2 abstracted clusters
![Page 19: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/19.jpg)
19
Abstract Protocol #1
RAC
L2 Cache+Local Dir’
Main Mem
Home Cluster
Remote Cluster 1
Global Dir
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir’
Remote Cluster 2
![Page 20: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/20.jpg)
20
Abstract Protocol #2
RAC
L2 Cache+Local Dir’
Main Mem
Home Cluster
Remote Cluster 1
Global Dir
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir’
Remote Cluster 2
![Page 21: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/21.jpg)
21
Problems with This Approach Every abstract protocol contains 2 protocols Duplicated behaviors in abstract protocols State space still large
1818 636,613,051M2
1812 284,088,425M1
Mem (GB)Time (hour)# of states
![Page 22: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/22.jpg)
22
Second Way to Decompose Protocols
RAC
L2 Cache+Local Dir’
Main Mem
Home ClusterRemote Cluster 1
Remote Cluster 2
RAC
L2 Cache+Local Dir’
Global Dir
RAC
L2 Cache+Local Dir’
Home Cluster Remote Cluster 1
ABS #1 ABS #2
ABS #3
L2 Cache+Local Dir
L1 Cache
L1 Cache
L2 Cache+Local Dir
L1 Cache
L1 Cache
![Page 23: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/23.jpg)
23
Model Checking Results
Model checkpassed
Use mem(GB)
18
18
18
1.8
1.8
1.8
Model checktime (sec)
> 125,410
44,978
66,249
270
50
21
# of states
> 438,120,000
284,088,425
636,613,051
1,500,621
574,198
198,162
Full model
Abs. model 1
Abs. model 2
Abs. model 1
Abs. model 2
Abs. model 3
Classicalapproach
Firstapproach
Secondapproach
Nonconclusive
Yes
Yes
Yes
Yes
Yes
![Page 24: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/24.jpg)
24
Details of Our Approach
Abstraction– States
– Transitions, properties
Constraining– Assume guarantee reasoning
![Page 25: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/25.jpg)
25
Abstraction on States
Intra-cluster
Inter-cluster
![Page 26: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/26.jpg)
26
State Representation
L2 Cache+Local Dir
L1 Cache
L1 Cache
L2 Cache+Local Dir’
L1s Network L2Local Dir
Original cluster
RAC
RAC
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
L1s Network L2Local Dir
L2Local Dir’ RAC
Abstract clusters
![Page 27: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/27.jpg)
27
Rule: guard action guard
– Become more permissive
action– Allow more behaviors
Abstracting Transitions and Properties
![Page 28: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/28.jpg)
28
An Example of Abstraction
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir’
WBClusters[c].WbMsg.Cmd = WB
Clusters[c].L2.Data := Clusters[c].WbMsg.Data;
Clusters[c].L2.HeadPtr := L2; …
True
Clusters[c].L2.Data := nondet; …
Abstract inter-cluster protocol
Abstract intra-cluster protocol
![Page 29: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/29.jpg)
29
Abstraction, Now Constraining
![Page 30: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/30.jpg)
30
An Example of Constraining
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir’
WBClusters[c].WbMsg.Cmd = WB
Clusters[c].L2.State = Excl
True &
Clusters[c].L2.State = Excl
Clusters[c].L2.Data := nondet; …
![Page 31: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/31.jpg)
31
Non-inclusive Protocols: History Variables
RAC
L2 Cache+Local Dir’
Main Mem
Home ClusterRemote Cluster 1
Remote Cluster 2
RAC
L2 Cache+Local Dir’
Global Dir
RAC
L2 Cache+Local Dir’
Home Cluster Remote Cluster 1
L2 Cache+Local Dir
L1 Cache
L1 Cache
L2 Cache+Local Dir
L1 Cache
L1 Cache
![Page 32: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/32.jpg)
32
Experimental Results
Model checkpassed
Use mem(GB)
18
1.8
1.8
1.8
Model checktime (sec)
> 161,398
770
250
248
# of states
> 473,260,000
4,070,484
2,424,719
2,424,719
Full model
Abs. model 1
Abs. model 2
Abs. model 3
Classicalapproach
Secondapproach
Nonconclusive
Yes
Yes
Yes
![Page 33: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/33.jpg)
33
Outline
BackgroundProposed solutions
High level hierarchical coherence protocol verification
– Refinement check: specifications vs. RTL implementations
Conclusion
![Page 34: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/34.jpg)
34
Our Approach
Use a hardware language– Hardware Murphi
Develop a formal theory of refinement check Develop a compositional approach
– Abstraction
– Assume guarantee
Develop a practical tool
![Page 35: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/35.jpg)
35
Hardware Murphi
Murphi extension by S. German and G. Janssen A concurrent shared variable language
– On each cycle• Multiple transitions execute concurrently• Exclusive write to a variable• Shared reads to variables• Write immediately visible within the same transition• Write visible to other transitions on the next cycle
Support transactions, signals, etc
![Page 36: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/36.jpg)
36
Transaction
Group multiple steps in impl
Transaction Rule-1 …. … Rule-6 … End;
12
3
456
![Page 37: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/37.jpg)
37
Workflow of Our Refinement Check
Hardware MurphiImpl model
Product model inHardware Murphi
Product model in VHDL
MurphiSpec model
Property check
Muv
Check low-level correctly implements high-level
![Page 38: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/38.jpg)
38
Full List of Assertions for Refinement Check
1. Serializability for specifications
2. No write-write conflicts
3. Initial states containment
4. Write set variables containment
5. Enableness for specifications
6. Joint variables match at the end of transactions
![Page 39: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/39.jpg)
39
An Example
Transaction
Rule-1
guard1 action1;
Rule-2
guard2 action2;
Rule-3
guard3 action3;
End;
Rule
spec_guard spec_action;
Impl transaction
Spec rule
![Page 40: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/40.jpg)
40
An Example (Cont’d)
Transaction
Rule-1 guard1 action1; assert spec_guard; spec_action; Rule-2
guard2 action2;
Rule-3 guard3 action3;
End;
assert impl_var1 = spec_var1;assert impl_var2 = spec_var2; …
![Page 41: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/41.jpg)
41
Driving Benchmark
Buf
Buf
Buf Remote
Dir Cache Mem
Router
Buf
Buf
Buf
LocalHome
Remote
Dir Cache Mem
S. German and G. Janssen, IBM Research Tech Report 2006
LocalHome
![Page 42: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/42.jpg)
42
Bugs Found with Refinement Check
Benchmark satisfies cache coherence already Bugs still found
– Bug 1: router unit loses messages
– Bug 2: home unit replies twice for one request
– Bug 3: cache unit gets updated twice from one reply
Refinement check is an automatic way of constructing checks
![Page 43: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/43.jpg)
43
Model Checking Approaches
Monolithic– Straightforward property check
Compositional– Divide and conquer
Product model in VHDL
Monolithic
Compositional
![Page 44: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/44.jpg)
44
Compositional Refinement Check
Reduce the verification complexity Basic Techniques
– Abstraction • Removing details to make verification easier
– Assume guarantee• A simple form of induction which introduces assumptions and
justifies them
![Page 45: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/45.jpg)
45
In More Detail
Abstraction– Change variables to free input variables
– E.g. change a latch to free input signal
Assume guarantee
(spec.Var = impl.Var) holds
Assume for reads of a transaction
![Page 46: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/46.jpg)
46
Experimental Results
Verification Time
1-bit 10-bit
1-day
Datapath
Configurations– 2 nodes, 2 addresses, SixthSense
30 min
Monolithic approachCompositional approach
![Page 47: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/47.jpg)
47
Outline
BackgroundProposed solutions
High level hierarchical coherence protocol verificationRefinement check: specifications vs. RTL implementations
Conclusion
![Page 48: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/48.jpg)
48
2005
Abstraction + assume guarantee Inclusive M-CMP protocols Chen et al. FMCAD 2006
Transaction based refinement check
Hierarchical protocols verification
2006 2007 2008
Transaction based refinement check Complete case study for a benchmark Chen et al. TECHCON 2007 Best session paper in verification
Extensions: refinement check
Present Predicate abstraction for Murphi Bounded Transaction based testing Chen et al. UUCS-06-002, UUCS-06-003
Starting practices
Hierarchical protocols verification
Refinement theory Modular refinement check Chen et al. FMCAD 2007
Improved approach: one level a time Automated abstraction Non-inclusive M-CMP protocols Chen et al. HLDVT 2007
Make muv a practical tool
Thesis Timeline
![Page 49: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/49.jpg)
49
Thank you.
![Page 50: Verification of Hierarchical Cache Coherence Protocols for Future Processors](https://reader030.fdocuments.in/reader030/viewer/2022020309/56813a60550346895da257de/html5/thumbnails/50.jpg)
50
Related Work
Parameterized verification– Chou et al.
Bluespec– Arvind et al.
Aggregation of distributed actions – Park and Dill
Compositional verification– Many previous works including McMillan, Jones, etc.