Whole Program Paths James R. Larus. Outline 1. Find acyclic path fragments 2. Convert into...
-
Upload
sarah-garrett -
Category
Documents
-
view
218 -
download
1
Transcript of Whole Program Paths James R. Larus. Outline 1. Find acyclic path fragments 2. Convert into...
Whole Program PathsWhole Program Paths
James R. LarusJames R. Larus
OutlineOutline
1.1. Find acyclic path fragmentsFind acyclic path fragments
2.2. Convert into whole-program pathConvert into whole-program path
3.3. Determine hot subpathsDetermine hot subpaths
Acyclic PathsAcyclic Paths
As per Ball&Larus paper we implementedAs per Ball&Larus paper we implemented
Calculating Acyclic PathsCalculating Acyclic Paths
Instrument chordsInstrument chords
Sum along paths is uniqueSum along paths is unique– Postprocess for functionsPostprocess for functions
Loop iter is new pathLoop iter is new path– New: also function callsNew: also function calls
Dump path ID to fileDump path ID to file
Acyclic Paths OutputAcyclic Paths Output
Acyclic Paths OutputAcyclic Paths Output
OutlineOutline
1.1. Find acyclic path fragmentsFind acyclic path fragments
2.2. Convert into whole-program pathConvert into whole-program path1.1. Compress output stringCompress output string
2.2. Coalesce common substringsCoalesce common substrings
3.3. Store efficientlyStore efficiently
3.3. Determine hot subpathsDetermine hot subpaths
Compress and CoalesceCompress and Coalesce
Grammatical BenefitsGrammatical Benefits
Explain output string as context-free Explain output string as context-free grammar:grammar:– Efficient compression (~20x)Efficient compression (~20x)– Automatic subsequence groupingAutomatic subsequence grouping
Grammar creationGrammar creation– Append symbols to start ruleAppend symbols to start rule– Digrams appear at most onceDigrams appear at most once– Rules must be used at least twiceRules must be used at least twice
Example: 121213121214Example: 121213121214
SEQUITURSEQUITUR
Execution RepresentationExecution Representation
Not a control-flow graph!Not a control-flow graph!
Execution sequence = post-order traversal Execution sequence = post-order traversal of DAGof DAG
Whole PathsWhole Paths
Efficient representationEfficient representation– Create grammar onlineCreate grammar online
Execution context informationExecution context information– e.g., A runs after Be.g., A runs after B
Frequency informationFrequency information
Simple path aggregationSimple path aggregation
OutlineOutline
1.1. Find acyclic path fragmentsFind acyclic path fragments
2.2. Convert into whole-program pathConvert into whole-program path
3.3. Determine hot subpathsDetermine hot subpaths1.1. Find short frequent subsequencesFind short frequent subsequences
2.2. ??????
3.3. Profit!Profit!
OutlineOutline
1.1. Find acyclic path fragmentsFind acyclic path fragments
2.2. Convert into whole-program pathConvert into whole-program path
3.3. Determine hot subpathsDetermine hot subpaths1.1. Find short frequent subsequencesFind short frequent subsequences
2.2. Heavily optimize that 1%Heavily optimize that 1%
3.3. Applies to 75% of cache missesApplies to 75% of cache misses
Hot SubpathsHot Subpaths
Looking for Looking for minimalminimal hot subpaths hot subpaths– L or fewer consecutive acyclic path fragments L or fewer consecutive acyclic path fragments
with cost of C or greaterwith cost of C or greater– Cost = execution frequency x costs of acyclic Cost = execution frequency x costs of acyclic
path fragmentspath fragments– Path fragment cost = number of instructionsPath fragment cost = number of instructions
Finding Hot SubpathsFinding Hot Subpaths
Recursively look for hot minimal subpathsRecursively look for hot minimal subpaths1.1. Split Split
between between childrenchildren
2.2. Processed Processed at lower at lower recursive recursive level level
ResultsResults
Typically:Typically:– 30MB/sec program trace (@200MHz)30MB/sec program trace (@200MHz)– 1 MB/sec program path1 MB/sec program path– 30 grammar rules per path fragment30 grammar rules per path fragment– 100,000 rules in grammar100,000 rules in grammar
Number of hot paths grows slowly with Number of hot paths grows slowly with maximum lengthmaximum lengthSpace sublinear in input size, time Space sublinear in input size, time supralinearsupralinear
ResultsResults
ResultsResults
Typically:Typically:– 30MB/sec program trace (@200MHz)30MB/sec program trace (@200MHz)– 1 MB/sec program path1 MB/sec program path– 30 grammar rules per path fragment30 grammar rules per path fragment– 100,000 rules in grammar100,000 rules in grammar
Number of hot paths grows slowly with Number of hot paths grows slowly with maximum lengthmaximum lengthSpace sublinear in input size, time Space sublinear in input size, time supralinearsupralinear
ResultsResults
SummarySummary
ContributionsContributions– Stream out acyclic path fragments in orderStream out acyclic path fragments in order– Compress and structure with grammarCompress and structure with grammar– Find hot subpaths from whole program pathFind hot subpaths from whole program path
LimitationsLimitations– 15x runtime slowdown15x runtime slowdown– Space-based limits on runtimeSpace-based limits on runtime– High number of hot paths foundHigh number of hot paths found
QuestionsQuestions
What other potentially-useful information What other potentially-useful information does this data structure give?does this data structure give?– Order-dependent code errorsOrder-dependent code errors
What potential for optimization does this What potential for optimization does this open up?open up?– Other applications?Other applications?– Experimental hot-path results?Experimental hot-path results?