Code optimization in GCC - Mines ParisTech · Code optimization in GCC Sebastian· Pop Universite·...
-
Upload
nguyendiep -
Category
Documents
-
view
224 -
download
0
Transcript of Code optimization in GCC - Mines ParisTech · Code optimization in GCC Sebastian· Pop Universite·...
Code optimization in GCCSebastian Pop
Universite Louis Pasteur
Strasbourg
FRANCE
Code optimization in GCC – p.1
IntroductionGCC : GNU Compiler Collection
C, C++, Java, Ada, Frotran, Mercury, . . .
Generates code for 43 different architectures:i386, ia64, m68k, sparc, . . .
Main compiler in GNU world
Apple’s system compiler.
Industrial compiler.
Code optimization in GCC – p.2
IntroductionGCC : GNU Compiler Collection
C, C++, Java, Ada, Frotran, Mercury, . . .
Generates code for 43 different architectures:i386, ia64, m68k, sparc, . . .
Main compiler in GNU world
Apple’s system compiler.
Industrial compiler.
Code optimization in GCC – p.2
IntroductionGCC : GNU Compiler Collection
C, C++, Java, Ada, Frotran, Mercury, . . .
Generates code for 43 different architectures:i386, ia64, m68k, sparc, . . .
Main compiler in GNU world
Apple’s system compiler.
Industrial compiler.
Code optimization in GCC – p.2
IntroductionGCC : GNU Compiler Collection
C, C++, Java, Ada, Frotran, Mercury, . . .
Generates code for 43 different architectures:i386, ia64, m68k, sparc, . . .
Main compiler in GNU world
Apple’s system compiler.
Industrial compiler.
Code optimization in GCC – p.2
IntroductionGCC : GNU Compiler Collection
C, C++, Java, Ada, Frotran, Mercury, . . .
Generates code for 43 different architectures:i386, ia64, m68k, sparc, . . .
Main compiler in GNU world
Apple’s system compiler.
Industrial compiler.
Code optimization in GCC – p.2
Front-ends /back-end
GCC
Code optimization in GCC – p.3
Front-ends /back-end
gcc g++ gcj g77 Front−ends
GCC
Code optimization in GCC – p.3
Front-ends /back-end
gcc g++ gcj g77 Front−ends
i386 ia64 m68k sparc Machine Description
GCC
Code optimization in GCC – p.3
Front-ends /back-end
gcc g++ gcj g77 Front−ends
i386 ia64 m68k sparc Machine Description
RTL Back−end
GCC
Code optimization in GCC – p.3
Front-ends /back-end
gcc g++ gcj g77 Front−ends
i386 ia64 m68k sparc Machine Description
RTL Back−end
GCC
GAS Assembler
Code optimization in GCC – p.3
Exemple:cross-compilation
Suppose that I want to generate Sparc code:-target=sparc
I build GCC on my laptop: -build=i586
and I run the compiler on my laptop:-host=i586
Code optimization in GCC – p.4
Exemple:cross-compilation
Suppose that I want to generate Sparc code:-target=sparc
I build GCC on my laptop: -build=i586
and I run the compiler on my laptop:-host=i586
Code optimization in GCC – p.4
Exemple:cross-compilation
Suppose that I want to generate Sparc code:-target=sparc
I build GCC on my laptop: -build=i586
and I run the compiler on my laptop:-host=i586
Code optimization in GCC – p.4
Exemple:cross-compilation
Suppose that I want to generate Sparc code:-target=sparc
I build GCC on my laptop: -build=i586
and I run the compiler on my laptop:-host=i586
../gcc/configure -target=sparc -build=i586
-host=i586
Code optimization in GCC – p.4
Exemple:cross-compilation
gcc g++ gcj g77 Front−ends
i386 ia64 m68k sparc Machine Description
RTL Back−end
GCC
GAS Assembler
Code optimization in GCC – p.5
Exemple:cross-compilation
gcc g++ gcj g77 Front−ends
RTL Back−end
GCC
sparc Machine Description
1. Select SPARC machine description
SPARC specific
Code optimization in GCC – p.5
Exemple:cross-compilation
gcc g++ gcj g77 Front−ends
RTL Back−end
GCC
sparc Machine Description
1. Select SPARC machine
SPARC specific
SPARC assembler code
description
2. Compile
Code optimization in GCC – p.5
RTL OptimizationsAn optimization pass optimizes all front-ends.
Machine dependent optimizations.
Types and memory structures after loweringto RTL contain less information.Memory accesses are under their canonical form:<start adress + offset>
Idea: we’d like to havearchitecture independent optimizations.on high level representations.
Code optimization in GCC – p.6
RTL OptimizationsAn optimization pass optimizes all front-ends.
Machine dependent optimizations.
Types and memory structures after loweringto RTL contain less information.Memory accesses are under their canonical form:<start adress + offset>
Idea: we’d like to havearchitecture independent optimizations.on high level representations.
Code optimization in GCC – p.6
RTL OptimizationsAn optimization pass optimizes all front-ends.
Machine dependent optimizations.
Types and memory structures after loweringto RTL contain less information.Memory accesses are under their canonical form:<start adress + offset>
Idea: we’d like to havearchitecture independent optimizations.on high level representations.
Code optimization in GCC – p.6
RTL OptimizationsAn optimization pass optimizes all front-ends.
Machine dependent optimizations.
Types and memory structures after loweringto RTL contain less information.Memory accesses are under their canonical form:<start adress + offset>
Idea: we’d like to have
architecture independent optimizations.on high level representations.
Code optimization in GCC – p.6
RTL OptimizationsAn optimization pass optimizes all front-ends.
Machine dependent optimizations.
Types and memory structures after loweringto RTL contain less information.Memory accesses are under their canonical form:<start adress + offset>
Idea: we’d like to havearchitecture independent optimizations.
on high level representations.
Code optimization in GCC – p.6
RTL OptimizationsAn optimization pass optimizes all front-ends.
Machine dependent optimizations.
Types and memory structures after loweringto RTL contain less information.Memory accesses are under their canonical form:<start adress + offset>
Idea: we’d like to havearchitecture independent optimizations.on high level representations.
Code optimization in GCC – p.6
IntermediateRepresentations
GCC
RTL
gcc g++ gcj g77
Machine descriptionTranslation follows machines specificities
Code optimization in GCC – p.7
IntermediateRepresentations
GCC
Mid−RTL
RTL
gcc g++ gcj g77
Machine description
Translation
Progressive transition from AST to RTL Architecture independent IR
Code optimization in GCC – p.7
IntermediateRepresentations
GCC
Mid−RTL
RTL
gcc g++ gcj g77
Simple
Machine description
Simplify
Translation
Progressive transition from AST to RTL Architecture independent IR
Imperative Normal Form
Language independent representation
Code optimization in GCC – p.7
Abstract SyntaxTrees
Simple linked list for statement nodes.
Manipulation of nodes through a macrointerface: TREE_CHAIN, TREE_OPERAND,
TREE_CODE, ...
Data structures hidden.
AST nodes are typed:allows tree-checking during development.
Code optimization in GCC – p.8
Abstract SyntaxTrees
Simple linked list for statement nodes.
Manipulation of nodes through a macrointerface: TREE_CHAIN, TREE_OPERAND,
TREE_CODE, ...
Data structures hidden.
AST nodes are typed:allows tree-checking during development.
Code optimization in GCC – p.8
Abstract SyntaxTrees
Simple linked list for statement nodes.
Manipulation of nodes through a macrointerface: TREE_CHAIN, TREE_OPERAND,
TREE_CODE, ...
Data structures hidden.
AST nodes are typed:allows tree-checking during development.
Code optimization in GCC – p.8
Abstract SyntaxTrees
Simple linked list for statement nodes.
Manipulation of nodes through a macrointerface: TREE_CHAIN, TREE_OPERAND,
TREE_CODE, ...
Data structures hidden.
AST nodes are typed:allows tree-checking during development.
Code optimization in GCC – p.8
AST: example
a = (−−b) * 7;x = y+z;
Code optimization in GCC – p.9
AST: example
EXPR_STMTa = (−−b) * 7;x = y+z; a = (−−b) * 7;
Code optimization in GCC – p.9
AST: example
EXPR_STMT EXPR_STMTTREE_CHAIN (S)a = (−−b) * 7;
x = y+z; a = (−−b) * 7; x = y+z;
Code optimization in GCC – p.9
AST: example
EXPR_STMT
MODIFY_EXPR
EXPR_STMTa = (−−b) * 7;x = y+z;
EXPR_STMT_EXPR (S)
Code optimization in GCC – p.9
AST: example
EXPR_STMT
MULT_EXPR
MODIFY_EXPR
EXPR_STMT
VAR_DECL
TREE_OPERAND (M, 1)
a = (−−b) * 7;x = y+z;
TREE_OPERAND (M, 0)
Code optimization in GCC – p.9
AST: example
EXPR_STMT
PREDECREMENT_EXPR
MULT_EXPR
IDENTIFIER_NODEa
INTEGER_CST7
MODIFY_EXPR
EXPR_STMT
VAR_DECL
a = (−−b) * 7;x = y+z;
DECL_NAME (V)
Code optimization in GCC – p.9
AST: example
EXPR_STMT
PREDECREMENT_EXPR
MULT_EXPR
IDENTIFIER_NODEa
INTEGER_CST7
MODIFY_EXPR
EXPR_STMT
IDENTIFIER_NODEb
INTEGER_CST1
VAR_DECL
VAR_DECL
a = (−−b) * 7;x = y+z;
Code optimization in GCC – p.9
Simple: overviewSIMPLE’s grammar defines an imperativenormal form:
Reduced number of expressions.Reduced number of control structures.
SIMPLE AST has a regular structure.
Systematic AST analysis is possible.
Common intermediate representation for allfront ends.
Code optimization in GCC – p.10
Simple: overviewSIMPLE’s grammar defines an imperativenormal form:
Reduced number of expressions.
Reduced number of control structures.
SIMPLE AST has a regular structure.
Systematic AST analysis is possible.
Common intermediate representation for allfront ends.
Code optimization in GCC – p.10
Simple: overviewSIMPLE’s grammar defines an imperativenormal form:
Reduced number of expressions.Reduced number of control structures.
SIMPLE AST has a regular structure.
Systematic AST analysis is possible.
Common intermediate representation for allfront ends.
Code optimization in GCC – p.10
Simple: overviewSIMPLE’s grammar defines an imperativenormal form:
Reduced number of expressions.Reduced number of control structures.
SIMPLE AST has a regular structure.
Systematic AST analysis is possible.
Common intermediate representation for allfront ends.
Code optimization in GCC – p.10
Simple: overviewSIMPLE’s grammar defines an imperativenormal form:
Reduced number of expressions.Reduced number of control structures.
SIMPLE AST has a regular structure.
Systematic AST analysis is possible.
Common intermediate representation for allfront ends.
Code optimization in GCC – p.10
Simple: overviewSIMPLE’s grammar defines an imperativenormal form:
Reduced number of expressions.Reduced number of control structures.
SIMPLE AST has a regular structure.
Systematic AST analysis is possible.
Common intermediate representation for allfront ends.
Code optimization in GCC – p.10
Simple: exemple
a = −−b*7;
Code optimization in GCC – p.11
Simple: exemple
a = −−b*7;a=b*7;b=b−1;
Code optimization in GCC – p.11
Simple: exemple
a = −−b*7;a=b*7;b=b−1;
if (i++ && −−k){
}j=f(i+3*k);
Code optimization in GCC – p.11
Simple: exemple
a = −−b*7;a=b*7;b=b−1;
if (i++ && −−k){
}j=f(i+3*k);
if (i){k=k−1;if(k)
else
i=i+1;
{i=i+1;T1=3*k;T2=i+T1;j=f(T2);
}
i=i+1;}else
Code optimization in GCC – p.11
Simple: exemple
{
}A[i]=A[i+3*k];
while(i++ && −−k)
Code optimization in GCC – p.12
Simple: exempleif(i){k=k−1;if (k)while(1){i=i+1;T1=3*k;T2=i+T1;A[i]=A[T2];if(i){k=k−1;if(k)i=i+1;elsebreak;
}elsebreak;
}}
i=i+1;
{
}A[i]=A[i+3*k];
while(i++ && −−k)
Code optimization in GCC – p.12
An optimizingcompiler
Front−end
Source code
Analyses Optimizations
Code optimization in GCC – p.13
An optimizingcompiler
Front−end
InliningAnalyses
Source code
Call graphRecursivity suppression
Optimizations
Code optimization in GCC – p.13
Call Graph(node, edge) => (declaration, call)
Graph representation:pointers: P-Space.in a file under parenthesized form:EXP-Space.
Use metrics for controlling inlining.
GCC’s analysis is limited to a singletranslation unit.
Code optimization in GCC – p.14
Call Graph(node, edge) => (declaration, call)
Graph representation:
pointers: P-Space.in a file under parenthesized form:EXP-Space.
Use metrics for controlling inlining.
GCC’s analysis is limited to a singletranslation unit.
Code optimization in GCC – p.14
Call Graph(node, edge) => (declaration, call)
Graph representation:pointers: P-Space.
in a file under parenthesized form:EXP-Space.
Use metrics for controlling inlining.
GCC’s analysis is limited to a singletranslation unit.
Code optimization in GCC – p.14
Call Graph(node, edge) => (declaration, call)
Graph representation:pointers: P-Space.in a file under parenthesized form:EXP-Space.
Use metrics for controlling inlining.
GCC’s analysis is limited to a singletranslation unit.
Code optimization in GCC – p.14
Call Graph(node, edge) => (declaration, call)
Graph representation:pointers: P-Space.in a file under parenthesized form:EXP-Space.
Use metrics for controlling inlining.
GCC’s analysis is limited to a singletranslation unit.
Code optimization in GCC – p.14
Call Graph(node, edge) => (declaration, call)
Graph representation:pointers: P-Space.in a file under parenthesized form:EXP-Space.
Use metrics for controlling inlining.
GCC’s analysis is limited to a singletranslation unit.
Code optimization in GCC – p.14
Call Graph : solutionPerform call graph optimizations outsideGCC.
Problems :Extract information, decide, then applyoptimizations: 3 passes.Knowledge base’s size.What informations to be stored in KB?
Code optimization in GCC – p.15
Call Graph : solutionPerform call graph optimizations outsideGCC.
Problems :
Extract information, decide, then applyoptimizations: 3 passes.Knowledge base’s size.What informations to be stored in KB?
Code optimization in GCC – p.15
Call Graph : solutionPerform call graph optimizations outsideGCC.
Problems :Extract information, decide, then applyoptimizations: 3 passes.
Knowledge base’s size.What informations to be stored in KB?
Code optimization in GCC – p.15
Call Graph : solutionPerform call graph optimizations outsideGCC.
Problems :Extract information, decide, then applyoptimizations: 3 passes.Knowledge base’s size.
What informations to be stored in KB?
Code optimization in GCC – p.15
Call Graph : solutionPerform call graph optimizations outsideGCC.
Problems :Extract information, decide, then applyoptimizations: 3 passes.Knowledge base’s size.What informations to be stored in KB?
Code optimization in GCC – p.15
An optimizingcompiler
Front−end
InliningAnalyses
Source code
Call graphRecursivity suppression
Optimizations
Code optimization in GCC – p.16
An optimizingcompiler
Front−end
InliningRecursivity suppression
Call graphControl flow graph
CFG normalization
Analyses Optimizations
Source code
Code optimization in GCC – p.16
CFG NormalizationSuppress irregularities from control flow:goto, break, continue.
CFG normalization is based on Simple.
Why normalizing CFG?It is difficult to optimize programscontaining gotos.Break and continue translation to RTLgenerates gotos.Simplification generates irregular code.
Code optimization in GCC – p.17
CFG NormalizationSuppress irregularities from control flow:goto, break, continue.
CFG normalization is based on Simple.
Why normalizing CFG?It is difficult to optimize programscontaining gotos.Break and continue translation to RTLgenerates gotos.Simplification generates irregular code.
Code optimization in GCC – p.17
CFG NormalizationSuppress irregularities from control flow:goto, break, continue.
CFG normalization is based on Simple.
Why normalizing CFG?
It is difficult to optimize programscontaining gotos.Break and continue translation to RTLgenerates gotos.Simplification generates irregular code.
Code optimization in GCC – p.17
CFG NormalizationSuppress irregularities from control flow:goto, break, continue.
CFG normalization is based on Simple.
Why normalizing CFG?It is difficult to optimize programscontaining gotos.
Break and continue translation to RTLgenerates gotos.Simplification generates irregular code.
Code optimization in GCC – p.17
CFG NormalizationSuppress irregularities from control flow:goto, break, continue.
CFG normalization is based on Simple.
Why normalizing CFG?It is difficult to optimize programscontaining gotos.Break and continue translation to RTLgenerates gotos.
Simplification generates irregular code.
Code optimization in GCC – p.17
CFG NormalizationSuppress irregularities from control flow:goto, break, continue.
CFG normalization is based on Simple.
Why normalizing CFG?It is difficult to optimize programscontaining gotos.Break and continue translation to RTLgenerates gotos.Simplification generates irregular code.
Code optimization in GCC – p.17
Flow Out
Condition
Loop’s body
Loop:
Code optimization in GCC – p.18
Flow Out
if (c)break;
Condition Normal exit
Irregular exit
Loop:
Loop’s body
Code optimization in GCC – p.18
Flow Out
else {
}...
if (c) {b_c = true;}
Loop’s body
Loop:
b_c & Condition Normal exit
Code optimization in GCC – p.18
Break Elimination
while (a)
stmt1;{
if (b)break;
stmt2;}
Code optimization in GCC – p.19
Break Elimination
int c_b = 0;
stmt1;{
if (b){c_b = 1;}else
}
{
}stmt2;
while (c_b == 0 && a)
Code optimization in GCC – p.19
Goto Elimination
goto
label
Code optimization in GCC – p.20
Goto Elimination
goto
label
Code optimization in GCC – p.20
Goto Elimination
goto
label
Code optimization in GCC – p.20
Goto Elimination
goto
label
Code optimization in GCC – p.20
Goto Elimination
goto
label
Code optimization in GCC – p.20
An optimizingcompiler
Front−end
InliningRecursivity suppression
Call graphControl flow graph
CFG normalization
Analyses Optimizations
Source code
Code optimization in GCC – p.21
An optimizingcompiler
Front−end
Inlining
Loop unrolling / blocking / fusion ...
Recursivity suppressionCFG normalization
Call graphControl flow graph
OptimizationsAnalyses
Source code
Spatial / temporal locality
Code optimization in GCC – p.21
An optimizingcompiler
Front−end
SSA
Inlining
Loop unrolling / blocking / fusion ...CFG normalization
Spatial / temporal localityInduction variablesArray access functions
Dependence analysis Pointers and alias analysis
OptimizationsAnalyses
Source code
Call graphControl flow graph Recursivity suppression
Code optimization in GCC – p.21
Loop OptimizationsLoops are normalized after detection ofinduction variables.
Geometric representation of array accessescan be then constructed.
Dependence analysis is necessary forvalidating loop transformations.
These points are still under development.
Code optimization in GCC – p.22
Loop OptimizationsLoops are normalized after detection ofinduction variables.
Geometric representation of array accessescan be then constructed.
Dependence analysis is necessary forvalidating loop transformations.
These points are still under development.
Code optimization in GCC – p.22
Loop OptimizationsLoops are normalized after detection ofinduction variables.
Geometric representation of array accessescan be then constructed.
Dependence analysis is necessary forvalidating loop transformations.
These points are still under development.
Code optimization in GCC – p.22
Loop OptimizationsLoops are normalized after detection ofinduction variables.
Geometric representation of array accessescan be then constructed.
Dependence analysis is necessary forvalidating loop transformations.
These points are still under development.
Code optimization in GCC – p.22
An optimizingcompiler
Front−end
SSA
Inlining
Loop unrolling / blocking / fusion ...CFG normalization
Spatial / temporal localityInduction variablesArray access functions
Dependence analysis Pointers and alias analysis
OptimizationsAnalyses
Source code
Call graphControl flow graph Recursivity suppression
Code optimization in GCC – p.23
An optimizingcompiler
Front−end
SSA
Inlining
Loop unrolling / blocking / fusion ...
Recursivity suppressionCFG normalization
Spatial / temporal locality
Call graphControl flow graph
Induction variablesArray access functionsPointers and alias analysisDependence analysis
Source code
OptimizationsAnalyses
Unparser
Optimized code
Code optimization in GCC – p.23
RemerciementsMerci à tous ceux qui ont contribué à la réussitede ce projet :
Code optimization in GCC – p.24
RemerciementsMerci à tous ceux qui ont contribué à la réussitede ce projet :
l’équipe ICPS pour l’excellente ambiance,
Philippe Clauss, Vincent Loechner et BenoîtMeister pour leur travail de recherche,
Catherine Mongenet pour le cours de compil,
Frédéric Wagner et Diego Novillo pourm’avoir accompagné dans ce projet,
the FSF for GCC, Debian, Linux and Prosper
et ma famille.Code optimization in GCC – p.24