The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146...
-
Upload
thomas-mathews -
Category
Documents
-
view
213 -
download
2
Transcript of The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146...
![Page 1: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/1.jpg)
The LANCE V2.0
C compiler system
Rainer Leupersphone: +49 (231) 755 6151mobile: +49 (177) 2131146
University of Dortmund, Informatik 1244221 Dortmund, Germany
fax: +49 (231) 755 6116http://ls12-www.cs.uni-dortmund
![Page 2: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/2.jpg)
© 2000, R. Leupers
Overview
Functionality of LANCE Software structure C frontend Intermediate representation (IR) IR optimizations Control and data flow analysis Backend interface
![Page 3: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/3.jpg)
© 2000, R. Leupers
The LANCE V2.0 compiler system
Tasks covered by LANCE: Source code analysis Generation of IR Machine-independent optimizations Data flow graph generation
Tasks not covered by LANCE: Assembly code generation (backend) Machine-specific optimizations Code assembly and linking
Purpose of LANCE: Facilitate C compiler development for new target processors Give insight into compiler structure
![Page 4: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/4.jpg)
© 2000, R. Leupers
Key features
Full ANSI C coverage (C 89)
Modular tool and library structure
Simple three address code IR (C subset)
Plug & play IR optimizations
Backend interface compatible to OLIVE
Proven in numerous compiler projects
![Page 5: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/5.jpg)
© 2000, R. Leupers
LANCE software structure
lance2.h
header file
liblance2.a
C++ library
C frontend
common IR
IR optimization 1
IR optimization n
machine-specificbackend
LANCE library LANCE tools
used by
![Page 6: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/6.jpg)
© 2000, R. Leupers
ANSI C frontend
Functionality: Lexical, syntactical, and semantical analysis of C source Generation of three address code IR for a C file Emission of error messages if required (gcc style) Machine-specific constants (type bitwidth, alignment) stored in a configuration file
Implementation: Based on a context-free C grammar, according to K&R spec C source automatically generated with attribute grammar compiling system (OX, extension of lex & yacc) In total approx. 26,000 lines of C source code Validated with comprehensive test suite
![Page 7: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/7.jpg)
© 2000, R. Leupers
Setup and IR generation
file test.c
file test.ir.c
>compile test.c
config.sparc
Environment variables: setenv LANCE2_CPP „gcc –E“ setenv LANCE2_CONFIG „config.sparc“
Call C frontend by „compile“ command:
![Page 8: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/8.jpg)
© 2000, R. Leupers
General IR format
One IR file (*.ir.c) generated for each C source file (*.c)
External IR format: C subset (compilable !)
Internal IR format: Accessible via LANCE library IR contains a symbol table + three address code (3AC) for each C function defined in the source code
3AC is a sequence of IR statements
3AC = at most two operands, one result per statement
IR statements (mostly) consist of IR expressions blocks of 3AC augmented with source information (C code, source line no.) for debugging purposes
![Page 9: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/9.jpg)
© 2000, R. Leupers
Classes of IR statements
Assignment: a = b + c; *p = !a; x = f(y,z); cond = *x;
Jump:goto lab;
Conditional jump:if (cond) goto lab;
Label:lab:
Return void:return;
Return value:return x;
![Page 10: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/10.jpg)
© 2000, R. Leupers
Classes of IR expressions
Symbol: „a“, „b“, „main“, „count“, ...
Binary expression: a * b, x / 2, 3 ^ v, f &4, q % r, ...
Unary expression: !a, *p, ~x, -z, ...
Function call: f1(), f2(a,b), f3(*x, 1, y), ...
Type cast: (char)z, (int)a, (float*)b, ...
String constant: „compiler“, „design“, „is“, „fun“, ...
Integer constant: 1000, 3456, -234, -112, ...
Float constant: „3.1415926536“, „2.718281828459“, ...
![Page 11: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/11.jpg)
© 2000, R. Leupers
Why is the LANCE IR a C subset ?
C source frontend IR-C source
CC CCexe 2exe 1 test input
output 1 output 2= ?
Validation of frontend (or any IR optimization):
C-to-C optimization:IR optimization
toolsoptimizedC sourceCC
![Page 12: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/12.jpg)
© 2000, R. Leupers
IR data structure overview
GLOBAL SYMBOL TABLEint x1,x2,x3; double y1,y2,y3; ........
fun 1„name1“
Local symbol tableint a,b,c; ...
stm 1 stm 2 stm m
fun n„name n“
Class: assignmentID: 4123Left hand side: *pRight hand side: a + b
Class: cond. jumpID: 4124Target: „L1“Condition: c
..........
...
Class: binaryID: 10034Left arg: aRight arg: bOper: +Type: int
exp info
IR statement listfunction list
stm info
IR expression
![Page 13: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/13.jpg)
© 2000, R. Leupers
The IR type class
C++ class IRType stores type info for all symbols and expressions Primary type: void, char, short, int, array, pointer, struct, function, ... Secondary type: subtype of arrays and pointers Storage class: extern, static, register, ... Qualifiers: const, volatile Example: const int* A[100];
Type->Class() = IRTYPE_ARRAY // primary type Type->IsConst() = true Type->Subtype()->Class() = IRTYPE_POINTER Type->Subtype()->Subtype()->Class() = IRTYPE_INT Type->ArrayDim() = 100 Type->SizeOf() = 400 // in bytes, for 32-bit pointers
Type->MemoryWords() = 200 // for a 16-bit word memory
![Page 14: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/14.jpg)
© 2000, R. Leupers
The symbol table class
Symbol table stores all relevant information for symbols/identifiers Two hierarchy levels:
Global symbol table IR->GlobalSymbolTable() One local symbol table per function fun->LocalSymbolTable()
All local symbols get a unique numerical suffix, e.g.int f(int x) { int a,b; } int f(int x_1) { int a_2, b_3; }
Important access methods: ST->LookupSymbol(char* name) IRSymbol* ST->CreateSymbol(IRType* tp) Iterators: ST->FirstObject(), ST->NextObject()
Information stored in a table entry (class IRSymbol): Symbol type: IRType* sym->Type() Symbol name: char* sym->Name()
![Page 15: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/15.jpg)
© 2000, R. Leupers
IR generation example
source fileIR file
forward declaration
automatic conversion
auxiliary vars
debug info
suffix 3 for parameter i
![Page 16: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/16.jpg)
© 2000, R. Leupers
IR optimization tools
Purpose: perform machine-independent optimizations on IR Identical IR format for all tools, „plug & play“ concept Currently available tools:
Constant folding cfold tool Constant propagation constprop tool Copy propagation copyprop tool Common subexpression elimination cse tool Dead code elimination dce tool Jump optimization jmpopt tool Loop invariant code motion licm tool Induction variable elimination ive tool
Automatic iteration of IR optimizations via „iropt“ shell script
![Page 17: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/17.jpg)
© 2000, R. Leupers
IR optimization example
compile
C source code
unoptimized IR
![Page 18: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/18.jpg)
© 2000, R. Leupers
Constant folding
cfold
![Page 19: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/19.jpg)
© 2000, R. Leupers
Constant propagation
constprop
![Page 20: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/20.jpg)
© 2000, R. Leupers
Copy propagation
copyprop
![Page 21: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/21.jpg)
© 2000, R. Leupers
Common subexpression elimination
cse
![Page 22: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/22.jpg)
© 2000, R. Leupers
Dead code elimination
dce
![Page 23: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/23.jpg)
© 2000, R. Leupers
Jump optimization
jmpopt
![Page 24: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/24.jpg)
© 2000, R. Leupers
Loop invariant code motion
licm
![Page 25: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/25.jpg)
© 2000, R. Leupers
Induction variable elimination
ive
![Page 26: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/26.jpg)
© 2000, R. Leupers
Control flow analysis
Purpose: identify basic block structure of a C function Basic block (BB): IR statement sequence with unique entry and exit points Control flow graph (CFG): One node per BB, edge (BB1, BB2) iff BB2 may be an immediate successor of BB1 during execution Assembly code generation usually done BB after BB Example:
while (x){ BB1; if (x) then BB2; else BB3; BB4;}
BB1
BB2 BB3
BB4
![Page 27: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/27.jpg)
© 2000, R. Leupers
CFG generation by LANCE
Class ControlFlowGraph contained in LANCE library Constructor ControlFlowGraph(Function* fun) generates CFG for any function fun LANCE tool showcfg exports CFGs in the VCG text format VCG can be used to visualize generated CFGs
showcfg xvcg
IR file VCG file CFG
![Page 28: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/28.jpg)
© 2000, R. Leupers
CFG visualization example
showcfg +VCG tool
![Page 29: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/29.jpg)
© 2000, R. Leupers
Data flow analysis
Goal: convert IR into data flow graph (DFG) representation for assembly code generation by tree pattern matching Performed by def/use analysis between IR statements/expressions LANCE lib class DataFlowAnalysis provides required methods Constructor DataFlowAnalysis(Function* fun) constructs data flow information for any function fun Example:
x = 5; goto lab; ... x = 6;lab: y = x + 1; ... z = 1 – y; u = y / 5;
x has two definitions: x and xy has two uses: y and y
![Page 30: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/30.jpg)
© 2000, R. Leupers
DFG visualization example
showdfg +VCG tool
![Page 31: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/31.jpg)
© 2000, R. Leupers
Backend interface
a b
*
+ +
2c
x y
a b
*
+ +
2c
x y
t
t t
CSE
auxiliaryvariable
LANCE lib classes LANCEDataFlowTree and DFTManager provide link between LANCE IR and tree pattern matching OLIVE/IBURG accept only trees instead of general DFGs Hence: split DFGs at the common subexpressions (CSEs)
![Page 32: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/32.jpg)
© 2000, R. Leupers
Data structure overview
Constructor DFTManager(Function* fun) generates data flow tree (DFT) representation for an entire function fun DFTManager contains internal list of basic blocks Each BB in turn is a list of DFTs
BB 1 DFT 1 DFT 2 DFT m
BB n
..........
...
BB 2
![Page 33: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/33.jpg)
© 2000, R. Leupers
DFT covering with OLIVE
DFTs are directly in the format required by code generators produced by OLIVE All DFTs consist of a fixed set of terminal symbols (e.g. cs_STORE) (specified in file INCL/termlist.c) Example (only a single DFT):
C file
IR fileDFT representation
![Page 34: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/34.jpg)
© 2000, R. Leupers
Example (cont.)
simplifiedOLIVE spec
DFT in OLIVE format
assemblycode for
hypotheticalmachine
![Page 35: The LANCE V2.0 C compiler system Rainer Leupers phone: +49 (231) 755 6151 mobile: +49 (177) 2131146 leupers@icd.de University of Dortmund, Informatik 12.](https://reader036.fdocuments.in/reader036/viewer/2022062802/56649e905503460f94b94e1f/html5/thumbnails/35.jpg)
© 2000, R. Leupers
Summary
LANCE provides you with ... C frontend IR optimizations C++ library for IR access (+ important basic classes) interface to OLIVE data flow trees
Full C compiler additionally requires ... OLIVE based backend for the concrete target machine target-specific optimizations (e.g. scheduling, address gen.)