1/29/2013
1
TDTS 01 Lecture 6
High-Level Synthesis I
Zebo PengEmbedded Systems Laboratory
IDA, Linköping University
2Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
H.L. Describe and Synthesize
Description of a design in terms of behavioral specification.
Refinement of the design towards an implementation by adding structural details.
Evaluation of the design in terms of a cost function and the design is optimized for the cost function.
For I=0 To 2 LoopWait until clk’eventand clk=´1´;If (rgb[I] < 248) Then
P=rgb[I] mod 8;... clk
PR
enable
O
1/29/2013
2
3Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Lecture 6
Advanced scheduling issues
A typical HLS process
Definition and motivation
Scheduling techniques
4Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Definition
HLS generates automatically a register-transfer level design from a behavioral specification.
For I=0 To 2 LoopWait until clk’eventand clk=´1´;If (rgb[I] < 248) Then
P=rgb[I] mod 8;...
clk
p R
enable
OHLS
HLS is also called architectural synthesis, behavioral synthesis, or algorithmic synthesis.
Layout Synthesis
Logic Synthesis
RTL Synthesis
High-Level Synthesis
1/29/2013
3
5Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Input to HLSThe behavioral specification.
Design constraints (timing, performance, cost, power consumption, pin-count, testability, etc.).
An optimization (cost) function.
A module library representing the available components.
For I=0 To 2 LoopWait until clk’eventand clk=´1´;If (rgb[I] < 248) Then
P=rgb[I] mod 8;...
clk
p R
enable
OHLS
6Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Controller
Output of HLSRTL implementation structure (netlist).
Controller (captured usually as a symbolic FSM).
For I=0 To 2 LoopWait until clk’eventand clk=´1´;If (rgb[I] < 248) Then
P=rgb[I] mod 8;...
clk
p R
enable
OHLS
C1 C2 C3
Other attributes, such as geometrical
information, to guide the back-end design tasks.
1/29/2013
4
7Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Goal of HLS To generate a RTL design that
implements the specified behavior,
satisfies the design constraints, and
leads to the optimization of the given cost function.Ex. Cost = DP-area + Cont-area + Exe-time
For I=0 To 2 LoopWait until clk’eventand clk=´1´;If (rgb[I] < 248) Then
P=rgb[I] mod 8;...
HLS
Controller
clk
p R
enable
O
C1 C2 C3
8Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Why HLS?Efforts spent in system design are now dominating
the design cycle.
Physical design → Logic design → System design.
Improve design productivity.
Increase re-targetability.
New technology generations.
ASICs vs. FPGAs.
New platforms.
Suport design re-use.
Get the design correct the first time.
1/29/2013
5
9Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Lecture 6
Advanced scheduling issues
A typical HLS process
Definition and motivation
Scheduling techniques
10Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
HLS Phase 1 — Specification Which language to use?
Procedural languages
Functional languages
Graphics notations
VHDL vs. Verilog vs. SystemC
Explicit parallelism or not?
Requirement specification
Timing
Power consumption
Ares cost
…
PROCEDURE Test;VAR
A,B,C,D,E,F,G:integer;
BEGINRead(A,B,C,D,E);F := E*(A+B);G := (A+B)*(C+D);...
END;
1/29/2013
6
11Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Phase 2 — Data-Flow Analysis
Parallelism extraction.
Eliminating high-level
language constructs
(procedure call, complex
operator, etc.).
Program transformation (e.g,
loop invariant movement).
Loop unrolling.
Common sub-expression
detection.
Dead code elimination.
+
A B C DE
GF
+
* *
PROCEDURE Test;VAR
A,B,C,D,E,F,G:integer;BEGIN
Read(A,B,C,D,E);F := E*(A+B);G := (A+B)*(C+D);...
END;
12Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Phase 3 — Operation Scheduling
To decide when to perform
each operation.
Performance/cost trade-
off.
Clocking strategy.
Performance estimation.
+
A B C DE
GF
+
* *
1/29/2013
7
13Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Phase 4 — Data-Path Allocation
Operator selection and
binding.
Register/memory allocation.
Interconnection generation.
Hardware minimization.
+
A B C DE
GF
+
* *
To decide the basic data-path structure.
partial data-path
Reg R2
+
A<0:7> B<0:7>
*
Reg R1
Reg R3
E<0:7>
Reg R4
M
14Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Phase 5 — Control Allocation Selection of control
style (PLA, micro-code, random logic, etc.).
Controller generation.
Controller description:
S1: Load R1 and R2 next S2;
S2: Add, M=1, Load R3 next S3;
S3: Load R4 next S4;
S4: Mult, M=0, Load R3 next S5;
...
Reg R2
+
A<0:7> B<0:7>
*
Reg R1
Reg R3
E<0:7>
Reg R4
M
1/29/2013
8
15Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Phase 6 — Module Binding and Cont. Imp.
Selection of physical modules.
Controller ROM:
0000 : 11000000 0001
0001 : 00100000 0010
0010 : 00011000 0011
0011 : 01000000 0100
...Module
Lib
Reg R1 Reg R2
Reg R3
A<0:7> B<0:7>
M1
Adder 8
L2L1
L3
Decision on module parameters and constraints.
Controller implementation.
16Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
The Basic Issues
Scheduling — Assignment of each operation to a time slot corresponding to a clock cycle or time interval.
Resource Allocation — Selection of the types of hardware components and the number for each type to be included.
Module Binding — Assignment of operation to the allocated hardware components.
Controller Synthesis — Design of control style and clocking scheme.
1/29/2013
9
17Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
The Basic Issues (Cont’d)
Compilation — To convert the input specification
into an internal representation.
Parallelism Extraction — To extract the inherent
parallelism of the original solution.
It is usually done with data flow analysis
techniques.
Operation Decomposition — Implementation of
complex operations in the behavioral
specification.
18Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Lecture 6
Advanced scheduling issues
A typical HLS process
Definition and motivation
Scheduling techniques
1/29/2013
10
19Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
The Scheduling Problems
Resource-constrained (RC) scheduling:Given a set O of operations with a partial ordering,
a set K of functional unit types, a type function, : O K, to map the operations into the
functional unit types, and resource constraints mk for each functional unit type.
Find a (optimal) schedule for the set of operations that obeys the partial ordering and utilizes only the available functional units.
Time-constrained (TC) scheduling
20Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
A RC-Scheduling Example
a := i1 + i2;
o1 := (a - i3) * 3;
o2 := i4 + i5 + i6;
d := i7 * i8;
g := d + i9 + i10;
o3 := i11 * 7 * g;
1 adder, 1 multiplier
: +,- Adder
* Multiplier
+ *
+
*
*+
-
*
+
+
o1
o2
o3
g
da
1/29/2013
11
21Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
ASAP (As Soon As Possible) Sch.
Sort operations topologically according to their dependence.
Schedule operations in the sorted order by placing them in the earliest possible control step.
1 2 3 4 5
6 7 8
9 10
Sorted DFG
+ 1*
3
+ 2
+ 4
*5
- 6
+ 7
+ 8
*9
*10
Co
ntr
ol S
tep
s
1
2
4
6
7
5
3
+ *
+
*
*+
-
*
+
+
22Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
ALAP (As Late As Possible) Sch. Sort operations topologically according to their dependence.
Schedule operations in the reversed order by placing them in the latest possible control step.
1 2 3 4 5
6 7 8
9 10
Sorted DFG
+ 1
*3
+ 2
+ 4
*5- 6
+ 7
+ 8*
9
*10
Co
ntr
ol S
tep
s
N-5
N-3
N-1
N
N-2
N-4
6 steps needed
+ *
+
*
*+
-
*
+
+
6 steps needed = Optimum
1/29/2013
12
23Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
List Scheduling
For each control step, the operations that are available to be scheduled are kept in a list;
The list is ordered by some priority function:
The length of path from the operation to the end of the block (critical path);
Mobility: the number of control steps from the earliest to the latest feasible control step.
Each operation on the list is scheduled one by one if the resources it needs are free; otherwise it is deferred to the next control step.
24Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Sorted list: +1(2), *3(2), +4(2), +2(1), *5(1)
Schedulable operations
Co
ntr
ol S
tep
s
1
3
5
6
4
2
Sorted list: +1(2), *3(2), +4(2), +2(1), *5(1)
Schedulable operations + 1*
3
List Scheduling Example
+ *
+
*
*+
-
*
+
+
1 2 3 4 5
6 7 8
9 10
+ 2
+ 1*
3
+ 4*
5
- 6
+ 7
+ 8*
9
*10
1/29/2013
13
25Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Time-Constrained Scheduling
Force-Directed Method
TC scheduling — to find a schedule for the operations that obeys the partial ordering and utilizes the minimal amount of hardware.
The basic idea is to balance the concurrency of operations.
ASAP and ALAP schedules, assuming no resource constraints, are calculated to derive the time frames for all operations.
For each type of operations, a distribution graph is built to denote the possible control steps for each operation.
If an operation could be done in k steps, then 1/k is added to each of these k steps.
26Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Force-Directed Scheduling Example
ASAP
+
+
*
a1
a2
*
+a3
ALAP
+
+
*
a1
a2*
+a3
Time Constraint: 3 cycles
a1
a2a3
Distribution Graph for Addition
The
idea
l dis
tribu
tion
1/29/2013
14
27Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Force-Directed Scheduling ExampleThe algorithm tries to balance the distribution graph by calculating the force of each operation-to-control step assignment and select the smallest force.
)(
)(
))(( )()(
1)(
iALAP
iASAP
ji
o
osijso sDG
oTsDGForce
ASAP
+
+
*
a1
a2
*
+a3
ALAP
+
+
*
a1
a2*
+a3
Time Constraint: 3 cycles
a1
a2a3
Distribution Graph for Addition
The
idea
l dis
tribu
tion
28Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
5.015.1)(2
15.1
)3(
)3()2)3((
aALAP
aASAPsca sDGForce
Force-Directed Scheduling ExampleThe algorithm tries to balance the distribution graph by calculating the force of each operation-to-control step assignment and select the smallest force.
ASAP
+
+
*
a1
a2
*
+a3
ALAP
+
+
*
a1
a2*
+a3
Time Constraint: 3 cycles
a1
a2a3
Distribution Graph for Addition
The
idea
l dis
tribu
tion
1/29/2013
15
29Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
5.015.0)(2
15.0
)3(
)3()3)3((
aALAP
aASAPsca sDGForce
Force-Directed Scheduling ExampleThe algorithm tries to balance the distribution graph by calculating the force of each operation-to-control step assignment and select the smallest force.
ASAP
+
+
*
a1
a2
*
+a3
ALAP
+
+
*
a1
a2*
+a3
Time Constraint: 3 cycles
a1
a2a3
Distribution Graph for Addition
The
idea
l dis
tribu
tion
30Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Classification of Sch. Approaches, I
Constructive scheduling —
One operation is assigned to one control step at a time and this process is iterated from control step (operation) to control step (operation).
ASAP
ALAP
List scheduling
1/29/2013
16
31Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Classification of Sch. Approaches, II
Global scheduling —
All control step and all operations are considered simultaneously when operations are assigned to control steps.
Force-directed scheduling
Neural net scheduling
Integer Linear Programming algorithms
32Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Classification of Sch. Approaches, III
Transformational scheduling —
Starting from an initial schedule, a final schedule is obtained by successively transformations.
Transformation-Based Approach
PerformanceCostPowerTestability...
1/29/2013
17
33Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Lecture 6
Advanced scheduling issues
A typical HLS process
Definition and motivation
Scheduling techniques
34Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Advanced Scheduling Issues
Chaining and multi-cycling:
*100ns
+
200ns
+
Two chainedadditions*
100ns
+
+
A multi-cyclemultiplication*
100ns
+
+
50ns
Pipelined operations.
Two adders needed
Complex in analysis, verification and testing.
1/29/2013
18
35Zebo Peng, IDA, LiTH TDTS01 Lecture Notes – Lecture 6
Summary on Scheduling
Scheduling has long been recognized as a very important step in high-level synthesis, since it determines the performance of the final design.
A wide variety of algorithms exist in the literature for efficiently performing the task of HLS scheduling.
These algorithms can be classified in different ways:
Objectives: Basic scheduling algorithms, Time constrained, or Resource constrained.
Approaches: Constructive, Global, or Transformational.
Top Related