1 Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation Seokjin Lee *, D. F. Wong + * Dept....

26
1 Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation Seokjin Lee * , D. F. Wong + * Dept. of Electrical and Computer Engineering + Dept. of Computer Sciences The University of Texas at Austin

Transcript of 1 Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation Seokjin Lee *, D. F. Wong + * Dept....

1

Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation

Seokjin Lee*, D. F. Wong+

*Dept. of Electrical and Computer Engineering

+Dept. of Computer SciencesThe University of Texas at Austin

2

Outline Overview Introduction

FPGA Architecture, Routing resources FPGA routing problem

Problem Formulation Routing graphs and Timing graphs

Algorithm Description Lagrangian Relaxation LR_ROUTE, NET_ROUTE

Experimental Results Conclusion

3

Overview A new timing-driven routing

algorithm for FPGAs Find a routing with minimum critical

path delay for a given placed circuit. Handling of the timing constraints in

a mathematical programming framework.

Routing results are compared with those of VPR router.

4

FPGA Architecture Logic modules

Implements logic functions

LUTs, flip-flops Routing resources

Wire segments Programmable

switches I/O modules

L

S

wiresegments

logicmodule

I/Omodule

programmableswitch

L L

L L L

LLL

S S

S S S

S S S

<A typical FPGA architecture>

5

FPGA Routing Resources Prefabricated wire

segments Routing constraints :

Sharing of a wire segments by different nets is not possible

Limited Routability High RC delays and

large area of switches

a b

cd

ef

g h

L2 L4

L1 L3

6

FPGA Routinga b

c

d

e

f

g h

12

34

56

7

8

910

1314

1112

1516

L1

L2

L3

L4

7

Routing Graph Gr (Vr , Er)

Vr : I/O pins of logic modules, wire segments Er : feasible connections between the nodes Routing problem: Find vertex disjoint trees

T={T1,…Tn}

3

2

8

7 13

16

10

9

a b

c

d

g h

e

f

a b

c

d

e

f

g h

12

34

56

7

8

910

1314

1112

1516

L1

L2

L3

L4

8

Timing Constraints Source-to-sink delays of nets

Delay of wire-switch chains Calculated from architecture specific RC

values based on Elmore delay model Timing constraints

Specified by arrival times at primary inputs (outputs of storage elements) or required times at primary outputs (inputs of storage elements)

9

Timing Graph Gt (Vt , Et)

Constructed from input netlist

Captures timing constraints

Vt : inputs, outputs,

logic module pins

Et : source-sink pairs of nets, input-output pairs of logic modules Fictitious nodes

s : connects primary inputs, t : connects primary outputs

s t

primaryinput

logicmodules primary

output

10

Timing-Driven FPGA Routing

Minimization of critical path delay under timing and routing constraints

Find vertex disjoint routing trees T = {T1, …, Tn} for all the nets

such that

Minimize subject to

ta

vuvu aDa tEvu ),(

)),( alongdelay

, nodeat timearrival , nodeat timearrival else,

output of timearrival and 0 , if

input of timearrival and 0 , (if

vupathD

vaua

uaDtv

vDasu

uv

vu

uut

svs

11

Lagrangian Relaxation General technique for solving

optimization problems with difficult constraints

Lagrangian subproblems New objective function: adding constraints to

the original objective function after multiplied by constants (Lagrangian multipliers)

Iteratively update Lagrangian multipliers and solve Lagrangian subproblems

12

Lagrangian Relaxation

kk b)(g

b)(g

b)(g

f

x

x

x

x

...

s.t

)( min

22

11

))((

))((

))(( )( min

222

111

kkk bg

bg

bgf

x

x

xx

Original problem Lagrangian subproblem

k ,,, update 21

13

LR for Our ProblemOriginal problem Lagrangian subproblem

t

vuvu

t

E(u,v)

aDa

a

s.t

min

tEvu

vuvuuvu aDaa

),(L

),(

)(

min

Taλ

λ update

14

Optimality Conditions Optimality conditions on By rearranging terms,

0/ uaL tVu

TEtuut

),(

1

t tEvw Ewu

uwwv),( ),(

},{ tsVw t

ta

b

cd

ac

bccd

e

dt

et

cdbcac 1 etdt

),(

),(

),(

)(

)1(),(

vuuvuv

(w,u) wuwuwwu

tutut

D

a

aL

Taλ

15

Simplified Lagrangian Subproblem

),(

0

),(

0

),(

)( )1(),(vu

uvuv(w,u) wu

wuwwutu

tut DaaL

Taλ

Optimality conditions on

tEvu

uvuvDL),(

)( Tλ

Lagrangian subproblem becomes

tEvu

uvuvDLLS),(

)(min : )( Tλ

16

Updating Lagrangian Multipliers

)}(,0max{1tuvur

ruv

ruv aDa

econvergenc

lim 0lim1

r

ii

rr

r

Subgradient Method

r : stepsize

17

LR_ROUTE

1. Initialize 2. Call NET_ROUTE to solve LS()3. Compute for each

4. Update for each 5. Repeat Steps 2-4 until no shared

resource exists.

ua tVu

uv tEvu ),(

18

Solving Lagrangian Subproblem

NET_ROUTE Find routing trees T for a set of

given multipliers such that

Minimize

subject to

where

tEvu

uvuvD),(

k

rik Vix 1

otherwise 0,

node uses net for if ,1 ikTx kik

19

Solving Lagrangian Subproblem

1s.t

min : )(),(

kik

Evuuvuv

x

DLSt

netk Vii

Viiki

netkvuuvuv

Vi kiki

Evuuvuv

rr

rt

xD

xDL

constant

knet for cost congestion routing

knet for delay sink weighted

),(

),(

}{

)1()(min

20

Routing Nets For net k,

Cost for each node:

netkvu Viiki

vupathiiuv

Viiki

netkvuuvuv

r

r

xd

xD

),( ),(

),(

minimize

cost

congestioncostdelay

iiuvi dc

21

NET_ROUTE

1. For each net k2. Rip up routing for net k3. for each sink v of net k 4. Maze route from source to sink

with cost 5. Update for all nodes in

iiuvi dc i ),( vupath

22

Experimental Results FPGA model used

Symmetrical-array-based FPGA Each logic block contains four 4-input LUTs and

flip-flops Switch connections: Fs = 3, Fc = W Fs: number of connections per wire entering the

switch box Fc : number of tracks to which each logic block pin can connect W : number of tracks in a channel

23

Experimental Results Tested on large circuits from MCNC

benchmark Routing with fixed channel width

Minimum channel width obtained by running VPR in timing-driven mode

Better results for 13 circuits (out of 17)

Critical path delay improved up to 33% with comparable runtime

24

Experimental Results Critical path delay and runtime

comparisonCircuits

LUTs/ FFs

Number of

Tracks

Delay (ns) Runtime (s)

VPR LR_ROUTE

VPR LR_ROUTE

Alu4 1522 33 46.6 46.2 58 57

Apex2 1878 43 61.5 49.3 61 46

Apex4 1262 41 45.4 48.9 29 41

Bigkey 1707 24 41.7 27.8 53 62

Clma 8383 51 125.0 96.4 531 464

Des 1591 24 43.5 48.1 44 42

Diffeq 1497 29 48.8 48.6 32 31

Dsip 1370 25 29.6 27.6 53 78

Elliptic 3604 40 77.1 71.3 151 256

25

Experimental Results

Circuits LUTs/ FFs

Number of

Tracks

Delay (ns) Runtime (s)

VPR LR_ROUTE

VPR LR_ROUTE

Ex1010 4598 44 83.5 75.2 248 351

Ex5p 1064 43 44.8 43.7 22 34

frisc 3556 43 81.5 84.3 121 171

misex3 1397 37 42.5 49.4 50 49

pdc 4575 61 96.5 95.0 304 465

s298 1931 28 98.7 91.5 71 85

seq 1750 35 55.9 47.0 55 67

spla 3690 56 94.7 74.0 203 234

26

Conclusion A new timing-driven routing

algorithm for FPGAs Find a routing with minimum critical

path delay for a given placed circuit. Handling of the timing constraints

by Lagrangian relaxation. Routing results are better than

those of VPR router.