TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic...

64
TKT 1212 TKT -1212 Digitaalijärjestelmien Suunnittelu FSM implementations, Practical VHDL synthesis Erno Salminen, 2011 Current state Input Next State Output Moore

Transcript of TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic...

Page 1: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

TKT 1212TKT-1212Digitaalijärjestelmien Suunnittelu

FSM implementations, Practical VHDL synthesisy

Erno Salminen, 2011

CurrentstateInput

Next

State

Output

Moore

Page 2: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

AcknowledgementsAcknowledgements Prof. Pong . P. Chu provided ”official” slides for the book

which is gratefully aknowledged See also: http://academic.csuohio.edu/chu_p/

M t lid d b A i K l l Most slides were made by Ari Kulmala and other previous lecturers (Teemu Pitkänen, Konsta Punkka,

Mikko Alho…))

2

Page 3: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Finite State MachinesFinite State Machines

All the previous teachings are still valid and just the description stylep g p ychanges

in=x2 in

Stop Play

Play x 2y=z3

Next_tracky=z5

in=plin=x2

in=next

n=others

in=others

always py=z1 y=z2

Rewplay x 2y=z4Prev_track

y=z6

in=st

in=prev

in=others

in=-x2

in=others

always

3

y p

in=-x2

Page 4: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Several Implementation stylesSeveral Implementation styles Two basic flavors: Mealy and Moore

b h l h h l d h In both cases, one must select whether to include the output registers or not

Moreover, you decide the VHDL presentation of FSM, y p Description style: how many processes Encoding of states, if not automated in synthesis

CurrentstateInput

Next

StateOutput

CurrentstateInput

Next

State

Output

MealyMoore4

Page 5: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Several Implementation styles (2)Several Implementation styles (2)1. 1 sequential process2. 2 processes

a) Seq: curr. state registers and output, Comb: next state logicb) Seq: curr state, Comb: next state, outputb) Seq: curr state, Comb: next state, outputc) Seq: next and curr state, Comb: output

3. 3 processes (Seq: curr state, Comb: output, Comb: nextl dstate logic separated)

CurrentstateInput

Next

StateOutput

CurrentstateInput

Next

State

Output

MealyMoore5

Page 6: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

ImplementationsImplementations General form:

We define an own type for the state machine states We define an own type for the state machine states ALWAYS use enumeration type for state machine states synthesis software, e.g. Quartus II, does not recognize it otherwise

E ti t f architecture rtl of traffic_light is

type states_type is (red, yellow, green);

i it t t li itl d fi d i t t h

Enumeration type for states

-- init state explicitly defined in reset, not heresignal ctrl_r : states_type;

...begin rtlbegin -- rtl Signal ctrl_r is the current

state register

6

Page 7: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Implementations: 2seg MooreImplementations: 2seg-Moore Moore, two segment coding stylesync ps : process (clk, rst n)y _p p ( , _ )begin -- process singleif rst_n = '0' then

<INIT STATE OF THE FSM>

l if lk' t d lk '1' thelsif clk'event and clk = '1' then<Synchronous part of the FSM; assign next state to curr state>

end if;end process sync ps;end process sync_ps;

comb_ns_output : process (ctrl_r, input)begin -- process output

<Combinational part;d fi t t tdefine next state, define output>

end process ns_output;

end rtl;

7

Page 8: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Implementations: 2seg MealyImplementations: 2seg-Mealy Mealy, two-segment coding style

( lk )sync_ps : process (clk, rst_n)begin -- process singleif rst_n = '0' then

<INIT STATE OF THE FSM>

elsif clk'event and clk = '1' then <Synchronous part of the FSM; assign next state to curr state>

end if;end if;end process sync_ps;

comb_ns_output : process (ctrl_r, input)begin -- process output

<Combinational part;define next state,define output(Looks same as Moore, but here also(Looks same as Moore, but here alsothe input is considered for output)>

end process ns_output;end rtl;8

Page 9: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

C di g t l 1 g M /R g M lCoding style: 1seg-Moore/Reg-Mealy 1-segment style

sync_all : process (clk, rst_n)begin -- process singleif rst_n = '0' then

<INIT STATE OF THE FSM>

elsif clk'event and clk = '1' then<define next stateand assign next state to curr state><define output> Output becomes register!

end if;end process sync_all;

9

Page 10: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Coding style: 3seg Moore/MealyCoding style: 3seg_Moore/Mealy 3-segment stylesync ps : process (clk, rst n)y _p p ( , _ )begin -- process singleif rst_n = '0' then

<INIT STATE OF THE FSM>elsif clk'event and clk = '1' then

< t t i t><curr state assignment>end if;

end process sync_ps;

comb ns : process (ctrl r, input)comb_ns : process (ctrl_r, input)begin -- process output

<Combinational part;define next state>

end process comb_ns;

comb_output : process (ctrl_r, input)begin -- process output

<Combinational part;define output>define output>

end process comb_output;

10

Page 11: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Examples: Traffic light FSM implemented with various stylesimplemented with various styles Examples shown as VHDL code

They also show the usage of counter in state machine Acts as a timer for showing yellow light

Output latency is larger for Moore and registered Mealy

H ll h i l i k ll li h f However, all the implementations keep yellow light on for same amount of time

11

Page 12: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Comparison of implementation styles: C di g t lCoding style 1-segment:

Automatically inferred output registersy p g Simple view to the design, everything at one place Safe, registered Mealy machine is easy to implement with this style Recommended (as opposite to the course book!)

2-segment, 3-segment Only way to implement unregistered outputs to FSMs Modular Long ago synthesis tools did not recognize 1 segment FSMs correctly Long ago synthesis tools did not recognize 1-segment FSMs correctly

Not the case anymore Recommended style in many books, partially because of those limitations of the old tools

Useful for quite simple control machines that do not have e.g. delay countersincludedincluded

Complex state machines are cumbersome to read The code does not proceed smoothly, have to jump around the code The same conditions may be repeated in many processes

12

Page 13: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Comparison of implementation styles: C di g t lCoding style 1-segment:

Automatically inferred output registersy p g Simple view to the design, everything at one place Safe, registered Mealy machine is easy to implement with this style Recommended (as opposite to the course book!)

2-segment, 3-segment Only way to implement unregistered outputs to FSMs Modular Long ago synthesis tools did not recognize 1 segment FSMs correctly Long ago synthesis tools did not recognize 1-segment FSMs correctly

Not the case anymore Recommended style in many books, partially because of those limitations of the old tools

Useful for quite simple control machines that do not have e.g. delay countersincludedincluded

Complex state machines are cumbersome to read The code does not proceed smoothly, have to jump around the code The same conditions may be repeated in many processes

13

Page 14: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Quartus II design flow after you’ve flow after you ve simulated and verified the design

Generic gate-level representationp

Places and routes the logic into a device

Converts the post-fit netlist into a FPGA programming file

Analyzes and validates the timing performance of all logic in a design.

Run on FPGA14

Page 15: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Examples: state diagramExamples: state diagramTool A

15 Note the encoding

Page 16: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

RTL Synthesis result: Tool ARegister for output bit 2RTL Synthesis result: Tool A output bit 2

State register

R i Registers for output bits 1..0

Combinatorialoutput logic

Next state logic, incl. counter for showing yellow lightlong enough

Comb path frominput to output l i

16 Registered mealy machine, traffic light VHD

logic

Page 17: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Technology schematic Tool ATechnology schematic, Tool ASingle flip-flops

Look-up tables, max 6 inputs

17

Registered mealy machine, traffic light VHD

Page 18: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Synthesis result: Tool BSynthesis result: Tool B Same VHDL, different result

# Info: [45144]: Extracted FSM in module work.traffic_light(rtl){generic map (n_colors_g => 3 yellow_length_g => 10)}, with state variable = ctrl_r[1:0], async set/reset state(s) = 00 , number of states = 3.# Info: [45144]: Preserving the original encoding in 3 state FSM# Info: [45144]: FSM: State encoding table.# Info: [40000]: FSM: Index Literal Encoding[ ] g# Info: [40000]: FSM: 0 00 00# Info: [40000]: FSM: 1 01 01# Info: [40000]: FSM: 2 10 10

Note the different state encoding

18

Registered mealy machine, traffic light VHD

Page 19: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Technology schematic Tool BTechnology schematic, Tool B

LUTLUTs

Multi-bit registers

19 Registered mealy machine, traffic light VHD

Page 20: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Physical placement on-chipThe traffic_light.vhd place and routed

Stratix 2S180, 143 000 ALUTs (~LUTs)

20 Quite much unused resources...

Page 21: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Implementation area and freqImplementation area and freq Note that no strict generalization can be made about the

”betterness”betterness Tool A Total ALUTs 15

h ALUTs with register 10 Tool B LUTs 16 Registers 9

The one register difference is due to the different stateencodingencoding The state encoding can be explicitly defined or left to the tool

to choose (as in this case)

21

Page 22: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Synthesis of different VHDLsSynthesis of different VHDLsAREA [LUT] AREA [reg] Lines of Code

mealy (single) 16 9 104y ( g )mealy (output separated) 13 6 126mealy_2proc. (out+ns separated) 11 6 125mealy_3proc 11 6 150

Functionally equivalent

Moore 11 6 108

Timing aspect vary Different max frequency

O l h ” l l ” h 3 b Only the ”Mealy single” has output registers (3 bit)

Coding style has an effect even with small designs

R dibilit f th d i i l!

22

Readibility of the code is as crucial!

Page 23: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Synthesis summarySynthesis summary Different tools produce slightly different results in even small

designs Notable effect also achieavable by tuning the tool settings Synthesis tools are heuristic due to very large design space Synthesis tools are heuristic due to very large design space

Even a single tool may produce slightly different results on different runs! Optimization heuristics utilize randomness

However, no tool can convert a bad design into a good one!

Make sure that you are aware of what signals of the shown codes have been implemented as registers!

23

Page 24: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Comparison of implementation styles: M d M lMoore and Mealy Generally, we want that outputs are registered

Mealy machine is dangerous due to possible long combinational paths (or loops)

For registered outputs, use a registered Mealy machine Outputs are registered but has shorter latency than Moore Outputs are registered, but has shorter latency than Moore

machine with registered outputs

Otherwise, opt for Moore machine, p

24

Page 25: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Logic synthesisLogic synthesis

25

Page 26: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Foreword: VHDL and synthesisForeword: VHDL and synthesis The main goal of writingVHDL is to generate synthesizable

description

This lecture presents some practical examples of how to write code that is good for synthesiswrite code that is good for synthesis

The quality of the design is much affected by the coding style must be able to choose structures that synthesize the best must be able to choose structures that synthesize the best

26

Page 27: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Simplest exampleSimplest example In this course, we concentrate on RTL synthesis: how is HDL

d i li f b i d fli flconverted into netlist of basic gates and flip-flops Technology mapping, routing and placement are beyond the scope of this

course

Example: arith_result <= a + b + c - 1; The resulting combinatorial logic is straightforward

l f d d h d Inclusion of DFF depends on the context (inside a sync process or not)

27

Page 28: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Example: if elseExample: if-else

Conceptual structure of nested if-clauses in HDL

Conceptual hardware realization

Page 29: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Example: Example: selected

iassignment

Fig 1. Basic form of synthesized logic

Note the similaritySimilar to if else Note the similarityto –if-clause inside a process

Similar to if-else

29 Fig 2. Full logic

Page 30: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Logic from case clauseLogic from case-clause This example has 2 outputs but again the logic is similar to if-

lclause

30

Conceptual structure of case-clause in HDL Conceptual hardware realization

Page 31: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

More complexSelectedSelectedassignemt

31Fig1. Conceptual hardware realization

Fig 2. Full logic

Page 32: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Example: loopExample: loop

Bounds must be static , likeBounds must be static , likehere (3-1 down 0)

The loop is ”unrolled” in logic

Evertyhing happens in parallel!

Hence, the loop is equivalentto

Sidenote: y <= a xor b is evenbetter with std_logic_vectors, butthen we would not have an example of a loop

Page 33: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Loops: example2 in SWLoops: example2 in SW With software

f (i 0 i i ) {for (i = 0; i < max_c; i++) {b(i) = a(i) + i;

}

It ti l l ti f b(i) ( i lifi d) Iterative calculation for b(i) (simplified)1. Calculate for-clause2. Fetch a3. Add4. Store b5 Increment i5. Increment i6. Go back to 1

Takes a lot of clock cycles (several even with loop-unrolling)

33

y p g

Page 34: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Loops: example2 in HWLoops: example2 in HW Hardware:

dd i f i i 0 t 1 tadd_i: for i in 0 to max_c-1 generateb(i) <= a(i) + i;

end generate add_i;

G t < > ll l t ti it Generates <max_c> parallel computation units High area overhead

Result generated in 1 clock cycle, very fast!g y , y

However, in HW we can adjust the area-performance ratio Pipeline e.g. half of the result on the first cycle, rest on the

second Fully sequential (the SW case), still faster than SW

34

y q ( ),

Page 35: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Basic optimizationsBasic optimizations Constant operands simplify Boolean equations For example, consider 4 bit comparatora) x = y

b) x = 0

Smallest possible data width is of course desiredS a est poss b e ata w t s o cou se es e

Iterative algorithms trade area for delay

Even the most basic operations have different costsp

One can share complex units via multiplexing

35

Page 36: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

SharingSharing Arithmetic operators

l Large implementation Limited optimization by synthesis software Data width has a major impactj p

“Optimization” can be achieved by “sharing” in RT level coding

O h i Operator sharing Functionality sharing

36

Page 37: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Sharing 2Sharing 2 Possible when “value expressions” in priority network and multiplexing p p y p g

network are mutually exclusively: Only one result is routed to output

G f f d l l h Generic format of conditional signal assignment guarantees this:sig_name <= value_expr_1 when boolean_expr_1 else

value_expr_2 when boolean_expr_2 elsevalue_expr_3 when boolean_expr_3 else. . .

value expr n;value_expr_n;

37

Page 38: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Sharing example 1NOTE:Coded outside a processSharing example 1

Original code:

Coded outside a process

r <= a+b when boolean_exp elsea+c;

Revised code (enables sharing):src0 <= b when boolean exp elsesrc0 b when boolean_exp else

c;r <= a + src0;

38

Page 39: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Area: 2 adders, 1 mux, Bool

D l

Area: 1 adder, 1 mux, Bool

D l Delay: Delay:

39

However, no free lunch in general: sharing reduces A butincreases T in this case

Page 40: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Sharing example 2NOTE:Coded inside a process

l hSharing example 2 Original code:

Equivalent with previous

Revised code:b dprocess(a,b,c,d,...)

beginif boolean exp then

process(a,b,c,d,...)begin

if boolean_exphif boolean_exp then

r <= a+b;else

thensrc0 <= a;src1 <= b;

r <= a+c;end if;end process;

elsesrc0 <= a;src1 <= c;p ;

end if;end process;r <= src0 + src1; s c0 s c ;

40

Page 41: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Guidelines for synthesizable HDLGuidelines for synthesizable HDL

41

Page 42: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Logic design basics still applyLogic design basics still apply Modularize the design to components Easier to design single components Easier to upgrade

U i t l t ( SVN it) Use version control system (e.g. SVN or git)

Asynchronous reset is used only to initialize Not part of the functionality Not part of the functionality Hence, you don’t force reset from your code Use separate clear-signal or similar if needed. That is checked on the edge

sensitive if-branch (lec 12)

42

Page 43: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

General guidelines and hintsGeneral guidelines and hints Use only synthesizable code!

Use std_logic data types and numeric_std package

Use only descending range in the arrays (e.g. downto)i l i d l i (d id h 1 d 0) Signal write_r : std_logic_vector(data_width_g-1 downto 0)

Signal write_out : std_logic_vector(0 to data_width_g-1)

Parenthesis to show the order of evaluation A and ( x or b)

Check the VHDL coding rules used in the course Not just tidyness, affects also performance/area of the designj y , p g

43

Page 44: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Points to notePoints to note Assignment delay, such as ”a <= b After x ns”, is problematic Assigment will be synthesized but not the delay This example will produce a simple wire If you fix your code like this, it won’t work after synthesisy y , y Only place to use non-synthesizable code is testbenches

Variables are synthesizable, but… it is harder to figure out the resulting logic than with signals (lec12)

High-impedance state ’Z’ is synthesizable but… simulation results and real HW do not always match (lec12)simulation results and real HW do not always match (lec12)

Synthesis tools are great but… they behave differently. Some structures are not accepted by all tools

44

Page 45: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Notes on combinational circuitdesigndesign Always write a complete sensitivity list

In every comb. process invocation, every signal must beassigned a value Oth i t l t h t h ld th i l Otherwise generates latches to hold the previous values We practically never want to have latches from RTL comb.

Processes Usual suspect: Incomplete if-else or such

Avoid combinational loops! The same signal on both sides of assigment

E.g. a <= a+1; -- aargh!

45

Page 46: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Note on sequential synchronouscircuit designcircuit design In a synchronous process, there are only two branches

if rst_n = '0' then -- asynchronous reset (active low)

<INIT>elsif clk'event and clk = '1' then rising clockelsif clk'event and clk = '1' then -- rising clock

edge <Synchronous part>

end if;

Only clk and rst_n in the sensitivity list! No else-branch, no other elsif branches, and no code outside the

branchesbranches Some tools support more branches and some don’t => behavior

undefined

46

Page 47: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

The true Devil never do!The true Devil – never do!ENTITY bad_counter IS

PORT (reset clk inc : IN STD LOGIC;reset, clk, inc : IN STD_LOGIC;cnt : BUFFER INTEGER RANGE 0 TO 4);

END bad_counter;ARCHITECTURE example OF bad_counter ISBEGIN -- Example

PROCESS (clk, reset, inc, cnt)BEGIN -- PROCESS

IF reset = ’0’ THEN -- asynchronous reset (active low)cnt <= 0;

ELSIF inc = ’1’ THENcnt <= cnt+1;

S lk’ lk ’1’ i i l k dELSIF clk’EVENT AND clk = ’1’ THEN -- rising clock edgecnt <= cnt-1;

END IF;END PROCESS;

END example;

What is wrong?

Generates a pseudo-random machine.

What is wrong?

47

Page 48: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Not supported by synthesisNot supported by synthesis Signals in packages (global signals) Si l d i bl i iti li ti Signal and variable initialization

Typically ignored (there are exceptions, e.g. Xilinx FPGA synthesis)

Unconstrained while and for loops More than one ’event in a process Multiple wait statements Physical types for example time Physical types, for example time Access types File types Guard expression (Sensitivity lists, delays and asserts are ignored)

48

Page 49: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

TypesTypes Using own types may significantly clarify the code Declaration Declaration:

TYPE location ISRECORD

x: INTEGER range 0 to location_max_c-1;y: INTEGER range 0 to location max c-1;y: INTEGER range 0 to location_max_c 1;valid : std_logic;

END RECORD;

TYPE locations type IS ARRAY (0 to 3) of location;_ yp ( ) ;SIGNAL loc_r : locations_type;

Usage:For i in 0 to 3 loop

x,y,valid

0,3,’1’Loc_r(0)loc_r(i).x <= i;loc_r(i).y <= 3-i;loc_r(i).valid <= ’1’;

End loop;

1,2,’1’

2,1,’1’

3,0,’1’

Loc_r(1)

Loc_r(2)

Loc_r(3)

49

Page 50: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Types #2Types #2

Initialization of an array constant:Initialization of an array constant:constant a_bound_c : integer := 2;

type vector_2d is array (0 to a_bound_c-1) of std_logic_vector(1 downto 0);

type vector_3d is array (0 to a_bound_c-1) of vector_2d;

constant initial_values_c : vector_3d := (("00", "01"),

("10", "11"));

You may split initilization to multiple lines to increase readability

Initial_values_c

c0\c1 0 1

0 ”00” ”01”

c1 = horizontalc0 = vertical

1 ”10” ”11”

50

Page 51: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Types #3Types #3

Special case, have to use positional assignment:Special case, have to use positional assignment:constant ip_types_c : integer := 1;

type ip_vect is array (0 to ip_types_c-1 ) of integer;

constant ip_amount_c : ip_vect := (0 => 1); -- right way

constant ip_amount2_c : ip_vect := (1); -- does not work!

constant ip_amount2_c : ip_vect := 1; -- does not work!

** Error: rtm pkg.vhd(20): Integer literal 1 is not of type ip vect._p g ( ) g yp p_

There is only a single value but it is an array nonetheless

51

Page 52: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Coding style effect Coding style effect Depends on the used synthesis software

However, if style x is clearly better than y in synthesis tool A, most probably it won’t be much worse in B (although it mayyield the same result)yield the same result)

Here we use Quartus II Altera FPGA synthesis tool as an Here, we use Quartus II Altera FPGA synthesis tool as an example Used on the course Source: Quartus II Handbook (for ver. 6.1) + Ref 1.

52

Page 53: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

MultiplexersMultiplexers Multiplexers form a large portion of the logic utilization

E.g. 30% of Nios II/f soft-core processor area are muxes

If-structure generates a priority multiplexerIF cond1 THEN < aIF cond1 THEN z <= a;

ELSIF cond2 THEN z <= b;

ELSIF cond3 THEN z <= c;

ELSE z <= d;

END IF;

It is preferred to use caseCASE sel IS

WHEN cond1 => z <= a;

WHEN cond2 => z <= b;

d3WHEN cond3 => z <= c;

WHEN OTHERS => z <= d;

END CASE;

Creates a balanced multiplexer tree

53

Page 54: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Multiplexers #2Multiplexers #2 Do not let the simplicity of VHDL trick you

Multiplexing four 32-bit words requires 130 input bits (2 control bits + 128 data bits), 32 output bits

A l f i A lot of routing 32 x 4-to-1 muxes A 4-to-1 mux requires three 2-to-1 muxesA 4 to 1 mux requires three 2 to 1 muxes One 2-to-1 mux implementable in one basic logic element => 3*32=96 2-to-1 muxes required, 96 LEs consumed

54

Page 55: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

ShiftersShifters Variable amount shifting is area-hungry

h 32 b h b h f d b Assume that 32-bit vector that can be shifted arbitrary amount to left or right

Needs a 32-to-1 mux for every result bit! 32-to-1 mux = 31 2-to-1 muxes = 31 LEs

32*31 = 992 2-to-1 muxes (=LEs) Non-constant shifters are generally not supported Non constant shifters are generally not supported

(automatically) by synthesis tools

An FPGA-specific trick is to use the embedded multipliers to d th d i hiftido the dynamic shifting Multiplying by 2n shifts the result to left by n Faster and more area-efficient than doing this with LEsg

55

Page 56: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

ComparatorsComparators <, >, ==

Avoid implementing == in general logic cells. Comparators are implementable using arithmetic operations and fast carry chainschains. Calculate a-b and check is the result negative, zero, or positive

Synthesis tools should be aware of this automatically...y y

Recall that x = a[6:0] < b[6:0] is the same as x = signed(a[6:0]-b[6:0])[7]

The last carry [overflow] of the subtraction The last carry [overflow] of the subtraction

Note: in ASIC’s it may not be feasible to use arithmetics for comparison

56

Page 57: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

An example 0.55 um standard-cell CMOS i l t tiCMOS implementation

RTL Hardware Design58

Subscriptsa = area-optimizedd = delay-optimized

Asymptotic cost:Nand: area is O(n) and time O(1)”>” area is O(n) and time O(n)

Page 58: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Background: Big-O notation for algorithmic complexityalgorithmic complexity

Way to approximate the how the cost increases with the number of i tinputs n

Function f(n) belongs to class O(g(n)):if n0 and c can be found to satisfy:if n0 and c can be found to satisfy:

f(n) < cg(n) for any n, n > n0

g(n) is simple function: 1, log2n, n, n*log2n, n2, n3, 2n

Following are O(n2):

58

Page 59: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Interpretation of Big OInterpretation of Big-O Filter out the “interference”: constants and less important

t terms Algorithms with O(2n) is intractable, but already O(n3) is

very bady Not realistic for a larger n Frequently tractable algorithms for sub-optimal solution exist

O d l h l i h One may develop a heuristic algorithm They do not guarantee optimal solution, but ususally provide

rather good one with acceptable cost Often utilize pseudo-random choices For example, simulated annealing and genetic algorithms

59

Page 60: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

E g E.g.,

intractable

60

Page 61: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

Specific to FPGASpecific to FPGA A lot of registers – use them Aggressive pipelining Aggressive pipelining Objective is to hide the routing delays as much as possible Simple logic stages between registers

Adders Adders Generally, its not beneficial to share adders FPGAs often contain (e.g. Altera) special structures for adders Sharing of adder may cost as much as the adder itselfSharing of adder may cost as much as the adder itself

Hard macros Use whenever appropriate Higher performance than by building one with the FPGA native resources Higher performance than by building one with the FPGA native resources ”they are there anyway”

Embedded multipliers and small SRAMs are common

61

Page 62: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

FPGA #2FPGA #2 Get to know the properties of the device

E.g. FPGA on-chip memories are typically multiples of 9 bit wide Th i th bit b d f The ninth bit can be used for Control

Parity bit

Otherwise, it is wasted! Memories are typically dual-ported, take advantage of this

62

Page 63: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

ConclusionsConclusions Finite state machines can be coded in a variety ways Prefer simplicity according to in house coding rules Prefer simplicity, according to in-house coding rules

Coding style has a profound effect on the quality of the hardware Area, max clock frequency Loops Loops Complex assignment logic creates a sea of multiplexers E.g. variable amount left-right shifter

Synthesis tools create different but functionally equivalent netlists Synthesis tools create different but functionally equivalent netlists even for small designs

Know your FPGA! You might save area and time if using some hard coded macros You might save area and time if using some hard-coded macros However, these are tricks that you should only use on the final

optimization phase

63

Page 64: TKT-1212 Digitaalijärjestelmien Suunnittelu€¦ · Several Implementation styles Two basic flavors: Mealy and Moore In bhboth cases, one must select whhhether to incldludeth he

ReferencesReferences1. James Ball, Designing Soft-Core Processors for FPGAs. In

book ”Processor Design. System-on-chip computing for ASICs and FPGAs”, eds. Jari Nurmi.

64