TKT-1212 Digitaalijärjestelmien toteutus · TKT-1212 Digitaalijärjestelmien toteutus Contents...
Transcript of TKT-1212 Digitaalijärjestelmien toteutus · TKT-1212 Digitaalijärjestelmien toteutus Contents...
Lecture 8 – Simulation engines
Erno Salminen 2010
TKT-1212 Digitaalijärjestelmien toteutus
ContentsModeling dimensions
1. Temporal2. Data abstraction3. Functional4. Structural
Basic simulator typesEvent-driven simulation enginesCycle-aseb simulation engines
Waveform viewers and summary
IntroductionVerification => Functional correctnessTesting => Manufacturing correctnessDesign Under Verification (DUV)Much of design time is spend on developing the verification environment and debugging HDLSimulation based verification is the most common
The heart is the simulation engineModels the behavior of the designSupports high-level verification languages (e.g. Vera), code coverage tools etc.
3
Acknowledgements
4
This presentation is based on the book ”Comprehensive Functional Verification: The Complete Industry Cycle” by Bruce Wile, John Goss, and Wolfgang Roesner
Some examples obtained from Mentor Modelsim manual
Modeling dimensions
“Model is a simplification of reality and every system is best approached through a small set of nearly independent models”
– Booch & Rumbaugh
5
Modeling dimensions#1: Temporal1. Temporal dimension -Behavior over time, i.e. when the
state changesI/Os represent the state of the DUV
a) Continuous time (Analog)Fairly close to physical properties of electrical circuit
b) Discrete time”Digital”, electrical properties abstracted, delta delay, events occur seemingly simultaneouslyClock cycleNo wiring or gate delays
c) Event-based (instruction-level, transaction-level)Waits for certain events, time between events varies
6
Modeling dimensions #2: Data2. Data abstraction - Signal values
a) Continuous range (analog)E.g. voltage measurement, arbitrarily accurate real numbers
b) Discrete valuesBits, strings, integers, states …
E.g. std_logic:’1’, ’0’, ’u’, ’x’, ’z’, ’H’, ’L’ …
Abstract values, e.g. user defined enumeration statesMain, read_io, write_io …
Structs combine several abstract values
7
Modeling dimensions #3: Function3. Functional Dimension
May just be Continuous mathematical functions (e.g. Spice simulator)Select level of abstraction
Transistors -> switches -> Boolean Logic -> Algorithms -> Abstract mathematical formula
E.g.:a) Boolean (half) adder:
z0 = x0 xor y0 C1 = x0 and y0
b) Algorithm+ (automatic implementation or user defined)
c) Abstract formulaThe whole functionality is specified with abstract implementation-independent notationsx=(2+y)^z mod a
8
Modeling dimensions #4: Structure4. Structural Dimension
a) Flat (single black box)No structurality, ”just implementation”, eg. FFT with abstarct mathematical formula
b) HierarchicalImplementation is structural
Many subblocks
Many components that have subblocks
E.g. FFT has subblocks for add and multiply
FFT
FFT+
*
+
*
Input Output
Input Output
9
VHDL support for modeling dimensionsTemporal
Continuous Gate Delay Clock CycleIntruction
CycleEvents
Data
Continuous Multivalue Bit Bit Abstract Value Struct
Functional
Continuous Switch Level Boolean Logic Algorithmic Abstract Mathematical
Structural
Single black box
Functional blocks
Detailed component hierarchy
10Verilog’s support
Modeling compromise: Speed vs. Accuracy
RTLStyle
Gate-levelstyle
Gate-level with detailed
delays
Simulation runtime and memory requirements
Model details and accuracy11
More speed -> more abstract models -> more test cases within same time, but less accuracy.Designers should start with high-level models
Basic verification, ”something like this could work”Gradually refine the models if needed
Simulation engines
Simulation engine typesEvaluate HDL model over time and present its state
Standardization: The HDL language reference manual (LRM) defines the behavior of the simulation engine.
1. Evaluate signals and blocks only at model times for which eventsare scheduled
Event-driven simulation enginesMajority of simulators are in this category, e.g. ModelSimEvaluate only the ”active parts”
2. Evaluate the model at every point of time along the finest granularity known to the simulation engine
Cycle-based simulation enginesSimpler simulator and hence fasterCommonly evaluate the whole model regardless of activity
13
Simulation engine
HDL Model
Basic HDL simulator block diagramInteractive user control GUI
HDL model of DUV
HDL Testbench
stimuluscheck
stimuluscheck
Trace Files
Coverage Files
Interactive waveform viewer GUI
Testbench program
Interactive testbench
debug GUIstimulus
check
14
Interactive coverage analysis GUI
Event-Driven Simulation Engines
15
Event-driven simulationMost popular approach,
Used also in other areas since it is very general approachChannels or signals transfer data between blocksBlocks process data at its inputs which may initiate a new transfer.
Engine acticates only those blocks whose inputs changeAlgorithm to evaluate time:
”Evaluate signals and blocks only at model times for which events are scheduled”The model objects need to notify the simulation engine about future changes scheduling engine may skip unused time intervalsSimulator keeps track of ”current time” and when events are scheduled events
16
Event-driven simulation (2)Scheduling is done in internal time intervals if no delay is specified
Zero-delay scheduling (delta delay), most usual in RTLEach scheduling step evaluation creates a resulting update to the next occuring stepParallel updates are handled sequentially by the engine, effectively randomly
Signal changes propagate through the model as the scheduling progressesFeedback loops may cause endless oscillation
User or the engine must take action to interrupt uncontrolled oscillationUsually bad HDL design, e.g. combinatorial loop
17
Simulator exampleSimulating a top level having two blocks, which contain 3 sub-blocks, b1, b2, c1.
inputs model outputs
i1
i2
a1
a2
s1
s2
s3
s4
s5 s6
a3
a4
s7
s8
s9a5
a6
a7 o1
o2
18
Signal directions imply partial order for the evalutation.
Sim. example: evaluation over timeA change occurs in i1Simulator starts to evaluate the model over time by steps (delta delay)
B1
B2
C1
inputs model outputs
i1
i2
a1
a2
s1
s2
s3
s4
s5 s6
a3
a4
s7
s8
s9a5
a6
a7 o1
o2
i1 a1 s1 B1
s3
a3 s7 C1
B2 s6 a6
s4 …
o2Blocks might be evaluated twice (due to both inputs)
…Signal update
Block evaluation
Simulator engine’s scheduling step (=delta)19
time
Another example
20
s9s10
s11
s12 s13
s7
s8
1.
2.
3.
4.
5.
Another example: Update sequence
21
...
...
Example of how network view is constructed from VHDL
22
user-defined signal name
simulator’s internal signal
Signal assignment1. concurrent - in an architectural body
Signal activities control – not top-down flow - their execution order
2. sequential - inside a process, top-down orderingThe value on the right side will be scheduled for the left side
Value is placed on the driver of the left-hand side signalMultiple concurrent assignments produce multiple drivers
That is legal if the signal type defines resolution function which resolves a single value from multiple drivers
Sequential body (i.e. process) may have multiple assignment but they produce only a single driver
Note that this includes both sequential (synchronous) and combinatorial (asynchronous) processesThe last assignment in HDL is ”kept”
Last assigned value (in time) is kept unless otherwise stated -> may require state-holding logic in synthesis (DFF or latch)
Events and transactionsSignal assignment may have an associated delay
An event occurs when a signal changes its value
When a value is scheduled to be assigned to a target signal after a given time, a transaction has been placed on the driver
A transaction may assign the same value again but no event occurTransaction is represented with value-time tuple (time, new_value)
Events and transactions (2)
target_signal <= v after d;
t+ t+ t+
At time t, new value v is computed from the right-hand sideAssignment specifies also delay dtransaction tri = (v,d) is placed on the driver
At time t+t0,time component of the transaction tri has decresed to d-t0
At time t+t1, time component decreases furtherAt time t+d, time component becomes 0 and transaction expires
Target signal get the value v
Transaction example
a b
c
expiration, but no event
event
Events and sensitivity listSimulation engine considers events of only those signals that are included in sensitivity list
Example1: combinatorial processcomb_foo: process (a_in, b_in) ... No need to simulate this when, e.g. clk changesHowever, it could be simulated
but that would not change any value (since the needed inputs are stable) waste of simulation time
comb_foocomb_foo ...
a_in
b_in assigned signals
clkrst_n
others
not needed in sens.list
(Do not read or put any of these outputs into sens.list! That would create combinatorial loop, i.e. random oscillator and infinite loop in simulator.)
sync_bar (in simulator)sync_bar (in simulator)
Events and sensitivity list (2)Example 2: synchronous processsync_bar: process (rst, clk)beginif(rst = '0')then
s0 <= '0';elsif(clk'event and clk='1') then
... <statements>After reset, process will be simulated on every clock edge
But statements inside elsif-branch executed only at rising clock edgeOne could include all signals that are read in statements
their events do not occur at the same time with clk’event waste of time
a_in
b_in
clkrst_n
othersnot needed in sens.list
(These will change just after the clk edge and these can be read inside sync_bar )
... assigned signals
Events and sensitivity list (3)Synthesis tool does not care about sensitivity list!
All necessary signals (those that are read) will go into the logic cloud!
All signals assigned inside if-branch with x’event and x=’1’create register whose clk input is connected to signal x
Detecting edge on arbitrary control signal must be coded explicitlyCompare values from two consecutive clk cycles: if (a_old_r /= a_in)Nested ’events won’t produce any meaningful logic
sync_bar (synthesized HW)sync_bar (synthesized HW)
...
a_in
b_in assigned signals
clkrst_n
others
comb logic
comb logic
Delays in VHDL1. Real delays: inertial, reject-inertial transport
See last lecture
Model the gate and wire delays
2. Delta delaysSimulator’s concept to deal with seemingly concurrent events
Multiple signals may need updating, statements that are sensitive to these signals must be executed, and any new events that result from these statements must then be queued and executed as well
The steps taken to evaluate the design without advancing simulation time are referred to as "delta times" or just "deltas.“
An infinitesimal interval
Waveform shows the same global time no matter how many delta delays elapses
This mechanism may cause unexpected results
31
Delta delay example◆ RS latch◆ In waveform viewer, all transitions occur
at the same timeENTITY rsl IS
PORT (s, r: IN BIT; q, qn: OUT BIT );END rsl;ARCHITECTURE gate OF rsl IS
SIGNAL q_temp, gn_temp : BIT;BEGIN
q <= q_temp;qn <= qn_temp;q_temp <= s NAND qn_temp;-- Executed once in tqn_temp <= r NAND q_temp;-- Executed twice in t
END gate; q_temp
qn_temp
Delta delayThe execution order of components with zero delay is unclear, e.g. two processes
Simulator assumes some order
This is bad problem if signal value is momentarily out of range of its type
Event-driven simulationEssential properties:
1. Evaluate model behaviour only at those times when model events are scheduled
2. Evaluate behavior only for the blocks or signals for which events are scheduled
VHDL and Verilog specifications include the assumption of underlying event-driven simulator
Cyclic process1. Update signals2. Execute processes (concurrent statements are actually also
processes)3. Adavance global time
33
Event-driven simulationThe three basic core data structures of the event-driven simulation engine
1. A list of all executable blocks present in the model network2. Data structure that shows the connections between blocks via signals3. A value table that holds all current signal values
Activity and time progress controlled by time wheelAt zero time, all executable blocks are scheduled
In VHDL, all processes and concurrent assignments
Each time wheel entry has a to-do listAssignments scheduled to happen at that point
34
ModelSim general flow
Source: Modelsim manual
35
Example of delta delay problemclk2 <= clk;
seq0: process (rst, clk)beginif(rst = '0')thens0 <= '0';
elsif(clk'event and clk='1') thens0 <= inp;
end if;end process;
seq1: process (rst, clk2)beginif(rst = '0')thens1 <= '0';
elsif(clk2'event and clk2='1') thens1 <= s0;
end if;end process;
seq0
s0
inp
seq1
s1
clk2
clk
rst_n
Desired HW: this is what you would expect
clk2 <= clk; process (rst, clk) begin
if(rst = '0')then s0 <= '0';
elsif(clk'event and clk='1') then s0 <= inp;
end if; end process; process (rst, clk2) begin
if(rst = '0')then s1 <= '0';
elsif(clk2'event and clk2='1') thens1 <= s0;
end if; end process;
Inp=1Clk = 1
Clk2<=clk
Event-queue[t0]
Signal update queue [t0] clk2=1
seq0:(clk)
S0=1
These change first,
then signal updates
(s0=0)
seq1:(clk2)
S1=1
which create new event,
and last signal
Wrong value!
Delta delay problem in simulator
Inp= 1Clk = 0
37In one simulation round
Some of you may notice similarities to problems caused by clock skew
Behavior of example codeIn this example you have two synchronous processes,
1. one triggered with clk 2. one triggered with clk2
To your surprise, the signals change in the clk2 process on the same edge as they are set in the clk process!As a result, the value of inp appears at s0 and s1 in the same simulation cycleDuring simulation
1. An event on clk occurs (from the testbench). 2. From this event ModelSim performs the "clk2 <= clk" assignment and the
process which is sensitive to clk3. Before advancing the simulation time, ModelSim finds that the process sensitive to
clk2 can also be run.In order to get the expected results, you must do one of the following:
a) Make certain to use the same clock on both processes or use just one process
b) Insert a delay at every output c) Insert a delta delay
Event-driven simulation performanceWidely used optimization well understoodPerformance critical portions:
Management of to-do listsThe time wheelThe data that represents model topology
When event is evaluated traverse model topology to find signals and blocks to update find the corresponding slots in the time wheelput the corresponding event to the to-do list
Model granularity compromiseActivity rate affects fine-grained model more than coarse grained model
39
Granularity vs performance
40
Simulation PerformanceSimulation throughput
Per time spent:Amount of verification, i.e. number of testsNumber of cyclesNumber of distinct states visited and checked
Improve throughput:Increase simulation engine performance
Or increase the simulated model performanceRun simulations in parallelEliminate redundant simulations
Hard to measureThe target of the simulation engine: all-around, gate-level, RT-level?Profiling: which parts of the model are most time consuming
E.g. Vsim -> Tools -> Profile -> Performace, then simulate, and view the performance report
41
Improving performanceEfficiency: time_spent_on_HDL / time_spent_on_scheduling
Less events, more efficient
Speed can be optimized by more abstract HDL:No gate-level structuresIntegers instead of bit-vectorsStandard librariesBinary values over multivaluedNo delay statementsProcesses instead of concurrent assignments
Process is a pre-scheduled atomic action for simulation engine
42
Example VHDLs
43
Fastest
Slowest
Event driven simulations- the future
Significant research on parallel algorithms for simulation engine over the years
No breakthroughSeems to be inherently hard to parallelize
no commercially available parallel event-driven simulation engine
Two alternatives to parallelize1. Trivial parallelization: Simulation farm
Pool of workstations, each running independent simulation
2. Running single model partiotioned and parallelized accross several workstations
44
Cycle-Based Simulation Engines (CBSE)
45
Cycle-Based simulation enginesAlgorithm to evaluate time:Evaluate the model at every point of time along the finest granularity
known to the simulation engineE.g. once per clok cycle
Based on much simpler algorithms than event-driven simSuperior performance
10x to 20x speed, models 3-10 times smallerOptimized totally for synchronous hardware design style
DownsidesSevere constraints to HDL design style
No delaysLimited sequential structuresTestbench features of the HDLs largely not supported
Testbenches with other languages, APIs
46
Cycle-based simulation engines(2)Due to constraints, not commercially accepted (came to market in mid 90s)However, some features have been then integrated to the event-driven simulators
Applicable portions of the code are automatically handled with cycle-based fashion, others with event-driven
Hybrid simulators
Synchronous designTiming verification and functional verification can be separated
Modeling propertiesZero-delay simulationNo combinational feedback loops
model must be a directed acyclic graph (DAG)
No dynamic block scheduling, evalution times are known
47
Example model network
48
Cycle-based simulation enginesCBSEs typically use an oblivious simulation algorithm
Calculates all the combinational functions at every cycleRedundant work: evaluate also parts that do not change
simplicity (time wheels, to-do lists removed)No multivalue bits
Synchronous do not care about glitches and hazards
Simulation model actually becomes a piece of executable code (a program)
Each output has a mathematical function dependent of the inputs typical arithmetic optimizations can be used at the compile time(synthesis-like optimizations).
e.g. constant propagation, redundant logic
49
CB model of 2b-adder
Compare with the one shown on slide #22 (fig 5.25). Note the absence of delays50
CBSE extensionsIn order to be more usable, CBSE’s have been extended at the expense of the simulation efficiency
Multivalued bitsE.g. Busses (three-state), bus driving errorsIn VHDL, std_logic has 9 different states lots more computation on boolean logicIn verilog, there are 4 states for bit
Performance degrates 3x-4xShould be selectable feature
Multiple clock domainsOverclock the simulation of the slower domainHowever, simulators can never be solely used to quarantee clock domain crossings!
Hybrid simulatorsEvents inside CBSE vs. CBSE inside event BSE
51
Multiclocking example
Slower clock domain clocked with the rate of the faster. (overhead)
52
Waveform Viewers
53
Waveform ViewersEvery simulator can produce a trace file
At minimum, symbol name and signal value containedUnfortunately, EDA vendors all have own file formatsUsual difference is the compression, because in large simulations, data amount is very high
Waveform viewers share very common looking GUI
GTK wave viewer is a free tool that supports many different formats
Search capabilities are required for usabilityCertain transitions on a signalSpecific values
54
Summary1. Event driven simulation engine:
Most commonSupports arbitrary delaysSupports a large set of HDL features
2. Cycle-based simulation engineMostly used to boost simulation within event driven simulation engineDoes not support delays (fixed time steps only)Severely restricts usable HDL featuresSignificantly faster than Event DSE
Designer can affect simulation speed by design choicesMore abstract code -> more speed -> less accuracyBalance and compromise!
55
Extra
Other sourcesVHDL: Analysis and Modeling of Digital Systems
Tekijät Zainalabedin NavabiJulkaisija McGraw-Hill Professional, 1998ISBN 0070464790, 9780070464797632 sivuahttp://books.google.fi/books?id=Z_EjcfIQqGgC
http://www.ece.msstate.edu/~reese/EE8993/lectures/delay/delay.pdfhttp://www.imit.kth.se/courses/2B1512/F1.pdfhttp://www.ida.liu.se/~petel/SysSyn/lect2.frm.pdfhttp://www.cs.lth.se/EDA380/Lectures/Lecture3.pdf
Time wheel and data structures
58
Typical ED simulator flow
yesno
59
no yes
Simulator checks all projected signal traces
Global time is advanced to the next transaction
1 ns, 10ns, 15 ns, 20 ns, 35 ns...
S3 has value 10 during 15-20ns
After that value is a function of 1 and 10 (not defined here)
Fig: [http://www.ida.liu.se/~petel/SysSyn/lect2.frm.pdf]
Proj
ecte
d si
gnal
va
lues
Simulator example
Debugging delta delay problemsThe best way to debug delta delay problems is observe your signals in the List window.
There you can see how values change at each delta time.View -> ListSelect signal in Object window -> RightMouseButton+Add to List
Reminder: Simulator example Simulating a top level having two blocks, which contain 3 sub-blocks, b1, b2, c1.
inputs model outputs
i1
i2
a1
a2
s1
s2
s3
s4
s5 s6
a3
a4
s7
s8
s9a5
a6
a7 o1
o2
62
Signal directions imply partial order for the evalutation.
Waveform of Simulator exampleInitally
i1 = 0i2 = 1
Theni1 goes 1 at 25 ns
All signals seem to be updated at once
List view of Simulator exampleShows the simulation order, i.e. delta delays between assignments
Start of simulation, all values ’U’Input values assignedFirst internal signals (a1, a2)get update
Simulation time advances, i1 rises
Block b1 sets s3 and s4Signals propagate...
Block c1 sets a7 and s9a3 and a4 updated, block b2 sets s6
What actually happens:
In this example at 0ns, blocks b1and c1 at were executed once(0ns+3, 0ns+6). Block b2 executed twice: 0ns+4 and 0ns+9. The latter does not chnage the value of s6.
Annotated simulator example
inputs model outputs
i1
i2
a1
a2
s1
s2
s3
s4
s5 s6
a3
a4
s7
s8
s9a5
a6
a7 o1
o2
65
Signal directions imply partial order for the evalutation.
+1 +2 +3 +4 +5 +6 +7
+9
+8
+8
The annotated delta steps correspond to simulation time 0 ns
Detecting Infinite Zero-Delay LoopsIf a large number of deltas occur without advancing time, it is usually a symptom of an infinite zero-delay loop in the design.In order to detect the presence of these loops, ModelSim defines a limit, the iteration limit", on the number of successive deltas that can occur.
When ModelSim reaches the iteration limit, it issues a warning message.The iteration limit default value is 5000.
If you receive an iteration limit warning, first increase the iteration limit and try to continue simulation.
You can set the iteration limit from the Simulate > Runtime Options menu or by modifying the IterationLimit variable in the modelsim.ini. See Control Variables Located in INI Files for more information on modifying the modelsim.ini file.
If the problem persists, look for zero-delay loops. Run the simulation and look at the source code when the error occurs.Use the step button to step through the code and see which signals or variables are continuously oscillating.
Two common causes are a loop that has no exit, or a series of gates with zero delay where the outputs are connected back to the inputs.
•Source: ModelSim SE Userís Manual, v6.2a, June 2006