SPICE Diego A Transistor Level Full System Simulator
-
Upload
edward-powell -
Category
Documents
-
view
27 -
download
0
description
Transcript of SPICE Diego A Transistor Level Full System Simulator
Computer Science & Engineering DepartmentUniversity of California, San Diego
SPICEDiego
A Transistor Level Full System Simulator
SPICEDiego
A Transistor Level Full System Simulator
Chung-Kuan Cheng
May 27, 2004
OutlineOutline
Motivation
Status of Commercial Simulators
Solver Engine: Multigrid Review
Activity Driven Analysis
Nonlinear Transistor Devices
Experimental Results
Conclusion
MotivationMotivation
Moore’s Law # transistors and clock frequency double /2 years
1984: 100K transistors, 10M Hz2003: 100M transistors, 5G Hz
More Challenges for Circuit Simulator Electrical Coupling (C&L):
interconnect delay, crosstalk, voltage drop, ground bounce
Short Channel Devices
SPICE Cannot perform large chip analysis, capacity limit to
50,000 transistors ( O(n2) complexity )
Status of Commercial Simulators Status of Commercial Simulators
Partition Based Simulation Most commercial fast spices (HSIM, Power Mill / Time Mill / Rail
Mill, NaroSim, RedHawk, Ultrasim)
Advantage
• Smaller Matrix Size
• Easy to apply varies time step to different subcircuits
Disadvantage
• Hard to catch coupling effect between subcircuits
• Device Model ignoring Miller’s effect
• Potential convergence problem
• Accuracy not guaranteed.
MotivationMotivation
Direct MethodDirect Method
Basic Iterative Basic Iterative
MethodMethod Slow Convergence
Conjugate GradientConjugate Gradient Multigrid MethodMultigrid Method
High
Complexity
Multigrid ReviewMultigrid Review
Error Components High frequency error (More oscillatory between neighboring
nodes)
Low frequency error (Smooth between neighboring nodes)
Basic iterative methods only efficiently reduce high
frequency error
Basic Idea of Multigrid Convert hard-to-damp low frequency error to easy-to-damp
high frequency error
Multigrid : A Hierarchy of Problems Multigrid : A Hierarchy of Problems
1 3 52 4 6
A0 • X0=b0
21
43A1 • X1=b1
21A2• X2=b2
Smoothing Smoothing
Smoothing Smoothing
Gauss Elimination
Restriction
RestrictionInterpolation
Interpolation
Hierarchically, all error components smoothed efficientlyHierarchically, all error components smoothed efficiently
Geometric vs AlgebraicGeometric vs Algebraic
Geometric multigrid method Require Regular Grid Structure
Algebraic Multigrid Coarsening Relied on Matrix,
No requirement of regular grid structure Coloring scheme
Error Smoothing Operator: Gauss-Seidel Interpolation
Small residue but the error decreases very slowly. In practice, we use only coarse node at the RHS of above
formula to approximate error correction of fine node.
jij
ijiii eaea
0Ae
Convergence of Multigrid MethodConvergence of Multigrid Method
The matrix needs to be symmetric positive definite Key to the convergence of iterative method
SOR, PCG, Multigrid
RC network The system matrix is S.P.D(symmetric positive definite)
)()()( tUtGXtXC System Equation:
Apply Trapezoidal Rule:
)()()()2
()()2
( htUtUtXCh
GhtXCh
G
LHS matrix is S.P.D, it is also valid for B.E. and F.E formulae
Convergence of Multigrid MethodConvergence of Multigrid Method RLKC network
)]()([12
)()(
)(2)]()([)()12
2
)()12
2
tVhtVlAL
htIhtI
TITlAhtUtUtV
lALT
lA
hG
h
C
htVlALT
lA
hG
h
C
00
0
0
U
I
VA
A
G
I
V
L
C Tl
lSystem Equation:
0
)()(
)(
)2
2
)(
)(2
2 tUhtU
tI
Vt
h
LA
A
Gh
C
htI
htV
h
LA
A
Gh
C Tl
l
Tl
l
Apply Trapezoidal Rule:
The LHS matrix is not S.P.D, but can be converted to S.P.D matrix
The LHS matrix of first equation is now S.P.D. Similar for B.E and F.E
L-1 is called K / Susceptance / Reluctance Matrix
Why Algebraic MethodWhy Algebraic Method
No Requirement of Regular Grid Works for general circuits.
Circuit with Mutual Inductance Adjacency graph of the converted system matrix
is different from circuit topology.
lALT
lA
hG
h
C 12
2
Converted System Matrix:Converted System Matrix:
Activity Driven AnalysisActivity Driven Analysis
Circuit Latency & Multi-rate Behavior
Spatial Latency Only portions of the circuit is active at any given time
80%-90% of total gates are non-switching Temporal Latency
A given portion of circuit is not always active.
Multi-rate Behavior Varies time constant multi-rate behavior
How to utilize ?
Circuit Partitioning: common technique used in timing
simulators.
Adaptive SmoothingAdaptive Smoothing
HOW? Only active regions get error smoothed Varies “time step size”
inactive subcircuits may only get chance to have error smoothed at finest level once every several time points
WHY? Error smoothing operation at finer levels takes most of the
iteration time Smoothing at coarser level is sufficient for inactive portions
of circuit
Adaptive smoothing at finest grid levelAdaptive smoothing at finest grid level
Incorporating Transistor Devices (1)Incorporating Transistor Devices (1)
•Direct Simulation of Transistor Devices Makes Linear Solver Diverge•Conventional Method: Abstract Device as Current Waveform, Ignore the Interaction with VDD/VSS. • How to include Transistor Devices?
Inside the inner most Newton-Raphson linearization iteration, decouple the linear and nonlinear interface, replaced by Norton Equivalent Circuit.
Incorporating Transistor Devices (2)Incorporating Transistor Devices (2)
Advantage Possible to use fast linear matrix solver (require
symmetric positive definite matrix properties , which is not hold for nonlinear transistors)
Less Memory Requirement: Matrix for nonlinear components can be generated on the fly. Possible to run large case with millions of transistors.
Decouple linear-nonlinear only at the inner most Newton-Raphson iteration of transient analysis. Accuracy guaranteed via linear-nonlinear iteration (typically 4 ~ 10 iterations)
Experimental Results (1)Experimental Results (1)
chip
board
Power Supply
Test Case #1 Board / Packaging / Chip Power Network Fully coupled packaging inductance 60k elements, 5000 nodes. Spice failed
Our tool Less than 2 minutes
Experimental Results (2)Experimental Results (2)
Power/Clock network case. 30k nodes, 1000 transistor devices Spice run time 41323s Our Run time: 1859s 22x speedup
Experimental Results (3) Experimental Results (3)
1K cell design 10,286 nodes 751 Gates Spice run time: 2121s Our run time: 26.1s 8x Speedup
10K cell design 123,590 nodes 7,481 Gates Spice Run time: 44293s Our run time: 3572s, 12.4x Speedup
Why SPICEDiego is better? Why SPICEDiego is better?
SPICEDiego: fast accurate transistor level circuit simulator Powerful Matrix Solver Engine Transistor devices. Capable of capturing coupling effects. Device Model including Miller’s effect Less Memory Requirement (no LU factorization, dose not
save matrix for transistors)
Application interconnect delay Crosstalk voltage drop, ground bounce simultaneous switching noise
ConclusionConclusion
Moore’s Law demands an extraordinary fast circuit simulator with guaranteed accuracy.
Current tools cannot cover Miller’s effect, mutual inductance. There is no bound on the error either.
SPICEDiego offers a solution for circuit designers