Power Network Analysis Chung-Kuan Cheng CSE Dept. University of California, San Diego 1/22/2010.
-
Upload
dayna-carter -
Category
Documents
-
view
215 -
download
2
Transcript of Power Network Analysis Chung-Kuan Cheng CSE Dept. University of California, San Diego 1/22/2010.
Power Network Analysis
Chung-Kuan Cheng
CSE Dept.
University of California, San Diego
1/22/2010
Page 2
Agenda
Background: power distribution networks (PDN’s)
Worst-case PDN noise prediction
– Motivation
– Problem formulation
– Proposed Algorithm
– Case study
Simulation: adaptive parallel flow using discrete Fourier transform (DFT)
– Motivation
– Adaptive parallel flow description
– Experimental results
Conclusions and future work
Page 3
Research on Power Distribution Networks
Analysis
–Stimulus, Noise Margin, Simulation
Synthesis
–VRM, Decap, ESR, Topology
Integration
–Sensors, Prediction, Stability, Robustness
Page 4
Publication List
• Power Distribution Network Simulation and Analysis[1] W. Zhang and C.K. Cheng, "Incremental Power Impedance Optimization Using Vector Fitting Modeling,“ IEEE Int. Symp. on Circuits and Systems, pp. 2439-2442, 2007.
[2] W. Zhang, W. Yu, L. Zhang, R. Shi, H. Peng, Z. Zhu, L. Chua-Eoan, R. Murgai, T. Shibuya, N. Ito, and C.K. Cheng, "Efficient Power Network Analysis Considering Multi-Domain Clock Gating,“ IEEE Trans on CAD, pp. 1348-1358, Sept. 2009.
[3] W.P. Zhang, L. Zhang, R. Shi, H. Peng, Z. Zhu, L. Chua-Eoan, R. Murgai, T. Shibuya, N. Ito, and C.K. Cheng, "Fast Power Network Analysis with Multiple Clock Domains,“ IEEE Int. Conf. on Computer Design, pp. 456-463, 2007.
[4] W.P. Zhang, Y. Zhu, W. Yu, R. Shi, H. Peng, L. Chua-Eoan, R. Murgai, T. Shibuya, N. Ito, and C.K. Cheng, "Finding the Worst Case of Voltage Violation in Multi-Domain Clock Gated Power Network with an Optimization Method“ IEEE DATE, pp. 540-547, 2008.
[5] X. Hu, W. Zhao, P. Du, A.Shayan, C.K.Cheng, “An Adaptive Parallel Flow for Power Distribution Network Simulation Using Discrete Fourier Transform,” accepted by IEEE/ACM Asia and South Pacific Design Automation Conference (ASP-DAC), 2010.
Page 5
Publication List
• Power Distribution Network Analysis and Synthesis[6] W. Zhang, Y. Zhu, W. Yu, A. Shayan, R. Wang, Z. Zhu, C.K. Cheng, "Noise Minimization During Power-Up Stage for a Multi-Domain Power Network,“ IEEE Asia and South Pacific Design Automation Conf., pp. 391-396, 2009.
[7] W. Zhang, L. Zhang, A. Shayan, W. Yu, X. Hu, Z. Zhu, E. Engin, and C.K. Cheng, "On-Chip Power Network Optimization with Decoupling Capacitors and Controlled-ESRs,“ to appear at Asia and South Pacific Design Automation Conference, 2010.
[8] X. Hu, W. Zhao, Y.Zhang, A.Shayan, C. Pan, A. E.Engin, and C.K. Cheng, “On the Bound of Time-Domain Power Supply Noise Based on Frequency-Domain Target Impedance,” in System Level Interconnect Prediction Workshop (SLIP), July 2009.
[9] A. Shayan, X. Hu, H. Peng, W. Zhang, and C.K. Cheng, “Parallel Flow to Analyze the Impact of the Voltage Regulator Model in Nanoscale Power Distribution Network,” in 10 th International Symposium on Quality Electronic Design (ISQED), Mar. 2009.
Page 6
Publication List (Cont’)
•3D Power Distribution Networks[10] A. Shayan, X. Hu, “Power Distribution Design for 3D Integration”, Jacob School of Engineering Research Expo, 2009 [Best Poster Award]
[11] A. Shayan, X. Hu, M.l Popovich, A.E. Engin, C.K. Cheng, “Reliable 3D Stacked Power Distribution Considering Substrate Coupling”, in International Conference on Computre Design (ICCD), 2009.
[12] A. Shayan, X. Hu, C.K. Cheng, “Reliability Aware Through Silicon Via Planning for Nanoscale 3D Stacked ICs,” in Design, Automation & Test in Europe Conference (DATE), 2009.
[13] A. Shayan, X.g Hu, H. Peng, W. Zhang, C.K. Cheng, M. Popovich, and X. Chen, “3D Power Distribution Network Co-design for Nanoscale Stacked Silicon IC,” in 17 th Conference on Electrical Performance of Electronic Packaging (EPEP), Oct. 2008. [5]
[14] W. Zhang, W. Yu, X. Hu, A.i Shayan, E. Engin, C.K. Cheng, "Predicting the Worst-Case Voltage Violation in a 3D Power Network", Proceeding of IEEE/ACM International Workshop on System Level Interconnect Prediction (SLIP), 2009.
==================================================================
T.Y. Wang, C.C.P. Chen, “Theremal-ADI a Linear Time Chip-Level Dynamic Thermal-Simulation Algorithm based on Alternating Direction Implicit Method,” IEEE Trans. On VLSI, pp. 691-700, 2003.
Page 7
What is a power distribution network (PDN)
Power supply noise
– Resistive IR drop
– Inductive Ldi/dt noise
[Popovich et al. 2008]
Page 8
PDN Roadmap
2007 2009 2012 2015 2017 2020 20220.5
0.6
0.7
0.8
0.9
1
1.1
Year
Vd
d o
f hig
h-p
erf
orm
an
ce M
PU
(V
)
2007 2009 2012 2015 2017 2020 2022160
180
200
220
240
260
280
300
320
YearA
vera
ge
cu
rre
nt (
A)
2007 2009 2012 2015 2017 2020 20220.5
1
1.5
2
2.5
3
3.5
4
4.5
Tra
nsi
en
t cu
rre
nt (
101
2 A
/s)
Transient current
Average current
Vdd of high-performance microprocessors Currents of high-performance microprocessors
[ITRS 2007]
Page 9
PDN Roadmap
2007 2009 2012 2015 2017 2020 20220.1
0.15
0.2
0.25
0.3
0.35
Year
Ta
rge
t im
pe
da
nce
(m
)
Target impedancetarget
5%dd
load
VZ
I
[ITRS 2007]
Page 10
Agenda
Background: power distribution networks (PDN’s)
Worst-case PDN noise prediction
– Motivation
– Problem formulation
– Proposed Algorithm
– Case study
Simulation: adaptive parallel flow using discrete Fourier transform (DFT)
– Motivation
– Adaptive parallel flow description
– Experimental results
Conclusions and future work
Page 11
Worst Case Analysis
Target Impedance vs. Worst Cases
Noise vs. Rise Time of Stimulus
Rogue Wave of Multiple Staged Network
Page 12
PDN Design Methodology: Target Impedance
PDN design
– Objective: low power supply noise
– Popular methodology: “target impedance”
[Smith ’99]
Implication: if the target impedance is small, then the noise will also be small
Page 13
Worst-Case PDN Noise Prediction: Motivation
Problems with “target impedance” design methodology
– How to set the target impedance?
• Small target impedance may not lead to small noise
– A PDN with smaller Zmax may have larger noise
Time-domain design methodology: worst-case PDN noise
– If the worst-case noise is smaller than the requirement, then the PDN design is safe.
• Straightforward and guaranteed
– How to generate the worst-case PDN noise
max max ( )t
V v t target max max ( )Z Z Z j
max max ( )t
I i t
max max max/( ) may be larger than oneV Z I
( ) ( ( ) ( ( )))v t IFT Z j FT i t FT: Fourier transform
Page 14
Worst-Case PDN Noise Prediction: Related Work
At final design stages [Evmorfopoulos ’06]
– Circuit design is fully or almost complete
– Realistic current waveforms can be obtained by simulation
– Problem: countless input patterns lead to countless current waveforms
• Sample the excitation space
• Statistically project the sample’s own worst-case excitations to their expected position in the excitation space
At early design stages [Najm ’03 ’05 ’07 ’08 ’09]
– Real current information is not available
– “Current constraint” concept
– Vectorless approach: no simulation needed
– Problem: assume ideal current with zero transition time
Page 15
Ideal Worst-Case PDN Noise
Problem formulation I
PDN noise:
Worst-case current [Xiang ’09]:
max ( )
s.t. 0 i(t) b
v t
0
( ) ( ) ( )t
v t h i t d ( ): PDN impulse responseh
( ) for ( ) 0i t b h ( ) 0 for ( ) 0i t h
Zero current transition time. Unrealistic!
Page 16
Worst-Case Noise with Non-zero Current Transition Times
Problem formulation II
0max ( ) ( ) ( )
s.t. 0 i(t) b
/
Tv T h i T d
di dt c
Transition time:
r
bt
c
T: chosen to be such that h(t) has died down to some negligible value.
* f(t) replaces i(T-τ)
Page 17
Proposed Algorithm Based on Dynamic Programming
GetTransPos(j,k1,k2): find the smallest i such that Fj(k1,i)≤ Fj(k2,i)
Q.GetMin(): return the minimum element in the priority queue Q
Q.DeleteMin(): delete the minimum element in the priority queue Q
Q.Add(e): insert the element e in the priority queue Q
Page 18
Proposed Algorithm: Initial Setup
Divide the time range [0, T] into m intervals [t0=0, t1], [t1, t2], …, [tm-1, tm=T]. h(ti) = 0, i=1, 2, …, m-1
u0 = 0, u1, u2, …, un = b are a set of n+1 values within [0, b]. The value of f(t) is chosen from those values. A larger n gives more accurate results.
h(t)
Page 19
Proposed Algorithm: f(t) within a time interval [tj, tj+1]
Ij(k,i): worst-case f(t) starting with uk at time tj and ending with ui at time tj+1
h(t)Theorem 1: The worst-case f(t) can be cons-tructed by determining the values at the zero-crossing points of the h(t)
Page 20
Proposed Algorithm: Dynamic Programming Formulation
Define Vj(k,i): the corresponding output within time interval [tj, tj+1]
Define the intermediate objective function OPT(j,i): the maximum output generated by the f(t) ending at time tj with the value ui
Recursive formula for the dynamic programming algorithm:
Time complexity:
1
( , ) ( )( ( , )( ))j
j
t
j jtV k i h I k i d
0( , ) max ( ) ( )
where ( ) is all the possible ( ) that satisfies ( )
jt
i
i j i
OPT j i h f d
f f f t u
0
(0, ) 0 for all [0, ]
( 1, ) max( ( , ) ( , ))jk n
OPT i i n
OPT j i OPT j k V k i
2( )O n m
Page 21
Acceleration of the Dynamic Programming Algorithm
Without loss of generality, consider the time interval [tj, tj+1] where h(t) is negative.
Define Wj(k,i): the absolute value of Vj(k,i):
( , ) ( , )j jW k i V k i
Lemma 1: Wj(k2,i2)- Wj(k1,i2)≤ Wj(k2,i1)- Wj(k1,i1) for any 0 ≤ k1 < k2 ≤ n and 0 ≤ i1 < i2 ≤ n
Page 22
Acceleration of the Dynamic Programming Algorithm
Define Fj(k,i): the candidate corresponding to k for OPT(j,i)
Accelerated algorithm:
– Based on Theorem 2
– Using binary search and priority queue
( , ) ( , ) ( , )j jF k i OPT j k W k i
Theorem 2: Suppose k1 < k2, i1 [0,∈ n] and Fj(k1,i1)≤ Fj(k2,i1), then for any i2 > i1, we have Fj(k1,i2)≤ Fj(k2,i2).
( log )O nm n
Page 23
Case Study 1: Impedance
2.09mΩ @ 19.8KHz 1.69mΩ
@ 465KHz
3.23mΩ @ 166MHz
Page 24
Case Study 1: Impulse Response
0 0.2 0.4 0.6 0.8 1
x 10-7
-1
-0.5
0
0.5
1
1.5
2x 10
6
Time (sec)
Impu
lse
resp
onse
(V
)
0 0.2 0.4 0.6 0.8 1 1.2
x 10-4
-30
-20
-10
0
10
20
30
Time (sec)
Impu
lse
resp
onse
(V
)
0 0.2 0.4 0.6 0.8 1 1.2
x 10-5
-1000
-500
0
500
1000
1500
2000
Time (sec)
Impu
lse
resp
onse
(V
)
Impulse response: 100ns~10µs Impulse response: 10µs~100µs
Impulse response: 0s~100ns
High frequency oscillation at the beginning with large amplitude, but dies down very quickly
Mid-frequency oscillation with relativelysmall amplitude.
Low frequency oscillation with the smallest amplitude, but lasts the longest
Amplitude = 1861
Amplitude = 29
Amplitude = 0.01
Page 25
Case Study 1: Worst-Case Current
Current constraints:
0 ( ) 50
Minimum transition time: r
i t A
t
Zoom in
The worst-case current also oscillates with the three resonant frequencies which matches the impulse response.
Saw-tooth-like current waveform at large transition times
Page 26
Case Study 1: Worst-Case Noise Response
Page 27
Case Study 1: Worst-Case Noise vs. Transition Time
The worst-case noise decreases with transition times.
Previous approaches which assume zero current transition times result in pessimistic worst-case noise.
Page 28
Case Study 2: Impedance
0 i(t) 1
1.25rt ns
pd
d
10
30
R m
R m
pd
d
30
10
R m
R m
100
102
104
106
108
1010
0
0.02
0.04
0.06
0.08
0.1
0.12
Frequency (Hz)
Impe
danc
e (
)
100
102
104
106
108
1010
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Frequency (Hz)
Impe
danc
e (
)
max 127.0Z m max 114.4Z m
224.3KHz
11.2MHz
98.1MHz
224.3KHz
10.9MHz
101.6MHz
Page 29
Case Study 2: Worst-Case Noise
for both cases: meaning that the worst-case noise is larger than Zmax.
The worst-case noise can be larger even though its peak impedance is smaller.
0 0.2 0.4 0.6 0.8 1 1.2
x 10-4
-0.05
0
0.05
0.1
0.15
Time (sec)
Wor
st c
ase
PD
N n
oise
(V
)
0 0.2 0.4 0.6 0.8 1 1.2
x 10-4
-0.05
0
0.05
0.1
0.15
Time (sec)
Wor
st c
ase
PD
N n
oise
(V
)
max
max
max max
127.0
139.3
/ 1.097
Z m
V mV
V Z
max
max
max max
114.4
146.8
/ 1.292
Z m
V mV
V Z
pd
d
10
30
R m
R m
pd
d
30
10
R m
R m
max max max/( ) 1V Z I
Page 30
Case 3: “Rogue Wave” Phenomenon
Worst-case noise response: The maximum noise is formed when a long and slow oscillation followed by a short and fast oscillation.
Rogue wave: In oceanography, a large wave is formed when a long and slow wave hits a sudden quick wave.
0 0.5 1 1.5 2
x 10-6
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
Time (sec)
Vol
tage
(V
)
Low-frequency oscillation corresponds to the resonance of the 2nd stage
High-frequency oscillation corresponds to the resonance of the 1st stage
Page 31 100
102
104
106
108
1010
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
Frequency (Hz)
Impe
danc
e (
)
Two stage
Case 3: “Rogue Wave” Phenomenon (Cont’)
Equivalent input impedance of the 2nd stage at high frequency
100
102
104
106
108
1010
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
Frequency (Hz)
Impe
danc
e (
)
Two stage1st stage alone
100
102
104
106
108
1010
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
Frequency (Hz)
Impe
danc
e (
)
Two stage1st stage alone2nd stage alone
Page 32
0 0.5 1 1.5 2
x 10-6
-1
-0.5
0
0.5
1
1.5
2
Time (sec)
Cu
rre
nt (
A)
Input current
0 0.5 1 1.5 2
x 10-6
-1
-0.5
0
0.5
1
1.5
2
Time (sec)
Cu
rre
nt (
A)
Input current
0 0.5 1 1.5 2
x 10-6
-1
-0.5
0
0.5
1
1.5
2
Time (sec)
Cu
rre
nt (
A)
Current through L2Input current
0 0.5 1 1.5 2
x 10-6
-1
-0.5
0
0.5
1
1.5
2
Time (sec)
Cu
rre
nt (
A)
Current through LInput current
Case 3: “Rogue Wave” Phenomenon (Cont’)
IL2
IL
Page 33
0 0.5 1 1.5 2
x 10-6
-0.02
-0.01
0
0.01
0.02
0.03
Time (sec)
Vol
tage
(V
)
V
2nd_only
0 0.5 1 1.5 2
x 10-6
-0.02
-0.01
0
0.01
0.02
0.03
Time (sec)
Vol
tage
(V
)
V
2nd
Case 3: “Rogue Wave” Phenomenon (Cont’)
V2nd
V2nd_only
Page 34
0 0.5 1 1.5 2
x 10-6
-1
-0.5
0
0.5
1
1.5
2
Time (sec)
Cur
rent
(A
)
Current through L
0 0.5 1 1.5 2
x 10-6
-1
-0.5
0
0.5
1
1.5
2
Time (sec)
Cur
rent
(A
)
Current through L2
0 0.5 1 1.5 2
x 10-6
-1
-0.5
0
0.5
1
1.5
2
Time (sec)
Cur
rent
(A
)
Current through L2Current through L1
Case 3: “Rogue Wave” Phenomenon (Cont’)
IL2 IL1
IL
Page 35
0.96 0.97 0.98 0.99 1 1.01 1.02 1.03 1.04
x 10-6
-1
-0.5
0
0.5
1
1.5
2
Time (sec)
Cur
rent
(A
)
Current through L2Current through L1
0 0.5 1 1.5 2
x 10-6
-1
-0.5
0
0.5
1
1.5
2
Time (sec)
Cur
rent
(A
)
Current through L
0.96 0.97 0.98 0.99 1 1.01 1.02 1.03 1.04
x 10-6
-1
-0.5
0
0.5
1
1.5
2
Time (sec)
Cur
rent
(A
)
Current through L
Case 3: “Rogue Wave” Phenomenon (Cont’)
IL2 IL1
IL
Zoom in
Page 36
0 0.5 1 1.5 2
x 10-6
-0.02
-0.01
0
0.01
0.02
0.03
Time (sec)
Vol
tage
(V
)
V
2nd
0 0.5 1 1.5 2
x 10-6
-0.02
-0.01
0
0.01
0.02
0.03
Time (sec)
Vol
tage
(V
)
V
2nd
V1st
-V2nd
0 0.5 1 1.5 2
x 10-6
-0.02
-0.01
0
0.01
0.02
0.03
Time (sec)
Vol
tage
(V
)
V
1st_only
Case 3: “Rogue Wave” Phenomenon (Cont’)
V2nd
V1st-V2nd
V1st_only
Page 37
0.96 0.97 0.98 0.99 1 1.01 1.02 1.03 1.04
x 10-6
-0.02
-0.01
0
0.01
0.02
0.03
Time (sec)
Vol
tage
(V
)
V
1st_only
0.96 0.97 0.98 0.99 1 1.01 1.02 1.03 1.04
x 10-6
-0.02
-0.01
0
0.01
0.02
0.03
Time (sec)
Vol
tage
(V
)
V
2nd
V1st
-V2nd
Case 3: “Rogue Wave” Phenomenon (Cont’)
V2nd
V1st-V2nd
V1st_only
Zoom in
Page 38
0.96 0.97 0.98 0.99 1 1.01 1.02 1.03 1.04
x 10-6
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
Time (sec)
Vo
ltag
e (
V)
V2nd_only
V1st_only
V2nd_only
+V1st_only
0.96 0.97 0.98 0.99 1 1.01 1.02 1.03 1.04
x 10-6
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
Time (sec)
Vo
ltag
e (
V)
V2nd
V1st
-V2nd
V1st
Case 3: “Rogue Wave” Phenomenon (Cont’)
V2nd
V1st-V2nd
V1st_only
V2nd_only
V1stmax(V1st)=37.34mV
max(V2nd_only) + max(V1st_only)= 42.09mV ≈ max(V1st)
0.96 0.97 0.98 0.99 1 1.01 1.02 1.03 1.04
x 10-6
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
Time (sec)
Vo
ltag
e (
V)
V2nd
V1st
-V2nd
V1st
0.96 0.97 0.98 0.99 1 1.01 1.02 1.03 1.04
x 10-6
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
Time (sec)
Vo
ltag
e (
V)
V2nd_only
V1st_only
V2nd_only
+V1st_only
Page 39
Agenda
Background: power distribution networks (PDN’s)
Analysis: worst-case PDN noise prediction
– Motivation
– Problem formulation
– Proposed Algorithm
– Case study
Simulation: adaptive parallel flow using discrete Fourier transform (DFT)
– Motivation
– Adaptive parallel flow description
– Experimental results
Conclusions and future work
Page 40
PDN Simulation: Why Frequency Domain?
Huge PDN netlists
– Time-domain simulation: serial - slow
– Frequency-domain simulation: parallel – fast
Frequency dependent parasitics
Simulation results
– Time-domain: voltage drops, simultaneous switching noise (SSN) – input dependent
– Frequency-domain: impedance, anti-resonance peaks – input independent
Page 41
Transform Operations
Laplace Transform [Wanping ’07]
– Input: Series of ramp functions
– Output: Rational expressing via vector fitting
– Choice of frequency samples
Discrete Fourier Transform (DFT)
– Periodic signal assumption
– Discrete frequency samples
Page 42
Basic DFT Simulation Flow
Page 43
Adaptive DFT Flow
Period[i]: the input period at each iteration
Interval[i]: the simulation time step at each iteration
FreqUpBd[i]: the upper bound of the input frequency range at each iteration
vi(t): tentative time-domain output within the frequency range [0, FreqUpBd] at each iteration
Iteration #1: obtain the main part of the output
Iteration #2~k: capture the oscillations in the tail of the output (high, middle, and low resonant frequencies)
For each iteration #i, i=k, k-1, …, 2, subtract the captured tail from the outputs at iteration #j, j<i to eliminate the wrap-around effect
Page 440 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)V
olta
ge (
V)
0 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
Problem with Basic DFT Flow
“Wrap-around effect” requires long padding zeros at the end of the input
– Periodicity nature of DFT
Small uniform time steps are needed to cover the input frequency range
0 0.2 0.4 0.6 0.8 1
x 10-7
-1
0
1
2
3
4
5
6
7
8x 10
-3
Time (sec)
Cur
rent
(A
)
0 0.2 0.4 0.6 0.8 1
x 10-7
-1
0
1
2
3
4
5
6
7
8x 10
-3
Time (sec)
Cur
rent
(A
)
0 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6
7
8x 10
-3
Time (sec)
Cur
rent
(A
)
T 2T 3T 4T
T
DFTrepetition output
outputLarge number of simulation points! Correct
Distorted!
T
Wrap-around
Page 45
Adaptive DFT Simulation
Basic ideas of the adaptive DFT flow: cancel out the wrap-around effect by subtracting the tail from the main part of the output
– Main part of the output: obtained with small time step and small period; distorted by the wrap-around effect
– Tail of the output: low frequency oscillation; can be captured with large time steps
0 0.2 0.4 0.6 0.8 1
x 10-7
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
0 0.2 0.4 0.6 0.8 1
x 10-7
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
T 2T 3T 4T
---
0 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
T
Correct Distorted Correct!
Total number of simulation points is reduced significantly!
Page 46
Experimental Results: Test Case & Input
Test case: 3D PDN
– One resonant peak in the impedance profile
Input current
– Time step: ∆t = 20ps
– Duration: T0 = 16.88ns
104
106
108
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
Frequency (Hz)
Impe
danc
e (
)
0 0.5 1 1.5 2
x 10-8
-1
0
1
2
3
4
5
6
7
8x 10
-3
Time (sec)
Cur
rent
(A
)
Impedance Original Input
Page 47
0 1 2 3 4 5 6 7 8 9
x 10-8
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vo
ltag
e (
V)
Final outputv
1(t)
0 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
0 1 2 3 4 5 6 7 8 9
x 10-8
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)Experimental Results: Adaptive Flow Process
Iteration #1: v1(t)
– ∆t1=20ps
– T1=20.48ns
Iteration #2: v2(t)
– ∆t2 = 32∆t1
= 640ps
– T2 = 4T1
= 81.92ns
Final output:
– Main part:
– Tail:
T1
2T1 4T13T1
v1(t), ∆t1=20ps, T1=20.48ns
3
1 2 1 11
( ) ( : ( 1) )m
v t v mT m T
Final output 3
1 2 1 11
( ) ( : ( 1) )m
v t v mT m T
2 1 1( : 4 )v T T
v2(t), ∆t2=640ps, T2=81.92ns
Page 48
Experimental Results: DFT Flow vs. SPICE
0 0.5 1 1.5 2
x 10-7
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
DFT flowSPICE transient simulation
Page 49
Error Analysis: Error Caused by Wrap-around Effect
0 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
DFT flow, T=20.48nsDFT flow, T=163.84nsSPICE transient simulation
Theorem 1: Let be the initial value of the output voltage. Suppose for some , then the meansquare error, i.e., is bounded by .
0 0.5 1 1.5 2 2.5
x 10-8
-1.5
-1
-0.5
0
0.5
1
1.5x 10
-4
Time (sec)
Vol
tage
(V
)
Difference between DFT and SPICE, T=20.48nsDifference between DFT and SPICE, T=163.84ns
Relative error: 2.09%
Relative error: 0.12%
Output comparison Error relative to SPICE
Page 50
Error Analysis: Error Caused by Different Interpolation Methods
SPICE: PWL interpolation
DFT: sinusoidal interpolation
0 0.2 0.4 0.6 0.8 1
x 10-7
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
DFT flow, t=20ps
DFT flow, t=2.5psSPICE transient simulation
0 0.2 0.4 0.6 0.8 1
x 10-7
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2x 10
-5
Time (sec)
Vol
tage
(V
)
Difference between DFT and SPICE, t=20ps
Difference between DFT and SPICE, t=2.5ps
Output comparison Error relative to SPICE
2.294 2.2945 2.295 2.2955 2.296
x 10-8
1.28
1.29
1.3
1.31
1.32
x 10-4
Time (sec)
Vol
tage
(V
)
DFT flow, t=20ps
DFT flow, t=2.5psSPICE transient simulation
Page 51
Time Complexity Analysis: Adaptive vs. Non-adaptive
Adaptive flow time complexity:
– Ti: simulation period at iteration #i,
– ∆ti: simulation time step at iteration #i,
Non-adaptive flow time complexity:
1
( / )k
i ii
O T t
1( / )kO T t
1 2 kT T T
1 2 kt t t
0 100 200 300 400 500 600 7000
0.5
1
1.5
2
2.5
Time (sec)
Re
lativ
e e
rro
r to
Hsp
ice
(%
)
Non-adaptive methodAdaptive method
Page 52
Parallel Processing
Test case: 3D PDN
– Case 1: 393930 nodes
– Case 2: 1465206 nodes
Simulation time (case 2)
– DFT flow:
• ~3.5 hr (w/ 1 prc)
• 76 sec (w/ 256 prcs)
– HSPICE: ~38 hr0 50 100 150 200 250 300
0
2000
4000
6000
8000
10000
12000
14000
Number of ProcessorsS
imu
latio
n T
ime
(se
c)
Case #1Case #2
Page 53
Agenda
Background: power distribution networks (PDN’s)
Analysis: worst-case PDN noise prediction
– Motivation
– Problem formulation
– Proposed Algorithm
– Case study
Simulation: adaptive parallel flow using discrete Fourier transform (DFT)
– Motivation
– Adaptive parallel flow description
– Experimental results
Conclusions and future work
Page 54
Remarks
Worst-case PDN noise prediction with non-zero current transition time
– The worst-case PDN noise decreases with transition time
– Small peak impedance may not lead to small worst-case noise
– “Rogue wave” phenomenon
Adaptive parallel flow for PDN simulation using DFT
– 0.093% relative error compared to SPICE
– 10x speed up with single processor.
– Parallel processing reduces the simulation time even more significantly
Page 55
Summary
1. Throughput/power (instruction/energy)
2. Throughput2/power (f x instruction/energy)
Power Distribution Network
– VRMs, Switches, Decaps, ESRs, Topology,
Analysis
– Stimulus, Noise Tolerance, Simulation
Control (smart grid)
– High efficiency, Real time analysis, Stability, Reliability, Rapid recovery, and Self healing
Page 56