Post on 12-Jan-2016
description
An Adaptive Parallel Flow for Power Distribution Network Simulation Using Discrete Fourier Transform
Xiang Hu1, Wenbo Zhao2, Peng Du2,
Amirali Shayan2, Chung-Kuan Cheng2
1ECE Dept., 2CSE Dept.
University of California, San Diego
01/19/2010
Page 2
Outline
Motivation
Adaptive simulation flow using discrete Fourier transform (DFT)
– Basic DFT flow
– Problems with the basic DFT flow
– Adaptive DFT flow
Experimental results
– Error analysis
– Time efficiency of the adaptive flow
– Time efficiency of parallel processing
Conclusions
Page 3
Outline
Motivation
Adaptive simulation flow using discrete Fourier transform (DFT)
– Basic DFT flow
– Problems with the basic DFT flow
– Adaptive DFT flow
Experimental results
– Error analysis
– Time efficiency of the adaptive flow
– Time efficiency of parallel processing
Conclusions
Page 4
PDN Simulation: Why Frequency Domain?
Huge PDN netlists
– Time-domain simulation: serial - slow
– Frequency-domain simulation: parallel – fast
Frequency dependent parasitics
Simulation results
– Time-domain: voltage drops, simultaneous switching noise (SSN) – input dependent
– Frequency-domain: impedance, anti-resonance peaks – input independent
Page 5
Transform Operations: Why Discrete Fourier Transform
Laplace Transform [Wanping ’07]
– Input: Series of ramp functions
– Output: Rational expressing via vector fitting
• Vector fitting may introduce large errors
– Choice of frequency samples is case dependent
Discrete Fourier Transform (DFT)
– Input: periodic signal
– Inverse DFT is straightforward: vector fitting is not needed
– Frequency sample points with uniform steps
Page 6
Outline
Motivation
Adaptive simulation flow using discrete Fourier transform (DFT)
– Basic DFT flow
– Problems with the basic DFT flow
– Adaptive DFT flow
Experimental results
– Error analysis
– Time efficiency of the adaptive flow
– Time efficiency of parallel processing
Conclusions
Page 7
Basic DFT Simulation Flow
Page 80 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)V
olta
ge (
V)
0 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
Problem with Basic DFT Flow
“Wrap-around effect” requires long padding zeros at the end of the input
– Periodicity nature of DFT
Small uniform time steps are needed to cover the input frequency range
0 0.2 0.4 0.6 0.8 1
x 10-7
-1
0
1
2
3
4
5
6
7
8x 10
-3
Time (sec)
Cur
rent
(A
)
0 0.2 0.4 0.6 0.8 1
x 10-7
-1
0
1
2
3
4
5
6
7
8x 10
-3
Time (sec)
Cur
rent
(A
)
0 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6
7
8x 10
-3
Time (sec)
Cur
rent
(A
)
T 2T 3T 4T
T
DFTrepetition output
outputLarge number of simulation points! Correct
Distorted!
T
Wrap-around
Page 9
Adaptive DFT Simulation
Basic ideas of the adaptive DFT flow: cancel out the wrap-around effect by subtracting the tail from the main part of the output
– Main part of the output: obtained with small time step and small period; distorted by the wrap-around effect
– Tail of the output: low frequency oscillation; can be captured with large time steps
0 0.2 0.4 0.6 0.8 1
x 10-7
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
0 0.2 0.4 0.6 0.8 1
x 10-7
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
T 2T 3T 4T
---
0 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
T
Correct Distorted Correct!
Total number of simulation points is reduced significantly!
Page 10
Adaptive DFT Flow
Period[i]: the input period at each iteration
Interval[i]: the simulation time step at each iteration
FreqUpBd[i]: the upper bound of the input frequency range at each iteration
vi(t): tentative time-domain output within the frequency range [0, FreqUpBd] at each iteration
Iteration #1: obtain the main part of the output
Iteration #2~k: capture the oscillations in the tail of the output (high, middle, and low resonant frequencies)
For each iteration #i, i=k, k-1, …, 2, subtract the captured tail from the outputs at iteration #j, j<i to eliminate the wrap-around effect
Page 11
Outline
Motivation
Adaptive simulation flow using discrete Fourier transform (DFT)
– Basic DFT flow
– Problems with the basic DFT flow
– Adaptive DFT flow
Experimental results
– Error analysis
– Time efficiency of the adaptive flow
– Time efficiency of parallel processing
Conclusions
Page 12
Experimental Results: Test Case & Input
Test case: 3D PDN
– One resonant peak in the impedance profile
Input current
– Time step: ∆t = 20ps
– Duration: T0 = 16.88ns
104
106
108
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
Frequency (Hz)
Impe
danc
e (
)
0 0.5 1 1.5 2
x 10-8
-1
0
1
2
3
4
5
6
7
8x 10
-3
Time (sec)
Cur
rent
(A
)
Impedance Original Input
Page 130 1 2 3 4 5 6 7 8 9
x 10-8
-5
0
5
10x 10
-3
Time (sec)
Vol
tage
(V
)
Final outputv
1(t)
Final output-v1(t)
0 1 2 3 4 5 6 7 8 9
x 10-8
-5
0
5
10x 10
-3
Time (sec)
Vol
tage
(V
)
0 0.5 1 1.5 2 2.5
x 10-8
-4
-2
0
2
4
6
8
10x 10
-3
Time (sec)
Vol
tage
(V
)Experimental Results: Adaptive Flow Process
Iteration #1: v1(t)
– ∆t1=20ps
– T1=20.48ns
Iteration #2: v2(t)
– ∆t2 = 64∆t1
= 640ps
– T2 = 8T1
= 163.84ns
Final output:
– Main part:
– Tail:
T1
2T1 4T13T1
v1(t), ∆t1=20ps, T1=20.48ns
3
1 2 1 11
( ) ( : ( 1) )m
v t v mT m T
Final output 3
1 2 1 11
( ) ( : ( 1) )m
v t v mT m T
2 1 1( :8 )v T T
v2(t), ∆t2=640ps, T2=163.84ns
· · ·8T1
Page 14
0 0.5 1 1.5 2
x 10-7
-4
-2
0
2
4
6
8
10x 10
-3
Time (sec)
Vol
tage
(V
)
DFT flowSPICE transient simulation
Experimental Results: DFT Flow vs. SPICE
Relative error: 0.25%
Page 15
Error Analysis: Error Caused by Wrap-around Effect
0 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
DFT flow, T=20.48nsDFT flow, T=163.84nsSPICE transient simulation
Theorem 1: Let be the initial value of the output voltage. Suppose for some , then the meansquare error, i.e., is bounded by .
0 0.5 1 1.5 2 2.5
x 10-8
-1.5
-1
-0.5
0
0.5
1
1.5x 10
-4
Time (sec)
Vol
tage
(V
)
Difference between DFT and SPICE, T=20.48nsDifference between DFT and SPICE, T=163.84ns
Relative error: 2.09%
Relative error: 0.12%
Output comparison Error relative to SPICE
Page 16
Error Analysis: Error Caused by Different Interpolation Methods
SPICE: PWL interpolation
DFT: sinusoidal interpolation
0 0.2 0.4 0.6 0.8 1
x 10-7
-1
0
1
2
3
4
5
6x 10
-3
Time (sec)
Vol
tage
(V
)
DFT flow, t=20ps
DFT flow, t=2.5psSPICE transient simulation
0 0.2 0.4 0.6 0.8 1
x 10-7
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2x 10
-5
Time (sec)
Vol
tage
(V
)
Difference between DFT and SPICE, t=20ps
Difference between DFT and SPICE, t=2.5ps
Output comparison Error relative to SPICE
2.294 2.2945 2.295 2.2955 2.296
x 10-8
1.28
1.29
1.3
1.31
1.32
x 10-4
Time (sec)
Vol
tage
(V
)
DFT flow, t=20ps
DFT flow, t=2.5psSPICE transient simulation
Relative error: 0.12%
Relative error: 0.09%
Page 17
Time Complexity Analysis: Adaptive vs. Non-adaptive
Adaptive flow time complexity:
– Ti: simulation period at iteration #i,
– ∆ti: simulation time step at iteration #i,
Non-adaptive flow time complexity:
1
( / )k
i ii
O T t
1( / )kO T t
1 2 kT T T
1 2 kt t t
0 100 200 300 400 500 600 7000
0.5
1
1.5
2
2.5
Time (sec)
Re
lativ
e e
rro
r to
Hsp
ice
(%
)
Non-adaptive methodAdaptive method
Page 18
Parallel Processing
Test case: 3D PDN
– ~ 0.17 million nodes
The adaptive DFT flow has more advantage when the number of available processors is limited.
Simulations between each iteration of the adaptive flow need to be processed serially
Simulation time with different number of processors (sec)
1 prc 4 prcs 8 prcs 16 prcs 64 prcs 128 prcs 256 prcs
Hspice 21374 NA NA NA NA NA NA
Basic DFT 15976 6635 3225 1647 425 218 143
Adaptive DFT 4947 2096 904 490 180 120 115
0 50 100 150 200 250 3000
2000
4000
6000
8000
10000
12000
14000
16000
Number of processors
Sim
ula
tion
tim
e (
sec)
Basic DFTAdpative DFT
Page 19
Parallel Processing: DFT Flow vs. SPICE
0 0.5 1 1.5 2 2.5
x 10-8
-1
0
1
2
3
4
5
6
7
8x 10
-3
Time (sec)
Vo
ltag
e (
V)
Adaptive DFTHspice
0 0.5 1 1.5 2 2.5
x 10-8
-4
-3
-2
-1
0
1
2
3x 10
-5
Time (sec)
Vo
ltag
e (
V)
Simulation result DFT error compared to Hspice
Page 20
Outline
Motivation
Adaptive simulation flow using discrete Fourier transform (DFT)
– Basic DFT flow
– Problems with the basic DFT flow
– Adaptive DFT flow
Experimental results
– Error analysis
– Time efficiency of the adaptive flow
– Time efficiency of parallel processing
Conclusions
Page 21
Conclusions
Implemented an adaptive flow for large PDN simulation using DFT
Total number of simulation points is reduced significantly compared to the basic DFT flow
Achieved a relative error of the order of 0.1% compared to SPICE
10x speed up with a single processor compared to SPICE.
Parallel processing is incorporated to reduce the simulation time even more significantly
Page 22