Xiang Hu 1 , Wenbo Zhao 2 , Peng Du 2 , Amirali Shayan 2 , Chung-Kuan Cheng 2

An Adaptive Parallel Flow for Power Distribution Network Simulation Using Discrete Fourier Transform

Xiang Hu1, Wenbo Zhao2, Peng Du2,

Amirali Shayan2, Chung-Kuan Cheng2

1ECE Dept., 2CSE Dept.

University of California, San Diego

01/19/2010

Outline

Motivation

Adaptive simulation flow using discrete Fourier transform (DFT)

– Basic DFT flow

– Problems with the basic DFT flow

– Adaptive DFT flow

Experimental results

– Error analysis

– Time efficiency of the adaptive flow

– Time efficiency of parallel processing

Conclusions

Outline

Motivation

– Basic DFT flow

– Error analysis

Conclusions

PDN Simulation: Why Frequency Domain?

Huge PDN netlists

– Time-domain simulation: serial - slow

– Frequency-domain simulation: parallel – fast

Frequency dependent parasitics

Simulation results

– Time-domain: voltage drops, simultaneous switching noise (SSN) – input dependent

– Frequency-domain: impedance, anti-resonance peaks – input independent

Transform Operations: Why Discrete Fourier Transform

Laplace Transform [Wanping ’07]

– Input: Series of ramp functions

– Output: Rational expressing via vector fitting

• Vector fitting may introduce large errors

– Choice of frequency samples is case dependent

Discrete Fourier Transform (DFT)

– Input: periodic signal

– Inverse DFT is straightforward: vector fitting is not needed

– Frequency sample points with uniform steps

Outline

Motivation

– Basic DFT flow

– Error analysis

Conclusions

Basic DFT Simulation Flow

0.5 1 1.5 2 2.5

x 10-8

Time (sec)V

0 0.5 1 1.5 2 2.5

x 10-8

Time (sec)

Problem with Basic DFT Flow

“Wrap-around effect” requires long padding zeros at the end of the input

– Periodicity nature of DFT

Small uniform time steps are needed to cover the input frequency range

0 0.2 0.4 0.6 0.8 1

x 10-7

Time (sec)

0 0.2 0.4 0.6 0.8 1

x 10-7

Time (sec)

0 0.5 1 1.5 2 2.5

x 10-8

Time (sec)

T 2T 3T 4T

DFTrepetition output

outputLarge number of simulation points! Correct

Distorted!

Wrap-around

Adaptive DFT Simulation

Basic ideas of the adaptive DFT flow: cancel out the wrap-around effect by subtracting the tail from the main part of the output

– Main part of the output: obtained with small time step and small period; distorted by the wrap-around effect

– Tail of the output: low frequency oscillation; can be captured with large time steps

0 0.2 0.4 0.6 0.8 1

x 10-7

Time (sec)

0 0.2 0.4 0.6 0.8 1

x 10-7

Time (sec)

T 2T 3T 4T

0 0.5 1 1.5 2 2.5

x 10-8

Time (sec)

Correct Distorted Correct!

Total number of simulation points is reduced significantly!

Adaptive DFT Flow

Period[i]: the input period at each iteration

Interval[i]: the simulation time step at each iteration

FreqUpBd[i]: the upper bound of the input frequency range at each iteration

vi(t): tentative time-domain output within the frequency range [0, FreqUpBd] at each iteration

Iteration #1: obtain the main part of the output

Iteration #2~k: capture the oscillations in the tail of the output (high, middle, and low resonant frequencies)

For each iteration #i, i=k, k-1, …, 2, subtract the captured tail from the outputs at iteration #j, j<i to eliminate the wrap-around effect

Outline

Motivation

– Basic DFT flow

– Error analysis

Conclusions

Experimental Results: Test Case & Input

Test case: 3D PDN

– One resonant peak in the impedance profile

Input current

– Time step: ∆t = 20ps

– Duration: T0 = 16.88ns

Frequency (Hz)

0 0.5 1 1.5 2

x 10-8

Time (sec)

Impedance Original Input

1 2 3 4 5 6 7 8 9

x 10-8

10x 10

Time (sec)

Final outputv

Final output-v1(t)

0 1 2 3 4 5 6 7 8 9

x 10-8

10x 10

Time (sec)

0 0.5 1 1.5 2 2.5

x 10-8

10x 10

Time (sec)

)Experimental Results: Adaptive Flow Process

Iteration #1: v1(t)

– ∆t1=20ps

– T1=20.48ns

Iteration #2: v2(t)

– ∆t2 = 64∆t1

= 640ps

– T2 = 8T1

= 163.84ns

Final output:

– Main part:

– Tail:

2T1 4T13T1

v1(t), ∆t1=20ps, T1=20.48ns

1 2 1 11

( ) ( : ( 1) )m

v t v mT m T

Final output 3

1 2 1 11

( ) ( : ( 1) )m

v t v mT m T

2 1 1( :8 )v T T

v2(t), ∆t2=640ps, T2=163.84ns

· · ·8T1

0 0.5 1 1.5 2

x 10-7

10x 10

Time (sec)

DFT flowSPICE transient simulation

Experimental Results: DFT Flow vs. SPICE

Relative error: 0.25%

Error Analysis: Error Caused by Wrap-around Effect

0 0.5 1 1.5 2 2.5

x 10-8

Time (sec)

DFT flow, T=20.48nsDFT flow, T=163.84nsSPICE transient simulation

Theorem 1: Let be the initial value of the output voltage. Suppose for some , then the meansquare error, i.e., is bounded by .

0 0.5 1 1.5 2 2.5

x 10-8

1.5x 10

Time (sec)

Difference between DFT and SPICE, T=20.48nsDifference between DFT and SPICE, T=163.84ns

Output comparison Error relative to SPICE

Error Analysis: Error Caused by Different Interpolation Methods

SPICE: PWL interpolation

DFT: sinusoidal interpolation

0 0.2 0.4 0.6 0.8 1

x 10-7

Time (sec)

DFT flow, t=20ps

DFT flow, t=2.5psSPICE transient simulation

0 0.2 0.4 0.6 0.8 1

x 10-7

Time (sec)

Difference between DFT and SPICE, t=20ps

Difference between DFT and SPICE, t=2.5ps

Output comparison Error relative to SPICE

2.294 2.2945 2.295 2.2955 2.296

x 10-8

x 10-4

Time (sec)

DFT flow, t=20ps

DFT flow, t=2.5psSPICE transient simulation

Time Complexity Analysis: Adaptive vs. Non-adaptive

Adaptive flow time complexity:

– Ti: simulation period at iteration #i,

– ∆ti: simulation time step at iteration #i,

Non-adaptive flow time complexity:

( / )k

1( / )kO T t

1 2 kT T T

1 2 kt t t

0 100 200 300 400 500 600 7000

Time (sec)

Non-adaptive methodAdaptive method

Parallel Processing

Test case: 3D PDN

– ~ 0.17 million nodes

The adaptive DFT flow has more advantage when the number of available processors is limited.

Simulations between each iteration of the adaptive flow need to be processed serially

Simulation time with different number of processors (sec)

1 prc 4 prcs 8 prcs 16 prcs 64 prcs 128 prcs 256 prcs

Hspice 21374 NA NA NA NA NA NA

Basic DFT 15976 6635 3225 1647 425 218 143

Adaptive DFT 4947 2096 904 490 180 120 115

0 50 100 150 200 250 3000

Number of processors

Basic DFTAdpative DFT

Parallel Processing: DFT Flow vs. SPICE

0 0.5 1 1.5 2 2.5

x 10-8

Time (sec)

Adaptive DFTHspice

0 0.5 1 1.5 2 2.5

x 10-8

Time (sec)

Simulation result DFT error compared to Hspice

Outline

Motivation

– Basic DFT flow

– Error analysis

Conclusions

Implemented an adaptive flow for large PDN simulation using DFT

Total number of simulation points is reduced significantly compared to the basic DFT flow

Achieved a relative error of the order of 0.1% compared to SPICE

10x speed up with a single processor compared to SPICE.

Parallel processing is incorporated to reduce the simulation time even more significantly

Xiang Hu 1 , Wenbo Zhao 2 , Peng Du 2 , Amirali Shayan 2 , Chung-Kuan Cheng 2

Documents

Transcript of Xiang Hu 1 , Wenbo Zhao 2 , Peng Du 2 , Amirali Shayan 2 , Chung-Kuan Cheng 2

CENG 450 Computer Systems & Architecture Lecture 3 Amirali Baniasadi amirali@ece.uvic.ca.

1 Design of Experiment Group 5 Yang Meng, Yang Zhiwei, Song Wenbo, Lou Jiamiao, Liu Shaojun,

CENG 241 Digital Design 1 Lecture 5 Amirali Baniasadi amirali@ece.uvic.ca.

MATHEMATICS WITHOUT BORDERS - WINTER 2020 GROUP 2€¦ · 235 Amia Dzhordzhi Uzunov Bulgaria Plovdiv 2 Certificate 236 Amir Tuleberdiev Kyrgyzstan Bishkek 2 Bronze 237 Amirali Altair

CENG 241 Digital Design 1 Lecture 1 Amirali Baniasadi amirali@ece.uvic.ca.

1 CENG 450 Computer Systems and Architecture Lecture 15 Amirali Baniasadi amirali@ece.uvic.ca.

Wenbo Sun, Bruce Wielicki, David Young, and Constantine Lukashin 1.Introduction 2.Objective 3.Effect of anisotropic air molecules on radiation polarization.

Grounded Theory Alan Copeland, Kris Hancock, Barbara Hannah, Junyi Ran, Amirali Jahangard and Jerry Salmon August 2011.

CENG 241 Digital Design 1 Lecture 3 Amirali Baniasadi amirali@ece.uvic.ca.

Hongxin Hu - Clemson Universityhongxih/Hongxin_CV_2020.pdf · Computer Communications (INFOCOM 2019), Paris, France, April 29-May 2, 2019. [Acceptance rate: 288/1464 = 19.7%] 6.Wenbo

NetSearch: Googling Large-scale Network Management Data GROUP 2 MEMBERS SAMUEL LAWER WENBO HAN HUAN YAN PEI YAN SHREY YADAV SHUAI YU SHINE PANDITA.

panamapetro.companamapetro.com/wp-content/uploads/2015/12/PANAMA... · PANAMA PETROCHEM LIMITED 38th ANNUAL REPORT 2019-20 2 Board of Directors Mr. Amirali E. Rayani Chairman (Executive)

Bluetooth Twiddler Ubiquitous Computing CS 7470 Jeremy Rogers Amirali Charania.

arXiv:1912.00271v2 [cs.CV] 8 Apr 2020 · arXiv:1912.00271v2 [cs.CV] 8 Apr 2020 2 Shervin Minaee, Amirali Abdolrashidi, Hang Su, Mohammed Bennamoun, David Zhang has many discriminative

1 CENG 450 Computer Systems and Architecture Lecture 12 Amirali Baniasadi amirali@ece.uvic.ca.

1 CENG 450 Computer Systems and Architecture Lecture 14 Amirali Baniasadi amirali@ece.uvic.ca.

Tutorial: The ns-2 Network Simulator Amirali Habibi Shrif Univesity Adapted from: Michael Welzl Institute of Computer Science University of Innsbruck,

CENG 241 Digital Design 1 Lecture 12 Amirali Baniasadi amirali@ece.uvic.ca.

CENG 241 Digital Design 1 Lecture 10 Amirali Baniasadi amirali@ece.uvic.ca.

Zheng Wenbo