Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu ›...

48
1 Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs Yannan Nellie Wu 1 , Joel S. Emer 1,2 , Vivienne Sze 1 1 MIT 2 NVIDIA

Transcript of Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu ›...

Page 1: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

1

Accelergy: An Architecture-Level Energy Estimation Methodology for

Accelerator Designs

Yannan Nellie Wu1 , Joel S. Emer1,2 , Vivienne Sze1

1 MIT 2 NVIDIA

Page 2: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

2

Accelergy Overview

• An architecture-level energy estimator

• Flexibly characterizes various basic building blocks of different

technologies

• Succinctly models diverse and complicated designs

• Improves estimation accuracy via fine-grained classification of operations

• Achieves 95% accuracy in evaluating a deep neural network (DNN)

accelerator – Eyeriss [ISSCC 2016]

Page 3: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

3

Energy Consumption Concerns

Data and computation-intensive applications are power hungry

Object Detection

DNNAccelerator

We must quickly evaluate energy efficiency of arbitrary potential designs in the large design space

Database Processing

DatabaseAccelerator

Page 4: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

4

Energy Estimation and Design Exploration

Arch. Description

global buffer(GLB)

buffer

MAC

PE*

processing element

component

abstract hierarchy

Page 5: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

5

• Physical-Level Energy Estimator (Synopsys Prime Power, Cadence Joules)

Energy Estimation and Design Exploration

Synthesize the design, place standard cells, and

route the wires

RTL Model

Physical Layout

Develop the register transfer

level (RTL) details

Arch. Description

Energy

NOR3

OR4 OR2

wire0

wire1

Requires physical layout of the design

Page 6: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

6

Energy Estimation and Design Exploration

• Physical-Level Energy Estimator (Synopsys Prime Power, Cadence Joules)

RTL Model

Physical Layout

Arch. Description

Energy

FabricatedChip

Requires physical layout of the designSlow design space exploration

Page 7: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

7

Accelergy Overview

• An architecture-level energy estimator

• Flexibly characterizes various basic building blocks in the design

• Succinctly models diverse and complicated designs

• Improves estimation accuracy via fine-grained classification of operations

• Achieves 95% accuracy in evaluating a deep neural network (DNN)

accelerator – Eyeriss [ISSCC 2016]

Page 8: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

8

Energy Estimation and Design Exploration

• Architecture-Level Energy Estimators

Arch Description

Energy

Physical Layout

FabricatedChip

RTL Model

Only requires architecture-level design Fast design space exploration

global buffer(GLB)

buffer

MAC

PE*

processing element

Page 9: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

9

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators: Aladdin[ISCA2014], fixed-cost[Asilomar2017]

Description withprimitive components(basic building blocks)

Energy Estimator

GLB

buffer

MAC

PE

Architecture Description

Page 10: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

10

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators: Aladdin[ISCA2014], fixed-cost[Asilomar2017]

GLB

buffer

MAC

PE

Architecture Description Energy Estimator

Comp. Action Energy

GLB access() 100pJ

buffer access() 10pJ

MAC compute() 5pJ

Energy Reference Table (ERT)

Page 11: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

11

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators: Aladdin[ISCA2014], fixed-cost[Asilomar2017]

GLB

buffer

MAC

PE

Architecture Description Energy Estimator

Comp. Action Energy

GLB access() 100pJ

buffer access() 10pJ

MAC compute() 5pJ

Energy Reference Table (ERT)

Action Counts

Comp. Action Counts

GLB access() 10

buffer access() 800

MAC compute() 400

Page 12: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

12

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators: Aladdin[ISCA2014], fixed-cost[Asilomar2017]

GLB

buffer

MAC

PE

Architecture Description Energy Estimator

Comp. Action Energy

GLB access() 100pJ

buffer access() 10pJ

MAC compute() 5pJ

Energy Reference Table (ERT)

Energy Calculator

Action Counts

Name Energy

GLB 1000pJ

buffer 8000pJ

MAC 2000pJ

Energy Estimations

Comp. Action Counts

GLB access() 10

buffer access() 800

MAC compute() 400

Page 13: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

13

Comp. Action Counts

GLB access() 10

buffer access() 800

MAC compute() 400

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators: Aladdin[ISCA2014], fixed-cost[Asilomar2017]

GLB

buffer

MAC

PE

Architecture Description Energy Estimator

Comp. Action Energy

GLB access() 100pJ

buffer access() 10pJ

MAC compute() 5pJ

Energy Calculator

GLB’

GLB’

Not generalizable to other designs

Energy Reference Table (ERT)

Action Counts

Page 14: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

14

Accelergy Overview

• An architecture-level energy estimator

• Flexibly characterizes various primitive components of different

technologies

• Succinctly models diverse and complicated designs

• Improves estimation accuracy via fine-grained classification of operations

• Achieves 95% accuracy in evaluating a deep neural network (DNN)

accelerator – Eyeriss [ISSCC 2016]

Page 15: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

15

Accelergy: Flexibly Model Various Primitive Components

GLB

buffer

MAC

PE

Architecture Description Accelergy

ERT Generator

Primitive Component

Library

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

SRAMSRAM type has

associated action “access”

Page 16: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

16

Accelergy: Flexibly Model Various Primitive Components

GLB

buffer

MAC

PE

Architecture Description Accelergy

ERT Generator

Primitive Component

Library

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Comp. Action Energy

GLB access() 100pJ

ERT (in progress)

SRAMSRAM type has

associated action “access”

Page 17: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

17

Comp. Action Energy

GLB access() 100pJ

buffer access() 10pJ

MAC compute() 5pJ

Accelergy: Flexibly Model Various Primitive Components

GLB

buffer

MAC

PE

Architecture Description Accelergy

ERT Generator

Primitive Component

Library

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

ERT

Page 18: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

18

Accelergy: Flexibly Model Various Primitive Components

GLB

buffer

MAC

PE

Architecture Description Accelergy

ERT Generator

Primitive Component

Library

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

ERT

Energy Calculator

Page 19: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

19

Accelergy: Flexibly Model Various Primitive Components

GLB

buffer

MAC

PE

Architecture Description Accelergy

ERT Generator

Primitive Component

Library

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

ERT

Energy Calculator

Comp. Action Counts

GLB access() 10

buffer access() 800

MAC compute() 400

Name Energy

GLB 1000pJ

Buffer 8000pJ

MAC 2000pJ

Action Counts

Energy Estimates

Page 20: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

20

Use energy estimation plug-ins to characterize primitive components

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Traditional open-source plug-ins*

*available at http://accelergy.mit.edu

Proprietary plug-ins

Accelergy: Flexibly Model Various Primitive Components

Emerging technology plug-ins

NVSIM[TCAD 2012]

Page 21: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

21

Use energy estimation plug-ins to characterize primitive components

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Traditional open-source plug-ins*

*available at http://accelergy.mit.edu

Proprietary plug-ins

Accelergy: Flexibly Model Various Primitive Components

Emerging technology plug-ins

NVSIM[TCAD 2012]

Detailed plug-in interface in open-source repo

Page 22: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

22

Modeling Complicated Designs

• Practical architecture designs involve much more details

buffer

SRAM

o AG_SRAM is an abstract hierarchyo Buffer is of SRAM typeo AGs is of counter type

AG_SRAM

data out

AG[0]read address

counter

AG[1]write address

data in

counter

– Example: storage units with local address generators (AGs)

Page 23: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

23

• Practical architecture designs involve much more details

– Example: storage units with local address generators (AG)

GLBBuffer

MAC

PE

Modeling Complicated Designs

GLBbuffer

Let’s construct a more practical design!

MAC

Page 24: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

24

MAC

• Practical architecture designs involve much more details

– Example: storage units with local address generators (AG)

PE

Modeling Complicated Designs

GLBbuffer

Let’s construct a more practical design!

Page 25: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

25

Modeling Complicated Designs

Accelergy

ERT Generator

Primitive Component

Library

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Energy Calculator

Architecture Description

Name Action Counts

PE[0].AG[0]

count 50

PE[0].AG[1]

count 50

• Action counts are even more tedious

• Small modification requires new action counts

Action Counts

• Architecture description is tedious

• Hard to make modifications

Page 26: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

26

Existing Work - Modeling Complicated Designs

• Existing work that aims to succinctly model complicated architectures

– Wattch[ISCA2000], McPAT[MICRO2009]

ALU

L1 $

ROB

…L2 $

CPU-Centric Architecture Model

Use a fixed set of compound components to represent the architecture

Components that can be decomposed into

lower level components

Page 27: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

27

Existing Work - Modeling Complicated Designs

• Existing work that aims to succinctly model complicated architectures

– Wattch[ISCA2000], McPAT[MICRO2009]

ALU

L1 $

ROB

…L2 $

CPU-Centric Architecture Model

The fixed set of compound components is not sufficient to describe arbitrary accelerator designs

Page 28: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

28

Accelergy Overview

• An architecture-level energy estimator

• Flexibly characterizes various primitive components of different

technologies

• Succinctly models diverse and complicated designs

• Improves estimation accuracy via fine-grained classification of operations

• Achieves 95% accuracy in evaluating a deep neural network (DNN)

accelerator – Eyeriss [ISSCC 2016]

Page 29: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

29

Architecture described with abstract hierarchies

PEGLB

AG_SRAM

Accelergy: Succinctly Model Arbitrary Architecture

AG_SRAM

buffer

SRAM

AG[0]read address

counter

AG[1]write address

counter

AG_SRAM is an abstract hierarchy

AG_SRAM is an user-defined compound

component

Architecture described with compound components and

primitive components

Design

MACmac

buffer

bufferAG_SRAM

GLBAG_SRAM

Page 30: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

30

Accelergy: Succinctly Model Arbitrary Action Counts

Name Action Counts

GLB.AG[0] count() 50

GLB.AG[1] count() 20

GLB.buffer read() 50

GLB.buffer write() 20

Tedious action counts in terms of primitive component actions

Action Counts

Page 31: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

31

Accelergy: Succinctly Model Arbitrary Action Counts

User-defined compound actions

AG_SRAM.read()

AGs[0].count() buffer.read()

GLBAG_SRAM

bufferAG_SRAM

GLBAG_SRAM

Page 32: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

32

Accelergy: Succinctly Model Arbitrary Action Counts

AG_SRAM.read()

AGs[0].count() buffer.read()

GLBAG_SRAM

Name Action Counts

GLB read() 50

GLB write() 20

bufferAG_SRAM

GLBAG_SRAM

Action Counts

Succinct action counts with compound component actions

User-defined compound actions

Page 33: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

33

Accelergy: Succinctly Model Complex Designs

Accelergy

ERT Generator

Primitive Component

Library

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Energy Calculator

Compound Component Description

Architecture Description

Comp. Actions Energy

GLB read(), … 120pJ, …

PE[0].buffer read(), … 12pJ, …

PE[0].MAC compute(), … 5pJ, …

ERT

MACmac

FIFOFIFO

MAC_FIFO

Page 34: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

34

Accelergy: Succinctly Model Complex Designs

Accelergy

ERT Generator

Primitive Component

Library

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Energy Calculator

Name Action Counts

GLB read() 50

PE[0].buffer

read() 60

Action Counts

Compound Component Description

Architecture Description

Energy Estimations

MACmac

FIFOFIFO

MAC_FIFO

Comp. Actions Energy

GLB read(), … 120pJ, …

PE[0].buffer read(), … 12pJ, …

PE[0].MAC compute(), … 5pJ, …

ERT

Page 35: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

35

Additional Challenge: Inaccurate Modeling of Energy/Action

• Existing architecture-level energy estimators only model coarse action types

Component Action Energy

GLB access() 100pJ

Buffer access() 10pJ

ALU compute() 5pJ

Coarse-grained Actions

Energy-Per-Actions of a Register File(normalized to idle)

1.81.0

4.7

2.1 2.4

RandomRead

RepeatedRead

RandomWrite

RepeatedWrite

ConstantDataWrite

Fine-grained Actions

Coarse-grained estimations reduce estimation accuracies

~5x

Page 36: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

36

Accelergy Overview

• An architecture-level energy estimator

• Flexibly characterizes various primitive components of different

technologies

• Succinctly models diverse and complicated designs

• Improves estimation accuracy via fine-grained actions

• Achieves 95% accuracy in evaluating a deep neural network (DNN)

accelerator – Eyeriss [ISSCC 2016]

Page 37: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

37

Accelergy: Fine-grained Action Classification

• Accurate estimation with a primitive component library

Defines the fine-grained actions for each primitive component

1.81.0

4.7

2.1 2.4

RandomRead

RepeatedRead

RandomWrite

RepeatedWrite

ConstantData Write

~5x 23.0

16.8

1.3

Random Mult Reused Mult Gated Mult

~20x

Fine-grained multiplier action typesFine-grained memory action types

Page 38: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

38

Accelergy: Fine-grained Action Classification

• Accurate estimation with a primitive component library*

Defines the fine-grained actions for each primitive component

1.81.0

4.7

2.1 2.4

RandomRead

RepeatedRead

RandomWrite

RepeatedWrite

ConstantData Write

~5x 23.0

16.8

1.3

Random Mult Reused Mult Gated Mult

~20x

Fine-grained multiplier action typesFine-grained memory action types

Detailed methodology for generating fine-grained action types in paper

Page 39: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

39

Accelergy Overview

• An architecture-level energy estimator

• Flexibly characterizes various primitive components of different

technologies

• Succinctly models diverse and complicated designs

• Improves estimation accuracy via fine-grained actions

• Achieves 95% accuracy in evaluating a deep neural network (DNN)

accelerator – Eyeriss [ISSCC 2016]

Page 40: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

40

Energy Evaluations on Eyeriss

• Experimental Setup:

– Workload: Alexnet weights & ImageNet input feature maps

– Ground Truth: Energy obtained from post-layout simulations PE

weights_spad

ifmap_spad

psum_spad

MAC

Ifmap = input feature mapPsum = partial sumPE = processing element*_spad = *_scratchpad

WeightsNoC

IfmapNoC

PsumWrNoC

Eyeriss Architecture

GLBs

Weights GLB

Shared GLB

PE array 12x14

PE PE … PE

PE PE PE

PE PE PE

………

PsumRdNoC

Page 41: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

41

Energy Evaluations on Eyeriss

• Experimental Setup:

– Workload: Alexnet weights & ImageNet input feature maps

– Ground Truth: Energy obtained from post-layout simulations PE

weights_spad

ifmap_spad

psum_spad

MAC

Ifmap = input feature mapPsum = partial sumPE = processing element*_spad = *_scratchpad

WeightsNoC

IfmapNoC

PsumWrNoC

Eyeriss Architecture

GLBs

Weights GLB

Shared GLB

PE array 12x14

PE PE … PE

PE PE PE

PE PE PE

………

PsumRdNoC

Zero-gating optimizationIf there is a 0 ifmap data• Gate on reading the weights data => gated-read• Gate on computing the MAC => gated-MAC

Page 42: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

42

Total Energy and Coarse Energy Breakdown

• Total energy estimation is 95% accurate of the post-layout energy.

• Estimated relative breakdown of the important units in the design is

within 8% of the post-layout energy.

Page 43: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

44

PE Array Energy Breakdown

Energy Breakdown of PEs across the Array

0.10

0.12

0.14

0.16

0.18

0.20

0.22

0.24

0.26En

ergy

Co

nsu

mp

tio

n (

µ J

)

PEs that process data of different sparsity

ground truth Accelergy

Aladdin fixed-cost

• Comparisons with existing work: Aladdin and fixed-cost

Page 44: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

45

PE Array Energy Breakdown

Energy Breakdown of PEs across the Array

0.10

0.12

0.14

0.16

0.18

0.20

0.22

0.24

0.26

Ener

gy C

on

sum

pti

on

J)

PEs that process data of different sparsity

ground truth Accelergy

Aladdin fixed-cost

• Comparisons with existing work: Aladdin and fixed-cost

Not aware of the fine-grained actions related to zero-gating optimization

Page 45: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

46

PE Array Energy Breakdown

Energy Breakdown of PEs across the Array

0.10

0.12

0.14

0.16

0.18

0.20

0.22

0.24

0.26

Ener

gy C

on

sum

pti

on

J)

PEs that process data of different sparsity

ground truth Accelergy

Aladdin fixed-cost

• Comparisons with existing work: Aladdin and fixed-cost

Inaccurate energy characterization of

components

Page 46: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

47

PE Energy Breakdown

• Comparisons with existing work: Aladdin and fixed-cost

0

10

20

30

40

50

60

70

80

90

100

ifmap_spad psum_spad weights_spad MAC

Ener

gy C

on

sum

pti

on

(n

J) ground truth

Accelergy

Aladdin

fixed-cost

Energy Breakdown of components inside a PE

Zero-gating action type not reflected

0

Page 47: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

48

PE Energy Breakdown

• Comparisons with existing work: Aladdin and fixed-cost

0

10

20

30

40

50

60

70

80

90

100

ifmap_spad psum_spad weights_spad MAC

Ener

gy C

on

sum

pti

on

(n

J) ground truth

Accelergy

Aladdin

fixed-cost

Energy Breakdown of components inside a PE

0

All local scratchpads share the same energy

reference table

Page 48: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu › slides.pdf · •Achieves 95% accuracy in evaluating a deep neural network (DNN) ... Architecture

49

Conclusion

• Accelergy is an architecture-level energy estimator that

–Accelerates accelerator design space exploration

–Provides flexibility to

• Describe a diverse range of accelerator designs

• Support estimation of different technologies, e.g., CMOS, RRAM, optical

–Achieves high accuracy energy estimations

• 95% accurate for the Eyeriss accelerator

• Open-source code available at: http://accelergy.mit.edu

Acknowledgement: DARPA, Facebook, MIT Presidential Fellowship