Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using...

33
Self-Adaptive FPGA-Based Image Processing Filters Using Approximate Arithmetics Jutta Pirkl , Andreas Becher, Jorge Echavarria, Jürgen Teich, and Stefan Wildermann Hardware/Software Co-Design, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) SCOPES’17, St. Goar, Germany, June 12, 2017

Transcript of Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using...

Page 1: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Self-Adaptive FPGA-Based Image ProcessingFilters Using Approximate Arithmetics

Jutta Pirkl, Andreas Becher, Jorge Echavarria, Jürgen Teich, and Stefan WildermannHardware/Software Co-Design, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU)SCOPES’17, St. Goar, Germany, June 12, 2017

Page 2: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Approximate Computing – A New Design Paradigm

Portable battery-powered devices Rapid workload increase

Underlying Idea

Trading accuracy of computations against disproportionate improvementswith respect to power consumption and/or performance and/or circuitarea.

Sources: https://www.lsr.com/embedded-wireless-modules/tiwiconnect, http://www.closdurocnoir.com/category/battery,http://blog.apterainc.com/business-intelligence/the-data-tsunami-is-coming.-is-your-company-ready

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 1

Page 3: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

MotivationProblem:• Error-tolerance depends on both input data and application context• Quality-configurability is the key principle of prospective AC platforms1

Dynamic PartialReconfiguration

ApproximateComputing

Adaptation ofthe approx-

imation level

Approach: Dynamic autonomous swapping of filters with different degreesof approximation utilizing reconfigurable hardware

1S. Venkataramani et al. “Approximate computing and the quest for computing efficiency”. In: 2015 52nd ACM/EDAC/IEEE Design Automation Conference(DAC). June 2015, pp. 1–6.

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 2

Page 4: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Outline

Concepts of Self-Adaptive Image ProcessingApproximate 2D-Convolution FiltersQuality EvaluationReconfiguration Management

Experimental ResultsQuality-Configurable Control MechanismPartial Reconfiguration Overhead

Summary

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 3

Page 5: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Concepts of Self-Adaptive Image Processing

Page 6: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Approximate 2D-Convolution Filters

• Basic filter building block: 2D-convolution filter wrapperwith a kernel size of 3×3

n

m

Filter kernel

Y [m,n] = ∑i

∑j

H[i, j] ·X [m− i,n− j]

Output Filter kernel Input

• Parallel Multiply-Accumulate (MAC) operation in a pipelined adder treestructure

• Replacement of all adders by the same approximate version

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 4

Page 7: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Approximate Adder Structures on FPGAs2

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

LUT6_2. . .LUT6_2LUT5LUT6_2. . .LUT6_2

a0 b0a... b...am−2 bm−2am−1 bm−1am bma... b...an−1 bn−1

s0s...sm−2sm−1sms...sn−1 m

all1 c0c...cmc...cout

Most Significant Part Least Significant Part

Case 1: Carry suppressionMSP LSP

0 1 1 1 1 1 311 1 1 1

+ 0 0 1 0 1 1 111 0 0 0 1 0 3934 42

2A. Becher et al. “A LUT-based approximate adder”. In: Proceedings of the 24th Annual IEEE International Symposium on Field-Programmable CustomComputing Machines. FCCM ’16. Washington DC, USA, May 2016.

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 5

Page 8: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Approximate Adder Structures on FPGAs2

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

LUT6_2. . .LUT6_2LUT5LUT6_2. . .LUT6_2

a0 b0a... b...am−2 bm−2am−1 bm−1am bma... b...an−1 bn−1

s0s...sm−2sm−1sms...sn−1 m

all1 c0c...cmc...cout

Most Significant Part Least Significant Part

Case 1: Carry suppressionMSP LSP

0 1 1 1 1 1 311 1 1 1

+ 0 0 1 0 1 1 111 0 0 0 1 01 1 1 39 42

⇒ error reduction mechanism2A. Becher et al. “A LUT-based approximate adder”. In: Proceedings of the 24th Annual IEEE International Symposium on Field-Programmable Custom

Computing Machines. FCCM ’16. Washington DC, USA, May 2016

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 5

Page 9: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Approximate Adder Structures on FPGAs2

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

i0 i1 i2 i3 i4

o0 o1

LUT6_2. . .LUT6_2LUT5LUT6_2. . .LUT6_2

a0 b0a... b...am−2 bm−2am−1 bm−1am bma... b...an−1 bn−1

s0s...sm−2sm−1sms...sn−1 m

all1 c0c...cmc...cout

Most Significant Part Least Significant Part

Case 1: Carry suppressionMSP LSP

0 1 1 1 1 1 311 1 1 1

+ 0 0 1 0 1 1 111 0 0 0 1 01 1 1 39 42

⇒ error reduction mechanism

Case 2: Carry predictionMSP LSP

0 1 1 1 1 1 3111 1 1 1

+ 0 0 1 1 1 1 151 0 1 1 1 0 46

⇒ no approximation error2A. Becher et al. “A LUT-based approximate adder”. In: Proceedings of the 24th Annual IEEE International Symposium on Field-Programmable Custom

Computing Machines. FCCM ’16. Washington DC, USA, May 2016

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 5

Page 10: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Case Study – Approximate Gaussian Lowpass Filter

m = 1 m = 2

m = 4 m = 6

m = 8 m = 10

Artifacts:• Brightness decrease→ underestimating adder

• “Cartoon effect”

m = 9

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 6

Page 11: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Impact of the Carry Chain Splitting Point on the Output Quality

Dependency of the average Peak Signal-to-Noise Ratio (PSNR) on mamong the Kodak Lossless True Color Image Suite3

2 3 4 5 6 7 8 9 10 1120

25

30

35

40

45

50

55

60

Splitting Position of the Carry Chain (m)

Ave

rage

PS

NR

[dB

]

3R. Franzen. True Color Kodak Images. http://r0k.us/graphics/kodak/.

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 7

Page 12: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Quality Evaluation

• Problem: Requirement of a no-reference metric to assess the quality atruntime

• Approach: Feature extraction from the histograms of in- and output images

0

2,000

4,000

6,000

Freq

uenc

y

Input Image

0 50 100 150 200 2500

2,000

4,000

6,000

m = 1

0 50 100 150 200 2500

2,000

4,000

6,000

m = 3

0 50 100 150 200 250

0

2,000

4,000

6,000

m = 5

0 50 100 150 200 2500

2

4

6

·104

Gray level

m = 7

0 50 100 150 200 2500

2

4

6

·104 m = 9

0 50 100 150 200 250

• More and more pixels are mapped onto exactly the same brightness values→ “cartoon”-effect

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 8

Page 13: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Quality Evaluation

0

2

4

6·104

Gray level

Freq

uenc

y m = 1

0 50 100 150 200 2500

2

4

6·104

Gray level

m = 7

0 50 100 150 200 2500

2

4

6·104

Gray level

m = 9

0 50 100 150 200 250

Gauss kernel:

1

16

(1 2 12 4 21 2 1

)• Distinctive peaks created by the all1-signal→ erroneous sums are mapped onto 2n−m values

• Example: m = 9, output bit width afternormalization n = 8

before normalization: x20 · · ·x9 | 111111111after normalization by 16: x11x10x9 | 11111

→ smallest collection bin at 00011111b (31d )→ further peaks at a distance of 25 = 32

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 9

Page 14: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Quality Evaluation

0

2

4

6·104

Gray level

Freq

uenc

y m = 1

0 50 100 150 200 2500

2

4

6·104

Gray level

m = 7

0 50 100 150 200 2500

2

4

6·104

Gray level

m = 9

0 50 100 150 200 250

Gauss kernel:

1

16

(1 2 12 4 21 2 1

)• Distinctive peaks created by the all1-signal→ erroneous sums are mapped onto 2n−m values

• Example: m = 9, output bit width afternormalization n = 8

before normalization: x20 · · ·x9 | 111111111after normalization by 16: x11x10x9 | 11111

→ smallest collection bin at 00011111b (31d )→ further peaks at a distance of 25 = 32

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 9

Page 15: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Quality Evaluation

• Amount of counters to sample thehistograms constitutes a trade-offbetween overhead and fidelity

• Counting of the pixels with graylevels 31, 63, 127 and 191 in bothin- and output image

Definition QM

Ratio of the maximum peakheight of the four bins andthe corresponding amountin the input image

• Large metric value indicates badquality

Progression of QMwith increasing m

1 2 3 4 5 6 7 8 9 10 110

5

10

15

20

25

30

Splitting Position of the Carry Chain (m)

QM

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 10

Page 16: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Case Study – Approximate Gaussian Lowpass Filter

m = 1 m = 2

m = 4 m = 6

m = 8 m = 10

Artifacts:• Brightness decrease→ underestimating adder

• “Cartoon effect”

m = 9

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 10

Page 17: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Reconfiguration Management

• Objective:Minimization of the critical path while maintaining a given quality boundary→ successive approximation of m to the fastest configuration mopt

• Based on a bang-bang controller with integrated hysteresis

Decision logic for setting the degree of approximation

• Case 1: Quality is still acceptable→ in- or decrement the splitting position m in the direction of mopt

• Case 2: Quality boundary is exceeded→ in- or decrement m in the opposite direction of mopt

• Case 3: Quality metric is within dead zone→ keep configuration

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 11

Page 18: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Experimental Results

Page 19: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Results – Input-Based Adaptivity

System behavior at runtime for the approximate Gaussian filter

100 200 300 400 500 600 700 800 900 1,0000

1

2

3

4

5

6

7

τQM

Frames

QM

100 200 300 400 500 600 700 800 900 1,0002

4

6

8

10

Frames

m

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 12

Page 20: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Results – Input-Based Adaptivity

System behavior at runtime for the approximate Gaussian filter

100 200 300 400 500 600 700 800 900 1,0000

1

2

3

4

5

6

7

τQM

Frames

QM

static m = 6

100 200 300 400 500 600 700 800 900 1,0002

4

6

8

10

Frames

m

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 12

Page 21: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Results – Input-Based Adaptivity

System behavior at runtime for the approximate Gaussian filter

100 200 300 400 500 600 700 800 900 1,0000

1

2

3

4

5

6

7

τQM

Frames

QM

static m = 5static m = 6

100 200 300 400 500 600 700 800 900 1,0002

4

6

8

10

Frames

m

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 12

Page 22: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Results – Input-Based Adaptivity

System behavior at runtime for the approximate Gaussian filter

100 200 300 400 500 600 700 800 900 1,0000

1

2

3

4

5

6

7

τQM

Frames

QM

dynamicstatic m = 5static m = 6

100 200 300 400 500 600 700 800 900 1,0002

4

6

8

10

Frames

m

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 12

Page 23: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Results – Input-Based Adaptivity

System behavior at runtime for the approximate Gaussian filter

100 200 300 400 500 600 700 800 900 1,0000

1

2

3

4

5

6

7

τQM

Frames

QM

dynamicstatic m = 5static m = 6

100 200 300 400 500 600 700 800 900 1,0002

4

6

8

10

Frames

m

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 12

Page 24: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Results – Requirement-Based Adaptivity

System evaluation results4,5

low medium high

123456789

Quality Requirement

mav

g

DoA

aspen redkayak snowmnt touchdownpass pedestrian area demo video

low medium high0

10

20

30

40

50

60

Quality RequirementP

SN

Rav

g[d

B]

Output Quality

4Test videos: Derf’s collection + self-shot demo video, resolution of 640×480, grayscale 8 bits/pixel

5Evaluation Parameters: Adaptation rate of 2sec at a frame rate of 30 fps

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 13

Page 25: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Analysis of the Partial Reconfiguration Overhead

Reconfiguration time for the partial bitstreams6

XC7Z020 Partial Bitstream

Bitstream Size [KB]7 637.44Reconfiguration Time [ms] 1.62Download Rate [MB/s] 374.48

• Approximately linear correlation between configuration time andbitstream size

• Remaining time slot for the filtering process at 30 fps:• 33.33 ms - 1.62 ms = 31.71 ms• Partial reconfiguration requires 4.86 % of the time frame

6This table contains only the largest bitstream among the approximate variants for the Gaussian filter which determines the slowest transfer

7Full binary bitstream size for the xc7z020 device: 4,045,564 Bytes

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 14

Page 26: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Summary

Page 27: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Summary

• Challenge: Input-dependent approximation error behavior requiresself-adaptive methods

• Proposition of a no-reference metric for online output qualitymonitoring based on histogram information

• Our concept offers• better exploitation of a given error tolerance than static approximation• a user control knob to select the desired output quality at runtime

Thank you for listening!

Any questions?

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 15

Page 28: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Summary

• Challenge: Input-dependent approximation error behavior requiresself-adaptive methods

• Proposition of a no-reference metric for online output qualitymonitoring based on histogram information

• Our concept offers• better exploitation of a given error tolerance than static approximation• a user control knob to select the desired output quality at runtime

Thank you for listening!

Any questions?

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 15

Page 29: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Summary

• Challenge: Input-dependent approximation error behavior requiresself-adaptive methods

• Proposition of a no-reference metric for online output qualitymonitoring based on histogram information

• Our concept offers• better exploitation of a given error tolerance than static approximation• a user control knob to select the desired output quality at runtime

Thank you for listening!

Any questions?

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 15

Page 30: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Summary

• Challenge: Input-dependent approximation error behavior requiresself-adaptive methods

• Proposition of a no-reference metric for online output qualitymonitoring based on histogram information

• Our concept offers• better exploitation of a given error tolerance than static approximation• a user control knob to select the desired output quality at runtime

Thank you for listening!

Any questions?

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 15

Page 31: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Backup Slides

Page 32: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

System Level Overview

User application+ Reconfiguration Manager

Linux Kernel

Driver Modules

/dev/image_filter

/dev/xdevcfg HW

/SW

Inte

rface

FilterController

QualityEvaluation

Reconfigurable Partition

FilterWrapper

SoC

Processing System Programmable Logic Main Memory

PR m = 1m = 2

m = 3· · ·

Software

• Reconfiguration Manager :Quality-control loop

• Linux device drivers as hardwareinterfaces

Hardware

• Approximate Filter Operators:Partial bitstreams for variousdegrees of approximation

• Quality Evaluation:Online quality monitoring

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 16

Page 33: Self-Adaptive FPGA-Based Image Processing Filters … FPGA-Based Image Processing Filters Using Approximate ... Hardware/Software Co-Design, ... Image Processing …

Correlation Between the Proposed Quality Metric and PSNR

30 40 50 600

5

10

15

20

PSNR [dB]

QM

Inverse relation: Increasing tendency of the metric with decreasing PSNR

Jutta Pirkl | Hardware/Software Co-Design (FAU) | Self-Adaptive FPGA-Based Image Processing SCOPES’17 17