module8-0

8/16/2019 module8-0

1/270

Digital Sign

Module 8:

8/16/2019 module8-0

2/270

Module Overview:

◮

Module 8.1: Introduction to Images and Image Processing◮ Module 8.2: Affine Transforms

◮ Module 8.3: 2D Fourier Analysis

◮ Module 8.4: Image Filters

◮ Module 8.5: Image Compression

◮ Module 8.6: The JPEG Compression Standard

8

8/16/2019 module8-0

3/270

Digital Sign

Module 8.1:

8/16/2019 module8-0

4/270

Overview:

◮ Images as multidimensional digital signals

◮ 2D signal representations

◮ Basic signals and operators

8.1

8/16/2019 module8-0

5/270

Overview:




8.1

8/16/2019 module8-0

6/270

Overview:




8.1

8/16/2019 module8-0

7/270

Please meet ...

8.1

8/16/2019 module8-0

8/270

Digital images

◮ two-dimensional signal x [n1, n2], n1, n2 ∈ Z

◮ indices locate a point on a grid → pixel

◮ grid is usually regularly spaced

◮ values x [n1, n2] refer to the pixel’s appearance

8.1

8/16/2019 module8-0

9/270

Digital images





8.1

8/16/2019 module8-0

10/270

8/16/2019 module8-0

11/270

Digital images





8.1

8/16/2019 module8-0

12/270

Digital images: grayscale vs color

◮ grayscale images: scalar pixel values

◮ color images: multidimensional pixel values in a color space (RGB, HSV

◮ we can consider the single components separately:

8.1

8/16/2019 module8-0

13/270





8.1

8/16/2019 module8-0

14/270





= + +

8.1

8/16/2019 module8-0

15/270

Image processing

From one to two dimensions...

◮ something still works

◮ something breaks down

◮ something is new

8.1

8/16/2019 module8-0

16/270

Image processing





8.1

8/16/2019 module8-0

17/270

Image processing





8.1

I i

8/16/2019 module8-0

18/270

Image processing

What works:

◮ linearity, convolution

◮ Fourier transform

◮ interpolation, sampling

8.1

I i

8/16/2019 module8-0

19/270

Image processing

What works:




8.1

I i

8/16/2019 module8-0

20/270

Image processing

What works:




8.1

I i

8/16/2019 module8-0

21/270

Image processing

What breaks down:

◮ Fourier analysis less relevant

◮ filter design hard, IIRs rare

◮ linear operators only mildly useful

8.1

I i

8/16/2019 module8-0

22/270

Image processing

What breaks down:




8.1

Image processing

8/16/2019 module8-0

23/270

Image processing

What breaks down:




8.1

Image processing

8/16/2019 module8-0

24/270

Image processing

What’s new:◮ new manipulations: affine transforms

◮ images are finite-support signals

◮ images are (most often) available in their entirety → causality loses mea

◮ images are very specialized signals, designed for a very specific processinhuman brain! Lots of semantics that is extremely hard to deal with

8.1

Image processing

8/16/2019 module8-0

25/270

Image processing





8.1

Image processing

8/16/2019 module8-0

26/270

Image processing





8.1

Image processing

8/16/2019 module8-0

27/270

Image processing





8.1

2D signal processing: the basics

8/16/2019 module8-0

28/270

2D signal processing: the basics

A two-dimensional discrete-space signal:

x [n1, n2], n1, n2 ∈ Z

8.1

2D signals: Cartesian representation

8/16/2019 module8-0

29/270

2D signals: Cartesian representation

n1 n2

h[n1, n2]

8.1

2D signals: support representation

8/16/2019 module8-0

30/270

2D signals: support representation

◮ just show coordinates of nonzerosamples

◮ amplitude may be written alongexplicitly

◮ example:

δ [n1, n2] =

1 if n1 = n2 = 0

0 otherwise.

1

−5 0

−5

0

5

n1

n 2

8.1

2D signals: image representation

8/16/2019 module8-0

31/270

2D signals: image representation

◮ medium has a certain dynamic range(paper, screen)

◮ image values are quantized (usually to8 bits, or 256 levels)

◮ the eye does the interpolation in spaceprovided the pixel density is highenough

0 85 170 255 30

85

170

255

340

425

510

n1

n 2

8.1

Why 2D?

8/16/2019 module8-0

32/270

Why 2D?

◮ images could be unrolled (printers, fax)

◮ but what about spatial correlation?

8.1

Why 2D?

8/16/2019 module8-0

33/270

Why 2D?

◮ images could be unrolled (printers, fax)

◮ but what about spatial correlation?

8.1

2D vs raster scan

8/16/2019 module8-0

34/270

−20 −10 0 10 20−20

−10

0

10

20

0 205 410 615 820 1025 1230 14350

1

8.1

2D vs raster scan

8/16/2019 module8-0

35/270

−20 −10 0 10 20−20

−10

0

10

20

0 205 410 615 820 1025 1230 14350

1

8.1

2D vs raster scan

8/16/2019 module8-0

36/270

−20 −10 0 10 20−20

−10

0

10

20

0 205 410 615 820 1025 1230 14350

1

8.1

2D vs raster scan

8/16/2019 module8-0

37/270

−20 −10 0 10 20−20

−10

0

10

20

0 205 410 615 820 1025 1230 14350

1

8.1

2D vs raster scan

8/16/2019 module8-0

38/270

−20 −10 0 10 20−20

−10

0

10

20

0 205 410 615 820 1025 1230 14350

1

8.1

2D vs raster scan

8/16/2019 module8-0

39/270

−20 −10 0 10 20−20

−10

0

10

20

0 205 410 615 820 1025 1230 14350

1

8.1

2D vs raster scan

8/16/2019 module8-0

40/270

−20 −10 0 10 20−20

−10

0

10

20

0 205 410 615 820 1025 1230 14350

1

8.1

2D vs raster scan

8/16/2019 module8-0

41/270

−20 −10 0 10 20−20

−10

0

10

20

0 205 410 615 820 1025 1230 14350

1

8.1

Basic 2D signals: the impulse

8/16/2019 module8-0

42/270

δ [n1, n2] =

1 if n1 = n2 = 0

0 otherwise.

−5 0

−5

0

5

8.1

Basic 2D signals: the rect

8/16/2019 module8-0

43/270

rect

n1

2N 1,

n2

2N 2

=

1 if |n1|

8/16/2019 module8-0

44/270

x [n1, n2] = x 1[n1]x 2[n2]

8.1

Separable signals

8/16/2019 module8-0

45/270

δ [n1, n2] = δ [n1]δ [n1]

rect n1

2N 1 ,

n2

2N 2

= rect n1

2N 1

rect n2

2N 2

.

8.1

Separable signals

8/16/2019 module8-0

46/270

δ [n1, n2] = δ [n1]δ [n1]

rect n1

2N 1 ,

n2

2N 2

= rect n1

2N 1

rect n2

2N 2

.

8.1

Nonseparable signal

8/16/2019 module8-0

47/270

x [n1, n2] =

1 if |n1| + |n2|

8/16/2019 module8-0

48/270

x [n1, n2] = rect

n1

2N 1,

n2

2N 2

−rect

n1

2M 1,

n2

2M 2

−5 0

−5

0

5

8.1

2D convolution

8/16/2019 module8-0

49/270

x [n1, n2] ∗ h[n1, n2] =∞

k 1=−∞

∞k 2=−∞

x [k 1, k 2]h[n1 − k 1, n2 − k

8.1

2D convolution for separable signals

8/16/2019 module8-0

50/270

If h[n1, n2] = h1[n1]h2[n2]:

x [n1, n2] ∗ h[n1, n2] =∞

k 1=−∞

h1[n1 − k 1]∞

k 2=−∞

x [k 1, k 2]h2[n2 −

= h1[n1] ∗ (h2[n2] ∗ x [n1, n2]).

8.1

2D convolution for separable signals

8/16/2019 module8-0

51/270

If h[n1, n2] is an M 1 ×M 2 finite-support signal:

◮ non-separable convolution: M 1M 2 operations per output sample

◮ separable convolution: M 1 + M 2 operations per output sample!

8.1

8/16/2019 module8-0

52/270

END OF MODULE 8.1

8/16/2019 module8-0

53/270

Digital Sign

Module 8.2: Ima

Overview:

8/16/2019 module8-0

54/270

◮ Affine transforms

◮ Bilinear interpolation

8.2

Overview:

8/16/2019 module8-0

55/270

◮ Affine transforms

◮ Bilinear interpolation

8.2

Affine transforms

8/16/2019 module8-0

56/270

mapping R2 → R2 that reshapes the coordinate system:

t ′1t ′2

=

a11 a12a21 a22

t 1t 2

−

d 1d 2

t ′1t ′2

= A

t 1t 2

− d

8.2

Affine transforms

8/16/2019 module8-0

57/270

mapping R2 → R2 that reshapes the coordinate system:

t ′1t ′2

=

a11 a12a21 a22

t 1t 2

−

d 1d 2

t ′1t ′2

= A

t 1t 2

− d

8.2

Translation

8/16/2019 module8-0

58/270

A =

1 00 1

= I

d = d 1d

2

0

0

8.2

Translation

8/16/2019 module8-0

59/270

A =

1 00 1

= I

d = d 1d

2

0

0

8.2

Scaling

8/16/2019 module8-0

60/270

A =

a1 00 a2

d = 0

if a1 = a2 the aspect ratio is preserved

0

0

8.2

Scaling

8/16/2019 module8-0

61/270

A =

a1 00 a2

d = 0

if a

1 = a

2 the aspect ratio

is preserved

0

0

8.2

Rotation

8/16/2019 module8-0

62/270

A =

cos θ − sin θsin θ cos θ

d = 0

0

0

8 2

Rotation

8/16/2019 module8-0

63/270

A =


d = 0

0

0

8 2

Rotation

8/16/2019 module8-0

64/270

A =


d = 0

0

0

8 2

Flips

8/16/2019 module8-0

65/270

Horizontal:

A =−1 00 1

d = 0

Vertical:

A =

1 00 −1

d = 00

0

8 2

Flips

8/16/2019 module8-0

66/270

Horizontal:

A =−1 00 1

d = 0

Vertical:

A =

1 00 −1

d = 00

0

8 2

Shear

8/16/2019 module8-0

67/270

Horizontal:

A =

1 s

0 1

d = 0

Vertical:

A =

1 0s 1

d = 00

0

8 2

Shear

8/16/2019 module8-0

68/270

Horizontal:

A =1 s

0 1

d = 0

Vertical:

A =

1 0s 1

d = 00

0

8 2

Affine transforms in discrete-space

8/16/2019 module8-0

69/270

t ′1t ′2

= A

n1n2

− d ∈ R2 = Z2

8 2

Solution for images

8/16/2019 module8-0

70/270

◮ apply the inverse transform: t 1t 2

= A−1

m1 + d 1m2 + d 2

;

◮ interpolate from the original grid point to the “mid-point”

(t 1, t 2) = (η1 + τ 1, η2 + τ 2), η1,2 ∈ Z, 0 ≤ τ 1,2

8/16/2019 module8-0

71/270

◮ apply the inverse transform: t 1t 2

= A−1

m1 + d 1m2 + d 2

;

◮ interpolate from the original grid point to the “mid-point”

(t 1, t 2) = (η1 + τ 1, η2 + τ 2), η1,2 ∈ Z, 0 ≤ τ 1,2

8/16/2019 module8-0

72/270

x [η1 + 1, η2 + 1]

x [η1 , η2]

η1 η1 + 1

η2

η2 + 1

8 2

Bilinear Interpolation

8/16/2019 module8-0

73/270

x [η1 + 1, η2 + 1]

x [η1 , η2]

(t 1, t 2)

η1 η1 + 1

η2

η2 + 1

8 2


8/16/2019 module8-0

74/270

τ 1

τ 2

η1 η1 + 1

η2

η2 + 1

8 2


8/16/2019 module8-0

75/270

τ 1

η1 η1 + 1

η2

η2 + 1

8 2


8/16/2019 module8-0

76/270

τ 1

τ 2

η1 η1 + 1

η2

η2 + 1

8.2


8/16/2019 module8-0

77/270

If we use a first-order interpolator:

y [m1, m2] = (1 − τ 1)(1− τ 2)x [η1, η2] + τ 1(1− τ 2)x [η1 + 1, η

+ (1 − τ 1)τ 2x [η1, η2 + 1] + τ 1τ 2x [η1 + 1, η2 + 1]

8.2

Shearing

8/16/2019 module8-0

78/270

8.2

8/16/2019 module8-0

79/270

END OF MODULE 8.2

8/16/2019 module8-0

80/270

Digital Sign

Module 8.3: F

Overview:

8/16/2019 module8-0

81/270

◮ DFT

◮ Magnitude and phase

8.3

Overview:

8/16/2019 module8-0

82/270

◮ DFT

◮ Magnitude and phase

8.3

2D DFT

8/16/2019 module8-0

83/270

X [k 1, k 2] =

N 1−1n1=0

N 2−1n2=0

x [n1, n2]e − j 2π

N 1n

1k

1 e − j 2π

N 2n

2k

2

x [n1, n2] = 1

N 1N 2

N 1−1

k 1=0

N 2−1

k 2=0

X [k 1, k 2]e j 2π

N 1n1k 1 e

j 2πN 2

n2k 2

8.3

2D DFT

8/16/2019 module8-0

84/270

X [k 1, k 2] =

N 1−1n1=0

N 2−1n2=0

x [n1, n2]e − j 2π

N 1n

1k

1 e − j 2π

N 2n

2k

2

x [n1, n2] = 1

N 1N 2

N 1−1

k 1=0

N 2−1

k 2=0

X [k 1, k 2]e j 2π

N 1n1k 1 e

j 2πN 2

n2k 2

8.3

2D-DFT Basis Vectors

8/16/2019 module8-0

85/270

There are N 1N 2 orthogonal basis vectors for an N 1 × N 2 imag

w k 1,k 2 [n1, n2] = e j 2πN 1

n1k 1 e j 2π

N 2n2k 2

for n1, k 1 = 0, 1, . . . , N 1 − 1 and n2, k 2 = 0, 1, . . . , N 2 − 1

8.3

2D-DFT basis vectors for k 1 = 1,

k 2 = 0 (real part)

255

8/16/2019 module8-0

86/270

0 2550

55

8.3


k 2 = 1 (real part)

255

8/16/2019 module8-0

87/270

0 2550

8.3


k 2 = 0 (real part)

255

8/16/2019 module8-0

88/270

0 2550

8.3


k 2 = 0 (real part)

255

8/16/2019 module8-0

89/270

0 2550

8.3


k 2 = 3 (real part)

255

8/16/2019 module8-0

90/270

0 2550

8.3


k 2 = 0 (real part)

255

8/16/2019 module8-0

91/270

0 2550

8.3


k 2 = 1 (real part)

255

8/16/2019 module8-0

92/270

0 2550

8.3


k 2 = 7 (real part)

255

8/16/2019 module8-0

93/270

0 2550

8.3


k 2 = 250 (real part)

255

8/16/2019 module8-0

94/270

0 2550

8.3


k 2 = 230 (real part)

255

8/16/2019 module8-0

95/270

0 2550

8.3

2D DFT

8/16/2019 module8-0

96/270

2D-DFT basis functions are separable, and so is the 2D-DFT:

X [k 1, k 2] =N 1−1n1=0

N 2−1n2=0

x [n1, n2]e − j 2π

N 1n1k 1 e

− j 2πN 2

n2k 2

◮ 1D-DFT along n2 (the columns)

◮ 1D-DFT along n1 (the rows)

8.3

2D DFT

8/16/2019 module8-0

97/270


X [k 1, k 2] =N 1−1n1=0

N 2−1n2=0

x [n1, n2]e − j 2π

N 2n2k 2

e − j 2π

N 1n1k 1



8.3

2D DFT

8/16/2019 module8-0

98/270


X [k 1, k 2] =N 1−

1n1=0

N 2−1n2=0

x [n1, n2]e − j 2π

N 2n2k 2

e − j 2πN 1 n1k 1


◮

1D-DFT along n1 (the rows)

8.3

2D DFT

8/16/2019 module8-0

99/270


X [k 1, k 2] =N 1−

1n1=0

N 2−1n2=0

x [n1, n2]e − j 2π

N 2n2k 2

e − j 2πN 1 n1k 1



8.3

2D DFT in matrix form

◮ finite support 2D signal can be written as a matrix x

8/16/2019 module8-0

100/270

◮ finite-support 2D signal can be written as a matrix x

◮ N 1 × N 2 image is an N 2 × N 1 matrix (n1 spans the columns, n2 spans th

◮ recall also the N × N DFT matrix (Module 4.2):

WN =

1 1 1 1 . . . 1

1 W 1N W 2N W

3 . . . W N −1N 1 W 2N W

4 W 6N . . . W 2(N −1)N

. . .1 W N −1

N W

2(N −1)N

W 3(N −1)N

. . . W (N −1)2

N

8.3



8/16/2019 module8-0

101/270




WN =

1 1 1 1 . . . 1

1 W 1N W 2N W

3 . . . W N −1N 1 W 2N W

4 W 6N . . . W 2(N −1)N

. . .1 W N −1

N W

2(N −1)N

W 3(N −1)N

. . . W (N −1)2

N

8.3



8/16/2019 module8-0

102/270




WN =

1 1 1 1 . . . 1

1 W 1N W 2N W

3 . . . W N −1N 1 W 2N W

4 W 6N . . . W 2(N −1)N

. . .1 W N −1

N W

2(N −1)N

W 3(N −1)N

. . . W (N −1)2

N

8.3


8/16/2019 module8-0

103/270

X [k 1, k 2] =

N 1−1

n1=0

N 2−1

n2=0

x [n1, n2]e − j 2π

N 2n2k 2

e

− j 2πN 1

n1k 1

8.3


N 1

N 1

8/16/2019 module8-0

104/270

X [k 1, k 2] =

N 1−1

n1=0

N 2−1

n2=0x [n1, n2]e

− j 2πN 2

n2k 2

e − j 2π

N 1n1k 1

V = WN 2 x

V ∈ RN 2×N 1

8.3


N1 1

N2 1

8/16/2019 module8-0

105/270

X [k 1, k 2] =

N 1−1

n1=0

N 2−1

n2=0x [n1, n2]e

− j 2πN 2

n2k 2

e − j 2π

N 1n1k 1

V = WN 2 x

V ∈ RN 2×N 1

X = VWN 1

X ∈ RN 2×N 1

8.3


N1−1

N2−1

8/16/2019 module8-0

106/270

X [k 1, k 2] =

N 1−1

n1=0

N 2−1

n2=0x [n1, n2]e

− j 2πN 2

n2k 2

e − j 2π

N 1n1k 1

V = WN 2 x

V ∈ RN 2×N 1

X = VWN 1

X ∈ RN 2×N 1

X = WN 2 xWN 1

8.3

How does a 2D-DFT look like?

8/16/2019 module8-0

107/270

◮

try to show the magnitude as an image◮ problem: the range is too big for the grayscale range of paper or screen

◮ try to normalize: |X ′[n1, n2]| = |X [n1, n2]|/ max |X [n1, n2]|

◮ but it doesn’t work...

8.3

DFT coefficients sorted by magnitude

8/16/2019 module8-0

108/270

0 5122

101

103

105

107

8.3

Dealing with HDR images

8/16/2019 module8-0

109/270

if the image is high dynamic range we need to compress the levels◮ remove flagrant outliers (e.g. X [0, 0] =

x [n1, n2])

◮ use a nonlinear mapping: e.g. y = x 1/3 after normalization (x ≤ 1)

8.3

Dealing with HDR images

8/16/2019 module8-0

110/270

if the image is high dynamic range we need to compress the levels◮ remove flagrant outliers (e.g. X [0, 0] =

x [n1, n2])

◮ use a nonlinear mapping: e.g. y = x 1/3 after normalization (x ≤ 1)

8.3

How does a 2D-DFT look like?

510 510

8/16/2019 module8-0

111/270

0 85 170 255 340 425 5100

85

170

255

340

425

0 85 170 255 340

85

170

255

340

425

8.3

DFT magnitude doesn’t carry much information

8/16/2019 module8-0

112/270

8.3

DFT phase, on the other hand...

8/16/2019 module8-0

113/270

8.3

Image frequency analysis

8/16/2019 module8-0

114/270

◮ most of the information is contained in image’s edges

◮ edges are points of abrupt change in signal’s values

◮ edges are a “space-domain” feature → not captured by DFT’s magnitud

◮ phase alignment is important for reproducing edges

8.3

8/16/2019 module8-0

115/270

END OF MODULE 8.3

8/16/2019 module8-0

116/270

Digital Sign

Mod

8/16/2019 module8-0

117/270

Overview:

8/16/2019 module8-0

118/270

◮ Filters for image processing

◮ Classification

◮ Examples

8.4

Overview:

8/16/2019 module8-0

119/270

◮ Filters for image processing

◮ Classification

◮ Examples

8.4

Analogies with 1D filters

◮ li it

8/16/2019 module8-0

120/270

◮ linearity

◮

space invariance◮ impulse response

◮ frequency response

◮ stability

◮ 2D CCDE

8.4

The problem with LSI operators

◮ interesting images contain lots of semantics: different information in diff

8/16/2019 module8-0

121/270

◮ interesting images contain lots of semantics : different information in diff

◮ space-invariant filters process everything in the same way

◮ but we should process things differently

• edges

• gradients

• textures

• ...

8.4


◮ interesting images contain lots of semantics: different information in diff

8/16/2019 module8-0

122/270




• edges

• gradients

• textures

• ...

8.4

8/16/2019 module8-0

123/270



8/16/2019 module8-0

124/270

g g



• edges

• gradients

• textures

• ...

8.4



8/16/2019 module8-0

125/270

g g



• edges

• gradients

• textures

• ...

8.4



8/16/2019 module8-0

126/270

g g



• edges

• gradients

• textures

• ...

8.4



8/16/2019 module8-0

127/270

g g



• edges

• gradients

• textures

• ...

8.4

Filter types

8/16/2019 module8-0

128/270

◮ IIR, FIR

◮ causal or noncausal

◮ highpass, lowpass, ...

• lowpass → image smoothing

• highpass → enhancement, edge detection

8.4

Filter types

8/16/2019 module8-0

129/270

◮ IIR, FIR





8.4

Filter types

8/16/2019 module8-0

130/270

◮ IIR, FIR





8.4

Filter types

8/16/2019 module8-0

131/270

◮ IIR, FIR





8.4

Filter types

8/16/2019 module8-0

132/270

◮ IIR, FIR





8.4

The problems with 2D IIRs

8/16/2019 module8-0

133/270

◮ nonlinear phase (edges!)

◮ border effects

◮ stability: the fundamental theorem of algebra doesn’t hold in multiple di

◮ computability

8.4


8/16/2019 module8-0

134/270


◮ border effects


◮ computability

8.4


8/16/2019 module8-0

135/270


◮ border effects


◮ computability

8.4

8/16/2019 module8-0

136/270

A noncomputable CCDE

y [n1, n2] = a0y [n1 + 1, n2] + a1y [n1, n2 − 1] + a2y [n1 − 1, n2] + a3y [n1, n2 + 1] +

2

8/16/2019 module8-0

137/270

y [0, 0]

y [1, 0]

y [0, 1]

y [−1, 0]

y [0,−1]

−2 −1 0 1 2

−2

−

1

0

1

2

8.4

A noncomputable CCDE

y [n1, n2] = a0y [n1 + 1, n2] + a1y [n1, n2 − 1] + a2y [n1 − 1, n2] + a3y [n1, n2 + 1] +

2

8/16/2019 module8-0

138/270

−2 −1 0 1 2

−2

−1

0

1

2

8.4

Practical FIR filters

◮ generally zero centered (causality not an issue) ⇒

8/16/2019 module8-0

139/270

ge e a y e o ce te ed (causa ty ot a ssue) ⇒odd number of taps in both directions

◮ per-sample complexity:

• M 1M 2 for nonseparable impulse responses

• M 1 + M 2 for separable impulse responses

◮ obviously always stable

8.4



8/16/2019 module8-0

140/270

g y ( y )odd number of taps in both directions





8.4



8/16/2019 module8-0

141/270






8.4



8/16/2019 module8-0

142/270






8.4



8/16/2019 module8-0

143/270

( )odd number of taps in both directions





8.4

Moving Average

[ ]1

N N

[ k k ]

8/16/2019 module8-0

144/270

y [n1, n2] = 1

(2N + 1)2

k 1=−N

k 2=−N

x [n1 − k 1, n2 − k 2]

h[n1, n2] = 1

(2N + 1)2rect

n1

2N ,

n2

2N

8.4

Moving Average

[ ] 1

N N

[ k k ]

8/16/2019 module8-0

145/270

y [n1, n2] =

(2N + 1)2

k 1=−N

k 2=−N

x [n1 − k 1, n2 − k 2]

h[n1, n2] = 1

(2N + 1)2rect

n1

2N ,

n2

2N

8.4

Moving Average

8/16/2019 module8-0

146/270

h[n1, n2] = 1

9

1 1 11 1 1

1 1 1

8.4

Moving Average

8/16/2019 module8-0

147/270

11× 11 MA 51 × 51 MA

8.4

Gaussian Blur

8/16/2019 module8-0

148/270

h[n1, n2] = 12πσ2

e −n2

1+n2

22σ2 , |n1, n2|

8/16/2019 module8-0

149/270

n1 n2

8.4

8/16/2019 module8-0

150/270

Gaussian Blur

8/16/2019 module8-0

151/270

σ = 1.8, 11 × 11 blur σ = 8.7, 51 × 51 blu

8.4

Sobel filter

approximation of the first derivative in the horizontal direction:

s [n1 n2] =−1 0 1

2 0 2

8/16/2019 module8-0

152/270

s o [n1, n2] = −2 0 2−1 0 1

separability and structure:

s o [n1, n2] =1

21

−1 0 1

8.4

Sobel filter

approximation of the first derivative in the horizontal direction:

s [n1 n2] =−1 0 1−2 0 2

8/16/2019 module8-0

153/270

s o [n1, n2] = −2 0 2−1 0 1

separability and structure:

s o [n1, n2] =1

21

−1 0 1

8.4

Sobel filter

approximation of the first derivative in the vertical direction:

8/16/2019 module8-0

154/270

pp

s v [n1, n2] =

−1 −2 10 0 0

1 2 1

8.4

Sobel filter

8/16/2019 module8-0

155/270

horizontal Sobel filter vertical Sobel filter

8.4

Sobel operator

approximation for the square magnitude of the gradient:

8/16/2019 module8-0

156/270

|∇x [n1, n2]|2 = |s o [n1, n2] ∗ x [n1, n2]|2 + |s v [n1, n2] ∗ x [n1, n2]

(“operator” because it’s nonlinear)

8.4

Gradient approximation for edge detection

8/16/2019 module8-0

157/270

Sobel operator thresholeded Sobel oper

8.4

Laplacian operator

Laplacian of a function in continuous-space:

8/16/2019 module8-0

158/270

Laplacian of a function in continuous-space:

∆f (t 1, t 2) = ∂ 2f

∂ t 21+

∂ 2f

∂ t 22

8.4

Laplacian operator

approximating the Laplacian; start with a Taylor expansion

f (t + τ ) =∞

0

f (n)(t )

n! τ n

8/16/2019 module8-0

159/270

n=0and compute the expansion in (t + τ ) and (t − τ ):

f (t + τ ) = f (t ) + f ′(t )τ + 1

2f ′′(t )τ 2

f (t − τ ) = f (t )− f ′(t )τ + 12

f ′′(t )τ 2

8.4

Laplacian operator

by rearranging terms:1

8/16/2019 module8-0

160/270

f ′′

(t ) = 1

τ 2 (f (t − τ ) − 2f (t ) + f (t + τ ))

which, on the discrete grid, is the FIR h[n] =

1 −2 1

8.4

Laplacian

summing the horizontal and vertical components:

8/16/2019 module8-0

161/270

h[n1, n2] =

0 1 01 −4 1

0 1 0

8.4

Laplacian

If we use the diagonals too:

8/16/2019 module8-0

162/270

If we use the diagonals too:

h[n1, n2] =

1 1 11 −8 11 1 1

8.4

Laplacian for Edge Detection

8/16/2019 module8-0

163/270

Laplacian operator thresholeded Laplacian op

8.4

8/16/2019 module8-0

164/270

END OF MODULE 8.4

8/16/2019 module8-0

165/270

Digital SignModule 8.5: Im

Overview:

8/16/2019 module8-0

166/270

◮ Redundancy in natural images

◮ Image coding ingredients

8.5

Overview:

8/16/2019 module8-0

167/270

◮ Redundancy in natural images

◮ Image coding ingredients

8.5

A thought experiment

◮ consider all possible 256 × 256, 8bpp images

8/16/2019 module8-0

168/270

◮ each image is 524,288 bits

◮ total number of possible images: 2524,288 ≈ 10157,826

◮ number of atoms in the universe: 1082

8.5



8/16/2019 module8-0

169/270




8.5



8/16/2019 module8-0

170/270




8.5



8/16/2019 module8-0

171/270




8.5

How many bits per image?

Another thought experiment

◮ take all images in the world and list them in an “encyclopedia of images

◮ to indicate an image, simply give its number

8/16/2019 module8-0

172/270

◮ on the Internet: M = 50 billion

◮ raw encoding: 524,288 bits per image

◮ enumeration-based encoding: log2 M ≈ 36 bits per image

◮ (of course, side information is HUGE)

8.5

8/16/2019 module8-0

173/270





8/16/2019 module8-0

174/270





8.5





8/16/2019 module8-0

175/270





8.5





8/16/2019 module8-0

176/270





8.5





8/16/2019 module8-0

177/270





8.5

Compression

Another approach:

◮ exploit “physical” redundancy

8/16/2019 module8-0

178/270

◮ allocate bits for things that matter (e.g. edges)

◮ use psychovisual experiments to find out what matters

◮ some information is discarded: lossy compression

8.5

Compression

Another approach:


8/16/2019 module8-0

179/270




8.5

Compression

Another approach:


8/16/2019 module8-0

180/270




8.5

Compression

Another approach:


8/16/2019 module8-0

181/270




8.5

Key ingredients

◮ compressing at block level

◮ ( )

8/16/2019 module8-0

182/270

◮ using a suitable transform (i.e., a change of basis)

◮ smart quantization

◮ entropy coding

8.5

Key ingredients


◮ bl f ( h f b )

8/16/2019 module8-0

183/270



◮ entropy coding

8.5

Key ingredients


◮ i i bl f (i h f b i )

8/16/2019 module8-0

184/270



◮ entropy coding

8.5

Key ingredients


◮ i i bl f (i h f b i )

8/16/2019 module8-0

185/270



◮ entropy coding

8.5

Compressing at pixel level

◮ reduce number bits per pixel

8/16/2019 module8-0

186/270

◮ equivalent to coarser quantization

◮ in the limit, 1bpp

8.5



8/16/2019 module8-0

187/270



8.5


◮ d b bit i l

8/16/2019 module8-0

188/270




8.5

Compressing at block level

◮ divide the image in blocks

◮ code the average value with 8 bits

8/16/2019 module8-0

189/270


◮ 3× 3 blocks at 8 bits per block gives lessthan 1bpp

8.5




8/16/2019 module8-0

190/270



8.5



8/16/2019 module8-0

191/270



8.5


◮

exploit the local spatial correlation

8/16/2019 module8-0

192/270

exploit the local spatial correlation◮ compress remote regions independently

8.5


◮

exploit the local spatial correlation

8/16/2019 module8-0

193/270

exploit the local spatial correlation◮ compress remote regions independently

8.5

Transform coding

A simple example:

◮ take a DT signal, assume R bits persample

◮ storing the signal requires NR bits

8/16/2019 module8-0

194/270

storing the signal requires NR bits

◮ now you take the DFT and it looks likethis

◮ in theory, we can just code the twononzero DFT coefficients!

0 5 10 15 20

8.5

Transform coding

A simple example:



8/16/2019 module8-0

195/270




0 5 10 15 20

8.5

Transform coding

A simple example:



8/16/2019 module8-0

196/270




0 5 10 15 20

8.5

Transform coding

A simple example:



8/16/2019 module8-0

197/270




0 5 10 15 20

8.5

Transform coding

Ideally, we would like a transform that:

◮ captures the important features of an image block in a few coefficients

i ffi i t t t

8/16/2019 module8-0

198/270

◮ is efficient to compute

◮ answer: the Discrete Cosine Transform

8.5

Transform coding



i ffi i t t t

8/16/2019 module8-0

199/270



8.5

Transform coding



◮ is efficient to comp te

8/16/2019 module8-0

200/270



8.5

2D-DCT

C [k 1, k 2] =N −1 N −1

x [n1, n2]cos

πN

n1 + 1

2

k 1

cos

πN

n2 +

8/16/2019 module8-0

201/270

[ 1, 2]n1=0

n2=0

[ 1, 2]

N

1 +

2

1

N

2 +

8.5

DCT basis vectors for an 8 × 8 image

8/16/2019 module8-0

202/270

8.5

Smart quantization

◮ deadzone

bl (fi )

8/16/2019 module8-0

203/270

◮ variable step (fine to coarse)

8.5

Smart quantization

◮ deadzone

i bl (fi )

8/16/2019 module8-0

204/270

◮ variable step (fine to coarse)

8.5

Quantization

Standard quantization:1

2x̂ [n]

00

8/16/2019 module8-0

205/270

x̂ = floor(x ) + 0.5

−1

−2

1 −1−2

11

10

8.5

Quantization

Deadzone quantization:

1

2x̂ [n]

00

01

8/16/2019 module8-0

206/270

x̂ = round(x )

−1

−2

1 −1−2

10

00

8.5

Entropy coding

◮ minimize the effort to encode a certain amount of information

◮ associate short symbols to frequent values and vice-versa

8/16/2019 module8-0

207/270

◮ if it sounds familiar it’s because it is...

8.5

Entropy coding



8/16/2019 module8-0

208/270


8.5

Entropy coding



8/16/2019 module8-0

209/270


8.5

Entropy coding

8/16/2019 module8-0

210/270

8.5

END OF MODULE 8.5

8/16/2019 module8-0

211/270

Digital Sign

Module 8 6: The JPEG Comp

8/16/2019 module8-0

212/270

Module 8.6: The JPEG Comp

Key ingredients


◮

using a suitable transform (i.e., a change of basis)t ti ti

8/16/2019 module8-0

213/270


◮ entropy coding

8.6

8/16/2019 module8-0

214/270

Key ingredients

◮ split image into 8 × 8 non-overlapping blocks

◮

compute the DCT of each block◮ smart quantization

8/16/2019 module8-0

215/270


◮ entropy coding

8.6

Key ingredients


◮

compute the DCT of each block◮ quantize DCT coefficients according to psycovisually tuned tables

8/16/2019 module8-0

216/270

◮ quantize DCT coefficients according to psycovisually-tuned tables

◮ entropy coding

8.6

Key ingredients


◮ compute the DCT of each block

◮ quantize DCT coefficients according to psycovisually tuned tables

8/16/2019 module8-0

217/270

◮ quantize DCT coefficients according to psycovisually-tuned tables

◮ run-length encoding and Huffman coding

8.6

DCT coefficients of image blocks (detail)

8/16/2019 module8-0

218/270

8.6

DCT coefficients of image blocks (detail)

8/16/2019 module8-0

219/270

8.6

Smart quantization

◮ most coefficients are negligible → captured by the deadzone

◮ some coefficients are more important than others

◮ find out the critical coefficients by experimentation

8/16/2019 module8-0

220/270

◮ allocate more bits (or, equivalently, finer quantization levels) to the mostcoefficients

8.6

Smart quantization




8/16/2019 module8-0

221/270


8.6

Smart quantization




8/16/2019 module8-0

222/270


8.6

Smart quantization




8/16/2019 module8-0

223/270


8.6

Psychovisually-tuned quantization table

ĉ [k 1, k 2] = round(c [k 1, k 2]/Q [k 1, k 2])

Q

16 11 10 16 24 40 51 6112 12 14 19 26 58 60 55

14 13 16 24 40 57 69 5614 17 22 29 51 87 80 62

8/16/2019 module8-0

224/270

Q =

18 22 37 56 68 109 103 7724 35 55 64 81 104 113 9249 64 78 87 103 121 120 10172 92 95 98 112 100 103 99

8.6

Advantages of nonuniform bit allocation

8/16/2019 module8-0

225/270

uniform tuned

8.6

Advantages of nonuniform bit allocation (detail)

8/16/2019 module8-0

226/270

uniform tuned

8.6

Efficient coding

◮ most coefficients are small, decreasing with index

◮ use zigzag scan to maximize ordering

◮ ti ti ill t l i f

8/16/2019 module8-0

227/270

◮ quantization will create long series of zeros

8.6

Efficient coding




8/16/2019 module8-0

228/270


8.6

Efficient coding




8/16/2019 module8-0

229/270


8.6

Zigzag scan

8/16/2019 module8-0

230/270

8.6

Example

100 −60 0 6 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

13 −1 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

8/16/2019 module8-0

231/270

0 0 0 0 0 0 0 00 0 0 0 0 0 0 0

100, -60, 0, 0, 0, 0, 6, 0, 0, 0, 13, 0, 0, 0, 0, 0, 0, 0, 0, 13, 0, 0, 0, 0, 0, 0, 00, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

8.6

Example

100 −60 0 6 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

13 −1 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

8/16/2019 module8-0

232/270

0 0 0 0 0 0 0 00 0 0 0 0 0 0 0

100, -60, 0, 0, 0, 0, 6, 0, 0, 0, 13, 0, 0, 0, 0, 0, 0, 0, 0, 13, 0, 0, 0, 0, 0, 0, 00, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

8.6

Runlength encoding

Each nonzero value is encoded as the triple

[(r , s ), c ]

◮ r is the runlength i.e. the number of zeros before the current value◮ s is the size i.e. the number of bits needed to encode the value

8/16/2019 module8-0

233/270

s is the size i.e. the number of bits needed to encode the value

◮ c is the actual value

◮ (0, 0) indicates that from now on it’s only zeros (end of block)

8.6

Runlength encoding


[(r , s ), c ]

◮ r is the runlength i.e. the number of zeros before the current value◮ s is the size i.e. the number of bits needed to encode the value

8/16/2019 module8-0

234/270



8.6

Runlength encoding


[(r , s ), c ]

◮

r is the runlength i.e. the number of zeros before the current value◮ s is the size i.e. the number of bits needed to encode the value

8/16/2019 module8-0

235/270



8.6

Runlength encoding


[(r , s ), c ]

◮

r is the runlength i.e. the number of zeros before the current value◮ s is the size i.e. the number of bits needed to encode the value

8/16/2019 module8-0

236/270



8.6

Example

100 −60 0 6 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

13 −1 0 0 0 0 0 0

0 0 0 0 0 0 0 00 0 0 0 0 0 0 0

8/16/2019 module8-0

237/270

0 0 0 0 0 0 0 0

[(0, 7), 100], [(0, 6), −60], [(4, 3), 6], [(3, 4), 13], [(8, 1), −1], [(0

8.6

Example

100 −60 0 6 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

13 −1 0 0 0 0 0 0

0 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

8/16/2019 module8-0

238/270

0 0 0 0 0 0 0 0

[(0, 7), 100], [(0, 6), −60], [(4, 3), 6], [(3, 4), 13], [(8, 1), −1], [(0

8.6

The runlength-size pairs

◮ by design, (r , s ) ∈ A with |A| = 256

◮ in theory, 8 bits per pair

◮ some pairs are much more common than others!

8/16/2019 module8-0

239/270

◮ a lot of space can be saved by being smart

8.6





8/16/2019 module8-0

240/270


8.6





8/16/2019 module8-0

241/270


8.6





8/16/2019 module8-0

242/270


8.6

Variable-length encoding

great idea: shorter binary sequences for common symbols

8/16/2019 module8-0

243/270

8.6


however: if symbols have different lengths, we must know how to pa

◮ in English, spaces separate words → extra symbol (wasteful)

◮ in Morse code, pauses separate letters and words (wasteful)

◮ can we do away with separators?

8/16/2019 module8-0

244/270


8.6






8/16/2019 module8-0

245/270


8.6






8/16/2019 module8-0

246/270


8.6

Prefix-free codes

◮ no valid sequence can be the beginning of another valid sequence

◮

can parse a bitstream sequentially with no look-ahead◮ extremely easy to understand graphically...

8/16/2019 module8-0

247/270

8.6

Prefix-free codes


◮


8/16/2019 module8-0

248/270

8.6

Prefix-free codes


◮


8/16/2019 module8-0

249/270

8.6

Prefix-free code

A0

1

0

C0

D1

B1

8/16/2019 module8-0

250/270

001100110101100

8.6

Prefix-free code

A0

1

0

C0

D1

B1

8/16/2019 module8-0

251/270

001100110101100

8.6

Prefix-free code

A0

1

0

C0

D1

B1

8/16/2019 module8-0

252/270

001100110101100

A

8.6

Prefix-free code

A0

1

0

C0

D1

B1

8/16/2019 module8-0

253/270

001100110101100

AA

8.6

Prefix-free code

A0

1

0

C0

D1

B1

8/16/2019 module8-0

254/270

001100110101100

AAB

8.6

Prefix-free code

A0

1

0

C0

D1

B1

8/16/2019 module8-0

module8-0

Documents

Transcript of module8-0