Recent progress in computational photography using deep ...

54
Recent progress in computational photography using deep learning Greg Slabaugh 1 Oct 2020 Note: This presentation contains a slide with flashing imagery.

Transcript of Recent progress in computational photography using deep ...

Page 1: Recent progress in computational photography using deep ...

Fin

d m

ore

Pow

erP

oin

t te

mpla

tes

on p

rezentr

.com

!

Recent progress in computational photography using deep learning

Greg Slabaugh

1 Oct 2020

Note: This presentation contains a slide with flashing imagery.

Page 2: Recent progress in computational photography using deep ...

AI, AI, and more AI

Page 3: Recent progress in computational photography using deep ...

3

Page 4: Recent progress in computational photography using deep ...

What is AI?

Merriam-Webster

“A branch of computer science dealing with the simulation of intelligent behavior in computers.”

English Oxford Living Dictionary

“The theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.”

Page 5: Recent progress in computational photography using deep ...

AI… coming to (or already in) a device near you

Page 6: Recent progress in computational photography using deep ...

My journey in AI

Siemens Corporate Research

Medicsight

City, University of London

Huawei

Page 7: Recent progress in computational photography using deep ...

Strong vs weak AI

Strong AI

• Consciousness

• Ability to make judgements, plan, communicate,

self-awareness

• Also known as Artificial General Intelligence

(AGI)

Weak AI

• Focuses on a specific task

• No self-awareness

Page 8: Recent progress in computational photography using deep ...

The AI taxonomy (according to Greg)

AI

Weak

Strong

Machine Learning

Other

Deep

Traditional

CNN

Other (DBN)Supervised

Unsupervised

Reinforcement

Page 9: Recent progress in computational photography using deep ...

The AI taxonomy (according to Greg)

AI

Weak

Strong

Machine Learning

Other

Deep

Traditional

CNN

Other (DBN)Supervised

Unsupervised

Reinforcement

Page 10: Recent progress in computational photography using deep ...

What is machine learning?

• Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn

without being explicitly programmed.

Models that are learned from data

10

Page 11: Recent progress in computational photography using deep ...

• Labelled data• Learn a mapping between inputs and outputs • Example: face detection

• Dynamic environment• Computer gets feedback and learns to

“win”• Example: ML playing Atari 2600 games

• No labels• Computer groups similar data to discover hidden patterns• Example: “People who bought X also bought Y”

Supervised

Unsupervised Reinforcement

Learning

Page 12: Recent progress in computational photography using deep ...

Neural networks

Page 13: Recent progress in computational photography using deep ...

Learning

Page 14: Recent progress in computational photography using deep ...

Going deep

Page 15: Recent progress in computational photography using deep ...

AlexNet (2012)

• AlexNet, a type of Convolutional Neural Network (CNN) won the ImageNet challenge by a large margin

(15.4% error, compared to 26.2%). This precipitated a swell of interest in Deep Learning

techniques. AlexNet learns how to represent images using abstracted features extracted from

learned filters.

Page 16: Recent progress in computational photography using deep ...

Deep learning is a class of machine learning algorithms that use multiple layers of nonlinear

processing for feature extraction and transformation. Each successive layer uses the output from

the previous layer as input.

In deep learning, features are learned, rather than engineered. This is also known as representation

learning, as the network learns representations of the data customised to the task.

Traditional machine learning

Deep learning

Representation learning

Page 17: Recent progress in computational photography using deep ...

Key components

1. Convolution. This filters an image. The

weights for the filter are learned.

2. ReLU. This applies a non-linear transformation

to the data. This way, the CNN and find a non-

linear mapping between the inputs and outputs.

3. Pooling. This combines adjacent pixels in a

filtered output. This results in abstraction. The

CNN learns more “high level” features (e.g.

face, instead of edges).

Page 18: Recent progress in computational photography using deep ...

Common operations

4. Dense (fully connected) layers. These layers connect

all inputs to all outputs through weights. In doing so, they

lose spatial information but look at the data holistically.

5. Skip connections. Using a skip connection, data (feature

maps) are passed over parts of a network. This helps in

back-propagating gradients.

7. Down/Upsampling. This increases the resolution of a

feature map or image.

6. Batch normalisation. Batch normalisation applies

normalisation at hidden layers. It takes the output of the

previous layer and subtracts the batch mean and divides by

the standard deviation. Denormalisation is applied using

learned weights.

Page 19: Recent progress in computational photography using deep ...

In computer vision, one typically sees convolutional neural networks (CNNs) applied to images. Convolution

is well suited to take advantage of spatially correlated data common to images. One may see recurrent

architectures for temporal data (e.g. videos).

A deep network can be characterised by:

• The architecture, which describes the layers of processing that transform inputs to outputs. A CNN that

outputs a label is a classifier, and one that outputs a continuous variable is a regressor.

• The loss, which is a mathematical representation of the error produced by the network. During training,

weights are adjusted by back-propagating gradients through the network to minimise the loss.

• The training, including the optimisation strategy and data used.

UNet

Characterising a CNN

Page 20: Recent progress in computational photography using deep ...

Deep learning frameworks

Page 21: Recent progress in computational photography using deep ...

Making it easy…

# Import libraries and modulesimport numpy as npnp.random.seed(123) # for reproducibility

from keras.models import Sequentialfrom keras.layers import Dense, Dropout, Activation, Flattenfrom keras.layers import Convolution2D, MaxPooling2Dfrom keras.utils import np_utilsfrom keras.datasets import mnist

# Load pre-shuffled MNIST data into train and test sets(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Preprocess input dataX_train = X_train.reshape(X_train.shape[0], 1, 28, 28)X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)X_train = X_train.astype('float32')X_test = X_test.astype('float32')X_train /= 255X_test /= 255

# Preprocess class labelsY_train = np_utils.to_categorical(y_train, 10)Y_test = np_utils.to_categorical(y_test, 10)

# Define model architecturemodel = Sequential()

model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(1,28,28)))model.add(Convolution2D(32, 3, 3, activation='relu'))model.add(MaxPooling2D(pool_size=(2,2)))model.add(Dropout(0.25))

model.add(Flatten())model.add(Dense(128, activation='relu'))model.add(Dropout(0.5))model.add(Dense(10, activation='softmax'))

# Compile modelmodel.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit model on training datamodel.fit(X_train, Y_train, batch_size=32, nb_epoch=10, verbose=1)

# Evaluate model on test datascore = model.evaluate(X_test, Y_test, verbose=0)

Page 22: Recent progress in computational photography using deep ...

Why is deep learning so… trendy?

Recently there has been a surge of (research, commercial) interest in Deep Learning

1. Large datasets (e.g. ImageNet)

2. New algorithms, toolkits (e.g. TensorFlow, PyTorch) and available code (GitHub)

3. Graphics Processing Units (GPUs)

NVidia GeForce RTX 2080Ti with 4352 cores

Page 23: Recent progress in computational photography using deep ...

CNNs in computer vision

Semantic segmentation

Object detection

Image classification

Super-resolutionPose estimation

Image restoration

Page 24: Recent progress in computational photography using deep ...

Computational photography

24

• Computational photography uses digital computation, rather than optical processes, in the capture

and processing of images.

• This can be done to

◦ Improve image quality

◦ Reduce cost

◦ Reduce size of camera elements

• More broadly, one can also consider image processing effects

Page 25: Recent progress in computational photography using deep ...

Why is this interesting?

25https://www.dxomark.com/smartphones-vs-cameras-closing-the-gap-on-image-quality/

Google Pixel 3Sony a7R III

Page 26: Recent progress in computational photography using deep ...

We’re taking a lot of photos…

https://focus.mylio.com/tech-today/how-many-photos-will-be-taken-in-2020

Page 27: Recent progress in computational photography using deep ...

Huawei P40Pro+, features

28

Page 28: Recent progress in computational photography using deep ...

29

A traditional ISP contains a large number of stages of image processing algorithms to transform the raw data acquired by

the image sensor into a high quality JPG image. An example simplified traditional ISP is shown below. An ISP is normally

implemented in a specialized ASIC.

Can one use Deep Learning in the ISP?

RAW

Hardware

Optics / Sensors

Bla

ck le

vel c

orr

ect

ion

Raw

no

ise

re

du

ctio

n

Au

to w

hit

e b

alan

ce

De

mo

saic

ing

Cam

era

co

lor

mat

rix

Dyn

amic

ran

ge

com

pre

ssio

n

Gam

ma

corr

ect

ion

Ton

e m

app

ing

RG

B d

en

ois

ing

Shar

pe

nin

g

De

vign

ett

e

JPG

Traditional ISP pipeline

Page 29: Recent progress in computational photography using deep ...

Automatic White Balance (AWB)

30

• AWB, or colour constancy, applies a colour correction to an image, to make the image appear as

if it were taken under an achromatic light source.

• This is achieved by estimating the illumination in the scene, and then compensating for it.

Page 30: Recent progress in computational photography using deep ...

Deep learning for AWB

31

• Regression problem: given an uncorrected image,

estimate (and apply) the colour correction

• Bianco et al., “Color Constancy Using CNNs,” CVPRW 2015

Page 31: Recent progress in computational photography using deep ...

Ill-posed problem

32

• The problem is ill-posed. Given a single image where the scene and the illumination are unknown,

multiple solutions a possible.

• Who remembers “The dress” from 2015?

Page 32: Recent progress in computational photography using deep ...

Multi-hypothesis approach

33

1. Create a set of N candidate illuminants

2. Correct the image with each candidate, forming N hypothesized corrected images

3. Classify each corrected image on how well it is white balanced – producing a weight for each image

4. Produce a weighted average solution

5. Apply correction

Page 33: Recent progress in computational photography using deep ...

Multi-hypothesis approach

34

1. Create a set of N candidate illuminants

2. Correct the image with each candidate, forming N hypothesized corrected images

3. Classify each corrected image on how well it is white balanced – producing a weight for each image

4. Produce a weighted average solution

5. Apply correction

Page 34: Recent progress in computational photography using deep ...

Multi-hypothesis approach

35

1. Create a set of N candidate illuminants

2. Correct the image with each candidate, forming N hypothesized corrected images

3. Classify each corrected image on how well it is white balanced – producing a weight for each image

4. Produce a weighted average solution

5. Apply correction

Page 35: Recent progress in computational photography using deep ...

Multi-hypothesis approach

36

1. Create a set of N candidate illuminants

2. Correct the image with each candidate, forming N hypothesized corrected images

3. Classify each corrected image on how well it is white balanced – producing a weight for each image

4. Produce a weighted average solution

5. Apply correction

Page 36: Recent progress in computational photography using deep ...

Multi-hypothesis approach

37

1. Create a set of N candidate illuminants

2. Correct the image with each candidate, forming N hypothesized corrected images

3. Classify each corrected image on how well it is white balanced – producing a weight for each image

4. Produce a weighted average solution

5. Apply correction

Page 37: Recent progress in computational photography using deep ...

Multi-hypothesis approach

38

1. Create a set of N candidate illuminants

2. Correct the image with each candidate, forming N hypothesized corrected images

3. Classify each corrected image on how well it is white balanced – producing a weight for each image

4. Produce a weighted average solution

5. Apply correction

Page 38: Recent progress in computational photography using deep ...

Results

39

Cube dataset: 1707 images

captured with Canon 550D camera

Page 39: Recent progress in computational photography using deep ...

Advantages / disadvantages of this approach

40

Advantages

• Classifier solves a camera-agnostic question (how well white balanced is the image)?

• Scene illuminants can be combined across cameras

• Can apply the method in a training-free way to new cameras

• State-of-the-art performance

Disadvantages

• Requires inference N times. However, the images can be very small.

• Assumes a single illuminant. Future work: handle multi-illuminant case.

Page 40: Recent progress in computational photography using deep ...

Moire patterns

41

• Moire patterns occur when two patterns interfere with each other

• Aliasing results from high frequencies masquerading as low frequencies

• Moire patterns are sensitive to movement!

https://en.wikipedia.org/wiki/File:Moir%C3%A9.gif

https://steemit.com/art/@ztwin/moire-gifs-

Page 41: Recent progress in computational photography using deep ...

Moire in digital photography

42

• In digital photography, Moire patterns degrade image quality.

• Why does this happen? A camera sensor samples incoming light on a set

of pixels. Frequencies above the Nyquist limit cannot be captured properly

by the sensor, resulting in aliasing.

◦ In scenario 1, the subpixel layout of the LCD elements produces

uncapturable frequencies

◦ In scenario 2, the scene itself contains very high frequencies

Scenario 1: Photography of digital screensScenario 2: Photography of high

frequency patterns

Page 42: Recent progress in computational photography using deep ...

Demoire

43

• The demoire problem seeks to remove the Moire corruption.

• This is challenging as Moire patterns have a widely varying appearance including different frequency

components. Wavelet decomposition: differences

Page 43: Recent progress in computational photography using deep ...

WDNet

44

• Wavelet DemoireNet (WDNet) is a CNN that transforms an image to the wavelet domain where it is

processed using two branches:

◦ Dense branch is based on DenseNet and models fine details

◦ Dilation branch uses dilated convolutions to look at the data more coarsely

Page 44: Recent progress in computational photography using deep ...

DenseNet? Dilated convolution?

45

• DenseNet is composed of denseblocks.

• Layers are densely connected through residuals.

• Each layer receives in input all previous outputs.

• Dilated convolution skips points by some rate.

• This increases receptive field

• The output looks at the data more globally.

Page 45: Recent progress in computational photography using deep ...

Results

46

Page 46: Recent progress in computational photography using deep ...

Results

47

Page 47: Recent progress in computational photography using deep ...

Ablation study: Importance of wavelet processing

48

Page 48: Recent progress in computational photography using deep ...

Image enhancement using curve layers

50

• Photoshop / Lightroom allows users to adjust global image properties through the use of curves

Can we build a neural network do this automatically?

Example: adjusting brightness

Page 49: Recent progress in computational photography using deep ...

CURL

51

• We recently introduced neural CURve Layers (CURL) which learns and applies curve adjustments to

an image. CURL has the following features:

◦ Curves are piecewise linear

◦ Curves can flexibly map different image attributes (brightness, saturation, colour)

◦ Different colour spaces (RGB, HSV, LAB) supported

◦ Fully differentiable and trained end-to-end

◦ Predicted curves are intuitive and can be user adjusted

◦ State-of-the-art performance

Page 50: Recent progress in computational photography using deep ...

CURL methodology

52

• Architecture

• Loss

Page 51: Recent progress in computational photography using deep ...

Results

53

Page 52: Recent progress in computational photography using deep ...

Deep learning limitations

• Typically requires large datasets

• Methods described in this talk also require labelled data

• Algorithms are complex

• Slow to train (but fast at test time)

• Difficult to interpret results (Explainable AI)

• Black-box

• Biologically inspired, but don’t capture the biological mechanisms of the brain

• Limited theoretical understanding

• Hyper-parameters

Page 53: Recent progress in computational photography using deep ...

Deep learning and AI

Hype, or hope?

Page 54: Recent progress in computational photography using deep ...

References

56

1. U-Net: Convolutional Networks for Biomedical Image Segmentation, Olaf Ronneberger, Philipp Fischer, Thomas Brox, MICCAI 2015

2. Densely Connected Convolutional Networks, Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger, CVPR 2017

3. Multi-Scale Context Aggregation by Dilated Convolutions, Fisher Yu, Vladlen Koltun ICLR 2016.

4. A Multi-Hypothesis Approach to Color Constancy, Daniel Hernandez-Juarez, Sarah Parisot, Benjamin Busam, Ales Leonardis, Gregory Slabaugh, Steven McDonagh, CVPR 2020.

5. Wavelet-Based Dual-Branch Neural Network for Image Demoireing, Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Ales Leonardis, Wengang Zhou, Qi Tian, ECCV 2020.

6. CURL: Neural Curve Layers for Global Image Enhancement, Sean Moran, Steven McDonagh, Greg Slabaugh, Submitted to ICPR 2020

Contact: [email protected]