Recent progress in computational photography using deep ...

rezentr

Recent progress in computational photography using deep learning

Greg Slabaugh

1 Oct 2020

Note: This presentation contains a slide with flashing imagery.

AI, AI, and more AI

What is AI?

Merriam-Webster

“A branch of computer science dealing with the simulation of intelligent behavior in computers.”

English Oxford Living Dictionary

“The theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.”

AI… coming to (or already in) a device near you

My journey in AI

Siemens Corporate Research

Medicsight

City, University of London

Huawei

Strong vs weak AI

Strong AI

• Consciousness

• Ability to make judgements, plan, communicate,

self-awareness

• Also known as Artificial General Intelligence

Weak AI

• Focuses on a specific task

• No self-awareness

The AI taxonomy (according to Greg)

Strong

Machine Learning

Traditional

Other (DBN)Supervised

Unsupervised

Reinforcement

The AI taxonomy (according to Greg)

Strong

Machine Learning

Traditional

Other (DBN)Supervised

Unsupervised

Reinforcement

What is machine learning?

• Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn

without being explicitly programmed.

Models that are learned from data

• Labelled data• Learn a mapping between inputs and outputs • Example: face detection

• Dynamic environment• Computer gets feedback and learns to

“win”• Example: ML playing Atari 2600 games

• No labels• Computer groups similar data to discover hidden patterns• Example: “People who bought X also bought Y”

Supervised

Unsupervised Reinforcement

Learning

Neural networks

Learning

Going deep

AlexNet (2012)

• AlexNet, a type of Convolutional Neural Network (CNN) won the ImageNet challenge by a large margin

(15.4% error, compared to 26.2%). This precipitated a swell of interest in Deep Learning

techniques. AlexNet learns how to represent images using abstracted features extracted from

learned filters.

Deep learning is a class of machine learning algorithms that use multiple layers of nonlinear

processing for feature extraction and transformation. Each successive layer uses the output from

the previous layer as input.

In deep learning, features are learned, rather than engineered. This is also known as representation

learning, as the network learns representations of the data customised to the task.

Traditional machine learning

Deep learning

Representation learning

Key components

1. Convolution. This filters an image. The

weights for the filter are learned.

2. ReLU. This applies a non-linear transformation

to the data. This way, the CNN and find a non-

linear mapping between the inputs and outputs.

3. Pooling. This combines adjacent pixels in a

filtered output. This results in abstraction. The

CNN learns more “high level” features (e.g.

face, instead of edges).

Common operations

4. Dense (fully connected) layers. These layers connect

all inputs to all outputs through weights. In doing so, they

lose spatial information but look at the data holistically.

5. Skip connections. Using a skip connection, data (feature

maps) are passed over parts of a network. This helps in

back-propagating gradients.

7. Down/Upsampling. This increases the resolution of a

feature map or image.

6. Batch normalisation. Batch normalisation applies

normalisation at hidden layers. It takes the output of the

previous layer and subtracts the batch mean and divides by

the standard deviation. Denormalisation is applied using

learned weights.

In computer vision, one typically sees convolutional neural networks (CNNs) applied to images. Convolution

is well suited to take advantage of spatially correlated data common to images. One may see recurrent

architectures for temporal data (e.g. videos).

A deep network can be characterised by:

• The architecture, which describes the layers of processing that transform inputs to outputs. A CNN that

outputs a label is a classifier, and one that outputs a continuous variable is a regressor.

• The loss, which is a mathematical representation of the error produced by the network. During training,

weights are adjusted by back-propagating gradients through the network to minimise the loss.

• The training, including the optimisation strategy and data used.

Characterising a CNN

Deep learning frameworks

Making it easy…

# Import libraries and modulesimport numpy as npnp.random.seed(123) # for reproducibility

from keras.models import Sequentialfrom keras.layers import Dense, Dropout, Activation, Flattenfrom keras.layers import Convolution2D, MaxPooling2Dfrom keras.utils import np_utilsfrom keras.datasets import mnist

# Load pre-shuffled MNIST data into train and test sets(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Preprocess input dataX_train = X_train.reshape(X_train.shape[0], 1, 28, 28)X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)X_train = X_train.astype('float32')X_test = X_test.astype('float32')X_train /= 255X_test /= 255

# Preprocess class labelsY_train = np_utils.to_categorical(y_train, 10)Y_test = np_utils.to_categorical(y_test, 10)

# Define model architecturemodel = Sequential()

model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(1,28,28)))model.add(Convolution2D(32, 3, 3, activation='relu'))model.add(MaxPooling2D(pool_size=(2,2)))model.add(Dropout(0.25))

model.add(Flatten())model.add(Dense(128, activation='relu'))model.add(Dropout(0.5))model.add(Dense(10, activation='softmax'))

# Compile modelmodel.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit model on training datamodel.fit(X_train, Y_train, batch_size=32, nb_epoch=10, verbose=1)

# Evaluate model on test datascore = model.evaluate(X_test, Y_test, verbose=0)

Why is deep learning so… trendy?

Recently there has been a surge of (research, commercial) interest in Deep Learning

1. Large datasets (e.g. ImageNet)

2. New algorithms, toolkits (e.g. TensorFlow, PyTorch) and available code (GitHub)

3. Graphics Processing Units (GPUs)

NVidia GeForce RTX 2080Ti with 4352 cores

CNNs in computer vision

Semantic segmentation

Object detection

Image classification

Super-resolutionPose estimation

Image restoration

Computational photography

• Computational photography uses digital computation, rather than optical processes, in the capture

and processing of images.

• This can be done to

◦ Improve image quality

◦ Reduce cost

◦ Reduce size of camera elements

• More broadly, one can also consider image processing effects

Why is this interesting?

25https://www.dxomark.com/smartphones-vs-cameras-closing-the-gap-on-image-quality/

Google Pixel 3Sony a7R III

We’re taking a lot of photos…

https://focus.mylio.com/tech-today/how-many-photos-will-be-taken-in-2020

Huawei P40Pro+, features

A traditional ISP contains a large number of stages of image processing algorithms to transform the raw data acquired by

the image sensor into a high quality JPG image. An example simplified traditional ISP is shown below. An ISP is normally

implemented in a specialized ASIC.

Can one use Deep Learning in the ISP?

Hardware

Optics / Sensors

Traditional ISP pipeline

Automatic White Balance (AWB)

• AWB, or colour constancy, applies a colour correction to an image, to make the image appear as

if it were taken under an achromatic light source.

• This is achieved by estimating the illumination in the scene, and then compensating for it.

Deep learning for AWB

• Regression problem: given an uncorrected image,

estimate (and apply) the colour correction

• Bianco et al., “Color Constancy Using CNNs,” CVPRW 2015

Ill-posed problem

• The problem is ill-posed. Given a single image where the scene and the illumination are unknown,

multiple solutions a possible.

• Who remembers “The dress” from 2015?

Multi-hypothesis approach

1. Create a set of N candidate illuminants

2. Correct the image with each candidate, forming N hypothesized corrected images

3. Classify each corrected image on how well it is white balanced – producing a weight for each image

4. Produce a weighted average solution

5. Apply correction

Results

Cube dataset: 1707 images

captured with Canon 550D camera

Advantages / disadvantages of this approach

Advantages

• Classifier solves a camera-agnostic question (how well white balanced is the image)?

• Scene illuminants can be combined across cameras

• Can apply the method in a training-free way to new cameras

• State-of-the-art performance

Disadvantages

• Requires inference N times. However, the images can be very small.

• Assumes a single illuminant. Future work: handle multi-illuminant case.

Moire patterns

• Moire patterns occur when two patterns interfere with each other

• Aliasing results from high frequencies masquerading as low frequencies

• Moire patterns are sensitive to movement!

https://en.wikipedia.org/wiki/File:Moir%C3%A9.gif

https://steemit.com/art/@ztwin/moire-gifs-

Moire in digital photography

• In digital photography, Moire patterns degrade image quality.

• Why does this happen? A camera sensor samples incoming light on a set

of pixels. Frequencies above the Nyquist limit cannot be captured properly

by the sensor, resulting in aliasing.

◦ In scenario 1, the subpixel layout of the LCD elements produces

uncapturable frequencies

◦ In scenario 2, the scene itself contains very high frequencies

Scenario 1: Photography of digital screensScenario 2: Photography of high

frequency patterns

Demoire

• The demoire problem seeks to remove the Moire corruption.

• This is challenging as Moire patterns have a widely varying appearance including different frequency

components. Wavelet decomposition: differences

• Wavelet DemoireNet (WDNet) is a CNN that transforms an image to the wavelet domain where it is

processed using two branches:

◦ Dense branch is based on DenseNet and models fine details

◦ Dilation branch uses dilated convolutions to look at the data more coarsely

DenseNet? Dilated convolution?

• DenseNet is composed of denseblocks.

• Layers are densely connected through residuals.

• Each layer receives in input all previous outputs.

• Dilated convolution skips points by some rate.

• This increases receptive field

• The output looks at the data more globally.

Results

Ablation study: Importance of wavelet processing

Image enhancement using curve layers

• Photoshop / Lightroom allows users to adjust global image properties through the use of curves

Can we build a neural network do this automatically?

Example: adjusting brightness

• We recently introduced neural CURve Layers (CURL) which learns and applies curve adjustments to

an image. CURL has the following features:

◦ Curves are piecewise linear

◦ Curves can flexibly map different image attributes (brightness, saturation, colour)

◦ Different colour spaces (RGB, HSV, LAB) supported

◦ Fully differentiable and trained end-to-end

◦ Predicted curves are intuitive and can be user adjusted

◦ State-of-the-art performance

CURL methodology

• Architecture

• Loss

Results

Deep learning limitations

• Typically requires large datasets

• Methods described in this talk also require labelled data

• Algorithms are complex

• Slow to train (but fast at test time)

• Difficult to interpret results (Explainable AI)

• Black-box

• Biologically inspired, but don’t capture the biological mechanisms of the brain

• Limited theoretical understanding

• Hyper-parameters

Deep learning and AI

Hype, or hope?

References

1. U-Net: Convolutional Networks for Biomedical Image Segmentation, Olaf Ronneberger, Philipp Fischer, Thomas Brox, MICCAI 2015

2. Densely Connected Convolutional Networks, Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger, CVPR 2017

3. Multi-Scale Context Aggregation by Dilated Convolutions, Fisher Yu, Vladlen Koltun ICLR 2016.

4. A Multi-Hypothesis Approach to Color Constancy, Daniel Hernandez-Juarez, Sarah Parisot, Benjamin Busam, Ales Leonardis, Gregory Slabaugh, Steven McDonagh, CVPR 2020.

5. Wavelet-Based Dual-Branch Neural Network for Image Demoireing, Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Ales Leonardis, Wengang Zhou, Qi Tian, ECCV 2020.

6. CURL: Neural Curve Layers for Global Image Enhancement, Sean Moran, Steven McDonagh, Greg Slabaugh, Submitted to ICPR 2020

Contact: g.slabaugh@qmul.ac.uk

Recent progress in computational photography using deep ...

Documents

Transcript of Recent progress in computational photography using deep ...

Computational Photography - Gigagiga.cps.unizar.es/.../Tutorial_ComputationalPhotography_CEIG2012.… · Diego Gutierrez, Belen Masia & Adrian Jarabo / Computational Photography most

Computational Photography: Real Time Plenoptic … · Computational Photography With traditional photography light rays in a scene go through optical ... Computational Refocusing

Introduction to Computational Photography Computational ... · Computational Photography Prof. Rob Fergus Spring 2008 Overview of Today • Introduction to Computational ... – Camera

Computational Photography - Portal · Computational photography • More than digital photography • Arbitrary computation between light measurement and final image – Light measured

Computational Photography and Video - ETH Zurich · 2008-03-07 · computational photography • Convergence of image processing, computer vision, computer graphics and photography

Course 15: Computational Photography Course 15: Computational Photography A.3: Understanding Film-like Photography Tumblin.

Computational Light Transport Computational Photography Camera

Computational photography by Sanket Mane

Computational Photography Light Field Rendering

CS6640 Computational Photography - Cornell University

Mastering Computational Chemistry with Deep Learningon-demand.gputechconf.com/...isayev-mastering-computational-chem… · Mastering Computational Chemistry with Deep Learning ...

Computational Photography - UCSB Computer Sciencemturk/Tampere/5. Computational Photography.pdf · Tampere 8.2012 Digital photography Photography has been rapidly changing in recent

Introduction to Computational Photography

Computational ⊗⊘ Photography - Computer graphicsgraphics.stanford.edu/.../lectures/02292012_computational_optics.pdf · ⊕⊖ Computational ⊗⊘ Photography Computational Optics

Introduction to Computational Photography Computational

CS 1950-G Computational Photography

Computational Photography CS 498 Ben Bower. What is computational photography? What is it used for? Computational imaging techniques – High Dynamic Range.

Computational Photography OPTI 600C

Georgia Tech's Computational Photography - … · Georgia Tech's Computational Photography Portfolio Esaias Pech eapa3@gatech.edu

Lecture 33: Computational photography