Download - CNN FOR LICENSE PLATE MOTION DEBLURRINGihradis/CNN-Deblur/ICIP_poster.pdf · Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic This research is

Transcript
Page 1: CNN FOR LICENSE PLATE MOTION DEBLURRINGihradis/CNN-Deblur/ICIP_poster.pdf · Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic This research is

CNN FOR LICENSE PLATE MOTION DEBLURRINGPavel Svoboda, Michal Hradiš, Lukáš Maršík, Pavel ZemčíkFaculty of Information Technology, Brno University of Technology, Brno, Czech Republic

This research is supported by the ARTEMIS joint undertaking under grant agreement no 621439 (ALMARVI). This work was supported by The Ministry of Education, Youth and Sports from the Large Infrastructures for Research, Experimental Development and Innovations project ”IT4Innovations National Supercomputing Center – LM- 2015070”.

Enhancement CNN

c=3

c=128

Conv.+ReLU

19 x 193 x 3 x 128

Image

3 x 3 x 128

c=128

Conv.+ReLU

3 x 3 x 128

c=128

Conv.+ReLU

3 x 3 x 128

c=128

Conv.+ReLU Image

1. Image restoration

It is relatively easy to generate artificially degraded images.

2. Training data 3. Architecture 4. Training

The same as for text deblurring (Hradiš et al.; 2015) • 15 convolution+ReLU layers (last is linear)• Inputs and outputs are RGB images• Outputs MAP estimates of the original images• No padding, crops 25px borders • 2M weights (9 MB), 2Mflops per pixel• Processes 1Mpx image in ~4s on GTX 780 • One LP in ~0.2s

trained on artificial data for specific viewpoint and image content

Degradation - linear blur and noise

Inverse problem - MAP estimate?

Not convex, too simple image priors.

How to handle saturation, non-uniform blur, compression, non-linearities, non-gaussian noise?

Our approach (Who needs k?)

• Stochastic gradient descent with momentum• L2 loss as objective function• Initialization with uniform distribution• 400K iterations• Minibatch 54 crops 66x66 (output 16x16)• 3 days on GTX 980

We generate data for a specific camera - blur lengths and orietations, license plate orientations and scales.

It is easy to include complex degradations including non-linear sharpenning filters and compression.

Original

Blind L0

CNN-L15

1

1 x 1 1 x 1 1 x 1 1 x 1 1 x 13 x 3 3 x 3

5 x 5 5 x 5 7 x 7 7 x 75 x 5 5 x 5

1 x 1

19 x 19

128 320 320 320 128 128 512 256 64128 128 128 128 128 1

3 5 7 9 11 13 15 17 19

12

17

22

27

32

591317

Test data length [pixel]

PSN

R [

dB

]

Blur

leng

ht

Two static cameras Blur directions 37°-57° and 59°-79°

ResultsMotion blur range (lenght-direction)

• 721 images, longer exposure (6-12ms)• Manually annotated LP characters• Evaluating OCR accuracy • Existing LP-specific OCR engine• Baseline blind L0-regularized deconvolution (Pan et al.; 2014)• Blur orientation range 50°

Evaluation on real images

LP crops 264x128px140K sharp training/validation images

0 10 20 30 40 50 60 70 80 90

16

18

20

22

24

26

28

30

1020406090130180

Test data direction [degree]

PSN

R [

dB

]

Blur

dire

ctio

n

CNNInput CNNInput CNNInput CNNInput

9 11 13 15 17 19 21 23

0.6

0.7

0.8

0.9

1

CNN-L15Blind L0OriginalBlind L0 per license plate

Trained models range [pixel]

Acc

ura

cy

Real

imag

es O

CR

• Error improved from 37% to 9%• L0 deconvolution only imporoved character error down to 23%

5 15 25 35 45 55 65 75 85 950.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

JPEG compression

OC

R a

ccur

acy

CNNInput CNNInput CNNInput

Q 5

Q 25

Q 95

Networks were trained and tested for different blur lengths

Networks were trained and tested for different blur orientation ranges

We tested a network trained for motion blur on images with additional JPEG compression to asses robustness to additional degradations. The network was able to maintain OCR accuracy down to quality 25.

Download fromwww.fit.vutbr.cz/~ihradis/CNN-Deblur

Robustness to JPEG compression References

• Michal Hradiš, Jan Kotera, Pavel Zemčík, and Filip Šroubek, “Convolutional neural networks for direct text deblurring,” in BMVC, 2015.• Jinshan Pan, Zhe Hu, Zhixun Su, and Ming-Hsuan Yang, “Deblurring Text Images via L0-Regularized Intensity and Gradient Prior,” in CVPR 2014.

We explore direct blind deconvolution with convolutional neural networks on images from a real-life traffic surveillance system, where the blur kernels are partially constrained. The neural networks trained on artificial generated data provide superior reconstruction quality compared to traditional blind deconvolution methods. Custom CNNs can be easily trained for other specific applications or camera configurations (image content and blur sizes and types).

⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩

Train

c=3

c=128

Conv.+ReLU

19 x 193 x 3 x 128

Image

3 x 3 x 128

c=128

Conv.+ReLU

3 x 3 x 128

c=128

Conv.+ReLU

3 x 3 x 128

c=128

Conv.+ReLU Image

SharpBlurred

Blur+noise