Input image (a) A-branch (b) B-branch (c) One...

1
Deep Specialized Network for Illuminant Estimation Wu Shi 1 , Chen Change Loy 1,2 , Xiaoou Tang 1,2 1 The Chinese University of Hong Kong 2 Shenzhen Institute of Advanced Technology, CAS {sw015, ccloy, xtang}@ie.cuhk.edu.hk Introduction Challenges The observed color I is given by I c = E c × R c , c ∈{r, g, b} (1) where E is the illumination and R is the reflectance. Estimating E from I is under- determined. Searching the large hypothesis space for an accurate illuminant estimation is hard due to the ambiguities of unknown reflections and local patch appearances. Contributions The proposed network combines two net- works to work hand-in-hand for robust il- luminant estimation. A novel notion of ‘branch-level ensemble’ is introduced. Through a winner-take-all learning scheme, the two branches of HypNet are encouraged to specialize on estimating illuminants for specific regions. SelNet yields much better final predictions than simply averaging the hypotheses. Methods The proposed Deep Specialized Network con- sists of two closely coupled sub-networks: (1) HypNet Generates two competing hypotheses for an illuminant estimation of an image patch. The ‘winner-take-all’ learning strategy L(Θ) = min k∈{A,B } (k ˜ E k - E * k 2 2 ) (2) where ˜ E k is the estimate of branch k and E * is the ground truth illumination. (2) SelNet Generates a score vector to pick the fi- nal illuminant estimation from one of the branches in HypNet. During training, the ground truth label for SelNet is generated from the ground truth illumination and the two hypotheses ob- tained from HypNet. References [1] Peter Vincent Gehler, Carsten Rother, Andrew Blake, Tom Minka, and Toby Sharp. Bayesian color constancy revisited. pages 1–8, 2008. [2] L. Shi and B. Funt. Re-processed version of the gehler color constancy dataset of 568 images. accessed from http://www.cs.sfu.ca/~colour/data/. [3] Dongliang Cheng, Dilip K Prasad, and Michael S Brown. Illuminant estimation for color constancy: why spatial-domain methods work and the role of the color distribution. 31(5):1049–1058, 2014. Methods (Network) Input image in RGB, ̅ Feature extraction Regression 8×8 4 4×4 2 128×10×10 256×4×4 conv1 conv2 fc1 fc2 256 Input: 2×44×44 Feature maps Feature maps Layer 2 Output layer A-branch B-branch 8×8 4 4×4 2 128×10×10 256×4×4 conv1 conv2 fc1 fc2 256 Input: 2×44×44 Feature maps Feature maps Layer 2 Output layer softmax Scores to select output HypNet SelNet UV conversion, subtract per-channel means, UV conversion, -./ 1 ( ̅ ) 4 ( ̅ ) { 6 ̅ + ̅ } Restoration Experimental Results HypNet Results Input Image A-Branch Ground Truth Iteration 10K A-Branch Iteration 100K B-Branch B-Branch Iteration 1000K A-Branch B-Branch log(r/g) log(r/g) log(r/g) log(b/g) log(b/g) log(b/g) Figure 1: Learning of two branches of HypNet. SelNet Results (e) HypNet+SelNet Input image (a) A-branch (b) B-branch (c) One Branch (d) DS-Net (Avg) (f) HypNet+Oracle Ground truth (e) HypNet+SelNet Input image (a) A-branch (b) B-branch (c) One Branch (d) DS-Net (Avg) (f) HypNet+Oracle Ground truth Figure 2: Different selection schemes. Experimental Results Global-Illuminant Setting Input Image Ground Truth Proposed(0.52º) Regression Tree (3.65º) CCC (2.24º) Input Image Ground Truth Proposed (1.40) Regression Tree (3.38) CCC (5.50) Methods Mean Median Trimean White-Patch 7.55 5.68 6.35 Edge-based Gamut 6.52 5.04 5.43 Gray-World 6.36 6.28 6.28 1st-order Gray-Edge 5.33 4.52 4.73 2nd-order Gray-Edge 5.13 4.44 4.62 Shades-of-Gray 4.93 4.01 4.23 Bayesian 4.82 3.46 3.88 General Gray-World 4.66 3.48 3.81 Intersection-based Gamut 4.20 2.39 2.93 Pixel-based Gamut 4.20 2.33 2.91 Natural Image Statistics 4.19 3.13 3.45 Bright Pixels 3.98 2.61 Spatio-spectral (GenPrior) 3.59 2.96 3.10 Cheng et al. 3.52 2.14 2.47 Corrected-Moment (19 Color) 3.50 2.60 Exemplar-based 3.10 2.30 Corrected-Moment (19 Edge) 2.80 2.00 CNN 2.36 1.98 Regression Tree 2.42 1.65 1.75 CCC (disc+ext) 1.95 1.22 1.38 HypNet One Branch 2.18 1.35 1.54 HypNet (A-branch) 5.06 4.38 4.52 HypNet (B-branch) 4.55 2.35 3.10 DS-Net (Average) 3.74 2.99 3.18 DS-Net (HypNet+SelNet) 1.90 1.12 1.33 DS-Net (HypNet+Oracle) 1.15 0.76 0.86 Table 1: The Color Checker [1, 2] dataset. Methods Mean Median Trimean White-Patch 10.62 10.58 10.49 Edge-based Gamut 8.43 7.05 7.37 Pixel-based Gamut 7.70 6.71 6.90 Intersection-based Gamut 7.20 5.96 6.28 Gray-World 4.14 3.20 3.39 Bayesian 3.67 2.73 2.91 Natural Image Statistics 3.71 2.60 2.84 Shades-of-Gray 3.40 2.57 2.73 Spatio-spectral (ML) 3.11 2.49 2.60 General Gray-World 3.21 2.38 2.53 2nd-order Gray-Edge 3.20 2.26 2.44 Bright Pixels 3.17 2.41 2.55 1st-order Gray-Edge 3.20 2.22 2.43 Spatio-spectral (GenPrior) 2.96 2.33 2.47 Cheng et al. 2.92 2.04 2.24 CCC (disc+ext) 2.38 1.48 1.69 Regression Tree 2.36 1.59 1.74 HypNet One Branch 2.56 1.87 2.01 HypNet (A-branch) 3.49 2.94 3.03 HypNet (B-branch) 5.17 2.91 3.50 DS-Net (Average) 3.41 2.36 2.72 DS-Net (HypNet+SelNet) 2.24 1.46 1.68 DS-Net (HypNet+Oracle) 1.32 0.93 1.01 Table 2: The NUS 8-camera [3] dataset.

Transcript of Input image (a) A-branch (b) B-branch (c) One...

Page 1: Input image (a) A-branch (b) B-branch (c) One Branchmmlab.ie.cuhk.edu.hk/.../illuminant_estimation/poster_v2.pdf · 2016. 10. 5. · DeepSpecializedNetworkfor IlluminantEstimation

DeepSpecializedNetwork forIlluminantEstimation

Wu Shi1, Chen Change Loy1,2, Xiaoou Tang1,2

1The Chinese University of Hong Kong2Shenzhen Institute of Advanced Technology, CAS

{sw015, ccloy, xtang}@ie.cuhk.edu.hk

IntroductionChallenges

• The observed color I is given by

Ic = Ec ×Rc, c ∈ {r, g, b} (1)

where E is the illumination and R is thereflectance. Estimating E from I is under-determined.

• Searching the large hypothesis space for anaccurate illuminant estimation is hard dueto the ambiguities of unknown reflectionsand local patch appearances.

Contributions

• The proposed network combines two net-works to work hand-in-hand for robust il-luminant estimation.

• A novel notion of ‘branch-level ensemble’is introduced.

• Through a winner-take-all learningscheme, the two branches of HypNet areencouraged to specialize on estimatingilluminants for specific regions.

• SelNet yields much better final predictionsthan simply averaging the hypotheses.

MethodsThe proposed Deep Specialized Network con-sists of two closely coupled sub-networks:

(1) HypNet

• Generates two competing hypotheses foran illuminant estimation of an imagepatch.

• The ‘winner-take-all’ learning strategy

L(Θ) = mink∈{A,B}

(‖Ek −E∗‖22) (2)

where Ek is the estimate of branch k andE∗ is the ground truth illumination.

(2) SelNet

• Generates a score vector to pick the fi-nal illuminant estimation from one of thebranches in HypNet.

• During training, the ground truth label forSelNet is generated from the ground truthillumination and the two hypotheses ob-tained from HypNet.

References[1] Peter Vincent Gehler, Carsten Rother, Andrew

Blake, Tom Minka, and Toby Sharp. Bayesian colorconstancy revisited. pages 1–8, 2008.

[2] L. Shi and B. Funt. Re-processed version of the gehlercolor constancy dataset of 568 images. accessed fromhttp://www.cs.sfu.ca/~colour/data/.

[3] Dongliang Cheng, Dilip K Prasad, and Michael SBrown. Illuminant estimation for color constancy:why spatial-domain methods work and the role ofthe color distribution. 31(5):1049–1058, 2014.

Methods (Network)

Input image in RGB,

𝐈

Feature extraction Regression

8×8𝑠4

4×4𝑠2

128×10×10 256×4×4conv1 conv2

fc1 fc2256

Input: 2×44×44 Feature maps Feature maps

Layer2

Output layer

A-branch

B-branch

8×8𝑠4

4×4𝑠2

128×10×10 256×4×4conv1 conv2

fc1

fc2256

Input: 2×44×44 Feature maps Feature maps

Layer2

Output layer

softmax

Scores to select output

HypNet

SelNet

UV conversion, subtract per-channel means,

UV conversion,

𝐈-./

𝐈

𝐹1(��)

𝐹4(��)

{𝐹6 �� + ��}

Restoration

Experimental ResultsHypNet Results

Input Image

A-Branch

Ground Truth

Iteration 10K

A-Branch

Iteration 100K

B-Branch B-Branch

Iteration 1000K

A-Branch

B-Branch

log(r/g) log(r/g) log(r/g)

log(b

/g)

log(b

/g)

log(b

/g)

Figure 1: Learning of two branches of HypNet.

SelNet Results

(e) HypNet+SelNet

Input image (a) A-branch (b) B-branch (c) One Branch

(d) DS-Net (Avg) (f) HypNet+OracleGround truth

(e) HypNet+SelNet

Input image (a) A-branch (b) B-branch (c) One Branch

(d) DS-Net (Avg) (f) HypNet+OracleGround truth

Figure 2: Different selection schemes.

Experimental ResultsGlobal-Illuminant Setting

Input Image Ground Truth

Proposed(0.52º) Regression Tree (3.65º) CCC (2.24º)

Input Image Ground Truth

Proposed (1.40) Regression Tree (3.38) CCC (5.50)

Methods Mean Median TrimeanWhite-Patch 7.55 5.68 6.35Edge-based Gamut 6.52 5.04 5.43Gray-World 6.36 6.28 6.281st-order Gray-Edge 5.33 4.52 4.732nd-order Gray-Edge 5.13 4.44 4.62Shades-of-Gray 4.93 4.01 4.23Bayesian 4.82 3.46 3.88General Gray-World 4.66 3.48 3.81Intersection-based Gamut 4.20 2.39 2.93Pixel-based Gamut 4.20 2.33 2.91Natural Image Statistics 4.19 3.13 3.45Bright Pixels 3.98 2.61 –Spatio-spectral (GenPrior) 3.59 2.96 3.10Cheng et al. 3.52 2.14 2.47Corrected-Moment (19 Color) 3.50 2.60 –Exemplar-based 3.10 2.30 –Corrected-Moment (19 Edge) 2.80 2.00 –CNN 2.36 1.98 –Regression Tree 2.42 1.65 1.75CCC (disc+ext) 1.95 1.22 1.38HypNet One Branch 2.18 1.35 1.54HypNet (A-branch) 5.06 4.38 4.52HypNet (B-branch) 4.55 2.35 3.10DS-Net (Average) 3.74 2.99 3.18DS-Net (HypNet+SelNet) 1.90 1.12 1.33DS-Net (HypNet+Oracle) 1.15 0.76 0.86

Table 1: The Color Checker [1, 2] dataset.

Methods Mean Median TrimeanWhite-Patch 10.62 10.58 10.49Edge-based Gamut 8.43 7.05 7.37Pixel-based Gamut 7.70 6.71 6.90Intersection-based Gamut 7.20 5.96 6.28Gray-World 4.14 3.20 3.39Bayesian 3.67 2.73 2.91Natural Image Statistics 3.71 2.60 2.84Shades-of-Gray 3.40 2.57 2.73Spatio-spectral (ML) 3.11 2.49 2.60General Gray-World 3.21 2.38 2.532nd-order Gray-Edge 3.20 2.26 2.44Bright Pixels 3.17 2.41 2.551st-order Gray-Edge 3.20 2.22 2.43Spatio-spectral (GenPrior) 2.96 2.33 2.47Cheng et al. 2.92 2.04 2.24CCC (disc+ext) 2.38 1.48 1.69Regression Tree 2.36 1.59 1.74HypNet One Branch 2.56 1.87 2.01HypNet (A-branch) 3.49 2.94 3.03HypNet (B-branch) 5.17 2.91 3.50DS-Net (Average) 3.41 2.36 2.72DS-Net (HypNet+SelNet) 2.24 1.46 1.68DS-Net (HypNet+Oracle) 1.32 0.93 1.01

Table 2: The NUS 8-camera [3] dataset.