SharinGAN: Combining Synthetic and Real Data for ... · 2017.1 [2]Ralph Gross, Iain Matthews,...

SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation

Supplementary Material

Koutilya PNVRkoutilya@terpmail.umd.edu

Hao Zhou∗

hzhou@cs.umd.edu

David Jacobsdjacobs@umiacs.umd.edu

University of Maryland, College Park, MD, USA.

1. More Implementation detailsThe discriminator architecture we used for this work

is: {CBR(n, 3, 1), CBR(2 ∗ n, 3, 2)}n={32,64,128,256},{CBR(512, 3, 1), CBR(512, 3, 2)}Ksets, {FcBR(1024),FcBR(512), Fc(1)}, where, CBR(out channels, kernelsize, stride) = Conv + BatchNorm2d + ReLU and FcBR(outnodes) = Fully conncected + BatchNorm1D + ReLU and Fcis a fully connected layer. For face normal estimation, wedo not use batchnorm layers in the discriminator. We usethe value K = 2 for MDE and K = 1 for FNE.

Face Normal Estimation We update the generator 3times for each update of the discriminator, which in turnis updated 5 times internally as per [1, 3]. The generatorlearns from a new batch each time, while the discriminatortrains on a single batch for 5 times.

2. ExperimentsMonocular Depth Estimation We provide more quali-

tative results on the test set of the Make3D dataset [5]. Fig-ure 2 further demonstrates the generalization ability of ourmethod compared to [8].

Face Normal Estimation Figure 3 depicts the qualita-tive results on the CelebA [4] and Synthetic [6] datasets.The translated images corresponding to synthetic and realimages look similar in contrast to the MDE task (Figure 4of the paper). We suppose that for the task of MDE, re-gions such as edges are domain specific, and yet hold pri-mary task related information such as depth cues, which iswhy SharinGAN modifies such regions. However, for thetask of FNE, we additionally predict albedo, lighting, shad-ing and a reconstructed image along with estimating nor-mals. This means that the primary network needs a lot ofshared information across domains for good generalizationto real data. Thus the SharinGAN module seems to bringeverything into a shared space, making the translated im-ages {xsh

r , xshs } look visually similar.

∗Hao Zhou is currently at Amazon AWS.

(a) Input Image (b) GT (c) SfSNet[6] (d) SharinGAN

Figure 1: Additional Qualitative comparisons of our methodwith SfSNet on the examples from test set of the Photofacedataset [7]. Our method generalizes much better to unseendata during training.

Figure 1 depicts additional qualitative results of the pre-dicted face normals for the test set of the Photoface dataset[7].

Algorithm top-1% top-2% top-3%SfSNet [6] 80.25 92.99 96.55SharinGAN 81.83 93.88 96.69

Table 1: Light classification accuracy on MultiPIE dataset[2]. Training with the proposed SharinGAN also improveslighting estimation along with face normals.

Lighting Estimation The primary network estimates notonly face normals but also lighting. We also evaluate this.Following a similar evaluation protocol as that of [6], Table1 summarizes the light classification accuracy on the Mul-tiPIE dataset [2]. Since we do not have the exact cropped

(a) Input Image (b) Ground Truth (c) GASDA[8] (d) SharinGAN

Figure 2: Additional Qualitative results on the test set of Make3D dataset [5]. Our method is able to capture better depthestimates compared to [8] for all the examples.

dataset that [6] used, we used our own cropping and resiz-ing on the original MultiPIE data: centercrop 300x300 andresize to 128x128. For a fair comparison, we used the samedataset to re-evaluate the lighting performance for [6] andreported the results in Table 1. Our method not only outper-forms [6] on the face normal estimation, but also on lightingestimation.

References[1] Martin Arjovsky, Soumith Chintala, and Leon Bottou.

Wasserstein generative adversarial networks. In NeurIPS,2017. 1

[2] Ralph Gross, Iain Matthews, Jeffrey Cohn, Takeo Kanade,and Simon Baker. Multi-pie. Image and Vision Computing,28(5):807 – 813, 2010. Best of Automatic Face and GestureRecognition 2008. 1

[3] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, VincentDumoulin, and Aaron C Courville. Improved training ofwasserstein gans. In NeurIPS. 2017. 1

[4] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deeplearning face attributes in the wild. In ICCV, 2015. 1, 3

[5] Ashutosh Saxena, Min Sun, and Andrew Y. Ng. Make3d:Learning 3d scene structure from a single still image. IEEETrans. PAMI, 31(5):824–840, 2009. 1, 2

[6] Soumyadip Sengupta, Angjoo Kanazawa, Carlos D. Castillo,and David W. Jacobs. Sfsnet: Learning shape, refectance andilluminance of faces in the wild. In CVPR, 2018. 1, 2, 3

[7] Stefanos Zafeiriou, Mark F. Hansen, Gary A. Atkinson,Vasileios Argyriou, Maria Petrou, Melvyn L. Smith, and Lyn-don N. Smith. The photoface database. In CVPR Workshops,2011. 1

[8] Shanshan Zhao, Huan Fu, Mingming Gong, and Dacheng Tao.Geometry-aware symmetric domain adaptation for monoculardepth estimation. In CVPR, 2019. 1, 2

Input Image, xs xshs = G(xs) Normal Albedo Shading Reconstruction

(a) Qualitative results of our method on CelebA testset [4].

Input Image, xr xshr = G(xr) Normal Albedo Shading Reconstruction

(b) Qualitative results of our method on the synthetic data used in [6].

Figure 3: Qualitative results of our method on face normal estimation task. The translated images xshr , xsh

s look reasonablysimilar for our task which additionally predicts albedo, lighting, shading and Reconstructed image along with the face normal.

SharinGAN: Combining Synthetic and Real Data for ... · 2017.1 [2]Ralph Gross, Iain Matthews,...

Documents

Transcript of SharinGAN: Combining Synthetic and Real Data for ... · 2017.1 [2]Ralph Gross, Iain Matthews,...

Kanade-Lucas-Tomasi (KLT) Feature Tracker

CSCE 643 Computer Vision: Lucas-Kanade Registration Jinxiang Chai.

Manga Naruto Gaiden 2 - ¡El Chico con el Sharingan…! (PDF).pdf

Sharingan: A Ninja Art to Copy, Analyze and Counter Attack · PDF fileSharingan 1 Sharingan: A Ninja Art to Copy, Analyze and Counter Attack Mrityunjay Gautam (Author and Presenter)

Pyramidal Implementation of Lucas Kanade Feature Tracker

NetSuite 2017.1 Release Notes - velosio.com · Site Management Tools Disable Escape to Login. NetSuite 2017.1 Release Notes 4 Product Section Summary ... subtab on the original GL

Ravi Kanade Courtesy the Architect

Kanade-Lucas-Tomasi (KLT) Tracker16385/s17/Slides/15.1_Tracking... · 2017. 4. 26. · An Iterative Image Registration Technique with an Application to Stereo Vision. 1981 Lucas Kanade

Evaluation of Advanced Lukas-Kanade Optical Flow on ...code.eng.buffalo.edu/jrnl/HoogAntink.pdf · Evaluation of Advanced Lukas-Kanade Optical Flow on Thoracic 4D-CT 3 { Robust gradient

Introducing Akshay Kanade as Advertising Art Director

EDU Lucas-Kanade Algorithm for Displacement Extraction from Navigatorsjbarral/JoelleBarral_ISMRM10... · 2010-07-24 · CONTACT: JBARRAL@STANFORD.EDU Lucas-Kanade Algorithm for Displacement

Kanade-Lucas-Tomasi(KLT) Feature Tracker Tracke… · Computer Vision (EEE6503) Fall 2009, YonseiUniv. Kanade-Lucas-Tomasi(KLT) Feature Tracker Computer Vision Lab. Jae Kyu Suhr

Estimation of Vehicle’s Lateral Position via the Lucas-Kanade Optical Flow · PDF file · 2012-12-11Estimation of Vehicle’s Lateral Position via the Lucas-Kanade Optical ... System

Adaptive Lucas-Kanade tracking

Kakashi sharingan

GeoGraphix 2017.1 Release Notes

NetSuite 2017.1 Release Notes - · PDF fileNetSuite 2017.1 Release Notes 2 Product Section Summary Script Queue Monitor in Application Performance Management (APM) Banking Change to

Ruby R P Gaurav Kulkarni Udayan Kanade · Ruby R P Gaurav Kulkarni Udayan Kanade. Mathematical Model → Simulation. Simulation → Mathematical Model. Multi-Scale Modeling (Homogenization)

Schneiderman Kanade Viola Jones Presentation

PetaLinux Tools Documentation · The following table shows the revision history for this document. Date Version Revision 04/06/2017 2017.1 Updated for PetaLinux Tools 2017.1 release