ENERATIVE ODELS KNOW WHAT THEY DON T KNOWbalaji/ICLR2019-ood-poster.pdfto have constant volume (CV)...

6. SUMMARY

3. TESTING OTHER DEEP GENERATIVE MODEL CLASSES1. INTRODUCTION

● Discriminative models are susceptible to overconfidence on out-of-distribution (OOD) inputs. Generative models are widely believed to be more robust to such inputs as they also model p(x) [Bishop, 1994].

● We challenge this assumption, showing that deep generative models can assign higher density estimates to an OOD dataset than to the training data!

● This phenomenon has implications not just for anomaly detection but also for detecting covariate shift, open-set classification, active learning, semi-supervised learning, etc

(Higher p(X) is Better)

We trained Glow [Kingma & Dhariwal, 2018] on CIFAR-10 and evaluated on the model on SVHN. We find that Glow assigns a higher likelihood to SVHN than to CIFAR-10 (both train/test splits).CIFAR-10 SVHN

DO DEEP GENERATIVE MODELS KNOW WHAT THEY DON’T KNOW?Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, Balaji Lakshminarayanan

Histogram of Glow Log-Likelihoods

This phenomenon is also observed in two other classes of deep generative models: auto-regressive (PixelCNN) and latent variable models (Variational Auto-Encoders).

PixelCNN Log-Likelihoods

2. MOTIVATING OBSERVATION: CIFAR-10 VS SVHN

(Lower BPD is Better)

5. DIGGING DEEPER INTO GLOW

The phenomenon is asymmetric w.r.t. datasets: Training on SVHN and evaluating on CIFAR-10 results in the expected ordering (SVHN is assigned higher likelihood).

Glow Log-Likelihoods: SVHN train, CIFAR-10 test

CIFAR-10 TRAIN VS SVHN TEST FASHIONMNIST TRAIN VS MNIST TEST

VAE Log-Likelihoods

4. TESTING GLOW ON OTHER DATA SETS

PixelCNN Log-Likelihoods

VAE Log-Likelihoods

We find further evidence of the phenomenon in five other data set pairs:

FashionMNIST vs MNIST CelebA vs SVHN ImageNet vs SVHN / CIFAR

● We also observe that constant inputs have the highest log-likelihood of any (tested) input.

● Furthermore, we find that SVHN has higher likelihood over the entire duration of training.

● Ensembling generative models does not help.

To make theoretical analysis more tractable, we restrict Glow to have constant volume (CV) transformations (w.r.t. input). We see similar CIFAR-vs-SVHN results for this model.

CV-Glow: CIFAR-10 vs SVHN

For CV-Glow, we can approximate the difference in likelihoods between the training and OOD data as follows:

This expression helps explain several observations:

1. Asymmetry: difference between 2nd moments does not commute.

2. Constant / grayscale inputs: equivalent to non-training moment being zero. Graying images increases likelihoods.

3. Early stopping / ensembling would not help: expression holds true for all values of CV-Glow’s parameters.

Density estimates from (current) deep generative models are not always able to detect out-of-distribution inputs.

Paper: https://arxiv.org/abs/1810.09136

BPD vs Training Iteration

Graying Images Increases Likelihoods

ENERATIVE ODELS KNOW WHAT THEY DON T KNOWbalaji/ICLR2019-ood-poster.pdfto have constant volume (CV)...

Documents

Transcript of ENERATIVE ODELS KNOW WHAT THEY DON T KNOWbalaji/ICLR2019-ood-poster.pdfto have constant volume (CV)...

Estimating the Likelihood of Statistical Models of Natural Image Patches Daniel Zoran ICNC – The Hebrew University of Jerusalem Advisor: Yair Weiss CifAR.

CIFAR Annual Performance Report

CIFAR AZRIELI GLOBAL SCHOLARS PROGRAM AZRIELI GLOBAL SCHOLARS PROGRAM CIFAR invites exceptional early career researchers from across the natural, biomedical, and social sciences and

CIFAR bigDATA Workshop Report

Inductive Principles for Restricted Boltzmann Machine Learningasamir/cifar/cifarss-marlin.pdf · 2010-08-14 · Inductive Principles for Restricted Boltzmann Machine Learning Benjamin

CIfAR Stanford 2008 SN Ia Rates: Theory, Progenitors, and Implications.

Optimization: Part II › ... › 09 › cifar-jorge.pdf · • Many optimization methods can be used and this approach creates the opportunity of employing second order methods •

cifar summer school 2016

Did Galaxies Reionize the Universe? Richard Ellis, Caltech CIFAR 201020 February 2010.

ENERATIVE MODELS KNOW WHAT THEY DON T …balaji/BDL-NeurIPS2018-ood-poster.pdfFor CV-Glow in particular, the volume term drops out (since it does not depend on x), yielding: where

GAN D V U ENERATIVE ADVERSARIAL NETWORKSamygdala.psychdept.arizona.edu/labspace/JclubLab... · better understand their GAN models. 1INTRODUCTION Generative Adversarial Networks (GANs)

Natural Image Statistics - University of Torontoasamir/cifar/lyu_08_11_10_slides.… · · 2010-08-11character. [Ruderman, 1996] 3 “natural image” dependency in ... • gray-scale

CIFAR Reach Magazine Spring 2010

CIFAR-10: KNN-based Ensemble of Classifiers

Nicole Fortin Department of Economics, UBC and Social Interactions, Identity and Well-Being Program Canadian Institute for Advanced Research (CIFAR)

High Performance SqueezeNext for CIFAR-10...1 High Performance SqueezeNext for CIFAR-10 Jayan Kant Duggal and Mohamed El-Sharkawy jduggal@purdue.edu, melshark@iupui.edu IoT Collaboratory

Religion and Innovation - Princeton University · Religion and Innovation Roland BØnabou Davide Ticchi Andrea Vindigni Princeton University IMT Lucca IMT Lucca & NBER & CIFAR Collegio

G ENERATIVE M -A DVERSARIAL N ETWORKS · 2017-03-06 · Published as a conference paper at ICLR 2017 G ENERATIVE M ULTI-A DVERSARIAL N ETWORKS Ishan Durugkar , Ian Gemp , Sridhar

Hierarchically Robust Representation Learning · SOP CIFAR-10 Figure 1. Examples from ImageNet, CIFAR-10 and SOP. Example-level distribution difference within a class can be ob-served

Training neural - Accueil - Département d'informatique …bengioy/cifar/NCAP2014-summer...Lower bound training Log likelihood > - 181.2 Variational autoencoder Variational bound Fast