AutoEncoder Squeezed Convolutional...

21
Squeezed Convolutional Variational AutoEncoder Presenter: Keren Ye Kim, Dohyung, et al. "Squeezed Convolutional Variational AutoEncoder for Unsupervised Anomaly Detection in Edge Device Industrial Internet of Things." arXiv preprint arXiv:1712.06343 (2017).

Transcript of AutoEncoder Squeezed Convolutional...

Page 1: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Squeezed Convolutional Variational AutoEncoder

Presenter: Keren Ye

Kim, Dohyung, et al. "Squeezed Convolutional Variational AutoEncoder for Unsupervised Anomaly Detection in Edge Device Industrial Internet of Things." arXiv preprint arXiv:1712.06343 (2017).

Page 2: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Background - Anomaly detectionNo labels for defects and malfunctions

A diagnosis of the behavior pattern of processes

● Assumed that an abnormal behavior pattern means an anomaly

Page 3: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Background - Edge computingCloud-based approach

● Central server to process data

Edge-based approach

● Receiving more attention● Real time inference on edge devices● Low costs in communication compared to cloud-based● Reduces the burden on communication and computer infrastructure● Neural networks (NN): compute and memory intensive, yet NN performs

well

Page 4: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Contribution - SCVAEUnsupervised

● labeled data is not required

Time series sensor data

● a practical case of Industrial Internet of Things (IIoT)

Anomaly detection

● Proposed to use a specific NN model to handle the problem

Page 5: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Contribution - SCVAEEdge computing

● Propose ways to reduce model size and inference time on edge devices

Evaluation

● Match-General metric for unlabeled data

Page 6: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Related work - DL based anomaly detectionVariational AutoEncoder (VAE)

● Model the data distribution, then try to reconstruct the data● Outliers that cannot be reconstructed are anomalous

Generative Adversarial Networks (GAN)

● G model: generate data to fool D model● D model: determine if the data is generated by G or from the dataset

An, Jinwon, and Sungzoon Cho. "Variational autoencoder based anomaly detection using reconstruction probability." SNU Data Mining Center, Tech. Rep. (2015).Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.

Page 7: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Related work - DL based anomaly detectionVariational AutoEncoder (VAE)

● The label is the same as the input● Outliers cannot be reconstructed

Image credit (left): Introduction to Principal Component Analysis (PCA)https://docs.opencv.org/3.1.0/d1/dee/tutorial_introduction_to_pca.htmlImage credit (right): Keras Tutorial: Content Based Image Retrieval Using a Convolutional Denoising Autoencoderhttps://blog.sicara.com/keras-tutorial-content-based-image-retrieval-convolutional-denoising-autoencoder-dc91450cc511

Page 8: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Related work - Time series dataFault Detection and Classification Convolutional Neural Networks

● CNN’s receptive field was matched to the multivariate sensor signal● The CNN filter was moved along the time axis to extract meaningful

features from the sensor data

CNN demo: LINK

Note: the kernel is learnable

Lee, Ki Bum, Sejune Cheon, and Chang Ouk Kim. "A convolutional neural network for fault classification and diagnosis in semiconductor manufacturing processes." IEEE Transactions on Semiconductor Manufacturing 30.2 (2017): 135-142.

Page 9: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Related work - Time series dataBenefits of using CNN

● 1D conv filter along the time axis can fill out missing value using historical information

● 1D conv filter along the sensors axis can fill out missing value using data from other sensors

● 2D convolutional filter utilizes both information

Autoregression is a special case of CNN

● 1D conv filter, kernel size equals the input sizeMukherjee, Debnath, and Suman Datta. "Incremental time series algorithms for IoT analytics: an example from autoregression." Proceedings of the 17th International Conference on Distributed Computing and Networking. ACM, 2016.

Page 10: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Related work - Edge computingReduce both the size and inference time

● Pruning and weight quantization (Deep compression)

● Modified CNN structure (SqueezeNet, MobileNet)

SqueezeNet (MobileNet is similar)

● Fire module: decompose a conv layer to a squeeze layer and an expand layer

Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding." arXiv preprint arXiv:1510.00149 (2015).Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size." arXiv preprint arXiv:1602.07360 (2016).Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile vision applications." arXiv preprint arXiv:1704.04861 (2017).

Page 11: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Method - CNN-Variational AutoencoderInput data - expressed in the form of 2D image data

Time windows

# of features

Page 12: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Method - CNN-Variational AutoencoderNetwork architecture

Explanation (Inference)

● Use input x to generate latent factor z● Use z to reconstruct the distribution of x● If x is not (a pre-defined threshold) from

the distribution, it means x is an outlier● During training, the label is x itself

Page 13: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Method - Squeezed ArchitectureSubstitute conv layer to Fire modules used in SqueezeNet

● Squeeze layer: 1x1 conv filter● Extend layer: 1x1 conv filter, 3x3 conv filter

For example, 100x100x3 2D image

● Squeeze layer: 1x1 conv filter may map the image to gray scale, output=100x100x1 (but you could use multiple such filters, e.g. 64)

● Extend layer: 1x1 conv filter do a one-on-one mapping, while 3x3 conv filter may have smooth effect (smooth operation on the gray scale image)

Page 14: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

ExperimentPerformance compared to baseline (IF, LOF, OCSVM, EE) on labeled data

Performance for unlabeled real world Computer Numerical Control (CNC) data

Model size and inference time

Page 15: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Experiment - labeled dataDataset

● Ozone, Occupancy

Label

● Ozone: normal / ozone day● Occupancy: someone in the

office room or not

X. Y. Kun Zhang, Wei Fan, “Ozone level detection data set,” 2008. [Online]. Available: https://archive.ics.uci.edu/ml/datasets/ Ozone+Level+DetectionV. F. Luis M. Candanedo, “UCI accurate occupancy detection of an office room from light, temperature, humidity and co2 measurements using statistical learning models,” 2016. [Online]. Available: https://archive.ics.uci.edu/ml/datasets/Occupancy+Detection

Page 16: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Experiment - labeled dataDataset

● Ozone, Occupancy

Evaluation

● Metric: Area under the precision-recall curve (PRAUC)

● Anomaly labels are included in these dataset, yet not used during training X. Y. Kun Zhang, Wei Fan, “Ozone level detection data set,” 2008. [Online]. Available:

https://archive.ics.uci.edu/ml/datasets/ Ozone+Level+DetectionV. F. Luis M. Candanedo, “UCI accurate occupancy detection of an office room from light, temperature, humidity and co2 measurements using statistical learning models,” 2016. [Online]. Available: https://archive.ics.uci.edu/ml/datasets/Occupancy+Detection

Page 17: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Experiment - unlabeled CNC dataMetric - Match-general

● common labels are created and evaluated as actual labels. ● Anomaly in common labels are labeled as anomaly if the 3 or more of the

6 models identify the label as anomaly. ● The higher the better

CNC Dataset

Page 18: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Experiment - unlabeled CNC data

Page 19: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Experiment - model size and inference timeBaseline: CNN version v.s. SqueezeNet version

● Performance (PRAUC) is similar● Inference time and memory usage are reduced

Page 20: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

ConclusionUnsupervised

● labeled data is not required

Time series sensor data

● a practical case of Industrial Internet of Things (IIoT)

Anomaly detection

● Proposed to use a specific NN model to handle the problem

Edge computing

● Propose ways to reduce model size and inference time on edge devices

Page 21: AutoEncoder Squeezed Convolutional Variationalpeople.cs.pitt.edu/~mosse/courses/cs3720/Squeezed... · 2018-04-17 · Squeezed Convolutional Variational AutoEncoder Presenter: Keren

Thanks