Visualizing, Measuring and Understanding Neural...

Visualizing, Measuring and UnderstandingNeural Networks

Zhiyuan Tang

tangzy@cslt.riit.tsinghua.edu.cn

August 1, 2016

Analyses of Deep Neural Networks

visualizing quantitative vs. qualitative Task-specific knowledge interpretation of hidden feature pattern / event high-dimensional data discriminative vs. generative …

Computer Vision (1)

Interpreting high level features Stacked DAE (feedforward), MNIST activation maximization Optimization: non-convex, gradient descent Most random initializations yield same pattern sensitivity analysis: specific unit, general unit hierarchical representation

layer1 layer2 layer3

Erhan, Dumitru, et al. "Visualizing higher-layer features of a deep network." University of Montreal 1341 (2009).

Computer Vision (2)

contribution of different layers (CNN) map activities back to input pixel space Deconvolutional Network (prior) greater invariance at higher layers

Layer 2

Computer Vision (2)

“dead” features translation, scale, and rotation, increasing invariance and class

discrimination in layers Correspondence Analysis Sensitivity analysis (Occlusion Sensitivity) Compositionality

Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks." European Conference on Computer Vision. Springer International Publishing, 2014.

Computer Vision (3)

computing the gradient (CNN) generating an image, which maximises the class score, google dream initialized with zero image, added training set mean saliency map: gradient based weakly supervised object segmentation gradient-based visualization and DeConv

Simonyan, Karen, Andrea Vedaldi, and Andrew Zisserman. "Deep inside convolutional networks: Visualising image classification models and saliency maps." arXiv preprint arXiv:1312.6034 (2013).

NLP (1)

Text prediction, character-level LM (LSTM) time-scales, the effect of the typo remains present in each layer short time-scales in bottom layers, longer memory in top ones Long-term interactions: parenthesis temporal hierarchy

Hermans, Michiel, and Benjamin Schrauwen. "Training and analysing deep recurrent neural networks." Advances in Neural Information Processing Systems. 2013.

NLP (2)

character-level LM, Linux Kernel Dataset (RNN, LSTM, GRU) Parentheses, the start of URLs,

quotes, depth of expression(other syntactic patterns)

Many cells not interpretable Saturation vs. n-gram

Karpathy, Andrej, Justin Johnson, and Li Fei-Fei. "Visualizing and understanding recurrent networks." arXiv preprint arXiv:1506.02078 (2015).

NLP (3)

Phrase classification (LSTM) negation, intensification

NLP (3)

Phrase classification (LSTM) important words: gradient-based concessive clauses

NLP (3)

Phrase classification (LSTM) negative asymmetry

Li, Jiwei, et al. "Visualizing and understanding neural models in NLP." arXiv preprint arXiv:1506.01066 (2015).

CV & NLP

Text prediction & image-vector generation (2 GRUs)

pays selective attention to lexical categories grammatical and semantic information

Hidden units of VISUAL active for meaningful constructions

Kádár, Ákos, Grzegorz Chrupała, and Afra Alishahi. "Representation of linguistic form and function in recurrent neural networks." arXiv preprint arXiv:1602.08952 (2016).

Attention/Encoder-decoder

Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).

Hermann, Karl Moritz, et al. "Teaching machines to read and comprehend." Advances in Neural Information Processing Systems. 2015.

Attention/Encoder-decoder

Xu, Kelvin, et al. "Show, attend and tell: Neural image caption generation with visual attention." arXiv preprint arXiv:1502.03044 2.3 (2015): 5.

Graves, Alex, and Navdeep Jaitly. "Towards End-To-End Speech Recognition with Recurrent Neural Networks." ICML. Vol. 14. 2014.

Variations

Depth, LSTM vs. GRU Architecture LSTM components

Pascanu, Razvan, et al. "How to construct deep recurrent neural networks." arXiv preprint arXiv:1312.6026 (2013). Chung, Junyoung, et al. "Empirical evaluation of gated RNNs on sequence modeling." arXiv preprint arXiv:1412.3555 (2014).Zaremba, Wojciech. "An empirical exploration of recurrent network architectures." (2015).Greff, Klaus, et al. "LSTM: A search space odyssey." arXiv preprint arXiv:1503.04069 (2015).

Platforms

Abadi, Martın, et al. "Tensorflow: Large-scale machine learning on heterogeneous distributed systems." arXiv preprint arXiv:1603.04467 (2016).

Strobelt, Hendrik, et al. "Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks." arXiv preprint arXiv:1606.07461 (2016).

Measurement

General RNN: unidirectional, bidirectional, homogeneous recurrent depth: complexity in the long term feedforward depth: complexity in the short term recurrent skip coefficients: the speed of information flow “skipping” allowing unimpeded information flow

red: the longest path, with infinite time gapgrowth rate(a limit over time) is recurrent depth dr

yellow: the longest input-output path, with least time gap with dr to compute feedforward depth

blue: the shortest path, with infinite time gapreciprocal of growth rate(a limit over time) is recurrent skip coefficients

Measurement

(recurrent depth, feedforward depth)sh: (1, 3)st: (1,3)bu: (1,3)td: (2,3)

feedforward depth: might not help on long term dependency tasks recurrent depth: might yield performance improvements recurrent skip coefficients: largely improve performance on long

term dependency tasks.

Zhang, Saizheng, et al. "Architectural Complexity Measures of Recurrent Neural Networks." arXiv preprint arXiv:1602.08210 (2016).

Rediscovery

Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. "A neural algorithm of artistic style." arXivpreprint arXiv:1508.06576 (2015).

Gatys, Leon, Alexander S. Ecker, and Matthias Bethge. "Texture synthesis using convolutional neural networks." Advances in Neural Information Processing Systems. 2015.

Ulyanov, Dmitry, et al. "Texture Networks: Feed-forward Synthesis of Textures and Stylized Images." arXiv preprint arXiv:1603.03417 (2016).

How to do with ASR

Acoustic knowledge: (vowel, consonant), (Initial, Final), … Linguistic knowledge: syntactic info, … Model attributes: long-term range, short term range, … Discriminative model for generating Waveform/feature transform Discovery new knowledge

DISCUSSION

Thanks a lot.

Visualizing, Measuring and Understanding Neural...

Documents

Transcript of Visualizing, Measuring and Understanding Neural...

Visualizing and Verbalizing. What is visualizing and verbalizing? Visualizing is directly related to language comprehension, language expression, and.

VISUALIZING THE LOSS LANDSCAPE OF NEURAL … · effect on the underlying loss landscape, is not well understood. In this paper, we explore the structure of neural loss functions,

Learning Semantics for Linked Data: How to Evaluate ...cslt.riit.tsinghua.edu.cn/.../c/c8/Learning_Semantics_for_Linked_Data.… · Learning Semantics for Linked Data: How to Evaluate

VISUALIZING AND UNDERSTANDING RECURRENT NEURAL NETWORKS Presented By: Collin Watts Wrritten By: Andrej Karpathy, Justin Johnson, Li Fei-fei.

Visualizing Features Learned by Convolutional Neural Networkssible” features. This work discusses and applies methods of visualizing the features learned by Convolutional Neural

Visualizing the Decision-making Process in Deep Neural Decision …openaccess.thecvf.com/content_CVPRW_2019/papers... · 2019-06-10 · Visualizing the Decision-making Process in

Visualizing and Understanding Convolutional Neural ... - arXiv and Understanding Convolutional Neural ... - arXiv

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING …cslt.riit.tsinghua.edu.cn/mediawiki/images/a/aa/07407387.pdf · IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1 Deep

Visualizing and Understanding Neural Machine Translation · internal workings of NMT and analyze translation errors. 1 Introduction End-to-end neural machine translation (NMT), ...

Visualizing Feature Maps in Deep Neural Networks using ...

Visualizing Deep Network Training Trajectories with PCA · Visualizing Deep Network Training Trajectories with PCA Eliana Lorch ELIANALORCH@GMAIL.COM Thiel Fellowship Abstract Neural

Understanding How Neural Networks See · Understanding How Neural Networks See SML201: Introduction to Data Science, Spring 2019 Michael Guerzhoy. ... “Visualizing and Understanding

Top-down Neural Attention by Excitation Backprop€¦ · [5] Zhou et al. “Learning Deep Features for Discriminative Localization.” CVPR, 2016. [6] Zeiler et al. “Visualizing

Recurrent Neural Network Architectures“Jiwei LI et al, “Visualizing and Understanding Neural Models in NLP” Sentiment Analysis Recent words more salient Long Term Dependencies

Visualization for Classification in Deep Neural …focus on visualizing and understanding classification results of any types of DNNs. Visualization for Classification Result: Visualization

Visualizing and Understanding Deep Neural Networks in CTR ...ceur-ws.org/Vol-2319/paper14.pdfVisualizing and Understanding Deep Neural Networks in CTR Prediction SIGIR 2018 eCom, July

DeepTracker: Visualizing the Training Process of ...huamin/tist_2018_dongyu_deeptracker.pdf · 1 DeepTracker: Visualizing the Training Process of Convolutional Neural Networks DONGYU

deepViz: Visualizing Convolutional Neural Networks for Image … · Deep learning, or the use of deep (i.e., many-layered) convolutional neural networks for machine recognition and

Visualizing Deep Convolutional Neural Networks Using ...vedaldi//assets/pubs/mahendran16visualizing.pdf · Int J Comput Vis DOI 10.1007/s11263-016-0911-8 Visualizing Deep Convolutional

Visualizing and Understanding Neural Machine …gmail.com, sms@tsinghua.edu.cn Abstract While neural machine translation (NMT) has made remarkable progress in recent years, it is hard