Visual Attention With Neural Networksfidler/teaching/2015/slides/... · Visual Attention With...

Visual Attention With Neural Networks

Main Paper: Recurrent Models of Visual Attention

Presentation by Matthew Shepherd

Mnih, V., Heess, N., Graves, A., & kavukcuoglu, K. (2014). Recurrent Models of Visual Attention. Advances in Neural Information …, 2204–2212.

Full image processing is computationally expensive

Regions are can be selected intelligently but time still scales with the size of the image

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. … And Pattern Recognition …, 580–587. http://doi.org/10.1109/CVPR.2014.81

Humans focus on specific regions in their FOV

https://www.youtube.com/watch?v=vJG698U2Mvo

“Vision as a sequential decision task”

The model sequentially chooses small windows of information on the data

Integrates information from all past windows to make its next decision.

The sensor only provides limited information about the scene, Xt, focused at

a location, lt-1

Referred to as a “Glimpse”

retina-like representation

The “glimpse network” incorporates sensor and location information into a

single vector

The core network fh(𝛳h) incorporates the sensor information as well as past information

The output of the core network is then used to deploy the sensor and make a

classification

The network is recurrent

Partially Observed Markov Decision Problem (POMDP)

- Glimpse can be seen as a partial view of the state - Network must learn a policy

π((lt, at)|s1:t; θ)

- The policy is determined by the NN

- State history is encapsulated by the hidden state of the network

So how do we train it?

The REINFORCE rule

Additional training details

Useful for determining fl but fa can be determined more directly by minimizing

cross-entropy loss.

A baseline value, bt, is added to the gradient approximation to reduce variance

Experiments

RAM performs well on translated MNIST

RAM performs very well on cluttered translated images

Meaningful policies are learned

RAM performs in a dynamic environment

Other models of attention

Graves, A. (2014). Generating Sequences with Recurrent Neural Networks, 1–43.

DRAW: A Recurrent Neural Network For Image Generation

• Combines an attention mechanism with a sequential variational auto-encoder

• Reading and writing are now both sequential tasks

Gregor, K., Danihelka, I., Graves, A., Rezende, D. J., & Wierstra, D. (2015). DRAW: A Recurrent Neural Network For Image Generation , 1–10.

Differentiable RAM

Differentiable RAM performance

DRAW-ing with attention

https://www.youtube. com/watch?v=Zt-7MI9eKEo

Visual Attention With Neural Networksfidler/teaching/2015/slides/... · Visual Attention With...

Documents

Transcript of Visual Attention With Neural Networksfidler/teaching/2015/slides/... · Visual Attention With...

The neural bases of attention

CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

Understanding Biological Visual Attention Using ...€¦ · Understanding Biological Visual Attention Using Convolutional Neural Networks Grace W. Lindsay a,b, Kenneth D. Miller a

Selective Visual Attention Visual Search What is …psiexp.ss.uci.edu/research/teachingP140C/Lectures/2_2.pdfSelective Visual Attention & Visual Search ... larger distance) • compatible

Structured Attention Guided Convolutional Neural …openaccess.thecvf.com/content_cvpr_2018/papers/Xu...Structured Attention Guided Convolutional Neural Fields for Monocular Depth

Recursive Visual Attention in Visual Dialogopenaccess.thecvf.com/content_CVPR_2019/papers/Niu... · Recursive Visual Attention in Visual Dialog ... communication, such as vision-and-language

visual attention reading - MITweb.mit.edu/zoya/www/visual_attention_reading.pdf · Visual attention with deep neural nets Paper discussions: “SALICON: Reducing the Semantic Gap

Control of Selective Visual Attention: Modeling the 'Where' Pathwaypapers.nips.cc/paper/1154-control-of-selective-visual-attention... · Control of Selective Visual Attention: Modeling

Show, Attend and Tell: Neural Image …Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Kelvin Xu KELVIN.XU@UMONTREAL.CA Jimmy Lei Ba JIMMY@PSI.UTORONTO.CA

Common neural substrates for visual working memory and ...dept.wofford.edu/neuroscience/NeuroSeminar/pdfsFall2007/MayerN… · The conceptual link between visual WM and attention

Show, Attend and Tell: Neural Image … Attend and Tell: Neural Image Caption Generation with Visual Attention Kelvin Xu KELVIN.XU@UMONTREAL.CA Jimmy Lei Ba JIMMY@PSI.UTORONTO.CA Ryan

Show, Attend and Tell: Neural Image CaptionGeneration with … · 2016-04-20 · Neural Image Caption Generation with Visual Attention with images,Donahue et al.(2014) also apply

Oddball distractors demand attention: neural and behavioral ...people.brandeis.edu/~sekuler/papers/NoyceFlankers.pdfOddball distractors demand attention: neural and behavioral responses

Top-down Neural Attention

Neural Synchrony in Attention and Consciousness

Visual Attention & Processing with Visual-Only IM

Show, Attend and Tell: Neural Image CaptionGeneration · PDF fileShow, Attend and Tell: Neural Image Caption Generation with Visual Attention ... our model to go beyond “objectness”

The Neural Basis for Visual Selective Attention in Young Infants: A … · 2016. 9. 23. · Schlesinger et al. Visual Selective Attention 137 or competing stimuli—is not only a

VISIT: A Neural Model of Covert Visual Attention · 420 VISIT: A Neural Model of Covert Visual Attention Subutai Ahmad-Siemens Research and Development, ZFE ST SN6, Otto-Hahn Ring

Attention II Selective Attention & Visual Search.