Image description through fusion based recurrent multi model learning
Click here to load reader
-
Upload
suhas-pillai -
Category
Engineering
-
view
38 -
download
2
Transcript of Image description through fusion based recurrent multi model learning
IMAGE DESCRIPTION THROUGH FUSION BASED RECURRENT MULTI MODALLEARNING
Ram Manohar Oruganti1, Shagan Sah2, Suhas Pillai3 and Raymond Ptucha1
ABSTRACT
Index Terms
1. INTRODUCTION
Fig. 1.
2. BACKGROUND
2.1 Convolutional Neural Networks
2.2 Long Short TermMemory Networks
<x1, x2, xt 1, xt, ,xT>, xt 1 xt
xt
it ftot is gt
ct,
ht,
it, ft, ot
W b
3. PROPOSED LEARNING MODEL
3.1 FRMM model
Fig. 2.
3.2 FRMM variations
3.3 Image description through FRMMs
image stage language stagefusion stage
4. EXPERIMENTAL RESULTS4.1 Datasets
4.2 Training detailsCaffe
4.3 Results
Model B 1 B 2 B 3 B 4AFRMM 70.2 52.8 38.3 27.6
Table I.
CNN layer B 1 B 2 B 3 B 4
AFRMM+fc8 70.2 52.8 38.3 27.6
Table II.
Model B 1 B 2 B 3 B 4 METEOR
40.4
Our model 70.2 52.8 27.6 22.5
Table III.
Model B 1 B 2 B 3 B 4 METEOR
Vinyals [13] 66.3 42.3 27.7 18.3
Table IV.
5. CONCLUSION
6. REFERENCES
, et al.arXiv preprint
arXiv:1409.0575,
26th Annual Conference onNeural Information Processing Systems 2012, NIPS2012, December 3, 2012 December 6, 2012
Proceedings of the IEEE,
27th Annual Conference on Neural InformationProcessing Systems, NIPS 2013
Neural Computation,
ICASSP 2013
Computer Vision and PatternRecognition
Computer Vision and PatternRecognition
, et al.
Computer Vision and PatternRecognition
arXiv preprintarXiv:1505.00487,
, et al.Proceedings of the IEEE
International Conference on Computer Vision
, et al.arXiv
preprint arXiv:1502.03044,
arXiv preprint arXiv:1411.4555,
21stAnnual Conference on Neural InformationProcessing Systems, NIPS 2007
Advances in neural information processing systems
arXiv preprint arXiv:1410.4615,
Computer Vision and Pattern Recognition
arXiv preprint arXiv:1412.4729,
arXiv preprintarXiv:1412.6632,
Transactions of the Associationfor Computational Linguistics,
, et al.Computer Vision ECCV
2014
ICLR
Proceedings of the 40thannual meeting on association for computationallinguistics
In Proceedings of the NinthWorkshop on Statistical Machine Translation
, et al.
arXiv preprint arXiv:1411.4389,, et al.
Proceedings of the IEEE Conferenceon Computer Vision and Pattern Recognition
arXiv preprint arXiv:1410.1090,