Week 10 School Climate. CNN Special Sunday, November 6, 2011 Social Mobility Finland’s Schools .
CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN &...
Transcript of CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN &...
![Page 1: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/1.jpg)
CAP6412AdvancedComputerVision
http://www.cs.ucf.edu/~bgong/CAP6412.html
Boqing GongFeb02,2016
![Page 2: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/2.jpg)
Today
• Administrivia• R-CNNReview&ProjectI• ImageCaptioning,byHarish• Neuralnetworks&Backpropagation(PartV)
![Page 3: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/3.jpg)
Pastdue(02/02Tuesday,12pm)
• Assignment3:Reviewthefollowingpaper
{Major}Karpathy,Andrej,andLiFei-Fei."Deepvisual-semanticalignmentsforgeneratingimagedescriptions."arXiv preprintarXiv:1412.2306 (2014).
Templateforpaperreview:http://www.cs.ucf.edu/~bgong/CAP6412/Review.docx
![Page 4: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/4.jpg)
Upcomingdue(02/04Tuesday,12pm)
• Assignment4:Reviewthefollowingpaper
{Major}Xu,Kelvin,JimmyBa,RyanKiros,AaronCourville,RuslanSalakhutdinov,RichardZemel,andYoshua Bengio.“Show,attendandtell:Neuralimagecaptiongenerationwithvisualattention.”arXivpreprintarXiv:1502.03044(2015).
Templateforpaperreview:http://www.cs.ucf.edu/~bgong/CAP6412/Review.docx
![Page 5: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/5.jpg)
NextweekWeek2 CNNvisualization&objectrecognition
Week3 CNN&objectlocalization
Week4 CNN &transferlearning
Week5 CNN&segmentation,super-resolution
Week6 CNN&videos(opticalflow,pose)
Week7 Imagecaptioning&attentionmodel
Week8 Visualquestionanswering
Week9 Attentionmodel,aligningbookswithmovies
Week10--16 Video:tracking,action,surveillanceHuman-centered CV3DCVLow-levelCV,etc.
![Page 6: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/6.jpg)
Nextweek:CNN&Segmentationandsuper-resolution
Tuesday(02/09)
Jose Sanchez
[Super-resolution] Dong, Chao, Chen Change Loy, Kaiming He, andXiaoou Tang. “Learning a deep convolutional network for imagesuper-resolution.” In Computer Vision–ECCV 2014, pp. 184-199.Springer International Publishing, 2014. (Extended version on ArXiv)& Secondary papers
Thursday(02/11)
Goran Igic
[Edge detection] Xie, Saining, and Zhuowen Tu. “Holistically-NestedEdge Detection.” In Proceedings of the IEEE International Conferenceon Computer Vision, 2015.& Secondary papers
![Page 7: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/7.jpg)
Today
• Administrivia• R-CNNReview&ProjectI• ImageCaptioning,byHarish• Neuralnetworks&Backpropagation(PartV)
![Page 8: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/8.jpg)
Slidecredit:RossGirshick
![Page 9: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/9.jpg)
ProjectI:R-CNNattesttime
• INPUT:animage• 1. Extractdetectionproposals(cf.Samer’s presentationon01/26)• 2.Warpproposalsto227-by-227• 3. ExtractCNNfeaturesforeachproposal(region)byCaffe• Forclassc=1,2,…20
• 4. OutputadetectionscoreforeachproposalbySVM(proposal,classc)• 5. Nonmaximumsuppressionusingthescoresofclassc• 6. Regressionforthesurvivedproposals
• OUTPUT:bounding boxeseachwithaclasslabel&adetectionscore
![Page 10: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/10.jpg)
ProjectI:R-CNNattrainingtime(bonus)
• INPUT:animage• 1.Extractdetectionproposals(10pts)• 2.Warpproposalsto227-by-227• 3.ExtractCNN featuresforeachproposal(region)byCaffe (30pts)• Forclassc=1,2,…20
• 4.OutputadetectionscoreforeachproposalbySVM(proposal,classc)(10pts)• 5.Nonmaximumsuppressionusingthescoresofclassc• 6.Regression forthesurvivedproposals(10pts)
• OUTPUT:bounding boxeseachwithaclasslabel&adetectionscore
![Page 11: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/11.jpg)
ProjectI:Gradingcriteria
• Total:100points+60bonuspoints +x pointstopromoteinnovation
• Quantitativeresults(65pts)• DetectionaverageprecisiononVOC2012validation(40pts)• DetectionaverageprecisiononVOC2012validationbeforeregression(10pts)• DetectionaverageprecisiononVOC2012validationwith1000proposals(15pts)
• Qualitativeresults(35pts)
![Page 12: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/12.jpg)
ProjectI:Resources
• Technicalreportathttp://arxiv.org/abs/1311.2524• Ross‘Github repository:https://github.com/rbgirshick/rcnn
![Page 13: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/13.jpg)
ProjectI:Objective
• Getfamiliarwiththestate-of-the-artobjectdetectionpipeline• LearnaboutPASCALVOC• Knowhowtobenchmarkdifferentalgorithms
• Benchmarkdatasets• Taskspecification• Evaluationprocedureandmetrics
• Benefitfutureresearch/R&D
![Page 14: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/14.jpg)
Today
• Administrivia• R-CNNReview&ProjectI• ImageCaptioning,byHarish• Neuralnetworks&Backpropagation(PartV)
![Page 15: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/15.jpg)
Uploadslidesafterclass
• See“PaperPresentation”onUCFwebcourse
• Sharingyourslides• Refertotheoriginalssourcesofimages,figures,etc.inyourslides• ConvertthemtoaPDFfile• UploadthePDFfileto“PaperPresentation”afteryourpresentation
![Page 16: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/16.jpg)
Deep Visual-Semantic Alignments for Generating Image Descriptions
Andrej Karpathy & Li Fei-FeiStanford University
Presented by Harish [email protected]
![Page 17: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/17.jpg)
Motivation
• Humans can do it!
• “Build a bridge between natural language & images” – Karpathy
![Page 18: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/18.jpg)
Problem Statement
• Generate Dense Image Descriptions
• Build a better correspondence between image and their sentence descriptions
Figures from http://bit.ly/rankingdemo
![Page 19: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/19.jpg)
Main Contributions
Slide credit : Karpathy
![Page 20: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/20.jpg)
Approach Outline
• Alignment Inference Model
– R-CNN
– BRNN (Bidirectional Recurrent Neural Network)
– MRF
• Multimodal RNN
![Page 21: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/21.jpg)
R-CNN Stage
• Use whole image + top 19 detected locations (total 20) from RCNN
• CNN pre-trained on ImageNet & fine-tuned
– 𝐼𝑏 - pixels inside bounding box
– 𝐶𝑁𝑁𝜃𝑐(𝐼𝑏) – FC7 output
– 𝑏𝑚 - bias (to be learned)
–𝑊𝑚 - Weight Matrix (to be learned)
![Page 22: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/22.jpg)
BRNN
Figure from M. Schuster and K. K. Paliwal. Bidirectional recurrent neural
![Page 23: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/23.jpg)
BRNN Training
Figure from M. Schuster and K. K. Paliwal. Bidirectional recurrent neural
![Page 24: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/24.jpg)
• BRNN input – sequence of N words
• BRNN output – N h-dimensional vectors
![Page 25: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/25.jpg)
Inferring Word Alignments
Slide credit : Karpathy
![Page 26: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/26.jpg)
MRF (Markov Random Field)
• Purpose – Smoothing
• Encourage nearby words to point to the same region
![Page 27: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/27.jpg)
Simple RNN
w(t) – one hot representation of current word𝑓1() – sigmoid function𝑔1() – softmax function
Figure from Mao et. Al : Explain Images with Multimodal Recurrent Neural Networks
![Page 28: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/28.jpg)
Multimodal RNN
![Page 29: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/29.jpg)
Experiments
• Datasets
– Flickr8K
– Flickr30K
– MSCOCO
• Preprocessing
– Convert to lowercase
– Eliminate OoV (Out of Vocabulary)
![Page 30: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/30.jpg)
Generated Descriptions – Full Frame
Figures from http://bit.ly/neuraltalkdemo
![Page 31: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/31.jpg)
Figures from http://bit.ly/neuraltalkdemo
![Page 32: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/32.jpg)
Generated Descriptions – Region
![Page 33: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/33.jpg)
![Page 34: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/34.jpg)
![Page 35: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/35.jpg)
Related Work
Junhua Mao1,2,Wei Xu1, 𝑌𝑖 𝑌𝑎𝑛𝑔1, 𝐽𝑖𝑎𝑛𝑔 𝑊𝑎𝑛𝑔1, 𝐴𝑙𝑎𝑛 𝐿. 𝑌𝑢𝑖𝑙𝑙𝑒2
1Baidu Research
2University of California, Los Angeles
Explain Images With Multimodal Recurrent Neural Networks
![Page 36: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/36.jpg)
• Goal : Generate novel sentence descriptions to explain the contents of images
Figure from Mao et. Al : Explain Images with Multimodal Recurrent Neural Networks
![Page 37: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/37.jpg)
• Tasks
– Sentence generation
– Sentence retrieval
– Image retrieval
![Page 38: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/38.jpg)
![Page 39: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/39.jpg)
Oriol Vinyals, Alexander Toshev, Samy Bengio & Dumitru Erhan
Show and Tell : A Neural Image Caption Generator
![Page 40: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/40.jpg)
• Goal : Generate novel sentence descriptions to explain the contents of images
Figures from Vinyals et. al : Show and Tell
![Page 41: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/41.jpg)
![Page 42: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/42.jpg)
Xinlei Chen1, C. Lawrence Zitnick2
1Carnegie Mellon University
2Microsoft Research
Mind’s Eye: A Recurrent Visual Representation for Image Caption
Generation
![Page 43: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/43.jpg)
• Goal : Generate novel captions, reconstructing image features given an image description
![Page 44: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/44.jpg)
![Page 45: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/45.jpg)
Comparative Results
![Page 46: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/46.jpg)
![Page 47: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/47.jpg)
Conclusion
• Region based dense descriptions
• Multimodal RNN
• Novel model to infer alignments
![Page 48: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/48.jpg)
Future Directions
• Use LSTM in the m-RNN model
• Try different CNNs – VGGNet, GoogLeNet
• Changing the RNN hidden layer function from Sigmoid to ReLU
• Adding Mind’s Eye paper approach – will it work?
![Page 49: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/49.jpg)
Some Useful Videos
• Recurrent Neural Networks and LSTMhttps://www.youtube.com/watch?v=56TYLaQN4N8
• Automated Image Captioning with ConvNets and Recurrent Netshttps://www.youtube.com/watch?v=xKt21ucdBY0
![Page 50: CAP 6412 Advanced Computer Vision - UCF Computer Sciencebgong/CAP6412/lec7.pdf · Week 3 CNN & object localization Week 4 CNN& transfer learning Week 5 CNN& segmentation, super-resolution](https://reader034.fdocuments.in/reader034/viewer/2022042709/5f51eb6b56c4e811342a94f3/html5/thumbnails/50.jpg)