Vision 2050 Vision 2050 Generic Presentation Short Version March 2010 - DRAFT -
Computer Vision Introduction One Lecture ( Short )
Transcript of Computer Vision Introduction One Lecture ( Short )
![Page 1: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/1.jpg)
Computervision,inonelecture
BillFreemanElectricalEngineeringandComputerScienceDept.
Massachuse<sIns>tuteofTechnologyApril21,2010
![Page 2: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/2.jpg)
TheTaiyuanUniversityofTechnologyComputerCenterstaff,andme(1987)
![Page 3: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/3.jpg)
Meandmywife,ridingfromtheForeigners’Cafeteria
![Page 4: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/4.jpg)
Insidethecomputercenter,withtheimageprocessingequipment
![Page 5: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/5.jpg)
WhileinChina,Ireadthisbook(tobere‐issuedbyMITPressthisyear),andgotveryexcitedaboutcomputervision.StudiedforPhDatMIT.
![Page 6: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/6.jpg)
Goalofcomputervision
Marr:“Totellwhatiswherebylooking”.
Wantto:– Es>matetheshapesandproper>esofthings.– Recognizeobjects– Findandrecognizepeople– Findroadlanesandothercars– Helparobotwalk,navigate,orfly.– Inspectformanufacturing
![Page 7: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/7.jpg)
Somepar>culargoalsofcomputervision
• Waveacameraaround,geta3‐dmodelout.• Capturebodyposeofactordancing.• Detectandrecognizefaces.• Recognizeobjects.• Trackpeopleorobjects
![Page 8: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/8.jpg)
Let’sgobackin>me,tothemid‐1980’s
![Page 9: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/9.jpg)
Whateveryonelookedlikebackthen
![Page 10: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/10.jpg)
10
Features
• Points
butalso,• Lines• Conics• Otherfi<edcurves
![Page 11: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/11.jpg)
11
Features“blocksworld”Atoyworldinwhichtostudyimageinterpreta>on.Allwehavetodoistoconvertrealworldimagestotheirblocksworldequivalentsandwe’reallset.
YvanLeclercandMar>nFischler,anop>miza>on‐basedapproachtothe
interpreta>onofsinglelinedrawingsas3‐dwire
frames.
Objects
![Page 12: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/12.jpg)
12 Hu<enlocherandUllman,Objectrecogni>onusingalignment,ICCV,1986
Computervisionresearchresults,1986
![Page 13: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/13.jpg)
13 FromRothwelletal,Efficientmodellibraryaccessbyprojec>velyinvariantindexingfunc>ons,CVPR1992.
6yearslater:Recognizingplanarobjectsusinginvariants.
Inputimage Edgepointsfi<edwithlinesorconics
Objectsthathavebeenrecognizedandverified.
Computervisionresearchresults,1992
![Page 14: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/14.jpg)
Backtothepresent…
![Page 15: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/15.jpg)
Companiesandapplica>ons
• Cognex• Poseidon• Mobileye• Eyetoy• Iden>x• Google• Microsoh• Facerecogni>onincameras
![Page 16: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/16.jpg)
![Page 17: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/17.jpg)
![Page 18: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/18.jpg)
![Page 19: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/19.jpg)
![Page 20: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/20.jpg)
MobilEye
![Page 21: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/21.jpg)
![Page 22: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/22.jpg)
![Page 23: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/23.jpg)
![Page 24: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/24.jpg)
![Page 25: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/25.jpg)
Microsoh
![Page 26: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/26.jpg)
Microsoh
![Page 27: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/27.jpg)
Somepar>culargoalsofcomputervision(statusreport)
• Waveacameraaround,geta3‐dmodelout(almost)
• Capturebodyposeofactordancing.Usingmul>plecameras(pre<ywell),usingasinglecamera(notyet)
• Detectandrecognizefaces.(frontal,yes)• Recognizeobjects.(workingonit,lotsofprogress)• Trackpeopleorobjects(overshort>mes)
![Page 28: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/28.jpg)
Whathasallowedustomakeprogress?
• SIFTfeatures• Discrimina>veclassifiers
• Bayesianmethods
• Largedatabases
![Page 29: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/29.jpg)
Whathasallowedustomakeprogress?
• SIFTfeatures• Discrimina>veclassifiers
• Bayesianmethods
• Largedatabases
![Page 30: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/30.jpg)
BuildingaPanorama
M.BrownandD.G.Lowe.RecognisingPanoramas.ICCV2003
![Page 31: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/31.jpg)
Howdowebuildapanorama?
• Weneedtomatch(align)images
h<p://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/files/InvariantFeatures.ppt
![Page 32: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/32.jpg)
MatchingwithFeatures• Detectfeaturepointsinbothimages
h<p://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/files/InvariantFeatures.ppt
![Page 33: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/33.jpg)
MatchingwithFeatures• Detectfeaturepointsinbothimages
• Findcorrespondingpairs
h<p://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/files/InvariantFeatures.ppt
![Page 34: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/34.jpg)
MatchingwithFeatures• Detectfeaturepointsinbothimages
• Findcorrespondingpairs• Usethesepairstoalignimages‐weknowthis
h<p://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/files/InvariantFeatures.ppt
![Page 35: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/35.jpg)
MatchingwithFeatures
• Problem1:– Detectthesamepointindependentlyinbothimages
nochancetomatch!
Weneedarepeatabledetector
counter‐example:
h<p://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/files/InvariantFeatures.ppt
![Page 36: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/36.jpg)
MatchingwithFeatures
• Problem2:– Foreachpointcorrectlyrecognizethecorrespondingone
?
Weneedareliableanddis>nc>vedescriptor
h<p://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/files/InvariantFeatures.ppt
![Page 37: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/37.jpg)
Overviewoffeaturedetec2onfor(instance)objectrecogni2on
Descriptor
detector location
Note:hereviewpointisdifferent,notpanorama(theyshowoff)
• Detector:detectsamescenepointsindependentlyinbothimages
• Descriptor:encodelocalneighboringwindow– Notehowscale&rota>onofwindowarethesameinbothimage(butcomputedindependently)
• Correspondence:findmostsimilardescriptorinotherimage
![Page 38: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/38.jpg)
CVPR2003Tutorial
Recogni2onandMatchingBasedonLocalInvariant
Features
DavidLoweComputerScienceDepartment
UniversityofBri>shColumbia
h<p://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf
![Page 39: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/39.jpg)
InvariantLocalFeatures• Imagecontentistransformedintolocalfeaturecoordinatesthatareinvarianttotransla>on,rota>on,scale,andotherimagingparameters
SIFT Features
![Page 40: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/40.jpg)
Freemanetal,1998h<p://people.csail.mit.edu/billf/papers/cga1.pdf
![Page 41: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/41.jpg)
Advantagesofinvariantlocalfeatures
• Locality:featuresarelocal,sorobusttoocclusionandclu<er(nopriorsegmenta>on)
• Dis2nc2veness:individualfeaturescanbematchedtoalargedatabaseofobjects
• Quan2ty:manyfeaturescanbegeneratedforevensmallobjects
• Efficiency:closetoreal‐>meperformance
• Extensibility:caneasilybeextendedtowiderangeofdifferingfeaturetypes,witheachaddingrobustness
![Page 42: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/42.jpg)
SIFTvectorforma2on• Computedonrotatedandscaledversionofwindowaccordingtocomputedorienta>on&scale– resamplea16x16versionofthewindow
• BasedongradientsweightedbyaGaussianofvariancehalfthewindow(forsmoothfalloff)
![Page 43: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/43.jpg)
SIFTvectorforma2on• 4x4arrayofgradientorienta>onhistograms
– notreallyhistogram,weightedbymagnitude• 8orienta>onsx4x4array=128dimensions• Mo>va>on:somesensi>vitytospa>allayout,butnottoomuch.
showingonly2x2herebutis4x4
![Page 44: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/44.jpg)
SIFTvectorforma2on• Thresholdedimagegradientsaresampledover16x16arrayofloca>onsinscalespace
• Createarrayoforienta>onhistograms
• 8orienta>onsx4x4histogramarray=128dimensions
showingonly2x2herebutis4x4
![Page 45: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/45.jpg)
Ensuresmoothness• Gaussianweight• Trilinearinterpola>on
– agivengradientcontributesto8bins:4inspace>mes2inorienta>on
![Page 46: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/46.jpg)
Reduceeffectofillumina2on• 128‐dimvectornormalizedto1
• Thresholdgradientmagnitudestoavoidexcessiveinfluenceofhighgradients
– ahernormaliza>on,clampgradients>0.2– renormalize
![Page 47: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/47.jpg)
Featurestabilitytonoise• Matchfeaturesaherrandomchangeinimagescale&orienta>on,withdifferinglevelsofimagenoise
• Findnearestneighborindatabaseof30,000features
![Page 48: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/48.jpg)
Featurestabilitytoaffinechange• Matchfeaturesaherrandomchangeinimagescale&
orienta>on,with2%imagenoise,andaffinedistor>on
• Findnearestneighborindatabaseof30,000features
![Page 49: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/49.jpg)
Dis2nc2venessoffeatures• Varysizeofdatabaseoffeatures,with30degreeaffinechange,2%imagenoise
• Measure%correctforsinglenearestneighbormatch
![Page 50: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/50.jpg)
![Page 51: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/51.jpg)
![Page 52: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/52.jpg)
Thesefeaturepointdetectorsanddescriptorsarethemostimportantrecentadvancein
computervisionandgraphics.
• Featurepointsareusedalsofor:– Imagealignment(homography,fundamentalmatrix)– 3Dreconstruc>on– Mo>ontracking– Objectrecogni>on– Indexinganddatabaseretrieval– Robotnaviga>on– …other
![Page 53: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/53.jpg)
MoreusesforSIFTfeatures
SIFTfeatureshavealsobeenappliedto(categorical)objectrecogni>on
First,let’spresentvariousoftheissuesinobjectrecogni>on.
![Page 54: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/54.jpg)
intra‐classvaria>on
Slidefrom:LiFei‐Fei,RobFergusandAntonioTorralba,shortcourseonobjectrecogni>on,h<p://people.csail.mit.edu/torralba/shortCourseRLOC/
![Page 55: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/55.jpg)
Objectrecogni2onissues
– Genera>ve/discrimina>ve/hybrid
Slidefrom:LiFei‐Fei,RobFergusandAntonioTorralba,shortcourseonobjectrecogni>on,h<p://people.csail.mit.edu/torralba/shortCourseRLOC/
![Page 56: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/56.jpg)
Objectrecogni2onissues
– Genera>ve/discrimina>ve/hybrid
– Appearanceonlyorloca>onandappearance
Slidefrom:LiFei‐Fei,RobFergusandAntonioTorralba,shortcourseonobjectrecogni>on,h<p://people.csail.mit.edu/torralba/shortCourseRLOC/
![Page 57: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/57.jpg)
Objectrecogni2onissues
– Genera>ve/discrimina>ve/hybrid
– Appearanceonlyorloca>onandappearance
– Invariances• Viewpoint• Illumina>on• Occlusion• Scale• Deforma>on• Clu<er• etc.
Slidefrom:LiFei‐Fei,RobFergusandAntonioTorralba,shortcourseonobjectrecogni>on,h<p://people.csail.mit.edu/torralba/shortCourseRLOC/
![Page 58: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/58.jpg)
Objectrecogni2onissues
– Genera>ve/discrimina>ve/hybrid
– Appearanceonlyorloca>onandappearance
– invariances– Partsorglobalw/sub‐window
– Usesetoffeaturesoreachpixelinimage
Slidefrom:LiFei‐Fei,RobFergusandAntonioTorralba,shortcourseonobjectrecogni>on,h<p://people.csail.mit.edu/torralba/shortCourseRLOC/
![Page 59: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/59.jpg)
Currentapproachesinobjectrecogni>on
• Bagofwords• Boos>ng• Labeltransfer
![Page 60: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/60.jpg)
Visualwords
• Vectorquan>zeSIFTdescriptorstoavocabularyof2or3thousand“visualwords”.
• Heuris>cdesignofdescriptorsmakesthesewordssomewhatinvariantto:– Ligh>ng– 2‐dOrienta>on– 3‐dViewpoint
![Page 61: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/61.jpg)
Comparewithobjectclassdatabase
Findwords
Formhistograms
Objectrecogni>onusingvisualwords
![Page 62: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/62.jpg)
Manycombinatorialmatchingproblemstobesolvedforobjectrecogni>on.
Instancerecogni>on:withfeaturesallowedtoappearornotinboththetestandtrainingexamples.
Deformableobjectrecogni>on:somefeatureclustersmaintainspa>alcoherence,otherscanvary.
Categoryrecogni>on:eachclassdefinedbymanydifferenttrainingsetexemplars.Findtheclassthatbestexplainstheobservedfeatureset.
Semi‐supervisedobjectrecogni>on:observedtrainingsetfeaturesincludemanybackgroundobjectfeatures.
h<p://www‐cvr.ai.uiuc.edu/ponce_grp/publica>on/paper/cvpr06b.pdf
h<p://www.cs.utexas.edu/~grauman/research/projects/pmk/pmk_projectpage.htm
![Page 63: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/63.jpg)
Caltech101
![Page 64: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/64.jpg)
Caltech101resultsover>me
![Page 65: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/65.jpg)
Problem:Categorylevelrecogni>onusingvisualwordsrepresenta>on.
Applica>ons:Objectrecogni>on.
References:Lazebnik,Schmid,andPonce,Beyondbagsoffeatures:Spa>alpyramidmatchingforrecognizingnaturalscenecategories,ComputerVisionandPa<ernRecogni>on(CVPR2006),h<p://www‐cvr.ai.uiuc.edu/ponce_grp/publica>on/paper/cvpr06b.pdf
K.GraumanandT.Darrell.UnsupervisedLearningofCategoriesfromSetsofPar>allyMatchingImageFeatures.InProceedingsoftheIEEEConferenceonComputerVisionandPa<ernRecogni>on(CVPR),NewYorkCity,NY,June2006,h<p://www.cs.utexas.edu/~grauman/papers/grauman_darrell_cvpr2006.pdf
![Page 66: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/66.jpg)
Whathasallowedustomakeprogress?
• SIFTfeatures• Discrimina>veclassifiers—SVM’sandboos>ng
• Bayesianmethods
• Largedatabases
![Page 67: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/67.jpg)
PaulViolaMichaelJ.JonesMitsubishiElectricResearchLaboratories(MERL)
Cambridge,MA
MostofthisworkwasdoneatCompaqCRLbeforetheauthorsmovedtoMERL
Rapid Object Detection Using a Boosted Cascade of Simple Features
h<p://citeseer.ist.psu.edu/cache/papers/cs/23183/h<p:zSzzSzwww.ai.mit.eduzSzpeoplezSzviolazSzresearchzSzpublica>onszSzICCV01‐Viola‐Jones.pdf/viola01robust.pdf
Manuscriptavailableonweb:
![Page 68: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/68.jpg)
Viola‐Jonesapproach
• Largefeatureset(…ishugeabout16,000,000features)
• Efficientfeatureselec>onusingAdaBoost
• CascadedClassifierforrapiddetec>on– HierarchyofA<en>onalFilters
The combination of these ideas yields the fastest known face detector for gray scale images.
Viola and Jones, Robust object detection using a boosted cascade of simple features, CVPR 2001
![Page 69: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/69.jpg)
ImageFeatures
“Rectangle filters”
Similar to Haar wavelets
Differences between sums of pixels in adjacent rectangles
{ ht(x) = +1 if ft(x) > θt -1 otherwise Unique Features
Viola and Jones, Robust object detection using a boosted cascade of simple features, CVPR 2001
![Page 70: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/70.jpg)
Huge“Library”ofFilters
Viola and Jones, Robust object detection using a boosted cascade of simple features, CVPR 2001
![Page 71: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/71.jpg)
IntegralImage
• DefinetheIntegralImage
• Anyrectangularsumcanbecomputedinconstant>me:
• Rectanglefeaturescanbecomputedasdifferencesbetweenrectangles
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 72: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/72.jpg)
Construc>ngclassifiersbycombiningfilteroutputs
• Perceptronyieldsasufficientlypowerfulclassifier
• UseAdaBoosttoefficientlychoosebestfeatures
Viola and Jones, Robust object detection using a boosted cascade of simple features, CVPR 2001
![Page 73: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/73.jpg)
AdaBoost Ini>aluniformweightontrainingexamples
weakclassifier1
weakclassifier2
Incorrectclassifica2onsre‐weightedmoreheavily
weakclassifier3
Finalclassifierisweightedcombina2onofweakclassifiers
(Freund&Shapire’95)
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 74: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/74.jpg)
Ada‐BoostTutorial
• GivenaWeaklearningalgorithm– Learnertakesatrainingsetandreturnsthebestclassifierfromaweakconceptspace
• requiredtohaveerror<50%
• Star>ngwithaTrainingSet(ini>alweights1/n)– Weaklearningalgorithmreturnsaclassifier– Reweighttheexamples
• Weightoncorrectexamplesisdecreased• Weightonerrorsisdecreased
• FinalclassifierisaweightedmajorityofWeakClassifiers– Weakclassifierswithlowerrorgetlargerweight
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 75: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/75.jpg)
ReviewofAdaBoost(Freund&Shapire95)
• Givenexamples(x1,y1),…,(xN,yN)whereyi=0,1fornega>veandposi>veexamplesrespec>vely.• Ini>alizeweightswt=1,i=1/N
• Fort=1,…,T• Normalizetheweights,wt,i=wt,i/Σwt,j
• Findaweaklearner,i.e.ahypothesis,ht(x)withweightederrorlessthan.5• Calculatetheerrorofht:et=Σwt,i|ht(xi)–yi|
• Updatetheweights:wt,i=wt,iBt(1‐di)whereBt=et/(1‐et)anddi=0ifexamplexiisclassifiedcorrectly,di=1otherwise.
• Thefinalstrongclassifieris
whereαt=log(1/Bt)
j=1
N
1if Σ αtht(x)> 0.5Σ αt
0otherwise
T
t=1 t=1
T
{h(x)=
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 76: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/76.jpg)
ExampleClassifierforFaceDetec>on
ROC curve for 200 feature classifier
One stage: a classifier with 200 rectangle features was learned using AdaBoost
95% correct detection on test set with 1 in 14084 false positives.
Viola and Jones, Robust object detection using a boosted cascade of simple features, CVPR 2001
![Page 77: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/77.jpg)
Developfast,accurateclassifierusingacascade
• Givenanestedsetofclassifierhypothesisclasses
• Computa>onalRiskMinimiza>on
vsfalsenegdeterminedby
%FalsePos
%Detec>o
n
050
5099
FACEIMAGESUB‐WINDOW
Classifier1
F
T
NON‐FACE
Classifier3T
F
NON‐FACE
F
T
NON‐FACE
Classifier2T
F
NON‐FACE
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 78: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/78.jpg)
Experiment:SimpleCascadedClassifier
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 79: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/79.jpg)
CascadedClassifier
1Feature 5Features
F
50%20Features
20% 2%
FACE
NON‐FACE
F
NON‐FACE
F
NON‐FACE
IMAGESUB‐WINDOW
• A1featureclassifierachieves100%detec>onrateandabout50%falseposi>verate.
• A5featureclassifierachieves100%detec>onrateand40%falseposi>verate(20%cumula>ve)– usingdatafrompreviousstage.
• A20featureclassifierachieve100%detec>onratewith10%falseposi>verate(2%cumula>ve)
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 80: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/80.jpg)
AReal‐>meFaceDetec>onSystem
Trainingfaces:4916faceimages(24x24pixels)plusver>calflipsforatotalof9832faces
Trainingnon‐faces:350millionsub‐windowsfrom9500non‐faceimages
Finaldetector:38layercascadedclassifierThenumberoffeaturesperlayerwas1,10,25,25,50,50,50,75,100,…,200,…
Finalclassifiercontains6061features.ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 81: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/81.jpg)
AccuracyofFaceDetector
Performance on MIT+CMU test set containing 130 images with 507 faces and about 75 million sub-windows.
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 82: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/82.jpg)
ComparisontoOtherSystems
10 31 50 65 78 95 110 167
Viola-Jones 76.1 88.4 91.4 92.0 92.1 92.9 93.1 93.9
Viola-Jones (voting)
81.1 89.7 92.1 93.1 93.1 93.2 93.7 93.7
Rowley-Baluja-Kanade
83.2 86.0 89.2 90.1
Schneiderman-Kanade
94.4
Detector
False Detections
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 83: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/83.jpg)
SpeedofFaceDetector
Speed is proportional to the average number of features computed per sub-window.
On the MIT+CMU test set, an average of 9 features out of a total of 6061 are computed per sub-window.
On a 700 Mhz Pentium III, a 384x288 pixel image takes about 0.067 seconds to process (15 fps).
Roughly 15 times faster than Rowley-Baluja-Kanade and 600 times faster than Schneiderman-Kanade.
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 84: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/84.jpg)
OutputofFaceDetectoronTestImages
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 85: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/85.jpg)
MoreExamples
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 86: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/86.jpg)
Conclusions
• We[they]havedevelopedthefastestknownfacedetectorforgrayscaleimages
• Threecontribu>onswithbroadapplicability– Cascadedclassifieryieldsrapidclassifica>on– AdaBoostasanextremelyefficientfeatureselector
– RectangleFeatures+IntegralImagecanbeusedforrapidimageanalysis
ViolaandJones,Robustobjectdetec>onusingaboostedcascadeofsimplefeatures,CVPR2001
![Page 87: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/87.jpg)
Whathasallowedustomakeprogress?
• SIFTfeatures• Discrimina>veclassifiers
• Bayesianmethods
• Largedatabases
![Page 88: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/88.jpg)
Trackingahumanin3D
![Page 89: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/89.jpg)
The appearance of people can vary dramatically.
![Page 90: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/90.jpg)
People can appear in arbitrary poses.
Structure is unobservable—inference from visible parts.
![Page 91: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/91.jpg)
Geometrically under-constrained.
![Page 92: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/92.jpg)
Butthisrequiresthatweusemarkers,whichwedon’twant,andalsorequiresmul>plecameras.
http://www.vicon.com/animation/
![Page 93: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/93.jpg)
State of the Art.
• Brightnessconstancycue– Insensi>vetoappearance
• Full‐bodyrequiredmul>plecameras
• Singlehypothesis
![Page 94: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/94.jpg)
State of the Art.
I(x, t) = I(x+u, 0) + η
• Singlecamera,mul>plehypotheses• 2Dtemplates(nodrihbutviewdependent)
![Page 95: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/95.jpg)
State of the Art.
• Mul>plehypotheses
• Mul>plecameras
• Simplifiedclothing,ligh>ngandbackground
![Page 96: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/96.jpg)
* No special clothing * Monocular, grayscale, sequences (archival data) * Unknown, cluttered, environment
Task: Infer 3D human motion from 2D image
![Page 97: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/97.jpg)
p(model | cues) = p(cues | model) p(model)
3. Posterior probability: Need an effective way to explore the model space (very high dimensional) and represent ambiguities.
p(cues)
1. Need a constraining likelihood model that is also invariant to variations in human appearance.
2. Need a prior model of how people move.
![Page 98: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/98.jpg)
Systemcomponentsforhumanbodytracking
• Representa>onforprobabilis>canalysis.• Modelsforhumanmo>on(priorterm).• Modelsforhumanappearance(likelihoodterm).
![Page 99: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/99.jpg)
• Representa>onforprobabilis>canalysis.• Modelsforhumanmo>on(priorterm).
• Modelsforhumanappearance(likelihoodterm).
![Page 100: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/100.jpg)
* Limbs are truncated cones * Parameter vector of joint angles and angular velocities = φ
![Page 101: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/101.jpg)
• Posteriordistribu>onovermodelparametersohenmul>‐modal(duetoambigui>es)
• Representwholedistribu>on:– sampledrepresenta>on– eachsampleisapose– predictover>meusingapar>clefilteringapproach
• IsardandBlake,1998,“Condensa>onAlgorithm”
![Page 102: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/102.jpg)
Posterior Temporal dynamics
Likelihood Posterior
Giventhedatasofar,whatdoIthinkisthesetofpossiblestatesthebodycouldbein?
Whatcouldeachofthosestatesbecomeatthenext>mestep?(Usespriormodelforhumanmo>on).
Howmuchiseachofthosepossiblestatessupportedbythevisualdataatthenext>mestep?
Updatees>mateofpossiblestates,giventhevisualdata.
![Page 103: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/103.jpg)
• Representa>onforprobabilis>canalysis.• Modelsforhumanmo>on(priorterm).• Modelsforhumanappearance(likelihoodterm).
![Page 104: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/104.jpg)
• Onlyhandlespeoplewalking.• Verypowerfulconstraintonhumanmo>on.
![Page 105: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/105.jpg)
• Ac>on‐specificmodel‐Walking– Trainingdata:3Dmo>oncapturedata
– Fromtrainingset,learnmeancycleandcommonmodesofdevia>on(PCA)
Mean cycle Small noise Large noise
![Page 106: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/106.jpg)
Initialize to figure, then let go…
![Page 107: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/107.jpg)
• Representa>onforprobabilis>canalysis.• Modelsforhumanmo>on(priorterm).• Modelsforhumanappearance(likelihoodterm).
![Page 108: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/108.jpg)
Changing background
Low contrast limb boundaries
Occlusion
Varying shadows
Deforming clothing
What do non-people look like?
What do people look like?
![Page 109: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/109.jpg)
(5000 samples in each example)
![Page 110: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/110.jpg)
Edge cues
![Page 111: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/111.jpg)
Ridge cues
![Page 112: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/112.jpg)
Flow cues
![Page 113: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/113.jpg)
Edge cues
Ridge cues
Flow cues
![Page 114: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/114.jpg)
Walking model
2500 samples ~10 min/frame
![Page 115: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/115.jpg)
Whathasallowedustomakeprogress?
• SIFTfeatures• Discrimina>veclassifiers
• Bayesianmethods
• Largedatasets• Miscellaneousadvances:exploi>ngcontext
![Page 116: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/116.jpg)
Images by Antonio Torralba
Useofcontextforobjectdetec>on
car pedestrian
Identical local image features!
![Page 117: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/117.jpg)
Contextspeedsobjectdetec>on:thisiswhattheworldlooksliketoafacedetectorthatdoesn’ttakeadvantageofcontext.Canyoufindthe
face?
AntonioTorralba
![Page 118: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/118.jpg)
![Page 119: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/119.jpg)
![Page 120: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/120.jpg)
Thebestobjectdetec>onalgorithmscombinetop‐down(context)withbo<om‐up(localfeatures)cues.
Thetop‐downinforma>oncanhelpsuppressfalsedetec>onscausedbyambiguouslocalinforma>on.
![Page 121: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/121.jpg)
Featurevectorforanimage:the“gist”ofthescene
– Compute 12 x 30 = 360 dim. feature vector – Or use steerable filter bank, 6 orientations, 4 scales, averaged
over 4x4 regions = 384 dim. feature vector – Reduce to ~ 80 dimensions using PCA
Oliva & Torralba, IJCV 2001
![Page 122: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/122.jpg)
Low‐dimensionalrepresenta>onforimagecontext
Images
Random noise filtered to have the
same 80-dimensional
representation as the images above.
![Page 123: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/123.jpg)
“gist”usefulforobjectpriming
![Page 124: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/124.jpg)
Examplesoflearnedfeaturesforbo<om‐updetec>on:applythefiltershownattoprowsandaveragethesquaredoutputover
regionsshowninbo<omrows.
![Page 125: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/125.jpg)
Theadvantageofcontextinobjectdetec>onFor each type of object, we plot the single most probable detection if it is above a threshold (set to give 80% detection rate)
If we know we are in a street, we can prune false positives such as chair and coffee-machine (which are hard to detect, and hence must have low thresholds to get 80% hit rate)
Objectdetec>onswithoutcontext:notefalsealarms
Objectdetec>onsahersuppressionoffalsedetec>onsusingcontext
![Page 126: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/126.jpg)
Whathasallowedustomakeprogress?
• SIFTfeatures• Discrimina>veclassifiers
• Bayesianmethods
• Large,labeleddatasets.
![Page 127: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/127.jpg)
Acorrespondence‐basedapproachtosceneparsing
Givenanimage
– Findanotherannotatedimagewithsimilarscene
– Findcorrespondencebetweenthesetwoimages
– Warptheannota>onaccordingtothecorrespondence
tree
sky
road
field
car
unlabeled
building
window
Input Support
Userannota>onWarpedannota>onDensescenealignmentusingSIFTFlowforobjectrecogni>onC.Liu,J.Yuen,A.TorralbaIEEEConferenceonComputerVisionandPa<ernRecogni>on(CVPR),2009.
![Page 128: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/128.jpg)
Systemoverview
Flow visualization code
Query
RGB SIFT
RGB SIFT Annota>onSIFTflow
Nearestneighbors
tree
sky
road
field
car
unlabeledDensescenealignmentusingSIFTFlowforobjectrecogni>onC.Liu,J.Yuen,A.TorralbaIEEEConferenceonComputerVisionandPa<ernRecogni>on(CVPR),2009.
![Page 129: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/129.jpg)
Systemoverview
Flow visualization code
SIFTflow RGB SIFT Annota>on
Warpednearestneighbors
Query
RGB SIFT Parsing Groundtruth
tree
sky
road
field
car
unlabeledDensescenealignmentusingSIFTFlowforobjectrecogni>onC.Liu,J.Yuen,A.TorralbaIEEEConferenceonComputerVisionandPa<ernRecogni>on(CVPR),2009.
![Page 130: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/130.jpg)
Sceneparsingresults(1)
Query Bestmatch Annota>onofbestmatch
Warpedbestmatchtoquery
Parsingresult Groundtruth
![Page 131: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/131.jpg)
Sceneparsingresults(2)
Query Bestmatch Annota>onofbestmatch
Warpedbestmatchtoquery
Parsingresult Groundtruth
![Page 132: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/132.jpg)
Pixel‐wiseperformance
Oursystemop>mizedparameters
Per‐pixelrate74.75%
Pixel‐wisefrequencycountofeachclass
DensescenealignmentusingSIFTFlowforobjectrecogni>onC.Liu,J.Yuen,A.TorralbaIEEEConferenceonComputerVisionandPa<ernRecogni>on(CVPR),2009.
![Page 133: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/133.jpg)
Comparison
J.Sho<onetal.Textonboost:Jointappearance,shapeandcontextmodelingformul>‐classobjectrecogni>onandsegmenta>on.ECCV,2006
(a)Oursystemop>mizedparameters
74.75%
(b)OursystemNoMarkovrandomfield
66.24%
(c)Sho<onetal.NoMarkovrandomfield
51.67%
(d)OursystemMatchingcolorinsteadofSIFT
49.68%
![Page 134: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/134.jpg)
Comparisonforeachclass
• Weconvertoursystemtoabinarydetectorforeachclassandcompareitwith[Dalal&Triggs.CVPR2005]
• InROC,oursystem(red)outperformstheirs(blue)formostoftheclasses
DensescenealignmentusingSIFTFlowforobjectrecogni>onC.Liu,J.Yuen,A.TorralbaIEEEConferenceonComputerVisionandPa<ernRecogni>on(CVPR),2009.
![Page 135: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/135.jpg)
Whathasallowedustomakeprogress?
• SIFTfeatures• Discrimina>veclassifiers
• Bayesianmethods
• Non‐parametricmethods
![Page 136: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/136.jpg)
![Page 137: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/137.jpg)
Algorithm
– Picksizeofblockandsizeofoverlap– Synthesizeblocksinrasterorder
– Searchinputtextureforblockthatsa>sfiesoverlapconstraints(aboveandleh)• Easytoop>mizeusingNNsearch[Lianget.al.,’01]
– Pastenewblockintoresul>ngtexture• usedynamicprogrammingtocomputeminimalerrorboundarycut
![Page 138: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/138.jpg)
![Page 139: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/139.jpg)
![Page 140: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/140.jpg)
![Page 141: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/141.jpg)
Problem:Howtoconstructandmanageanon‐parametricsignalprior?Howselecttheexemplarstouse,howquicklyfindnearestneighbormatches?
Applica>ons:Low‐levelvision:noiseremoval,super‐resolu>on,filling‐in,texture
synthesis.
References:W.T.Freeman,E.C.Pasztor,O.T.CarmichaelLearningLow‐LevelVisionInterna>onalJournalofComputerVision,40(1),pp.25‐47,2000.h<p://www.merl.com/reports/docs/TR2000‐05.pdf
AlexeiA.EfrosandThomasK.Leung,TextureSynthesisbyNon‐parametricSampling,IEEEInterna>onalConferenceonComputerVision(ICCV'99),Corfu,Greece,September1999,h<p://graphics.cs.cmu.edu/people/efros/research/NPS/efros‐iccv99.pdf
![Page 142: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/142.jpg)
2009BIRSWorkshoponComputerVisionandtheInternet
![Page 143: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/143.jpg)
RobFergus
RickSzeliski
LanaLazebnik
![Page 144: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/144.jpg)
Nearestneighborsearchinhighdimensions
Nearestneighborsinhigh‐dimensions.categoryrecogni>on.forinstancerecogni>on,nnforindividualfeaturesworksfine.butforcategoryrecogni>on,many>mesthelocalfeaturesarenot,bythemselves,aclosematch,duetowithin‐classvaria>ons.
Nearestneighborsearch,buttakingintoaccountarepar>culardata.or,telluswhatques>onsweshouldbeaskingaboutourdatainordertodonearestneighborsearchwell.
onthelargedatabaseside:howstorememories,concepts,objectsinverylargedatabases?Largedatabaseissues.mul>dimensional:kdtree(butonlyupto20dims)findingsimilarthingsinveryhighdimensions.
Parallelism‐‐wherecanweexploitit?kdtreehighdsearch.DoesLSHworkasadver>sed?inprac>cenotaswell.
![Page 145: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/145.jpg)
Problem:Nearestneighborsearchinhighdimensions.
Applica>ons:Non‐parametrictexturesynthesisandsuper‐resolu>on.Imagefilling‐in.Objectrecogni>on.Scenerecogni>on.
References:(ManyinCSliterature,LSH,etc.)
PatchMatch:ARandomizedCorrespondenceAlgorithmforStructuralImageEdi>ngACMTransac>onsonGraphics(Proc.SIGGRAPH),August2009ConnellyBarnes,EliShechtman,AdamFinkelstein,DanBGoldman,h<p://www.cs.princeton.edu/gfx/pubs/Barnes_2009_PAR/patchmatch.pdf
![Page 146: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/146.jpg)
ShaiAvidan
![Page 147: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/147.jpg)
Blindvision
![Page 148: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/148.jpg)
Problem:Developsecuremul>‐partytechniquesforvisionalgorithms.
Applica>ons:Secure,distributedimageanalysis.
References:
S.AvidanandM.ButmanBlindVisionEuropeanConferenceonComputerVision(ECCV),Graz,Austria,2006.h<p://www.merl.com/reports/docs/TR2006‐006.pdf
Paperabstract:Alicewouldliketodetectfacesinacollec>onofsensi>vesurveillanceimagessheown.Bobhasafacedetec>onalgorithmthatheiswillingtoletAliceuse,forafee,aslongasshelearnsnothingabouthisdetector.AliceiswillingtouseBob´sdetectorprovidedthathewilllearnnothingaboutherimages,noteventheresultofthefacedetec>onopera>on.Blindvisionisaboutapplyingsecuremul>‐partytechniquestovisionalgorithmssothatBobwilllearnnothingabouttheimagesheoperateson,noteventheresultofhisownopera>onandAlicewilllearnnothingaboutthedetector.Theprolifera>onofsurveillancecamerasraisesprivacyconcernsthatcanbeaddressedbysecuremul>‐partytechniquesandtheiradapta>ontovisionalgorithms.
![Page 149: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/149.jpg)
DevaRamanan
![Page 150: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/150.jpg)
Evaluateeasilyoverapowersetofallsegmenta>ons.
DevaRamanan:wantsafastandefficientwaytosearchoverallpossiblesegmenta>onsofanimage,scoringeachoneagainstsomemodel.
h<p://www.di.ens.fr/~russell/papers/Russell06.pdf
![Page 151: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/151.jpg)
Problem:Evaluatesomesegmenta>on‐dependentfunc>onover(someapproxima>onto)allpossiblesegmenta>ons.Note:differentthanbo<om‐upsegmenta>on,whichIwouldnotrecommendasa
researchproject.
Applica>ons:Imageunderstanding.
References:Deva’shomepage:h<p://www.ics.uci.edu/~dramanan/
UsingMul>pleSegmenta>onstoDiscoverObjectsandtheirExtentinImageCollec>ons,BryanRussell,AlexeiA.Efros,JosefSivic,BillFreeman,AndrewZissermaninCVPR2006,h<p://people.csail.mit.edu/brussell/research/proj/mult_seg_discovery/index.html
![Page 152: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/152.jpg)
AlyoshaEfros
![Page 153: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/153.jpg)
Efroscomments
Alyosha:non‐booleanretrievaloflargedataset.ie,it'snotlogicalopera>onswewannaretreive,butrealvaluednumbers.
alyosha:theneedleinthehaystackproblem.findsignalclusters/characteris>cswhenthere'slotsofnoise.findthepa<erns,ignorethenoise.seethepictureofthe4ofuswithhatsanddeterminethathatsarewhat'sincommon.
alyosha:weneedtofindsomethingnewtogeneralizefromgraphicalmodels.thoseweregoodfortoyproblemswheretherewerelotsofcondi>onalindependencies.Butnowwedon'thavethat.wantsomeothermodel.somethingthatprovidestheabstrac>on,maybe,thatonlyafewofthesecondi>onalindependenciesareac>veatanyone>me(likesparsecoding).sortofsimilartohigherordercliques.
![Page 154: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/154.jpg)
DavidLowe
![Page 155: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/155.jpg)
DavidLowe
needbe<erfeatures.anar>stcandrawthenendofanelephant'strunk,andyouknowimmediatelywhatitis.butourfeaturesdon'tcapturethatsimilarityatall.
learningoffeaturesfromimages.whatisanaturalencodingofimages?asawarningforwhatapproachnottotake:don'tbotherlearningtransla>oninvariance,orrota>oninvariance.soali<lebitofsupervisionisok.
![Page 156: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/156.jpg)
Computervisionacademicculture
Nomore“ifonly”papers
End‐to‐endempiricalorienta>onThereisacertainoverheadincominguptospeedonthefiltersandrepresenta>ons.Needdatasetvalida>onThecompe>>veconferenceshave20‐25%acceptancerate.Otherconferenceshaveli<leimpact.Thecompe>>veconferences:CVPR,ICCV,ECCV,NIPS.
Thus:besttocollaborate.
![Page 157: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/157.jpg)
PeopleatMITtoworkwith
EdwardAdelson—BrainandCogni>veSciences,materialpercep>oninhumansandmachines;mul>‐resolu>onimagerepresenta>ons.
FredoDurand—EECS,computa>onalphotography,computergraphics.BillFreeman—EECS,computa>onalphotography,computervision.JohnFisher—CSAIL,machinelearning,computervision.PolinaGolland—EECS,medicalapplica>ons.EricGrimson—EECS,surveillance,medicalapplica>ons.BertholdHorn—EECS,computedimaging.TommyPoggio—BrainandCogni>veSciences,machinelearning,
computervision,inspiredbyandmodelinghumanvision.RameshRaskar—MediaLab,computa>onalphotography.AntonioTorralba—EECS,objectrecogni>on,sceneinterpreta>on.
![Page 158: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/158.jpg)
Acomputergraphicsapplica>onofnearest‐neighborfindinginhighdimensions
![Page 159: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/159.jpg)
Theimagedatabase
• Wehavecollected~6millionimagesfromFlickrbasedonkeywordandgroupsearches
– typicalimagesizeis500x375pixels– 720GBofdiskspace(jpegcompressed)
![Page 160: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/160.jpg)
Imagerepresenta>on
Color layout
GIST [Oliva and Torralba’01]
Original image
![Page 161: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/161.jpg)
Obtainingseman>callycoherentthemesWe further break-up the collection into themes of semantically coherent scenes:
Train SVM-based classifiers from 1-2k training images [Oliva and Torralba, 2001]
![Page 162: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/162.jpg)
Basiccameramo>ons
Forward motion Camera rotation Camera pan
Starting from a single image, find a sequence of images to simulate a camera motion:
![Page 163: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/163.jpg)
3. Find a match to fill the missing pixels
Scene matching with camera view transformations: Translation
1. Move camera
2. View from the virtual camera
4. Locally align images
5. Find a seam
6. Blend in the gradient domain
![Page 164: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/164.jpg)
4. Stitched rotation
Scene matching with camera view transformations: Camera rotation
1. Rotate camera
2. View from the virtual camera
3. Find a match to fill-in the missing pixels
5. Display on a cylinder
![Page 165: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/165.jpg)
More “infinite” images – camera translation
![Page 166: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/166.jpg)
![Page 167: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/167.jpg)
![Page 168: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/168.jpg)
![Page 169: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/169.jpg)
Virtual space as an image graph
Forward Rotate (left/right)
Pan (left/right)
• Nodes represent Images
• Edges represent particular motions:
• Edge cost is given by the cost of the image match under the particular transformation
Image graph
Kaneva,Sivic,Torralba,Avidan,andFreeman,InfiniteImages,toappearinProceedingsofIEEE.
![Page 170: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/170.jpg)
Virtual image space laid out in 3D
Kaneva,Sivic,Torralba,Avidan,andFreeman,InfiniteImages,toappearinProceedingsofIEEE.
![Page 171: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/171.jpg)
Outline
• Aboutme• Computervisionapplica>ons
• Computervisiontechniquesandproblems:– Low‐levelvision:underdeterminedproblems– High‐levelvision:combinatorialproblems– Miscellaneousproblems
![Page 172: Computer Vision Introduction One Lecture ( Short )](https://reader034.fdocuments.in/reader034/viewer/2022042521/548c5cf4b4795902248b48ab/html5/thumbnails/172.jpg)
Problem:InferenceinMarkovRandomFields.Wanttohandlehigherordercliquepoten>als,high‐dimensionalstatevariables,andreal‐valuedstatevariables.
Applica>ons:Low‐levelvision:noiseremoval,super‐resolu>on,filling‐in,texture
synthesis.
References:PushmeetKohli,LuborLadicky,PhilipTorrRobustHigherOrderPoten>alsforEnforcingLabelConsistency.In:Interna>onalJournalofComputerVision,2009.h<p://research.microsoh.com/en‐us/um/people/pkohli/papers/klt_IJCV09.pdf