MKL for Category Recognition

64
MKL for Category Recognition Kumar Srijan Syed Ahsan Ishtiaque

description

MKL for Category Recognition. Kumar Srijan Syed Ahsan Ishtiaque. Dataset. 19 categories considered Currently Minimum of 58 images in each Average of 101 images The images have been taken from Google Images. Has been supplemented by images from Flickr. Code Walkthrough – Relevant Files. - PowerPoint PPT Presentation

Transcript of MKL for Category Recognition

Page 1: MKL for Category Recognition

MKL forCategory Recognition

Kumar SrijanSyed Ahsan Ishtiaque

Page 2: MKL for Category Recognition

Dataset

• 19 categories considered• Currently– Minimum of 58 images in each– Average of 101 images

• The images have been taken from Google Images.

• Has been supplemented by images from Flickr.

http://images.google.com and http://flickr.com

Page 3: MKL for Category Recognition

http://images.google.com

Page 4: MKL for Category Recognition

http://images.google.com

Page 5: MKL for Category Recognition

http://images.google.com

Page 6: MKL for Category Recognition

http://images.google.com

Page 7: MKL for Category Recognition

http://images.google.com

Page 8: MKL for Category Recognition

http://images.google.com

Page 9: MKL for Category Recognition

http://images.google.com

Page 10: MKL for Category Recognition

http://images.google.com

Page 11: MKL for Category Recognition

http://images.google.com

Page 12: MKL for Category Recognition

http://images.google.com

Page 13: MKL for Category Recognition

http://images.google.com

Page 14: MKL for Category Recognition

http://images.google.com

Page 15: MKL for Category Recognition

http://images.google.com

Page 16: MKL for Category Recognition

http://images.google.com

Page 17: MKL for Category Recognition

http://images.google.com

Page 18: MKL for Category Recognition

http://images.google.com

Page 19: MKL for Category Recognition

http://images.google.com

Page 20: MKL for Category Recognition

http://images.google.com

Page 21: MKL for Category Recognition

http://images.google.com

Page 22: MKL for Category Recognition

Code Walkthrough – Relevant Files• preprocCal101.sh - Rescales images and renames them

according to the code• cal_preprocDatabases.m - Builds Image and ROI database• cal_preprocVocabularies.m – Prepares the visual word

vocabularies• cal_preprocFeatures.m – computes features for all the

images, project them onto visual words and build map files for each

• cal_preprocHistograms.m – prepares Histograms for the visual words.

• cal_preprocKernels.m – computes training and testing kernel matrices.

• cal_classAll.m – final classification

Page 23: MKL for Category Recognition

Code Walkthrough

Construct Visual Wordscal_preprocVocabulariesConstruct Visual Words

cal_preprocVocabularies

Calculating Local Descriptors

Calculating Local Descriptors

Vector quantizationbk_calcVocabularyVector quantizationbk_calcVocabulary

Calculate Image Descriptors

bk_calcFeatures

Calculate Image Descriptors

bk_calcFeatures

Preparing Database( Separate Training and Testing Images, Define

Region of Interest and add Jitters to it )

cal_preprocDatabeses

Preparing Database( Separate Training and Testing Images, Define

Region of Interest and add Jitters to it )

cal_preprocDatabeses computes the features for all the images,

projects them on visual words, and produce map files for each.

cal_preprocFeatures

computes the features for all the images,

projects them on visual words, and produce map files for each.

cal_preprocFeatures

Compute and quantize descriptors for training

and test imagesbk_calcFeatures

Compute and quantize descriptors for training

and test imagesbk_calcFeatures

Prepare the visual words Histogram

cal_preprocHistograms

Prepare the visual words Histogram

cal_preprocHistograms

Compute Training and Testing Kernel Matrices

cal_preprocKernels

Compute Training and Testing Kernel Matrices

cal_preprocKernelsRun on all categories

cal_classAllRun on all categories

cal_classAll

Train SVM with MKLOne vs. Rest Classifiers

bk_trainAppModel

Train SVM with MKLOne vs. Rest Classifiers

bk_trainAppModel

Evaluate SVM on test data

bk_testAppModel

Evaluate SVM on test data

bk_testAppModel

Page 24: MKL for Category Recognition

Documentation for modifications and adjustment of parameters for

code execution

Page 25: MKL for Category Recognition

Changing the number of Training and testing images

• Default value is 15.• Change drivers/cal_filenames.txt accordingly

– this file contains the name of images for each category to be processed as training or testing images

Page 26: MKL for Category Recognition

Changing the number of Training images

• To change the number of final training images, which includes jittered images– In drivers/cal_conf.m file, change conf.numPos to desired value

• To change the number of initial training images (without jitters), which are input to the code– In drivers/cal_preprocDatabases.m –

Change this if ni <= 15 % Hard CodedTo if ni <= conf.numPos % Changed, change it to your desired value imdb.images(ii).set = imdb.sets.TRAIN ; else imdb.images(ii).set = imdb.sets.TEST ;

end

Page 27: MKL for Category Recognition

Changing the number of test images

• In drivers/cal_setupTrainTest.m –for cl = fieldnames(roidb.classes)‘

selCla = findRois(testRoidb, 'class', char(cl)) ;Change this keep(selCla(1 : min(15, length(selCla)))) = true ; % Hard CodedTo keep(selCla(1 : min(conf.numPos, length(selCla)))) = true ; % Changed

% you can change it to desired valueend

Page 28: MKL for Category Recognition

Adding a new Feature

• In drivers/cal_conf.m– Add that feature name to conf.featNames– Now specify the properties and parameters for that

feature like conf.feat.<your_feature_name>.<parameter>

• Now add your extractFn, quantizeFn and clusterFn in features directory ( check for input and output format for each )

Page 29: MKL for Category Recognition

Parameters

• Parameter should include– format – dense or sparse

• In the dense format, one stores features on a grid, specifying the x and y pixel coordinate of each column/row of the grid. Then we store an "image" whose pixels correspond to grid elements andspecify corresponding visual words.

• In the sparse format, one store a list of visual words and their x, y location in the image.

– extractFn - pointer to the function called to extract the feature

Page 30: MKL for Category Recognition

Parameters

– clusterFn - pointer to the clustering (k-means) function– quantizeFn - pointer to the function used to project onto

k-means cluster– vocabSize - k-means vocab. size (number of visual words)– numImagesPerClass - number of image per class used to

sample features to train the vocabulary with k-means– numFeatsPerImage - number of features per image

sampled to train the vocabulary with k-means– compress – “false” generally– pyrLevels - pyramidal levels used when building histogram

based on this features

Page 31: MKL for Category Recognition

Changing Jitters

• Jitters are basic modifications (zooming, flipping and rotating) on an image, in the code they are used to create more training data out of the basic training data, which helps to increase the accuracy.

• Current jitters supported are– rp5, rm5, fliplr, fliplr_rp5, fliplr_rm5, zm1, zm2 – these all

are modifications of zoom, rotate and flip only.

• For changing the jitters to be used, in drivers/cal_conf.m file – change conf.jitterNames accordingly.

Page 32: MKL for Category Recognition

Changing Features

• Current features supported are– gb – Sparse Geometric-Blur words– gist– bow – Sparse SIFT words, Bag of Words– phog180, phog360 – Dense edge-based shape– phowColor, phowGray – Dense SIFT words

• For changing features to be used, in drivers/cal_conf.m file – change conf.featNames accordingly

• For using bow feature, also use cal_preprocDiscrimScores after cal_preprocFeatures step.

Page 33: MKL for Category Recognition

Changing the weight learning method

• Current learning methods supported are– Manik– equalMean – It means that the weights are set to

the inverse of the average of the kernel matrices. It is a simple heuristic whose only purpose is to "balance" the kernels when you combine them additively.

• For changing the weight learning method, in drivers/cal_conf.m file – change conf.learnWeightMethod accordingly.

Page 34: MKL for Category Recognition

Obtaining Results

• Calculate SVM score for the image for all the classes.

• The image is assigned the class which has the highest score.

• Use this information to create the confusion matrix.

• Use confusion matrix to calculate the final accuracy.

Page 35: MKL for Category Recognition

Code Execution - I• In the current execution, we have taken 10 classes.

– Badge– Bulb– Camera– Cell– Frog– Horse– Keyboard– Kingfisher– Locket– Moon

• 15 train + 15 test images were used for the execution of the code

Page 36: MKL for Category Recognition

Kernel Matrices

echi2_phowGray_L0 echi2_phowGray_L1

echi2_phowGray_L2 el2_gb

Page 37: MKL for Category Recognition

Aggregate SVM Scores

Test Images

Categories

Highest

Lowest

Scores

Page 38: MKL for Category Recognition

Confusion matrixBADGE BULB CAMERA CELL FROG HORSE KEYBOARD KINGFISHER LOCKET MOON

BADGE 8 0 1 0 2 0 1 0 2 1

BULB 0 10 0 0 0 1 0 1 1 2

CAMERA 1 1 8 2 1 0 1 0 1 0

CELL 0 2 0 7 1 1 0 2 2 0

FROG 0 1 0 0 7 3 1 2 1 0

HORSE 0 0 0 0 6 9 0 0 0 0

KEYBOARD 0 1 0 0 1 0 13 0 0 0

KINGFISHER 0 0 0 0 5 2 0 7 1 0

LOCKET 1 2 0 0 1 0 0 0 9 2

MOON 1 0 0 0 0 0 0 1 0 13

Page 39: MKL for Category Recognition

Confusion Matrix

Category

Category

Highest

Lowest

Scores

Page 40: MKL for Category Recognition

Analysis• Overall accuracy is 61%.• Moon and keyboard have very high classification

rate – they have relatively lesser intraclass variance.

• Cell phone, frog and kingfisher have very low classification rates.

• There is appreciable confusion in horse vs. frog and kingfisher vs. frog. These are found in natural surroundings, possibly creating the confusion.

• Artificial objects don’t get confused with natural ones very frequently.

Page 41: MKL for Category Recognition

Code Execution - II• In this execution, we have taken 19 classes.

• badge• bulb• camera• cell• frog• horse• keyboard• kingfisher• locket• moon• owl• photo• piggy• pliers• remote• shirt• shoe• spoon• sunflower

• 15 train + 15 test images were used for the execution of the code

Page 42: MKL for Category Recognition

Kernel Matrices

echi2_phowGray_L0 echi2_phowGray_L1

echi2_phowGray_L2 el2_gb

Page 43: MKL for Category Recognition

Aggregate SVM Scores

Test Images

Categories

Highest

Lowest

Scores

Page 44: MKL for Category Recognition

Confusion Matrix

Category

Category

Highest

Lowest

Scores

Page 45: MKL for Category Recognition

Confusion Matrix

Category

Category

Highest

Lowest

Scores

Page 46: MKL for Category Recognition

Analysis

• Overall accuracy is 50.5% (lesser as compared to 10 category classification).

• Moon, keyboard and shirt have very high classification rate.

• Cell phone, frog and kingfisher have very low classification rates.

• There is appreciable confusion in photo-frame vs. cell phone and kingfisher vs. frog.

Page 47: MKL for Category Recognition

Analysis

• The classification of blub was good in the 10 category case, but was very bad in the 19 category case.

• Similar looking objects(low interclass difference) like camera, cell phone ,remote control and photo frame are more likely to get confused amongst themselves than with other groups.

Page 48: MKL for Category Recognition

Code Execution - III• In this execution, we have taken 19 classes.

• badge• bulb• camera• cell• frog• horse• keyboard• kingfisher• locket• moon• owl• photo• piggy• pliers• remote• shirt• shoe• spoon• sunflower

• 20 train + 15 test images were used for the execution of the code

Page 49: MKL for Category Recognition

Kernel Matrices

echi2_phowGray_L0 echi2_phowGray_L1

echi2_phowGray_L2 el2_gb

Page 50: MKL for Category Recognition

Aggregate SVM Scores

Highest

Lowest

Scores

Page 51: MKL for Category Recognition

Confusion Matrix

Highest

Lowest

Scores

Page 52: MKL for Category Recognition

Analysis

• Overall accuracy is 53.3% (slightly more than when 15 images were taken per category).

• Moon, keyboard and shirt have very high classification rate.

• Cell phone, frog and kingfisher have very low classification rates.

• There is appreciable confusion in photo-frame vs. cell phone and kingfisher vs. frog.

Page 53: MKL for Category Recognition

Analysis• Number of correct classifications for photo frame got

increased from 2 to 10(out of 15).• Number of correct Classification for piggy bank

decreased from 10 to 5(out of 15).• For objects with low intra-class variation(moon), the

classification error has increased.• For objects with low intra-class variation(photo

frame), the classification error has decreased significantly.

• Increasing the number of training images did not significantly increase accuracy in the case of classes with low inter-class variability.

Page 54: MKL for Category Recognition

Code Execution - IV

• 19 Classes Considered – Badge, Bulb, Camera, Cell Phone, Frog, Horse, Keyboard, Kingfisher, Locket, Moon, Owl, Photo Frame, Piggy Bank, Pliers, Remote Control, Shirt, Shoe, Spoon and Sunflower

• Features used are phog180 and phog360.• No Jitters are used• Kernel type is echi2.• 25 train + 15 test images were used for the execution

of the code.

Page 55: MKL for Category Recognition

Kernel Matrices

Page 56: MKL for Category Recognition

Aggregate SVM Scores

Highest

Lowest

Scores

Page 57: MKL for Category Recognition

Confusion Matrix

Highest

Lowest

Scores

Page 58: MKL for Category Recognition

Analysis

• Overall accuracy was 45.6 percent .• In spite of having more training images the

accuracy was decreased.• This shows that Jitter helps in better training

and in turn better accuracy.

Page 59: MKL for Category Recognition

Code Execution - V• 10 Classes considered - Badge, Bulb, Camera, Cell Phone,

Frog, Horse, Keyboard, Kingfisher, Locket and Moon.• Kernel type is echi2.• 25 train and 25 test images are used for execution of the

code.• Running the code on the new test feature “lowesift”.

• This feature is similar to BOW feature• Instead of using Laplace-Harris for calculating interest points, it uses

SIFT detector itself.• In our case, we used both the implementations of David Lowe and

Andrea Vedaldi to test the feature.• Since the feature was similar to BOW feature, the clusterFn and

quantizeFn of BOW were used.

Page 60: MKL for Category Recognition

Kernel Matrices

Page 61: MKL for Category Recognition

Aggregate SVM Scores

Highest

Lowest

Scores

Page 62: MKL for Category Recognition

Confusion Matrix

Highest

Lowest

Scores

Page 63: MKL for Category Recognition
Page 64: MKL for Category Recognition

Analysis

• Overall accuracy is 47.2 percent.• Keyboard is the class which is most correctly

classified, because of it’s low intraclass variance.• Cellphone, Badge and Bulb have the least

accuracy.• There is appreciable confusion between Badge as

a Locket as these two classes are very similar.• Kingfisher got confused with Horse and Locket,

this did not happen in earlier executions.