Animated Image Cloth Segmentation - IITKhome.iitk.ac.in/~mandeeps/Animated Image Cloth...
Transcript of Animated Image Cloth Segmentation - IITKhome.iitk.ac.in/~mandeeps/Animated Image Cloth...
November 2016
Animated Image Cloth Segmentation
by
MANDEEP SINGH
Third Year Undergraduate
Mechanical Engineering Department
(IITK)
&
BIBEK KUMAR
Third Year Undergraduate
Electrical Engineering Department
(IITK)
Supervised by
Dr. Tanaya Guha, Assistant Professor
Electrical Engineering Department (IITK)
DEPARTMENT OF ELECTRICAL ENGINEERING
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
ANIMATED IMAGE CLOTH SEGMENTATION 2
ABSTRACT
This project aims to improvise the process to parse clothes from animated images which can be
deployed in the search engines of online shopping related to Disney character clothing. A lot of
time & energy consumed in data retrieval process for searching can be saved using some elegant
algorithms like k-means clustering and using Gabor filters for texture based feature extraction.
Further a novel probabilistic modelling is proposed for image parsing using Data Driven Markov
Chain Monte Carlo (DDMCMC) for improvised results as the future work.
Keywords: k-means clustering, Gabor filters, Data Driven Markov Chain Monte Carlo
(DDMCMC), split and merge method
ANIMATED IMAGE CLOTH SEGMENTATION 3
INTRODUCTION
Clothing is one of the major part of our lifestyle which is the key parameter in one’s
social status. Nowadays, due to the growing online shopping facilities, there is an increase in the
demand to retrieve individual consumer’s choice based search results so as to attract the
customers towards their respective online shopping sites and make their shopping experience
more easy and delightful.
Vision algorithm to recognize the clothing have a wide variety of potential impacts, ranging from
better social understanding, to improved person identification, surveillance or content-based
image retrieval. The e-commerce opportunities alone are huge with millions of dollars clothing
markets! Despite the potential research and commerce gains of clothing recognition, relatively
few researchers have explored the problem. Although clothing styles like that of a Disney prince
and/or princess which are quite popular among the kids and youngsters (especially the bride
gowns), little research has been made to parse the animated images, being different from the real
world images, and to develop a sophisticated and intelligent algorithm to recognize and
distinguish various clothing items from an animated image of any favorite Disney character.
ANIMATED IMAGE CLOTH SEGMENTATION 4
OVERVIEW OF THE APPROACH
This project utilizes one of the famous algorithm, called k-means clustering, to segment
constituent parts of clothes from the background and other body parts like face, hairs, hands and
feet of the animated image of the famous Disney characters as the sampled dataset used in it as
the first stage of the extracting the clothing. Further for future improvisation, a data driven
Markov chain is established for parsing the image to the constituent clothing items by unifying
various segmentation algorithms like edge k-means, split and merge etc. The probabilistic model
can be established by categorically dividing the image into different regions based on its type i.e.
uniform, texture and shading.
1. k-means clustering
It is a least square method to partition the given n dataset into k-regions, each observation
belongs to a particular cluster based on the nearest distance from the mean of that cluster. Here
the dataset is the set of pixels of the animated RGB image and each pixels are the collection of
the intensity values of red, blue, and green light ranging from 0-255. So we can consider each
pixel with its intensity (r, g, b) as a point in the 3-D RGB coordinate system.
1.1 Algorithm-
Randomly select k points as the means of k-clusters and
calculate the Euclidean distance of the ith pixel (ri, gi, bi)
from the initial means (rm, gm, bm) and the minimum
distance to that pixel gives the cluster in which it belong.
Now after forming the clusters, the updated means of the
clusters is defined by their respective centroid. And the algorithm is
run till the point of convergence with k distinct clusters as shown in the Fig 1 with 3 clusters.
Figure 1
ANIMATED IMAGE CLOTH SEGMENTATION 5
Following are the sampled animated images of the Disney princesses used to segment the clothes
in this project along with its results on the application of k-means clustering after separating the
background.
Figure 2
Figure 3
Figure 4
ANIMATED IMAGE CLOTH SEGMENTATION 6
Figure 5
Figure 6
1.2 Limitation of the initial approach and its remedy-
After performing the k-means clustering, we can infer that it can segment the clothings from the
face, skin and hair almost with 90-95% accuracy as in the case of Fig 5, however it is expected to
fail in the cases (with 60-75% accuracy) where there is cluttering of the pixels of the 2 different
cluster and also the unneccesary splitting of single cluster as in the other cases.
These glitches can be removed by segmenting the images using Gabor filters for the textured
regions and deploying a more rigorous algorithm that can prevent the same textured regions or
the region with shade to split further by merging those regions into the single regions. It is finally
proposed to use the Markov property (i.e. conditional probabilities of the future states don’t
depend on the sequence of events in the past and only depend on the present state) which come
handy to get the optimal solution (regions) of the dynamic system.
ANIMATED IMAGE CLOTH SEGMENTATION 7
2. Texture based segmentation
As discussed in the earlier limitations section, we use the multi-channel Gabor filters for
extracting the regions of same texture.
2.1 Gabor filters
These are the band-pass filters with tunable center frequency, orientation and bandwidth.
Frequency and orientation representations of Gabor filters are similar to those of the human
visual system, and they have been found to be particularly appropriate for texture representation
and discrimination. The Fourier transform of the Gabor filter is a Gaussian shifted in frequency.
2-D Gabor Filter-
𝑓(𝑥, 𝑦, 𝜔, 𝜃, 𝜎x, 𝜎y) = {exp [−((𝑥/𝜎x)2+(y/𝜎y)
2)/2+𝑗𝜔(𝑥𝑐𝑜𝑠𝜃 + 𝑦𝑠𝑖𝑛𝜃)]}
where 𝜎 is the spatial spread, 𝜔 is the frequency, 𝜃 is the orientation
2.2 Gaussian Smoothing
Spatial smoothing can be applied to the feature extraction methods through Gabor filter, and is
known to enhance the process of segmentation process because it suppresses large variations in
the feature map in the regions which belong to the same texture.
Each filter output is smoothed using Gaussian smoothing function that matches the
corresponding filter spatial Gaussian curve.
𝑔(𝑥, 𝑦) = exp {−(𝑥2+ 𝑦2)/2𝜎2}
2.3 Steps for texture based segmentation
Input Image
Gabor Filtering
Gaussian smoothing
ClusteringSegmented
Image
ANIMATED IMAGE CLOTH SEGMENTATION 8
3. Data Driven Markov Chain Monte Carlo (DDMCMC)
Markov Chain Monte Carlo is clever way to iteratively search through higher dimensional space
by constructing a Markov Chain which converges to the invariant (stationary) solution from the
posterior probability p(W|I) which is proportional to prior p(W) and likelihood p(I|W).
3.1 Image Models
As there are mainly two types of regions in the sampled image, namely uniform and texture, so
we can define 2 image models: independently and identically distributed Gaussian model for
uniform regions and a mixture of 2 Gaussian model for textured color regions along with their
likelihood functions to switch between different regions according to its maximum likelihood.
The likelihood of image is the product of the regions’ likelihood which is given by
p(I|W)=∏ 𝑝(𝑰𝐾𝑖=1 Ri; i;𝑙i) where I is the image; Ri is the ith region; i is the model parameter
vectors and 𝑙i is the model label index.
3.2 Dynamics- Split and Merge method
Let W and W’ be the states of Markov chain with K and K+1 disjoint regions.
Then if the kth region is split into ith and jth , its proposed splitting probability is defined by
Where Conditional Probability of how likely chain proposes to move to W’ from W is given by
& Probability of Proposed Merge is
ANIMATED IMAGE CLOTH SEGMENTATION 9
Figure 7: The anatomy of the solution space. The arrows represent Markov chain jumps and the
reversible jumps between the two subspace 8 and 9 realise the split and merge of a region.
3.3 Further refinement through calibration
The project has proposed the DDMCMC algorithm to get the desired results, however there is a
need to calibrate the system according to the sampled animated images. A parametric factor is
multiplied with the likelihood function to get the required output with the correct posterior
probability of both types of image models as that of human visual recognition.
REFERENCES
[1] R. C. Gonzalez, R. E. Woods, “Digital Image Processing”, 3rd Edition, Ch-10
[2] Kota Yamaguchi, M. Hadi Kiapour, Luis E. Ortiz, Tamara L. Berg, “Parsing Clothing in
Fashion Photographs”, IEEE, 2012
[3] Zhuowen Tu, Song-Chun Zhu, “Image Segmentation by Data-Driven Markov Chain Monte
Carlo”, IEEE Transactions on Pattern Analysis and Machine Intelligence, May 2002
[4] Mathwork Documentation, “Image Processing”, The Mathwork Inc., 2016