Post on 25-Mar-2018
ii
UNIVERSITY OF NAIROBI
FINAL YEAR PROJECT
DEPARTMENT OF ELECTRICAL AND INFORMATION ENGINEERING
CROWD SIZE ESTIMATOR USING IMAGE PROCESING TECHNIQUES
PROJECT NO: 95
By
LWANGA CHRISTINE WANJIRA
REG. NO: F17/29764/2009
SUPERVISOR: DR. H. A. OUMA
EXAMINER: OMBURA
A PROJECT REPORT SUBMITTED TO THE DEPARTMENT OF ELECTRICAL AND
INFORMATION ENGINEERING IN PARTIAL FULFILLMENT OF THE
REQUIREMENTS OF BSc. ELECTRICAL AND ELECTRONIC ENG. OF THE
UNIVERSITY OF NAIROBI
April,28th, 2014
1
DECLARATION OF ORIGINALITY. FACULTY/ SCHOOL/ INSTITUTE: Engineering
DEPARTMENT: Electrical and Information Engineering
COURSE NAME: Bachelor of Science in Electrical & Electronic Engineering
TITLE OF NAME OF STUDENT: LWANGA CHRISTINE WANJIRA
REGISTRATION NUMBER: F17/29764/2009
COLLEGE: Architecture and Engineering
WORK:CROWD SIZE ESTIMATION OF A STILL PHOTOGRAPH USING IMAGE PROCESSING TECHNIQUES.
1) I understand what plagiarism is and I am aware of the university policy in this regard.
2) I declare that this final year project report is my original work and has not been submitted
elsewhere for examination, award of a degree or publication. Where other people’s work or my
own work has been used, this has properly been acknowledged and referenced in accordance
with the University of Nairobi’s requirements.
3) I have not sought or used the services of any professional agencies to produce this work.
4) I have not allowed, and shall not allow anyone to copy my work with the intention of passing it
off as his/her own work.
5) I understand that any false claim in respect of this work shall result in discplinary action, in
accordance with University anti-plagiarism policy.
Signature: ………………………………………………………………………………………
Date: ……………………………………………………………………………………………
2
DEDICATION This project is dedicated to my parents who have given their all selflessly to ensure that I get
stellar education. Their continuous encouragement even in the face of despair has given me the
strength and inspiration to focus and work towards achieving my goals.
3
ACKNOWLEDGEMENT I acknowledge the enormous input by my supervisor, Dr. H.Ouma , for the useful comments and
suggestions which have led to the improvement of this project and for the guidance and moral
support that he granted unto me during the development of this project .
4
ABSTRACT. Image Processing Algorithms are the basis for Image Computer Analysis and Machine Vision. Employing a
theoretical foundation – Image Algebra – and powerful development tools – Visual C++, Visual Fortran,
Visual Basic, and Visual Java – high-level and efficient Computer Vision Techniques have been developed.
This paper analyzes different Image Processing Algorithms by classifying them in logical groups. In
addition, specific methods are presented illustrating the application of such techniques to the real-world
images. In most cases more than one method is used. This allows a basis for comparison of different
methods as advantageous features as well as negative characteristics of each technique is delineated.
5
Contents DECLARATION OF ORIGINALITY. ..................................................................................................................... 1
DEDICATION .................................................................................................................................................... 2
ACKNOWLEDGEMENT .................................................................................................................................... 3
ABSTRACT. ...................................................................................................................................................... 4
LIST OF FIGURES .............................................................................................. Error! Bookmark not defined.
CHAPTER ONE: INTRODUCTION. .................................................................................................................... 8
Background ................................................................................................................................................. 8
Problem Statement .................................................................................................................................... 8
Main objectives .......................................................................................................................................... 9
Specific Objectives. ..................................................................................................................................... 9
Project Scope .............................................................................................................................................. 9
Conclusion. ................................................................................................................................................. 9
CHAPTER TWO: LITERATURE REVIEW. .......................................................................................................... 10
INTRODUCTION. ....................................................................................................................................... 10
The Spatial Domain ................................................................................................................................... 12
Contrast Manipulation .......................................................................................................................... 13
Histogram Equalization ......................................................................................................................... 13
Laplacian ............................................................................................................................................... 14
A Genetic Algorithm Technique ............................................................................................................ 15
Comparison of Competing Techniques .................................................................................................... 17
Detection of stationary crowds. ............................................................................................................... 19
Optimal density estimate. ........................................................................................................................ 20
Geometric distortion. ............................................................................................................................... 20
Grey – level ........................................................................................................................................... 21
Edge Detection. .................................................................................................................................... 22
Digital morphology. .............................................................................................................................. 24
Texture. ................................................................................................................................................. 25
Thinning and skeletonization algorithms. ............................................................................................ 26
Hough Transform .................................................................................................................................. 27
6
Summary ................................................................................................................................................... 27
CHAPTER THREE: DESIGN ............................................................................................................................. 28
Introduction. ............................................................................................................................................. 28
Design Stages ............................................................................................................................................ 28
Stages of development ......................................................................................................................... 28
Conclusion. ............................................................................................................................................... 35
CHAPTER FOUR: IMPLEMENTATION............................................................................................................. 36
Introduction. ............................................................................................................................................. 36
Conclusion ................................................................................................................................................ 45
CHAPTER FIVE. .............................................................................................................................................. 46
Discussion. ................................................................................................................................................ 46
Challenges Encountered. .......................................................................................................................... 46
Conclusion. ............................................................................................................................................... 46
Recommendations. ................................................................................................................................... 47
REFERENCES. ................................................................................................................................................. 48
APPENDIX. ..................................................................................................................................................... 50
Matlab Code. ............................................................................................................................................ 50
7
List of Figures. Figure 1 ......................................................................................................................................................... 16
Figure 2 ......................................................................................................................................................... 17
Figure 3 ......................................................................................................................................................... 18
Figure 4 ......................................................................................................................................................... 28
Figure 5 ......................................................................................................................................................... 33
Figure 6 ........................................................................................................................................................ 33
Figure 7 ......................................................................................................................................................... 34
Figure 8 ......................................................................................................................................................... 35
Figure 9 ......................................................................................................................................................... 36
Figure 10 ....................................................................................................................................................... 37
Figure 11 ....................................................................................................................................................... 37
Figure 12 ....................................................................................................................................................... 38
Figure 13 ....................................................................................................................................................... 38
Figure 14 ....................................................................................................................................................... 39
Figure 15 .......................................................................................................... Error! Bookmark not defined.
Figure 16 .......................................................................................................... Error! Bookmark not defined.
Figure 17 .......................................................................................................... Error! Bookmark not defined.
Figure 18 .......................................................................................................... Error! Bookmark not defined.
Figure 19 .......................................................................................................... Error! Bookmark not defined.
Figure 20 .......................................................................................................... Error! Bookmark not defined.
Figure 21 .......................................................................................................... Error! Bookmark not defined.
Figure 22 .......................................................................................................... Error! Bookmark not defined.
Figure 23 .......................................................................................................... Error! Bookmark not defined.
Figure 24 .......................................................................................................... Error! Bookmark not defined.
Figure 25 .......................................................................................................... Error! Bookmark not defined.
Figure 26 .......................................................................................................... Error! Bookmark not defined.
Figure 27 .......................................................................................................... Error! Bookmark not defined.
Figure 28 .......................................................................................................... Error! Bookmark not defined.
Figure 29 ....................................................................................................................................................... 45
8
CHAPTER ONE: INTRODUCTION.
Background
A crowd is something beyond a simple sum of individuals. It has collective characteristics
which could be described in general terms such as ‘angry crowd’, ‘peaceful crowd’. A crowd
can assume different and complex behaviors as those expected by their individuals.
Understanding crowd behavior helps in designing pedestrian facilities, for major layout
modifications to existing areas and for the daily management of sites subject to crowd
traffic. Conventional manual measurement techniques are not suitable for comprehensive
data collection of patterns of site occupation and movement. Real time monitoring is
tedious and tiring but safety-critical.
When congestion (crowd density) exceeds a certain level, this being dependent upon the
collective objective of the crowd and the environment, danger may occur for a variety of
reasons. Physical pressure may result directly in injury to individual or to the collapse of
parts of the physical environment.
Crowd density analysis could be used to measure the comfort level in public spaces or
detect potentially dangerous situations. There are real models developed to estimate the
number of people in crowded scenarios using computer vision techniques such as; pixel-
based analysis, texture-based analysis, object-level analysis.
An important and challenging problem related to the crowd phenomenon is crowd
simulation which relates to the reproduction of realistic crowds based on computer
graphics techniques. Animations of crowds find applications in for instance evaluation of
crowd management techniques where for instance simulation of the flow of people leaving
a football stadium after a match.
Study of methods used by human observers may help in the choice of image processing
algorithms likely to be useful in automatic assessment of crowd behavior.
Problem Statement
For high density crowds, a major difficulty is validation of the models it’s difficult to estimate the
actual density of real crowds. Human observers can be used but its time consuming and difficult
hence the advantage of being able to develop methods of automatically collecting this data by
the use of image processing techniques provided such methods can be adequately validated for
accuracy by comparison with manual observations or other alternatives they can be used as a
basis for deriving mathematical methods.
9
Main objectives
The main objectives of this project are to use image processing techniques to estimate the size
of a crowd from a still photograph.
Specific Objectives.
1. Understanding what a crowd is and how to incorporate various image processing
techniques using algorithms to estimate a crowd size on a photograph to the
accuracy of 10%.
2. Explore and familiarize with the different image processing techniques used to
analyze images.
3. Design software to be used for analysis and extract features of a photograph that
most likely represent a person. Take the image and determine what the system
needs to do to get the total number of people.
4. Compare and contrast against already analyzed results to calculate the accuracy.
Project Scope
I. Our objective for the crowd models is that they should not involve actual counting of
individual or tracking of the movement of individuals but should be based on a collective
description of crowds.
II. Analyzing the images within the confines of image processing techniques and clear
indication of algorithms used with backing validations on choice.
Conclusion.
After deducing and understanding what a crowd is, the next chapter explores on Image
Processing Techniques as well as how they have been incorporated before in image analysis.
10
CHAPTER TWO: LITERATURE REVIEW.
INTRODUCTION. This chapter will seek to describe and discuss the researches from various sources such as
textbooks, articles and the internet. It consists of information which is vital in the
development of this project. It will focus on the different algorithms of image processing
techniques which can be used in image enhancement. This will assist in the realization of
the project objectives. For estimation of crowd density it’s easy to distinguish a very dense
crowd from the background. That is surrounding buildings, road surfaces, and so on. This
idea could be applied quantitatively to computer based density estimation of the image
pixels corresponding to the crowd could be separated from those of the background.
This can be done effectively using a reference image for the scene obtained with no crowd
present for subtraction from the image under analysis.
Care is taken such that lighting conditions are similar and that there are no movable
objects on the scene for instance vehicles, temporary billboards, which computer would
not distinguish from people. We could use the following;
Edge detection
Optimal density estimate
Geometric distortion.
Crowds can be characterized considering four different aspects;
Image space domain
Sociological domain
Level of services
Computer graphics domain
Crowds are identified when the density of people is sufficiently large to disable individual and
group identification. In sociological domain, psychologists and sociologists have studied the
behavior of groups and people during several years. That is, they have been mainly interested on
the effects occurring when people with same goals become only one entity named crowd or
mass. In this case people can lose individuality and adopt the behavior of a crowd entity.
In computer graphics domain, many have proposed in recent years different models of crowd
simulation. Some of them can achieve realistic crowd behaviors but a model where all possible
crowd behavior could be simulated has not yet been achieved. In fact a big challenge in crowd
simulation is the knowledge about real crowds, meaning that many behaviors of real crowds are
not yet observed or still explained. Nevertheless there are several aspects that describe crowd
behaviors as indicated below;
11
Least effort hypothesis
People choose the least effort route to reach their goals. This issue is closely related to
the trajectory of people in real scenarios in crowded scenes; since the trajectories should
be minimally changed to avoid collision with others and still cope with the least effort
hypothesis.
Lane formation
It takes less effort for people to follow immediately behind someone who is already
moving in their direction than it does to push their own way through a crowd.
Emerged as a consequence of the least effort hypothesis; lane formations arise since
people change trajectory whenever they encounter an entity moving in the opposite
direction. This action forms chains of entities walking in line as we would expect.
Bottleneck effect
Describes a very obvious effect in which people change velocity as a function of both
density of people and restriction in the environment.
Image processing is an important component of modern technologies because humans depend
so much on the visual information than other creatures. Image is better than any other
information form for us to perceive. Among our information about the world, 99% is perceived
with our eyes. Image processing has traditionally been an area in the engineering community.
The basic tools are Fourier analysis with a long history and wavelet analysis which has become
popular in the engineering community since 1980's. In the past few decades several advanced
mathematical approaches have been introduced into this field, namely, the variation calculus,
partial differential equations (PDE) and stochastic approaches (or statistical methods), and they
have become important tools for theoretical image processing.
Image processing consists of converting an image into a digital form, and then performing
operations on it such as extracting its content, or the information in it. It is also used for object
recognition. A digital image is an array of square picture elements or pixels arranged in columns
and rows. There are color images, grayscale images and binary images. Color images can be
converted to grayscale in order to facilitate the extraction of some information within the image.
A grey scale image is an 8-bit image, in which each pixel has an assigned intensity between 0
(black) and 255 (white).
12
A binary image is an image in which pixels can only have two values: black (0) or white (1). Most common image formats are: GIF, JPEG, TIFF, PNG, PS, and PSD.
Image processing methods
There are two types of methods used for image processing:
- Analog image processing or visual techniques of image processing: used for printouts and
photographs.
- Digital image processing: processing digital images by using a computer. This technique
includes three phases for processing images: pre-processing, enhancement and display,
information extraction. Let us briefly define each of those phases:
* Image pre-processing or image restoration consists of correcting the image from different
errors, noise and geometric distortions.
* Image enhancement improves the visual aspect of the image, after the correction of errors, to
facilitate the perception or interpretability of information in the image.
* Information extraction utilizes the computer’s decision-making capability to identify and
extract specific pieces of information or pixels.
The Spatial Domain
The spatial domain simply describes the conventional view of an image, which is that of a 2D
array of pixels. Thus enhancement in this domain generally involves manipulation of pixels by
means of modifying the original pixels values of an image based on some predefined rules (this is
also known as local or point process). Alternatively, pixel values may be combined or compared
to other pixels in their immediate neighborhood in a range of differing methods. In this section
the following techniques will be described:
1. Contrast manipulation
2. Histogram Equalization
3. Laplacian
4. Genetic Algorithm technique
13
Contrast Manipulation
What occurs in most systems is that once an image is captured and held in memory, the
individual pixels are then mapped (using a transformation equation) to a table (a look up table)
within the systems hardware. This look up table of pixel values is then sent to the systems
display device, where the image can be viewed. Thus the pixels of the original image that is
stored in memory does not get modified, and contrast manipulation occurs by reassigning the
pixel brightness levels for each pixel within the table.
One of the imaging transformations involves a non-linear relationship which has the property of
expanding one portion of the grayscale range while compressing another. Many of the common
transfer functions used involve the variation of this variable. These include:
a) A logarithmic function or a square root function has the property of compressing the
displayed brightness at the bright end of the scale, while expanding those at the dark
end. This proves useful for converting images taken from a camera with a linear response
to a more common logarithmic response.
b) An inverse log or squared function will do the opposite of the above
c) An inverse function will produce a negative image
Histogram Equalization
The histogram of an image represents the relative frequency of occurrence of grey levels within
an image. Histogram modeling techniques are used to modify the grayscale range and contrast
values of an image such that its intensity histogram fits a desired shape.
Histogram Equalization is used to modify an input image s intensity histogram in order to obtain
an output image with a uniformly distributed histogram. The resultant effect will be that the
output image will have a perception that overall contrast is optimal (thus the image is
enhanced).
The process of histogram equalization involves the use of a transfer function which reassigns the
brightness values of display pixels based on the input image histogram. The process does not
affect individual pixels brightness order (that is they remain brighter or darker than other pixels)
but only modify/shift the brightness values so that an equal number of pixels have each possible
brightness value i.e. the process can be represented by the following simplified mathematical
equation(found in [1]):
14
K = j
i=0 Ni / T …………. (1)
What equation (1) says is that for each brightness level, j, in the original image the new assigned
value K is equal to the sum of the number of pixels in the image with brightness equal to or less
than j (i.e. Ni) divided by the total number of pixels, T.
In more complicated cases, the image histogram may not be a good representation for local
statistics in two separate parts of the image. In such a case Histogram Equalization may not
enhance the image well enough to represent the two areas. In such a case another algorithm,
known as Adaptive histogram equalization, is more appropriate. In this algorithm the image
histogram is divided into several rectangular domains, the histogram equalization is then applied
to each of these domains. Once this is completed the brightness levels are modified to match
across boundaries.
Adaptive histogram equalization shows better contrast over different parts of an image. The
corresponding grey-scale histogram lacks the mid-levels present in the global histogram
equalization as a result of setting a high contrast level.
Laplacian
The Laplacian is an edge enhancing algorithm. It performs local, or neighbourhood equalization
of the brightness levels of image pixels. The result is that the output/displayed image shows an
increase in local contrast at the boundaries.
Shown below is a simple 3x3 Laplacian operator which is applied to a neighbourhood of pixels
-1 -1 -1
-1 +8 -1
-1 -1 -1
This operator is understood to mean that the central pixel brightness value is increased by 8,
while the brightness values of all the surrounding pixels are subtracted from the central pixel. A
consequence of this is that in regions where the brightness values are uniform this operator sets
the brightness values of the neighbouring pixels to zero. But when one of these neighbouring
pixels becomes the central pixel, it has the same effect as the previous pixel. Thus all pixels with
uniform brightness inadvertently get enhanced by a similar ratio with this procedure. This means
that only points, edges or lines benefit from this operation since the brightness levels will be
non-uniform within the neighbouring pixels (a large change in brightness level usually indicates
15
the presence of an edge). Hence the overall effect of this operation is that the edges of an image
are enhanced.
The output image produced from the application of the Laplacian algorithm is not easily
interpretable, but the subtraction of the original image with the laplacian image produces an
image which seems sharpened when compared with the original.
A Genetic Algorithm Technique
The Genetic Algorithm (GA) technique is essentially a searching strategy that is modelled around
the evolution of a population of individuals. GAs have generally been used to solve difficult
optimization problems in various fields. In terms of image processing, GAs have been used for
image compression, reconstruction, segmentation, and enhancement. Image enhancement of
grey-scale images with the use of GAs and real coded chromosomes can be expanded as follows;
Real coded GAs differ from the classical approach (which codes chromosomes as a binary string
of fixed length) by coding the chromosomes as a set of genes. Each real valued gene then
represents a parameter of the optimization problem. This scheme has proved effective since it
avoids the negative artefacts that generally plague the binary string version due to the mutation
process of the GA.
The GA model is used to partially automate the subjective evaluation of an image by a human
interpreter. What this means is that the algorithm will try to automatically identify targets within
an image and then enhance those targets. The initial image produced from the process does not
usually fit the demand for visual interpretation, thus some manual intervention (for adjusting
the contrast levels) is required during the process (hence the reason to call the procedure
partially automatic).
The chromosome coding scheme works with the features in an image that the GA has to evolve.
This is essentially the brightness levels/intensity of each pixel in a grey scale image. The
maximum number of shades of grey (denoted as Ng) that will be used is a constant that is
determined before hand. But the chromosome only codes the intensity values of a small subset
of Ng (denoted ng, which is equivalent to the length of the chromosome. Figure below
demonstrates this:
16
Figure 1
The Fitness of each chromosome (which represents an image), is defined as a subjective fitness
score between 0-10. It is here where human intervention occurs, since it is a human interpreter
that assigns the fitness of each individual chromosome. The interpreter may set the fitness
based on the brightness and enhancement of certain areas of an image. The result of this, is that
the final output image will be enhanced in a manner that the user desires. This operation would
be too human intensive if every chromosome was to be considered, thus a method is developed
whereby the human interpreter only looks at a subset ( ) of the chromosomes from the total
population size(N).
The selection mechanism uses an elitist approach so that the best chromosome perpetuates
through subsequent generations. The remaining chromosomes go through a procedure known
as tournament selection. This is a simple mechanism whereby random chromosomes in the
mating population are paired together, and the fittest chromosome in the pair is then picked up.
This procedure is repeated for a certain number of steps, with the remaining chromosomes
mating.
The crossover operation used gets a better mix of the genetic material from parental
chromosomes. Here the crossover operator known as the Gaussian Uniform Crossover is used.
This operator gives a better mixing of the genes and results in better contrast with evolved
images.
Knot Cubic spline
17
Comparison of Competing Techniques
Table 1 below, summarises the properties of some of the image enhancing techniques discussed
Technique
Property
Histogram Equalization
Adjusts the global contrast of an image, by sharing out the intensity levels of each pixel across the image. Thus entire image is enhanced.
Adaptive Histogram Equalization
Used for local enhancement of a region within an image. Is an extension of Histogram Equalization.
Laplacian Used mostly for edge enhancement
Genetic Algorithms Used for local enhancement of an image, with minimal human interaction.
Figure 2
18
Below is a flow chart illustrating Image processing:
Figure 3
There is of course some need to manage crowds in open areas for instance refugees
congregating in rural areas, but the lack of a built environment of constraining buildings make
the problem rather different and more related to the distribution of food and medical
assistance.
INFORMATION PROCESS
Image acquisition
Electron-solid interaction
Detection of elastically and
inelastically scattered electrons
Analog to digital conversion
Gray level image
Image enhancement
Image segmentation
Image measurement
Binary image
Data
19
Study of the methods used by human observers may help in the choice of image processing
algorithms, likely to be useful in automatic assessment of crowd behavior.
Detection of stationary crowds.
The up and down oscillatory head movement of individuals walking in a freely flowing crowd
stop when the crowd is too dense for free movement. We compute the 2-dimensional Discrete
Fourier Transform (DFT) for each image in a time sequence followed by a measurement of
temporal changes in the resulting magnitude and or phase spectra. However this approach has
two main disadvantages:
a) DFT for a single image is related to local changes of intensity and not to temporal
interframe properties.
b) It involves a high computational and memory cost.
A more effective method is to isolate motion properties in the image sequence through a data-
reducing coding mechanism such as Discrete Cosine Transform(DCT) whose form for a one-
dimensional image f(x,t) of N elements is given by N-1
g (t)= f(x,t) cos (2pi kx),
x=0
where k is constant derived from the maximum signal frequency, Nyquist criterion and the
maximum expected motion to be observed. This transform associates sinusoids with the time
varying parts of an image . Faster moving objects are associated with constant levels which can
be detected by applying the DFT on g(t). for a one-dimensional “images” of N pixels, the DCT
calculates a single value for each image in a time-sequence( A data reduction of N to 1 )
therefore reducing significantly the computational cost of the DFT. As we are only concerned
here with detecting movement in the vertical direction the DCT of a 2-D image sequence is given
by:
N-1 N-1
g (t) = I(x,y,t)cos(2π ky)
y=0 x=0
I(x,y,t) represents each image of size N*N in the sequence,ie N*N is the image size.
20
There different processes for image enhancement:
Optimal density estimate.
Each of the measurement techniques can be approximated by
Z=mx+b y=(z-b)=mx
Where z is the number of pixels after segmentation that is number of non background pixels or
number of thinned edge pixels as will be discussed later on. X is the number of people , m and b
are coefficients obtained from the experimental data by linear regression.
It is thus possible to combine these two measurements into an optimal estimation of crowd
density through a linear filter. A simple dynamic model is used that assumes a constant number
of pedestrian density from image to image is modeled as zero-mean process noise.
Xk+1 =xk + v
ye ze-be me we
= = +
yb zb - bb mb xk wb
x represents the number of people, k the sample image number, v is the process noise obtained
from variation of x from image to image. Z is the number of crowd pixels, m and b are linear fit
coefficients and w is the noise characteristics of each measurements techniques. Subscripts e
and b are thinned edges and background removal respectively.
Geometric distortion.
The other methods suffer from near far effects where people near to the camera will occupy
more area than people of the same size far from the camera. This is exacerbated if standard
closed-circuit security camera installation are used for data-capture, because such cameras are
commonly mounted at low angles either to obtain a longer field view for human monitoring or
because they are installed in locations with low ceilings. For instance tunnels or platforms in
subway areas.
21
Compensation for rear far effect can be particularly achieved by using geometric distortion of
the image. Such distortion is commonly used for special video effects in the television
entertained and advertising business and special hardware is available for rapid processing.
The technique also has application for video data compensation and so the theory and practice
is well developed.
Image Processing Algorithms should be divided in 5 major groups:
i. Grey-level segmentation or Threshold method.
ii. Edge- Detection Techniques.
iii. Digital morphology.
iv. Texture.
v. Thinning and skeletonization Algorithms.
Grey – level
Thresholding is a conversation between a grey-level image and bi-level image (a monochrome
image only compiled by black and white pixels). It should contain most essential information of
the image that is number, position, and shape of objects. Most of the time pixels with similar
grey level belong to the same object therefore classifying the image by grey-scale pixels may
reduce and simplify some image processing operations such as pattern recognition and
classification.
Most essential thresholding will be selection of single threshold value (though difficult because
of noise and illumination effects ). All grey levels below this value are black (0) and those above
are white (1). The remedy is thus we use mean grey level that is; half of pixels to be white and
the other half to be black.
P – tile method is another easy way to find the threshold. It uses histograms of grey levels in the
image with the histogram and percentage of black pixels desired.
Number of black pixels = % * total number of pixels.
Then one simply counts the pixels in histogram bins starting at bin 0; until the count is greater
than or equal to the desired number of black pixel. The threshold is the grey level associated
with last bin counted. The advantage is that we can change the percentage of the black pixels
desired.
22
Edge pixel method produces a threshold based on the digital laplacian which is a non-directional
edge detection operator. The histogram of the original image is found considering only those
pixels having large laplacians. Then the threshold is computed using this new histogram.
Iterate method. An initial guess at a threshold is redefined by consecutive passes through the
image. It thresholds the image into object and background repeatedly using the levels in each of
them to improve value of T. the mean grey level for all the pixels below the threshold is found to
be Tb and the same process for the pixels above the initial threshold are found as To.
Thus the new estimation is; Tb +To/ 2
The process is repeated using threshold and it stops when in 2 consecutive passes through the
image, no change in threshold is observed.
The fuzzy sets method. An element x belongs to a set S with particular probability Ux.
thresholding an image means to classify pixels belonging to either the set of background pixels
or the set of object pixels.
In a nutshell, in the above methods, we have assumed that pixels of object and pixels of
background don’t overlap however in real life this is not true. Nevertheless all that is needed is
for the 2 classes of pixels not to overlap each of a set of regions that collectively form the image.
The first step is to determine how many independent regions form the image and their sizes.
Therefore there is a threshold value per region and make sure each region has both background
and object pixels.
A good example of a thresholding algorithm is the one proposed by Chow and Kaneko in 1972.
This method finds a bimodal histogram for each region. This histogram is intended to have 2
classes of Pixels (object and background).
Edge Detection.
Edge is the boundary between an object and its background. If edges of images can be identified
with precision all the objects can be identified with precision all the objects can be identified and
their area, perimeter, shape, et cetera can be calculated.
There are terms in edge detection vital for our understanding. One is edge enhancement which
increases contrast between the edges and the background such that edges become unstable.
Edge tracing is the process of following the edges usually collecting the edge pixels into a list.
23
Basically edge detection is measuring the total perimeter of all the regions occupied by people.
For low density crowds, this can be expected to give a measure of density . Although errors are
inevitable as numbers increase because of occlusion and overlapping of individuals. The process
can be refined further by thinning the edge images to minimize the effects of varying edge
thickness.
There are different types of edge detection;
a) Gradient edge detection.
b) Kirsch edge detection.
c) Sobel edge detection.
d) Canny edge detection.
e) ISEF edge detection.
A model of an edge can be ideally represented by the Step Edge, which is simply a change in grey
level occurring at one location. The step edge is an ideal model because in a real image never a
change in a grey level occurs in the extreme left side of a pixel due to noise and illumination
disturbances. Due to digitization, it is unlike that the image will be sampling in such a way that all
of the edges happen to correspond exactly with a pixel boundary. The change in grey level may
extend across various pixels. The actual position of the edge is considered to be in the center of
the ramp connecting the low grey level to the high grey level. This is called Ramp Edge.
The Sobel edge detector is a Template-Based Edge Detector that uses templates in the form of
convolution masks such as;
Sx= -1 0 1 Sy=-1 -2 -1
-2 0 2 0 0 0
-1 0 1 1 2 1
These templates are an approximation to the gradient at the pixel in the center of the template.
Another edge detector technique is Kirsch Edge Detector . The masks given by these templates
try to model the kind of grey level change seen near an edge having various orientations. There
is a mask for each of eight compass directions. For instance K0 implies a vertical edge (horizontal
gradient) at the pixel corresponding at the center of the mask. To find the edge, I is convolved
with the eight masks at each pixel position. The response is the maximum of the responses of
any of the eight masks and the directions quantified into eight possibilities (π⁄4*i).
24
Two advanced and optimized edge detectors are Canny Edge Detectors and Infinite Symmetric
Exponential Filter (ISEF). Both are classified as Mathematical Edge Detectors.
Ko= -3 -3 5
-3 0 5
-3 -3 5
For John Canny an optimal edge detector need to satisfy these three conditions:
• The edge detector should respond only to edges, and should find all of them; no edges should
be missed.
• The distance between the edge pixels as found by the edge detector and the actual edge
should be as small as possible.
• The edge detector should not identify multiple edge pixels where only a single edge exits.
Canny edge detection algorithm
1. Read the image I.
2. Convolve a 1D Gaussian mask with I.
3. Create a 1D mask for the first derivative of the Gaussian in the x and y directions.
4. Convolve I with G along the rows to obtain Ix, and down the columns to obtain Iy.
5. Convolve Ix with Gx to have Ix’, and Iy with Gy to have Iy’.
6. Find the magnitude of the result at each pixel (x, y)
Digital morphology.
The concept of digital morphology is based on the fact that images consist of set of pictures
elements called pixels that collect into groups having a two-dimensional structure called shape.
A group of mathematical operations can be applied to the set of pixels to enhance or highlight
specific aspects of the shape so that they can be counted or recognized.
This part of the image processing analysis deals with image filtering and geometric analysis of
the structuring elements. Erosion or elimination of set of pixels having a given pattern
(structuring element) and dilation or addition of a given pattern to a small area, are basic
morphology operations. Binary morphological operations are defined on bilevel images.
25
In general, an operator is defined as a set of black pixels with a specific location for each of its
pixels given by the pixel row and column indices. Mathematically, a pixel is though as a point in
two-dimensional space.
A combination of the simplest morphology operation: dilation and erosion will result in two very
helpful image processing operation. They are called opening and closing.
Opening: Application of erosion immediately followed by a dilation using the same structuring
element. This binary operation tends to open small gaps between touching objects in an image.
After an opening objects are better isolated, and might be counted or classified. A practical
application of an opening is removing noise. For instance after thresholding an image.
Closing: Application of a dilation immediately followed by erosion using the same structuring
element. The closing operation closes or fills the gaps between objects.
Texture.
The repetition of a pattern or patterns over a region is called texture. This pattern may be
repeated exactly, or as set or small variations. Texture has a conflictive random aspect: the size,
shape, color, and orientation of the elements of the pattern (textons).
The main goal identifying different textures in machine vision is to replace them by a unique
grey level or color. In addition there is another problem associated with texture: scaling. Equal
textures at different scales may look different for an image-processing algorithm. For that
reason is unlikely that a simple operation will allow the segmentation of textured regions. But
some combination of binary operations may result in an acceptable output for a wide range of
textures.
The simplest way to perform the texture segmentation in grey-level images is that the grey level
associated with each pixel in a textured region could be the average (mean) level over some
relatively small area. This area is called window and can vary in size to capture different scales.
The use of windows is very convenient in this case, since texture deals with region instead of
individual pixels.
The method can be stated as following:
1. For each pixel in the image, replace it by the average of the levels seen in a region W × W
pixels in size centered at that pixel.
2. Threshold the image into two regions using the new average levels. The exact location of the
boundary between regions depends on the threshold method that is applied.
26
An improved method uses the standard deviation of the grey level in a small region instead of
the mean. The standard deviation brings information about how many pixels in that regions
belong to textons and how many belong to the background. The precision with which the
boundary between regions is known is a function of the window’s size.
A texture is a combination of a large number of textons. Isolating textons and treat them as
individual objects, it is something doable. Once this is done, it should be possible to locate edges
that result from the grey-level transition along boundary of a texton. Analyzing some edge
properties such as common directions, distances over which the edge pixel repeat, or a measure
of the local density, it is possible to characterize the texture.
The number of edge pixels in a window can be found after applying an edge detector to that
window. Then, the density is calculated dividing the number of edge pixels found by the area of
the window. From here, useful information can be extracted. For instance edge direction, the
mean x and y component of the gradient at the edge, and the relative number of pixels whose
principal direction is x and y.
The combination of edge enhancement and co-concurrence is a smart solution (Dyer 1980 –
Davis 1981) that can be used in grey-level texture segmentation. The computation of the co-
concurrence matrix of an edge-enhanced image gives better results than the traditional method.
Thinning and skeletonization algorithms.
Skeletonization was introduced to describe the global properties of objects and to reduce the
original image into a more compact representation. The skeleton expresses the structural
connectivities of the main components of an object and it has the width of one pixel in the
discrete case. These kinds of techniques have a wide range of applications, for example
skeletonization has been applied successfully in solving character recognition problems.
A basic method for skeletonization is thinning. It is an iterative technique, which extracts the
skeleton of an object as a result. In every iteration, the edge pixels having at least one adjacent
background point are deleted. All those pixels can be eroded, only if its removal doesn't affect
the topology of the object. The skeleton represents the shape of the object in a relatively small
number of pixels.
Thinning works for objects consisting of lines (straight or curved). This method does not work for
object having shapes that encloses a large area. Thinning is most of the time an intermediate
process, to prepare the object for further analysis. The subsequence processes determine the
properties of the skeleton. For the same object, a skeleton may work find in one situation, but
may not work in all situations.
27
The first definition of skeleton was made by Blum in 1967. He defined the medial axis function
(MAF). Medial axis is basically defined as the set of points, which are the center points of largest
circles that can be contained inside the shape or the object. To represent a shape as a skeleton,
and still have the capability to reconstruct the original shape, there can be a radius function that
is associated with the skeleton points. The MAF of a shape is the locus of the centers of all
maximal discs contained in the shape. A maximal disc contained in the shape is any circle with its
interior that is contained in the shape. A maximal disc contained in the shape is any circle with
its interior that is contained in the shape. MAF is a reversible transform, which means it can be
inverted to give back the original image.
The MAF in its original implementation needs time and space and it is very difficult to implement
directly. For that reason the continuous transform its converted to a discrete one.
A good approximation of MAF on a sampled grid is easily obtained computing first the distance
from each object pixel to the nearest boundary pixel, and then calculates the Laplacian of the
distance image. Pixels having large values belong to the medial axis. The way that distance
between the object pixels and the boundary is measured has an influence on the final result
(skeleton).
Hough Transform
It’s a mapping algorithm that processes data from a Cartesian coordinate space into a polar
parameter space. It is most useful for finding geometric lines and shapes in binary images. In the
simplest realization, line detection, we want to map all collections of points from an input image
(binary) to a single accumulative value that can describe a single line in the original space.
Hough transform is a basic image processing method for finding the global relationship between
the pixels, which means it can help us to determine if the pixels lie on a curve of specified shape.
An example is given below to discuss this algorithm in details.
The line detection from the binary image seems to be the simplest application of Hough
Transform.
Summary
The key concept of this chapter was to find a possible design approach according to the
literature we read. Firstly, we discussed the different image processing techniques and how they
have been applied before in image enhancement. We discussed different algorithms for specific
techniques and their results. Then the basic strategies of digital image recognizing had been
discussed. Finally, we discussed the hough transform which will be used in the design of our
system for crowd estimation.
28
CHAPTER THREE: DESIGN
Introduction.
In this section the planned development of the project will be discussed. The working
conditions of our crowd estimation program will be set up. The development stages will
be specified in this section. Then the algorithms used in the project will be discussed in
details.
Design Stages
Stages of development
The development of the crowd size estimation program algorithm was divided into four
main stages according to our computation strategy block diagram below. Then each main
stage was broken into sub steps according to the algorithm use. Every sub step was
planned as a distinct algorithm, which could be written and tested separately, then
incorporated into the main project.
Figure 4
1) Image Acquisition.
At this stage we read the image from a digital camera or phone camera. The angle of
capture could be aerial, at an angle or directly frontal view.
Image
Acquisition
Image
Enhancement
Feature
Extraction
Counter
Output
Segmentation Morphological
Operations
Edge
Detection
Hough Circle
Transform
GUI
Display
Sobel
Filter
Defining a
Structuring
Element
Erosion
Opening Canny Edge
Detection
29
2) Image Enhancement.
We carry out image processing techniques at this stage. They include:
o Segmentation
o Morphological Operations
o Edge Detection.
Segmentation.
Segmenting an image into various components for object recognition. The read image is first
converted to grayscale. The objects pixels are separated within an image from background
pixels. Thresholding graylevel techniques separate an object from the background based upon
the graylevel histogram of an image. We use the graylevel discontinuities within the image.
These discontinuities are then used to separate objects within an image from the background.
We will use the gradient magnitude operation as the segmentation function. A popular one is
the sobel operator which creates an image which emphasizes edge and transitions. Thus we use
the sobel edge masks, imfilter and some simple arithmetic to compute the gradient magnitude.
The gradient is high at the borders of the objects and low mostly inside the objects.
Morphological Operations.
We then perform the following morphological operations on the segmented image.
Define a circular structuring element ‘disk’
Erosion
Opening
Defining a structuring element.
A structuring element is a second set of pixels with peculiar shape that acts on the pixels of the
image to produce an expected result. Thus we choose a structuring element the same size and
shape as the objects we want to process in the input image, in our case we are using a circular
structuring element to enable us identify the circular shape of the heads.
Morphological image processing is a collection of non-linear operations related to the shape or
morphology of features in an image. According to Wikipedia, morphological operations rely only
on the relative ordering of pixel values, not on their numerical values, and therefore are
especially suited to the processing of binary images. Morphological operations can also be
30
applied to greyscale images such that their light transfer functions are unknown and therefore
their absolute pixel values are of no or minor interest.
Morphological techniques probe an image with a small shape or template called a structuring
element. The structuring element is positioned at all possible locations in the image and it is
compared with the corresponding neighbourhood of pixels. Some operations test whether the
element "fits" within the neighbourhood, while others test whether it "hits" or intersects the
neighbourhood.
It is this that determines the precise details of the effect of the sobel operator on the image.
The structuring element is sometimes called the kernel, but we reserve that term for the similar
objects used in convolutions. The structuring element consists of a pattern specified as the
coordinates of a number of discrete points relative to some origin.
Erosion.
If a binary image is eroded, the resultant image is one where there is a foreground pixel for
every origin pixel where its surrounding structuring element sized, fit within the object. We
combine two sets using vector subtraction of set elements and is the dual operator of dilation as
explained in the literature review. It’s used to shrink the size of the heads for easier
identification.
The basic effect of the operator on a binary image is to erode away the boundaries of regions of
foreground pixels (i.e. white pixels, typically). Thus areas of foreground pixels shrink in size, and
holes within those areas become larger. This therefore emphasizes the circular form of the
heads for easy identification and analyzing.
Opening.
Application of an erosion immediately followed by dilation of the eroded object using same
structural element. It closes or fills the gaps between objects used for smoothing outline of
objects after a digitization followed. It eliminates thin protrusion.
You can use morphological opening to remove small objects from an image while preserving the
shape and size of larger objects in the image. Thus I chose the operations to improve the quality
of the eroded image.
31
Edge Detection.
Edge is the boundary between an object and its background. If edges of images can be identified
with precision, all the objects can be identified and their area, perimeter, shape etcetera can be
calculated. We decided to use the Canny Edge detection because it detects strong edges plus it
will find weak edges that are associated with strong edges. As edge detection is a fundamental
step in computer vision, it is necessary to point out the true edges to get the best results from
the matching process. That is why it is important to choose edge detectors.
The Smoothing concept has been applied in this Gaussian operation, so the finding of errors is
effective by using the probability. The next advantage is improving the signal with respect to the
noise ratio and this is established by Nonmaxima suppression method as it results in one pixel
wide ridges as the output. The third advantage is Better detection of edges especially in noise
state with the help of thresholding method. The major disadvantage is the computation of
Gradient calculation for generating the angle of suppression. The main disadvantage is Time
consumption because of complex computation.
Nevertheless it worked well with our image in defining the circular edges of the heads.
3) Feature Extraction.
Features are inherent properties of data, independent of coordinate frames. In this context we
are looking to extract heads for counting in estimation of a crowd. Therefore I incorporated the
use of Hough Transform to identify and mark the heads in form of circles.
Hough Transform is a mapping algorithm that processes data from a Cartesian coordinate space
into a polar parameter space.
It is most useful for finding geometric lines and shapes in binary images. The Hough transform is
a technique which can be used to isolate features of a particular shape within an image. Because
it requires that the desired features be specified in some parametric form, the classical Hough
transform is most commonly used for the detection of regular curves such as lines, circles,
ellipses, etc. A generalized Hough transform can be employed in applications where a simple
analytic description of a feature(s) is not possible. Due to the computational complexity of the
generalized Hough algorithm, we restrict the main focus of this discussion to the classical Hough
transform. Despite its domain restrictions, the classical Hough transform retains many
applications, as most manufactured parts (and many anatomical parts investigated in medical
imagery) contain feature boundaries which can be described by regular curves. The main
advantage of the Hough transform technique is that it is tolerant of gaps in feature boundary
descriptions and is relatively unaffected by image noise.
Hence it is best suited for our application of identifying heads and counting them.
32
A circle can be described completely with three pieces of information: the center (a, b) and the
radius. (The center consists of two parts, hence a total of three)
x = a + Rcosθ
y = b + Rsinθ
When the θ varies from 0 to 360, a complete circle of radius R is generated.
So with the Circle Hough Transform, we expect to find triplets of (x, y, R) that are highly probably
circles in the image. That is, we want to find three parameters. Thus, the parameter space is 3D.
To begin, we’ll start with the assumption that you’re looking for circles of a particular radius,
that is, R is known. The equation of each circle is:
x = a + Rcosθ
y = b + Rsinθ
So, every point in the xy space will be equivalent to a circle in the ab space (R isn’t a parameter,
we already know it). This is because on rearranging the equations, we get:
a = x1 – Rcosθ
b = y1 – Rsinθ
for a particular point (x1, y1). And θ sweeps from 0 to 360 degrees.
4) Counter Output.
From our block diagram I have defied the chanel for display of total circle detected to be
through a graphic user interface. Below is a simple flow diagram of the design of the graphic
user interface.
33
No
Yes
Figure 5
Figure 6 Graphic user interface 1
Start
Load
Image
Optional Process
Error Message
Terminate
34
Figure 7 Graphic User Interface in operation
“Graphic User Interface 1” shows the blank user interface before loading the image. Also
“Graphical User Interface in operation 1” shows the user interface after loading the image and
the total estimated crowd size displayed in a separate window.
35
Figure 8 Design stage of graphic user interface.
The figure above shows the design of the Graphic user interface with the different axes for displaying
results of various image processing techniques on the image. More to that I incorporated the use of
pushbuttons and pop- up menus to provide a wide variety of user options.
Conclusion. The graphic user interface was successfully designed and was functional. The code for this program is
indicated in the appendix of this report.
36
CHAPTER FOUR: IMPLEMENTATION.
Introduction.
Having Designed the program as well as the graphic user interface I implemented it on 15 crowd
pictures and tabulated the results of the different behavious to various image processing
techniques as well as the results of accuracy. First , Taking the below choice photo for illustration
of our design implementation, the results are indicated below.
Figure 9 Original RGB Image
The above image is what I used to put through the various image processing techniques
designed in the design stage. It is a picture of a crowd at a concert taken from an angle-aerial
point of view.
As depicted in the picture the “heads” in the foreground are bigger than those at the
background part of the image. I estimated a radius range in the Hough transform to
accommodate the variant sizes. However still some “heads” were not counted and the program
counted some hats as heads. Nevertheless I was able to achieve a good level of accuracy as
indicated in the pages to follow.
Below is the systematic hierarchy of output pictures as the program was running. I have
indicated the various processes below each image for easier understanding.
37
Figure 10 Grayscale Image
Figure 11 Opening-Reconstructed Image
38
Figure 12 Edge Detected Image
Figure 13 Hough Circle Transformed Image
39
Figure 14 Total of circles.
As indicated above the total number of heads counted by the program is 776. Manually, I
counted the number of people in the picture and came up with an estimate of 1,100 people.
Hence the percentage accuracy was;
(776/1100)* 100= 70.545454%
However not all the images could achieve this level of accuracy due to their characteristics. For
instance some were taken from a side-view point of view and their arms and legs were counted
too thus the total achieved exceeding total number of people in the picture. More to that I
eliminated some image processing techniques while adding others to certain images to establish
which combination of processes works for each individual image.
TOTAL :
40
Below is the 15 different images of crowds I worked with:
Figure 15 Crowd 1
Figure 16 Crowd 2
Figure 17 Crowd 3
41
Figure 18 Crowd 4
Figure 19 Crowd 5
42
Figure 20 Crowd 6
Figure 21 Crowd 7
Figure 22 Crowd 8
43
Figure 23 Crowd 9
Figure 24 Crowd 10
Figure 25 Crowd 11
44
Figure 26 Crowd 12
Figure 27 Crowd 13
Figure 28 Crowd 14
45
Below is a comprehensive table of the behavior of the above images to different combinations of image
processing techniques. Numbers represent the images above, (X) marks processes undertaken.
Techniques
1 2 3 4 5 6 7 8 9 10 11 12 13 14 Original Image
Morphology x X x x x x x x x x x x
Segmentation x X x x x x x x X x x x x x x
Erosion x X x x x x x X x x x x x
Dilation x x
Opening x X x x x x x x x
Closing x X x x x x X x x
Edge Detection
x X x x x x x x X x x x x x x
Manual total 1200 57 84 168 20 170 96 1500 80 100 40 110 65 50 1100
Program total 782 43 62 83 17 85 42 632 15 52 12 48 28 23 776
Accuracy (%) 65.2 75.4 73.8 49.4 85.0 50.0 43.8 42.1 18.8 52.0 30.0 43.6 43.0 46.0 70.5
Figure 29
Conclusion
From the table I deduced the combination of image processing techniques yielding better
accuracy was; Morphology, Segmentation, Erosion, Closing and Edge Detection. However this
did not work for all images for instance image 9 had an accuracy of 18%.
46
CHAPTER FIVE.
Discussion.
Various Image Processing Techniques discussed in the literature review were explored in the
implementation stage and results tabulated. As the program was running, effects of image
processing techniques on the image were shown in the graphic user interface window as
indicated in the implementation chapter.
Also options and preferences of the user for different image processing techniques were
provided for in the graphic user interface.
Challenges Encountered.
During the design stage of the program and user interface, it was a bit of a challenge to integrate
functions of the program into a comprehensive well defined user interface.
In addition images have different characteristics for instance the angle of capture as well as
lighting and quality. It was challenging to come up with a code that achieves a fairly good
accuracy for all types of images. Nevertheless, there were methods used to reduce these
constraints, such as definition of an appropriate radius range for the Hough circle transform as
indicated in the design chapter. This aided in accommodation of a reasonable number of circles.
Time was a constraint as well.
Conclusion.
The main objectives of this project was to use image processing techniques to estimate the size
of a crowd from a still photograph. This was achieved as indicated in the implementation
chapter. Thus there was an Understanding of what a crowd is and how to incorporate various
image processing techniques using algorithms to estimate a crowd size on a photograph.
Exploration and familiarization with the different image processing techniques used to analyze
the images was carried out and tested, and results tabulated in a table for analysis.
A software was designed to be used for analysis and extraction of features of a photograph that
most likely represent a person. Which in our case was heads. The software was designed in such
a fashion to take the image and determine what the system needs to do to get the total number
of people and display sequential effects on an image through a graphic user interface as well as
total.
Though tedious, total number of people were counted manually so as to be compared and
contrasted with the software output. This was tabulated in a table for accuracy calculation.
47
Recommendations.
Having designed a software for crowd size estimation on a still photograph, there are various
improvements that should have been made if there was more time;
Improvement on accuracy based on uniform output for different types of images.
A more comprehensive and interactive user interface with additional features for
video.
In addition, the software is only for still photographs application, therefore improvements in
terms of a wider scope of applications can be made. For instance crowd size estimation of a
running video.
48
REFERENCES. 1) Alberto Martin and Sabri TosunogluFlorida International University Department of
Mechanical Engineering“IMAGE PROCESSING TECHNIQUES FOR MACHINE VISION”
2) Harley R. Myler, Arthur R. Weeks “THE POCKET HANDBOOK OF IMAGE PROCESSING
ALGORITHMS IN C”
3) J.R. Parker “Algorithms for Image Processing and Computer Vision Second Edition”.
4) Julio Cezar Silveira, Jacques Junior, Soraia Raupp Musse, Claudio Rosito Jung “Crowd
Analysis Using Computer Vision Techniques”.
5) Rafael C. Gonzalez,Richard E. Woods “Digital Image Processing 2nd Edition”
6) Richard Szeliski “Computer Vision: Algorithms and Applications”
7) D. Lu and Q. Weng” A survey of image classification methods and techniques for
improving classification performance”
8) Anthony C. Davies, Jia Hong Yin and Sergio A. Velastin “Crowd Monitoring Using Image
Processing”
9) Danny B. Yang, Hector H. Gonzalez, Leonidas J.Guibas “Counting Peoplein Crowds with a
Real-Time Network of Simple Image Sensors”.
10) Yaowu Hu, Ping Zhou, Hao Zhou “A New Fast and Robust Method Based on Head
Detection for People-Flow Counting System”
11) Victor Lempitsky, Andrew Zisserman “Learning to Count Objects in an Image”
12) K.M.M Rao “OVERVIEW OF IMAGE PROCESSING”.
13) John C. Russ “The Image Processing Handbook, Third Edition”
14) Paul Bourke “Various Simple Image Processing Techniques”
15) Richard Wicentowski and Tia Newhall “Using Image Processing Projects to Teach CS1
Topics”
49
16) www.stackoverflow.com and www.mathworks.com
November 2013
December 2013
January 2014
February 2014
March 2014
April 26th 2014.
50
APPENDIX.
Matlab Code. % --- Executes just before mfile is made visible. handles.im=im; im=im2double(im); %converts to double imgray=(im(:,:,1)+im(:,:,2)+im(:,:,2))/3; handles.imgray=imgray; hy=fspecial('sobel'); hx=hy; Iy=imfilter(double(imgray),hy,'replicate'); Ix=imfilter(double(imgray),hx,'replicate'); imgradmag=sqrt(Ix.^2+Iy.^2); handles.imgradmag=imgradmag; se=strel('disk',20); Ie=imerode(imgray,se); handles.Ie=Ie; Iobr=imreconstruct(Ie,imgray); handles.Iobr=Iobr; edgeim=edge(Iobr,'canny',[0.15 0.2]); handles.edgeim=edgeim; [centers,radii]=imfindcircles(edgeim,[5 10],'sensitivity',0.92,'Edge',0.03); handles.radii=radii; h=length(centers); handles.h=h; Total=ones(240,320); HDText=insertText(Total,[205 55],h,'AnchorPoint','LeftCenter'); handles.HDText=HDText; guidata(hObject,handles)
% --- Executes on button press in Exit. close all;
% --- Executes on button press in Reset. axes(handles.axes1); hold off; cla reset; set(handles.axes1,'xtick',[],'ytick',[]); axes(handles.axes2); hold off; cla reset; set(handles.axes2,'xtick',[],'ytick',[]); axes(handles.axes3); hold off; cla reset; set(handles.axes3,'xtick',[],'ytick',[]); axes(handles.axes4); hold off; cla reset; set(handles.axes4,'xtick',[],'ytick',[]);
51
axes(handles.axes5); hold off; cla reset; set(handles.axes5,'xtick',[],'ytick',[]);
% --- Executes on selection change in Options. contents=get(hObject,'Value'); switch contents case 2 global im handles.im=im; handles.imgray=(handles.im(:,:,1)+handles.im(:,:,2)+handles.im(:,:,2))/3; axes(handles.axes2); imshow(handles.imgray); handles.hy=fspecial('sobel'); handles.hx=handles.hy; handles.Iy=imfilter(double(handles.imgray),handles.hy,'replicate'); handles.Ix=imfilter(double(handles.imgray),handles.hx,'replicate'); handles.imgradmag=sqrt(handles.Ix.^2+handles.Iy.^2); axes(handles.axes5); imshow(handles.imgradmag); handles.se=strel('disk',20); handles.Ie=imerode(handles.imgray,handles.se); handles.Iobr=imreconstruct(handles.Ie,handles.imgray); axes(handles.axes3); imshow(handles.Iobr); handles.edgeim=edge(handles.Iobr,'canny',[0.15 0.2]); axes(handles.axes5); imshow(handles.edgeim); handles.d=imdistline; [centers,handles.radii]=imfindcircles(handles.edgeim,[5
10],'sensitivity',0.92,'Edge',0.03); viscircles(centers,handles.radii,'EdgeColor','b'); axes(handles.axes4); imshow(handles.edgeim); handles.h=length(centers); handles.Total=ones(240,320); figure,imshow(handles.Total); hold on handles.HDText=insertText(handles.Total,[205
55],handles.h,'AnchorPoint','LeftCenter'); imshow(handles.HDText); text(25,55,'TOTAL : ','FontSize',12,'FontWeight','bold','Color','m'); hold off axes(handles.axes1); imshow(handles.im); case 3 handles.imgray=(handles.im(:,:,1)+handles.im(:,:,2)+handles.im(:,:,2))/3; axes(handles.axes2); imshow(handles.imgray); case 4 axes(handles.axes3); imshow(handles.Iobr); case 5 axes(handles.axes4);
52
imshow(handles.edgeim); otherwise end
% --- Executes on button press in Customize.
v=get(handles.popupmenu6,'Value'); if v==2 axes(handles.axes1); imshow(handles.imgradmag); elseif v==3 axes(handles.axes2); imshow(handles.Ie); elseif v==4 axes(handles.axes3); imshow(handles.Iobr); elseif v==5 axes(handles.axes4); imshow(handles.edgeim); elseif v==6 [centers,handles.radii]=imfindcircles(handles.edgeim,[5
10],'sensitivity',0.92,'Edge',0.03); viscircles(centers,handles.radii,'EdgeColor','b'); axes(handles.axes4); imshow(handles.edgeim); end
% --- Executes on button press in Load. [path,user_cance]=imgetfile(); if user_cance msgbox(sprintf('Error'),'Error','Error'); return end handles.im=imread(path); axes(handles.axes1); imshow(handles.im);