Face Annotation

CHAPTER 1

INTRODUCTION

In computer science, image processing is any form of signal

processing for which the input is an image, such as a photograph or video

frame; the output of image processing may be either an image or, a set of

characteristics or parameters related to the image. Most image-processing

techniques involve treating the image as a two-dimensional signal and

applying standard signal-processing techniques to it.

Today Face recognition is a rapidly growing area in image

processing. It has many uses in the fields of biometric authentication,

security, and many other areas.

1.1 FACE DETECTION

Face detection is a computer technology that determines the

locations and sizes of human faces in arbitrary (digital) images. It detects

facial features and ignores anything else, such as buildings, trees and

bodies.

Face detection can be regarded as a specific case of object-class

detection. In object-class detection, the task is to find the locations and

sizes of all objects in an image that belong to a given class. Examples

include upper torsos, pedestrians, and cars.

Face detection can be regarded as a more general case of face

localization. In face localization, the task is to find the locations and sizes

of a known number of faces (usually one). In face detection, one does not

have this additional information.

1

http://en.wikipedia.org/wiki/Computer_science

http://en.wikipedia.org/wiki/Signal_(electrical_engineering)

http://en.wikipedia.org/wiki/Two-dimensional

Early face-detection algorithms focused on the detection of frontal

human faces, whereas newer algorithms attempt to solve the more general

and difficult problem of multi-view face detection. That is, the detection

of faces that are either rotated along the axis from the face to the observer

(in-plane rotation), or rotated along the vertical or left-right axis (out-of-

plane rotation), or both. The newer algorithms take into account

variations in the image or video by factors such as face appearance,

lighting, and pose.

Single image detection methods are classified into four categories;

some methods clearly overlap category boundaries and are discussed in

this section.

Knowledge-based methods.

Feature invariant approaches.

Template matching methods.

Appearance-based methods.

KNOWLEDGE-BASED METHODS: These rule-based methods

encode human knowledge of what constitutes a typical face. Usually, the

rules capture the relationships between facial features. These methods are

designed mainly for face localization.

FEATURE INVARIANT APPROACHES: These algorithms aim to

find structural features that exist even when the pose, viewpoint, or

lighting conditions vary, and then use these to locate faces. These

methods are designed mainly for face localization.

TEMPLATE MATCHING METHODS: Several standard patterns of a

face are stored to describe the face as a whole or the facial features

separately. The correlations between an input image and the stored

patterns are computed for detection. These methods have been used for

both face localization and detection.

2

APPEARANCE-BASED METHODS: In contrast to template

matching, the models (or templates) are learned from a set of training

images which should capture the representative variability of facial

appearance. These learned models are then used for detection. These

methods are designed mainly for face detection.

1.2 FACE RECOGNITION

Face recognition systems are progressively becoming popular as

means of extracting biometric information. Face recognition has a critical

role in biometric systems and is attractive for numerous applications

including visual surveillance and security. Because of the general public

acceptance of face images on various documents, face recognition has a

great potential to become the next generation biometric technology of

choice. Face images are also the only biometric information available in

some legacy databases and international terrorist watch-lists and can be

acquired even without subjects' cooperation.

1.3 ONLINE SOCIAL NETWORKS (OSNs):

Online Social Networks is an online service, platform, or site that

focuses on building and reflecting of social networks or social relations

among people, who, for example, share interests and/or activities. A

social network service consists of a representation of each user (often a

profile), his/her social links, and a variety of additional services. Most

social network services are web-based and provide means for users to

interact over the Internet, such as e-mail and instant messaging.

Social networking sites allow users to share ideas, activities,

events, and interests within their individual networks.

The main types of online social networks are those that contain

category places (such as former school year or classmates), means to

3

connect with friends (usually with self-description pages), and a

recommendation system linked to trust. Popular methods now combine

many of these, with Facebook, Google+ and Twitter widely used

worldwide; Facebook, Twitter, LinkedIn and Google+ are very popular in

India.

Web-based social networking services make it possible to connect

people who share interests and activities across political, economic, and

geographic borders. Through e-mail and instant messaging, online

communities are created where a gift economy and reciprocal altruism are

encouraged through cooperation. Information is particularly suited to gift

economy, as information is a nonrival good and can be gifted at

practically no cost.

Facebook and other social networking tools is increasingly the

object of scholarly research. Scholars in many fields have begun to

investigate the impact of social-networking sites, investigating how such

sites may play into issues of identity, privacy, social capital, youth

culture, and education.

1.4 FACE ANNOTATION:

Face annotation technology is important for a photo management

system. The act of labeling identities (i.e., names of individuals or

subjects) on personal photos is called face annotation or name tagging.

This feature is of considerable practical interest for Online Social

Networks.

Automatic face annotation (or tagging) facilitates improved

retrieval and organization of personal photos in online social networks.

4

The users of Online Social Networks spend enormous amounts of

time in browsing the billions of photos posted by their friends and

relatives. Normally they would go through and tag photos of friends and

themselves if they were so inclined, but now Online Social Networks are

trying to make things easier with face recognition.

Using similar algorithms for face detection that digital cameras are

now using, Online Social Networks can spot a face in an image once it’s

uploaded, and a box shows up with a space for the users to enter their or

theirs’ friend’s name, tagging them for recognition across the system.

The next step of the Online Social Network is automatic face annotation.

Typically, online social networks like Facebook and Myspace are

used for sharing or managing a photo or video collection. Users tag

photos of individuals and friends using face annotation. The thing is,

most online social networks feature manual face annotation, a time-

consuming task that can be extremely labor-intensive. The amount of

photos posted online increased astronomically each day and that’s a lot of

photos and faces to tag.

5

CHAPTER 2

LITERATURE REVIEW

2.1 SOCIAL NETWORK CONTEXT:

Most personal photos that are shared online are embedded in some

form of social network, and these social networks are a potent source of

contextual information that can be leveraged for automatic image

understanding. The utility of social network context is investigated for the

task of automatic face recognition in personal photographs. Here face

recognition scores are combined with social context in a conditional

random field (CRF) model and apply this model to label faces in photos

from the popular online social network Facebook, which is now the top

photo-sharing site on the Web with billions of photos in total.

An increasing number of personal photographs are uploaded to

online social networks, and these photos do not exist in isolation Social

networks are an important source of image annotations, and they also

provide contextual information about the social interactions among

individuals that can facilitate automatic image understanding. Here, the

specific problem of automatic face recognition in personal photographs is

focused. Photos and context drawn from the online social network

Facebook are used, which is currently the most popular photo-sharing site

on the Web. Many Facebook photos contain people, and these photos

comprise an extremely challenging dataset for automatic face recognition.

To incorporate context from the social network, a conditional

random field (CRF) model is defined for each photograph that consists of

a weighted combination of potential functions. One such potential comes

from a baseline face recognition system, and the rest represent various

aspects of social network context. The weights on these potentials are

learned by maximizing the conditional log-likelihood over many training

photos, and the model is then applied to label faces in novel photographs.

6

Drawing on a database of over one million downloaded photos, it is

shown that incorporating social network context leads to a significant

improvement in face recognition rates.

ADVANTAGES:

This method combines image data with social network context in a

conditional random field model to improve recognition performance.

DISADVANTAGES:

With larger numbers of labeled faces available, this can apply to

photos that contain arbitrary numbers of faces, including single-face

photos for which only more limited context is available.

2.2 IMAGE CONTENT AND SOCIAL CONTEXT:

MediAssist is a system which facilitates browsing, searching and

semi-automatic annotation of personal photos, using analysis of both

image content and the context in which the photo is captured. This

semiautomatic annotation includes annotation of the identity of people in

photos. Technologies for efficiently managing and organizing digital

photos assume more and more importance as users wish to efficiently

browse and search through larger and larger photo collections. It is

accepted that content-based image retrieval techniques alone have failed

to bridge the so-called semantic gap between the visual content of an

image and the semantic interpretation of that image by a person. Personal

photos differ from other images in that they have an associated context,

often having been captured by the user of the photo management system.

Users will have personal recollection about the time, place and other

context information relating to the environment of photo capture, and

digital personal photos make a certain amount of contextual metadata

available in their EXIF header, which stores the time of photo capture and

camera settings such as lens aperture and exposure time. GPS location

information is also supported by EXIF and, although not currently

7

captured by most commercial cameras, there are ways of “location-

stamping” photos using data from a separate GPS device, and camera

phones are inherently location-aware.

Systems for managing personal photo collections could thus make

use of this contextual metadata in their analysis and organization of

personal photos. One of the more important user needs for the

management of personal photo collections is the annotation of the

identities of people. They proposed the MediAssist system for personal

photo management, a context-aware photo management system that

includes person annotation technologies as one of its major features.

The system uses context- and content-based analysis to provide

powerful tools for the management of personal photo collections, and

facilitates semi-automatic person-annotation in personal photo

collections, powered by automatic analysis. Traditional face recognition

approaches do not generally cope well in an unconstrained photo capture

environment, where variations in lighting, pose and orientation represent

major challenges. It is possible to overcome some of these problems by

exploiting the contextual information that comes with personal photo

capture.

ADVANTAGES:

By combining context and content analysis, performance over

content or context alone is improved.

DISADVANTAGES:

Here No automatic face annotation without prompting the user for

confirmation.

2.3 DISTRIBUTED FACE RECOGNITION:

Distributed face recognition is one of the problems in a calibrated

camera sensor network. It is assumed that each camera is given a small

and possible different training set of face images taken under varying

8

viewpoint, expression, and illumination conditions. Each camera can

estimate the pose and identity of a new face using classical techniques

such as Eigenfaces or Tensorfaces combined with a simple classifier.

However, the pose estimates obtained by a single camera could be very

poor, due to limited computational resources, impoverished training sets,

etc., which could lead to poor recognition results.

The key contribution is to propose a distributed face recognition

algorithm in which neighboring cameras share their individual estimates

of the pose in order to achieve a “consensus” on the face pose. For this

purpose, convergent distributed consensus algorithm is used on SE (3)

that estimates the global Karcher mean of the face pose in a distributed

fashion.

In face recognition, most existing algorithms such as Eigenfaces,

Fisherfaces, ICA and Tensorfaces, operate in a centralized fashion. In this

paradigm, training is performed by a central unit, which learns a face

model from a training set of face images. Images of a new face acquired

from multiple cameras are also sent to the central unit, which performs

recognition by comparing these images to the face model. However, this

approach has several drawbacks when implemented in a smart camera

network.

First, it is not fault tolerant, because a failure of the central

processing unit implies a failure of the entire application. Second, it

requires the transmission of huge amounts of raw data. Moreover, it is not

scalable, because as the number of nodes in the network increases, so

does the amount of processing by the central unit, possibly up to a point

where it exceeds the available resources. Existing approaches perform

compression to reduce the amount of transmitted data, but the

compressed images are still processed by a central unit. In a fully

distributed paradigm, each node could perform an initial processing of the

9

images and transmit over the network only the distilled information

relevant to the task at hand. Then, the nodes could collaborate in order to

merge their local observations and come up with a global result, which is

consistent across the entire network.

This framework has several advantages over the centralized

paradigm. For instance, when a single node fails the others can still

collaborate among each other. Moreover, since each node performs its

share of the processing, the aggregate resources available in the network

grow with the number of nodes, making the solution scalable. The pose of

the object is chosen as the local information to be extracted by the nodes

and transmitted to neighboring nodes. Consensus algorithms provide a

natural distributed estimation framework for aggregating local

information over a network. While this approach is specifically designed

for recognizing faces, distributed framework could be easily applied to

other object recognition problems, as long as each camera node can

provide an estimate of the object pose.

ADVANTAGES:

The pose of an object can be represented with only six parameters,

which is obviously much less than the number of pixels in an image.

Then the pose of a face can be estimated from a single view using

existing face recognition algorithms, e.g., view based Eigenfaces . Then,

the pose of the face is a global quantity, which facilitates the definition of

a global objective that all the nodes need to achieve, e.g., finding the

average pose.

DISADVANTAGES:

Here, two to five cameras are connected in a network with a ring

topology. So, if any fault occurs in any one camera, then it will reduce the

accuracy of pose of face.

10

2.4 PERSONALIZING IMAGE SEARCH RESULTS:

The social media site Flickr allows users to upload their photos,

annotate them with tags, submit them to groups, and also to form social

networks by adding other users as contacts. Flickr offers multiple ways of

browsing or searching it. One option is tag search, which returns all

images tagged with a specific keyword. If the keyword is ambiguous,

e.g., “beetle” could mean an insect or a car, tag search results will include

many images that are not relevant to the sense the user had in mind when

executing the query. Users express their photography interests through

the metadata they add in the form of contacts and image annotations. The

photo sharing site Flickr is one of the earliest and more popular examples

of the new generation of Web sites, labeled social media, whose content

is primarily user-driven. Other examples of social media include: blogs

(personal online journals that allow users to share thoughts and receive

feedback on them), Wikipedia (a collectively written and edited online

encyclopedia), and Del.icio.us and Digg (Web sites that allow users to

share, discuss, and rank Web pages, and news stories respectively). Social

media sites share four characteristics:

(1) Users create or contribute content in a variety of media types;

(2) Users annotate content with tags;

(3) Users evaluate content, either actively by voting or passively by

using content; and

(4) Users create social networks by designating other users with

similar interests as contacts or friends.

In the process of using these sites, users are adding rich metadata in

the form of social networks, annotations and ratings. Availability of large

quantities of this metadata will lead to the development of new

algorithms to solve a variety of information processing problems, from

new recommendation to improved information discovery algorithms.

11

ADVANTAGES:

User-added metadata on Flickr can be used to improve image

search results. The approaches described here can also be applied to other

social media sites, such as Del.icio.us.

DISADVANTAGES:

Since Flickr have become extremely popular, yet the user-supplied

tags are relatively scarce compared to the number of annotated images.

The same tag can additionally be used in different senses, making the

problem even more challenging.

2.5 PERSONAL AND SOCIAL NETWORK CONTEXT

Social network context is useful as real-world activities of

members of the social network are often correlated within a specific

context. The correlation can serve as a powerful resource to effectively

increase the ground truth available for annotation. There are three main

contributions:

(a) Development of an event context framework and definition of

quantitative measures for contextual correlations based on concept

similarity in each facet of event context;

(b) Recommendation algorithms based on spreading activations

that exploit personal context as well as social network context;

(c) Experiments on real-world, everyday images that verified both

the existence of inter-user semantic disagreement and the

improvement in annotation when incorporating both the user and

social network context.

They develop a novel collaborative annotative system that exploits

the correlation in user context and the social network context. This work

enables members of a social network to effectively annotate images. The

social network context is important when different users’ annotations and

their corresponding semantics are highly correlated. The annotation

12

problem in social networks has several unique characteristics different

from the traditional annotation problem. The participants in the network

are often family, friends, or co-workers, and know each other well. They

participate in common activities – e.g. traveling, attending a seminar,

going to a film, parties etc. There is a significant overlap in their real

world activities. Social networks involve multiple users – this implies that

each user may have a distinctly different annotation scheme, and different

mechanisms for assigning labels. There may be significant semantic

disagreement amongst the users, over the same set of images.

The traditional image annotation problem is a very challenging one

– only a small fraction of the images are annotated by the user, severely

restricting the ground truth available. In our approach we define event

context – the set of facets / attributes (image, who, when, where, what)

that support the understanding of everyday events. Then we develop

measures of similarity for each event facet, as well as compute event-

event and user-user correlation. The user context is then obtained by

aggregating event contexts and is represented using a graph.

ADVANTAGES:

This method provides quantitative and qualitative results which

indicate that both personal and social context facilitates effective image

annotation. Here a novel collaborative annotative system is developed.

That system exploits the correlation in user context and the social

network context. This work enables members of a social network to

effectively annotate images.

DISADVANTAGES:

The notion of “context” has been used in many different ways

across applications. But set of contextual attributes is always application

dependent.

13

CHAPTER 3

SYSTEM ANALYSIS

3.1 EXISTING SYSTEM:

There are many problems that exist due to the many factors

that can affect the photos. When processing images one must take into

account the variations in light, image quality, the persons pose and facial

expressions along with others. In order to successfully be able to identify

individuals correctly there must be some way to account for all these

variations and be able to come up with a valid answer.

In order to recognize face many methods are available. They have

not only advantages but also disadvantages. Social network context is one

method which is used to improve face annotation. But in this method,

with larger numbers of labeled faces available, this can apply to photos

that contain arbitrary numbers of faces, including single-face photos for

which only more limited context is available. MediAssist is a system

which facilitates semi-automatic annotation of personal photos, using

analysis of both image content and the context in which the photo is

captured. Here No automatic face annotation without prompting the user

for confirmation. In a distributed face recognition algorithm neighboring

camera share their individual estimates of the pose in order to achieve a

“consensus” on the face pose. Here cameras are connected in a network

with a ring topology. So, if any fault occurs in any one camera, then it

will reduce the accuracy of pose of face. The social media site Flickr

allows users to upload their photos, annotate them with tags, submit them

to groups, and also to form social networks by adding other users as

contacts. Since Flickr have become extremely popular, yet the user-

supplied tags are relatively scarce compared to the number of annotated

14

images. The same tag can additionally be used in different senses, making

the problem even more challenging.

3.2 PROPOSED SYSTEM:

A novel collaborative face recognition (FR) framework is

proposed to improve the accuracy of face annotation by effectively

making use of multiple FR engines available in an OSN.

The proposed collaborative FR framework is constructed using

different M+1FR engines: one FR engine belongs to the current user,

while M FR engines belong to M different contacts of the current user.

This collaborative FR framework consists of two major parts:

selection of suitable FR engines and merging of multiple FR results.

To select K suitable FR engines out of M+1 FR engines, a social

graph model (SGM) is constructed that represents the social relationships

between the different contacts considered. SGM is created by utilizing the

personal photo collections shared in the collaborative FR framework.

Based on the constructed SGM, a relevance score is computed for each

FR engine. K FR engines are then selected using the relevance scores

computed for the FR engines. Next, the query face images detected in the

photos of the current user are simultaneously forwarded to the selected K

FR engines.

In order to merge the FR results returned by the different FR

engines, Bayesian Decision Rule is used. A key property of the solution

is that we are able to simultaneously account for both the relevance scores

computed for the selected FR engines and the FR result scores.

This collaborative FR framework has a low computational cost

and comes with a design that is suited for deployment in a decentralized

OSN.

15

CHAPTER 4

SYSTEM SPECIFICATION

4.1 HARDWARE REQUIREMENTS

PROCESSOR : PENTIUM IV 2.6GHZ, Intel Core 2 Duo

RAM : 512 MB DD RAM

MONITOR : 15” COLOR

HARD DISK : 320 GB

MEMORY : 2GB

4.2 SOFTWARE REQUIREMENTS

FRONT END : ASP.NET

BACK END : MSAccess2007

OPERATING SYSTEM : Windows 07 Home Basic (64-bit)

16

CHAPTER 5

TECHNOLOGY OVERVIEW

5.1 C# AND .NET

C# is a multi-paradigm programming language encompassing

strong typing, imperative, declarative, functional, generic, object-

oriented (class-based), and component-oriented programming disciplines.

It was developed by Microsoft within its .NET initiative and later

approved as a standard by Ecma (ECMA-334) and ISO (ISO/IEC

23270:2006). C# is one of the programming languages designed for

the Common Language Infrastructure.

5.1.1 INTRODUCTION

C# is intended to be a simple, modern, general-purpose, object-oriented

programming language. Its development team is led by Anders Hejlsberg.

The most recent version is C# 4.0, which was released on April 12, 2010.

1. C# language is intended to be a simple, modern, general-purpose,

object-oriented programming language.

2. The language, and implementations thereof, should provide support

for software engineering principles such as strong type checking,

array bounds checking, detection of attempts to use uninitialized

variables, and automatic garbage collection. Software robustness,

durability, and programmer productivity are important.

3. The language is intended for use in developing software

components suitable for deployment in distributed environments.

4. Source code portability is very important, as is programmer

portability, especially for those programmers already familiar with

C and C++.

5. Support for internationalization is very important.

17

http://en.wikipedia.org/wiki/C_Sharp_4.0

http://en.wikipedia.org/wiki/International_Organization_for_Standardization

6. C# is intended to be suitable for writing applications for both

hosted and embedded systems, ranging from the very large that use

sophisticated operating systems, down to the very small having

dedicated functions.

7. Although C# applications are intended to be economical with

regard to memory and processing power requirements, the

language was not intended to compete directly on performance and

size with C or assembly language.

5.1.2 FEATURES

By design, C# is the programming language that most directly reflects

the underlying Common Language Infrastructure (CLI). Most of its

intrinsic types correspond to value-types implemented by the CLI

framework. However, the language specification does not state the code

generation requirements of the compiler: that is, it does not state that a C#

compiler must target a Common Language Runtime, or generate

Common Intermediate Language (CIL), or generate any other specific

format. Theoretically, a C# compiler could generate machine code like

traditional compilers of C++ or FORTRAN.

Some notable features of C# that distinguish it from C and C++

(and Java, where noted) are:

1. It has no global variables or functions. All methods and

members must be declared within classes. Static members of

public classes can substitute for global variables and

functions.

2. Local variables cannot shadow variables of the enclosing

block, unlike C and C++. Variable shadowing is often

considered confusing by C++ texts.

18

http://en.wikipedia.org/wiki/Variable_shadowing

3. C# supports a strict Boolean data type, bool. Statements that

take conditions, such as while and if, require an expression

of a type that implements the true operator, such as the

boolean type. While C++ also has a boolean type, it can be

freely converted to and from integers, and expressions such

as if(a) require only that a is convertible to bool,

allowing a to be an int, or a pointer. C# disallows this

"integer meaning true or false" approach, on the grounds that

forcing programmers to use expressions that return

exactly bool can prevent certain types of common

programming mistakes in C or C++ such as if (a = b) (use of

assignment = instead of equality ==).

4. In C#, memory address pointers can only be used within

blocks specifically marked as unsafe, and programs with

unsafe code need appropriate permissions to run. Most

object access is done through safe object references, which

always either point to a "live" object or have the well-

defined null value; it is impossible to obtain a reference to a

"dead" object (one that has been garbage collected), or to a

random block of memory. An unsafe pointer can point to an

instance of a value-type, array, string, or a block of memory

allocated on a stack. Code that is not marked as unsafe can

still store and manipulate pointers through

the System.IntPtr type, but it cannot dereference them.

5. Managed memory cannot be explicitly freed; instead, it is

automatically garbage collected. Garbage collection

addresses the problem of memory leaks by freeing the

19

http://en.wikipedia.org/wiki/Nullable_type

http://en.wikipedia.org/wiki/Boolean_data_type

programmer of responsibility for releasing memory that is no

longer needed.

6. In addition to the try...catch construct to handle exceptions,

C# has a try...finally construct to guarantee execution of the

code in the finally block.

7. Multiple inheritance is not supported, although a class can

implement any number of interfaces. This was a design

decision by the language's lead architect to avoid

complication and simplify architectural requirements

throughout CLI.

8. C#, like C++, but unlike Java, supports operator overloading.

9. C# is more type safe than C++. The only implicit

conversions by default are those that are considered safe,

such as widening of integers. This is enforced at compile-

time, during JIT, and, in some cases, at runtime. No implicit

conversions occur between booleans and integers, nor

between enumeration members and integers (except for

literal 0, which can be implicitly converted to any

enumerated type). Any user-defined conversion must be

explicitly marked as explicit or implicit, unlike C++ copy

constructors and conversion operators, which are both

implicit by default. Starting with version 4.0, C# supports a

"dynamic" data type that enforces type checking at runtime

only.

10.Enumeration members are placed in their own scope.

11.C# provides properties as syntactic sugar for a common

pattern in which a pair of methods, accessor (getter) and

20

http://en.wikipedia.org/wiki/Mutator_method

http://en.wikipedia.org/wiki/Syntactic_sugar

http://en.wikipedia.org/wiki/Property_(programming)

http://en.wikipedia.org/wiki/Scope_(programming)

http://en.wikipedia.org/wiki/Enumerated_type

http://en.wikipedia.org/wiki/Copy_constructor

http://en.wikipedia.org/wiki/Copy_constructor

http://en.wikipedia.org/wiki/Just-in-time_compilation

http://en.wikipedia.org/wiki/Type_safety

http://en.wikipedia.org/wiki/Operator_overloading

mutator (setter)encapsulate operations on a

single attribute of a class.

12.Full type reflection and discovery is available.

13.Checked exceptions are not present in C# (in contrast to

Java). This has been a conscious decision based on the issues

of scalability and versionability.

5.1.3 ADVANTAGES OF C#

ADVANTAGES OVER C AND C++

1. It is compiled to an intermediate language (CIL) independently

of the language it was developed or the target architecture and

operating system

2. Automatic garbage collection

3. Pointers no longer needed (but optional)

4. Reflection capabilities

5. Don't need to worry about header files ".h"

6. Definition of classes and functions can be done in any order

7. Declaration of functions and classes not needed

8. Unexisting circular dependencies

9. Classes can be defined within classes

10.There are no global functions or variables, everything belongs to

a class

11.All the variables are initialized to their default values before

being used (this is automatic by default but can be done

manually using static constructors)

12.You can't use non-boolean variables (integers, floats...) as

conditions. This is much more clean and less error prone

13.Apps can be executed within a restricted sandbox

21

http://en.wikipedia.org/wiki/Checked_exceptions

http://en.wikipedia.org/wiki/Reflection_(computer_science)

http://en.wikipedia.org/wiki/Attribute_(computing)

http://en.wikipedia.org/wiki/Mutator_method

ADVANTAGES OVER C++ AND JAVA:

1. Formalized concept of get-set methods, so the code becomes

more legible

2. More clean events management (using delegates)

ADVANTAGES OVER JAVA:

1. Usually it is much more efficient than java and runs faster

2. CIL (Common (.NET) Intermediate Language) is a standard

language, while java byte codes aren't

3. It has more primitive types (value types), including unsigned

numeric types

4. Indexers let you access objects as if they were arrays

5. Conditional compilation

6. Simplified multithreading

7. Operator overloading. It can make development a bit trickier but

they are optional and sometimes very useful

8. (limited) use of pointers if you really need them, as when calling

unmanaged (native) libraries which doesn't run on top of the

virtual machine (CLR).

5.2 MICROSOFT ACCESS

5.2.1 INTRODUCTION

Microsoft Access is a "Relational Database Management System."

The description that follows applies to Microsoft Access 2000, Microsoft

Access 2002, Microsoft Access 2007 Microsoft Access 2007 & 2010, and

even Microsoft Access 97. In fact what follows applies to just about

every Windows database out there regardless of who makes it.

Access can store data in specific formats for sorting, querying, and

reporting. Sorting is pretty straightforward; data is simply presented to

22

http://en.wikipedia.org/wiki/Relational_database_management_system

you in particular orders. An example might be presenting your customer

data (customer number, name, address, city, state, zip, and total

purchases) in last name order.

Querying means that as a user of this database, you can ask Access for a

collection of information relating to location such as state or country,

price as it might relate to how much a customer spent, and date as it

might relate to when items were purchased. Querying can include sorting

as well. For example if you wanted to see the top spending customers in

the state of Florida querying would be a way to do that. A Query on data

typically returns a sub-set of the collection of data, but can return all of it

in a different order as well. Reporting is simply query results in printable

or viewable form.

5.2.2 STORING DATA

In order for Access to perform these functions data has to be stored

in the smallest possible units. These units are called fields. A field might

contain a first name, a last name, a middle name, a street address, and so

on. Notice that I do not propose that the entire name be placed in one

field. If that were done the only sorting one could perform would end up

being presented by the first name hardly useful. But if a separate field is

used for the last name, another for first, and so on, much more useful

sorting can be accomplished.

Fields are also defined as a type of data (number, text, date, date-

time, dollar, etc.). By storing data in its own specific field type, Access

(or any RDBMS for that matter) can sort that data in very tightly

controlled ways. For example one can sort numbers and alphabetic

content accurately as long as Access knows what type of sort to apply to

23

that data. Thus the field type an entire collection of fields relating to a

particular entry is called a record. The entire collection of records is

called a table.

Tables resemble spreadsheets in that they are a grid of data. A row

represents one complete record and a column a particular data field. Thus

a data table containing a collection of customer demographics might

contain the Customer Number, Name, Address, City, State, Zip,

Telephone Number, Cell Phone Number, and email Address.

Possibly the easiest way to visualize this is to imagine a data table

as a spreadsheet. Each column would be a field, each row a record, and a

collection of rows would represent the entire data table. Naturally each

row, in the case of a customer file for example, would be one customer.

By storing data in this manner it is much easier to sort and report on that

data in nearly any order you wish. All of the above could describe a

Database Management System DBMS, but Microsoft Access is also a

relational database management system.

5.2.3 RELATIONAL DATABASES

A relational database management system (RDBMS) allows for the

creation of multiple tables that can be linked together via one or more

shared field values between tables. For example the Customer Table

mentioned above could be linked to a table containing more sensitive

purchase or credit card information. As long as both tables contain a

Customer Number field with the same data size and type these tables can

be linked or related. Of course each Customer Number would be unique

to the customer.

24

In this example the secondary or child table could contain the

following information; Customer Number (identical in format to

Customer Number in the parent table), Product ID, Unit Price, Quantity,

etc. Yet another child table could contain the Customer Number, credit

card data, including the number, active til dates, and pin number.

As long as there is a common Customer Number (in this example)

the tables can be linked or kept separate depending on the level of

security required of this information. This way two of the tables could be

displayed on a form, through a query, or on a report and look as though

all the information is stored in one place.

Access, like any RDBMS, will allow these tables to be interrelated

via forms, reports, or queries. Access, as with many other RDBMS, can

use Structured Query Language (SQL) to query the table(s). Though

Microsoft Access is not as powerful (and is properly a psudo-rdbms) as

other products such as Microsoft SQL Server, Oracle, MySQL, Sybase,

or IBM DB2, it operates in much the same way and many of the SQL

statements that would work properly in Access could be also used in the

above mentioned RDBMS without modification.

Finally, databases are used almost ubiquitously.

5.2.4 TERMS

Relational Database Management System: Data that are stored and

manipulated are held in a relational manner. e.g. The tables within can be

related to each other via fields.

Field: A single data item within a data record. The field is usually

represented as a column.

25

Key Field: A field that contains like data that is used between tables to

link or relate them. Key Field usually refers to the linking field in the

parent table. In Access, this must always be the first field in the parent

table.

Foreign Key: Like the key field above a field that contains data that can

be used to link or relate two tables together. The Foreign Key usually

refers to the field in the child table; it does not have to be the first record

in the child table.

Record: The row of information representing one set of data within a

table.

Parent Table: The primary table used to coordinate and connect to child

tables. It might also be called the master table.

Child Table: Another table which can be related to the parent table. Think

of this as a "slave" table; a table that is designed to be used with the

Master table. Note: that Child Tables can also be master tables to lesser

child tables.

Table: A collection of like data arranged in a spreadsheet (rows and

columns) like fashion.

5.2.5 CREATING AND DESIGNING MICROSOFT ACCESS

DESIGNING

Before creating a database, you should plan and design it. For

example, you should define the type of database you want to create. You

should create, in writing, a list of the objects it will contain: employees,

customers, products, transactions, etc.

26

For each object, you should create a list of the pieces of

information the object will need for its functionality: name(s), contact

information, profession, etc. You should also justify why the object needs

that piece of information. You should also define how the value of that

piece of information will be given to the object. As you will see in later

lessons, some values are typed, some values must be selected from a

preset list, and some values must come from another object. In later

lessons, we will see how you can start creating the objects and their

content.

CREATING A DATABASE

If you have just started Microsoft Access, to create a database, under

File, click New. You can then use one of the links in the main (middle)

section of the interface:

To create a blank database, in the middle section, under Available

Templates, click Blank Database

To create a database using one of the samples, under Available

Templates, click a category from one of the buttons, such as

Sample Templates. Then click the desired buttons:

If you start Microsoft Access just after installing it, the File section

would display a normal menu. If you start creating or opening databases,

a list of MRUs displays under Close Database. To open a database, if you

see its name under File, you can click it.

Since a Microsoft Access database is primarily a file, if you see its

icon in a file utility such as Windows Explorer, you can double-click it.

This would launch Microsoft Access and open the database. If you

27

received a database as an email attachment, you can also open the

attachment and consequently open the database file.

CLOSING A DATABASE

Close a database without closing Microsoft Access. To do this, in

the File section, click Close Database.

Deleting a Database

If you have a database you don't need any more, you can delete it. To

delete a database in a file utility such as Windows Explorer:

Click the icon of the database to select it and press Delete

Right-click the icon and click Delete

A warning message would be presented to you to confirm what you want

to do. After you have deleted a database, it doesn't disappear from the

MRU lists of Microsoft Access. This means that, after a database has

been deleted, you may still see it in the File section. If you try opening

such a database, you would receive an error.

If a database has been deleted and you want to remove it from the MRU

lists, you can open the Registry (Start -> Run: regedit, Enter) (be careful

with the Registry; when it doubt, don't touch it). Open the following key:

Locate the deleted database and delete its key. The next time you start

Microsoft Access, the name of the deleted database would not display in

the File section.

THE SIZE OF A DATABASE

A Database is primarily a computer file, just like those created with

other applications. As such, it occupies space in the computer memory. In

28

some circumstances, you should know how much space a database is

using. This can be important when you need to back it up or when it is

time to distribute it. Also, when adding and deleting objects from your

database, its file can grow or shrink without your direct intervention.

Like any other computer file, to know the size of a database, you

can right-click it in Windows Explorer or My Computer and click

Properties. If you are already using the database, to check its size, you

can click File, position the mouse on Manage and click Database

Properties. In the Properties dialog box, click General and check the Size

label

29

CHAPTER 6

SYSTEM DESIGN

6.1 SYSTEM ARCHITECTUE

Fig 6.1:System Architecture

30

Input Image (Personal Photo)

Face Detection

Query Faces

Merging of FR Engine

FR1 FR2 FRk

Output Image (Name Tagged

Personal Photo)

6.2 DATA FLOW DIAGRAMS

USE CASE DIAGRAM

Fig 6.2.1: Use Case Diagram

31

CLASS DIAGRAM

Fig 6.2.2: Class Diagram

SEQUENCE DIAGRAM

Fig 6.2.3: Sequence Diagram

32

COLLABORATION DIAGRAM

Fig 6.2.4: Collaboration Diagram

COMPONENT DIAGRAM

Fig 6.2.5: Component Diagram

33

ACTIVITY DIAGRAM

Fig 6.2.6: Activity Diagram

34

6.3 MODULE DESCRIPTION

MODULES:

This project is divided into 3 modules:

1. Face Detection

2. Selection of Face Recognition Engines

3. Merging of Face Recognition Results

6.3.1 FACE DETECTION:

Face detection can be regarded as a specific case of object-class

detection. In object-class detection, the task is to find the locations and

sizes of all objects in an image that belong to a given class

Face detection can be regarded as a more general case of face

localization. In face localization, the task is to find the locations and sizes

of a known number of faces (usually one).

Early face-detection algorithms focused on the detection of frontal

human faces, whereas newer algorithms attempt to solve the more general

and difficult problem of multi-view face detection. That is, the detection

of faces that are either rotated along the axis from the face to the observer

(in-plane rotation), or rotated along the vertical or left-right axis (out-of-

plane rotation), or both. The newer algorithms take into account

variations in the image or video by factors such as face appearance,

lighting, and pose.

Fig 6.3.1 Face Detection

35

FEATURE TYPES AND EVALUATION:

The features employed by the detection framework universally

involve the sums of image pixels within rectangular areas. As such, they

bear some resemblance to Haar basis functions, which have been used

previously in the realm of image-based object detection. However, since

the features used by Viola and Jones all rely on more than one rectangular

area, they are generally more complex. The figure illustrates the four

different types of features used in the framework. The value of any given

feature is always simply the sum of the pixels within clear rectangles

subtracted from the sum of the pixels within shaded rectangles. As is to

be expected, rectangular features of this sort are rather primitive when

compared to alternatives such as steerable filters. Although they are

sensitive to vertical and horizontal features, their feedback is considerably

coarser. However, with the use of an image representation called

the integral image, rectangular features can be evaluated in constant time,

which gives them a considerable speed advantage over their more

sophisticated relatives. Because each rectangular area in a feature is

always adjacent to at least one other rectangle, it follows that any two-

rectangle feature can be computed in six array references, any three-

rectangle feature in eight, and any four-rectangle feature in just nine.

Fig 6.3.2: Four Different types of Features

36

LEARNING ALGORITHM:

The speed with which features may be evaluated does not adequately

compensate for their number, however. For example, in a standard 24x24

pixel sub-window, there are a total of 45,396 possible features, and it

would be prohibitively expensive to evaluate them all. Thus, the object

detection framework employs a variant of the learning

algorithm AdaBoost to both select the best features and to train classifiers

that use them.

CASCADE ARCHITECTURE:

Fig 6.3.3: Cascade Architecture

The evaluation of the strong classifiers generated by the learning process

can be done quickly, but it isn’t fast enough to run in real-time. For this

reason, the strong classifiers are arranged in a cascade in order of

complexity, where each successive classifier is trained only on those

selected samples which pass through the preceding classifiers. If at any

stage in the cascade a classifier rejects the sub-window under inspection,

no further processing is performed and continue on searching the next

sub-window (see figure at right). The cascade therefore has the form of a

degenerate tree. In the case of faces, the first classifier in the cascade –

called the attentional operator – uses only two features to achieve a false

negative rate of approximately 0% and a false positive rate of 40%. The

37

effect of this single classifier is to reduce by roughly half the number of

times the entire cascade is evaluated.

6.3.2 SELECTION OF FACE RECOGNITION ENGINES

Selection of suitable FR engines is based on the presence of social

context in personal photos collections. Social context refers to the strong

tendency that users often take photos together with friends, family

members, or co-workers.

The use of social context for selecting suitable FR engines is

motivated by two reasons. First, social context is strongly consistent in a

typical collection of personal photos. Therefore, query face images

extracted from the photos of the current user are likely to belong to close

contacts of the current user. Second, it is likely that each FR engine has

been trained with a high number of training face images and

corresponding name tags that belong to close contacts of the owner of the

FR engine. Consequently, by taking advantage of social context, the

chance increases that FR engines are selected that are able to correctly

recognize query face images.

CONSTRUCTION OF A SOCIAL GRAPH MODEL:

A Social Graph Model (SGM) is constructed for selecting suitable

FR engines. Social relationship between the current user and the OSN

members in his/her contact list are quantified by making use of the

identity occurrence and the co-occurrence probabilities of individuals in

personal photo collections.

Let luser be the identity label or the name of the current user Then, let

Sluser=lmMm=1be a set consisting of M different identity labels. These

identity labels correspond to OSN members that are contacts of the

current user. Note that lm denotes the identity label of the mth contact and,

without any loss of generality lm≠ln if m≠n.

38

A social graph is represented by a weighted graph as follows:

G = (N,E,W)

where N=nm, m=1,..MUnuser is a set of nodes that includes the current

user and his/her contacts, E=em,m=1,…Mis a set of edges connecting

the node of the current user to the node of the mth contact of the current

user, and the element wm in Wrepresents the strength of the social

relationship associated with em.

To compute wm, we estimate the identity occurrence and co-

occurrence probabilities from personal photo collections. The occurrence

probability for each contact is estimated as follows:

ΣpЄpuser δ1(lm ,P)

Proboccur(lm)= __________________________

|Puser|

where Puser denotes the entire collection of photos owned by the

current user, |.|denotes the cardinality of a set, and δ1(lm ,p)is an indicator

39

function that returns one when the identity of the mth contact is manually

tagged in photo P and zero otherwise.

In addition, the co-occurrence probability between the current user

and the mth contact is estimated as follows:

ΣpЄpOSN δ2(luser, lm ,P)

Probco-occur(lm)= _______________________________ for

lmЄSluser

|POSN|

where POSN denotes all photo collections in the OSN the current

user has access to (this includes photo collections owned by the current

user, as well as photo collections owned by his/her contacts), and δ2(luser,

lm ,P)is a pairwise indicator function that returns one if the current user

and the mth contact of the current user have both been tagged in photo

and zero otherwise.

Using the above two equations wm is computed as follows;

wm=exp(proboccur(lm)+probco-occur(luser,lm))

The use of an exponential function leads to a high weighting value

when (proboccur(lm) and probco-occur(luser,lm)) both have high values.

SELECTION OF FR ENGINES:

In order to select suitable FR engines, we need torank the FR

engines Ωm according to their ability to recognize a particular query face

image. To this end, we make use of the strength of the social relationship

between the current user and

the mth contact of the current user, represented by wm. Specifically, wm is

used to represent the relevance score of the mth FR engine

When the FR engines have been Ωm m=I,…M ranked according

to wm, two solutions can be used to select suitable FR engines. The first

solution consists of selecting the top K FR engines according to their

40

relevance score wm. The second solution consists of selecting all FR

engines with a relevance score that is higher than a certain threshold

value. In practice, the first solution is not reliable as the value of K may

significantly vary from photo collection to photo collection. Indeed, for

each photo collection, we have to determine an appropriate value for K by

relying on a heuristic process. Therefore, we adopt the second solution to

select suitable FR engines. Specifically, in our collaborative FR

framework, an FR engine is selected if its associated relevance score is

higher than the average relevance score ΣMm=1wm/M.

6.3.3 MERGING OF FACE RECOGNITION RESULTS

The purpose of merging multiple FR results retrieved from

different FR engines is to improve the accuracy of face annotation. Such

an improvement can be accomplished by virtue of a complementary

effect caused by fusing multiple classification decisions regarding the

identity of a query face image.

In an OSN, FR engines that belong to different members of the

OSN may use different FR techniques. For example,

some feature extractors may have been created using global face

representations, whereas other feature extractors may have been

created using local face representations. Such an observation holds

especially true for OSNs that have been implemented

in a decentralized way, a topic that is currently of high interest. Therefore,

we only consider fusion of multiple classifier results at measurement level

and at decision level.

FUSION USING A BAYESIAN DECISION RULE

To combine multiple FR results at measurement level, we propose

to make use of fusion based on a Bayesian decision rule (BDRF). This

kind of fusion is suitable for converting different types of distances or

confidences into a common a posteriori probability. Hence, multiple FR

41

results originating from a set of heterogeneous FR engines can be easily

combined through a Bayesian decision rule. Moreover, the use of a

Bayesian decision rule allows for optimal fusion at measurement level

To Perform collaborative FR,Q and T(n) are independently and

simultaneously submitted to K different FR engines Then, let qk and tk(n)

be a feature vector extractor from Q and T(n). Here, dk(n) can be computed

by using the NN classifier assigned to Ωk.

Distance scores calculated by different NN classifiers may not be

comparable due to the use of personalized FR engines. To map the

incomparable distance scores onto a common representation that takes the

form of a posteriori probabilities. To obtain an a posteriori probability

related to, we first convert the distance scores into corresponding

confidence values using a sigmoid activation function. This can be

expressed as

1

ck(n)=__________________

1+exp(dk(n))

It should be emphasized that dk(n)(1≤n≤G) for a particular k must

be normalized to have zero mean and unit standard deviation prior to the

computation of the confidence value ck(n). A sum normalization method is

subsequently employed to compute an a posteriori probability:

ck(n)

Fk(n)=prob(l(Q)=l(T(n))|Ωk,Q)= _______________________

∑G n=1 ck

(n)

Where 0≤ Fk(n)≤1 Here Fk

(n) represent the probability that the identification

label T(n) is assigned to that of Q, assuming that Q is forwarded to Ωkfor

42

FR purpose contribute FR result scores to the final FR result score that

have the same importance as the FR result scores contributed by highly

reliable FR engines. Therefore, we assign a weight to each that takes into

account the relevance score of the associated FR engine. The rationale

behind this weighting scheme is that FR engines with high relevance

scores are expected to be highly trained for a given query face image the

weighted FR result score for can be defined as follows:

Fk(n)=Fk

(n)+α.Fk(n).Rk

Where Rk=(wk-wmin ) / (wmax -wmin ) where wmax and the wmin are the

maximum and minimum of all values, and the parameter α reflects the

importance Rk of relative to Fk(n).. Note that the importance of becomes

higher α as increases. Thus, α=0 only the FR results are used during the

fusion process. In by properly adjusting the value of α, we can increase

the importanceof produced by FR engines with high relevancescores.

On the other hand, we can decrease the importance of produced by FR

engines with low relevance scores.To merge the computed by the sum

rule is used:

CF(n)=∑n=1K Fk

(n).

The sum rule allows for optimal fusion at measurement level,compared to

other rules such as the product and median ruleFinally, to perform face

annotation on Q , the identity label of T(n) is determined by choosing the

identity label of that achieves the highest value for CF(n) :

l(Q)=l(T(n*))and n*=arg maxn=1G CF(n)

CHAPTER 7

43

SYSTEM IMPLEMENTATION

7.1 CODING

Coding is the process of designing, writing, testing, debugging, and

maintaining the source code of computer programs. This source code is

written in one or more programming languages. The purpose of

programming is to create a set of instructions that computers use to

perform specific operations or to exhibit desired behaviors. The process

of writing source code often requires expertise in many different subjects,

including knowledge of the application domain,

specialized algorithms and formal logic.

7.2 CODING STANDARD

A comprehensive coding standard encompasses all aspects of code

construction. While developers should prudently implement a standard, it

should be adhered to whenever practical. Completed source code should

reflect a harmonized style, as if a single developer wrote the code in one

session. At the inception of a software project, establish a coding standard

to ensure that all developers on the project are working in concert. When

the software project incorporates existing source code, or when

performing maintenance on an existing software system, the coding

standard should state how to deal with the existing code base.

The readability of source code has a direct impact on how well a

developer comprehends a software system. Code maintainability refers to

how easily that software system can be changed to add new features,

modify existing features, fix bugs, or improve performance. Although

readability and maintainability are the result of many factors, one

particular facet of software development upon which all developers have

an influence is coding technique. The easiest method to ensure a team of

44

developers will yield quality code is to establish a coding standard, which

is then enforced at routine code reviews.

Using solid coding techniques and good programming practices to

create high-quality code plays an important role in software quality and

performance. In addition, if you consistently apply a well-defined coding

standard, apply proper coding techniques, and subsequently hold routine

code reviews, a software project is more likely to yield a software system

that is easy to comprehend and maintain.

7.3 SAMPLE CODING

FACE DETECTION

using System;

using System.Collections.Generic;

using System.Diagnostics;

using System.Drawing;

using System.Drawing.Imaging;

using System.Runtime.InteropServices;

using System.Windows.Forms;

using Emgu.CV;

using Emgu.CV.Structure;

using Emgu.CV.UI;

using Emgu.CV.GPU;

using System.Data;

using System.Data.OleDb;

using System.Runtime.CompilerServices;

45

namespace FaceDetection

public partial class Form1 : Form

public static string ID;

public static string FName;

public static string LName;

OleDbConnection con;

public Form1()

InitializeComponent();

private void Form1_Load(object sender, EventArgs e)

Rectangle rect = Screen.PrimaryScreen.WorkingArea;

//Divide the screen in half, and find the center of the form to center it

this.Top = 0;

this.Left = 0;

this.Width = rect.Width;

this.Height = rect.Height;

this.SendToBack();

this.TopMost = false;

this.Activate();

con=new

OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data

Source=face_annotation.mdb;");

46

public Bitmap CropBitmap(Bitmap bitmap, Rectangle rect)

//Rectangle rect = new Rectangle();

Bitmap cropped = bitmap.Clone(rect, bitmap.PixelFormat);

return cropped;

private void openFileDialog1_FileOk(object sender,

System.ComponentModel.CancelEventArgs e)

textBox1.Text = openFileDialog1.FileName;

private void button2_Click(object sender, EventArgs e)

openFileDialog1.ShowDialog();

openFileDialog1.OpenFile();

pictureBox2.ImageLocation = textBox1.Text;

pictureBox2.Visible = true;


OleDbCommand cmd = new OleDbCommand();

OleDbCommand cmd1 = new OleDbCommand();

cmd.CommandType = CommandType.Text;

47

cmd.CommandText = "Insert into photos(EmailID,Photo)

Values('" + ID + "','" + pictureBox2.ImageLocation + "')";

con.Open();

cmd.Connection = con;

//cmd1.Connection = con;

cmd.ExecuteNonQuery();

con.Close();

Image<Bgr, Byte> image = new Image<Bgr,

byte>(textBox1.Text); //Read the files as an 8-bit Bgr image

Stopwatch watch;

String faceFileName = "haarcascade_frontalface_default.xml";

String eyeFileName = "haarcascade_eye.xml";

String contact, contacta="null", contact2;

String pname;

Double pcooccur, pcooccura;

int cooccur = 0, cooccura = 0;

int posn, posna;

Double[] weight = new Double[100];

Double[] sweight = new Double[100];

Double[] avalue = new Double[100];

Double[] wfrscore = new Double[100];

weight[0] = 0;

Double tweight = 0, tavalue = 0, cvalue = 0, frscore = 0;

String aval;

Double wmin = 0, wmax = 0, rk, aweight = 0;

if (GpuInvoke.HasCuda)

48

using (GpuCascadeClassifier face = new

GpuCascadeClassifier(faceFileName))

using (GpuCascadeClassifier eye = new

GpuCascadeClassifier(eyeFileName))

watch = Stopwatch.StartNew();

using(GpuImage<Bgr, Byte> gpuImage=new

GpuImage<Bgr, byte>(image))

using (GpuImage<Gray, Byte> gpuGray =

gpuImage.Convert<Gray, Byte>())

Rectangle[] faceRegion = face.DetectMultiScale(gpuGray,

1.1, 10, Size.Empty);

foreach (Rectangle f in faceRegion)

//draw the face detected in the 0th (gray) channel with

blue color

image.Draw(f, new Bgr(Color.Blue), 2);

using (GpuImage<Gray, Byte> faceImg =

gpuGray.GetSubRect(f))

//For some reason a clone is required.

//Might be a bug of GpuCascadeClassifier in opencv

using (GpuImage<Gray, Byte> clone =

faceImg.Clone())

Rectangle[] eyeRegion =

eye.DetectMultiScale(clone, 1.1, 10, Size.Empty);

49

foreach (Rectangle s in eyeRegion)

Rectangle eyeRect = s;

eyeRect.Offset(f.X, f.Y);

image.Draw(eyeRect, new Bgr(Color.Red), 2);

watch.Stop();

else

int faces = 1;

int x = 300, y = 300;

//Read the HaarCascade objects

using (HaarCascade face = new HaarCascade(faceFileName))

using (HaarCascade eye = new HaarCascade(eyeFileName))

watch = Stopwatch.StartNew();

using (Image<Gray, Byte> gray = image.Convert<Gray,

Byte>()) //Convert it to Grayscale

//normalizes brightness and increases contrast of the image

gray._EqualizeHist();

50

MCvAvgComp[] facesDetected = face.Detect(

gray,

1.1,

10,

Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_C

ANNY_PRUNING,

new Size(20, 20));

foreach (MCvAvgComp f in facesDetected)

//draw the face detected in the 0th (gray) channel with

blue color

image.Draw(f.rect, new Bgr(Color.Transparent), 2);

//Set the region of interest on the faces

gray.ROI = f.rect;

MCvAvgComp[] eyesDetected = eye.Detect(

gray,

1.1,

10,

Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO

_CANNY_PRUNING,

new Size(20, 20));

gray.ROI = Rectangle.Empty;

foreach (MCvAvgComp s in eyesDetected)

Rectangle eyeRect = s.rect;

eyeRect.Offset(f.rect.X, f.rect.Y);

//image.Draw(eyeRect, new Bgr(Color.Transparent),

2);

51

Bitmap bmp = new Bitmap(image.Bitmap);

Bitmap bmpgs = ConvertToGrayscale(bmp);

PictureBox pc = new PictureBox();

pc.Image = CropBitmap(bmp, f.rect);

pc.Size = new Size(100, 100);

pc.SizeMode = PictureBoxSizeMode.StretchImage;

this.Controls.Add(pc);

Image<Bgr, Byte> img = new Image<Bgr,

byte>(bmpgs);

TextBox tb = new TextBox();

TextBox tb1 = new TextBox();

tb.Text = (img.GetAverage().Red +

img.GetAverage().Green +

mmg.GetAverage().Blue).ToString();

this.Controls.Add(tb);

tb.Location = new Point(x, y + 150);

tb.Hide();

pc.Location = new Point(x, y);

x = x + 200;

faces++;

con.Open();

string query1 = "SELECT * FROM photos";

OleDbDataAdapter adapt1 = new

OleDbDataAdapter(query1, con);

DataSet ds1 = new DataSet();

adapt1.Fill(ds1, "photos");

adapt1.Dispose();

DataView view1 = new DataView();

view1 = ds1.Tables[0].DefaultView;

52

posn = view1.Count;

string query = "SELECT * FROM friends where

EmailID='" + ID + "' and Condition='Accepted'";

OleDbDataAdapter adapt = new

OleDbDataAdapter(query, con);

DataSet ds = new DataSet();

adapt.Fill(ds, "friends");

adapt.Dispose();

DataView view = new DataView();

view = ds.Tables[0].DefaultView;

int f1 = view.Count;

int fa = f1;

foreach (DataRow dr in ds.Tables[0].Rows)

contact = view[f1 - 1].Row[1].ToString();

string query2 = "SELECT * FROM photos where

EmailID='" + ID + "'";




adapt2.Fill(ds2, "photos");

adapt2.Dispose();



int puser = view2.Count;

string query3 = "SELECT * FROM queryface where

EmailID='" + ID + "' and Person='" + contact + "'";



53


adapt3.Fill(ds3, "queryface");

adapt3.Dispose();



int occur = view3.Count;

Double poccur = occur / puser;

string query31 = "SELECT * FROM queryface

where Person='" + contact + "'";





adapt31.Dispose();



int occur1 = view31.Count;

foreach (DataRow dr11 in ds31.Tables[0].Rows)

pname = view31[occur1 - 1].Row[1].ToString();

string query4 = "SELECT * FROM queryface

where Person='" + FName + " " + LName + "'

and Photo='" + pname + "'";




54


adapt4.Dispose();



cooccur = view4.Count;

occur1--;

//cooccur--;

pcooccur = cooccur / posn;

weight[f1] = Math.Exp(occur) + Math.Exp(cooccur);

tweight = tweight + weight[f1];

f1--;

aweight = tweight / fa;

Label sfr = new Label();

sfr.Text = "Selected FR Engine";

sfr.Location = new Point(x, y + 150);

string query1a = "SELECT * FROM photos";

OleDbDataAdapter adapt1a = new

OleDbDataAdapter(query1a, con);

DataSet ds1a = new DataSet();

adapt1a.Fill(ds1a, "photos");

adapt1a.Dispose();

DataView view1a = new DataView();

view1a = ds1a.Tables[0].DefaultView;

posna = view1a.Count;

string querya = "SELECT * FROM friends where

EmailID='" + ID + "' and Condition='Accepted'";

55

OleDbDataAdapter adapta = new

OleDbDataAdapter(querya, con);

DataSet dsa = new DataSet();

adapta.Fill(dsa, "friends");

adapta.Dispose();

DataView viewa = new DataView();

viewa = dsa.Tables[0].DefaultView;

int f1a = viewa.Count;

int faa = f1a;

foreach (DataRow dr in dsa.Tables[0].Rows)

contacta = view[f1a - 1].Row[1].ToString();

string query2a = "SELECT * FROM photos where

EmailID='" + ID + "'";




adapt2a.Fill(ds2a, "photos");

adapt2a.Dispose();



int pusera = view2a.Count;

string query3a = "SELECT * FROM queryface where

EmailID='" + ID + "' and Person='" + contacta + "'";




adapt3a.Fill(ds3a, "queryface");

adapt3a.Dispose();

56



int occura = view3a.Count;

Double poccura = occura / pusera;

string query31a = "SELECT * FROM queryface

where Person='" + contacta + "'";





adapt31a.Dispose();



int occur1a = view31a.Count;

foreach (DataRow dr11 in ds31a.Tables[0].Rows)

pname=view31a[occur1a- 1].Row[1].ToString();

string query4a = "SELECT * FROM queryface

where Person='" + FName + " " + LName + "'

and Photo='" + pname + "'";

OleDbDataAdapter adapt4a=new




adapt4a.Dispose();


57


cooccura = view4a.Count;

occur1a--;

//cooccura--;

pcooccura = cooccura / posna;

weight[f1]=Math.Exp(occura) +

Math.Exp(cooccura);

f1a--;

if (aweight <= weight[f1a+1])

for (int i = 0; i < f1a; i++)

if (weight[i] > tweight)

for (int j = 0; j < f1a; j++)

sweight[j] = weight[i];

wmax = Math.Max(sweight[j], sweight[j

- 1]);

wmin = Math.Min(sweight[j], sweight[j -

1]);

58

watch.Stop();

fr.ID = ID;

fr.FName = FName;

fr.LName = LName;

fr.photo = image.Bitmap;


fr fr = new fr();

this.Hide();

fr.Show();

public Bitmap ConvertToGrayscale(Bitmap source)

Bitmap bm = new Bitmap(source.Width, source.Height);

for (int y = 0; y < bm.Height; y++)

for (int x = 0; x < bm.Width; x++)

59

Color c = source.GetPixel(x, y);

int luma = (int)(c.R * 0.3 + c.G * 0.59 + c.B * 0.11);

bm.SetPixel(x, y, Color.FromArgb(luma, luma, luma));

return bm;

CHAPTER 8

TESTING

8.1 INTRODUCTION

Testing is a process of executing a program with the

intent of finding an error. A good test has a high probability of finding an

as yet undiscovered error. A successful test is one that uncovers an as yet

undiscovered error. The objective is to design tests that systematically

uncover different classes of errors and do so with a minimum amount of

60

time and effort. Testing cannot show the absence of defects, it can only

show that software defects are present.

8.1.1 TEST PLAN

The test-case designer not only has to consider the white

and black box test cases, but also the timing of the data and the

parallelism of the tasks that handle the data.

In many situations, test data provided when a real system is in one

state will result in proper processing, while the same data provided when

the system is in a different state may lead to error.

The intimate relationship that exists between real-time software

and its hardware environment can cause testing problems. Software tests

must consider the impact of hardware faults of software processing. Step

strategy for real-time systems is proposed.

The first step in the testing of real-time software is to test each task

independently (i.e.) the white and black box tests are designed and

executed for each task. Each task is executed independently during these

tests. The task testing uncovers errors in logic and functions, but will not

uncover timing or behavioral errors.

VALIDATION

The process of evaluating software at the end of the software

development process to ensure compliance with software requirements. It

is actual testing of the application.

Am I building the right product?

Determining if the system complies with the requirements and

performs functions for which it is intended and meets the

61

organization’s goals and user needs. It is traditional and is

performed at the end of the project.

Performed after a work product is produced against established

criteria ensuring that the product integrates correctly into the

environment.

Determination of correctness of the final software product by a

development project with respect to the user needs and

requirements.

VERIFICATION

The process of determining whether or not the products of

a given phase of the software development cycle meet the implementation

steps and can be traced to the incoming objectives established during the

previous phase.

Verification process helps in detecting defects early, and

preventing their leakage e downstream. Thus, the higher cost of later

detection and rework is eliminated

The goal of software testing is to assess the requirements of a

project; then the tester will determine if these requirements are met. There

are many times when low memory usage and speed are more important

than making the program pretty or capable of handling errors. While

programming skills are useful when testing software, they are not

necessarily required; however it can be useful in determining the cause of

errors found in a project.

8.2 TYPES OF TESTING

There are different types of testing in the development process.

They are:

62

Unit Testing

Integration Testing

Validation Testing

System Testing

Functional Testing

Performance Testing

Beta Testing

Acceptance Testing

8.2.1 UNIT TESTING

Developers write unit tests to check their own code. Unit

testing differs from integration testing, which confirms that components

work well together, and acceptance testing, which confirms that an

application does what the customer expects it to do. Unit tests are so named

because they test a single unit of code. Unit testing focuses verification

effort on the smallest unit of software design. Each of the modules in this

project was verified individually for errors.

8.2.2 INTEGRATION TESTING

Integration testing is a systematic testing for constructing the

program structure while at the same time conducting tests to uncover errors

associated within the interface. This testing was done with sample data.

The need for integrated test is to find the overall system performances.

8.2.3 VALIDATION TESTING

Validation testing is where the requirements established as part

of the software requirements analysis are validated against the software

that has been constructed. It provides final assurance that the software

meets all functional, behavioral and performance requirements. A deviation

from the specification is uncovered and corrected. Each input field was

tested with the validation rules specified integrity.

63

8.2.4 SYSTEM TESTING

System Testing is software or hardware is testing conducted on a

complete, integrated system to evaluate the system's compliance with its

specified requirement .System testing falls within the scope of black box

testing, and as such, should require no knowledge of the inner design of

the code or logic.

8.2.5 PERFORMANCE TESTING

Performance testing covers a wide range of engineering or

functional evaluations where a material, product, system, or person is not

specified by detailed material or component specification: rather,

emphasis is on the final measurable performance characteristics. Testing

can be a qualitive or qualitative procedure.

8.2.6 BETA TESTING

In software development, a beta test is the second phase of

software testing in which a sampling of the intended audience tries the

product out. (Beta is the second letter of the Greek alphabet.) Originally,

the term alpha test meant the first phase of testing in a software

development process. The first phase includes unit testing, component

testing, and system testing. Beta testing can be considered "pre-release

testing." Beta test versions of software are now distributed to a wide

audience on the Web partly to give the program a "real-world" test and

partly to provide a preview of the next release.

8.2.7 ACCEPTANCE TESTING

Testing generally involves running a suite of tests on the

completed system. Each individual test, known as a case, exercises a

64

particular operating condition of the user's environment or feature of the

system, and will result in a pass or fail, or Boolean, outcome. There is

generally no degree of success or failure. The test environment is usually

designed to be identical, or as close as possible, to the anticipated user's

environment, including extremes of such. These test cases must each be

accompanied by test case input data or a formal description of the

operational activities (or both) to be performed—intended to thoroughly

exercise the specific case—and a formal description of the expected

results.

CHAPTER 9

SCREEN SHOTS

USER ACCOUNT:

65

FACE DETECTION

66

SELECTION OF FR ENGINES

67

CHAPTER 10

CONCLUSION

68

Our project demonstrates that the collaborative use of multiple

FR engines allows improving the accuracy of face annotation for personal

photo collections shared on OSNs. The improvement in face annotation

accuracy can mainly be attributed to the following two factors.

1) In an OSN, the number of subjects that needs to be annotated

tends to be relatively small, compared to the number of subjects

encountered in traditional FR applications. Thus, an FR engine

that belongs to an OSN member is typically highly specialized

for the task of recognizing a small group of individuals.

2) In an OSN, query face images have a higher chance of

belonging to people closely related to the photographer.

Hence, simultaneously using the FR engines assigned to these

individuals can lead to improved face annotation ac curacy as these FR

engines are expected to locally produce high-quality FR results for the

given query face images.

CHAPTER 11

FUTURE ENHANCEMENT

69

Our approach for constructing an SGM relies on the

availability of manually tagged photos to reliably estimate the

identity occurrence and co-occurrence probabilities In practice, however,

two limitations can be identified:

1) individuals are only part of the constructed SGM when they

were manually tagged in the personal photo collections;

2) our approach may not be optimal when manually tagged photos

do not clearly represent the social relations between different

members of the OSN

REFERENCES

[1]P. Wu and D. Tretter, “Close & close: Social cluster and closeness

70

from photo collections,” in Proc. ACM Int. Conf. Multimedia,

2009.

[2] A. Gursel and S. Sen, “Improving search in social networks by

agent based mining,” in Proc. Int. Joint Conf. Artificial Intelligence,

2009.

[3] J. Zhu, S. C. H. Hoi, and M. R. Lyu, “Face annotation using

transductive kernel fishder discriminant,” IEEE Trans. Multimedia, vol.

10, no. 1, pp. 86-96, 2008.

[4] Z. Stone, T. Zickler, and T. Darrell, “Autotagging Facebook:

Social network context improves photo annotation,” in Proc. IEEE

Int.Conf. Computer Vision and Pattern Recognition (CVPR), 2008.

[5] K. Choi and H. Byun, “A collaborative face recognition framework

ona social network platform,” in Proc. IEEE Int. Conf. Automatic Face

and Gesture Recognition (AFGR), 2008.

[6] N. O’Hare and A. F. Smeaton, “Context-Aware person identification

in personal photo collections,” IEEE Trans. Multimedia, vol. 11, no.

2,pp. 220-228, 2009.

[7] J. Y. Choi, W. De Neve, Y. M. Ro, and K. N. Plataniotis,

“Automatic face annotation in photo collections using context-based

unsupervised clustering and face information fusion,” IEEE Trans.

Circuits Syst. Video Technol., vol. 20, no. 10, pp. 1292-1309,

Oct.2010.

[8] M. H. Yang, D. J. Kriegman, and N. Ahuja, “Detecting faces in

images: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no.

1, pp.34-58, 2002.

[9]J. Y. Choi, Y. M. Ro, and K. N. Plataniotis, “Color face recognition

for degraded face images,” IEEE Trans. Syst., Man, Cybern. B, vol. 39,

no. 5, pp. 1217-1230, 2009.

[10] R. Tron and R. Vidal, “Distributed face recognition via consensus

71

on SE(3),” in Proc. 8th Workshop Omnidirectional Vision, Camera

Networks and Non-classical Cameras-OMNIVIS, 2008.

[11] K. Lerman, A. Plangprasopchok, and C. Wong, “Personalizing

image search results on flickr,” in Proc. Int. Joint Conf. Artificial

Intelligence, 2007.

[12] C. A. Yeung, L. Licaardi, K. Lu, O. Seneviratne, and T. B. Lee,

“De-centralization: The future of online social networking,” in Proc.

Int.Joint Conf. W3C Workshop, 2009.

[13] M. Naaman, R. B. Yeh, H. G. Molina, and A. Paepcke,

“Leveraging context to resolve identity in photo albums,” in Proc. ACM

Int. Conf. ACM/IEEE-CS Joint Conf. on Digital Libraries, 2005.

72

Face Annotation

Documents

Transcript of Face Annotation