Using editing operations to improve searching by color in multimedia database systems

13
Using Editing Operations to Improve Searching by Color in Multimedia Database Systems Leonard Brown, 1 Le Gruenwald 2 1 Computer Science Department, The University of Texas at Tyler, Tyler, TX 75799 2 The University of Oklahoma, School of Computer Science, Norman, OK 73019 Received 9 November 2007; revised 31 March 2008; accepted 27 May 2008 ABSTRACT: Since multimedia database management systems determine similarity by comparing sets of image features, relevant images in the database can be missed if their features do not match those extracted from the query image. Many failed matches can be avoided if modified versions of the missed relevant images are also stored in the underlying database. To minimize the storage cost asso- ciated with adding extra images to the database, the modified ver- sions can be stored as sequences of editing operations instead of as large, binary objects. This article presents a technique for processing color-based queries in this environment that accesses the sequences of editing operations directly. It also presents a methodology that can be used to speed up the query processing just as ordered indices speed up the processing of traditional queries. In addition, this article provides a performance illustrating the technique’s strengths and weaknesses when compared with the traditional approach to proc- essing color-based queries. The results indicate that with low similar- ity thresholds, the proposed technique processes similarity searches more accurately than the traditional approach while using less data- base storage space since the modified versions are kept as editing operation sequences. V V C 2008 Wiley Periodicals, Inc. Int J Imaging Syst Technol, 18, 182–194, 2008; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ima.20155 Key words: multimedia databases; image retrieval; similarity search I. INTRODUCTION Because of the availability of faster and more powerful processors and the growth of the popularity of the Web, more and more com- puter applications are being developed that maintain collections of images and other types of multimedia data. Because multimedia data objects are different than traditional alphanumeric data, a Mul- tiMedia DataBase Management System (MMDBMS) has different storage and retrieval requirements from those of a traditional DBMS. For example, images are typically much larger than tradi- tional alphanumeric data elements, so an MMDBMS should employ efficient storage techniques. In addition, users interpret the content of images when they view them, so an MMDBMS should facilitate searching in those systems utilizing that content, which is a require- ment commonly referred to as Content-Based Image Retrieval (CBIR) (Aslandogan et al., 1999; Smeulders et al., 2000; Dunckley and Lynne, 2003; Deb et al., 2004; Datta et al., 2005; Vasconcelos, 2007; Datta et al., 2008). Previous research (Brown et al., 2004; Dukkipati and Brown, 2005) has indicated that it is possible to improve the retrieval accu- racy of an MMDBMS supporting CBIR by storing some of the images in the database using a nontraditional storage format, which is as sequences of editing operations. The purpose of this article is to present new tools and techniques for processing CBIR queries in systems that operate in this environment. In addition, this article presents data structures that can be used to speed up the time needed to search the underlying database while processing the queries. These new approaches are needed because the traditional techniques for performing CBIR assume that all images are stored in conventional binary formats. Our work in this article focuses on the visual property of color since its extraction process is typically the most straightforward. The remainder of this artcle has the following organization. Sec- tion II provides a brief summary of the key points of using color to retrieve images in a conventional MMDBMS. Section III describes how having images stored as editing operations can improve the effectiveness of a CBIR system. Sections IV and V present an approach for identifying the colors that are present in an image stored as a set of editing operations and an approach for processing similar- ity searches when an MMDBMS stores images in this fashion, respectively. Section VI presents the results of a performance evalua- tion comparing the proposed approach to the conventional methods for processing color-based queries in terms of retrieval accuracy. Sec- tion VII describes and evaluates a technique for speeding up the exe- cution time of the approaches in the earlier sections. Finally, Section VIII summarizes this article and provides directions for future work. II. CONVENTIONAL APPROACHES TO SEARCHING IMAGES BY COLOR To facilitate CBIR, systems typically extract features and generate a signature for each image in the database to represent its content so Correspondence to: Leonard Brown; e-mail: [email protected] ' 2008 Wiley Periodicals, Inc.

Transcript of Using editing operations to improve searching by color in multimedia database systems

Page 1: Using editing operations to improve searching by color in multimedia database systems

Using Editing Operations to Improve Searching by Colorin Multimedia Database Systems

Leonard Brown,1 Le Gruenwald2

1 Computer Science Department, The University of Texas at Tyler, Tyler, TX 75799

2 The University of Oklahoma, School of Computer Science, Norman, OK 73019

Received 9 November 2007; revised 31 March 2008; accepted 27 May 2008

ABSTRACT: Since multimedia database management systems

determine similarity by comparing sets of image features, relevant

images in the database can be missed if their features do not match

those extracted from the query image. Many failed matches can beavoided if modified versions of the missed relevant images are also

stored in the underlying database. To minimize the storage cost asso-

ciated with adding extra images to the database, the modified ver-

sions can be stored as sequences of editing operations instead of aslarge, binary objects. This article presents a technique for processing

color-based queries in this environment that accesses the sequences

of editing operations directly. It also presents a methodology that canbe used to speed up the query processing just as ordered indices

speed up the processing of traditional queries. In addition, this article

provides a performance illustrating the technique’s strengths and

weaknesses when compared with the traditional approach to proc-essing color-based queries. The results indicate that with low similar-

ity thresholds, the proposed technique processes similarity searches

more accurately than the traditional approach while using less data-

base storage space since the modified versions are kept as editingoperation sequences. VVC 2008 Wiley Periodicals, Inc. Int J Imaging

Syst Technol, 18, 182–194, 2008; Published online in Wiley InterScience

(www.interscience.wiley.com). DOI 10.1002/ima.20155

Key words: multimedia databases; image retrieval; similarity search

I. INTRODUCTION

Because of the availability of faster and more powerful processors

and the growth of the popularity of the Web, more and more com-

puter applications are being developed that maintain collections of

images and other types of multimedia data. Because multimedia

data objects are different than traditional alphanumeric data, a Mul-

tiMedia DataBase Management System (MMDBMS) has different

storage and retrieval requirements from those of a traditional

DBMS. For example, images are typically much larger than tradi-

tional alphanumeric data elements, so an MMDBMS should employ

efficient storage techniques. In addition, users interpret the content

of images when they view them, so an MMDBMS should facilitate

searching in those systems utilizing that content, which is a require-

ment commonly referred to as Content-Based Image Retrieval

(CBIR) (Aslandogan et al., 1999; Smeulders et al., 2000; Dunckley

and Lynne, 2003; Deb et al., 2004; Datta et al., 2005; Vasconcelos,

2007; Datta et al., 2008).

Previous research (Brown et al., 2004; Dukkipati and Brown,

2005) has indicated that it is possible to improve the retrieval accu-

racy of an MMDBMS supporting CBIR by storing some of the

images in the database using a nontraditional storage format, which

is as sequences of editing operations. The purpose of this article is

to present new tools and techniques for processing CBIR queries in

systems that operate in this environment. In addition, this article

presents data structures that can be used to speed up the time

needed to search the underlying database while processing the

queries. These new approaches are needed because the traditional

techniques for performing CBIR assume that all images are stored

in conventional binary formats. Our work in this article focuses on

the visual property of color since its extraction process is typically

the most straightforward.

The remainder of this artcle has the following organization. Sec-

tion II provides a brief summary of the key points of using color to

retrieve images in a conventional MMDBMS. Section III describes

how having images stored as editing operations can improve the

effectiveness of a CBIR system. Sections IV and V present an

approach for identifying the colors that are present in an image stored

as a set of editing operations and an approach for processing similar-

ity searches when an MMDBMS stores images in this fashion,

respectively. Section VI presents the results of a performance evalua-

tion comparing the proposed approach to the conventional methods

for processing color-based queries in terms of retrieval accuracy. Sec-

tion VII describes and evaluates a technique for speeding up the exe-

cution time of the approaches in the earlier sections. Finally, Section

VIII summarizes this article and provides directions for future work.

II. CONVENTIONAL APPROACHES TO SEARCHINGIMAGES BY COLOR

To facilitate CBIR, systems typically extract features and generate

a signature for each image in the database to represent its content soCorrespondence to: Leonard Brown; e-mail: [email protected]

' 2008 Wiley Periodicals, Inc.

Page 2: Using editing operations to improve searching by color in multimedia database systems

that those features can be searched in response to a user’s query.

Subsequently, users can pose queries to the MMDBMS requesting

images that have specific feature values. In addition, the extracted

features can be used as the basis for measuring the similarity

between two images, so users can pose queries, called similarity

searches, which request all images similar to some query image that

they specify. This query image may be directly supplied to the sys-

tem by the user from an external source, or it may be specified

through a relevance feedback mechanism that allows a user to have

more interaction with the system by selecting and evaluating one or

more images retrieved as the result of a previously submitted user

query. These selected images can then be resubmitted to the data-

base in order to refine the similarity search results. Examples of

techniques and issues regarding performing relevance feedback are

given in (Tao et al., 2006, 2007, 2008; Chatzis et al., 2007; Datta

et al., 2008).

The extracted features are typically based upon visual properties

of the images, and these properties typically reflect the inherent na-

ture of the application domain supported by the MMDBMS. To

illustrate, consider an application that performs autonomous naviga-

tion while driving and therefore needs to recognize images of road

signs. When considering these images, it should be noted that many

countries around the world have adopted specific color and shape-

based conventions for classifying different types of road signs. This

is because signs with recognizable symbols and colors are easier for

people to use than signs with words, and the symbols and colors aid

drivers and passengers that are not familiar with the local language.

An MMDBMS supporting road sign recognition, then, should pro-

vide searching using color and shape-based features since they pro-

vide relevant information regarding the purpose of a sign.

When extracting color features, one common method used by

existing systems is to generate a histogram for each image stored in

the database. Each bin of a given histogram contains the percentage

of pixels in its respective image that are of a particular color. These

colors are usually obtained by uniformly quantizing the space of a

color model such as RGB, HSV, or Luv into a system-dependent

number of divisions. Numerous CBIR systems utilize similar histo-

gram methods to either directly represent or compute alternative

comparable representations for color-based features including BIC

(Stehling et al., 2002), DISIMA (Oria et al., 2001), MARS (Ortega

et al., 1998), and RECI (Djeraba et al., 1997). A summary of an

approach comparing the mean average precision of three color-

based retrieval techniques is given in (Vasconcelos, 2007).

Since each image is represented using a signature computed

based on a color histogram, the system can allow users to query the

database requesting the images that have a specified percentage of

pixels containing a certain color, such as ‘‘Retrieve all images that

are at least 25% blue.’’ In addition, the system can process users’

similarity searches by extracting a color histogram from the speci-

fied query image and then comparing it to the ones representing the

images stored in the database. Note that this extraction phase is not

necessary when the user selects a query image as a result of a rele-

vance feedback process. This is because the query image in this

case is an image retrieved from the database and therefore has al-

ready had its feature signature extracted and saved when it was

originally inserted.

Common functions used to evaluate the similarity between two

n-dimensional histograms < x1, . . . , xn > and < y1, . . . , yn >include the Histogram Intersection (Swain et al., 1991) evaluated as

Smin(xi, yi) and the Lp-Distances evaluated as (S|xp 2 yp|)1/p. Addi-tional functions for comparing histograms can be found in (Djeraba

et al., 1997). Since the histograms are essentially points in a multi-

dimensional space, multidimensional indexes, such as the R-tree

(Guttman and Antonin, 1984) and its numerous variants (Brown

et al., 1998; Gaede et al., 1998) can be used to reduce the time

required to process these queries.

A. Problems with Conventional Approaches to CBIR. The

above discussion indicates that instead of directly using the data-

base images themselves, CBIR is typically performed in an

MMDBMS utilizing features extracted from the database images.

This indirect form of content representation often results in a

‘‘semantic gap’’ (Smeulders et al., 2000) between the features

extracted from the image and the actual visual content humans per-

ceive in it. Consequently, the results from similarity searches and

image recognition queries submitted to a CBIR system are often

inaccurate because the system bases its decisions on the extracted

features. Thus, when features from two images do not match, the

CBIR system will not consider the pair of images to be similar,

even though humans may consider the images to be alike. Many

instances of this problem persist as open issues in the CBIR

research community. For example, it is difficult to match images of

the same object under varying lighting conditions or under varying

settings such as outdoor environments (Zhao et al., 2003).

To illustrate the above problem, consider Figure 1 which con-

tains a query image of a stop sign on the left and a database of road

sign images on the right. If the features extracted from the query

image do not match the features extracted from the stop sign in the

database, the CBIR system would be unable to accurately match or

recognize the query image, which is a false negative. As presented

in (Gupta et al., 1997), minimizing the occurrences of false nega-

tives is often considered more important than reducing the number

of false positives, since a user can filter out any unwanted returned

images but has no way of knowing the existence of a matching

database image that was not retrieved.

One technique for addressing the above matching problem is to

expand the given query image q into several query images as in

(Tahaghoghi et al., 2001; Jin et al., 2003) where each new query

image is created by editing q. Each of the images is submitted to

the database separately, and the results from all of them are com-

bined together to form one resulting collection. This technique is

somewhat analogous to text retrieval systems that augment terms in

Figure 1. Example environment of an augmented MMDBMS sup-

porting CBIR. [Color figure can be viewed in the online issue, which is

available at www.interscience.wiley.com.]

Vol. 18, 182–194 (2008) 183

Page 3: Using editing operations to improve searching by color in multimedia database systems

a user’s query utilizing a manually produced thesaurus before

searching a collection of documents. Another technique, called rele-

vance feedback, is also related to the notion of multiple querying.

In this technique, a user can evaluate the results from a CBIR query

by marking one or more of the images as relevant or not relevant,

and then resubmit that information to refine the original query. Both

the multiple query and relevance feedback techniques can improve

CBIR accuracy by recognizing that a single query image may not

precisely express the information the user wants from the database.

These approaches, then, improve CBIR by reducing the gap identi-

fied by the information retrieval community as the difference

between the user’s query, the specific statement processed by the re-

trieval system, and the user’s information need, the conceptual

question he or she truly wants answered.

III. DATABASE AUGMENTATION

In database augmentation, the problems of feature matching are

addressed by adding new images to the database created by editing

the original images already present. To illustrate how the addition

of edited images can help retrieval, consider Figure 2. In the figure,

the same comparison scheme used in Figure 1 may be able to match

the query image to one of the darkened images along the bottom

row. So, as long as the connections between the original photos and

the darkened photos in Figure 2 are maintained, the CBIR system

would now have the ability to recognize the query image without

having to change the basic feature extraction or comparison scheme

employed by the system.

The advantage of the database augmentation approach over

multiple query image approaches becomes evident when consid-

ering the time that it would take to process queries using each

approach. In multiple query approaches, the features must be

extracted from each of the query images in order to compare

them to the features in the underlying MMDBMS, and feature

extraction is a very expensive process. Let t1 represent the time

needed to extract the feature signature used for comparison from

an image, and let t2 represent the time needed to compare two

image signatures. In addition, let n represent the number of

images in the database, and let k represent the number of addi-

tional query images submitted in the multiple query image

approach as well as the number of modified images added for

each original database image in the database augmentation

approach. Multiple query approaches would require (k 1 1) 3 t1time to extract the signatures from the query images. Alterna-

tively, the database augmentation approach would only require t1units of time to extract the signatures since there is only one

query image. Assuming that there is no indexing technique on

the database, the multiple query image method would require (k1 1) 3 n 3 t2 image similarity comparisons since each query

image would have to be compared with each database image.

The database augmentation method would also require (k 1 1) 3n 3 t2 image similarity comparisons since that is the number of

image objects that would be contained in the database. Thus, the

multiple query image method would require (k 1 1) 3 t1 1 (k1 1) 3 n 3 t2 time to process a query, which is larger than the

t1 1 (k 1 1) 3 n 3 t2 time needed to process a query in the

database augmentation method.

Relevance feedback approaches do not suffer the same perform-

ance penalty that multiple query image approaches have. This is

because the retrieved images evaluated by the users have already

been inserted into the system, so their features have already been

extracted and stored in the database. However, in relevance feed-

back approaches, users evaluate the images that were retrieved and

not the images that were not retrieved from the system. So, while

this approach is effective in refining the set of retrieved images to

ensure that most of the retrieved images are relevant, it still does

not address the problem of matching an unusual database image

whose features completely differ. This is addressed in the database

augmentation approach, however, because the additional images

serve as the mechanism for linking a query image to the unusual

database image.

One disadvantage of augmenting an MMDBMS to improve re-

trieval accuracy is that it increases the number of images stored in

the underlying database. This disadvantage is magnified because

one of the characteristics that distinguish multimedia data from tra-

ditional alphanumeric data is that multimedia data objects are much

larger. Thus, adding more images to the database results in a nontri-

vial increase in the storage required by the MMDBMS. To mini-

mize the effects of this disadvantage, an MMDBMS can adopt the

technique of storing the edited images as sequences of operations

(Speegle et al., 1998, 2000; Brown et al., 2004) instead of storing

them in a conventional binary format such as JPEG (Wallace and

Gregory, 1991). The purpose of utilizing this format is that an

image stored as a set of editing operations will consume much less

space than the same image stored in a conventional binary format.

Specifically, if an image e is created by editing an original base

image object, say b, the edited image is stored as a reference to balong with the sequence of operations used to change b into e.Instantiating an image stored in this format can be accomplished by

accessing the referenced base image and sequentially executing the

associated editing operations.

The current methods for extracting features from images require

that all of the images be stored in a binary format. So, in an aug-

mented database, any images stored as editing operations must first

be instantiated for the system to use the current methods of feature

extraction. Since instantiation is an expensive process in terms of

execution time, it should be avoided. In the next section, we present

an approach for that accomplishes this by identifying the colors in

an image directly from the editing operations themselves. This pre-

sentation extends an earlier version (Brown et al., 2004) by provid-

ing a more extensive performance evaluation of our approach

including evaluations of the retrieval accuracy against each individ-

ual operation that can be used by the system.

Figure 2. Example environment of an augmented MMDBMS sup-

porting CBIR. [Color figure can be viewed in the online issue, which isavailable at www.interscience.wiley.com.]

184 Vol. 18, 182–194 (2008)

Page 4: Using editing operations to improve searching by color in multimedia database systems

IV. IDENTIFYING COLORS IN IMAGES WITHOUTINSTANTIATION

The primary motivation of our approach for processing retrieval

queries in an augmented MMDBMS is to avoid instantiating the

edited images. We infer the values of the features of the edited

images directly from the sequence of operations contained in their

descriptions. To describe our approach, then, it is necessary to ex-

plicitly define the storage format as well as the operations that may

appear in the description of an edited image.

A. Storage Format of Edited Images. Two components are

necessary for storing an edited image as a sequence of operations.

Specifically, the system must store both the original image that was

transformed to create the edited image and the operation or set of

operations that comprise the transformation. Thus, the description

of an edited image contains both a reference to the original image

and the set of transformation operations.

Our proposed approach takes actions based on specific transfor-

mation operations; thus, it assumes that only members of a specific

set of operations may be used to create the edited images. This set

was presented in (Speegle et al., 1998, 2000) and is composed of

five operations called Define (x1, y1, x2, y2), Combine (C11,. . ., C33),

Modify (Rmin, Rmax, Rnew, Gmin, Gmax, Gnew, Bmin, Bmax, and Bnew),

Mutate (M11,. . ., M33), and Merge (target_image, coordinates). This

set is used because it has the capability to add, modify, and delete a

single pixel at a time. Theoretically, then, any image can be trans-

formed into any other given image by repeatedly applying the oper-

ations in the set on individual pixels (Brown et al., 1997).

The Define operation selects the group of pixels that will be

edited by the subsequent operations in the list, and the parameters

to the operation specify the coordinates of the desired group of pix-

els, called the Defined Region (DR). The Combine operation is

used to blur images by changing the colors of the pixels in the DR

to the weighted average of the colors of the pixels’ neighbors, and

the parameters to the operation are the weights (C1,. . ., C9) applied

to each of the neighbors C1 through C9. The Modify operation is

used to explicitly change the colors of the pixels in the DR that are

of a certain color, RGBold, into a new color, RGBnew. The parame-

ters of the Modify operation specify both RGBold and RGBnew. The

Mutate operation is used to rearrange pixels within an image, and

the parameters specify the matrix (M11,. . ., M33) used to change the

locations of the pixels. This operation can be used to perform rota-

tions, scales, and translations of items within an image. Finally, the

Merge operation is used to copy the current DR into a target image.

The parameters specify the target image and the coordinates specify

where to copy the DR.

Examples of the above operations are given in Figures 3–7. Fig-

ure 3 displays an example of the DR created by the Define opera-

tion when applied to an image of the state flag of Oklahoma

(HTTP, 2003a). Figure 4 displays the results of applying Combine

(1, 2, 1, 2, 4, 2, 1, 2, 1) to this DR. Figure 5 displays the results of

applying Mutate (2.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0) on a DR

enclosing the star in the upper left corner of the original image. Fig-

ure 6 displays the results of applying Modify (0, 0, 0, 0, 0, 255, 255,

255, 0) to an image of the French national flag (HTTP, 2003a). The

result is that the blue pixels along the left side of the flag are

changed to green, while the red and white pixels are unchanged.

Finally, Figure 7 shows the results of applying the Merge operation

when the DR encloses the entire original image.

B. Rules for Determining Effects of Editing Operations onValues of Histogram Bins. We infer the color features in an

edited image using a set of rules that identify bounds on the per-

centage of pixels in an edited image that could map to a given color

bin if it were instantiated. The purpose of each rule is to determine

how its corresponding editing operation can change a given

Figure 3. Rectangle corresponding to define(32, 96, 224, 288).[Color figure can be viewed in the online issue, which is available at

www.interscience.wiley.com.]

Figure 4. Blurred effects after applying combine (1, 2, 1, 2, 4, 2, 1,

2, 1). [Color figure can be viewed in the online issue, which is avail-

able at www.interscience.wiley.com.]

Figure 5. Scale change after applying mutate (2, 0, 0, 0, 1, 0, 0,

0, 1). [Color figure can be viewed in the online issue, which is avail-

able at www.interscience.wiley.com.]

Figure 6. Color change after applying modify (0, 0, 0, 0, 0, 255,

255, 255, 0). [Color figure can be viewed in the online issue, which is

available at www.interscience.wiley.com.]

Vol. 18, 182–194 (2008) 185

Page 5: Using editing operations to improve searching by color in multimedia database systems

histogram bin, say HB. So, each rule is expressed as an adjustment

to the minimum and maximum bounds on the percentage of pixels

that may be in bin HB if the edited image is instantiated. The per-

centages are adjusted by repeatedly updating the total number of

pixels that are in the image as well as the minimum and maximum

numbers of pixels that are in bin HB for each operation listed in the

description of the image.

Both the Combine (C1,. . .,C9) and Modify (Rmin, Rmax, Rnew,

Gmin, Gmax, Gnew, Bmin, Bmax, and Bnew) operations only change the

colors of the pixels in the current DR. Because of this, one rule for

both operations is that the total number of pixels in the image will

not change after either operation is applied. In addition, the number

of pixels that may change color is bounded by the number of pixels

in the DR, denoted |DR|.

Now, consider the parameters of only the Modify operation. If

the operation changes pixels so that they become a color that maps

to bin HB, it can only increase the total number of pixels in the

image that would map to bin HB. Thus, the maximum bound should

increase by |DR| while the minimum bound remains constant. Alter-

natively, if the operation takes pixels whose colors map to bin HB

and changes them to some new color that does not map to HB, it

can only decrease the total number of pixels in the image that are in

the bin. Thus, the minimum bound should be reduced by |DR| while

the maximum bound remains constant. If no colors specified in the

parameter map to bin HB, both the maximum and minimum bounds

should remain unchanged.

The |DR| again can serve as a bound for the number of pixels

that may change as a result of applying the Combine operation to

an image. However, we noted that pixels within homogeneously

colored regions will not change because the operation determines a

new color for a pixel based on the average color of its neighbors.

We assume that a majority of the pixels in an edited image will be

in a homogeneously colored region, so the rule for the Combine

operation is that the adjustment to the number of pixels that are in

bin HB will be so small that it can be ignored.

The rules for the Mutate operation are based on specific instan-

ces of its parameters. If the current DR contains the whole image,

then the distribution of colors in the image should remain the same.

Alternatively, if the parameters imply a rigid body transformation,

then the DR will simply be moved without any scaling. Thus, the

total number of pixels that may change color is bounded by |DR| as

in the previous operations.

The rules for the Merge operation adjust the percentage of pixels

in bin HB based on the combination of the pixels in the DR and the

colors in the target image. They were developed from the following

observations. First, adding the minimum (maximum) numbers of

pixels in bin HB from the DR and the target image gives the mini-

mum (maximum) number of pixels in bin HB for the resulting

image. Second, the size of the resulting image will be equal to the

size of the target image, unless the DR is copied onto a position that

causes that image to grow, such as pasting the DR beginning at the

lower right corner of the target image. Finally, if a cropping opera-

tion is specified, meaning that the target image is NULL, the size of

the resulting image will be equal to the size of the DR.

The minimum bound for the DR is equal to the number of pixels

in the DR minus the total number of pixels in the image that are not

in bin HB. The maximum bound for the DR is equal to the smaller

of the following values, the number of pixels in the DR and the

number of pixels in bin HB in the entire image. The minimum

bound for the target is equal to |DR| subtracted from the number of

pixels in bin HB in the target image before applying the operation.

The maximum bound for the target image is equal to the minimum

of the following values, the number of pixels in the target image

before applying Merge and the number of pixels in the target image

not covered by the DR.

Table I provides a summary of the above formulae for comput-

ing the adjustments to the minimum and maximum bounds of the

number of pixels in bin HB after the application of each operation.

In the table, |E|, |T|, |THB|, |HB|min, and |HB|max represent the number

of pixels in the edited image, the number of pixels in the target

image of the Merge operation, the number of pixels in the target

Figure 7. Effects of combining images using merge (image2, 100,120). [Color figure can be viewed in the online issue, which is avail-

able at www.interscience.wiley.com.]

Table I. Summary of rules for adjusting bounds on numbers of pixels in bin HB.

Editing Operation Conditions

Minimum Number in

Bin HB

Maximum Number in

Bin HB

Total Number of

Pixels in Image

Combine (C11, . . ., C33) All No change No change No change

Modify (Rmin, Rmax, Rnew,

Gmin, Gmax, Gnew, Bmin,

Bmax, and Bnew)

If (Rnew, Gnew, Bnew) maps to HB No Change Increase by |DR| No Change

Else if ([Rmin� � �Rmax], [Gmin� � �Gmax],

[Bmin� � �Bmax]) maps to HB

Decrease by |DR| No Change No Change

Else No Change No Change No Change

Mutate (M11,M12,M13, M21,

M22,M23,M31,M32, M33)

DR contains image Multiply by |M11 3M22| Multiply by |M113M22| Multiply by |M113M22|

Rigid Body Decrease by |DR| Increase by |DR| No Change

Merge (Target, xp, yp) Target is NULL |DR|2 (|E| 2 |HB|min) MIN[|HB|max, |DR|] |DR|

Target is Not NULL |DR|2 (|E|2 |HB|min) 1|THB| 2 |DR|

MIN(|HB|max, |DR|)1MIN(|THB|, |T|2 |DR|)

[MAX((xp 1 x2 2 x1),

height of Target)

2MIN(xp,0)1 1]3[MAX((yp 1 y2 2 y1),width of Target)2MIN(yp,0)1 1]

186 Vol. 18, 182–194 (2008)

Page 6: Using editing operations to improve searching by color in multimedia database systems

image that are in bin HB, the minimum number of pixels in bin HB,

and the maximum number of pixels in bin HB, respectively.

Consider using the rules to determine if an edited image, e, satis-fies the given query. A system accesses the value of the histogram

bin for the referenced base image given in the storage format of e,and then uses the above rules to determine how the associated edit-

ing operations modify that value. After applying the rules, let the

minimum number of pixels that are in bin HB be represented by

BOUNDmin, let the maximum number of pixels that are in bin HB

be represented by BOUNDmax, and let the size of the image be rep-

resented by imageSize. The range [BOUNDmin/imageSize,

BOUNDmax/imageSize] represents the bounds on the percentage of

pixels in image e that map to bin HB. If this range does not overlap

the desired query range, image e cannot satisfy the given query.

Thus, the above rules can be used to eliminate images that do not

satisfy a given query without producing false negatives by comput-

ing the range [BOUNDmin/imageSize, BOUNDmax/imageSize].

V. PROCESSING SIMILARITY SEARCHES WITHOUTINSTANTIATION

In this section, we present our approach for processing similarity

searches of the type ‘‘Retrieve all images that are similar to queryimage q’’ while avoiding instantiation. To process a similarity

search, the MMDBMS must have an internal procedure with a con-

dition that indicates if two images are similar. This condition is of-

ten whether the distance between the two images is less than some

given threshold, t. Thus, we assume that there are two input param-

eters to the MMDBMS, a query image q and a threshold value t.Since the database contains both binary images and edited

images stored as sequences of operations, the query processor must

be able to compare images stored in either format to an input query

image q. Consequently, our approach operates in two phases where

the first phase identifies the binary images in the database that are

similar to q, and the second phase identifies the edited images that

are similar to q without instantiating them.

Our approach is displayed in Figure 8. The first phase covers the

first three steps in the figure, and it uses conventional histogram

techniques to process the binary images. The first step identifies the

input of the given query, which, as described above, contains the

query image q and a given threshold threshold. The second step is

to process the query image and extract its histogram stored in the

variable hq. The third step compares hq to the previously extracted

histograms corresponding to the binary images in the database using

the histogram intersection described in Section II.

Figure 8. Our approach for process-

ing similarity search queries.

Table III. Sequence of operations for each edited image in the

example database.

I5:I2 I6:I3

Define (0, 0, 9, 4) Define (0, 0, 4, 3)

Merge (NULL, 0, 0) Modify (0, 100, 255, 0, 100, 255, 0,

100, 255)

Combine (1, 2, 1, 2, 4, 2, 1, 2, 1)

I7:I4 I8:I4

Define (0, 0, 9, 0) Define (0, 0, 9, 2)

Mutate (1, 0, 5, 0, 1, 5, 0, 0, 1) Modify (0, 255, 255, 0, 255, 255,

0, 255, 255)

Table II. Histograms for the binary images in the example database.

Histogram

ID

Image

ID bin0 bin1 bin2 bin3 bin4 bin5 bin6 bin7

H1 I1 0 0 0 0 0 0 0.6 0.4

H2 I2 0.5 0.5 0 0 0 0 0 0

H3 I3 0.8 0 0 0 0 0 0 0.2

H4 I4 0 0 0 0 0 0 0.4 0.6

Vol. 18, 182–194 (2008) 187

Page 7: Using editing operations to improve searching by color in multimedia database systems

The second phase determines the similarity between q and the

edited images in the database, and it covers Step 4 through Step 9

given in Figure 8. To keep the results consistent with the distance

values produced by the first phase, the second phase comparisons

are also based upon the Histogram Intersection. Consequently, the

purpose of these steps is to estimate the minimum possible values

of the histogram bins of the edited images.

Steps 4–7 initialize variables that will be repeatedly updated

during execution. Step 8 contains the main loop that executes for

each bin listed in the variable queryBins. During each iteration of

this loop, the current bin being processed is represented by the vari-

able currentBin and the value of the histogram for the query image

at this bin is represented by the variable match. As the loop pro-

ceeds, it will repeatedly estimate the percentage of a specific color

in an edited image and determine if that value is close enough to

the known amount of the color that is present within the query

image, q, in order for the edited image to be considered similar to q.We use the variables [pctMin, pctMax] to represent the bounds of

the needed percentage of color that is necessary for being consid-

ered similar to q. Variable pctMin is computed as (match 2 thresh-old), and pctMax is computed as (match 1 threshold).

Given the above range [pctMin, pctMax], our approach executes

a loop for each edited image remaining in set S. Within this inner

loop, we use the rules described earlier to estimate the percentage

of pixels that would correspond to currentBin in each edited image,

e, if it were instantiated. Each estimated value is represented as a

boundary range [boundMin, boundMax], and if that range intersects

the target range [pctMin, pctMax], then the estimated similarity

between q and e is increased. This value is represented in the array

totalSum at index e, and the increase is either match or a value com-

puted as a function of [boundMin, boundMax] such as the average

of the boundary endpoints. If [boundMin, boundMax] does not

intersect the target range, then it means that e cannot possibly be

similar to the query image. Thus, it is removed from set S. After

this test, an additional check is performed to ensure that it is possi-

ble for image e to be considered similar to q using the remaining

colors represented by the bins in queryBins. This test is performed

by computing ((totalSum [e] 1 pctRemaining) < (1 2 threshold)).If this test is false, then image e gets removed from set S.

When the inner loop of Step 8 terminates, our approach will

have generated an estimate on the similarity of each edited image to

q. The final step, then, is to find those estimates that are within the

given threshold and return the edited images that correspond to

those estimates. Since the estimates in the totalSum array are based

upon the Histogram Intersection, we compute (1.0 2 totalSum[e])for each edited image e and compare it to the threshold value in

order to stay consistent with the distance values for the binary

images computed in the first phase.

Tables II–V illustrates an example application of our algorithm.

Tables II and III list the database’s binary and edited images,

respectively. Given a similarity search with a threshold of 0.25 and

a query histogram, hq, 5 <0, 0, 0, 0, 0, 0, 0.5, 0.5>, Table IV lists

the boundary computations of each edited image after executing

Step 8 for bin 6, which will eliminate edited images I5 and I6.

Table V lists the boundary computations of the remaining edited

images for bin 7.

VI. PERFORMANCE EVALUATION

To evaluate the performance of our approach, we have implemented

it on a UNIX platform using the Perl language. The system is capa-

ble of retrieving images by color using either our proposed

approach or the conventional one. The Web-enabled version of our

system is able to execute both similarity searches and simple range

queries over a collection of binary and edited images. Screenshots

of the retrieval interface of our system are displayed in Figure 9

where the left and right images display the range and similarity

search querying interfaces, respectively.

Table IV. Results of boundary computations for each edited image (Bin6).

Edited Image Operation Step |DR| |HB|min |HB|max ImageSize BoundMin BoundMax

I5 (Initialization) n/a 0 0 100 0.0 0.0

Define (0,0,9,4) 50 0 0 100 0.0 0.0

Merge(null,0,0) 50 0 0 50 0.0 0.0

I6 (Initialization) n/a 0 0 100 0.0 0.0

Define (0,0,4,3) 20 0 0 100 0.0 0.0

Modify(0,100,255, 0,100,255,0,100,255) 20 0 0 100 0.0 0.0

Combine (1,2,1,2,4,2,1,2,1) 20 0 0 100 0.0 0.0

I7 (Initialization) n/a 40 40 100 0.4 0.4

Define (0,0,9,0) 10 40 40 100 0.4 0.4

Mutate (1,0,5,0,1,5,0,0,1) 10 30 50 100 0.3 0.5

I8 (Initialization) n/a 40 40 100 0.4 0.4

Define (0,0,9,2) 30 40 40 100 0.4 0.4

Modify (0,255,255, 0,255,255, 0,255,255) 30 10 40 100 0.1 0.4

Table V. Results of boundary computations for each edited image (Bin7).

Edited Image Operation Step |DR| |HB|min |HB|max ImageSize BoundMin BoundMax

I7 (Initialization) n/a 60 60 100 0.6 0.6

Define (0,0,9,0) 10 60 60 100 0.6 0.6

Mutate (1,0,5,0,1,5,0,0,1) 10 50 70 100 0.5 0.7

I8 (Initialization) n/a 60 60 100 0.6 0.6

Define (0,0,9,2) 30 60 60 100 0.6 0.6

Modify (0,255,255, 0,255,255, 0,255,255) 30 60 90 100 0.6 0.9

188 Vol. 18, 182–194 (2008)

Page 8: Using editing operations to improve searching by color in multimedia database systems

Our performance evaluation only focuses on comparing the re-

trieval accuracy of the similarity search queries. As a result, our

evaluation only used the query processing portion of the prototype

to compare our approach to the conventional one. The static and

dynamic parameters of the performance evaluations are listed in

Tables VI and VII, respectively.

The data set used in our performance evaluation contained a col-

lection of 4760 total images. This data set was created using a col-

lection of binary images of international road signs obtained from

the Web (Geocities, 2005). For each of these binary images, we cre-

ated four new edited images from it. Each edit consisted of two

operations from Section IV with each operation applied using ran-

dom parameters. The first operation was the Define operation used

to select some area within the image to edit, and the second operation

was applied to the selected area. Each of the four remaining opera-

tions, Combine, Merge, Mutate, and Modify, were used as the second

operation yielding the four edited images created for each original

image. For evaluating the performance of the conventional approach,

each edited image was instantiated and stored in the gif format.

The Web site (Geocities, 2005) classified the original collection

of signs into nine categories based in part on the November 1968

Convention on Road Signs and Signals. These categories listed on

the Web site serve in our evaluation as the basis for determining the

accuracy of both our proposed approach and the conventional

color-based retrieval approach. Specifically, all images in a given

category were considered to be similar, and two images from differ-

ent categories were considered to be not similar. Thus, when a

query of the type ‘‘Retrieve all images that are similar to queryimage q’’ was submitted to the system, the desired results should

have contained all of the images in q’s category. The results were

obtained using each of the original images in the database as the

query image q.The metrics used to gauge the accuracy of the proposed and con-

ventional approaches during the performance evaluation were preci-

sion and recall. Precision is computed as the number of relevant

images retrieved divided by the total number of images retrieved,

and recall is computed as the number of relevant images retrieved

divided by the total number of relevant images in the database. Typ-

ically, the precision of a system improves as the recall declines.

Again, the categories provided in (Geocities, 2005) served as the

basis for defining the images that were relevant.

Each measurement was obtained by using each of the binary

images in the database as the query image. For each threshold

value, the average precision and recall of each query were com-

puted for every category of images. This yielded 9 precision values

and 9 recall values for each threshold value. These sets of values

Table VII. Dynamic parameters used in evaluation (Data set I).

Description Values

Threshold 0.05, 0.10, 0.15, 0.20, 0.25, 0.30, 0.35, 0.40, 0.45, 0.50,

0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95

Table VI. Static parameters used in evaluation (Data set I).

Description Default Value

Total Number of Images in the Database 4,760

Number of Edited Images in the Database 3,808

Number of Operations per Edited Image 2

Number of Image Categories 9

Color Model Luv

Histogram Dimensions 32

Figure 9. Retrieval interface for executing color-based range queries. [Color figure can be viewed in the online issue, which is available atwww.interscience.wiley.com.]

Vol. 18, 182–194 (2008) 189

Page 9: Using editing operations to improve searching by color in multimedia database systems

were then averaged to produce a single average precision and aver-

age recall value for each threshold. This procedure was used so that

each category of images would have equal weight in measuring the

retrieval accuracy of each approach.

Figures 10a and 10b display the results of our tests measuring

precision and recall, respectively. In each graph, the lighter line

‘‘Hist’’ corresponds to the results for the traditional-Histogram

based approach, and the darker line ‘‘Rule’’ corresponds to the

results for our proposed rule-based approach. Both tests varied the

threshold for determining similarity using values 0.05, 0.10, . . . ,0.95. In addition, Figure 10c displays a precision-recall graph

obtained using each threshold’s precision and recall measurement

pair as a data point. This graph gives an indication of the precision

that can still be obtained as the system’s recall improves from using

increasing threshold values. The precision-recall graph illustrates

that the rule-based approach is able to consistently generate results

with higher precision when the recall is smaller than 0.5. This result

is a reflection of Figures 10a and 10b which show the rule-based

approach outperformed the conventional histogram approach in

both precision and recall for smaller thresholds (thresholds below

0.5 in this test). In the tests, our rule-based approach produced an

average gain in precision of 7.8% and an average gain in recall of

4.8% when considering all thresholds. When considering only

thresholds below 0.5, the average gain was 11.9% for precision and

14.8% for recall. When the threshold exceeds 0.5, Figure 10b shows

that the conventional approach began to have higher recall rates

than our approach. This caused the lines of precision-recall graph of

Figure 10c to coincide.

It should also be noted that our approach provides these per-

formance gains in addition to allowing a system to save space by

storing edited images as sequences of operations. As shown in

Table VIII, the total amount of space needed by our testing database

using the conventional retrieval approach system was 13.64 MB.

This storage total was composed of the original images (5.77 MB),

the augmented images (7.16 MB), and the color features (0.71 MB).

In contrast, the total amount of space needed by the database when

using our proposed approach was 6.15 MB. This total was com-

posed of the original images (5.77 MB), the augmented images

stored as editing operations (0.24 MB), and the color features

(0.14 MB). Note that we used less space (less than half in this

experiment) storing the color features because we did not have to

permanently store the color histograms of the edited images.

The above space savings will become more pronounced as the

number of edited images in the database increases. To illustrate,

consider a second data set used in our performance evaluation that

contained a collection of 25,000 total images. The original images

were a collection of U.S. state flags obtained from the Web by sub-

mitting text-based queries to Google. The following phrases were

submitted to Google ‘‘State Flag of x,’’ and ‘‘x State Flag’’ where xwas one of the 50 states in the U.S., and the top 100 results were

saved as part of the collection. This resulted in 5000 total binary

images divided into 50 categories where the results for a state repre-

sented one category. The edited images were then formed in the

Figure 10. (a) Precision versus threshold (Data Set I), (b) recall ver-

sus threshold (Data Set I), (c) precision versus recall (Data Set I).

Table VIII. Comparison of permanent storage space (Data set I).

Approach

Original

Images

Augmented

Images

Color

Features

Total

Space

HIST

(Conventional)

5.77 MB 7.16 MB 0.71 MB 13.64 MB

RULE

(Proposed)

5.77 MB 0.24 MB 0.14 MB 6.15 MB

Table IX. Static parameters used in evaluation (Data set II).

Description Default Value

Total Number of Images in the Database 25,000

Number of Edited Images in the Database 20,000

Number of Operations per Edited Image 2

Number of Image Categories 50

Color Model Luv

Histogram Dimensions 32

190 Vol. 18, 182–194 (2008)

Page 10: Using editing operations to improve searching by color in multimedia database systems

same manner as described earlier with four edited images created

for each binary one. Queries were submitted for each of the first

10 query images in each category. As before, all of the images in

the same category as the query image were considered to be rele-

vant. The static and dynamic parameters of the performance evalua-

tions are listed in Tables IX and X, respectively.

Table XI shows the space savings gained by storing the edited

images in the second data set as sequences of operations. Figures

11a and 11b display the results of our tests on this second data set

measuring precision and recall, respectively. As before, the lighter

line ‘‘Hist’’ corresponds to the results for the traditional-Histogram

based approach, and the darker line ‘‘Rule’’ corresponds to the

results for the rule-based approach. The tests varied the threshold

for determining similarity using values 0.05, 0.15, . . . , 0.95. Theresults indicate that our proposed rule-based approach is able to

increase the recall of the system while obtaining the space savings

as described earlier. The average gain in recall was 8.0%. This gain,

however, was offset by an average loss in precision of 8.9%. This

decrease in the precision of the results causes the conventional and

proposed rule-based lines to coincide in the precision-recall graph,

so that graph is not pictured for this data set.

To further analyze the performance in terms of retrieval accu-

racy of our proposed approach, we tested its effectiveness against

the editing operations of Section IV.A individually. In these tests,

we compared our proposed approach and the conventional histo-

gram approach against subsets of our original databases. Each sub-

set consisted of the original images and the edited images created

using one specific operation, Combine, Modify, Mutate, or Merge.

Thus, these tests allowed us to evaluate the effectiveness of the

individual rules for each operation.

Figures 12a through 12d display the results of the above tests for

the international road sign data set giving the precision-recall

graphs for the database of edited images created with the Combine,

Modify, Mutate, and Merge operations, respectively. These figures

illustrate that each rule contributes to the improved retrieval accu-

racy illustrated in Figure 10c with the exception of the Combine

operation. This is not surprising since the rule acts as if the opera-

tion does not change an image. These results also indicate that the

rules for the Merge operation are the least effective when compared

with the conventional approach. This pattern held when the individ-

ual operation results for the state flag data set were examined as

well. This implies that the accuracy of the rule-based approach may

be improved by refining the rules for the Merge operation.

VII. REDUCING EXECUTION TIME

Systems that use conventional approaches such as histograms to

retrieve images by color are able to process submitted retrieval

queries without having to access each image in the underlying data-

base. This is frequently accomplished using an index or other types

of access method whose nodes represent regions of the multidimen-

sional data space of the feature signatures. The speedup in query

processing is obtained through the avoidance of having to access all

of the nodes of the index by quickly identifying sections of the mul-

tidimensional space that cannot contain feature signatures that sat-

isfy the given query.

Using a similar idea of reducing query processing time by elimi-

nating data accesses, this section summarizes a method presented in

(Brown et al., 2006) for speeding up the approach described in the

previous section. Specifically, this approach avoids accessing some

of the descriptions of the edited images during query processing. It

accomplishes this by identifying the rules that will only widen the

range specified by the minimum bound and maximum bounds,

called bound-widening rules. The bound-widening rules presented

earlier are the ones for the Modify, Combine, and Mutate opera-

tions, and rule for the Merge operation when the target parameter is

null.

To take advantage of bound-widening rules, the system needs to

store those edited images that only have the above operations in a

data structure. These edited images are clustered together based

upon the referenced base images that are listed in their respective

descriptions, meaning that two edited images are clustered together

if and only if they have the same referenced image. Each element

of the data structure is composed of a tuple <B_id, E_list[ where

B_id is the identifier of referenced base image and E_List is the listof identifiers of edited images that were created from modifying

B_id. The remaining edited images are stored in an alternative list.

Figure 11. (a) Precision versus threshold (Data Set II), (b) recall ver-

sus threshold (Data Set II).

Table X. Dynamic parameters used in evaluation (Data set II).

Description Values

Threshold 0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95

Table XI. Comparison of permanent storage space (Data set II).

Approach

Original

Images

Augmented

Images

Color

Features

Total

Space

HIST

(Conventional)

10.05 MB 132.93 MB 5.36 MB 148.34 MB

RULE

(Proposed)

10.05 MB 1.38 MB 1.41 MB 12.84 MB

Vol. 18, 182–194 (2008) 191

Page 11: Using editing operations to improve searching by color in multimedia database systems

The proposed data structure can be constructed as images are

inserted into the database. Each time an image stored in a traditional

binary format is inserted, the identifier for its corresponding histo-

gram should be added to the data structure. The list of identifiers

should be kept sorted to make it easier to search for a specific bi-

nary image. Once a binary image b is added to the MMDBMS, the

system should insert the descriptions of the edited versions of b into

the system as well. Each time an edited image is inserted into the

database, the system needs to determine whether it should be added

to the data structure or the alternative list by identifying if it con-

tains any operations whose rules are not bound-widening. An algo-

rithm for performing this insertion is displayed in Figure 13.

The above data structure can be used to process queries that

search for specific color feature values in an augmented MMDBMS

without having to ever instantiate the edited images. First, the algo-

rithm, displayed in Figure 14, computes the query parameters HB,

PCTmin, and PCTmax. Next, the algorithm sequentially accesses

each cluster in the data structure and checks if the histogram of the

corresponding binary image satisfies the given query. If so, then its

identifier along with all the identifiers of the edited images within

the cluster were added to the query’s resultant set. If the binary

image’s histogram does not satisfy the query, then the rules for

each operation of the edited images within the cluster will have to

be applied as usual. The final step in the algorithm is to apply the

rules for each operation of the edited images listed in the alternative

list.

Our prototype described earlier was used to evaluate the per-

formance of the data structure. The data sets used in the test were

Figure 12. (a). Precision versus recall for combine operation database (Data Set I), (b) precision versus recall for modify operation database (Data

Set I), (c) precision versus recall for mutate operation database (Data Set I), (d) precision versus recall for merge operation database (Data Set I).

Figure 13. Insertion algorithm for proposed

data structure.

192 Vol. 18, 182–194 (2008)

Page 12: Using editing operations to improve searching by color in multimedia database systems

obtained from various sites on the Internet. The first data set con-

tains a collection of images of flags around the world (HTTP,

2003a), and the second contains a collection of images of college

football helmets(HTTP, 2003b). These data sets were selected

because color-based features are extremely important in recogniz-

ing both flags and logos. The tests compared the average execution

time of the algorithms for processing range queries in augmented

databases with and without using the above data structure. The

results indicate that the average execution time is smaller with the

data structure than without it. Specifically, the system processes the

queries an average of 33.07% faster for the helmet data set and an

average of 22.08% faster for the flag data set. Both tests demon-

strated, however, that the reduction in time decreased as more

images were stored as editing operations. The reason is that the pro-

posed data structure improves execution time when images contain

only operations with bound-widening rules. Each edited image con-

taining a nonbound-widening operation requires the same process-

ing cost as the original algorithm. If many of the edited images fall

into this category, the added cost of the data structure actually hurts

the performance of the query processor.

VIII. SUMMARYAND FUTURE WORK

MultiMedia DataBase Management Systems (MMDBMSs) focus

on the storage and retrieval of images and other types of multimedia

data. A common type of query used to search images is one that

retrieves all images that are similar to a given query image, q. Toallow for greater flexibility when matching database images to a

query image, a database may be augmented with additional images

created by editing the original set of images in the database. To

save space when storing the additional images, they can be stored

as sequences of editing operations instead of in a binary format.

This article presented an approach for searching images by color

in an augmented database. Our algorithm searches the images with-

out having to instantiate the edited images in the database permit-

ting their retrieval while maintaining the original space savings. In

addition, the database does not have to extract and, therefore, store

the visual properties or features from the edited images to search

them which is another increase in savings. Our tests on the primary

data set of road signs indicated that our approach can be used to

obtain an improvement in retrieval accuracy while saving space.

The tests on the second, larger data set did not show a significant

increase in retrieval accuracy, although the space savings were

more pronounced.

This article focused on searching a collection of images utilizing

the visual property of color. As a next step in our work, it will be

necessary to identify rules for retrieving images using other proper-

ties besides color, such as texture and shape. Ultimately, rules must

be identified for identifying the effects of editing operations on

more complex features within a set of images from extremely nar-

row domains, such as identifying the effects of common disguises

on the features of a face.

Although this work focused on how to search augmented

images, the next major issue is to define how to augment the

images. This involves identifying the editing operations that should

be used to create the additional images, as well as identifying when

to apply them. Once such a procedure is defined, it should become

part of a process that is periodically performed automatically by the

MMDBMS allowing it to optimize itself without relying on the as-

sistance of a database administrator.

REFERENCES

Y.A. Aslandogan and C.T. Yu, Techniques and systems for image and video

retrieval, IEEE Trans Knowl Data Eng 11 (1999), 56–63.

L. Brown, L. Gruenwald, and G. Speegle, Testing a set of image processing

operations for completeness, Proc 2nd Int Conf Multimedia Inf Syst, Chi-

cago, Illinois, April 1997, pp. 127–134.

L. Brown and L. Gruenwald, Tree-based indexes for image data, J Vis Com-

mun Image Represent 9 (1998), 300–313.

L. Brown and L. Gruenwald, Performing color-based similarity searches in

multimedia database management systems augmented with derived images,

Proc 21st Br Natl Conf Databases, Lecture Notes Comput Sci, Vol. 3112,

Springer, Edinburgh, Scotland, July 2004, pp. 178–189.

Figure 14. Query processing algo-

rithm utilizing proposed data structure.

Vol. 18, 182–194 (2008) 193

Page 13: Using editing operations to improve searching by color in multimedia database systems

L. Brown and L. Gruenwald, Speeding up color-based retrieval in multime-

dia database management systems that store images as sequences of editing

operations, Proc First IEEE Int Workshop Multimedia Databases Data

Manag (MDDM), Atlanta, Georgia, April 2006, CD-ROM.

S. Chatzis, D. Anastasios, and V. Theodora, A content-based image retrieval

scheme allowing for robust automatic personalization,Proc 2007 Conf Video

Image Retrieval (CVIR), Amsterdam, The Netherlands, July 2007, pp. 1–8.

R. Datta, L. Jia, and Z.W. James, Content-based image retrieval -

approaches and trends of the new age, Proc 7th ACM SIGMM Int Workshop

Multimedia Inf Retrieval, Singapore, November 2005, pp. 253–262.

R. Datta, J. Dhiraj, L. Jia, and Z.W. James, Image retrieval: Ideas, influen-

ces, and trends of the new age, ACM Comput Surv 40 (2008).

S. Deb and Z. Yanchun, An overview of content-based image retrieval tech-

niques, Proc 18th Int Conf Adv Inf Netw Appl (AINA), 2004, pp. 59–64.

C. Djeraba, P. Fargeaud, and H. Briand, Retrieval and extraction by content of

images in an object oriented database, Proc 2nd Conf Multimedia Inf Syst,

April 1997, pp. 50–57.

P. Dukkipati and L. Brown, Improving the recognition of geometrical shapes

in road signs by augmenting the database, Proc 3rd Int Conf Comput Sci

Appl, San Diego, California, June 2005, pp. 8–13.

L. Dunckley, Multimedia databases: An object-relational approach, Addison-

Wesley, London, 2003.

V. Gaede and G. Oliver, Multidimensional access methods, ACM Comput

Surv 30 (1998), 170–231.

[Geocities, 2005] URL http://www.geocities.com/jusjih/roadsigns.html#d,

last accessed September 7, 2005.

A. Gupta and J. Ramesh, Visual information retrieval, Commun ACM, 40

(1997), 71–79.

A. Guttman, R-trees: A dynamic index structure for spatial searching, Proc

1984 ACM SIGMOD Int Conf Manage Data, 1984, pp. 47–57.

[HTTP, 2003a] images from URL http://www.flags.net, last accessed on

January 7, 2003.

[HTTP, 2003b] Images obtained from Web, URL http://inside99.n-et/

Helmet_Project/index.htm, last accessed on January 7, 2003.

X. Jin and J.C. French, Improving image retrieval effectiveness via multiple

queries, Proc 1st ACM Int Workshop Multimedia Databases, New Orleans,

Louisiana, November 2003, pp. 86–93.

V. Oria, M.T. Ozsu, S. Lin, and P.J. Iglinski, Similarity queries in the DIS-

IMA image DBMS, Proc 9th ACM Int Conf Multimedia, Ottawa, Canada,

October 2001, pp. 475–478.

M. Ortega, Y. Rui, K. Chakrabarti, K. Porkaew, S. Mehrotra, and T.S.

Huang, Supporting ranked Boolean similarity queries in MARS, IEEE Trans

Knowl Data Eng 10 (1998), 905–925.

A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, Content-

based image retrieval at the end of the early years, IEEE Trans Pattern Anal

Machine Intelligence 22 (2000), 1349–1380.

G. Speegle, W. Xiaojun, and G. Le, A meta-structure for supporting multi-

media editing in object-oriented databases, Proc 16th Br Natl Conf Data-

bases, July 1998, Lecture Notes Comput Sci, Vol. 1405, Springer, pp. 89–

102.

G. Speegle, A.M. Gao, S. Hu, and L. Gruenwald, Extending databases to

support image editing, Proc IEEE Int Conf Multimedia Expo, August 2000.

R.O. Stehling, A.N. Mario, and X.F. Alexandre, A compact and efficient

image retrieval approach based on border/interior pixel classification, Proc

11th Int Conf Inf Knowl Manag, November 2002.

M.J. Swain and D.H. Ballard, Color indexing, Int J Comput Vis 7 (1991),

11–32.

S.M.M. Tahaghoghi, A.T. James, and E.W. Hugh, Are two pictures better

than one, Proc 12th Australasian Conf Database Technol, Queensland,

Australia, January 2001, pp. 138–144.

D. Tao, T. Xiaoou, and L. Xuelong, Which components are important for

interactive image searching?, IEEE Trans Circ Syst Video Technol 18

(2008), 3–11.

D. Tao, L. Xuelong, and J.M. Stephen, Negative samples analysis in rele-

vance feedback, IEEE Trans Knowl Data Eng 19 (2007), 568–580.

D. Tao, T. Xiaoou, L. Xuelong, and R. Yong, Direct kernel biased discrimi-

nant analysis: A new content-based image retrieval relevance feedback algo-

rithm, IEEE Trans Multimedia 8 (2006), 716–727.

G.K. Wallace, The jpeg still picture compression standard, Commun ACM

34 (1991), 30–44.

N. Vasconcelos, From pixels to semantic spaces: Advances in content-based

image retrieval, IEEE Comput 40 (2007), 20–26.

W. Zhao, R. Chellappa, P.J. Phillips, and A. Rosenfeld, Face recognition: A

literature survey, ACM Comput Surv 35 (2003), 399–458.

194 Vol. 18, 182–194 (2008)