Semantic Indexing Of Images Using A Web Ontology Language
Transcript of Semantic Indexing Of Images Using A Web Ontology Language
Semantic Indexing Of Images Using A Web Ontology Language
Gowri Allampalli-Nagaraj
A thesis
submitted in partial fulfillment of the
requirements for the degree of
Master of Science
University of Washington
2007
Program Authorized to Offer Degree:
Institute of Technology - Tacoma
University of Washington
Graduate School
This is to certify that I have examined this copy of a master‘s thesis by
Gowri Allampalli-Nagaraj
and have found that it is complete and satisfactory in all respects,
and that any and all revisions required by the final
examining committee have been made.
Committee Members:
_____________________________________________________
Isabelle Bichindaritz
_____________________________________________________
George Mobus
Date:__________________________________
In presenting this thesis in partial fulfillment of the requirements for a master‘s degree at
the University of Washington, I agree that the Library shall make its copies freely available
for inspection. I further agree that extensive copying of this thesis is allowable only for
scholarly purposes, consistent with ―fair use‖ as prescribed in the U.S. Copyright Law. Any
other reproduction for any purposes or by any means shall not be allowed without my
written permission.
Signature ________________________
Date ____________________________
University Of Washington
Abstract
Semantic Indexing Of Images Using A Web Ontology Language
Gowri Allampalli-Nagaraj
Chair of the Supervisory Committee:
Professor Isabelle Bichindaritz
Computing and Software Systems
This paper presents a system implemented to evaluate the retrieval efficiency of images
when they are semantically indexed using a combination of a Web Ontology Language and
the low level features of the image. Finding a similarity measure algorithm to retrieve
images based on the semantic metadata can be very challenging due to diverse image
content and inadequate domain specific ontologies describing the content. Existing
methods for indexing images are primarily based on text. While this method is widely used
due to its simplicity, it is not very efficient as it requires a domain expert and the textual
interpretations of image content vary from person to person. In our approach, we leverage
sophisticated image processing techniques to extract image content information and
associate them to existing domain ontologies developed by experts thereby, bridging the
gap between low level features and high level semantics. The work described in this paper
shows that a high retrieval accuracy rate is obtained when all the image descriptors are
combined with an ontology while building the semantic metadata for indexing images.
i
TABLE OF CONTENTS
List Of Figures ........................................................................................................................ iii
List Of Tables .......................................................................................................................... iv
Chapter 1 ............................................................................................................................... 1
Introduction ....................................................................................................................... 1
Chapter 2 ............................................................................................................................... 3
Motivation......................................................................................................................... 3
Chapter 3 ............................................................................................................................... 4
Problem Statement ........................................................................................................... 4
Chapter 4 ............................................................................................................................... 6
Background ....................................................................................................................... 6
4.1 Ontology ................................................................................................................. 6
4.2 Image Databases ..................................................................................................... 6
4.3 Image Semantic Representation Languages .......................................................... 7
4.4 Image Interpretation Software ............................................................................... 8
4.5 MPEG – 7 ............................................................................................................... 8
4.6 Distance Measure ................................................................................................. 11
Chapter 5 ............................................................................................................................. 12
Datasets ........................................................................................................................... 12
5.1 Visible Human Image Data Set: .......................................................................... 12
5.2 University Of Washington Digital Anatomist Reference Ontology ................... 13
Chapter 6 ............................................................................................................................. 14
Preprocessing Tools ....................................................................................................... 14
6.1 MySQL ................................................................................................................. 14
6.2 Adobe Photo Shop ............................................................................................... 14
6.3 M - Ontomat Annotizer ........................................................................................ 14
Chapter 7 ............................................................................................................................. 16
Preprocessing Methods .................................................................................................. 16
7.1 Selection Of Images From Visible Human ......................................................... 16
7.2 Extraction Of UWDA Ontological Terms From The UMLS Database ............. 16
7.3 Creation Of UWDA Reference Ontology In DAML (Darpa Agent Mark Up
Language) ................................................................................................................... 17
7.5 Conversion Of Image Format To JPEG .............................................................. 18
7.6 Extracting Image Content And Linking To Domain Ontology .......................... 18
Chapter 8 ............................................................................................................................. 20
Methods .......................................................................................................................... 20
8.1 Training And Test Set .......................................................................................... 20
8.2 Extracting Image Content From XML Files ....................................................... 21
8.3 Calculating Distance Measure ............................................................................. 21
8.4 Calculating Combined Distance Measure ........................................................... 25
8.5 Creating Distance Matrix ..................................................................................... 25
ii
8.6 Calculating Retrieval Accuracy Rate .................................................................. 25
8.7 Improving Retrieval Accuracy Rates. ................................................................. 27
Chapter 9 ............................................................................................................................. 29
Results, Discussion And Analysis ................................................................................. 29
9.1 Initial Results ........................................................................................................ 29
9.2 Increased Training To Test Ratio ........................................................................ 30
9.3 Combined Descriptors ......................................................................................... 32
9.4 Ensemble Classification ....................................................................................... 33
9.5 Ten Fold Cross Validation ................................................................................... 35
9.6 Excluding Descriptors .......................................................................................... 37
9.7 Empirical Weight Optimization ........................................................................... 38
Chapter 10 ........................................................................................................................... 39
Related Work .................................................................................................................. 39
10.1 Knowledge – Assisted Video Analysis And Object Detection ........................ 39
10.2 Retrieval of Multimedia Objects By Combining Semantic Information From
Visual And Textual Descriptors ................................................................................ 40
Chapter 11 ........................................................................................................................... 41
Educational Statement .................................................................................................... 41
Chapter 12 ........................................................................................................................... 43
Conclusion ...................................................................................................................... 43
Bibliography ........................................................................................................................... 44
Appendix A ............................................................................................................................. 48
Presentation Slides.......................................................................................................... 48
Appendix B ............................................................................................................................. 98
Installation & User Manual ............................................................................................ 98
Appendix C ........................................................................................................................... 102
System Output .............................................................................................................. 102
Appendix D ........................................................................................................................... 103
Image Descriptor Files ................................................................................................. 103
Appendix E ........................................................................................................................... 109
DAML Ontology File ................................................................................................... 109
Appendix F ............................................................................................................................ 114
Image Annotation Files ................................................................................................ 114
iii
LIST OF FIGURES
Figure Number Page
1: Image of Abdomen from Visible Human Data Set. .................................................... 12
2: Image of Thigh from Visible Human Data Set. ........................................................... 13
3: Screenshot of SQL query used to Extract UWDA terms from UMLS. ...................... 17
4: Screenshot of VDE tool in M-Ontomat Annotizer showing the image feature extraction
and annotation process. ............................................................................................ 19
iv
LIST OF TABLES
Table Number Page
1: Accuracy rate for training set. ...................................................................................... 29
2: Accuracy rate for test set. ............................................................................................. 30
3: Accuracy rate for 75% images in training set and 25% images in test set. ................ 31
4: Accuracy rate for 50% images in training set and 50% images in test set. ................ 31
5: Combined accuracy rate for training Set = 50 % and test Set = 50%. ........................ 32
6: Combined accuracy rate for training set = 75 % and test set = 25%. ......................... 33
7: Accuracy rate for Ensemble Classification for 50% test and 50% training. ............... 33
8: Accuracy rate for Ensemble Classification for 75% training and 25% training......... 34
9: Accuracy rate for Ten Fold Cross Validation for 75% training and 25 % test. .......... 35
10 : Accuracy rate for Ten Fold Cross Validation for 50% training and 50 % test. ....... 36
11: Accuracy Rate excluding Contour Shape and Texture Browsing. ........................... 37
12: Accuracy rate excluding Contour Shape descriptor. ................................................. 38
13: Accuracy rates for Empirical Weight Optimization. ................................................. 38
v
ACKNOWLEDGEMENTS
Special thanks to Professor Isabelle Bichindaritz for all her assistance, guidance and
feedback during the course of this thesis. Her involvement was essential in the completion
of this thesis. I am also very thankful to Professor George Mobus for all his help and
valuable feedback. Thanks to the members of the committee for all their valuable input.
vi
DEDICATION
To my husband, family and friends.
1
Chapter 1
INTRODUCTION
With the advances in medical technology over the years we have a large number of
digital images like Magnetic Resonance Images (MRI), X-Rays, anatomical and
pathological images, etc. Medical research has led to the development of valuable
knowledge bases consisting of formal domain ontologies, electronic patient records,
statistical medical data and results of various medical studies. Analysis of these images is
of utmost importance to study the different aspects of a problem. To analyze the
information stored in these images, the concerned doctors / scientists should be able to
access the image information easily and effectively [15]. Until lately, medical databases
mostly used textual information to store and retrieve images not making potential use of the
rich image content present in the digital images. Handling large collections of images is a
growing challenge and there has been a lot of research in the area of image retrieval
systems to efficiently store and retrieve image collections.
The main goal for this thesis work is to aid the ongoing research in the area of
semantic indexing of images by evaluating the retrieval effectiveness of image collections
when image content information is combined with a formalized ontology to automatically
index images by content. Research in this area has raised questions as to whether or not it is
possible to develop a semantic indexing system with an efficient rate of image retrieval
[34]. The challenge involved is to develop a similarity matching algorithm for analyzing
the image content extracted and producing a match.
In the system presented here, we use medical anatomical images from the Visible
Human [24] data set and the Digital Anatomist [22] formal medical ontology developed for
the human anatomical terms. In our approach, we extract various image features like color,
2
shape, texture, etc in MPEG-7[35] standard image feature description format and associate
them to the related anatomical terms thus building the semantic metadata. An important
feature of this system is the similarity matching algorithm developed to calculate the
matching between images thereby determining the retrieval accuracy rate for the system.
Various experiments based on different approaches for improving the accuracy rates were
performed to evaluate the retrieval efficiency of the system.
Chapter 2 describes the motivation behind this research. A detailed description of
the problem being solved and the background information required to understand this
research area are illustrated in Chapters 3 and 4. Chapters 5 and 6 illustrate the dataset and
preprocessing tools and resources used to process the data for further analysis. Chapter 7
describes the methods used in pre processing the data. The architecture of the system and
the methodology used to solve the research issue is described in Chapter 8. The
experimental results, analysis and discussion are described in Chapter 9. Chapter10
describes other related work in this area. Chapter 11 contains the Educational Statement.
Finally, Chapter 12 contains the conclusions derived from this implementation.
3
Chapter 2
MOTIVATION
With the number of digital images increasing rapidly, there is a great need to
manage digital image repositories. There is a need to store and retrieve images just like text
documents. Advances in the field of medical technologies have encouraged hospitals and
medical research centers to use various machines like X-Ray, Magnetic Resonance
Imaging (MRI), Scan, etc. The use of such machines has resulted in the production of
valuable data in the form of digital images on different diseases, physical structures,
various organisms, etc. Analysis of these images is of utmost importance to study the
different aspects of a problem. To analyze the information stored in these images, the
concerned doctors / scientists should be able to access the image information easily and
effectively.
By indexing images based on semantic descriptors of low level features, doctors
can submit a query like – ‗find images with round calcifications‘ [3]. In such a query,
‗calcification‘ is the textual description representing the semantics of the region of interest
and the shape ‗round‘ is the textual annotation representing the low-level shape feature.
Executing such a query would avoid the retrieval of images with just a round shape or with
just the associated text ‗calcification‘. Another example query can be of this form- ‗find all
the images having a blue sky‘. Such a query would yield images whose semantic descriptor
is ‗blue‘ and the corresponding feature representation is the color ‗blue‘. This kind of
semantic annotation for images greatly improves the image classification and query
mechanisms. There is a growing need for research in the area of attaching semantics to low
level features to improve image retrieval and storage methods [25]. In our implementation,
images are indexed based on their semantic content, in order to address the growing need
for representing images with meaningful annotations and improve their retrieval efficiency.
4
Chapter 3
PROBLEM STATEMENT
The number of digital images is growing rapidly, driving the need for the
development of efficient tools to browse, retrieve and navigate through these large image
collections. As information contained in images is complex, containing different colors,
shapes, textures and subject, indexing methods designed for storing and retrieving textual
content will not work effectively. There is a need to explicitly capture a sufficient amount
of content information as well as application specific semantics by means of a variety of
metadata like multimedia indexes, attribute based annotations and intentional descriptions
to allow appropriate selection, browsing and retrieval of images from large collections [1].
Potentially, images have many types of attributes that could be used for storage and
retrieval. Presence of a particular combination of color, texture or shape features, presence
of a specific type of object, depiction of a particular event, presence of individuals /
locations, presence of specific emotions or metadata such as who created the image, where
and when, etc., are some image attributes that could be used for indexing images. Images
can be indexed based on a single attribute or a combination of attributes to improve the
efficiency of the image retrieval system.
Traditionally images are indexed based on textual annotations. Every image is
examined individually and a textual annotation describing the various characteristics of the
image is stored along with the image for the purposes of indexing. Given the large number
of images being produced, manual annotations tend to be very time consuming and prone
to error. Querying images with textual annotations is also not very effective, as images
have so much more content in them making it harder to describe the image with plain text
[15, 34].
5
Another approach to indexing images is to extract the content of images like color,
shape and texture and to store the feature representation of such content along with the
images for indexing purposes. With this approach of indexing, the images could only be
queried on their color, shape and texture but not on the actual subject matter. This approach
is not useful in querying images containing a particular subject matter and is said to have
many limitations when applied to image databases with a broad content [15].
The most recent approach to indexing images is to use the low level features of the
image as semantic descriptors of the image thus bridging the gap between the above two
approaches of indexing images. Digital images are composed of pixels arranged in an
infinite variety of patterns and, in general, it is difficult to predict the particular pattern that
would match the information need. Deciding on the aspects of the image that are
appropriate for indexing is very challenging. Interpretation of the semantic content is in
itself a challenging task as every interpretation can be different. Such an indexing would
greatly improve the querying capability of images as they can be queried for both low level
features as well as high level semantics.
The feature representation and the semantic descriptors of the image thus obtained
are mapped onto domain ontologies in order to classify the images for retrieval purposes.
Determining the association between semantic descriptors and ontologies is a difficult task.
Having a system which indexes images based on the semantic metadata would be very
beneficial to retrieve large collections of images more effectively and efficiently. With this
approach, one can leverage and combine the research efforts in the areas of domain
ontologies and image processing to build an effective image indexing system.
6
Chapter 4
BACKGROUND
4.1 Ontology
Ontology is a formal, explicit specification of a shared conceptualization. A
‗conceptualization‘ refers to the abstract model of some phenomenon in the world,
identifying the relevant concepts of that phenomenon. Explicit means that the type of
concepts is explicitly defined and formal refers to the fact that the ontology can be
expressed mathematically. As a result it is machine readable and understandable. In image
retrieval applications, ontology allows the description of semantics, establishes a common
and shared understanding of a domain and facilitates the implementation of a user oriented
vocabulary of terms and their relationship with objects in images [12].
4.2 Image Databases
Image data such as satellite images, medical images and digital pictures are
generated in large numbers every day. The World Wide Web itself is a huge repository of
images. As a result of the huge volume of image data, the use of multimedia databases is
very essential. Multimedia databases store and retrieve images, texts, videos, sounds and
data stored on any media. The analysis of such images is very useful for archival and
retrieval purposes in fields like medicine, environmental studies, military purposes, etc.
Multimedia databases support querying images based on their content. Images can be
queried based on the shape of the objects present in the image, colors of the object,
textures, volume, spatial relationships, motion, etc.
7
4.3 Image Semantic Representation Languages
Searching for images by content implies a first step of extracting features from the
images, to be able to search these features. Image mining deals with the extraction of this
semantic content from a large collection of images. Associating the semantic content with
the images is called annotation. Semantic content of images can be stored with images
using standard languages. In image annotation different objects of the image are attached
with textual and spatial information and stored in a database using a standard
representation. Images can be queried effectively by indexing the images along with their
semantic content. Metadata is the most important part of data archive and it provides
descriptive data about every stored object. Metadata includes indexing information that can
be described using a standardized framework to represent an image along with its semantic
content.
Resource Description Framework (RDF)[20] is used to represent information and to
exchange knowledge on the Web. Web Ontology Language (OWL)[20] used to publish
and share sets of terms called ontologies, supporting advanced Web search, software agents
and knowledge management. The DARPA Agent Markup Language (DAML)[20] is an
extension of XML, which provides a rich set of constructs to create ontologies and to
markup information so that it is machine readable and understandable. DAML, RDF and
OWL are some of the languages that have been developed to represent the semantic content
of the images. MPEG-7[35] offers a comprehensive set of audiovisual description tools to
create metadata descriptions which will form the basis for applications enabling the needed
effective and efficient access to multimedia content.
8
4.4 Image Interpretation Software
Image analysis software provides the tools for segmentation, feature extraction and
statistical analysis of content in images. Segmentation deals with the identification of
objects of interest within an image. Feature extraction is extracting information from the
images by measuring the number, size, shape or color of objects.
4.5 MPEG – 7
MPEG-7[35] is an ISO/IEC standard developed by MPEG (Moving Picture Experts
Group). MPEG-7, formally named "Multimedia Content Description Interface", is a
standard for describing the multimedia content data that supports some degree of
interpretation of the information meaning, which can be passed onto, or accessed by, a
device or a computer code. MPEG-7 is not aimed at any one application in particular;
rather, the elements that MPEG-7 standardizes support as broad a range of applications as
possible.
MPEG-7 Visual Description Tools included in the standard consist of basic
structures and descriptors that cover the following basic visual features: Color, Texture,
Shape and Motion, Localization, and Face recognition. Each category consists of
elementary and sophisticated descriptors. In this implementation, we are only using the
Color, Texture and Shape descriptors. The following section provides a brief description of
the image descriptors used.
Dominant Color. This color descriptor is most suitable for representing local (object or
image region) features where a small number of colors are enough to characterize the
color information in the region of interest. Whole images are also applicable, for
example, flag images or color trademark images. Color quantization is used to extract a
small number of representing colors in each region/image. The percentage of each
9
quantized color in the region is calculated correspondingly. A spatial coherency on the
entire descriptor is also defined, and is used in similarity retrieval.
Scalable Color. The Scalable Color Descriptor is a Color Histogram in HSV Color
Space, which is encoded by a Haar transform. Its binary representation is scalable in
terms of bin numbers and bit representation accuracy over a broad range of data rates.
The Scalable Color Descriptor is useful for image-to-image matching and retrieval based
on color feature. Retrieval accuracy increases with the number of bits used in the
representation.
Color Layout. This descriptor effectively represents the spatial distribution of color of
visual signals in a very compact form. This compactness allows visual signal matching
functionality with high retrieval efficiency at very small computational costs. It provides
image-to-image matching as well as ultra high-speed sequence-to-sequence matching,
which requires so many repetitions of similarity calculations.
Color Structure. The Color Structure descriptor is a color feature descriptor that
captures both color content (similar to a color histogram) and information about the
structure of this content. Its main functionality is image-to-image matching and its
intended use is for still-image retrieval, where an image may consist of either a single
rectangular frame or arbitrarily shaped, possibly disconnected, regions. The extraction
method embeds color structure information into the descriptor by taking into account all
colors in a structuring element of 8x8 pixels that slides over the image, instead of
considering each pixel separately.
Texture Browsing. The Texture Browsing Descriptor is useful for representing
homogeneous texture for browsing type applications, and requires only 12 bits
(maximum). It provides a perceptual characterization of texture, similar to a human
characterization, in terms of regularity, coarseness and directionality. The computation of
10
this descriptor proceeds similarly as the Homogeneous Texture Descriptor. First, the
image is filtered with a bank of orientation and scale tuned filters (modeled using Gabor
functions); from the filtered outputs, two dominant texture orientations are identified.
Three bits are used to represent each of the dominant orientations. This is followed by
analyzing the filtered image ions along the dominant orientations to determine the
regularity (quantified to 2 bits) and coarseness (2 bits x 2). The second dominant
orientation and second scale feature are optional.
Edge Histogram. The edge histogram descriptor represents the spatial distribution of five
types of edges, namely four directional edges and one non-directional edge. Since edges
play an important role for image perception, it can retrieve images with similar semantic
meaning. Thus, it primarily targets image-to-image matching (by example or by sketch),
especially for natural images with non-uniform edge distribution. In this context, the image
retrieval performance can be significantly improved if the edge histogram descriptor is
combined with other Descriptors such as the color histogram descriptor.
Region Shape. The shape of an object may consist of either a single region or a set of
regions as well as some holes in the object. Since the Region Shape descriptor makes use of
all pixels constituting the shape within a frame, it can describe any shapes, i.e. not only a
simple shape with a single connected region but also a complex shape that consists of holes
in the object or several disjoint regions. The Region Shape descriptor not only can describe
such diverse shapes efficiently in a single descriptor, but is also robust to minor
deformation along the boundary of the object.
Contour Shape. The Contour Shape descriptor captures characteristic shape features of an
object or region based on its contour. It uses so-called Curvature Scale-Space
representation, which captures perceptually meaningful features of the shape.
11
4.6 Distance Measure
A distance is a numerical description of how far apart objects are at any given
moment in time. In physics or everyday discussion, distance may refer to a physical length,
a period of time, etc. In mathematics, the Euclidean distance or Euclidean metric is the
"ordinary" distance between two points that one would measure with a ruler, which can
be proven by repeated application of the Pythagorean Theorem.
12
Chapter 5
DATASETS
This chapter illustrates the image data set and the reference ontology used for this
implementation.
5.1 Visible Human Image Data Set:
Images from the Visible Human [24] Data Set were used. The Visible Human
dataset contains anatomically detailed, three-dimensional representations of the normal
male and female human bodies. This digital image dataset contains complete human male
and female cadavers in MRI, CT and anatomical modes. The images were obtained via
academic licensing through National Library of Medicine.
Figure 1: Image of Abdomen from Visible
Human Data Set.
13
Figure 2: Image of Thigh from Visible Human
Data Set.
5.2 University Of Washington Digital Anatomist Reference Ontology
The University of Washington Digital Anatomist (UWDA) [22] reference ontology
from the medical domain was chosen. UWDA is an abridged version of the Foundation
Model of Anatomy [27] Ontology and is incorporated into the UMLS (Unified Medical
Language System) Meta source. UWDA is a domain ontology that represents knowledge of
the human body. It contains classes and relationships that provide a symbolic model of the
structure of the human body. This domain is computer based and was designed for
bioinformatics. It was developed by the structural information group at the University of
Washington. UMLS was obtained through academic licensing in order to access the
UWDA Ontology.
14
Chapter 6
PREPROCESSING TOOLS
This chapter illustrates the tools used to process the image data set and create the
reference ontology.
6.1 MySQL
MySQL is an open source SQL Database Management System. MySQL was used
in this implementation to house the UMLS database containing the University of
Washington Digital Anatomist reference ontology. The ontological terms contained in the
UWDA ontology was retrieved using SQL queries from the MySQL instance of UMLS.
6.2 Adobe Photo Shop
Adobe Photoshop is a graphics editor developed by Adobe Systems for image
manipulation. Images obtained from the visible human data set are in the raw format.
Adobe Photoshop was used to convert these images to JPEG format in order to access any
information contained in the images.
6.3 M - Ontomat Annotizer
M-OntoMat-Annotizer (M stands for Multimedia)[26] is a user-friendly tool
developed inside the aceMedia. It is an extension of the CREAM (CREAting Metadata
for the Semantic Web) framework and its reference implementation, OntoMat-
Annotizer. M-OntoMat-Annotizer Visual Descriptor Extraction Tool developed as a
plug –in to Ontomat –Annotizer presents a graphical interface for loading and
15
processing visual content (images and videos), extraction of visual features and
association with domain ontology concepts. M-OntoMat-Annotizer is a Java-based
application and is distributed under the GNU LESSER GENERAL PUBLIC LICENSE
[R1].
16
Chapter 7
PREPROCESSING METHODS
The following chapter describes the various steps involved in preparing the image
data set and the reference ontology for this implementation using the tools and data sets
described in the above chapters.
7.1 Selection Of Images From Visible Human
A subset of 90 images from the Visible Human Data Set was chosen. This subset
consisted of both the male and female images spanning from head to toe of the human
body. 15 categories based on different regions of the human body such as Head, Abdomen,
Thigh, Abductor Magnus, Kidney, Eyes, Brain, Gluteal Muscles, Hamstring, Biceps,
Pectoralis Major, Colon, Pelvis, Thorax and Lungs were chosen. The categories were
chosen such that the images range in their content i.e. they have different colors, shapes and
textures. 90 images were selected by picking 6 images from each of the 15 categories to act
as test and training images for our experiments.
7.2 Extraction Of UWDA Ontological Terms From The UMLS Database
A subset of 15 UWDA ontological terms corresponding to the 15 categories of
images described in the above section was extracted from the UMLS database for our
experiment. MySQL was used to install the UMLS database and SQL queries were
designed to extract the UWDA ontological terms from the UMLS database. The UMLS
database has various tables in the databases containing information such as concepts,
definitions, terms, etc. The following SQL query was used to extract the UWDA
ontological terms and their definition from the UMLS tables.
17
Figure 3: Screenshot of SQL query used to
Extract UWDA terms from UMLS.
7.3 Creation Of UWDA Reference Ontology In DAML (Darpa Agent Mark Up
Language)
An empty ontology file was created in the DAML format. The 15 extracted
ontology terms and definitions were then added to the file in DAML format using the
DAML references and guidelines. This file containing the 15 UWDA ontological terms
was used in M-Ontomat Annotizer as the reference ontology file in DAML format.
7.4 Loading Domain Ontology In M-Ontomat Annotizer
The reference ontology DAML file is loaded into M-Onto Annotizer using the
Ontology Explorer. The Ontology Explorer displays all the ontological terms contained in
18
the domain ontology file created above. Ontology Explorer provides a way to create
prototype instances for ontology terms to be linked to image feature content.
7.5 Conversion Of Image Format To JPEG
The subset of images chosen for the implementation from the Visible Human Data
Set is in the raw format. These images need to be converted to the bitmap or JPEG format
to access the image content information. The raw images were opened with Adobe
Photoshop after specifying the width, size and resolution as per guidelines set by National
Library of Medicine for this data set. These images were then saved as JPEG files through
Adobe Photoshop. The JPEG image files were then used for image segmentation and
feature extraction as described in the next section.
7.6 Extracting Image Content And Linking To Domain Ontology
The Visual Descriptor Extraction (VDE) tool in M- Ontomat Annotizer was used
for loading the JPEG image files and selecting and extracting image content information.
An ontology term from one of the 15 terms was selected from the Ontology Explorer. A
new prototype instance of this ontology term was created in order to link the image content
features for the new image. An image was chosen from the same category as the ontology
term and uploaded to the VDE tool. An electronic pen or mouse was used to select the
region on the image corresponding to the ontological term for this image. For example, if
the chosen ontological term is Head, then an image from the category Head is chosen and
uploaded to the VDE tool. The Head region is then selected on the image for image content
extraction. VDE provides the functionality to extract the following content from the images
– Texture Browsing, Region Shape, Dominant Color, Scalable Color, Contour Shape, Edge
Histogram, Color Structure and Color Layout. Once the region of interest was selected on
the image, all the above image features were extracted using the VDE tool one by one. The
features are extracted into XML files and the association with the prototype instance is also
19
stored in the XML file for each image feature by the VDE tool. This procedure was
followed for all the 90 images in the data set. Each image will have 8 XML files
containing the image content, 1 RDF file containing the domain ontology and references to
the XML files and 1 DAML file containing the domain ontology terms. These files form
the core data set and were used to build the semantic retrieval system described in the next
section.
Figure 4: Screenshot of VDE tool in M-Ontomat
Annotizer showing the image feature extraction
and annotation process.
20
Chapter 8
METHODS
This chapter describes the methodology used in the development of the system to
semantically index images and calculate the retrieval efficiency. The first step in the
implementation involved selecting the test and the training images. Once the test set and
the training set was obtained, every test image was compared to a training image by
extracting all the feature descriptors for each image and calculating the distance measure
for each feature type. Distance matrices were built containing the distance measures for test
versus training images for every feature. The test images were then classified using
similarity matching algorithms and the Ensemble classification approach. The accuracy
rate was determined for every approach. The following sections describe the methods and
approach used to develop the system.
8.1 Training And Test Set
The chosen subset of 90 images is divided into 2 sets. The first set is the training set
and the second set is the test set. 3 approaches were followed for populating the test set and
the training set. In the first approach, 15 representative images from each category were
used as the training set and the remaining images were in the test set. Many studies show
that with a larger training set, the accuracy rate results can be improved. Hence, in the
second approach, a training set that contained 50% of the images and a test set that
contained the remaining 50% of the images were used. Also, an algorithm was developed
to randomly populate both the test and the training images. In the third approach, the test
and the training images were randomly populated. However, the training set contained 75%
of images and the remaining 25 % of the images were in the test set.
21
For every image in the test set, the distance measure between the test image and
every other training image for a particular feature descriptor was calculated and stored in a
distance matrix for that feature descriptor. Also, for every training image, the distance
between the training image and every other training image for a particular feature
descriptor was calculated and stored in a distance matrix for that particular feature
descriptor.
8.2 Extracting Image Content From XML Files
Image content information for a particular image is extracted from the descriptor
XML files. Every visual descriptor file has a different format and hence different XPath
expression methods were developed for parsing each type of file. Image content from the
XML files are extracted at run time while calculating similarity measure for each image.
8.3 Calculating Distance Measure
Distance measure calculations require the image content information for the 2
images whose distance needs to be calculated. The image content information is extracted
for the 2 images as described in the above section. Every feature descriptor has a different
formula for calculating the distance as attributes of the descriptor are unique to a particular
descriptor. The distance measure is thus calculated using one of the following formulae
depending on which feature descriptor the distance measure is being calculated for.
Dominant Color. The distance between two dominant color descriptors, F1 and F2, is
calculated by the following distance function [28]:
. (1)
22
where F is the dominant color and p is the corresponding percentage value. N is the total
number of dominant colors, and ak,l is the similarity coefficient between two colors. The
formula for ak,l is shown below:
.
dk,l, Td and dmax are defined as follows: (3)
.
.
where is the dominant color coefficient between 1 and 1.5 [28].
.
where, ck and cl, are colors. (5)
Color Layout. The distance between two color layout descriptors values [Y,Cb,Cr] and
[Y‘,Cb‘,Cr‘] can be calculated as follows[28]:
.
23
where , and denote weighting values for each coefficient. Y, Cb and Cr are
color layout descriptors also known as YCoeff, CbCoeff and CrCoeff.
Color Structure. The color structure distance measure between their descriptors is shown
in the following formula [28]:
. (7)
where hA and hB are the color structure descriptor vectors of images A and B and i is the
total number of color structure descriptors.
Texture Browsing. The texture browsing descriptor captures the regularity v1, direction v2
and v4, and scale v3 and v5 in the texture pattern. The distances between two sets of
corresponding coefficients of TBC vector is shown in following formula [28]:
TBC = . (8)
Edge Histogram. Edge histogram distance E is measured as the distances between two sets
of inverse quantized edge histograms A and B is shown below [28]:
. (9)
where, and are Edge Histogram descriptors and i is the total number of Edge
Histogram descriptors.
Contour Shape. Contour shape distance measure M is computed as a weighted sum of the
distance measure between the global curve parameters and the distance measure between
the Curvature Scale Space (CSS) peaks associated with the object and the semantic entity
[28].
24
. (10)
where E and C are the absolute values of Eccentricity and Circularity. Mcss is the distance
measure value between the CSS matching peaks with an additional penalty for each
unmatched peak equivalent to the missing peak height [28].
.
where xpeak and ypeak are coordinate values in x and y axes and i is the total number of
Contour Shape descriptors.
Region Shape . The distance function between 2 region shape descriptor is obtained from
the following formula [28]:
. . (12)
where p and q are region shape attributes and i is the total number of attributes.
Scalable Color. The distance function between 2 scalable color descriptors is obtained
from the following formula [28]:
. (13)
where p and q are scalable color attributes and i is the total number of attributes.
25
8.4 Calculating Combined Distance Measure
Combined distance measure is calculated by summing the weighted distances
obtained for all the image descriptors as described in the above section. Different weights
were used while combining all the distances. The process of weight determination is
explained in the Results and Analysis section.
8.5 Creating Distance Matrix
A distance matrix is created for every feature descriptor. The elements of the matrix
are the distance measures calculated using the methods stated in the above section. The
dimensions of the matrix are Test X Training or Training X Training. Totally, 17 distance
matrices are generated for image retrieval calculations. 8 matrices, one for every feature
description is created for the dimension - Test X Training. The remaining 8 matrices, one
for every feature description is created for the dimension – Training X Training. These
distance matrices are used in the image retrieval algorithms to calculate the retrieval
accuracy rate as described in the following sections. The elements of the last distance
matrix contain the combined distances of all image descriptors.
8.6 Calculating Retrieval Accuracy Rate
Two algorithms based on different classification approaches were developed to
calculate the retrieval accuracy rate. The first algorithm uses a simple classification
technique based on smallest distance matching. The second algorithm follows the
Ensemble Classification technique.
Smallest Distance Classification. The algorithm for smallest distance classification is
based on calculation on distance matrices. To further explain the algorithm, let us consider
any distance matrix - Test X Training for Scalable Color. The first row of the matrix
26
containing the distance measure for the test image and all the training images is scanned
and the smallest distance measure is calculated using fundamental sorting techniques.
Once, the smallest distance measure is obtained, the first row is scanned again to find all
the training images that have the same smallest distance measure. A count of all the
matches and the matching training images ID‘s are stored for calculating the retrieval
accuracy. The ontology term for the test image is retrieved using XPath expression parsing
of the ontology RDF files. The ontology terms are retrieved for all the matching training
images using XPath expressions as well. If any one of the training ontology terms matches
the test ontology term then the algorithm classifies the image to the right category for
identification. Each positive match is reflected in the accuracy count. The algorithm is
repeated for all the rows in the distance matrix. The overall accuracy is obtained once the
algorithm finishes with all the rows. The overall accuracy is a percentage obtained as a
ratio of the number of test images classified over the total number of test images. The
following are the different retrieval efficiencies that were calculated for all the test and
training images using the smallest distance matching algorithm, Independent retrieval
efficiency for every feature descriptor and Retrieval efficiency when combining all the
feature descriptors.
Ensemble Classification. The Ensemble technique is a popular and efficient classification
technique. It derives from the concept of voting. Every image descriptor votes for a
particular category. The test image will be classified to the category that has the maximum
number of votes. An algorithm was developed to reflect this method. The algorithm uses
the distance matrices produced for all the image descriptors. The algorithm considers the
distance matrices belonging to a particular image descriptor. The first row of the matrix
containing the distance measure for the test image and all the training images is scanned
and the smallest distance measure is calculated using fundamental sorting techniques.
Once, the smallest distance measure is obtained, the first row is scanned again to find all
the training images that have the same smallest distance measure. A count of all the
matches and the matching training images ID‘s are stored for calculating the retrieval
27
accuracy. The ontology term for the test image is retrieved using XPath expression parsing
of the ontology RDF files. The ontology terms are retrieved for all the matching training
images using XPath expressions as well. The training ontology terms retrieved is stored in
an array. These steps are repeated for the first row of every distance matrix belonging to all
the image descriptors. At the end of these steps, the array contains the matched training
image ontology terms. Each set of ontology terms added to this list by the feature
descriptors are analogous to votes added. The frequency of all the ontology terms is
counted and the term with the highest frequency/vote is the obtained. This term is then
compared to the ontology term for the test image and classified as positive if they match
and the count of positive matches is tracked for retrieval accuracy rate calculations. The
above procedure is repeated for all the rows in the distance matrices i.e. for all the test
images. The overall retrieval accuracy rate is calculated as described earlier.
8.7 Improving Retrieval Accuracy Rates.
Ten Fold Cross Validation and Empirical Weight Optimization techniques were
used to improve the retrieval accuracy rates produced by the system.
Ten Fold Cross Validation. In the Ten Fold Cross Validation approach, all the
calculations performed in system are repeated 10 times and the calculations are averaged at
the end of the last iteration. This approach is aimed at generalizing the errors caused by
random operations such as populating the test set and the training set. The whole program
runs in a loop of 10 iterations. In each of the iterations, the training and the test sets are
populated, the distance matrices and accuracy rates are calculated. At the end of each of
the iterations the results are summed. At the end of all the iterations the results are
averaged.
28
Empirical Weight Optimization. Empirical weight optimization technique was used to
determine the weights while calculating the weighted combined distance measure.
Combined distance measure is calculated as a weighted sum of all the descriptors. To start
with, all the descriptors are assigned equal weights. One of the descriptors is chosen and its
corresponding weight is varied from +1 to -1 in increments of +/- 0.1 each time. For every
weight measure, the difference between the maximum weight and the weight chosen for the
descriptor is calculated and the difference is distributed as among all the other descriptors
equally. Combined accuracy rate is calculated for every variation. This technique is then
applied to all the other descriptors.
29
Chapter 9
RESULTS, DISCUSSION AND ANALYSIS
The following chapter illustrates the results obtained from the implementation
approach described above. An analysis of the results the various methods used to improve
the implementation results are described in detail in this section.
9.1 Initial Results
The initial results for the implementation contained 15 images in the training set and 75
images in the test set. The tables below show the results for test vs. training and training vs.
training.
Table 1: Accuracy rate for training set.
Training Set = 15 Images, Training Set = 15 Images
Image Descriptor Accuracy Rate
Color Layout 100%
Color Structure 100%
Contour Shape 100%
Dominant Color 100%
Edge Histogram 100%
Region Shape 100%
Scalable Color 100%
Texture Browsing 100%
From the training vs. training results table we can see that the retrieval accuracy rate for
all training images is 100%. The retrieval rate for training images is calculated to verify
that the algorithm developed is able to correctly classify images in the training set.
30
Table 2: Accuracy rate for test set.
Training Set = 15 Images, Test Set = 75 Images
Image Descriptor Accuracy Rate
Color Layout 42.6666666666667%
Color Structure 50.6666666666667%
Contour Shape 14.6666666666667%
Dominant Color 37.3333333333333%
Edge Histogram 41.3333333333333%
Region Shape 68%
Scalable Color 53.3333333333333%
Texture Browsing 29.3333333333333%
From the test vs. training results table we see that highest accuracy rate is obtained
by indexing images only on the Region Shape descriptor. Scalable Color and Color
Structure provide the second best retrieval rates. This accuracy rate is definitely better
compared to a random classifier accuracy rate of 6.66 %. The random classifier rate is
obtained as the percentage probability of the test image being classified as one of the 15
training images.
9.2 Increased Training To Test Ratio
Data mining best practices indicate that the Training to Test ratio should be high for
improved retrieval accuracy rates. In our experiments we selected 2 ratios for training and
test sets. The first ratio was 2/3rd
training and 1/3rd
test. The second ratio was 1/2 training
and 1/2 test. The training and the test sets were populated randomly based on another data
mining best practice guidelines. The following table indicates the results obtained with the
2 ratios of training and test sets.
31
Table 3: Accuracy rate for 75% images in
training set and 25% images in test set.
Training Set = 75%, Test Set = 25%
Image Descriptor Accuracy Rate
Color Layout 48%
Color Structure 87%
Contour Shape 26.07%
Dominant Color 47.83%
Edge Histogram 65.22%
Region Shape 65.22%
Scalable Color 78.26%
Texture Browsing 52.17%
With training to test ratio being 2/3rd
and 1/3rd
, the best retrieval accuracy rates are
obtained for Color Structure descriptor. Scalable Color also gives good results.
Table 4: Accuracy rate for 50% images in
training set and 50% images in test set.
Training Set = 50%, Test Set = 50%
Image Descriptor Accuracy Rate
Color Layout 44.44%
Color Structure 57.77%
Contour Shape 17.77%
Dominant Color 37.77%
Edge Histogram 44.44%
Region Shape 68.88%
Scalable Color 64.44%
Texture Browsing 62.22%
32
With training to test ratio being1/2 and 1/2, the best retrieval accuracy rates are
obtained for Region Shape descriptor followed by Scalable Color.
From the results, we can see that the retrieval accuracy rates have significantly
improved with a higher number of images in the training set. By increasing the number of
images in the training set, the maximum value for the retrieval accuracy rate for a
descriptor has increased from 68% to 87%.
9.3 Combined Descriptors
To further improve the accuracy rate, we combined the distance measures for all the
descriptors and calculated the accuracy rate on the combined value. Above mentioned
ratios for the test and training sets were used. The test and the training sets were also
randomly populated.
Table 5: Combined accuracy rate for training Set
= 50 % and test Set = 50%.
Training Set = 50%, Test Set = 50%
Image Descriptors Accuracy Rate
Combined Descriptors (Equal
Weights)
73.33%
With test to training ratio being ½ and ½, the combined accuracy rate is shown
above.
33
Table 6: Combined accuracy rate for training set
= 75 % and test set = 25%.
Training Set = 75%, Test Set = 25%
Image Descriptors Accuracy Rate
Combined Descriptors (Equal
Weights)
86.95%
With test to training ratio being 1/3 and 2/3, the combined accuracy rate is shown
above.
The retrieval accuracy rate obtained by combining all the descriptors is almost
equivalent to the highest retrieval accuracy rate obtained for one of the descriptors in the
previous experiment (Color Structure). Due to the combined retrieval accuracy rates not
being significantly higher compared to accuracy rates obtained by single descriptors, we
experimented with some more methods to improve the combined accuracy rates as
described in the following sections.
9.4 Ensemble Classification
The next approach used to improve the retrieval accuracy rate was Ensemble
Classification.
Table 7: Accuracy rate for Ensemble
Classification for 50% test and 50% training.
Training Set = 50%, Test Set = 50%
Image Descriptors Accuracy Rate
Ensemble 37.77%
34
With test to training ratio being ½ and ½, the Ensemble accuracy rate is shown
above.
Table 8: Accuracy rate for Ensemble
Classification for 75% training and 25% training.
Training Set = 75%, Test Set = 25%
Image Descriptors Accuracy Rate
Ensemble 43.47%
With test to training ratio being 1/3 and 2/3, the Ensemble accuracy rate is shown
above.
Good results were not obtained using the Ensemble classification approach due to
the votes being distorted for certain descriptors. Due to the nature of the image descriptors,
we found that there were more than one training images with the smallest distance
measures for a particular test image. The images from the Visible Human Data Set are very
similar in terms of dominant colors and textures in the images. Many training images
having the same smallest distance measure meant that the test images were voted to be in
different training classes thereby skewing the voting calculations for the Ensemble
classification method.
For example, for test image 1, training images 3, 8, and 9 had the same smallest
distance measures. However, training images 3 and 9 voted the test image to be in the
―Head‖ class whereas training image 8 voted for ―Eyes‖. While predicting the class of the
test images using the Ensemble classification technique, we considered all the votes for a
particular test image across all the descriptors distance matrices and calculated the vote
with the maximum occurrence and assigned the test image to the class with the maximum
35
votes. In the above example, the test image will be assigned to the ―Head‖ class. In actual,
the test image belongs to the ―Eyes‖ class. Hence, the retrieval accuracy rate is reduced due
to incorrect classification.
9.5 Ten Fold Cross Validation
We used Ten Fold Cross Validation method to further improve the accuracy rates
for single descriptors and combined descriptors. With the Ten Fold Cross Validation we
can average out any errors that might occur due to random selection of training and test
images.
From the table below, for the training to test ratio of 2/3rd
and 1/3rd
, the best results
are obtained when all the descriptors are combined. The Ensemble accuracy rate is also
improved but the results are not as high as the combined accuracy rate. However, Scalable
Color, Edge Histogram, Color Structure provide good results as well.
Table 9: Accuracy rate for Ten Fold Cross
Validation for 75% training and 25 % test.
Training Set = 75%, Test Set = 25%
Ten Fold Cross Validation
Image Descriptor Accuracy Rate
Color Layout 55.65%
Color Structure 71.304%
Contour Shape 30.86%
Dominant Color 52.60%
Edge Histogram 71.73%
Region Shape 66.95%
Scalable Color 75.65%
Texture Browsing 65.65%
Combined Descriptors (Equal Weights) 84.34%
Ensemble 64.78%
36
From the table below, for the training to test ratio of ½ and ½, the best results are
obtained when all the descriptors are combined. The Ensemble accuracy rate is also
improved but the results are not as high as the combined accuracy rate. However, Scalable
Color and Region Shape descriptors provide good results as well.
Table 10 : Accuracy rate for Ten Fold Cross
Validation for 50% training and 50 % test.
Training Set = 50%, Test Set = 50%
Ten Fold Cross Validation
Image Descriptor Accuracy Rate
Color Layout 52%
Color Structure 64.22%
Contour Shape 26.22%
Dominant Color 46.44%
Edge Histogram 63.55%
Region Shape 68.44%
Scalable Color 70.22%
Texture Browsing 55.11%
Combined Descriptors (Equal
Weights)
81.33%
Ensemble 62.22%
From all the above experiments, we observed that Scalable Color, Color Structure,
Region Shape and Edge Histogram provided consistent good results.
However, Contour Shape consistently has the lowest accuracy rates followed by Texture
Browsing and Dominant Color. Color Layout lies in between, with an average of around
50% accuracy rate across all experiments. The next section describes experiments done by
excluding descriptors with low individual retrieval accuracy rates while calculating the
overall combined accuracy rate.
37
9.6 Excluding Descriptors
Texture Browsing and Contour Shape descriptors were excluded from the
combined accuracy rate calculations. The results obtained from this exclusion are shown
below. There is an increase in the combined accuracy rate (87.39%) compared to previous
experiment results (~84%).
Although, Contour Shape has consistently given low accuracy rates, Texture
Browsing did give average results in some of the experiments described above. Hence,
removing both the Texture Browsing descriptor and the Contour Shape descriptor from the
combined descriptor calculations did not significantly improve the accuracy rates.
Table 11: Accuracy rate excluding Contour
Shape and Texture Browsing.
Training Set = 50%, Test Set = 50%,
Training Set = 75%, Test Set = 25%
Image Descriptors Accuracy Rate
Combined Descriptors (Equal
Weights, No Contour Shape and
Texture Browsing)
84.88%
Combined Descriptors (Equal
Weights, No Contour Shape and
Texture Browsing)
87.39%
The combined accuracy rate significantly improved when Contour Shape
Descriptor was excluded from the combined accuracy rate calculations. A high accuracy
rate of 90.434 % was obtained with the training and test ratio as 2/3rd
and 1/3rd
.
38
Table 12: Accuracy rate excluding Contour
Shape descriptor.
Training Set = 50%, Test Set = 50%,
Training Set = 75%, Test Set = 25%
Image Descriptors Accuracy Rate
Combined Descriptors (Equal
Weights, No Contour Shape)
84.44%
Combined Descriptors (Equal
Weights, No Contour Shape)
90.434%
Accuracy rates obtained for Contour Shape have been consistently lower across all
experiments and hence excluding it from the combined descriptor calculations significantly
improved the retrieval accuracy rates.
9.7 Empirical Weight Optimization
By using the Empirical Weight Optimization Technique, we were able to further
improve the retrieval accuracy rates by combining weighted descriptors and not excluding
any descriptors from the semantic metadata. The highest retrieval accuracy rate obtained
from this approach is 93.48% with weights for the descriptors as shown in the table. These
results also show that by maximizing the weight for Region Shape, the accuracy rates
significantly improve when combining all the descriptors.
Table 13: Accuracy rates for Empirical Weight
Optimization.
Training Set = 75%, Test Set = 25%
Image Descriptor Weights Accuracy Rate
Region Shape = 1.9
Other descriptors = 0.0148
93.48%
39
Chapter 10
RELATED WORK
10.1 Knowledge – Assisted Video Analysis And Object Detection
Gabriel Tsechpenakis, Giorgos Akrivas, Giorgos Andreou, Giorgos Stamou and
Stefanos Kollias presented a method for object recognition in video sequences [28]. The
goal of the system is to extract semantics automatically by detecting and tracking moving
objects in video sequences and then using low-level features of each semantic entity, in
order to associate moving objects with them. The proposed algorithm consists of two
main steps: the detection and localization of ―regions-of-interest‖ in a sequence, and the
estimation of the main mobile object contours. Visual descriptors, which are used to
model visual content associated with semantic entities, are categorized according to the
MPEG-7 framework. Visual descriptors extracted were mapped to the conceptual terms
to build the semantic indexing metadata. Similarity matching algorithms were used to
match the moving regions extracted. The simulation of this system was able to identify
moving regions based on the extracted semantics.
A similar approach was used in our implementation. Our implementation focused
on the content of images and not videos. The main difference is that in our
implementation, the semantics are manually extracted by selecting the region of interest
and formalized domain ontology is used for mapping the extracted content to meaningful
terms. Also, the above system used similar videos to build the training and test sets
whereas in our implementation, we used images diverse in their content.
40
10.2 Retrieval of Multimedia Objects By Combining Semantic Information From
Visual And Textual Descriptors
Mats Sjöberg, Jorma Laaksonen, Matti Pöllä and Timo Honkela proposed a
method of content-based multimedia retrieval of objects with visual, aural and textual
properties [33]. In their method, training examples of objects belonging to a specific
semantic class are associated with their low-level visual descriptors (such as MPEG-7)
and textual features such as frequencies of significant keywords extracted from audio
tracks. A fuzzy mapping of a semantic class in the training set to a class of similar objects
in the test set was created by using Self-Organizing Maps (SOMs) trained from the visual
and textual descriptors. Query by example (QBE) is the main operating principle in
SOM, meaning that the user provides the system a set of example objects of what he or
she is looking for, taken from the existing database. The various experiments performed
by them on the system proposed showed a promising increase in retrieval performance.
The results also showed that the retrieval performance increased with the use of textual
features.
The implementation approach described above is less similar to the approach used
in our implementation. We classified images using a similarity matching algorithm based
on smallest distances and Ensemble classification. This approach is slightly different to
the SOM approach used in the implementation described above. Also, in our approach all
the training images in a particular class have the same textual descriptor whereas this
implementation uses a range of words and their frequencies.
41
Chapter 11
EDUCATIONAL STATEMENT
This research work benefited from the knowledge obtained from many classes
taken as a part of the Graduate curriculum at the Institute of Technology, UW Tacoma.
Strong foundations obtained from the TCSS 543 – Advanced Algorithms class helped in
the mathematical aspects involved in this research. Knowledge obtained from this class was
also useful in selecting and implementing the right data structures needed for this
implementation. Image processing foundations from the TCSS 451 - Digital Media class
was very useful in extracting image features which was a significant part of this
implementation. Database design basics learnt from the TCSS 545 class was extremely
helpful during the data pre processing phase. The basics of scientific research obtained
from the TCSS 598 – Master‘s Seminar was extremely helpful while researching on this
area. The exposure to formal technical writing in this class was also very helpful while
writing this paper. Concepts of Bioinformatics such as data mining and domain ontologies
helped me a lot when trying to understand the concepts related to the medical domain.
TCSS588 - Bioinformatics class was very useful in determining areas for future research
that would benefit the medical domain. Apart from these classes, programming knowledge
gained from many other classes was very useful in the design and implementation stages.
Exposure to image processing tools, similarity matching algorithms and techniques
proved to be very knowledgeable, as it can be applied to solve indexing problems in
various domains. Many indexing algorithms were researched during the course of this
research. This knowledge will be very useful to build information retrieval applications in
the future. This research also proved to be very beneficial in learning the languages of the
Semantic Web such as RDF and DAML. Working on this thesis has given me the
opportunity to research and learn about various areas of computer science like imaging,
42
multimedia databases, knowledge representation languages, etc. I thoroughly enjoyed the
learning experience and exposure to various technologies during the course of this research.
43
Chapter 12
CONCLUSION
The implementation described in this paper has shown that a high retrieval accuracy
rate is obtained by semantically indexing images using a web ontology language and the
visual descriptors of the image. The biggest challenge in this implementation was to
develop a similarity matching algorithm to retrieve matching images by combining all the
visual descriptors and the ontology terms. A retrieval accuracy rate of 93.48 % was
obtained using the algorithm developed. The approach proposed in this paper will benefit
the medical community to a large extent as large collections of medical images can be
indexed and retrieved semantically. Future improvements to this implementation include
automating the image segmentation and feature extraction phase and using learning
techniques to improve the similarity matching algorithm developed.
44
BIBLIOGRAPHY
[1] Boll, S., Klas, W., Sheth, A. (1998). Overview on Using Metadata to Manage
Multimedia Data. In Multimedia Data Management—Using Metadata to Integrate and
Apply Digital Media (1-24).
[2] Chavez-Aragon, A., Starostenko, O. (2004). Ontological Shape – Description, A New
Method for Visual Information Retrieval. Proceedings of the 14th IEEE International
Conference on Electronics, Communications and Computers. Retrieved Nov 27, 2004,
from http://ieee.org
[3] Comaniciu, D., Foran, D., Meer, P. (1998). Shape –Based Image Indexing and
Retrieval for Diagnostic Pathology. Proceedings of the 14th IEEE International
Conference on Pattern Recognition, 1 (902-904). Retrieved Nov 27, 2004, from
http://ieee.org
[4] Fayyad, U.M. (1996). Automating the Analysis and Cataloging of Sky Surveys. In
Advances in Knowledge Discovery and Data Mining (471-493)
[5] Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., et al. (1995).
Query by Image and Video Content. IEEE Computer, 28(9), (23-31). Retrieved Nov 1,
2004, from http://ieee.org
[6] GIS Images. Retrieved Nov 10, 2004, from http://earth.jsc.nasa.gov/sseop/efs/query.pl
[7] Golbeck, J., Alford, A., Hendler, J. Organization and Structure of Information using
Semantic Web Technologies. Maryland Information and Network Dynamics
Laboratory, University of Maryland. Retrieved Nov 1, 2004, from
http://www.mindswap.org/papers/Handbook.pdf
[8] Hand D., Mannila, H., Smyth, P. (2001). Retrieval by Content. In Principles of Data
Mining (449-484). England: The MIT Press.
[9] Hu, B., Dasmahapatra, S., Lewis, P., Shadbolt, N. (2003). Ontology Based Medical
Image Annotation with Description Logics. Proceedings of the 15th IEEE International
Conference on Tools with Artificial Intelligence. Retrieved Nov 1, 2004, from
http://ieee.org
45
[10] ImageJ. Retrieved Nov 11, 2004, from http://rsb.info.nih.gov/ij/docs/intro.html
[11] Jorgensen, C. Image Indexing- An Analysis of Selected Classification Systems in
Relation to Image Attributes Named by Naïve Users. Retrieved Nov 8, 2004, from
http://digitalarchive.oclc.org/da/ViewObject.jsp?fileid=0000002655:000000059275&re
qid=8078
[12] Knublauch, H., Olivier, D., Musen M. Weaving the Biomedical Semantic Web with the
Protégé OWL Plug-in. Stanford Medical Informatics, Stanford University: Stanford.
Retrieved Nov 18, 2004, from http://protege.stanford.edu
[13] Maybury, M.T. (Ed.). (1997). Intelligent Multimedia Information Retrieval. Menlo
Park, CA: AAAI Press.
[14] Mejino, J., Rosse, C. Conceptualization of Anatomical Spatial Entities in the Digital
Anatomist foundation Model. Structured Informatics Group, Department of Biological
Structure, University of Washington School of Medicine. Retrieved Nov 4, 2004 from
http://sig.biostr.washington.edu/s/da/
[15] Mojsilovic, A., and Gomes, J. (2002). Semantic Based Categorization, Browsing and
Retrieval in Medical Image Databases. IEEE International Conference on Image
Processing, III (145-148). . Retrieved Nov 1, 2004, from http://ieee.org
[16] Ontology Web Language. Retrieved Nov 21, 2004, from http://www.w3.org/TR/owl-
features/
[17] Pentland, A., Picard, R.W., Sclaroff, S. (1994). Photobook: Tools for content-based
manipulation of image databases. International Journal of Computer Vision, 18 (233-
254).
[18] Protégé. Retrieved Nov 3, 2004, from http://protege.stanford.edu/
[19] Rui, Y. Huang, T.S., Ortega, M., Mehrotra, S. (1997). Relevance feedback: a power
tool in interactive content-based image retrieval. Proceedings of the IEEE Transactions
on Circuits and Systems for Video. Maybury, M.T. (Ed.) Intelligent Multimedia
Information Retrieval Technology, 8(5), (644-655). Retrieved Nov 1, 2004, from
http://ieee.org
46
[20] Semantic Web. Retrieved Oct 17, 2004, from, http://www.w3.org/2001/sw
[21] Smith, J.R., Chang, S. (1997). Querying by color regions using VisualSeek content-
based visual query system. Intelligent Multimedia Information Retrieval, In: Maybury,
M.T. (Ed.) (23-41). Menlo Park, CA: AAAI Press.
[22] The Digital Anatomist. Retrieved Oct 17, 2004, from,
http://www9.biostr.washington.edu/cgi-bin/DA/imageform
[23] UMLS. Retrieved Oct 17, 2004, from http://www.nlm.nih.gov/research/umls/
[24] Visible Human . Retrieved Oct 17, 2004 from,
http://www.nlm.nih.gov/research/visible/visible_human.html
[25] Visser, P., Bench-Capon, T. (1996). On the Reusability of Ontologies in Knowledge -
System Design. Conference Proceedings of the Seventh International Workshop on
Database and Expert Systems Applications, (256-261)
[26] M – Ontomat Annotizer. Retrieved Jan 30, 2006 from,
http://www.acemedia.org/aceMedia/results/software/m-ontomat-annotizer.html
[27] Foundation Model of Anatomy. Retrieved Nov 11, 2005 from,
http://sig.biostr.washington.edu/s/fm/AboutFM.html
[28] Tsechpenakis, G., Akrivas, G., Andreou, G., Stamou, G., Kollias, S. Knowledge –
Assisted Video Analysis and Object Detection. Image Video and Multimedia
Laboratory, Department of Electrical and Computer Engineering, National Technical
University of Athens. Retrieved Oct 30, 2006 from,
http://www.cbim.rutgers.edu/papers/eunite_2002.pdf
[29] Christopoulas, C., Berg, D., Skodras, A. The Colour In the Upcoming MPEG – 7
Standard. Retrieved Jan 5, 2007 from,
http://www.eurasip.org/content/Eusipco/2000/sessions/ThuAm/SS2/cr1634.pdf
[30] Eidenberger, E. Evaluation and Analysis of Similarity Measures for Content –Based
Visual Information Retrieval. Interactive Media Systems Group, Institute of Software
Technology and Interactive Systems, Vienna University of Technology. Retrieved
Dec 15, 2006 from,
http://www.ims.tuwien.ac.at/media/documents/publications/acmms2004b.pdf
47
[31] Geradts, Z., Hardy, H., Poortman, A. Bijhold, J. Evaluation of contents based image
retrieval methods for a database of logos on drug tablets. Netherlands Forensic
Institute. Retrieved Nov 21, 2006 from,
http://citeseer.ist.psu.edu/cache/papers/cs/30794/http:zSzzSzgeradts.comzSzhtmlzSzDo
cumentszSzArticleszSzSPIE2001zSzdrugs.pdf/geradts01evaluation.pdf
[32] Papadopoulos, S., Mezaris, V., Kompatsiaris, I., Strintzis, M.G. A Region Based
Approach to Conceptual Image Based Classification. Information Processing
Laboratory, Electrical and Computer Engineering Dept., Aristotle University of
Thessaloniki. Retrieved Jan 5th, 2006 from,
http://www.iti.gr/~bmezaris/publications/vie05.pdf
[33] Sj¨oberg, M., Laaksonen, J., P¨oll¨a, M., Honkela, T. Retrieval of Multimedia
Objects by Combining Semantic Information from Visual and Textual Descriptors.
Laboratory of Computer and Information Science , Helsinki University of
Technology. Retrieved Feb 15, 2007 from,
http://www.cis.hut.fi/s/cbir/papers/icann2006mats.pdf
[34] Eakins, J., Graham, M. Content Based Image Retrieval. University of Northumbria at
Newcastle . Retrieved Dec 15th
, 2006 from
http://www.jisc.ac.uk/uploaded_documents/jtap-039.doc
[35] MPEG - 7. Retrieved Nov 11, 2005 from,
http://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm
48
APPENDIX A
PRESENTATION SLIDES
This appendix contains the PowerPoint slides prepared for the thesis presentation.
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
APPENDIX B
INSTALLATION & USER MANUAL
Installation:
1. Download images from Visible Human Project available from the National Library
of Medicine FTP site - vhnet.nlm.nih.gov (130.14.35.50).
2. Install Adobe Photoshop Graphics Software available as Compact Discs after
academic purchase.
3. Install MySQL Database Management System from the MySQL download site -
http://dev.mysql.com/downloads/.
4. Download UMLS database from the National Library of Medicine site -
http://www.nlm.nih.gov/research/umls/meta6.html.
5. Install UMLS as a MySQL database on the MySQL server.
6. Download M-Ontomat Annotizer from the Acemedia site -
http://www.acemedia.org/aceMedia/results/software/m-ontomat-annotizer.html.
7. Install Microsoft Visual Studio Integrated Developer Environment (IDE) or you can
use any IDE such as Eclipse available through academic purchase.
8. Install Microsoft .NET Framework available through academic purchase.
99
User Manual:
Preprocessing Steps:
1. Convert images from .raw format to JPEG files using Adobe Photoshop and specify
the values shown below for conversion:
Anatomy CT MRI
Header 0 3416 7900
Width 2048 512 256
Height 1216 512 256
Channels 3 2 2
Interlaced X X
2. Store the JPEG image files in individual folders( one for each image)
3. Extract University of Washington Digital Anatomist (UWDA) ontology terms from
UMLS using the SQL query shown below:
SELECT * FROM MRCONSO WHERE SAB = ‗UWDA‘;
MRCONSO is the table containing the ontological concepts and SAB is the name
of the column representing the source of the terms in UMLS Database.
4. Create an empty text file in DAML format using the standard XML schema for
DAML. Store the extracted terms in the file created to form the DAML ontology
file.
100
5. Run M-Ontomat Annotizer, open the Ontology Explorer and load the ontology
DAML file created in the earlier step.
6. Open the Visual Description Extraction (VDE) Tool in M-Ontomat Annotizer and
load an image for image segmentation and feature extraction.
7. Select an ontology term in the ontology explorer and create a prototype instance for
the ontology term.
8. Corresponding to the selected ontology term, select the region of interest on the
image and extract all the image descriptors for this region using the VDE tool.
9. Store the image descriptor files and annotation files generated by M-Ontomat after
the prototype instance creation and extraction of image descriptors.
10. Repeat steps 6-9 for all the images.
Execution Steps for new images:
1. Pre-process the images as described in the earlier section.
2. Open Visual Studio and load the Semantic Indexing.csproj file stored in the
Compact Disc submitted with this thesis.
3. Specify size of the image dataset, location of the image and M-Ontomat output files
in the MainProgram.cs file in the project. Also, specify the location to output the
results files.
4. Build the project and execute to start the semantic indexing system.
101
5. Results are all stored in text files.
Execution Steps for existing images:
1. Download the contents of the folder ―Project‖ from the compact disc submitted.
2. Navigate to the ―Semantic Indexing‖ folder under top Folder ―Project‖ and execute
the file Semantic Indexing. Exe
3. Results will be obtained in the Results folder under top Folder ―Project‖.
102
APPENDIX C
SYSTEM OUTPUT
This appendix contains a screenshot of the sample output produced by the implementation
as shown below.
103
APPENDIX D
IMAGE DESCRIPTOR FILES
This appendix contains sample image descriptor files for all images descriptor types.
1. Color Layour Descriptor File:
<?xml version='1.0' encoding='ISO-8859-1' ?>
<Mpeg7 xmlns = "http://www.mpeg7.org/2001/MPEG-7_Schema" xmlns:xsi =
"http://www.w3.org/2000/10/XMLSchema-instance">
<DescriptionUnit xsi:type = "DescriptorCollectionType">
<Descriptor xsi:type = "ColorLayoutType"><YDCCoeff>26</YDCCoeff>
<CbDCCoeff>16</CbDCCoeff>
<CrDCCoeff>43</CrDCCoeff>
<YACCoeff5>16 14 17 16 16 </YACCoeff5>
<CbACCoeff2>16 18 </CbACCoeff2>
<CrACCoeff2>16 15 </CrACCoeff2>
</Descriptor>
</DescriptionUnit>
</Mpeg7>
2. Color Structure Descriptor File:
<?xml version='1.0' encoding='ISO-8859-1' ?>
<Mpeg7 xmlns = "urn:mpeg:mpeg7:schema:2001" xmlns:xsi =
"http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation =
"urn:mpeg:mpeg7:schema:2001 .\Mpeg7-2001.xsd">
<Description xsi:type = "ContentEntityType">
<MultimediaContent xsi:type = "ImageType">
104
<Image><VisualDescriptor xsi:type = "ColorStructureType" colorQuant = "1">
<Values>3 0 16 0 255 0 32 0 110 117 93 6 1 18 9 0 26 32 0 0 3 3 0 0 3
6 2 0 0 0 0 0 </Values>
</VisualDescriptor>
</Image>
</MultimediaContent>
</Description>
</Mpeg7>
3. Contour Shape Descriptor File:
<?xml version='1.0' encoding='ISO-8859-1' ?>
<Mpeg7 xmlns = "urn:mpeg:mpeg7:schema:2001" xmlns:xsi =
"http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation =
"urn:mpeg:mpeg7:schema:2001 schema/Mpeg7-2001.xsd">
<DescriptionUnit xsi:type = "DescriptorCollectionType"><
Descriptor xsi:type = "ContourShapeType">
<GlobalCurvature>1 1 </GlobalCurvature>
<PrototypeCurvature>0 1 </PrototypeCurvature>
<HighestPeakY>12</HighestPeakY>
<Peak peakX = "23" peakY = "3"/>
<Peak peakX = "15" peakY = "5"/>
<Peak peakX = "55" peakY = "5"/>
<Peak peakX = "38" peakY = "6"/>
<Peak peakX = "58" peakY = "7"/>
<Peak peakX = "28" peakY = "7"/>
<Peak peakX = "32" peakY = "5"/>
<Peak peakX = "19" peakY = "7"/>
<Peak peakX = "26" peakY = "7"/>
105
</Descriptor>
</DescriptionUnit>
</Mpeg7>
4. Dominant Color Descriptor File:
<?xml version='1.0' encoding='ISO-8859-1' ?>
<Mpeg7 xmlns = "urn:mpeg:mpeg7:schema:2001" xmlns:xsi =
"http://www.w3.org/2001/XMLSchema-instance">
<DescriptionUnit xsi:type = "DescriptorCollectionType">
<Descriptor xsi:type =
"DominantColorType"><SpatialCoherency>14</SpatialCoherency>
<Value><Percentage>1</Percentage>
<Index>4 9 15 </Index>
<ColorVariance>0 0 1 </ColorVariance>
</Value>
<Value><Percentage>3</Percentage>
<Index>18 14 11 </Index>
<ColorVariance>1 1 0 </ColorVariance>
</Value>
<Value><Percentage>9</Percentage>
<Index>8 4 3 </Index>
<ColorVariance>0 0 0 </ColorVariance>
</Value>
<Value><Percentage>12</Percentage>
<Index>25 22 15 </Index>
<ColorVariance>0 0 0 </ColorVariance>
</Value>
<Value><Percentage>4</Percentage>
106
<Index>14 8 6 </Index>
<ColorVariance>0 1 0 </ColorVariance>
</Value>
</Descriptor>
</DescriptionUnit>
</Mpeg7>
5. Edge Histogram Descriptor File:
<?xml version='1.0' encoding='ISO-8859-1' ?>
<Mpeg7 xmlns = "http://www.mpeg7.org/2001/MPEG-7_Schema" xmlns:xsi =
"http://www.w3.org/2000/10/XMLSchema-instance">
<DescriptionUnit xsi:type = "DescriptorCollectionType">
<Descriptor xsi:type = "EdgeHistogramType">
<BinCounts>1 2 7 1 4 1 4 4 5 4 2 4 3 5 5 1 2 1 6 5 6 0 6 5 5 1 2 2 5 6
2 3 3 3 6 6 0 3 5 6 5 0 2 7 6 5 2 4 6 5 3 4 6 5 5 6 0 4 2 6 0 0 0 3 3
0 4 2 7 5 1 4 6 0 5 0 1 5 0 3 </BinCounts>
</Descriptor>
</DescriptionUnit>
</Mpeg7>
6. Region Shape Descriptor File:
<?xml version='1.0' encoding='ISO-8859-1' ?>
<Mpeg7 xmlns = "http://www.mpeg7.org/2001/MPEG-7_Schema" xmlns:xsi =
"http://www.w3.org/2000/10/XMLSchema-instance">
<DescriptionUnit xsi:type = "DescriptorCollectionType">
107
<Descriptor xsi:type = "RegionShapeType"><MagnitudeOfART>15 15 4 5 5 15 15
5 8 2 5 6 0 7 1 0 0 13 13 8 2 2 2 11 10 8 1 2 2 4 2 2 2 0 2
</MagnitudeOfART>
</Descriptor>
</DescriptionUnit>
</Mpeg7>
7. Scalable Color Descriptor File:
<?xml version='1.0' encoding='ISO-8859-1' ?>
<Mpeg7 xmlns = "http://www.mpeg7.org/2001/MPEG-7_Schema" xmlns:xsi =
"http://www.w3.org/2000/10/XMLSchema-instance">
<DescriptionUnit xsi:type = "DescriptorCollectionType">
<Descriptor xsi:type = "ScalableColorType" NumberOfCoefficients = "2"
NumberOfBitplanesDiscarded = "3">
<Coefficients>-11 -3 -7 4 3 1 2 3 1 1 0 2 -3 1 3 2 0 0 0 0 -1 0 0 0 -1 0 0
0 -1 0 0 0 1 1 1 0 1 1 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 -1 0 0 0 0 0
0 0 </Coefficients>
</Descriptor>
</DescriptionUnit>
</Mpeg7>
8. Texture Browsing Descriptor File:
<?xml version='1.0' encoding='ISO-8859-1' ?>
<Mpeg7 xmlns = "http://www.mpeg7.org/2001/MPEG-7_Schema" xmlns:xsi =
"http://www.w3.org/2000/10/XMLSchema-instance">
<DescriptionUnit xsi:type = "DescriptorCollectionType">
<Descriptor xsi:type = "TextureBrowsingType"><Regularity>irregular</Regularity>
108
<Direction>90 degree</Direction>
<Scale>fine</Scale>
<Direction>0 degree</Direction>
<Scale>fine</Scale>
</Descriptor>
</DescriptionUnit>
</Mpeg7>
109
APPENDIX E
DAML ONTOLOGY FILE
This appendix contains the University of Washington Digital Anatomist ontology file
created in DAML for this implementation.
<rdf:RDF
xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs ="http://www.w3.org/2000/01/rdf-schema#"
xmlns:daml ="http://www.daml.org/2001/03/daml+oil#"
xmlns:xsd ="http://www.w3.org/2000/10/XMLSchema#"
xmlns:srdef ="C:\Gowri\Project\UWDA#"
>
<!-- This ontology is based on the Semantic Network part of Unified Medical Language
System (UMLS)
Knowledge Source Server, which is accessible at http://www.nlm.nih.gov/research/umls/
All the data are retrived from this knowledge sourse server for educational and scietific
use
-->
<daml:Ontology rdf:about="">
<daml:versionInfo>$Id: UWDA.daml Rue Feb 06 21:40:59 2007$</daml:versionInfo>
<rdfs:comment>
Semantic Types of UMLS Semantic Network
</rdfs:comment>
<daml:imports rdf:resource="http://www.daml.org/2001/03/daml+oil"/>
</daml:Ontology>
<daml:Class rdf:ID="Kidney">
<rdfs:comment>
110
Body organ that filters blood for the secretion of URINE and that regulates ion
concentrations.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Abdomen">
<rdfs:comment>
Subdivision of trunk, which is demarcated from the thorax internally by the inferior surface
of the sternocostal part of the diaphragm and externally by the costal margin, from the
pelvis by the plane of the superior pelvic aperture and from the lower limbs by the inguinal
folds; together with the thorax, pelvis, and perineum, it constitutes the trunk. Examples:
There is only one abdomen.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Head">
<rdfs:comment>
Body part, which consists of a maximal set of diverse subclasses of organ and organ
part spatially associated with the skull, it is partially surrounded by skin of head.
Examples: There is only one head.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="AdductorMagnus">
<rdfs:comment>
Largest muscle in the thigh. It keeps the knees together.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Brain">
<rdfs:comment>
Subdivision of neuraxis that consists of neural tissue (which is organized into gray
matter and white matter) and the cerebral ventricular system (cavity of organ part); it is
111
embryologically derived from the rostral part of the neural tube; together with the spinal
cord, the brain constitutes the organ neuraxis. Examples: There is only one brain.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Pelvis">
<rdfs:comment>
Subdivision of trunk, which is demarcated from the abdomen by the plane of the
superior pelvic aperture, and from the perineum by the inferior surface of the pelvic
diaphragm; together with the thorax, abdomen, and perineum, it constitutes the trunk.
Examples: There is only one pelvis.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Thigh">
<rdfs:comment>
Different groups of muscles carry out opposing actions with regards to moving the
hip and knee joints.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Biceps">
<rdfs:comment>
The biceps brachialis, flexes the elbow (bends the arm).
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Lungs">
<rdfs:comment>
Lobular organ the parenchyma of which consists of air-filled alveoli which
communicate with the tracheobronchial tree. Examples: There are only two instances, right
lung and left lung.
112
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="PectoralisMajor">
<rdfs:comment>
The pectoralis major muscle lies over the anterior wall of the chest.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Thorax">
<rdfs:comment>
Subdivision of the trunk, which is demarcated from the neck by the plane of the
superior thoracic aperture and from the abdomen internally by the inferior surface of the
diaphragm and externally by the costal margin; together with the abdomen, pelvis and
perineum, it constitutes the trunk. Examples: There is only one thorax.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Eyes">
<rdfs:comment>
Organ with organ cavity which is connected to the optic nerve. Examples: There are
only two eyeballs, the right and the left eyeballs.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Hamstring">
<rdfs:comment>
Muscle in the thigh.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Colon">
<rdfs:comment>
Part of the large intestine that extends from the cecum to the rectum.
113
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="GlutealMuscles">
<rdfs:comment>
Muscle in the pelvis region.
</rdfs:comment>
</daml:Class>
<daml:Class rdf:ID="Unknown">
<rdfs:comment>
Unknown
</rdfs:comment>
</daml:Class>
</rdf:RDF>
114
APPENDIX F
IMAGE ANNOTATION FILES
This appendix contains a sample annotation file containing the ontology term associated
with the image and links to image descriptor files associated with the image.
<rdf:RDF xml:base="http://www.acemedia.org/fact-statements/PROTOTYPES#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema"
xmlns:vdo="http://www.acemedia.org/ontologies/VDO#"
xmlns:vdoext="http://www.acemedia.org/ontologies/VDO-EXT#">
<vdoext:Prototype rdf:about="http://www.acemedia.org/ontologies/VDO-
EXT#Abdomen1">
<rdf:type rdf:resource="file://newOnto.org/C_/Gowri/Project/UWDA.daml#Abdomen"/>
<vdoext:hasDescriptor
rdf:resource="http://www.acemedia.org/ontologies/VDO#VDE_INST_1171440210270338
6011"/>
<vdoext:hasDescriptor
rdf:resource="http://www.acemedia.org/ontologies/VDO#VDE_INST_1171440236224338
6011"/>
<vdoext:hasDescriptor
rdf:resource="http://www.acemedia.org/ontologies/VDO#VDE_INST_1171440263615338
6011"/>
<vdoext:hasDescriptor
rdf:resource="http://www.acemedia.org/ontologies/VDO#VDE_INST_1171440292427338
6011"/>
115
<vdoext:hasDescriptor
rdf:resource="http://www.acemedia.org/ontologies/VDO#VDE_INST_1171440353475338
6011"/>
<vdoext:hasDescriptor
rdf:resource="http://www.acemedia.org/ontologies/VDO#VDE_INST_1171440380115338
6011"/>
<vdoext:hasDescriptor
rdf:resource="http://www.acemedia.org/ontologies/VDO#VDE_INST_1171440409694338
6011"/>
<vdoext:hasDescriptor
rdf:resource="http://www.acemedia.org/ontologies/VDO#VDE_INST_1171440438288338
6011"/>
</vdoext:Prototype>
<vdoext:Prototype rdf:about="http://www.acemedia.org/ontologies/VDO-
EXT#Abdomen2">
<rdf:type rdf:resource="file://newOnto.org/C_/Gowri/Project/UWDA.daml#Abdomen"/>
<vdoext:hasDescriptor
rdf:resource="http://www.acemedia.org/ontologies/VDO#VDE_INST_1171440139301338
6011"/>
</vdoext:Prototype>
</rdf:RDF>