A Classiﬁcation of Semantic Annotation Systems · 2012-10-30 · in semantic annotation models...

Semantic Web – Interoperability, Usability, Applicability 0 (2011) 1–0 1IOS Press

A Classification of Semantic AnnotationSystemsEditor(s): Krzysztof Janowicz, University of California, Santa Barbara, USASolicited review(s): Jérôme Euzenat, INRIA Grenoble Rhône-Alpes, France and Sebastian Tramp, Universität Leipzig, Germany

Pierre Andrews a, Ilya Zaihrayeu a , and Juan Pane a

a Dipartimento di Ingegneria e Scienza dell’InformazioneThe University of Trento, ItalyE-mail: {andrews,ilya,pane}@disi.unitn.it

Abstract.The Subject-Predicate-Object triple annotation system is now well adopted in the research community, however, it does not

always speak to end-users. In fact, explaining all the complexity of semantic annotation systems to laymen can sometime bedifficult. We believe that this communication can be simplified by providing a meaningful abstraction of the state of the artin semantic annotation models and thus, in this article, we describe the issue of semantic annotation and review a number ofresearch and end-user tools in the field. Doing so, we provide a clear classification scheme of the features of annotation systems.We then show how this scheme can be used to clarify requirements of end-user use cases and thus simplify the communicationbetween semantic annotation experts and the actual users of this technology.

Keywords: Semantic, Semantic Annotation, Vocabulary, Classification Scheme, Tag, Attributes, Ontology

1. Introduction

Social annotation systems such as Delicious1, Flickr2

and others have laid the fundamentals of the Web 2.0principles and gained tremendous popularity amongWeb users. One of the factors of success for these sys-tems is the simplicity of the underlying model, whichconsists of a resource (e.g., a web page), a tag (nor-mally, a text string), and a user who annotated the re-source with the tag as can be seen in Figure 1. De-spite its simplicity, the annotation model enables a setof useful services for the end user, e.g., searching re-sources using tags added by a community of users,computing the most popular tags and building the so-called tag clouds3, finding users with common inter-ests based on the resources they annotated and on the

1http://www.delicious.com/2http://www.flickr.com/3http://en.wikipedia.org/wiki/Tag_cloud

Fig. 1. A Generic Annotation Model

tags they used and providing a recommendation ser-vice on this basis, among others.

Due to the natural language nature of the underly-ing model, these systems have been criticised for notbeing able to take into account the explicit informationabout the meaning or semantics of each tag. For ex-ample, different users can use the same tag with dif-ferent meanings (i.e., homonyms), different tags withthe same meaning (i.e., synonyms), different tags inthe same meaning but at different levels of abstraction,morphological variations of the same tag, and so on. Inaddition to the problem of ambiguity in the interpreta-tion of the meaning of the tag itself, there is a furtherproblem of deciding what the tag refers to. For exam-

0000-0000/11/$00.00 c© 2011 – IOS Press and the authors. All rights reserved

2 P. Andrews and I. Zaihrayeu and J. Pane / A Classification of Semantic Annotation Systems

ple, a tag “John” attached to a photo does not specify ifJohn is a person on the photo or the photographer whotook this photo.

With the advent of semantic technologies, someof the aforementioned problems were (partially) ad-dressed in some systems. For example, Flickr intro-duced the so-called machine tags that use a specialsyntax to define extra information about the tag. Forexample, a tag “upcoming:event=428084” assigned toa photo encodes that it is related to an Upcoming.orgevent identified with “428084”. Another example isFaviki4 – a social bookmarking system similar to Deli-cious. However, when users enter a tag, they are askedto disambiguate its meaning to a known concept ex-tracted from Wikipedia.

There are many types of annotation models avail-able in the scientific state of the art and in the al-ready existing end-user applications. In the semanticweb community, these models have been abstractedby the Subject-Predicate-Object triple that can be usedfor most of the annotation types discussed here. How-ever, this generalisation is low level in the sense that itsatomic building element, a triple, can be used to rep-resent a single property of an object, whereas end userapplications can use structurally richer atomic buildingblocks and include objects such as people, events, andlocations.

In Sections 2, 3 and 4, we present an extended sur-vey of the tools and models that were available and weprovide what we believe to be a simple abstraction ofthe important dimensions that an annotation model canhave. The features were selected based on the inten-tion to demonstrate how much they can contribute tothe definition of a semantic annotation model. The fea-tures were grouped in the following three dimensions:

1. the structural complexity of annotations (e.g.,tags, attributes, and relations), see Section 2;

2. the vocabulary type, i.e., the level of formalityof annotations defined on the basis of the formof the underlying knowledge organisation systemto the elements of which the annotations can belinked (e.g., thesauri, taxonomies), see Section 3;and

3. the user collaboration in sharing and reusing se-mantic annotations and in the collaborative con-struction and evolution of the underlying knowl-edge organisation system, see Section 4.

4http://faviki.com

Within each dimension we provide a specification ofthe most typical approaches by giving their descrip-tion, comparative analysis of their advantages and dis-advantages, and examples of popular systems (both inthe research field and among publicly available sys-tems). The goal of the comparative analysis is to showthe trade off between the level of user involvement andprovided services in each approach. The term user en-compasses the notion of user as annotator and user asconsumer of the annotations, wherever it requires clar-ification, we will use the term annotator or user as con-sumer to differentiate one from the other.

In Section 6 we show how this annotation modelclassification can be used to elicit requirements fromend-users. To do this, we provide summaries and con-clusions made from the analysis of a concrete use casepertaining to the telecommunications sector.

2. Structural Complexity of Annotations

The first dimension that we discuss is the structuralcomplexity of annotations. This dimension relates tothe amount of information that is encoded in the an-notation itself, how it is structured in the underlyingstorage model and how this structure can be used. Thisstructure has great influence on what data can be dis-played to the user, how it can be displayed, but alsowhat type of back-end services can be provided. Wedistinguish between tags, attributes, relations and on-tologies. Tags are at the beginning of the spectrum andrepresent the easiest form of annotation from the anno-tator point of view; whereas ontologies are at the otherend of the spectrum and represent the hardest form ofannotation from the annotator point of view. In fact, asit has been showed in [?], designing ontologies is a dif-ficult and error-prone task even for experienced users.

2.1. Tags

A tag annotation element is a non-hierarchical key-word or free-form term assigned to a resource (see Fig-ure 2). A tag implicitly describes a particular propertyof a resource as the computer and other consumers ofthe annotation do not know the meaning that the an-notator intended (except if the natural language usedis unambiguous). Normally, a tag is a single word or asequence of characters without spaces (which typicallyserve as tag separators in the user input). Examples oftags include: the name of the person on a picture, thename of the place where a picture was taken, or a topic

P. Andrews and I. Zaihrayeu and J. Pane / A Classification of Semantic Annotation Systems 3

Fig. 2. The Tag Annotation Model

of a news article. The tagging annotation systems areoften discussed as part of the folksonomy annotationmodel [?] that links tags, resources and users (annota-tors) following the model in Figure 1.

PROS With the increasing amount of applicationsapplying Web 2.0 principles, the notion of tagging as asimple organisation system is now familiar for Internetusers and, therefore, it entails nearly no learning curvefor a typical user in order to start using them. They al-low the user to easily annotate a Web resource with afree-text term and find other resources which were an-notated with the same tag by browsing or searching.In fact, after four years of its existence Flickr reportedto have “about 20 million unique tags” (January 2008)and 5 billion images (September 2010) [?].

CONS Tags represent a minimal annotation modelfrom the structural complexity point of view and,therefore, can enable only a limited number of servicesmainly focused on basic retrieval and browsing (e.g.,retrieve resources that were assigned tag x). Becausethey only implicitly describe resource properties, tagsare subject to ambiguity in the interpretation of theseproperties. For example, natural language tag “John”attached to a picture does not specify whether John is aperson on the picture or if he is the photographer whotook the picture.

APPLICATIONS Delicious5 is a social bookmakingservice that allows its users to memorize and shareURLs of Web resources such as blogs, articles, music,video, etc. It was founded in 2003 and now counts fivemillion users and 150 million bookmarked URLs. Thekey idea of the service is that its users can access theirbookmarks from any computer, for example at home,at work or while traveling. Delicious has been among

5http://www.delicious.com/

the first popular systems that used tagging for orga-nization and retrieval purposes. One of the importantsuccess factors of Delicious was its simplicity of use(see Figure 3). In a nutshell it works as follows: theuser finds an interesting website and decides to book-mark it in Delicious; when adding the website the userannotates the website with a set of terms (called tags).Later, in order to retrieve a saved bookmark, the anno-tation consumer can query the system with one or moreof the previously assigned tags. If the answer set con-tains too many elements, it can be refined by addingmore terms to the query. Delicious allows the user tobrowse not only personal annotations, but also to findbookmarks saved and annotated by other annotators.Tagclouds (see Figure 4) and tag based search are start-ing points for navigation in the space of all (published)bookmarks of all users. While browsing bookmarksthe system visualizes tags which were assigned to re-sources by other annotators.

Flickr6 is a free image hosting service that storesover three billion images. It was launched in 2004 andpopularized the concept of tagging together with De-licious. Flickr allows its users to upload their photos,organize and share them using tags. Even if, the userscan establish relationships, form communities, com-ment and annotate photos of each other, the site is moreused as a user’s personal photo repository.

Another example of a popular social site whichuses tags is Last.fm7. It is a UK-based Internet radio,founded in 2002, which has a thirty million active usersin more than 200 countries. Last.fm allows its users tocreate custom playlists and radio stations from audiotracks available from the service’s library. It also of-fers numerous social features such as recommendationof similar tracks considering user’s favorites. Userscan annotate resources such as bands, artists, albumsand tracks and retrieve them using tag based search.Examples of other systems that use tags for annotat-ing their resources include: Youtube8, CiteULike9 andLiveJournal10.

Note that the Subject-Predicate-Object (SPO) modelused widely in semantic web technologies (throughRDF for instance) is already of higher structural com-plexity than the tags as it supposes the existence ofa predicate linking the tag (Object) and the resource

6http://www.flickr.org7http://www.last.fm8http://www.youtube.org9http://www.citeulike.org/10http://www.livejournal.com/


Fig. 3. Annotating a Resource Using the Tag Annotation Model

Fig. 4. Flickr Cloud of Tags

(Subject). In fact, in the simple tagging annotationmodel, there is no need for this predicate as it is alwaysmeant to be the tagging relationship.

2.2. Attributes

An attribute annotation element is a pair 〈AN,AV 〉,where AN is the name of the attribute and AV isthe value of the attribute (see Figure 5). The attributename defines the property of the annotated resource(e.g., “location”, “event”, “starting date”) and the at-tribute value specifies the corresponding value (e.g.,“Trento”, “birthday”, “April 1, 2009”). Apart fromthis, the model allows us to define the data types for at-tributes and, therefore, enables type checking at querytime.

PROS Attributes are pervasively used on the web andin desktop applications and, therefore, represent a well

Fig. 5. The Attribute Annotation Model

known notion for end-users. Differently from tags, at-tributes explicitly define the described resource proper-ties and, therefore, enable a richer resource annotationand query language. For example, one could search forimages of the Eiffel Tower taken between 1890 and1900 by a specific photographer.

CONS While enabling more services than tags do,attributes are still a limited means of annotation be-cause they refer to single resources and, therefore,cannot be used to effectively enable services whichare based on interrelationships that exist between re-sources (e.g., search and navigation between relatedresources). Furthermore, attribute annotations requirea more metadata-knowledgeable annotator than tag an-notations do.

APPLICATIONS One of the earliest systems thatused attributes for resource annotation was SemanticFile Systems described by [?]. The system allows theuser to assign arbitrary number of name-value pairs to


the user’s files and then retrieve them by creating socalled virtual directories. The user creates virtual di-rectories at runtime specifying the list of attributes. Ac-cording to the user input the virtual directory containsonly those files whose attributes match attributes fromthe list. The implementation of the ideas introduced inSemantic File systems can be found in search enginesintegrated directly in the operating systems. This is of-ten referred to as Extended File Attributes11 as files canbe assigned a number of extra metadata attributes thatare not relevant for the file system behaviour but canbe used in search system such as Mac OS X Spotlight.

The Web has many examples of popular systems us-ing attributes. Almost all social networks, (e.g. Face-book12 and MySpace13) consider their users as a re-source and use the attribute annotation model to repre-sent user profiles. The variety of attributes is large andincludes common attributes such as “Personal Info”,“Contacts” as well as specific attributes, such as “Inter-ests”, “Traveling” (see Figure 6). Online markets suchas Ebay14 use the attribute annotation model to anno-tate resources which are items to be sold. The sellercan assign an item with attributes such as “item loca-tion: Trento”, “item price: $100”, “shipping to: Italy”,etc.

In the Subject-Predicate-Object model, the predi-cate would be used to link the resource (subject) andthe attribute value (object) through a property name(predicate). In that sense, the RDF+OWL Propertypredicates can be considered to be a good example ofthe attribute complexity dimension.

To annotate cultural heritage resources in the Bib-liopolis collection15, [?] uses attributes representingmetadata of scanned books, photos and related re-sources. They can then annotate visual resources withprovenance, authors, date of publication, etc.

Both Delicious and Flickr tags systems were “hack-ed” by users to use so-called machine tags that allow toinformally encode attributes in the simple tag systemthey provide. For instance, before the introduction ofa specific geotagging interface by Flickr, users couldgeolocalise their photos by assigning tags of the formgeo:lat=XXXX and geo:long=YYYY. Both web-sites have added adhoc support for searching for thesemachine tags.

11xattr in some unix based systems12http://www.facebook.com13http://www.myspace.com/14http://www.ebay.com15http://www.bibliopolis.nl

2.3. Relations

A relation annotation element is a pair 〈Rel,Res〉,where Rel is the name of the relation and Res is an-other resource. The relation name defines how the an-notated resource is related with Res. At the concep-tual level, the relation annotation model is an exten-sion of the attribute annotation model to the domainof resources, which allows the user to interlink theseresources (see Figure 7). For instance, in a scientificpaper a citation referencing another paper is an exam-ple of a relation annotation which defines a relationbetween these documents.

PROS Relation annotations provide a way to inter-link various resources through typed links. It allowsthe user to navigate from one resource to another andenable search and navigation based on these relationlinks.

CONS The annotator is expected to bear a highermental load w.r.t. the previous models as, instead ofdescribing one resource, the annotator has to under-stand what the two resources are about and what kindof relationship holds between them.

APPLICATIONS User of social network sites anno-tate their profiles establishing relationships betweeneach other, which allow them to find friends and tomeet new people by navigating the network of rela-tions between users. Apart from this, some social net-works, for instance Facebook, allow their users to an-notate photos with links to profiles of people appearingin the picture (See Figure 8). In 2007, Facebook hadaround 1.7 billion uploaded photos with around 2.2 bil-lion relation annotations. In a similar manner, biblio-graphic folksonomies such as Connotea [?] or BibSon-omy16 allow for relating bookmarked scientific refer-ences to their authors.

The relation annotation model can also be used todefine relations within a resource. For example, theAraucaria project [?] annotates the rhetorical structureof a document using the RST relations annotation [?].Figure 9 illustrates an example of intra document an-notation of the RST relations; segments of text aregiven an identifier and links between each segment areannotated with a type of argumentation relationship(justify, condition, etc.).

Upcoming.org17 is a social website for listing eventsthat can be linked to Flickr photos by annotating them

16http://www.bibsonomy.org17http://upcoming.yahoo.com/


Fig. 6. Resource Annotation Using the Attribute Annotation Model

Fig. 7. The Relation Annotation Model

with so called triple or machine tag. Triple tags usea special syntax to define extra information about thetag making it machine readable. For example, a taglike “upcoming:event=428084” assigned to a photoencodes that it is related to Upcoming.org event identi-fied with “428084”, therefore making these resourcesinterlinked. Last.fm is another example of a systemthat use triple tags to link its tracks to Flickr photos.

Freebase18 is a large knowledge base containingaround five million various facts about the world.It is described as “an open shared database of theworld’s knowledge” and “a massive, collaboratively-edited database of cross-linked data.”19. It allows itsusers to annotate resources (e.g. images, text, Webpages) using the Freebase annotation schema. Theschema defines a resource annotation as a collection

18http://www.freebase.com19http://www.crunchbase.com/company/

metawebtechnologies

of attributes, its kind (e.g. person, event, location), andtyped relations with other resources. It provides a con-venient way to perform search and navigation in thespace of resources allowing the user to find them usingattributes, relations, and/or schema kinds.

2.4. Ontologies

This model is based on the notion of semantic anno-tation [?], a term coined at the beginning. It describesboth the process and the resulting annotation or meta-data consisting of aligning a resource or a part of itwith a description of some of its properties and char-acteristics with respect to a formal conceptual modelor ontology. Figure 10 shows an schematic represen-tation of the model. As defined by Gruber et al., “anontology is an explicit specification of a (shared) con-ceptualization” [?]. In practice, ontologies are usuallymodeled with (a subset of) the following elements:concepts (e.g., CAR, PERSON), instances of these con-cepts (e.g., bmw-2333 is-instance-of CAR,Marry is-instance-of Person), propertiesof concepts and instances (e.g., PERSONhas-father), restrictions on these properties (e.g.,MAX(PERSON has-father) = 120), relationsbetween concepts (e.g., PERSON is-a BEING), re-lations between instances (e.g., Marry has-fatherJohn), etc [?]. The ontology annotation model al-lows the annotator to describe and interlink existing

20The meaning of this rather informal notation is that any instanceof the concept PERSON may have one father at most.


Fig. 8. Resource annotation using the relation annotation model

R

JUSTIFY

RCONDITION

1A 1B

VOLITIONAL-RESULT

1C 1D

[And if the truck driver’s just don’t want to stick tothe speed limits,]1A [noise and resentments areguaranteed.]1B [It is therefore legitimate to askfor proper roads and speed checks.]1C [And the

city officials have signaled to support localcitizens.]1D

Fig. 9. Rethorical Structure Theory Relationship

resources by qualifying resources as concepts or as in-stances and by defining relations, properties, and re-strictions that hold between them. Thus, it is the rich-est model from the structural point of view amongst allmodels presented in this section.

An example of the application the ontology anno-tation model is shown in Figure 11. In this exam-ple, possibly different users annotated resources thatrepresent the concepts of a Person, Country, andPolitical Unit as well as the entities BarackObama and USA. For the annotation, the annotator(s)used relations is-instance-of,

Fig. 10. The Ontology Annotation Model

Fig. 11. An Example of the Application of the Ontology AnnotationModel

is-president-of, and is-a defined in some on-tology specification. Other examples of the applicationof this model can be found in [?].

PROS Ontology-based annotations or “semantic an-notations” describe a resource with respect to a formalconceptual model, allowing meaning-bearing links be-tween structured and unstructured data (such as an on-


tology and a text). This empowers a whole new rangeof retrieval techniques, which can be based on theknowledge schema expressed in the ontology, benefitfrom reasoning, co-occurrence of annotation or enti-ties in the same resource or context, as well as com-bine this with unstructured data specific types of re-trieval, such as full text search (FTS) in informationretrieval (IR). The actual metadata is encoded in theannotation and usually expresses metadata automati-cally or manually generated about the resource. Theontology and the corresponding instance base capturebackground knowledge about a domain. The combina-tion of the evidence based information about the re-source and the background knowledge, allows index-ing techniques, which are based on resource URIs asmodeled in the ontology, ensuring retrieval and nav-igation through each of its characteristics (for exam-ple lexical representations such as NYC and New YorkCity will be indexed as a single resource, despite theirsuperficial differences, and this will lead to results con-taining the string “New York City”, even though theuser provides a query such as “NYC”).

CONS Each of the annotation models described inorder of increasing complexity, presents new chal-lenges to human annotators, although disclosing richerpotential for automatic processing. Semantic annota-tions, being the most sophisticated of this row, are noexception. The main challenges semantic annotationpresents are in two major lines, namely (i) usability,and (ii) maintenance of the conceptual models.

The usability aspect is fundamental to human in-volvement in the generation of semantic metadata andis also going to be the main hurdle that needs to becrossed to weave the approach in all forms of userinteraction with software and data. Proposing a largenumber of entities and concepts coming from an ontol-ogy to the annotator is indeed an issue. There shouldthus be efficient search and recommendation servicesto help the annotator in providing the right entity orrelation for the semantic annotation.

This raises another issue in the use of complex onto-logical structures. In fact, one can experience the chal-lenge of presenting multiple subsumption structuresover the same model, as well as multiple descriptionfacets of each resource. Empowering users to find theirway to the right concept, entity or relationship that theywant to cite is a serious challenge to usability expertsand visual interface designers. In fact, as it is shownin [?], “Most people find it difficult to understand thelogical meaning and potential inferences statements in

description logics, including OWL-DL”. In addition,in real world applications (in the Linked Open Data(LOD) cloud [?] for example) multiple instance basesand fact bases can be involved and thus, their corre-sponding ontologies can be inter-aligned [?], resultingin millions or even billions of individual entity descrip-tions.

While the issue of scalable user interfaces that candeal with large amounts of possible values for an anno-tation also arises with the previously described mod-els (such as tags or attributes), the complexity of dis-playing the right entities and relations from multiplelarge aligned ontologies further complicates the us-ability problem and creates new challenges in termsof scalability. Indeed, while intuitive search and auto-suggest/complete methods can already help with sim-pler models, the complexity of the structure of the on-tology raises new issues in displaying large amount ofentities and their relations as well as in summarizingof entire knowledge bases to any useful level of gran-ularity.

Another complicated task an ontology providerneeds to face is the maintenance and update of theknowledge, often coming from external sources, itssyntactic and semantic alignment, and often, its chal-lenging scale (e.g. bio-medical knowledge bases withbillions of individual facts). In [?], the authors studydifferent automatic approaches to construct or extendontologies, they conclude that most of the state of theart approaches require some level of expert involve-ment to curate the knowledge. This is also pointed outin [?].

APPLICATIONS OntoWiki [?] is a free, open-sourcesemantic wiki application, meant to serve as an on-tology editor and a knowledge acquisition system (seeFigure 12). It allows its users to annotate Web re-sources representing them as concepts and instancesof concepts. Different attributes can be assigned to aresource and interlinked with other resources, there-fore, describing its characteristics. OntoWiki retrievalfunctionality allows to search and to generate differ-ent views and aggregations of resources based on con-cepts, attributes and relations.

Semantic MediaWiki [?] is an extension of a wikisoftware21 that allows its users to annotate wiki articlesdefining an article as a class like “person”, “document”or an instance of previously declared class. The userscan also interlink articles by annotating them with

21the same one as Wikipedia uses.


Fig. 12. Resource Annotation Using the Ontology Annotation Model in OntoWiki

typed links like “author” or “was born in”. This bringssemantic enhancement to the MediaWiki that allowsbrowsing and searching in a more specific ways, suchas “Give me a table of all movies from the 1960s withItalian directors”. OntoBlog [?] takes a similar ap-proach as Semantic MediaWiki, but is based on a blogplatform and allows the authors of the blog posts to de-fine attributes and relations to an ontology’s entities orother blog posts.

The authors of [?] describe a similar system thatallow the annotation of wikis with semantic content;however, unlike the general entity/instance approachused by the systems described earlier, the annotationsrepresent rules and knowledge that can be used forproblem solving in expert systems.

While the previously discussed systems allow thecontent creators to dynamically create new class andinstances in a centralized website, the Annotea [?]annotation scheme focuses on allowing content con-sumers to annotate web pages distributed over the webwith a basic set of classes22. For instance, users visit-ing a website can leave comments, questions or expla-nations to the content provided by the content creator.

KIM is a semantic annotation, indexing and retrievalplatform developed by Ontotext [?]. It can mainly beused to facilitate automatic semantic annotation on topof different content types, with built-in extended ca-pabilities for semi- or non-structured text processingbased on the GATE framework [?]. It is deployed ontop of a native semantic database engine, currentlyOWLIM and/or Sesame [?,?]. It allows both of theabove described kinds of semantic annotations, in the

22note that in the Annotea scheme, new classes can also be de-scribed, but in a less open manner.

same time providing support for the simpler annota-tion models explained earlier and in the next section.Despite the fact it has matured as a platform since theearly 2000s and is in active use, it has never, so far,been focused on usability or visualization aspects ofthe manual annotation process.

PhotoStuff [?] and PicSter [?] are also annotationsystems based on ontologies, allowing to describe in-stances and classes within images (as their name sug-gest). While KIM allows to interlink text to the ontolo-gies to describe instances, PicSter allows for the cre-ation of instances of entities depicted in figures (e.g. atype of flower in a photo). It also allows to annotate thepictures with metadata attributes relevant to the picturesuch as where it was taken.

Microformats23 [?] allow for the embedding of in-stances description directly in HTML, thus annotat-ing the content with properties and relations to otherentities on the web. hRest [?], for example, can de-scribe RESTful web services directly from their doc-umentation pages. The Schema.org24 initiative, startedin 2011, is a demonstration of how such an approachof embedding semantic annotations within HTML isreaching a mature stage. In fact, as part of this ini-tiative, a number of large search engine companies25

have agreed to work together towards a standardiza-tion of semantic annotation within HTML, in collabo-ration with one of the main web standardization body,the W3C, and its “Web Schema” task force26.

23http://microformats.org/24http://www.schema.org25including Google and Bing!26http://www.w3.org/2001/sw/interest/

webschema.html


While there are many existing vocabularies to an-notate content on the web (such as the Dublin Core27,Common Tag28, SIOC29, etc.) with a formalised on-tological structure, and many ways of implement-ing them (microformats, RDF, RDFs, NTriple, etc.),FOAF [?] is a popular example of an annotationscheme to describe entities found on the web, in partic-ular people and organisations, describe their relation-ship and their relation to web resources (maker for in-stance). Anyone can embed FOAF description on theirwebsite to provide contact metadata (dateofbirth,name, etc.) and relations to other entities around theweb by using dereferenceable URIs to other FOAF de-scriptions or resources descriptions.

3. The Vocabulary Type

When annotation elements (e.g., tags, attributenames and values, relation names) are provided by theuser in the form of a free-form natural language text,these annotations unavoidably become subjects to thesemantic heterogeneity problem because of the am-biguous nature of natural language [?]. Particularly, weidentify the following four main issues:

Base form variation. This problem arises when thesame word is entered (possibly by different users)using its different forms (e.g., plurals vs. singu-lar forms, conjugations) or erroneously (e.g., mis-spellings) during the annotation or search [?].This is usually dealt with a lemmatization proce-dure that converts the annotation to its base form.

Polysemy. Annotation elements may have ambiguousinterpretation. For instance, the tag “Java” may beused to describe a resource about the Java islandor a resource about the Java programming lan-guage; thus, users looking for resources related tothe programming language may also get some ir-relevant resources related to the Island (therefore,reducing the precision);

Synonymy. Syntactically different annotation elementsmay have the same meaning. For example, the at-tribute names “is-image-of” and “is-picture-of”may be used interchangeably by users but will betreated by the system as two different attributenames because of their different spelling; thus, re-

27http://dublincore.org/28http://commontag.org29http://rdfs.org/sioc/spec

trieving resources using only one of the attributenames may yield incomplete results as the com-puter is not aware of the synonymy link;

Specificity gap. This problem comes from a differ-ence in the specificity of terms used in annota-tion and searching. For example, the user search-ing with the tag “cheese” will not find resourcestagged with “cheddar30” if no link connectingthese two terms exists in the system.

The use of free-form natural language requires theminimal involvement of the user at annotation timebut leads to a higher involvement of the user at thesearch and navigation time. The four problems de-scribed above produce a higher level of noise in searchresults and the annotation consumers need to filter outmanually the results they are interested in. At naviga-tion time, the lack of an explicit structure that definesthe relationships between terms used for the annota-tion largely reduces the navigation capability and onlysimple clustering can be performed to help the usernavigate through the annotations (such as the one illus-trated in the Delicious tag cloud in Figure 4).

The problems described above can be addressed toa certain extent by using a Knowledge OrganisationSystem (KOS) that can take the form of an authorityfile, a glossary, a classification scheme, a taxonomy,a thesaurus, a lexical database (such as WordNet [?]),or an ontology, among others (see [?] for an in-depthdiscussion on the different types of KOSs). The keyidea is that terms in user annotations and queries canbe explicitly linked to the elements of the underly-ing KOS and, therefore, their meaning can be disam-biguated (see Figure 13). Depending on the type andstructure of the KOS, the above mentioned problemscan be addressed in a different extent, as discussed inthe rest of this section.

The controlled vocabulary is a noteworthy exam-ple of a KOS and can be defined as “a closed listof named subjects, which can be used for classifica-tion” [?]. Controlled vocabularies are normally builtand controlled a top-down fashion by a (small) groupof experts in a particular domain. When using sucha vocabulary, the terms used by the annotation cre-ators can be linked to elements of the vocabulary (e.g.,word, concepts) and thus be disambiguated (recall Fig-ure 13). The same approach can also be applied to dis-ambiguate search terms used by the annotation con-

30which is a kind of cheese


sumers, thus solving some of the issues described ear-lier.

Noteworthy, the term controlled vocabulary comesfrom the Library and Information Science communityand in the Semantic Web community the term ontol-ogy is often used in order to describe similar kinds ofknowledge organisation systems.

Ontologies are often classified based on the level oftheir expressivity (or formality) that conditions the ex-tent to which a certain form of ontology can be usedin automated reasoning. The authors of [?] proposedsuch a classification shown in Figure 14 where theynote that there is a point on the scale (marked as a blackbar) where automated reasoning becomes useful: thisis where the ontology can be reasoned about using thesubclass relations [?].

Note that the way how ontologies can be used to an-notate resources (as described in Section 2) is differ-ent from the way how ontologies can be used to sup-port the annotation process (as described in this sec-tion). In the former case, users build ontologies by pro-viding pieces of the knowledge it contains as annota-tion elements. For example, by linking a page aboutBarack Obama to a page about people with the onto-logical relation is-instance-of (recall the exam-ple depicted in Figure 11), the user annotates the pageabout Barack Obama with an ontology annotation el-ement. Such elements (possibly provided by differentusers) are then assembled into a bigger ontology whichcan be seen as a complex ontological annotation struc-ture used to describe the annotated resources. In thecase when an ontology is used as a KOS to supportthe annotation process, the users provide (simpler formof) annotation elements and (semi-automatically) mapthem to the background ontology (see Figure 13). Forinstance, the annotator might tag a web page with theterm “dog” and link it to the concept DOG in the back-ground ontology. It is worth mentioning that both ap-proaches can potentially enable automated reasoning.

In the rest of this section we discuss three types ofannotation models which differ in whether they arebased on a KOS and, if they are, in the kind of theKOS. In the selection of the models we followed theprinciple of staying within the spectrum of so-calledlightweight ontologies [?]. The first model is not basedon a KOS whereas the second and the third ones are.These last two models fall on the left and on the rightside from the vertical bar shown in Figure 14 respec-tively. We show how the four issues described at thebeginning of this section can be addressed in the lasttwo approaches. Table 1 summarizes the semantic het-

Table 1Semantic heterogeneity issues and their relation to types of KOSs

ISSUETYPE OF KOS

no KOSAuthorityFile

Thesaurus

Base form variation√

Polysemy√

Synonymy√

Specificity gap√ √

erogeneity related issues and to which KOS types theyapply.

3.1. No KOS

In this model, annotation elements are not linked toKOS elements and, therefore, are subjects to at leastthe four problems described in the introduction to thissection. In other words, free text is used for the anno-tation and search. Note that the search in this model isimplemented as matching of strings representing queryterms with those used in annotation elements.

PROS The advantage of this model is that the an-notator and the annotation consumer does not need toknow about the existence of a KOS and does not needto be involved to help resolve ambiguities in linkingannotation elements to the elements in the KOS. Thus,the human involvement at annotation and search timeis minimal: the user enters a list of free-text words in atext box.

CONS The main disadvantages of this model arethe four problems described earlier: base form vari-ation, polysemy, synonymy, and the specificity gap.Moreover, the minimal involvement at annotation timetranslates into a higher involvement required at thesearch and navigation time, as described in the intro-duction to this section. Folksonomies and other collab-orative tagging systems based on free text annotationssuffer from the lack of formal semantics that leads tothe semantic heterogeneity problem, which makes in-dexing, search and navigation more difficult [?].

APPLICATIONS Delicious and Flickr are both ex-amples of systems that use free text for the annota-tion and search. The user can enter any free-text tagthey want to annotate their bookmarks and photos (re-spectively). Connotea [?] and Bibsonomy31 (amongst

31http://www.bibsonomy.org


Fig. 13. KOS-based Annotation and Search

Fig. 14. A Classification of Various Forms of Ontologies According to their Level of Expressivity (Formality) (From [?])

other) allow for a similar type of free-text tagging butfor scientific references.

3.2. Authority file

This model is based on a simple form of a con-trolled vocabulary called the authority file. In this vo-cabulary, synonymous terms are grouped to representa concept and one of the terms is selected as the con-cept name and used for visualisation and navigationpurposes (this term is called the preferred term) [?].Each concept may have one or more associated termsand each term may belong to one or more concepts. Inthis model, the annotation elements are mapped to thecontrolled vocabulary terms and, consequently, to con-cepts which uniquely codify the meaning of the anno-tation elements. For example the tag “automobile” canbe mapped to a concept with the preferred term “car”in the controlled vocabulary. Note that when a term be-longs to more than one concept, the user may need to

be involved to disambiguate the concept selection atannotation or query time.

PROS The model allows us to resolve the polysemyand synonymy problems by linking annotation ele-ments and user queries to a concept in the vocabu-lary. For example, if the user searches for the term“java” that is mapped to a concept with the meaningof a programming language in the controlled vocabu-lary, the search will not return resources annotated withthe term “java” that refers to the island in Indonesia.It is worth mentioning that the user can partially re-solve the polysemy problem by adding more terms tothe query. For example, querying with “java program-ming language” will unlikely return results related tothe island. However, it does not address the problemof synonymy (thus affecting recall) and would requirethe user (as consumer of the annotations) to provide


queries with more terms whereas user queries are typ-ically very short32.

Because synonymous terms are linked to a singleconcept in the authority file, the user (as annotationconsumer) will be able to discover and navigate re-sources annotated with a given concept without havingto know the particular terms the annotator and/or otherannotators used to denote this concept while annotat-ing these resources.

CONS At both annotation and search time, the usermay need to get involved more. At annotation time, ifthere is an ambiguity in the mapping from the term topossible multiple concepts in the vocabulary, the anno-tator may need to choose which concept the term refersto. Similarly, at search time, if there is an ambiguity inthe terms the users use for searching, they may need tospecify explicitly which concept they are referring to.Alternatively, the user may decide to leave the conceptdisambiguation task to the system which may apply adisambiguation algorithm; however, this may lead to alarge amount of erroneously selected concepts as thedisambiguation task is known to be hard in the generalcase [?].

The second issue is the time and human resourcesneeded to create and maintain the data in the authorityfile. Particularly, one of the most challenging mainte-nance problems is to keep the contents of the authorityfile aligned with the possibly emerging vocabulary ofthe users to cover their annotation and search needs.

APPLICATIONS Faviki33 is a social bookmarkingsystem similar to Delicious (see Figure 15). However,when users enter a tag, they are asked to disambiguateits meaning to a known concept. These concepts areautomatically extracted from the contents of DBPe-dia [?]. This disambiguation allows a better conver-gence between the terms used for the annotation bydifferent users as well as it provides better results atsearch time. In [?], the authors provide a review ofsuch semantic bookmarking systems.

Delicious introduced a few features to its website tohelp the users put more control on their vocabulary.By adding the recommendation and auto-completionfeatures to the tagging interface, they provide a dy-namically extensible vocabulary where users can addnew free-text tags or reuse the ones already existing in

32An analysis of the top 1000 queries submitted to the Yahoosearch engine showed that the average length of a query is 1.8 terms(see http://webscope.sandbox.yahoo.com/).

33http://faviki.com

the community vocabulary. Apart from this, Deliciousadded the feature of creating tag “bundles” where userscan group together their tags in any way they want,thus helping them disambiguate and control their vo-cabulary more strictly. Note that although authorityfiles are created by domain experts whereas Deliciousbundles are created by ordinary users, the two datastructures are similar and, therefore, can be subjects tothe same advantages and disadvantages as discussedin this section. We extend this discussion in Section 4,where we provide details on the different ways to cre-ate and maintain the contents of the underlying KOS(e.g., by users or by experts).

3.3. Thesaurus

A thesaurus is “a vocabulary of a controlled index-ing language, formally organized so that the a pri-ori relationships between concepts (for example as"broader" and "narrower") are made explicit.” [?].Figure 16 provides an example of a thesaurus. In thisthesaurus, the concepts “Cat” and “Dog” are linkedto their parent concept “Mammals” with the genus-species relation which is then linked to a concept “Ver-tebrate” with the same relation.

Note that a thesaurus can be constructed by extend-ing the authority file with the parent-child relationshipsbetween the concepts in the authority file. This the-saurus structure is the basis for the KOS of the modeldiscussed in this section.

PROS This model inherits the pros of the modelbased on the authority file (see Section 3.2). Apartfrom that, this model allows us to solve the specificitygap problem because query terms can be mapped tothe annotation elements through the parent-child rela-tions of the thesaurus and, therefore, resources whichare more specific in meaning than the user query canalso be retrieved. For example, the user can annotatetwo different resources with “cat” and “dog” and thenretrieve both resources by searching for “mammal”.

CONS This model inherits the cons of the modelbased on the authority file (see Section 3.2). Anotherdisadvantage of thesauri is the problem of the vocab-ulary granularity. Namely, if someone (e.g., a domainexpert) produces a thesaurus, there is no guarantee thatthis one will be detailed enough for the user for theannotation and search. For instance, in the thesaurusexample presented earlier (see Figure 16), the expertmight not have introduced the “vertebrate” vs. “inver-tebrate” distinction. In this case, if the user wants to


Fig. 15. Faviki: Tagging with Concepts from Wikipedia (screenshot provided by faviki.com)

Nature

Animal

Ver tebra te Inver tebra te

Mammals Birds

Cat Dog Cormorant Robin

Echinodermata Mollusca

Sea S ta r Squid

Fig. 16. Example of a Thesaurus

perform a generic search for all “vertebrate”, only asearch for the higher term “animal” will be possible. Incontrast, if the thesaurus provided by the domain ex-pert is too detailed, the disambiguation task at anno-tation and search time may become more complex forthe user who might not know the differences betweenterms which are very close in meaning.

APPLICATIONS Shoebox34 is a photo managementsoftware that allows the users to annotate photos withannotations which are nodes in a thesaurus. The soft-ware comes with predefined categories (such as coun-

34http://www.kavasoft.com/Shoebox

tries/regions/cities, or animal kingdom) but the userscan also freely extend the thesaurus by creating theirown categories and sub-categories. The user can thensearch for all photos tagged with an annotation corre-sponding to a category or one of its sub-categories asillustrated in Figure 17.

The approach presented in [?] uses the SKOS vo-cabulary schema [?] to represent the thesauri used forannotating cultural heritage resources such as scannedbook-printing in the Bibliopolis35 collection.

35http://www.bibliopolis.nl


Fig. 17. Searching of Photos by Categories in Shoebox: Here Things > Nature > Animal > Vertebrate > Mammals

4. User Collaboration

In this section we will describe a dimension relatedto how users contribute to the creation of differenttypes of annotations (see Section 2) and the vocabu-lary, or KOS, used in the process (see Section 3). Wecan distinguish between two approaches that can beapplied for the creation of the annotation or for the cre-ation of the KOS:

1. the single-user model, where a single user per-forms the task of either annotating resources orcreating the vocabulary (or both),

2. the community model, where a set of users col-laborate in the task of either annotating resourcesor creating the vocabulary (or both).

In this section we will focus on the interactions be-tween users, and how these interactions affect, posi-tively or negatively, the other two elements of our clas-sification of annotations; these elements are the struc-tural complexity of the annotation, i.e., how users in-teract with each other using some degree of structuredannotation (see Section 2) to annotate resources, andthe type of vocabulary used to annotate the resources(see Section 3). Note that a system can have combina-tions of the single-user and community approach withthe annotation and the vocabulary or KOS building.For example, a system can use a community build vo-

Table 2Types of Collaboration for Annotating Resources and BuildingVocabularies

FEATURE

TYPE OF ANNOTATION

Single UserCollaborative

Private use Public use

InteractionbetweenUsers whenannotating

none none encouraged

Type of vo-cabulary

personal,private

personal,shared

shared

cabulary or KOS, and have only single user annota-tions.

Considering the annotation perspective, in the sin-gle user model we can identify two sub-dimensionsby considering who consumes the annotations (see Ta-ble 2). A single user can annotate a resource for per-sonal use only, or can annotate a resource consideringthat these annotations will be later consumed by manyother users. As can be seen Table 2, in the personaluse of annotations the vocabulary used for the annota-tions can be local, and the user is free to use any formof vocabulary as long as s/he understands the mean-ing of the annotation (e.g., the user can add an annota-tion “Fufy” and s/he might understand that this is thename of his/her pet). In contrast, in the public use of


the annotations, the annotator should use some form ofshared vocabulary, as the intention of the annotation isfor others to consume, this holds also for the collabo-rative annotations. In the remainder of this section wewill explain how this consideration affects the annota-tion process.

Considering the vocabulary perspective, i.e., howthe vocabulary is built to be later used in the annotationprocess, we will focus on KOS type vocabularies (seeSection 3.2 and Section 3.3), since in the free-form an-notation model (No KOS, see Section 3.1), anyone isfree to use any term (even terms not corresponding toa vocabulary).

4.1. Single-user (Private use)

In this model, a user annotates resources for per-sonal purposes, usually to organize these resources forfuture search or navigation, i.e., the same user is theannotator and consumer. Single users could also buildtheir personal KOS (see Section 3), but as we will seein the CONS part, the necessary work does typically notjustify this task.

PROS The advantage of the single-user annotationmodel for private use is that each user is in full con-trol of the annotations and since there is no sharing,the issues related to the semantic heterogeneity prob-lem (see Section 3) is reduced to the scope of the singleuser. One can also argue that since the annotator andthe consumer of the annotation are the same person,the annotations will reflect the personal taste/opinionand knowledge of the user, which in turn can be trans-lated into more accurate results at search time.

CONS In this model, each single user has to annotateall the resources that s/he wants to have annotated forfuture use. These resources can be private resources(e.g., local files), resources shared by other users (e.g.,shared photos), or publicly available resources such asWeb pages. This annotation process has the followingdisadvantages:

– annotating all of the resources takes time and re-quires motivation (building personal KOS to an-notate the resources requires even more time andmotivation). If the user has no strong incentives,the quality and coverage of the annotations andthe KOS cannot be guaranteed; Indeed [?] foundthat users rarely annotated their resources if theywere for a private use but that if they annotatedphotos – on Flickr in this study –, it was to pro-vide better search for other people.

– users have to remember which terms they nor-mally use to denote a particular concept when an-notating a resource and use it consistently acrossthe whole set of resources. If this is not doneproperly the user will not be able to find the de-sired resources at search time (see Section 3).While this is also true for the collaborative model(see Section 4.3), given the higher number ofusers annotating the same resource we hypoth-esize that there is more diversity in the use ofterms to denote the same concept, and therefore inthe collaborative model, even if the user searchesa resource with a different term than the ones/he applied in the annotation, this different termmight be present too.

APPLICATIONS Bookmarking is a well known ex-ample of annotation of Web resources for personal use.Currently, most Web browsers include this feature bydefault.

Other examples of personal annotation systems arePicasa desktop36 and Shoebox, where users tag theirphotos with free-form keywords. Picasa also adds thepossibility of geo-tagging photos while Shoebox pro-vides a KOS in the form of thesaurus for the annota-tion.

Another system for the annotation of Web resourcesis Zotero37, where users have the possibility to manageWeb resources related to scientific research by classi-fying them and/or by adding free-form annotations.

4.2. Single-user (Public use)

In this model, a single user, (normally a classifica-tion expert, or a small set of experts representing aninstitution) annotates resources with the goal of orga-nizing the knowledge for a broader set of users thatwill consume these annotations. Library catalogs areperhaps the most well known example of this kind ofmodel.

Considering the construction of the KOS, in thismodel also a single user (also normally a domain ex-pert or a small set of experts representing an institu-tion) builds the KOS to be used by a broader set ofusers in the annotation task (normally, but not only,the classification experts mentioned above) and laterfor search (normally by the public in general). In this

36http://picasa.google.com/37http://www.zotero.org/


case, library classification schemes or thesauri are wellknown examples.

Note that the expert, classification of domain, havedifferent tasks and therefore can be performed by dif-ferent people. Classification experts are experts in theannotation task, they classify by annotating resourceswith the predefined classes in a predefined vocabulary,and domain experts defined the classes and the vocabu-lary to be used in the classification and annotation task.There are cases, however, when the classification anddomain expert role is performed by the same person.Whenever we mean to refer to only one type of expertwe will make the clear distinction.

PROS Many people benefit from the work performedby a few experts in the field (both classification and do-main); the work (either the annotation or the resultingKOS) is considered to be of good quality resulting ina good organization of the resources. In addition, ex-perts in the annotation and classification task are alsoable to use more complex and well structured vocabu-laries (see Section 3) as they are better trained in thistask. Well studied and consistent KOS in the form ofcontrolled vocabularies developed by domain expertsare very useful for solving the issues related to the se-mantic heterogeneity problem (see Section 3).

CONS It is normally costly to have dedicated ex-perts to annotate and classify resources or to buildKOS in the form of controlled vocabularies. More-over, in highly dynamic domains, the time lag betweenthe publication of the resource and its annotation canbe considerable [?]. This delay can be due to, eitherthe scalability of the approach (work overload of theclassification experts) making the classification of vastamounts of resources in short times impracticable, orto the fact that the field of knowledge is new or verydynamic and thorough studies have to be performed inorder to reach consensus on the right vocabulary to beused for the annotations.

Another issue with this approach is that the classi-fication experts who perform the annotation are nei-ther the authors of the resource nor the consumers ofthe annotations, therefore the annotations might not al-ways reflect the consumers’ perspective or the authors’intention [?]. Some possible solutions to these issues,still at research stage, are being proposed; for exam-ple, to let casual annotators define their classes andkeywords that will later be approved by domain and/orclassification experts [?].

If the single annotator is not an expert in the anno-tation and classification process (or vocabulary build-

ing) it could miss relevant annotations (or terms) for aresource. This is also true in the collective annotationmodel (see Section 4.3) but given the higher numberof users annotating the same resource (or building thevocabulary) we hypothesize that there is more proba-bility that more meaningful annotations will be usedgiven that the resource is analysed from possible dif-ferent perspectives.

APPLICATIONS Library catalogs, such as the Li-brary of Congress38 and many other libraries fallinto this model of annotation. Library classificationsschemes used in library catalogs are examples of ex-pert designed KOS in the form of controlled vocabu-laries with the purpose of classification of books.

4.3. Collective annotation

In the collective annotation model, a set of usersshare their annotations about publicly available re-sources. These annotations are later also used by a setof users (possibly larger than the set of annotators) fornavigation and search. In this model, the workload ofannotating resources is distributed (in contrast to theprevious 2 models). The users do not necessarily worktogether in the process of annotation, neither do theyhave to reach an agreement on the resource annotations(but obviously this could be the case).

In this model, we could devise two subcategories,one where users collaboratively modify the same an-notation (wiki style) and another where users canonly modify their own annotations (folksonomy style).While in both subcategories users (annotators) mighthave different motivations to create or modify the an-notations, the pros and cons to be presented here gener-ically apply to both subcategories, unless explicitlystated otherwise.

Collaborative Tagging (or social tagging) is a wellknown model to annotate resources39 with free-texttags [?]. When a critical mass of user-generated tagsis available, these annotations can be used to gener-ate folksonomies [?]. The main characteristic of folk-sonomies (folk + taxonomies) is the bottom-up ap-proach (i.e., from individuals to groups) for reachingconsensus on emerging concepts, without any a prioryagreement or KOS.

Considering the construction of KOS, the basic ideais to let users collaboratively build a the KOS using a

38http://www.loc.gov/index.html39mainly on the web.


bottom-up approach. This approach is reflected in sev-eral proposals in the literature. These proposals differin whether there should be a pre-defined KOS whichwill be collaboratively extended [?], or the KOS shouldemerge bottom-up from scratch [?,?]. Depending onthe model, users could work together in order to reachagreements in the construction process [?], or theycould work independently and an automatic processcould address the evolution of knowledge [?]. [?] pro-posed several generic characteristics and issues thatany system for emergent semantics should considerand address. The notions presented in these proposalsallows us to address disadvantages present in centrallybuilt KOS.

PROS The main advantage of the collaborative an-notation model and construction of the KOS is thedistribution of the workload among all annotators ofthe system. This, in turn, lowers the barrier of userparticipation in the annotation process since one usercan adopt the annotations that were assigned to a re-source by others — thus simplifying the annotationtask —, or even, more importantly, can search and dis-cover new resources that were annotated with the sametags as the user’s tags [?]. Another advantage (in thefolksonomy style) is that annotators can express theirown views when annotating resources, which in con-trast with annotations made by experts, could simplifysearch and discovery of resources for other users (con-sumers) with the same interests, considering that theseresources were annotated with many potentially differ-ent points of views and interests.

In addition, the mass of data provided through theannotations in a collaborative model as well as theusers’ relations with these annotations can be used forautomatic extraction of new knowledge (new termsand relations in the KOS); for example, a recommen-dation system can be built to propose relevant annota-tions for a new resource, based on existing annotationsof similar resources [?] or to extract co-relations fromunstructured annotation schemes [?,?,?,?]. While suchtechniques can be applied with any type of collabora-tion model, the state of the art in recommender systemsuse knowledge from multiple users as it is difficult toprovide rich recommendations when only a single useris involved. In addition, these statistical techniques re-quire a large amount of data to be accurate and while,theoretically it is possible to apply them on single-userannotation models, the critical mass of annotation isoften practically only reached in multi-user systems.

This model also provides the system with behav-ioral information about the users’ interests throughtheir interaction with other users’ annotations (con-suming them) and their own annotations (producingthem). By using this information the system could in-fer a user model containing, for example, their edu-cation, background, interests or culture. Based on thisuser model the system can suggest and discover re-lated resources or can discover hidden communitiesand sub-communities of users based not only on socialrelations (such as friends or coworkers) but also on in-terests and shared knowledge or background [?]; Ebayand Amazon40 are systems that use these techniques tosuggest their consumer related products.

[?] illustrates how a static vocabularies or KOSmight not be well suited for a dynamic online useas new terms arise quickly. Building the KOS in abottom-up fashion can address the time lag problemfor the inclusion of new concepts in the KOS. Theoret-ically, having more semantics in the annotations willenable better services and this also goes with the in-crease of the underlying vocabulary to match the wholevocabulary used by the user [?].

CONS Social annotation approaches could sufferfrom subjective annotation that express personal opin-ions, likes or dislikes of the annotators, and not theactual intention of the resource author. Examples ofthese subjective annotations are “best paper”, “funny”,“boring”, “to read” [?]. This type of annotations can-not be avoided since the primary goal of any annota-tion model is to allow users to find the resources theyare interested in. These subjective annotations couldbe also important for ranking purposes of the resourcesand the annotating users. The system could try to limitsuch subjective annotations by automatically separat-ing the personal annotation from the general annota-tions or by avoiding the mix between annotations thatexpress opinions and the ones that are not associated tothe annotator’s relative taste. In the folksonomy modeleach user will express their own preferences as statedbefore, but in the wiki model where users have to over-ride other user’s annotations, this could even becomean editing war were each user (or groups of users) triesto imposed their own opinions41 [?].

An important issue which remains open and needsto be treated in the community-built KOS is the model

40http://www.amazon.com/41http://en.wikipedia.org/wiki/Wikipedia:

Edit_warring


to be used to allow knowledge to emerge from the con-tribution of each single user. The process of agreement(either automatic or user driven) in knowledge evo-lution is still an interesting research issue to be ad-dressed [?,?].

APPLICATIONS Diigo42 and Delicious are exam-ples of applications implementing this model of anno-tation to build a shared collection of bookmarks. Whilethey both allow for private bookmarks43, their mainfeatures are built around the sharing of annotationswith the site community to construct a public folkson-omy.

Flickr also allows users to comment resources, tagor add them to their own personal favorite list. Thesetags are later used for search and navigation. [?] pro-posed and extension for searching Flickr content whereWordNet [?] is used to expand queries, using this ap-proach users are able, for example, to find results about“automobiles” when searching for “cars”.

The work presented by [?] incorporates the ideas ofcollaboration to library catalogs, where the users canbecome more than simple annotation consumers by an-notating resources in digital libraries and become an-notation providers. Facetag [?] is another initiative fol-lowing this direction. The rationale in this system is totry to integrate top-down (i.e., from experts to individ-uals) and bottom-up classification in a semantic col-laborative tagging system, incorporating the ideas offaceted classification, which helps users organize theirresources for later search, navigation and discovery.Facetag incorporates the idea of collaborative annota-tion and collaborative KOS in a single system.

Considering community-built KOS, [?] proposed anontology maturing process by which new knowledgeis formalized in a bottom-up fashion going throughseveral maturity stages: from informal emergent ideasusing input from social annotation systems such asfolksonomies, to formal lightweight ontologies via alearning process involving users. They also introducedSoboleo44, a system for collaborative ontology engi-neering and annotation.

Another related work to community-built KOS waspresented by [?] where a model for formalizing the ele-ments in an ontology evolution scenario was proposed.The model consists of three elements, Actor-Concept-Instance. This model has a straightforward parallel to

42http://www.diigo.com/43thus falling in the single-user (private user) dimension.44http://www.soboleo.com/

the model adopted in section 1 to explain the annota-tion process as seen in Figure 1; where Actor is equiv-alent to users, Concepts to annotations and Instancesto resources.

Tagpedia [?] tries to build a general purpose KOSto overcome the problems of Wordnet and Wikipedia.The main issues with WordNet are the lack of knowl-edge about entities (specially people) and the lack ofsupport for incorporating new knowledge. Wikipediacontains a lot of information about entities but suf-fers from the lack of a more formal/ontological struc-ture [?]. The idea of Tagpedia is to initially mineWikipedia to construct an initial set of syntags (in con-trast to synsets in WordNet that groups words) and toallow users to extend this initial set of syntags dynam-ically in a bottom-up manner.

In PicSter [?], image annotations are created lo-cally by the users and thus there is no central ontol-ogy shared by all peers using the system. PicSter stillprovides the opportunity to share resources and theirannotation by providing ontology alignment [?] whenthe users perform search queries.

Annotea [?] takes an interesting approach in col-laborative annotations of resources in that it is decen-tralised. In particular, the resources (web pages) thatare annotated are not always owned by the annota-tion creators and the annotations can be shared anddistributed on different annotation servers. Annotationcreators can leave comments, questions, etc. on web-sites that can be consulted by other users of the An-notea system and extended (with answers, explana-tions, etc.) by these separate users.

5. Discussion

As we have seen in the previous section, there aremany approaches to annotation systems, from theo-retic models to actual end-user implementations ofsuch systems. While there have already been reviewsof the state of the art in this field (e.g., see [?]) and anumber of research workshops ([?,?] for instance), weare not aware of any approach that tried to character-ize diverse semantic annotation approaches in a com-mon framework. In the previous sections, we have thustried to identify the main dimensions that would helpus classify a large variety of systems together to showtheir common features. Table 3 shows a summary ofsome systems that we have discussed and how they fitin the different dimensions. In this section we discusssome properties and drawbacks of the proposed classi-


Table 3Examples of Systems and their features according to our classifica-tion (* see comments in Section 5)

STRUCTURAL COMPLEXITY VOCABULARY TYPE COLLABORATION TYPE

Tag Attribute Relation* Ontology No KOS Authorityfile*

Taxonomy Singleuser,privateuse

Singleuser,publicuse

Collabora-tive

Flickr√ √ √

Delicious√ √ √ √

Faviki√ √ √ √ √

Last.fm√ √ √

Youtube√ √ √

Facebook√ √ √ √ √ √

Ebay√ √ √

CiteULike√ √ √ √

Bibsonomy√ √ √ √

Zotero√ √ √ √ √

Mendeley√ √ √ √ √ √ √

OntoWiki√ √ √

SemanticMediaWiki

√ √ √

Annotea√ √ √

PicasaDesktop

√ √ √ √ √

PicasaWeb

√ √ √ √ √

Picster√ √ √


fication as a whole (Section 5.1) and provide some rec-ommendations for building semantic annotation sys-tems based on these dimensions (Section 5.2).

5.1. Classification Properties and Drawbacks

Independance of the Dimensions In the previous sec-tions, we have introduced three main dimensions toclassify annotation systems:

1. Structural Complexity of Annotations,2. Vocabulary Type,3. User Collaboration.

While we have discussed these dimensions as beingmainly independent – as, in our opinion, they are or-thogonal –, we do not intend them to be consideredseparately as one cannot design an annotation systemwithout making choices along each dimension. As wehave discussed in the Vocabulary Type (Section 3) andin the Collaboration Type (Section 4), some aspectsof these dimensions might be preferable in combina-tion. For instance, when using large collaborative sys-tems, it is often easier to use a simpler structural com-plexity, as Flickr or Delicious have done, as this sim-plifies the sharing of annotations and the user inter-faces. However, as we have discussed, by using sim-pler structural complexity, one looses much of the se-mantics of the annotations and thus it might still bepreferable to go for more complex structures and usemore advanced annotation alignment algorithms as isillustrated by [?]. Another example of this preferen-tial tie between dimensions is the one of the ontol-ogy structural complexity. When using such annota-tion system, the annotators will create instances andclasses while annotating. It seems thus natural to al-low the future annotators to reuse this created ontol-ogy as a vocabulary for their own annotations. Whilethis is not a real dependence between the structuralcomplexity and the vocabulary type – i.e. the vocabu-lary available could still be a No KOS –, it probablymakes more sense to provide such features to the user.In these cases, the dimensions remain orthogonal andindependent but display some pairing preferences, andthis is why in the previous discussion some systemswere found in many dimensions.

Structural Complexity In addition, there are somefuzzy separations between the possible sub-dimensionsof each main dimension. For instance, Flickr and Deli-cious are both marked as having a tag structural com-plexity in Table 3. However, users have started using

the tag system as a workaround to store more complexdata by using machine tags; thus being able to storeattributes and relations in the tags. We do not directlyconsider these systems as supporting attributes or re-lations as, at the implementation level, when dealingwith machine tags stored as free-text tags, it is veryhard to leverage the full semantic services providedby the attribute or relation dimensions of the structuralcomplexity. Indeed, both Flickr and Delicious havehad to provide specific support for machine tags, thusmaking them move away from purely free-text tag sys-tems. For instance, Flickr eventually introduced a full-fledged geotagging features, separate from the tags an-notations, after many users started using machine tagsto store latitude and longitude of photos.

Another important point to consider is the differencebetween attributes and relations in the structural com-plexity. When a resource is represented as a URI in anannotation system, it is then easy to encode a relationas an attribute, which domain is a URI. Thus, at a datastorage level, the attribute dimension and the relationsdimension are not really different. However, knowingthat a particular attribute represents a relation can en-able more powerful features. For instance, if for a per-son, we have one attribute “picture” that points to theURI of the file and one attribute “father” that pointsto the URI of another person. If they are both only at-tributes, there is no difference for the computer as towhat to do with both fields. However, if the “father”attribute has the semantics of a relation, then the com-puter might know that there is more to this URI thanjust a reference. For instance, mutual relations couldbe computed, possible transitivity could be computed,etc. Thus relations are still interesting to consider, andwe have kept them as a standalone sub-dimension as itprovides a larger feature-set.

Vocabulary Type The same issue arises when weconsider the vocabulary type. While many systems usethe No KOS approach – for instance Flickr free-texttagging –, some still provide an auto-completion orrecommendation feature that helps users choose tagsfrom a list of popular tags (for example). If we considerthis, systems with such a feature – such as Delicious –are standing between the No KOS dimension and theuse of a simple authority file made of the most populartags. However, as for the structural complexity, whilethis approach provides support for the user when creat-ing annotations, it might not provide all the features ofan authority file; for instance there is no control overthe base form variations of the tags.


Thus, while the top level of our classification ismeant to represent orthogonal dimensions, the secondlevel is meant to be taken as main divisions betweenfeature sets, which can overlap at some time to providehybrid systems as with the machine tags or the auto-completion of tags. However, as we pointed out, thisoverlap also has its drawbacks as it does not enable allthe features of the more complex dimension. We havestill pointed out along the article when these overlapsbetween approaches existed while trying to keep a log-ical split between the different level of each dimension.

5.2. Recommendations for building semanticannotation systems

In this section we provide our recommendations forthe configuration of a balanced semantic annotationsystem that, on one hand, encapsulates enough se-mantics to allow the development of new semantics-based services for the end-user and, on the other hand,does not require an explicit expertise from the user toprovide semantic annotations, which would facilitatea wider adoption of such systems. Our recommenda-tions are based on the analysis and findings reportedin this article as well as our experience with the IN-SEMTIVES project annotation systems (see Section 6)and are provided for each of the three dimensions, asreported below:

Structural Complexity of Annotations. While usersshould be able to provide annotations of the tag,attribute, and relation kinds (see Section 2), theyshould be driven towards providing more rela-tions and attributes to enable more semantics-based services. In this case, the user will pro-vide qualified resource descriptions and will in-terlink resources with qualified links. This resultsin a structure that can be naturally mapped intoRDF and, therefore, can enable the adoption andreuse of services developed by the Semantic Webcommunity so far. In addition, these lightweightbut structure-rich annotations can form the ba-sis for the automatic extraction of data sets tobe uploaded to the LOD cloud. The use of thefull fledged ontology for annotation is not recom-mended when the annotating user is not an expertin semantic web technologies due to high learningbarriers [?].Noteworthy, the tendency towards more struc-tured annotations can be noticed in some anno-tation systems. For example, while Flickr and

Delicious only started with free-text tags, theysoon introduced more complex attribute annota-tions: both Flickr and Delicious provided sup-port for attributes and relations based on ma-chine tags; Flickr also introduced a dedicatedgeo-localisation interface to store the coordinate(attributes) where a photo was taken; Deliciousintroduced a way to send a resource to otherusers by linking to them with the for: tag. Inthe same way, Wikipedia (and MediaWiki) in-troduced info-boxes that allows to annotate wikipages with attributes and this feature was then ex-tended by projects like Semantic MediaWiki [?].

Vocabulary type. We recommend that, during the an-notation process, the user is supported by a KOSof the thesaurus type. This will enable one of themostly used reasoning tasks, the subsumption onthe hierarchy of concepts (see Section 3.3) andwill allow to address the problems described atthe beginning of Section 3. While the annotat-ing users should be driven towards providing an-notations that are linked to KOS elements, theyshould not be restricted from providing free textannotations. This recommendation is to supportthe annotators who are well familiar with the ex-isting Web 2.0 annotation systems and to over-come the potential problem of a partial coverageof the KOS, which can be addressed, at least inpart, by adopting the recommendation for the usercollaboration dimension.In public annotation systems, we can see thatthis trend towards more structured vocabularies isgrowing, with the introduction of tag “bundles” inDelicious, with the use of categories in Wikipediaor with new social bookmarking sites based onKOS such as Faviki.

User Collaboration. During the annotation process,whenever possible, it is recommended that theworkload of annotating resources is distributedamong the users of the system. This is to leveragethe “wisdom of crowd” effect [?] – the conceptthat has been successfully implemented in manyWeb 2.0 applications such as Delicious. Consid-ering the KOS, it is recommended that the con-tents of the KOS are evolved dynamically by theusers in order to address the problem of partialknowledge and time lag between the creation ofconcepts inside communities and their inclusionin the KOS by experts. From the user perspective,this means that they are more in control of the vo-cabulary, but from the system design perspective,


this means that new methodologies for evolvingknowledge are required. In order to avoid the“cold start” problem, it is recommended to pro-vide an initial data in the KOS (possibly providedby experts). Because the users will have to dealwith the semantics of annotations, proper user in-terface and interaction tools need to be developedthat would expose to the user these semantics ina simple and natural way. For example, when theuser enters “java” as a tag annotation, the systemcan show the summary of the recognised meaningof the tag as, for example, in “java (island)” thatwould help the user understand if the meaning isthe intended one and correct it otherwise [?].

6. Using the Annotation Model Dimensions

The three classification dimensions that we dis-cussed in the previous sections are not only usefulto classify the existing annotation systems but also tospecify requirements for developing new annotationsystems. While different technologies can be used toimplement a particular annotation model (e.g. RDF,microformats, relational databases), the system’s highlevel features are not dependent on such technologyand there is no need to provide such low level detailsto the users.

Noteworthy, the proposed classification was usedwithin the INSEMTIVES project45 in order to identifythe requirements for annotation models in three differ-ent industrial use cases. In this section, we describe themethodology that was applied for this analysis (Sec-tion 6.1) and provide details of such an analysis for oneuse case partner who is a big player in the telecommu-nication sector (Section 6.2).

6.1. Applied Methodology

In order to identify the needs of the use case part-ners for the different characteristics of the annota-tion model, a semi-structured interview was performedwith a representative of each use case partner. Beforethe interviews, each use case partner was given the re-port on the three dimensions of annotation systems (aspresented in the Sections 2, 3 and 4). This allowedus to increase the awareness of the use case partnersof the existing systems and their features and helpedthe partners to make more informed decisions about

45FP7-231181, see http://www.insemtives.eu

their needs. About forty questions were prepared forthe guided interviews46. The questions were made asnontechnical as possible but with clear connections tothe elements of the annotation model they referred to.Below we give some examples of the questions withan explanation of their relation to the classification di-mensions:

Structural Complexity

Should the user be able to explicitly refer to prop-erties of the resources the user query refers to?

RATIONALE: the goal of this question is to clarify ifthe attribute structural complexity should be used.

When searching for a resource, should the user beable to find resources that have a certain relation-ship with another resource?

RATIONALE: the goal of this question is to clarify ifthe relation structural complexity should be used.

Vocabulary Type

Should the user be able to navigate from a givenresource to other semantically related resources?

RATIONALE: the goal of this question is to clarify ifthe annotations should be assigned explicit semantics.

If they navigate, are they interested in seeing a de-tailed classification/thesaurus (such as by country,then by region, then by city, etc.) of the resourcesor a rough clustering is enough (such as the tagcloud)?

RATIONALE: the goal of this question is to clarify ifthe thesaurus KOS should be used.

Collaboration Type

Can the set of available annotation terms be ex-tended dynamically (or automatically), and bywhom?

RATIONALE: the goal of this question is to clarify ifthere is a need of updating the vocabulary, and whethercollaboration is needed or allowed.

Will the same people that make the annotations,search based on the annotations? Are annotatorsand end user different?

46Interested readers are referred to [?] for the complete list ofquestions and for the transcripts of the interviews.


RATIONALE: the goal of this question is to clarify theannotations need to be shared, or if the annotationsshould be private.

The answers to all the questions were then rigor-ously analysed with the goal of the identification ofthe requirements for the annotation model for each usecase partner as reported in [?]. Once the requirementsof each use case partner were mapped to the three clas-sification dimensions, the implementation technologywas decided, based on specific particularities of thecurrent infrastructure of the use case partner (e.g,. usean RDF or a relational database).

6.2. Case Study – Telefónica Corporate Portal

Telefónica Investigación y Desarrollo47 (TelefónicaI+D for brevity) is the innovation company of the Tele-fónica Group48. Over the last few years and within theglobal market, Telefónica I+D has grown to becomea network of centres of technological excellence thatstretches far beyond the Spanish borders, extending itsR+D activities to offices located in Spain (Barcelona,Granada, Huesca, Madrid and Valladolid), Brazil (SãoPaulo) and Mexico (Mexico D.F.). Currently, Tele-fónica I+D employs more than 1 200 employees sharedout amongst its 7 centres.

The Telefónica I+D’s intranet corporate portal is alive, 24x7, highly active Web portal used by every sin-gle employee at the company. Since its first days in1998, the intranet corporate portal has become one ofthe main communication channel available at the com-pany. The portal offers access to all the sources of in-formation and services; including infrastructure man-agement, business management, knowledge manage-ment and human resources management information,services and tools. However, as the amount of data hasgrown, it has become much more difficult to access theright asset in the precise moment when it is needed.

In order to address the above mentioned problem, itwas decided to enrich the corporate portal with auto-matic, semi-automatic and user-guided semantic anno-tation capabilities in order to enable enhanced featuressuch as semantic search, faceted navigation and intel-

47http://www.tid.es/en48The Telefónica Group is a Spanish multinational group of com-

panies operating in the Information Technology and telecommuni-cation domain. The Group stands in the 5th position in the telecom-munication sector worldwide in terms of market capitalization, the1st as an European integrated operator and the 3rd in the Eurostoxx50 ranking, composed of the major companies in Europe (June 30th2010).

ligent recommendations. In the following we describethe requirements analysis along the three classificationdimensions that led to the development of a semanticannotation component for the Telefónica portal.

6.2.1. Structural ComplexityThe use case requires a mixed complexity in the an-

notation model. Users are only interested in an easyand quick annotation procedure and the annotators arealready used to a simple tagging interfaces.

However, in the use of the annotation for searchingand navigating, the users should be able to filter outsearches by using particular attributes (e.g., the author,a date range, etc.). There is also interest in having anannotation model that supports relational annotationsto link resources together, but also parts of resourcesfor navigating between them.

The annotation model should thus be able to supporttags which are not associated to any particular propertyof the resource, but also attributes to describe specificmetadata and relations between (and within) resourcesfor navigation purposes.

6.2.2. Vocabulary TypeDuring the interview, users made clear that the an-

notation should not be limited to a specific vocabularyand that users will enter free-text annotations. How-ever, after further discussion and by looking at the in-tended usage, it appears that the free-text annotationwill be combined with a KOS of some sort, providedby domain experts. The KOS might take the form ofa thesaurus to ease the navigation and search, and theusers will be ready to align their free-text annotationsto the concepts in the KOS.

A set of expert users, with a good knowledge of thedomain as well as of the annotation system, will pro-vide the initial contents for the KOS that can be used tobootstrap the recommendation system. This KOS willthen be extended by the users when providing free-textannotations.

There is also a strong requirement that the issues ofsynonymy and polysemy should be resolved for im-proving the search accuracy. This will either require anauthority file or thesaurus KOS. Hence, this use casewill require a KOS where users will help disambiguat-ing the annotation to map free-text tags to the corre-sponding concept in the KOS when performing the an-notation task or searching.

6.2.3. Collaboration TypeThe use case will have many shared resources that

every user can access, annotate, and search for. With


the exception of some of the resources where there isa need for access control, all the resources and annota-tions will be visible to everyone.

Moderators will be controlling the user’s annota-tions to make sure they are correct, where traceabil-ity of the annotations is needed for auditing purposes.The collaborative annotation model thus should embedprovenance information as well as versioning informa-tion.

In addition, as pointed out previously, the KOS willbe built both in a top-down fashion, with a small setof domain experts creating the initial vocabulary butalso in a bottom-up fashion, with annotators and mod-erators extending the existing KOS and annotating re-sources on their own.

7. Conclusions

This article investigates existing models for repre-senting annotations, and analyses their different char-acteristics, forms, and functions. Based on this analy-ses, a classification for annotation models was devel-oped that distinguishes three main dimensions:

– structural complexity of annotations,– the type of the vocabularies used,– the collaboration type supported.

Furthermore a comprehensive overview of existing an-notation approaches, both in research and industry, isgiven to provide examples of annotation models andtheir different characteristics. Based on the analysis ofthe studied approaches and on the proposed classifica-tion dimensions, the article proposes a discussion onsome inter- and intra-dimension characteristics of theproposed classification and on some recommendationsof the authors for each dimension that can be useful tothe designers of semantic annotation systems.

The article further reports on a methodology to usethis annotation model classification scheme to elicit re-quirements for annotation models in end-user appli-cations. This methodology is then illustrated with oneconcrete use case from which requirements for its an-notation model have been extracted.

Acknowledgements

This work has been partly supported by the INSEM-TIVES project (FP7-231181, see http://www.insemtives.eu).

The authors would like to thank Denys Babenko, To-bias Bürger, Borislav Popov, and Biswanath Dutta fortheir valuable contribution and feedback made at earlystages of this work.

A Classiﬁcation of Semantic Annotation Systems · 2012-10-30 · in semantic annotation models...

Documents

Transcript of A Classiﬁcation of Semantic Annotation Systems · 2012-10-30 · in semantic annotation models...