Semantic Web 0 (0) 1 1 IOS Press Tversky’s feature-based ... · Semantic Web 0 (0) 1 1 IOS Press...

Semantic Web 0 (0) 1 1IOS Press

Tversky’s feature-based similarity andbeyondSilvia Likavec a, Ilaria Lombardi a,∗ and Federica Cena a

a Dipartimento di Informatica, Università di Torino, Corso Svizzera 185, 10149 Torino, ItalyE-mail: [email protected], [email protected], [email protected]

Abstract. Similarity is one of the most straightforward ways to relate two objects and guide the human perception of the world.It has an important role in many areas, such as Information Retrieval, Natural Language Processing (NLP), Semantic Web andRecommender Systems. To help applications in these areas achieve satisfying results in finding similar concepts, it is importantto simulate human perception of similarity and assess which similarity measure is the most adequate.

In this work we wanted to gain some insights into Tversky’s feature-based semantic similarity measure on instances in aspecific ontology. We experimented with various variations of this measure trying to improve its performance. We proposeNormalised common-squared Jaccard’s similarity as an improvement of Tversky’s similarity measure. We also explored theperformance of some hierarchy-based approaches and showed that feature-based approaches outperform them on two specificontologies we tested. Above all, the combination of feature-based with hierarchy-based approaches shows the best performanceon our datasets. We performed two separate evaluations. The first evaluation includes 137 subjects and 25 pairs of concepts inthe recipes domain and the second one includes 147 subjects and 30 pairs of concepts in the domain of drinks. To our knowledgethese are some of the most extensive evaluations performed in the field.

Keywords: Similarity, properties, feature-based similarity, hierarchy, ontology, instances

1. Introduction

Similarity is one of the main guiding principleswhich humans use to categorise the objects surround-ing them. Even though in psychology the focus is onhow people organise and classify objects, in computerscience similarity plays a fundamental role in informa-tion processing and finds its application in many areasfrom Artificial Intelligence to Cognitive Science, fromNatural Language Processing to Recommender Sys-tems. Semantic similarity can be employed in many ar-eas, such as text mining, dialogue systems, Web pageretrieval, image retrieval from the Web, machine trans-lation, ontology mapping, word-sense disambiguation,item recommendation, to name just a few. Due to thewidespread usage and relevance of semantic similar-ity, its accurate calculation which closely mirrors hu-

*Corresponding author. E-mail: [email protected]

man judgement brings improvements in the above andmany other areas.

Usually, semantic similarity measures have beentested on WordNet [11]. WordNet is a taxonomic hier-archy of natural language terms developed at Prince-ton University. The concepts are organised in synsets,which group the words with the similar meaning. Moregeneral terms are found at the top of the underlyinghierarchy, whereas more specific terms are found atthe bottom. Two important datasets used to test simi-larity measures on WordNet are the ones proposed by[28] and [20]. These datasets are manually composedand contain rated lists of domain-independent pairs ofwords.

However, with the diffusion of ontologies as knowl-edge representation structures in many areas, there is aneed to find similar and related objects in specific do-main ontologies used in various applications. In thesecases, the features of domain objects play an importantrole in their description, along with the underlying hi-

1570-0844/0-1900/$35.00 c© 0 – IOS Press and the authors. All rights reserved

2 Likavec et al. / Tversky’s feature-based similarity and beyond

erarchy which organises the concepts into more gen-eral and more specific concepts. The experiments withfeature-based and hierarchy-based semantic similaritymeasures on specific domain ontologies are rare andwithout conclusive results [23,34]. Hence, we decidedto carry out some experiments with Tversky’s feature-based similarity measure on two specific domain on-tologies, trying to draw some conclusions on the bestuse practices and case specific characteristics. For ourfirst experiment, we chose the domain of recipes usingthe slightly modified Wikitaaable dataset1 used in [7].For our second experiment, we chose the domain ofdrinks, developed previously by our research team fordifferent purposes. Actually, it is extremely difficult tofind a publicly available ontology with defined prop-erties, which is an important requirement for testingour approach. In addition, the domains which could betested with non-expert users are very limited, since theusers should be able to evaluate the similarity of all (orat least most of) the couples proposed in the test.

There are many different similarity measures around.It is not very clear which measure is the most suit-able in which situation and comparative studies arerare [21,29]. Above all, the evaluation of the measureswith users is limited often to very few participants.On the contrary, our experiments involve 137 and 147subjects respectively, which is a significant number ofparticipants compared to other studies.

Thus, we aimed to gain more insight into the be-haviour of Tversky’s feature-based similarity measurefor ontology instances calculated from property-valuepairs for compared objects and compare its perfor-mance to hierarchy-based measures. We experimentedwith various variations of Tversky’s feature-based sim-ilarity measure and we report here our findings. Thesame variations of TverskyâAZs measure could beapplied to other feature-based similarity measures aswell. Also, we tried to combine Tversky’s similaritymeasure with hierarchy based approaches.

Our aim was to avoid any dependency on theweighting parameters which mark the contribution ofeach feature (known also as relevance factors). Theseparameters can be tuned for each single domain, butwe wanted to test the contribution of each feature withits equal share.

The main contributions of this work are the follow-ing:

1http://wikitaaable.loria.fr/rdf/

1. A concrete proposal on how to improve Tversky’sfeature-based semantic similarity;

2. An extensive evaluation on two domain specificontologies;

3. Considerations on how feature-based and hierarchy-based semantic similarity measures interact andcomplement each other;

4. Conclusions on feature-based semantic similarityperformance.

The paper is organised as follows. In Section 2 weprovide the background on the basic concepts we usein our work: conceptual hierarchies, ontologies, prop-erties in OWL and properties of instances. In Section 3,we provide some details about the semantic similar-ity measures we will be dealing with: feature-basedsemantic similarity measures and hierarchy-based se-mantic similarity measures. For the sake of complete-ness we also give some background on InformationContent similarity measures, although we will not bedealing with them in this paper. In Section 4 we re-port on the semantic similarity measures we dealt withand propose some possible improvements of the ba-sic Tversky’s similarity measure. We also give detailsof the variations of each of these similarity measureswhich might include or not the hierarchy-based sim-ilarity. The results of our extensive evaluation are re-ported in Section 5 followed by a Section 6 which sum-marises some additional works which exploit feature-based semantic similarity measures. We conclude inSection 7 drawing some conclusions and pointingsome directions for future work.

2. Background on semantic knowledgerepresentation

This section explains the terms and notions used inthis work. It provides a background on conceptual hi-erarchies, as well as on ontologies and on properties inontologies.

2.1. Conceptual hierarchies

A conceptual hierarchy provides a first step in mod-elling a specific domain - it provides a taxonomy ofconcepts organised using the partial order IS-A rela-tion, which groups the concepts into classes [3,32]. Itis usually a tree or a lattice and it specialises moregeneral into more specific classes. The IS-A relationis asymmetric and transitive and defines a hierarchi-cal structure of the ontology, enabling the inheritance

Likavec et al. / Tversky’s feature-based similarity and beyond 3

of characteristics from parent classes to descendantclasses. In simple words, a conceptual hierarchy is asimple knowledge structure where the properties ofconcepts are not taken into account. Conceptual hier-archies provide structure for a given domain, organ-ise concepts into categories and provide the basis forformulation of relations among concepts in an abstractway. This enables easier development, maintenanceand reuse. W.r.t. to similarity calculation, it enables theemployment of hierarchy-based approaches.

2.2. Ontologies

Ontologies enable explicit specification of domainelements and their properties, hierarchical organisationof domain elements, exact description of any existingrelationships between them and employment of rigor-ous reasoning mechanisms. In theory, an ontology canbe seen as a “formal, explicit specification of a sharedconceptualisation” [12]. They are expressed with stan-dard formats and languages, which allows for extensi-bility and re-usability. One standard formalism for on-tology representation is OWL.2

Two layers can be identified in an ontology: ontol-ogy definition layer which contains the classes whichdescribe the concepts in the domain of the discourse,and the instance layer which contains all the con-crete distinct individuals in the domain. At the basisof the ontology is the conceptual hierarchy seen abovewhich provides its structure. Properties are defined atthe class level.

2.3. Properties in OWL

When using ontologies expressed in OWL, the fea-tures of domain elements are described using proper-ties. Properties provide details about classes in the on-tology and their values describe the properties of in-stances. In OWL, two kinds of properties exist:

(i) object properties linking individuals among them-selves;

(ii) data type properties linking individuals and datatype values.3

Object properties and datatype properties are definedas instances of the built-in OWL classes owl:ObjectPropertyand owl:DatatypeProperty, respectively. Both are sub-

2http://www.w3.org/TR/owl-ref3In OWL, there is also the notion of annotation prop-

erty (owl:AnnotationProperty) and ontology property(owl:OntologyProperty), used in OWL DL.

classes of the RDF class rdf:Property. In this work, weonly considered object properties, since the treatmentof data type properties (such as literal values) is morecomplex and requires further semantic analysis.

The characteristics of a property are defined withthe property axiom, which in its basic form assertsthe existence of the property. Two most usually usedconstructs in the property axiom are the ones definingthe property’s domain and range. rdfs:domain links aproperty to a class description and it asserts that theobjects having this property must belong to the classextension of the indicated class description. Multiplerdfs:domain axioms are allowed and should be inter-preted as stating that the domain of the property is theintersection of all domains. rdfs:range links a propertyto either a class description or to a data range and itasserts that the values of this property must belong tothe class extension of the class description or to datavalues in the specified data range. Multiple rdrs:rangeaxioms are allowed and should beinterpreted as statingthat the range of the property is the intersection of allranges.

For example:

<owl:ObjectProperty rdf:ID="has_ingredient"><rdfs:domainrdf:resource="#Recipe"/>

<rdfs:rangerdf:resource="#Food"/>

</owl:ObjectProperty>

defines a property has_ingredient which ties the el-ements of Recipe class to the elements of Food class.

It is possible to build hierarchies of properties withrdfs:subPropertyOf axiom. Subproperty axioms canbe applied to both datatype properties and object prop-erties.

Equivalent properties are defined with owl:equivalentProperty.Apart from explicitly defining properties for the

classes, properties can be used to define classes withproperty restrictions.

2.4. Instances

Instances in the ontology (also called individuals)give life to classes, describing concrete individuals.They are defined with individual axioms which pro-vide their class memberships (property rdf:type),individual identities and values for their properties.All the properties of instances are inherited from theclasses the instances belongs to. A specific value is as-sociated to each property and some properties can havemore than one value.


Let us see an example of an instance of a recipeclass, the AsparagusParmigiana, in RDF/XML syn-tax:

<Recipe rdf:ID="Asparagus_Parmigiana"><rdf:typerdf:resource="#Vegetable_Dish"/>

<has_ingredientrdf:resource="#Asparagus"/>

<has_ingredientrdf:resource="#Parmesan_Cheese"/>

<has_ingredientrdf:resource="#Butter"/>

<suitable_for_dietrdf:resource="#Vegetarian_diet"/>

<not_suitable_for_dietrdf:resource="#"Low_cholesterol_diet"/>

<can_ba_eaten_asrdf:resource="#Side_Dish"/>

</Recipe>

defines a recipe Asparagus_Parmigiana which is oftype Vegetable_Dish, has Asparagus, Parmesan_Cheeseand Butter as ingredients, is suitable for Vegetarian_Diet,not suitable for Low_ cholesterol_diet, and can beeaten as Side_Dish.

Properties of instances enable the employmentof feature-based similarity measure to ontology in-stances.

3. Semantic similarity measures

In this section we give some details of the three maincategories of semantic similarity measures, namelyfeature-based, hierarchy-based and information content-based. As a result of trying to improve and combinethe above approaches, many hybrid similarity mea-sures were born. In our experiments we only dealtwith Tversky’s feature-based measure, its improve-ments and its combination with hierarchy-based mea-sures. We include information-content-based measuresfor the completeness sake only.

3.1. Feature-based similarity measures

Calculation of similarity based on properties goesback to Tversky’s work on Features of Similarity [31]where similarity between two objects O1 and O2 is afunction of their common and distinctive features:

simT (O1,O2) =

=α(ψ(O1) ∩ ψ(O2))

β(ψ(O1) \ ψ(O2)) + γ(ψ(O2) \ ψ(O1)) + α(ψ(O1) ∩ ψ(O2)).

(1)

In the formula above ψ(O) describes all the relevantfeatures of the object O, and α, β, γ ∈ R are constantswhich allow for different treatment of the various com-ponents. For α = 1 common features of the two objectshave maximal importance. For β = γ non-directionalsimilarity measure is obtained. Depending on the val-ues for α, β, γ, we obtain the following variations ofTversky’s similarity:

– Jaccard’s or Tanimoto similarity for α = β = γ =

1;– Dice’s or Sørensen’s similarity for α = 1 and β =

γ = 0.5.In our setting, we will be using the following nota-

tion:– common features of O1 and O2: cf(O1,O2) =

ψ(O1) ∩ ψ(O2),– distinctive features of O1: df(O1) = ψ(O1)\ψ(O2)

and– distinctive features of O2: df(O2) = ψ(O2)\ψ(O1).In order to calculate the above similarities for do-

main objects O1 and O2, we need to calculate for eachproperty p they have in common, how much it con-tributes to common features of O1 and O2, distinctivefeatures of O1 and distinctive features of O2, respec-tively. We denote these values by cfp, df1p and df2p.

Hence, we have to compare the property-value pairsbetween instances for each property they have in com-mon. We will include in common features the caseswhen the two objects have the same value for the givenproperty p. We will include in distinctive features ofeach object the cases when the two objects have differ-ent values for the given property p.

We consider equal the properties defined withowl:EquivalentProperty. Repeating the above processfor each property O1 and O2 have in common, we ob-tain all common and distinctive features of O1 and O2:

cf(O1,O2) = Σni=1cfpi

df(O1) = Σni=1df

1pi

df(O2) = Σni=1df

2pi

where n is the number of common properties definedfor O1 and O2. Then the above similarity measures be-come:

Jaccard’s similarity:

simJ(O1,O2) =cf(O1,O2)

df(O1) + df(O2) + cf(O1,O2)(2)


Dice’s similarity:

simD(O1,O2) =2cf(O1,O2)

df(O1) + df(O2) + 2cf(O1,O2)(3)

3.1.1. Mathematical propertiesHere we provide the reader with some basic mathe-

matical properties of the Jaccard’s similarity measurewhich we would deal with in the rest of the paper.

1. Boundaries∀O1,O2, 0 ≤ simJ(O1,O2) ≤ 1.

2. Maximal similarityIf O1 ≡ O2, then simJ(O1,O2) = 1.

3. Commutativity∀O1,O2, simJ(O1,O2) = simJ(O2,O1).

4. MonotonicityIf cf(O1,O2) ≤ cf(O1,O3) and df(O1) = df(O2),then simJ(O1,O2) ≤ simJ(O1,O3).If cf(O1,O2) ≤ cf(O1,O3) and df(O2) = df(O3),then simJ(O1,O3) ≤ simJ(O2,O3).

3.2. Hierarchy-based similarity measures

Hierarchy-based or distance-based similarity mea-sures use the underlying conceptual hierarchy directlyand calculate the distance between concepts by cal-culating the number of edges or the number of nodeswhich have to be traversed in order to reach one con-cept from another. The hierarchy-based measure wasfirst introduced in [24] as a simple shortest path con-necting the compared concepts and was the basis forthe development of many measures of semantic simi-larity. A discussion and comparison with informationcontent-based approaches can be found in [5]. In thissection we give more details about three hierarchy-based measures we used in our experiments: Leacockand Chodorow’s measure [15], Wu and Palmer’s mea-sure [33] and Li’s measure [16].

3.2.1. Leacock and Chodorow’s similarity measureThe first measure we will look at is Leacock and

Chodorow’s similarity measure [15] which was orig-inally used for word sense disambiguation in a localcontext classifier. In order to calculate the distancesbetween words, the authors use the normalised pathlength in WordNet [11] between all the senses of thecompared words and measure the path length in nodes:

simLC(a, b) = − log( N p2 × max

). (4)

N p is the number of nodes in the path p from a to b,whereas max is the maximum depth of the hierarchy.

The distance between two words belonging to the samesynset is 1.

If we want to express this measure as a function ofdistances between nodes we obtain the following for-mula:

simLC(a, b) = − log(dist(a, b)2 × max

). (5)

dist(a, b) is the distance between a and b calculated asthe shortest path length between these two nodes.

The disadvantage of this similarity measure is thatmany pairs of non-similar words are estimated as sim-ilar, due to the equal edge lengths in their hierarchy.

3.2.2. Wu and Palmer’s similarity measureWu and Palmer’s similarity measure [33] is based

on the depths in the hierarchy of the two words beingcompared and on the depth of their common subsumer,which characterises their commonalities. If we denoteby c the first subsuming node for the two comparednodes a and b, their similarity is calculated as follows:

simWP(a, b) =2Nc

Na + Nb. (6)

Nn is the number of nodes along the path from the noden to the root.

This measure can also be expressed as a function ofdistances between nodes as follows:

simWP(a, b) =2dist(c)

dist(a) + dist(b). (7)

dist(n) is the depth of n, i.e., its distance from the root,again calculated as the shortest path length between thetwo nodes.

3.2.3. Li’s similarity measureLi et al. [16] proposed an approach for calculating

the similarity between sentences which uses seman-tic information and word order information in the sen-tence. Similarity of singular words is calculated usingthe shortest path length between words dist and thedepth h of their common subsumer as follows:

simL(a, b) = e−αdist(a,b) eβh − e−βh

eβh + e−βh . (8)

where α ∈ [0, 1], β ∈ (0, 1] are parameters whichcontrol the contribution of shortest path length anddepth, respectively. Their optimal values depend onthe knowledge base used and for WordNet they are


α = 0.2 and β = 0.45. If the words a and b belong tothe same synset then dist(a, b) = 0, if they do not be-long to the same synset but the synsets for a and b con-tain one or more common words, then dist(a, b) = 1and for the remaining cases the actual path length be-tween a and b is calculated.

Since they only depend on the underlying hierarchyof the domain ontology, the hierarchy-based similaritymeasures are fairly simple and require a low computa-tional cost. The known problem with hierarchy-basedsimilarity measures is that all the edges in the hierar-chy are considered to be of the same length, so manysimilarity values are not correct. The accuracy of thesemeasures have been surpassed by more complex ap-proaches which exploit additional semantic informa-tion.

3.3. Information content-based similarity measures

The measures seen above work on single knowl-edge structures and they do not need external sourcesfor similarity calculation. In this section we presentthe most important approach which uses external cor-pus to compute similarity. The foundational work oninformation content-based similarity is due to Resnik[25,26]. His approach is based on the fact that themore abstract classes provide less information con-tent, whereas more concrete and detailed classes lowerdown in the hierarchy are more informative. The clos-est class that subsumes two compared classes, calleda most informative subsumer is the class which pro-vides the shared information for both, and measurestheir similarity. In order to calculate the informationcontent of a concept in a IS-A taxonomy, Resnik turnsto an external text corpus and calculates the probabilityof occurrence of the class in this corpus as its relativefrequency (each word in the text corpus is counted asan occurrence of each class that contains it.). The in-formation content of a class in a taxonomy is given bythe negative logarithm of the probability of occurrenceof the class in a text corpus as follows:

simR(a, b) = maxc∈S (a,b)[− log p(C)] (9)

where p(c) is the probability of encountering an in-stance of concept c, and S (a, b) is the set of all con-cepts that subsume a and b. According to Resnikthis approach performs better than hierarchy-based ap-proaches, based on human similarity judgements as abenchmark.

The problem with Resnik’s similarity measure is theexcessive information content value of many polyse-mous words (i.e. word senses not taken into account)and multi-worded synsets. Also, the information con-tent values are not calculated for individual words butfor synsets, hence the synsets containing commonlyoccurring ambiguous words would have excessive in-formation content values. To deal with the problem ofexcessive information content value of many polyse-mous words, Resnik proposes weighted word similar-ity which takes into account all senses of the words be-ing compared.

3.4. Discussion

The main problem with information content-basedsimilarity measures is their dependance on externalcorpora for the calculation of term frequencies. Thisrequires disambiguation and annotation of terms inthe corpus, very often done manually, hence affect-ing the applicability of this approach to large corpora.Also, with the change of corpora or the ontology, theterm frequencies have to be recalculated. One step to-wards improving this problem was the introduction ofintrinsic computation of information content [30] aswe will see in Section 6. The methods based on in-trinsic information content outperform corpora-relyingapproaches.

In this work, we particularly focus on on feature-based and hierarchy-based similarity measures sincethey do not require external knowledge sources fortheir application, rather they rely solely on the domainontology. Feature-based similarity measures evaluateboth common and distinctive features of compared ob-jects, hence exploiting more semantic information thanhierarchy-based approaches. But this information hasto be available, which is not always the case. Otherwisethe applicability and accuracy of these measures is hin-dered. Feature-based similarity measures can also beapplied in cross-ontology settings.

4. Variations of Tversky’s feature-based similaritymeasure

In this section we present the variations of Tver-sky’s, or more preciselly, Jaccard’s similarity measurefrom Section 3.1 which we evaluated in our experi-ments. We experimented with 6 basic modifications ofJaccard’s similarity (see Equation 2). For each of these7 measures, we present 3 further variations, which aim


to take the underlying hierarchical structure into ac-count in different ways. We actually did the same cal-culations also with Dice’s measure (see Equation 3)but the results were almost always worse, so we willnot tackle Dice’s measure any further.

4.1. Basic modifications of Tversky’s similaritymeasure

The first modification of Tversky’s, more preciselyof Jaccard’s measure, called Squared Jaccard’s sim-ilarity, was to consider squares of all the values as inthe following formula:

simS Q(O1,O2) =cf2(O1,O2)

df2(O1) + df2(O2) + cf2(O1,O2)

(10)

The second modification of Jaccard’s measure,called Common-squared Jaccard’s similarity, wasto consider squares of only the values two objects havein common, giving more importance to common fea-tures. The following formula illustrates this measure:

simCS Q(O1,O2) =cf2(O1,O2)

df(O1) + df(O2) + cf2(O1,O2)

(11)

Since these two measures did not provide us withsatisfying results in the evaluation, we tried the follow-ing two modifications: Normalised Jaccard’s simi-larity and Normalised common-squared Jaccard’ssimilarity, given by the following formulas:

simN(O1,O2) =

=

cf(O1,O2)(cf(O1,O2)+df(O1))(cf(O1,O2)+df(O2))

df(O1)cf(O1,O2)+df(O1) +


cf(O1,O2)(cf(O1,O2)+df(O1))(cf(O1,O2)+df(O2))

(12)

simNCS Q(O1,O2) =

=

cf2(O1,O2)(cf(O1,O2)+df(O1))(cf(O1,O2)+df(O2))



cf2(O1,O2)(cf(O1,O2)+df(O1))(cf(O1,O2)+df(O2))

(13)

The Normalised common-squared Jaccard’s similar-ity measure for ontological objects was first introducedin [6] and was further developed in [17] as a wayto calculate similarity between shapes and in [18] forimproving recommendation accuracy and diversity.

We also tried to converte Li’s similarity formula [16]into feature based formula. Since the similarity mea-sure increases with the increasing number of commonfeatures, common features can be taken as an argu-ment of the sigmoid function. Furthermore, the simi-larity values should decrease with the increasing num-ber of distinctive features, hence the distinctive fea-tures should be an argument of the negative sigmoidfunction translated by 1. So we obtained the followingfunction:

simS 0(O1,O2) =ecf(O1,O2) − 1ecf(O1,O2) + 1

(1 −edf(O1)+df(O2) − 1edf(O1)+df(O2) + 1

)

=2(ecf(O1,O2) − 1)

(ecf(O1,O2) + 1)(edf(O1)+df(O2) + 1)

(14)

This result is similar to just taking the distinctivefeatures as an argument to inverse exponential func-tion since these two functions have similar graphs andbehaviour for arguments greater than 0. But the ex-ponential in the denominator increases very fast, sothe similarity values were getting extremely small veryquickly. Hence we decided to leave just the distinc-tive features in the denominator. Adding 1 prevents thecase of the division with zero when there are no dis-tinctive features among the compared objects. Also,we need to divide by 2 so that the final result is in theinterval [0, 1). So the final Sigmoid similarity has thefollowing formula:

simS (O1,O2) =ecf(O1,O2) − 1

(ecf(O1,O2) + 1)(df(O1) + df(O2) + 1)

(15)

Finally, the last measure which we considered wasto use the sigmoid function of Jaccard’s similarity asfollows:

simJS (O1,O2) =esimJ − 1esimJ + 1

(16)


We call this measure Sigmoid Jaccard’s similarity.

4.1.1. Mathematical propertiesIt is straightforward that the previously introduced

mathematical properties (boundaries, maximal simi-larity, commutativity and monotonicity) are preservedfor simS Q, simCS Q, simN and simNCS Q measures, sincethey are simple modifications of the original simJ mea-sure.

Let us first consider simJS measure since it is sim-pler than simS measure. The values of simJS belong to[0, e−1

e+1 ) since the values of the sigmoid function be-long to [0, 1), for positive arguments. Hence, its maxi-mum value is e−1

e+1 when simJ = 1. Sigmoid function isa monotone function so the monotonicity is preserved.Commutativity is straightforward.

As far as simS measure is concerned, its valuesbelong to [0, 1) and the maximal value is obtainedwhen the compared objects do not have any distinctivefeatures. Commutativity and monotonicity are againstraightforward.

Next, we will see how we tried to include the hier-archical information into these measures.

4.2. Feature-based similarity combined withhierarchical information

The basic question we wanted to answer with theseexperiments was how much the actual hierarchical in-formation contributes to the similarity of two objectsand if this information can be simulated by properties.

For example, in the recipe domain, all the recipescould be instances of the class Recipe or there couldexist an underlying hierarchy of recipe types and eachrecipe could be an instance of its corresponding recipetype. In our case, instead of having all the recipes in-stances of DishType, we made the recipes instancesof various dish types, such as BreadDish, CakeDish,PastaDish etc. This choice has the following conse-quences:

– instead of having the same values for the propertyrdf:type for all the dishes, we can distinguishthem according to various values for the propertyrdf:type;

– in case we want to take the hierarchical informa-tion into account, we can calculate the distancebetween various dishes, rather than assume thatthey all have the same similarity, since they allhave the same parent. Since we deal with the in-stances in the ontology, we decided to considerthem “descendants” of their classes, otherwise theinstances of one class would all be equal.

Having these considerations in mind, we decided todistinguish the following three variations of each of theabove described measures:V1: the measures without counting the property rdf:type

to see how much this information effectivelycontributes to the overall similarity among ob-jects (measures simJnt, simS Qnt, simCS Qnt, simN ,simNCS Qnt, simS nt and simJS nt);

V2: the measures without counting the property rdf:typebut including the hierarchical information in thefeature-based similarity formula. In this case,since the greater distance between two objectsmeans that they are less similar, we decidedthat the distance between objects counts as theirdistinctive feature. In the following formulasdist(O1,O2) is the number of edges along the pathconnecting O1 and O2 and max is the maximumdepth of the class hierarchy.

simJnth(O1,O2) =

=cf(O1,O2)

df(O1) + df(O2) + cf(O1,O2) +dist(O1,O2)

2max

(17)

simS Qnth(O1,O2) =

=cf2(O1,O2)

df2(O1) + df2(O2) + cf2(O1,O2) +dist2(O1,O2)

(2max)2

(18)

The equations for simCS Qnth, simNnth, simNCS Qnth

are analogous to the equations above. The onesthat are worth writing out explicitly are simS nth

and simJS nth.

simS nth(O1,O2) =

=2(ecf(O1,O2) − 1)

(ecf(O1,O2) + 1)(df(O1) + df(O2) + 1 +dist(O1,O2)

2max )

(19)

simJS (O1,O2) =esimJnth − 1esimJnth + 1

(20)


V3: the measures counting the property rdf:typeand including the hierarchical information in thefeature-based similarity formula. This approachcounts the hierarchical information twice in away but in two subtle ways. Including the prop-erty rdf:type takes into account all the ob-jects which are of the same recipe type, whereasincluding the hierarchical information accountsfor the distance between the compared objects.These measures (simJh, simS Qh, simCS Qh, simNh,simNCS Qh, simS h and simJS h) showed the best per-formance results, as we will see in Section 5.

5. Evaluation

The most commonly used datasets to test similar-ity measures are the ones proposed by [28] and [20].Rubenstein and Goodenough’s experiment dates backto 1965. They asked 51 native English speakers to as-sess the similarity of 65 English noun pairs on a scalefrom 0 (semantically unrelated) to 4 (highly related).Miller and Charles’ experiment in 1991 considered asubset of 30 noun pairs and their similarity was re-assessed by 38 undergraduate students. The correlationw.r.t Rubenstein and Goodenough results was 0.97.[26] repeated the same experiment in 1995 with 10subjects. The correlation w.r.t. Miller and Charles re-sults was 0.96. Finally, [22] repeated the experimentsin 2009 with 101 participants, both English and non-English native speakers. His average correlation w.r.t.Rubenstein and Goodenough was 0.97, and 0.95 w.r.t.Miller and Charles. We can see that similarity judge-ments by various groups of people over a long periodof time remain stable.

All these experiments were dealing with commonEnglish nouns and the correlation with these resultswas mostly used to test similarity measures on Word-Net [11]. But our focus is different. We wanted to ex-periment with the similarity measures on specific do-main ontologies, which represents more complex enti-ties. We needed data representation where features ofdomain objects are explicitly provided, which is notthe case with WordNet.

Our experiment was conducted with the goal ofevaluating the feature-based similarity of instances inits various forms and its comparison with hierarchy-based approaches. In our first experiment we evalu-ated our approach in the domain of recipes using the

slightly modified Wikitaaable dataset4 used in [7]. Inour second experiment we evaluated our approach inthe domain of drinks using an ontology developed pre-viously by our research team. We assumed that both ofthese datasets were suitable for a wide range of peo-ple, and that they would not need expert knowledge toassess the similarity of the proposed domain items.

5.1. Hypotheses

We wanted to verify that:H1: Tversky’s feature-based similarity measure ap-

plied to ontology instances has a good correlationwith human judgement;

H2: Tversky’s feature-based similarity measure showsbetter performance than hierarchy-based approaches;

H3: it is possible to improve Tversky’s feature-basedsimilarity;

H4: hierarchical information when encoded with fea-tures could play an important role in assessingsimilarity among instances;

H5: combining the hierarchy and feature-based ap-proach beyond linear combination provides goodcorrelation with human judgement.

5.2. Subjects

A total of 137 people took part in the first test and147 took part in the second test. They were selectedamong authors’ colleagues and among the second andthird year undergraduate students at the Faculty of Phi-losophy at the University of Torino, Italy. The partici-pants were recruited according to an availability sam-pling strategy.5 All the participants were native Italianspeakers.

5.3. Materials

We designed two experiments to test our hypothe-ses.

In the first experiment, we designed a web ques-tionnaire with 25 pairs of recipes, chosen from Wik-itaaable dataset, covering a range of recipes expected

4http://wikitaaable.loria.fr/rdf/5Although a more suitable way to obtain a representative sample

is random sampling, this strategy is time consuming and not finan-cially justifiable. Hence, much research is based on samples obtainedthrough non-random selection, such as the availability sampling, i.e.a sampling of convenience, based on subjects available to the re-searcher, often used when the population source is not completelydefined.


to be judged very similar to not similar at all. The orig-inal recipes from the dataset were translated to Ital-ian. The original dataset contains 1666 recipes. Theproperties defined for the recipes are the following:rdf:type, can_be_eaten_as, has_ingredient,suitable_for_diet, not_suitable_for_diet andhas_origin. This dataset provided us with a goodsetting to test our approach. We only made the fol-lowing slight modifications: in the original ontologyall the recipes were instances of Recipe, whereas weneeded to use the hierarchical structure of the ontologyto test hierarchy-based similarity measures, as well asincorporate the hierarchical information into our mea-sures, hence we made the recipes instances of variousDish_Type’s. In the original ontology rdf:type isa property, whereas in our case, we once used it as aproperty and once as an underlying hierarchical infor-mation. Figure 1, shows the basic Wikitaaable taxon-omy of recipes, but only including the top categoriesand the ones from which we used the instances to testour approach. The properties are omitted for clarity.

The main dish types are: RollDish, CrepeDish,BakedGoodDish, BeverageDish, SoupAndStuffDish,PancakeDish, MousseDish, SweetDish, SaladDish,SaltedDish, SauceDish, WaffleDish, PreserveDish,SpecialDietDish. The recipes we used in our eval-uation did not belong to all recipe groups, since thecomplexity of the test would have been too high andwe would have obtained random answers from theusers.

In the second experiment, we again used a webquestionnaire, this time containing 30 pairs of drinkschosen from the ontology describing drinks. Theoriginal ontology has 148 classes, with the maindrink classes being: Water, AlcoholicBeverage,DrinkInACup, PlantOriginDrink and SoftDrink.The properties defined for the drinks are the following:has_alcoholic_content, has_caloric_content,has_ ingredient, is_sparkling, is_suitable_forand rdf:type. Figure 2 shows the basic taxonomy ofdrinks, again only including the main categories andomitting the properties.

In each experiment the participants were asked toasses the similarity of these 25 pairs (respectively 30pairs) by anonymously assigning them a similarityvalue on the scale from 1 to 10 (1 meaning not similarat all, 10 meaning very similar). The ordering of pairswas random. The users’ similarity values were thenturned into the values from the [0, 1] interval to makethem match the similarity values produced by various

algorithms. The datasets used are available from theauthors upon request for academic purposes.

In each experiment, the participant group P was di-vided into two groups: P1 was used as a referencegroup and P2 was used as a control group to measurethe correlation of judgments among human subjects,as in [25].

In this paper we considered the following feature-based similarity measures:

1. Tversky’s similarity where α = β = γ = 1, alsoknown as Jaccard’s or Tanimoto similarity simJ;

2. Squared Jaccard’s similarity simS Q;3. Common-squared Jaccard’s similarity simCS Q;4. Normalized Jaccard’s similarity simN ;5. Normalized common-squared Jaccard’s similar-

ity simNCS Q;6. Sigmoid similarity simS ;7. Sigmoid Jaccard’s similarity simJS ;For the comparison, we considered the following

edge-based similarity measures:1. Leacock and Chodorow’s similarity simLC;2. Wu and Palmer’s similarity simWP;3. Li’s similarity simL.For each of the feature-based measures, we consid-

ered the following variants:V1. the measure without the rdf:type property

(measures simJnt, simS Qnt, simCS Qnt, simN , simNCS Qnt,simS nt, simJS nt);

V2. the measure without the rdf:type propertycombined with underlying hierarchy (measuressimJnth, simS Qnth, simCS Qnth, simNnth, simNCS Qnth,simS nth, simJS nth);

V3. the measure with the rdf:type property com-bined with underlying hierarchy (measures simJh,simS Qh, simCS Qh, simNh, simNCS Qh, simS h, simJS h).

For all the feature-based measures we also tested theDice’s similarity, but in almost all the cases it showedworse performance than Jaccard’s similarity, so wewill not report the results here.

5.4. Measures

We used the Pearson product-moment correlationcoefficient r to measure the accuracy of similarityjudgement.

The Pearson product-moment correlation coefficientmeasures the correlation (i.e. linear association) be-tween two sets of values. For two sets of values X =

{x1, . . . , xn} and Y = {y1, . . . , yn}, the Pearson coeffi-


Fig. 1. Recipe taxonomy

Fig. 2. Drinks taxonomy

cient is calculated as follows:

r =Σn

i=1(xi − x)(yi − y)√Σn

i=1(xi − x)2(yi − y)2

where x and y are the sample means of the two setsof values X and Y . The value of Pearson coefficientranges from +1 (indicating a strong positive correla-tion) to −1 (indicating a strong negative correlation).Value 0 means there is no correlation.

5.5. Results and discussion

For the first experiment (Recipes) Table 1 reports thePearson correlation between the reference group P1and control group P2, as well as the Pearson correla-tion between the participant group P and the hierarchy-based approaches. Tables 2, 3, 4, 5 report the Pearsoncorrelation between each of the above measures andthe participant group P for the base case and for the 3variations introduced in Section 4. In bold are all theresults which are better than basic Jaccard’s similarity.

For the second experiment (Drinks) Table 6 re-ports the Pearson correlation between the reference


MEASURE r

P1 - P2 0.9714Leacock and Chodorow 0,6033

Wu and Palmer 0,6304

Li et al. 0,6270

Table 1Recipes - P1 - P2 and edge-based

MEASURE r

Jaccard 0,7461

Sq. Jaccard 0,7692Comm-sq. Jaccard 0,6059

Norm. Jaccard 0,7640Norm. comm-sq. Jacc. 0,7654Sigmoid 0,7583Sigmoid Jaccard 0,7442

Table 2Recipes - base case

MEASURE r

Jaccard no t. 0,6826

Sq. Jaccard no t. 0,7002Comm-sq. Jaccard no t. 0,5408

Norm. Jaccard no t. 0,7129Norm. comm-sq. Jacc. no t. 0,7083Sigmoid no t. 0,7067Sigmoid Jaccard no t. 0,6804

Table 3Recipes - base case without rdf:type

MEASURE r

Jaccard + h. 0,7477

Sq. Jaccard + h. 0,7693Comm-sq. Jaccard + h. 0,6095

Norm. Jaccard + h. 0,7628Norm. comm-sq. Jacc. + h. 0,7674Sigmoid + h. 0,7606Sigmoid Jaccard + h. 0,7459

Table 4Recipes - base case with hierarchy

MEASURE r

Jaccard no t. + h. 0,6875

Sq. Jaccard no t. + h. 0,7005Comm-sq. Jaccard no t. + h. 0,5464

Norm. Jaccard no t. + h. 0,7317Norm. comm-sq. Jacc. no t. + h. 0,7299Sigmoid no t. + h. 0,7140Sigmoid Jaccard no t. + h. 0,6854

Table 5Recipes - base case without rdf:type with hierarchy

group P1 and control group P2, as well as the Pear-son correlation between the participant group P andthe hierarchy-based approaches. Tables 7, 8, 9, 10 re-port the Pearson correlation between each of the abovemeasures and the participant group P for the base caseand for the 3 variations introduced in Section 4.

5.5.1. Participants group and the comparison withhierarchy-based approaches

The correlation for both experiments with the hu-man subjects, i.e. the comparison of the two groupsof participants (column P1-P2), is 0.9714 (respectively0, 9859) and similar to the one reported in [25]. Fromthis good correlation we can conclude that the partic-ipants’ responses were coherent among themselves inboth experiments and that we can trust the human rat-ings.

Furthermore, w.r.t. the correlation of participantsgroup with hierarchy-based approaches we can see that

in the first experiment the correlation with hierarchy-based measures is relatively low (only Squared Jaccardmeasure performs worse than Wu and Palmer and Liet al.’s measures), whereas in the second experimentNormalised and Normalised common square measuresperform consistently better than all the hierarchy-based measures. This can be explained with the factthat the ontology of drinks is designed to better mir-ror human categorisation than the ontology of recipes(which was indeed flat at the beginning and we per-formed only minimal changes to obtain the main cat-egories of recipes). But it also shows how dependantthe measure is on the ontology design.

5.5.2. Base case with and without the propertyrdf:type

Tables 2 and 7 show the results for the base case forall 7 similarity measures in both experiments, wherethe property rdf:type was included in the calcula-


MEASURE r

P1 - P2 0,9859Leacock and Chodorow 0,7543

Wu and Palmer 0,8817

Li et al. 0,8567

Table 6Drinks - P1 - P2 and edge-based

MEASURE r

Jaccard 0,8724

Sq. Jaccard 0,8755Comm-sq. Jaccard 0,7748

Norm. Jaccard 0,9173Norm. comm-sq. Jacc. 0,8935Sigmoid 0,8319

Sigmoid Jaccard 0,8689

Table 7Drinks - base case

MEASURE r

Jaccard no t. 0,9025

Sq. Jaccard no t. 0,8422

Comm-sq. Jaccard no t. 0,7805

Norm. Jaccard no t. 0,8771

Norm. comm-sq. Jacc. no t. 0,9203Sigmoid no t. 0,8740

Sigmoid Jaccard no t. 0,8959

Table 8Drinks - base case without rdf:type

MEASURE r

Jaccard + h. 0,8736

Sq. Jaccard + h. 0,8761Comm-sq. Jaccard + h. 0,7790

Norm. Jaccard + h. 0,9122Norm. comm-sq. Jacc. + h. 0,9015Sigmoid + h. 0,8187

Sigmoid Jaccard + h. 0,8706

Table 9Drinks - base case with hierarchy

MEASURE r

Jaccard no t. + h. 0,9029

Sq. Jaccard no t. + h. 0,8442

Comm-sq. Jaccard no t. + h. 0,7864

Norm. Jaccard no t. + h. 0,8809

Norm. comm-sq. Jacc. no t. + h. 0,9252Sigmoid no t. + h. 0,8564

Sigmoid Jaccard no t. + h. 0,8975

Table 10Drinks - base case without rdf:type with hierarchy

tions and no further hierarchical information was takeninto account. Tables 3 and 8 show the same results butwithout the property rdf:type. The exclusion of therdf:type property served to discover how much theactual information about the types of instances con-tributes to the overall similarity of the instances in theontology.

Looking at the Tables 2 and 7 we can see that all theproposed measures except for simCS Q and simJS in thefirst experiment and except for simCS Q, simS and simJS

improve the basic Jaccard’s similarity simJ . In the firstexperiment the best results were obtained for simS Q,followed by simNCS Q and simN , all being better than theoriginal Jaccard’s similarity simJ . In the second exper-iment the best results were obtained for simN , followedby simNCS Q and simS Q, all being better than the originalJaccard’s similarity simJ . Hence, we confirmed our hy-pothesis H1 that the feature-based similarity applied to

ontology instances has a good correlation with humanjudgement and H3 that it is possible to improve theoriginal Tversky’s formulation of similarity measure.

Notice also, that the worst performing similaritymeasure in all cases is simCS Q, which leads us to theconclusion that giving more importance to commonfeatures does not lead to good results and that distinc-tive features are equally important in assessing simi-larity among ontology instances.

We can also see that in the first experiment thebest results are obtained by considering the rdf:typeproperty (Table 2) and in the second one by excludingthe rdf:type property (Table 8). In both cases, thereis a good correlation among participant group P andthe corresponding results.

Furthermore, the results in Table 2 (respectively Ta-ble 8) are in most cases better than the results in Ta-ble 1 (respectively Table 6) showing that we can ob-


tain better similarity results by considering proper-ties for instances in an ontology, rather than hierarchyunderlying the ontology. This confirms our hypothe-sis H2 that the feature-based similarity shows betterperformance than hierarchy-based approaches. Often,other proposed methods in the literature do not surpassLi’s measure, even when they surpass other hierarchy-based methods. On both of our datasets, the best per-forming hierarchy-based similarity measure is Wu andPalmer’s measure. But almost all feature-based mea-sures in the first experiment and most of the measuresin the second experiment surpass Wu and Palmer’smeasure. This means that properties play more impor-tant role than the underlying hierarchy when describ-ing ontological instances and their mutual similarity.

But we cannot confirm our hypothesis H4 that hi-erarchical information encoded with rdf:type prop-erty plays an important role in assessing similarityamong instances. Sometimes the categorisation of do-main items is important and sometimes it is not. Itprobably depends on the domain and on the level ofdetail of the underlying hierarchy.

5.5.3. Including hierarchical informationFinally, we wanted to combine hierarchical informa-

tion with features to see if the similarity values couldbe improved. Just a linear combination of feature-based similarity and hierarchy-based similarity wouldnot yield better results, since hierarchy-based similar-ity would just decrease the correlation. Hence, we triedto incorporate the hierarchy-based similarity in a dif-ferent way, by including the normalised distance be-tween concepts (calculated as dist

2 max ) in the feature-based formulae as a part of distinctive features. Some-how the results obtained by the combined feature-based and hierarchy-based measures (Table 4 and Ta-ble 9) are not conclusive. In the domain of recipes,apart from simN , all the other measures perform bet-ter in the presence of the hierarchical information, al-though with a small improvement. In the domain ofdrinks, we have the following situation: simS Qh be-haves better than its corresponding counterparts, simN

is the best in its category and simNCS Qnth is the best inits category. Also, the results in Table 5 and Table 10are better than the results in Table 3 and Table 8 whichshows that the information about the underlying hier-archy plays an important role in assessing the similar-ity, even when categorisation of domain objects is notpresent. Hence, w.r.t. the hypothesis H5 we can con-clude that combining the hierarchy and feature-basedapproach beyond linear combination provides good

correlation with human judgement but sometimes bet-ter results are obtained without incorporating the un-derlying hierarchy.

5.5.4. Concluding considerationsWe can see that in both domains, the consistently

best performing measure is the Normalised commonsquare Jaccard’s similarity simNCS Q. In the domain ofRecipes we obtained the improvement of 2, 5% for thebase case, whereas in the domain of Drinks we ob-tained the improvement of 2, 4% for the base case6.

In both cases additional improvement is obtainedby considering the underlying hierarchy. In the do-main of Recipes it works best with the rdf:type prop-erty included (simulating categorisation) with the im-provement w.r.t. the basic Jaccard’s similarity simJ of2, 9%. In the domain of Drinks it works best withoutthe rdf:type property with the improvement w.r.t. thebasic Jaccard’s similarity simJ of 6, 1%.

Also, in both cases, there is a notable improve-ment w.r.t. the best hierarchy-based measure (Wu andPalmer): 21, 7% in the case of Recipes and 5% in thecase of Drinks. This again shows how dependant thehierarchy-based similarity measures are on the designof the underlying conceptual hierarchy.

Considering all the proposed measures, in the caseof Recipes, 4 out of 6 proposed measures perform con-sistently better than the basic Jaccard’s similarity mea-sure (66, 7%), whereas in the case of Drinks 3 out of 6measures perform better in the presence of rdf:typeproperty (50%) and only simNCS Q measure performsbetter without rdf:type property (16, 7%).

We can see that even with relatively small number ofproperties defined for the concepts in the ontology, thefeature-based similarity outperforms hierarchy-basedapproaches. Of course, the number of defined proper-ties plays an important role in semantic similarity cal-culation.

The dataset used plays an important role in the sim-ilarity calculations. To the best of our knowledge, thedatasets used in this work include significantly highernumber of participants than many works in the field.Usually, the similarity measures are tested on Word-net [11], but we are providing the community with yetanother rich dataset to experiment with.

6In the domain of Drinks for the base case, simN brings evengreater improvement of 5, 2% but it does not perform consistentlybetter as simNCS Q.


5.6. Comparison with the performance on WordNet

Since most of the similarity measures in the liter-ature have been tested on WordNet, we include herethe comparison of our results with the correspondingresults provided in [29]. We include only the resultsfor Miller and Charles’ dataset [20], since the ones for[28] are not always available.

We can see that the hierarchy-based measures (Lea-cock and Chodorow, Wu and Palmer and Li et al.) per-form better on WordNet than on Wikitaaable datasetbut the performance is similar to the Drinks dataset.This is due to the hierarchical structure of Wikitaaableand Drinks datasets. The hierarchy in Wikitaaable israther shallow, hence the information obtained fromthe underlying conceptual hierarchy is not so rich. Onthe other hand, the Drinks ontology has a deeper un-derlying hierarchy and provides more precise informa-tion. We can see that Tversky’s similarity measure per-forms better on Wikitaaable and Drinks datasets, sincethere are more properties defined for the concepts. Weinclude also the results for Tversky + hierarchy, Sig-moid Tversky + hierarchy, Norm. com. sq. Tversky,Norm. com. sq. Tversky + hierarchy (with type) andNorm. com. sq. Tversky + hierarchy (no type) whichperform even better than simple Tversky’s measure onWikitaaable and Drinks datasets.

6. Related work

In this section we give a brief summary of otherworks which deal with feature-based similarity. Theseapproaches calculate feature-based similarity in dif-ferent ways, starting from Tversky’s similarity mea-sure but taking into account different aspects w.r.t. us(antescendant classes, descendant classes etc., whereaswe compare property-value pairs). We include theseworks here to have a more complete picture of feature-based similarity measures. We did not test these mea-sures since the scope of our work was to evaluate theperformance of Tversky’s similarity measure and itsvariations calculating them from property-value pairsfor compared objects. The same variations of Tver-sky’s measure could be applied to other feature-basedsimilarity measures as well. Also, some of these mea-sures are not applicable in our context. For example,we cannot calculate the number of descendant classessince we deal with instances in the ontology.

An interesting approach to feature-based similaritycalculation is given in [22] and [23]. Both works

translate the feature-based model into information con-tent (IC) model, with a slightly different formulation ofTversky’s formula where Tversky’s function describ-ing the saliency of the features is substituted by theinformation content of the concepts. In [22] IntrinsicInformation Content iIC, introduced in [30], is usedtaking into account the number of subconcepts of aconcept and a total number of concepts in a domain.In [23] Extended Information Content EIC is used in-stead of iIC where iIC is combined with EIC as a av-erage iIC for all the concepts connected to a certainconcept with different relations. Both approaches usethe underlying ontology structure directly, where allthe defined semantic relations are used, rather than re-lying on an external corpus. Their new similarity mea-sure called FaITH is based on this novel framework.Also, this new IC calculation can be used to rewritethe existing similarity measures in order to calculaterelatedness, in addition to similarity.

An important work on feature-based similarity re-garding ontological concepts is described by [34].They start from Tversky’s assumption that similarity isdetermined by common and distinctive features of thecompared objects and consider the relations betweenconcepts as their features. They linearly combine twosimilarities. The first similarity is obtained from directconnections between two objects, as well as commonfeatures shared between both concepts (in this case thesimilarity between relations is calculated using Wu andPalmer’s measure [33]). The second similarity is basedon distinctive features of each object. Their approachcan be used to calculate similarity at the class level,as well as the similarity of instances. Furthermore, itis possible to take into account only specific relationswhich leads to context-aware similarity. The problemis that the proposed method is assessed only on 4 pairsof concepts.

[29] also introduce a new feature-based approachfor calculating ontology-based semantic similaritybased on taxonomical features. They evaluate the se-mantic distance between concepts by considering asfeatures the set of concepts that subsume each of them.Practically, the degree of disjunction between theirfeature sets (non-common subsumers) model the dis-tance between concepts, whereas the degree of over-lap (common subsumers) models their similarity. Theproblem with this approach is that If the input ontologyis not deep enough or built with enough taxonomicaldetails or it does not consider multiple inheritance, theknowledge needed for similarity calculation might bescarse. The authors also provide a detailed survey of


MEASURE WordNet Wikitaaable Drinks

Leacock and Chodorow 0.74 0,6033 0,7607

Wu and Palmer 0.74 0,6304 0,8789

Li et al. 0.82 0,6270 0,8549

Tversky 0.73 0,7461 0,8724

Tversky + hierarchy N/A 0,7477 0,8736

Norm. com. sq. Tversky N/A 0,7654 0,8935

Norm. com. sq. Tversky + hierarchy (with type) N/A 0,7674 0,9015

Norm. com. sq. Tversky + hierarchy (no type) N/A 0,7299 0,9252

Table 11Comparison with WordNet

most of the ontology-based approaches and comparetheir performance on WordNet 2.0. From this analysisthey draw important conclusions about the advantagesand limitations of these approaches and give directionson their possible usage. A slightly different version ofthis measure was used by [2] on SNOMED CT ontol-ogy to evaluate the similarity of medical terms.

[27] and [21] propose similar measures for calcu-lating semantic similarity based on matching of theirsynonym sets, semantic neighbourhoods (semantic re-lations among classes) and features which are classi-fied into parts, functions and attributes. This enablesseparate treatment of these particular class descriptionsand introduction of specific weights which would re-flect their importance in different contexts. In [27]these similarities are calculated using Tversky’s for-mula, where parameters in the denominator are cal-culated taking into account the depth of the hierar-chies of different ontologies. Synonym sets and seman-tic neighbourhoods are useful when detecting equiv-alent or most similar classes across ontologies. Fea-tures are useful when detecting similar but not equiv-alent classes. [21] eliminate the need for the param-eters in the denominator in Tversky’s formula, hencethey do not rely on the depth of the corresponding on-tologies. This leads to matching based only on com-mon words for synset similarity calculation. Also, setsimilarities are calculated per relationship type. Finaly,their similarity does not have weights for different sim-ilarity components. The novelty of their work is the ap-plication of this and various other similarity measuresto MeSH ontology (Medical Subject Headings). Bothmethods can be used for cross-ontology similarity cal-culation.

A semantic similarity measure for OWL objects in-troduced by [13] is based on Lin’s information theo-retic similarity measure [19]. They compare semanticdescription of two services and define their semantic

similarity as a ratio between the shared and total in-formation of the two objects. The semantic descriptionof an object is defined using its description sets whichcontain all the statements (triples) describing the givenobject and their information content is based on their“inferencibility”, i.e. the number of new RDF state-ments that can be generated by applying a certain setof inference rules to the predicate. They use their mea-sure to determine the similarity of semantic servicesannotated with OWL ontologies.

Similarity can find many applications is Recom-mender Systems. [9] developed a content-based movierecommender system based on Linked Open Data(LOD) in which they adapt a vector space model(VSM) approach to compute similarities between RDFresources. Their assumption is that two movies aresimilar if they have features in common. The wholeRDF graph is represented as a 3-dimensional matrixwhere each slice refers to one property in the ontol-ogy and for each property the similarity between twomovies is calculated using their cosine similarity.

Similarity among concepts can be also automati-cally learned. Similarity learning is an area of super-vised machine learning, closely related to regressionand classification, where the goal is to learn from ex-amples a similarity function that measures how similaror related two objects are. Similarity learning is usedin information retrieval to rank items, in face identifi-cation and in recommender systems. Moreover, manymachine learning approaches rely on some similaritymetric. This includes unsupervised learning such asclustering, which groups together close or similar ob-jects, or supervised approaches like K-nearest neigh-bour algorithm. Metric learning has been proposed asa preprocessing step for many of these approaches.

Automatic learning of similarity among conceptsin an ontology is used especially for ontology map-ping (also known as ontology alignment, or ontology


matching) [10], the process of determining correspon-dence between ontology concepts. This is necessaryfor integrating heterogeneous databases, developed in-dependently with their own data vocabulary or differ-ent domain ontologies.

There are several works which have exploited MLtechniques towards ontology alignment. [14] organ-ised the ontology mapping problem into a standard ma-chine learning framework, exploiting multiple conceptsimilarity measures (i.e, synset-based, Wu and Palmer,description-based, Lin). In [8] a multi-strategy learn-ing was used to obtain similar instances of hierarchiesto extract similar concepts using Naïve Bayes (NB)technique. In [1], following a parameter optimisationprocess on SVM, DT and neural networks (NN) clas-sifiers, an initial alignment was carried out. Then theuser’s feedback was used to improve the overall perfor-mance. All these works considered concepts belong-ing to different ontologies while we concentrated onconcepts in a same ontology.

7. Conclusions and future work

In this work we present some experiments withTversky’s feature-based similarity measure based onproperties defined in an ontology and their compar-ison with hierarchy-based approaches. We also pro-pose an improvement of Jaccard’s similarity measure,namely Normalised common-squared Jaccard’s simi-larity simNCS Q.

We further proposed three variations of all thefeature-based measures, to see how much the under-lying hierarchical information contributes to accuratesimilarity measurement. We came to the conclusionthat the underlying hierarchical information does playan important role in similarity calculation and it is ex-pressed better with features than with underlying hier-archical information.

We tested these four variations of the feature-basedsimilarity measures on slightly modified Wikitaaabledataset in the domain of recipes and on Drinks ontol-ogy designed by our researchers previously. Our firstevaluation included 137 subjects and 25 pairs of con-cepts and our second evaluation included 147 subjectsand 30 pairs of concepts. This is a significant numberof participants compared with other evaluations in theliterature.

Our main finding is that the measure with the bestperformance is the one combining feature-based and

hierarchy-based measures, followed by the simplefeature-based similarity.

As a future work, it would be interesting to seehow the similarity measures would perform in thepresence of more properties or on a different dataset.MeSH and SNOWMED are some possible candi-date datasets, although in these cases expert opin-ion would be needed. Also, it would be interestingto add hierarchical structure among property values.For example, in the present approach we considerFusilli and Spaghetti two different domain items(hence, two different property values). But Fusilliand Spaghetti are both descendants of Pasta so theycould be considered as “almost” the same values forproperties.

Moreover, in this work we only consider object typeproperties. Taking data type properties, such as liter-als, into account might be an interesting area for fu-ture investigation. In this case, it would be necessaryto determine when two literal values can be consideredequal.

In addition, similar experiments can be applied tolinked open data [4] or any other data structure wherethe objects are described by means of their properties.

References

[1] B. Bagheri Hariri, H. Abolhassani, and H. Sayyadi. A neural-networks-based approach for ontology alignment. In SCIS &ISIS, volume 2006, pages 1248–1252, 2006.

[2] M. Batet, D. Sánchez, and A. Valls. An ontology-based mea-sure to compute semantic similarity in biomedicine. Journal ofBiomedical Informatics, 44(1):118–125, 2011.

[3] G. Beydoun, A. A. Lopez-Lorca, F. García-Sánchez, andR. Martínez-Béjar. How do we measure and improve the qual-ity of a hierarchical ontology? Journal of System Software,84(12):2363–2373, 2011.

[4] C. Bizer, T. Heath, and T. Berners-Lee. Linked Data - TheStory So Far. International Journal on Semantic Web and In-formation Systems, 5(3):1–22, 2009.

[5] A. Budanitsky and G. Hirst. Evaluating WordNet-based Mea-sures of Lexical Semantic Relatedness. Computational Lin-guistics, 32(1):13–47, 2006.

[6] F. Cena, S. Likavec, and F. Osborne. Property-based interestpropagation in ontology-based user model. In 20th Conferenceon User Modeling, Adaptation, and Personalization, UMAP2012, volume 7379 of LNCS, pages 38–50. Springer, 2012.

[7] A. Cordier, V. Dufour-Lussier, J. Lieber, E. Nauer, F. Badra,J. Cojan, E. Gaillard, L. Infante-Blanco, P. Molli, A. Napoli,and H. Skaf-Molli. Taaable: A case-based system for personal-ized cooking. In S. Montani and L. C. Jain, editors, SuccessfulCase-based Reasoning Applications-2, volume 494 of Studiesin Computational Intelligence, pages 121–162. Springer BerlinHeidelberg, 2014.


[8] J. David. Association rule ontology matching approach. In-ternational Journal on Semantic Web and Information Systems(IJSWIS), 3(2):27–49, 2007.

[9] T. Di Noia, R. Mirizzi, V. C. Ostuni, D. Romito, and M. Zanker.Linked open data to support content-based recommender sys-tems. In 8th International Conference on Semantic Systems,I-SEMANTICS ’12, pages 1–8. ACM, 2012.

[10] J. Euzenat and P. Shvaiko. Ontology Matching. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2007.

[11] C. Fellbaum, editor. WordNet: An Electronic Lexical Database.MIT Press, 1998.

[12] T. R. Gruber. A translation approach to portable ontology spec-ifications. Knowledge Acquisition Journal, 5(2):199–220, June1993.

[13] J. Hau, W. Lee, and J. Darlington. A semantic similarity mea-sure for semantic web services. In Web Service SemanticsWorkshop at WWW (2005, 2005.

[14] R. Ichise. Machine learning approach for ontology mappingusing multiple concept similarity measures. In Computer andInformation Science, 2008. ICIS 08. Seventh IEEE/ACIS Inter-national Conference on, pages 340–346. IEEE, 2008.

[15] C. Leacock and M. Chodorow. Combining local context andWordNet similarity for word sense identification, pages 305–332. In C. Fellbaum (Ed.), MIT Press, 1998.

[16] Y. Li, D. McLean, Z. Bandar, J. O’Shea, and K. A. Crockett.Sentence similarity based on semantic nets and corpus statis-tics. IEEE Transactions on Knowledge and Data Engineering,18(8):1138–1150, 2006.

[17] S. Likavec. Shapes as property restrictions and property-basedsimilarity. In O. Kutz, M. Bhatt, S. Borgo, and P. Santos,editors, 2nd Interdisciplinary Workshop The Shape of Things,volume 1007 of CEUR Workshop Proceedings, pages 95–105.CEUR-WS.org, 2013.

[18] S. Likavec, F. Osborne, and F. Cena. Property-based semanticsimilarity and relatedness for improving recommendation ac-curacy and diversity. International Journal on Semantic Weband Information Systems, 11(4):1–40, 2015.

[19] D. Lin. An information-theoretic definition of similarity. In15th International Conference on Machine Learning ICML’98, pages 296–304. Morgan Kaufmann Publishers Inc., 1998.

[20] G. A. Miller and W. G. Charles. Contextual correlates of se-mantic similarity. Language&Cognitive Processes, 6(1):1–28,1991.

[21] E. G. M. Petrakis, G. Varelas, A. Hliaoutakis, andP. Raftopoulou. X-similarity: Computing semantic similaritybetween concepts from different ontologies. Journal of Digital

Information Management, 4(4):233–237, 2006.[22] G. Pirró. A semantic similarity metric combining features and

intrinsic information content. Data and Knowledge Engineer-ing, 68:1289–1308, 2009.

[23] G. Pirrò and J. Euzenat. A feature and information theoreticframework for semantic similarity and relatedness. In 9th In-ternational Semantic Web Conference, ISWC ’10, volume 6496of LNCS, pages 615–630. Springer, 2010.

[24] R. Rada, H. Mili, E. Bicknell, and M. Blettner. Developmentand application of a metric on semantic nets. IEEE Trans. onSystems Management and Cybernetics, 19(1):17–30, 1989.

[25] P. Resnik. Using information content to evaluate semantic sim-ilarity in a taxonomy. In 14th International Joint Conferenceon Artificial Intelligence, pages 448–453, 1995.

[26] P. Resnik. Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity innatural language. Journal of Artificial Intelligence Research,11:95–130, 1999.

[27] M. A. Rodriguez and M. J. Egenhofer. Determining semanticsimilarity among entity classes from different ontologies. IEEETransactions on Knowledge and Data Engineering, 2000.

[28] H. Rubenstein and J. B. Goodenough. Contextual correlatesof synonymy. Communications of the ACM, 8(10):627–633,1965.

[29] D. Sánchez, M. Batet, D. Isern, and A. Valls. Ontology-basedsemantic similarity: A new feature-based approach. ExpertSystems with Applications, 39(9):7718–7728, 2012.

[30] N. Seco, T. Veale, and J. Hayes. An intrinsic information con-tent metric for semantic similarity in wordnet. In R. L. de Mán-taras and L. Saitta, editors, 16th Eureopean Conference onArtificial Intelligence, ECAI ’04, including Prestigious Appli-cants of Intelligent Systems, PAIS ’04, pages 1089–1090. IOSPress, 2004.

[31] A. Tversky. Features of similarity. Psychological Review,84(4):327–352, 1977.

[32] C. A. Welty and N. Guarino. Supporting ontological analy-sis of taxonomic relationships. Data Knowledge Engeneering,39(1):51–74, 2001.

[33] Z. Wu and M. Palmer. Verbs semantics and lexical selection. In32nd Annual Meeting on Association for Computational Lin-guistics, pages 133–138. Association for Computational Lin-guistics, 1994.

[34] P. D. H. Zadeh and M. Reformat. Assessment of semantic sim-ilarity of concepts defined in ontology. Information Sciences,250:21–39, 2013.

Semantic Web 0 (0) 1 1 IOS Press Tversky’s feature-based ... · Semantic Web 0 (0) 1 1 IOS Press...

Documents

Transcript of Semantic Web 0 (0) 1 1 IOS Press Tversky’s feature-based ... · Semantic Web 0 (0) 1 1 IOS Press...