Post on 19-Apr-2022
SOLVING DIFFERENT LANGUAGES PROBLEM (PORTUGUESE,
ENGLISH and BAHASA INDONESIA) IN DIGITAL LIBRARY WITH
ONTOLOGY
Herlina JAYADIANTI a,b , Carlos Sousa PINTO b, Lukito Edi NUGROHO c, Paulus Insap SANTOSA d
, Wahyu
WIDAYAT e
a, Universitas Gadjah Mada, Electrical Engineering and Information Technology, Yogyakarta, Indonesia,
herlinajayadianti@gmail.com
bMinho University,Information System Department,Campus Azurém, Guimaraes, Portugal, csp@dsi.uminho.pt c,d
Universitas Gadjah Mada, Electrical Engineering and Information Technology, Yogyakarta, Indonesia,
{lukito, Insap}@mti.ugm.ac.id eUniversitas Gadjah Mada, Faculty of Economic and Development,
Yogyakarta, Indonesia, wahyu@mep.ugm.ac.id
ABSTRACT
In this paper we will present in how digital library work
for different language support, perhaps in a different
repositories and in a different countries. Our works
requires available collections in one metadata
associated with each collection in another metadata and
build a relation between each metadata in each
repository. In this paper we will use three languages
from three different repositories, such as Indonesia,
English, and Portuguese. it is very important to make a
connection between references in different languages
(English, Portuguese and Indonesia) in a large metadata; this is aim of our work. Ambiguity,
equivalences and semantics problem will appear in this
situation, and we will try to solve this problem trough
this work.
Keywords: Ontology, Library, References, Different
languages, Indonesia, English, Portuguese.
1 INTRODUCTION
A digital library is a repository of digital
documents of different files formats like.pdf, .doc, .ppt
or even plain .txt which can be any journal, newspaper,
books, magazines, instruction manuals, presentations
and others publications. Nowadays Ontology is very
important for making an efficient searching in digital
library [1], [2], [3], [4]. Ontology based digital library
should have the additional features of semantic based
accessing / querying and searching the library using a
reference ontology to reform the user query and extract
only appropriate content from the library. In this section we will present a future face of digital library, it
will work for different language support, perhaps in a
different country. In summary, our works requires
available collections in one library (references in
English) associated with each collection in another
library (references in Portuguese and Indonesia).
Figure 1. Metadata Architecture
6-08 Solving Different Languages Problem (portuguese, English And Bahasa Indonesia) In Digital Library With Ontology
197
We give an illustration (Figure1) there are
three metadata repositories, in English, in Portuguese
and in Indonesian language. Our aim is to build a
relation between each metadata in each repository.
Indonesian reader often searching literatures or references in English and Indonesian language,
similarly, in country such as East Timor, people use
more than three languages to communicate –
Portuguese, English, Indonesia and local language.
Alongside Malay, Portuguese was the language that is
absorbed by the Indonesian language. We can say that
it is very important to make a connection between
references in different languages (English (En) –
Portuguese (pt) – Indonesia (ina)) in a large metadata.
The term “reference” is a relation between objects in
which one object connects or link to another object.
The first object in this relation is said to refer to the second object. The second object – the one to which the
first object refers – is called the referent of the first
object. As an example:
Book (1): Artificial intelligence a modern approach by
Author (1) Russell Norvig (first object) refers to:
Book (2): Intelligent machinery by Author (2) Alan
Turing (second object).
Figure 2. Case Study Library
Base on Figure 2 : Books (En) ≈ Livro (pt) ≈
Buku (Ina) and Title (En) ≈ Titulo (pt) ≈ Judul (Ina).
The book “Artificial Inteligence” ≈ The book
“Inteligência artificial” ≈ “Kecerdasan Buatan”. We
will describe it in more detail in section 4. To get
common English terms, we use terms from Wordnet.
WordNet1 is a large lexical database or electronic
dictionary for English. WordNet implements measure of similarity and relatedness among terms [5] [6].
Measures of similarity use information found in an is–a
1 http://wordnet.princeton.edu/
hierarchy of concepts, and quantify how much concept
A is similar to concept B.
2 SEMANTIC HETEROGENEITY
Semantic heterogeneity occurs when the same
reality, modeled by two or more people, does not have
the same model or representation [7], [8], [9]. In this
research we consider different conceptualizations (sets
of terms) about library that cause a semantic
heterogeneity problem. Section 1 (introduction)
describes the concept of library and the different
perceptions of it. Since the representations or models of library are independently developed, they often have
different structures, terminologies, or even
interpretations, representing an obstacle for semantic
interoperation of those models. Semantic heterogeneity
problem takes place on naming, scaling and
confounding [10], [11]. Semantic heterogeneity on
naming includes problems with synonyms (same
concept with different terms of concepts and their
properties, e.g. Education and school background) and
homonyms (same term with different semantics, e.g.
Worm as animal, as muscle under tongue and as infection in computers). Semantic heterogeneity
problem in confounding occurs when one concept can
refer different realities and has an effect on the attribute
values. For example, latestmeasuredtemperature
doesn’t refer one and the same instant.
3 ONTOLOGY INTEGRATIONS
Ontology consists of classes, data properties,
object properties, and instances. Instances are objects
which cannot be divided without losing their structural
and functional characteristics. Data properties and
object properties are related and operate among the
various objects populating the ontology. Ontology
integration is one way to solve the problem of semantic
heterogeneity and can be done using several
approaches. For example, merging, matching or
mapping. In our case, we decided to use mapping
process because with mapping we can find the similarities and correspondences between terms of the
ontologies. Mapping works with logical axioms,
typically expressing logical equivalence or inclusion
among ontology terms. The integration of ontologies
creates a new ontology by reusing other available
ontologies through assembling, extending, or
specializing operations. In integration processes the
source ontologies and the resultant ontology can have
different amounts of information [9].
The goal of ontology integration is to derive more
general domain ontology (common ontology) from
several other ontologies in the same domain, into a consistent unit. The domain of both the integrated and
The Proceedings of The 7th ICTS, Bali, May 15th-16th, 2013 (ISSN: 9772338185001)
198
the resulting ontologies is the same. Figure 3 shows an
example with several source ontologies (Oen, Oina,
Opt) and the integrated common ontology (CO - Oen).
Figure 3. Integration of ontologies
Ontology integration process implies several steps.
Finding similarities and differences between ontologies
in an automatic and semi-automatic way;
Defining mappings between ontologies;
Developing an ontology integration architecture;
Composing mappings across different ontologies;
Representing uncertainty and imprecision in
mappings.
Particularly, in ontology integration, some tasks
should be performed to eliminate differences and
conflicts between those ontologies. The task lies at two
levels: language level and ontology level [12]. Base on
Figure 3 we can see that Ontology En (OEn) from
library which is majority of the books is with English
language literature integrated with repository in
Portuguese language in ontology Pt (Opt) and
repository in Indonesian language in Ontology Ina
(Oina). Importing process is one way to integrate
ontologies. When an ontology imports another
ontology, all the definitions about classes, properties
and individuals of the imported ontology becomes
available to the importing ontology. Here Ontology
English (Oen) is a common ontology because it
will use a more common term for English, Portuguese
and Indonesian people than Portuguese language or Indonesian language.We will use an Ontology Web
Language (OWL), a language to create ontologies for
the Web, we can implement the referred process of
importing. The code below describes how the
owl:imports mechanism works and how OWL
resolves the location of an ontology, given its URI. <owl:Ontology
rdf:about="http://www.semanticweb.org
/Oen.owl">
<owl:imports
rdf:resource="http://www.semanticweb.
org/Oina.owl"/>
<owl:imports
rdf:resource="http://www.semanticweb.
org/Opt.owl"/>
4 ACHIEVING A COMMON
ONTOLOGY
The importing process, as explained before, allows
us to obtain a new ontology, a Common Ontology (CO),
consisting of common terms. Common term is a
common word recognized and used by different sets of
people. In this project we use an English as a common term, because English language is more common than
Portuguese and Indonesia. Label : [language : Indonesia]
Class : Buku
Buku SubClassOf Koleksi
Label : [language : pt]
Class : Livro
Livro SubClassOf Coleções
Label : [language : en]
Class : Books
Books SubClassOf Collection
Buku (indonesia)≈ Books (en) ≈ Livro (pt)
Figure 4. Integrating Classes between Ontologies
Figure 4 shows the relationship scheme
between terms in the considered O’s and the common
terms in the CO. The class Editores (Opt), is the class
that represents publisher in Portuguese language, is
equivalent to class Penerbit (Oina), If the differences in
perception of this problem occur among a group of
people or human beings, they can easily communicate
with each other with help from translator or dictionary
and agree on a common understanding about their
different language, but what happens if we are talking
about communication between machines? Let’s suppose that we have three libraries and three
ontologies (as an example we can use ontology english
(oen) and ontology portuguese (opt) and ontology
indonesia (oina) describing their respective perceptions
about library (see Figure 4 and Figure 5). Three
ontologies (oen, opt and oina) have a class refer to
6-08 Solving Different Languages Problem (portuguese, English And Bahasa Indonesia) In Digital Library With Ontology
199
“collection” which represents each collection record
(see figure 5).
Figure 5. Ontograph Visualization Between Class Books(Oen),
Buku(Oina) And Livro(Opt)
The three ontologies look rather different, the
information that they capture is roughly the same but
they use a different language, english laguange,
portuguese language and indonesian language. We can
say that class buku belonging to the ontology oina is
equivalence to the class livro in ontology opt. Class
buku (oina) and class livro (opt) represent the same
semantic value. Not only the classes are integrated, we
also integrate the dataproperties, objectproperties and
instances. Considering the same example we can say
that classes artikel(oina) and artigos (opt) are equivalent – they represent various article but have
different terms.
5 ONTOLOGY MAPPING
After the ontology integration process is
complete, the next step should be the execution of a
mapping process. The main purposes of mapping processes are to find similarities between the source
ontologies through logical axioms, logical equivalences
and inclusions among ontology terms. We can do
“mapping” between classes, properties and individuals.
Mapping can be done by using automatic,
semiautomatic or interactive reasoning. The results of
“mapping” are used with various purposes such as data
transformation, query answering and data integration.
According to Noy [13] there are four dimensions of
ontology mapping:
Mapping discovery: Given two ontologies, find
similarities between them and determine which concepts and properties represent similar notions;
Interactive specification of mapping: Use tools that
interactively allows to define and compare
ontologies and mappings with automatic or semi-
automatic help;
Use declarative formal representation of mapping;
Do reasoning with mapping.
For instance, we can say that the Class Book in
Figure 4-6 is equivalent to the Class Livro.
ObjectProperties Has_Written is equivalent to
ObjectProperties Menulis. As an important note that
ObjectProperties Has_Written exist in Oen and
ObjectProperties Menulis exist in Oina, but through
ontology mapping and importing process now we can
integrated them in one ontology.
Figure 6. Object Properties Equivalence - Owl: EquivalentProperty
In this section, we can see a testing result of ontologies mapping. We use SPARQL. Prefix :
<http://www.semanticweb.org/Oen.owl"#>
Prefix :
<http://www.semanticweb.org/OIna.owl"#>
Prefix :
<http://www.semanticweb.org/OPt.owl"#>
PREFIX rdf:
<http://www.w3.org/1999/02/22-rdf-syntax-
ns#>
PREFIX owl:
<http://www.w3.org/2002/07/owl#>
PREFIX xsd:
<http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs:
<http://www.w3.org/2000/01/rdf-schema#>
SELECT ?Books ?Authors
WHERE { ?Books :Written_by ?Authors.
?Authors :AuthorName ?Value
FILTER (?Value = 'Stuart Russell' )}
Figure 7. OntoGraph visualization between Class Books(Oen),
Buku(Oina) and Livro(Opt)
Base on Figure 7 we can see that references
“Inteligência artificial”from (Opt) is
equivalence to Artificial intelligence from (Oen) and
Kecerdasan Buatan from (Oina). So only with one
term “Book”, SELECT ?Books ?Authors, system
can understand what user want. System will
give a result not from Oen, book Artificial
intelligence but also will give a result from Opt and Oina.
Ontology mapping can be applied in various
domains not only library. We can use ontology
mapping process to help us to find semantic
correspondences between similar or different elements
of different ontologies in any domain. In this paper we
also focus in a SPARQL query process which can be
used to achieve interoperability in semantic information
The Proceedings of The 7th ICTS, Bali, May 15th-16th, 2013 (ISSN: 9772338185001)
200
retrieval and/or knowledge discovery processes over
interconnected RDF data sources. Formal mappings
between different overlapping ontologies are exploited
in order to rewrite initial user SPARQL queries, so that
they can be evaluated over different RDF data sources on different places.
6 CONCLUSIONS
Considering to the services of reference in
different languages, it is should be a front concern
activity in many central libraries in the world. The
service of reference will give an affect of the library service it self. If the library can accommodate a
different semantic, different terms and different
language from different user query so it will be very
easy for user from different languages find references
related to their own knowledge and language.
7 ACKNOWLEDGEMENT
We would like to acknowledge the support of the
Erasmus Mundus EuroAsia program for the research
foundation of this research, and also to acknowledge
Universidade do Minho Portugal and Universitas
Gadjah Mada Yogyakarta Indonesia for the
collaboration.
8 REFERENCES
[1] A. Bénel, E. Egyed-Zsigmond, Y. Prié, S.
Calabretto, A. Mille, A. Iacovella, and J. M.
Pinon, “Truth in the digital library: From
ontological to hermeneutical systems,” Research
and Advanced Technology for Digital Libraries,
pp. 914–914, 2001.
[2] S. Buckingham Shum, E. Motta, and J. Domingue,
“ScholOnto: an ontology-based digital library
server for research documents and discourse,”
International Journal on Digital Libraries, vol. 3, no. 3, pp. 237–248, 2000.
[3] M. Doerr, J. Hunter, and C. Lagoze, “Towards a
core ontology for information integration,”
Journal of Digital information, vol. 4, no. 1, 2003.
[4] L. Rajput and S. Shyam, “Ontology based digital
library,” 2010. [Online]. Available:
http://dl.acm.org/citation.cfm?id=1742233.
[Accessed: 27-Jan-2013].
[5] C. Fellbaum, WordNet. Springer, 2010.
[6] T. Pedersen and V. Kolhatkar, “WordNet::
SenseRelate:: AllWords: a broad coverage word sense tagger that maximizes semantic relatedness,”
in Proceedings of Human Language Technologies:
The 2009 Annual Conference of the North
American Chapter of the Association for
Computational Linguistics, Companion Volume:
Demonstration Session, 2009, pp. 17–20.
[7] K. Janowicz, “The role of space and time for
knowledge organization on the semantic web,”
Semantic Web, vol. 1, no. 1, pp. 25–32, 2010.
[8] V. Morocho, F. Saltor, and L. Perez-Vidal,
“Ontologies: Solving semantic heterogeneity in a federated spatial database system,” in In
Proceedings of 5th International Conference on
Enterprise Information System, 2003.
[9] Y. Xue, “Ontological View-driven Semantic
Integration in Open Environments,” The
University of Western Ontario, 2010.
[10] I. Boukhari, L. Bellatreche, and S. Jean, “An
ontological pivot model to interoperate
heterogeneous user requirements,” in Leveraging
Applications of Formal Methods, Verification and
Validation. Applications and Case Studies,
Springer, 2012, pp. 344–358. [11] L. Bellatreche, G. Pierra, and E. Sardet,
“Evolution Management of Data Integration
Systems by the Means of Ontological Continuity
Principle,” in Recent Trends in Information Reuse
and Integration, Springer, 2012, pp. 77–96.
[12] N. F. Noy, M. Crubézy, R. W. Fergerson, H.
Knublauch, S. W. Tu, J. Vendetti, and M. A.
Musen, “Protégé-2000: An Open-Source
Ontology-Development and Knowledge-
Acquisition Environment: AMIA 2003 Open
Source Expo,” in AMIA Annual Symposium Proceedings, 2003, vol. 2003, p. 953.
[13] N. F. Noy, “Ontology mapping,” Handbook on
ontologies, pp. 573–590, 2009.
6-08 Solving Different Languages Problem (portuguese, English And Bahasa Indonesia) In Digital Library With Ontology
201
The Proceedings of The 7th ICTS, Bali, May 15th-16th, 2013 (ISSN: 9772338185001)
[This page is intentionally left blank]
202