Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling,...

15
110 Int. J. Knowledge and Web Intelligence, Vol. 1, Nos. 1/2, 2009 Copyright © 2009 Inderscience Enterprises Ltd. Semantic lifecycles: modelling, application, authoring, mining, and evaluation of meaningful data Felix Mödritscher Institute for Information Systems and New Media, Vienna University of Economics and Business, Augasse 2-6, 1090 Vienna, Austria E-mail: [email protected] Abstract: The Semantic Web aims at evolving the ‘web of data’ to an ‘intelligent information space’ which is responsive to both human beings and computer systems. Consequently, semantic technologies have emerged in many application fields, primarily to provide intelligent IT systems. Semantics, therefore, is considered to be the essence of systemic intelligence. In this paper, we introduce a lifecycle model for semantics, consisting of five phases: a modelling b application c authoring d mining e evaluation of semantic information. With respect to this model, we analyse semantic lifecycles from former research work and summarise experiences as well as future research activities. Keywords: semantic technologies; adaptation systems; information retrieval; data mining; semantic modelling; technology-enhanced learning; medical documentation. Reference to this paper should be made as follows: Mödritscher, F. (2009) ‘Semantic lifecycles: modelling, application, authoring, mining, and evaluation of meaningful data’, Int. J. Knowledge and Web Intelligence, Vol. 1, Nos. 1/2, pp.110–124. Biographical notes: Felix Mödritscher received an MSc in Computer Technics (2002) and a PhD in Computer Science (2007) from Graz University of Technology. Since November 2003, he has been participating in the research projects AdeLE (nationally funded), APOSDLE (IST FP6/IP), iCAMP (IST FP6/STREP), and ROLE (IST FP7/IP). In the scope of these projects, he has been dealing with personalisation and adaptive behaviour in e-learning systems, infrastructures and services for technology-enhanced learning, as well as personal learning environments and learner networks. Currently, he is a postdoctoral fellow at the Institute for Information Systems and New Media of the Vienna University of Economics and Business.

Transcript of Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling,...

Page 1: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

110 Int. J. Knowledge and Web Intelligence, Vol. 1, Nos. 1/2, 2009

Copyright © 2009 Inderscience Enterprises Ltd.

Semantic lifecycles: modelling, application, authoring, mining, and evaluation of meaningful data

Felix Mödritscher Institute for Information Systems and New Media, Vienna University of Economics and Business, Augasse 2-6, 1090 Vienna, Austria E-mail: [email protected]

Abstract: The Semantic Web aims at evolving the ‘web of data’ to an ‘intelligent information space’ which is responsive to both human beings and computer systems. Consequently, semantic technologies have emerged in many application fields, primarily to provide intelligent IT systems. Semantics, therefore, is considered to be the essence of systemic intelligence. In this paper, we introduce a lifecycle model for semantics, consisting of five phases:

a modelling b application c authoring d mining e evaluation of semantic information.

With respect to this model, we analyse semantic lifecycles from former research work and summarise experiences as well as future research activities.

Keywords: semantic technologies; adaptation systems; information retrieval; data mining; semantic modelling; technology-enhanced learning; medical documentation.

Reference to this paper should be made as follows: Mödritscher, F. (2009) ‘Semantic lifecycles: modelling, application, authoring, mining, and evaluation of meaningful data’, Int. J. Knowledge and Web Intelligence, Vol. 1, Nos. 1/2, pp.110–124.

Biographical notes: Felix Mödritscher received an MSc in Computer Technics (2002) and a PhD in Computer Science (2007) from Graz University of Technology. Since November 2003, he has been participating in the research projects AdeLE (nationally funded), APOSDLE (IST FP6/IP), iCAMP (IST FP6/STREP), and ROLE (IST FP7/IP). In the scope of these projects, he has been dealing with personalisation and adaptive behaviour in e-learning systems, infrastructures and services for technology-enhanced learning, as well as personal learning environments and learner networks. Currently, he is a postdoctoral fellow at the Institute for Information Systems and New Media of the Vienna University of Economics and Business.

Page 2: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

Semantic lifecycles: modelling, application, authoring, mining 111

1 Introduction

1.1 The need for semantics

Inspired by early visions of hypertext systems (Bush, 1945) and first conceptual and technological approaches – like Xanadu (Nelson, 1965) –Tim Berners-Lee developed the ‘tools’ necessary for a working Web, the HyperText Transfer Protocol (HTTP) and the HyperText Markup Language (HTML) in the early 1990s (cf. Berners-Lee and Fischetti, 1999). The rest is history, and Sir Tim Berners-Lee is considered being the ‘inventor’ of the World Wide Web. However, this first version of the Web was criticised by researchers and practitioners for its poor, limiting usage possibilities as well as the lack of meaningful meta-information to foster understandability of hypertexts by both humans and machines. Again, Berners-Lee was one of the pioneers working on conceptual ways out of this shortcoming, which consequently lead to the vision of the Semantic Web (Berners-Lee et al., 2001).

Above and beyond, the development in the scope of the Web evidences that technology has carried out the paradigm shift from “what [the Web] can do” to “what [the Web] can do for humans” (Shneiderman, 2002). In contrary to human–computer interaction research, the Web cannot be trimmed to specific end-users, i.e., by applying user-centred design methods (cf. ISO DIS 13407, 1999). Nevertheless, semantic structures and descriptions may help to increase the utility and usability of hypertext corpora. Similarly, semantics is the missing link for ‘intelligent information systems and spaces’ being responsive to end-users as well as processable by other systems. Semantic technologies have been developed and experienced in many application areas (Tochtermann et al., 2007; Auer et al., 2008) to realise context-sensitive, multi-purpose IT systems, to provide ‘intelligent’ functionalities and content, and to support end-users in their everyday tasks.

1.2 Purpose and structure of this paper

Semantic technologies and models are not at all new, although researchers in the field of compiler construction and natural language processing rather refer to the (formal) semantics of languages and programming languages (Abbott, 1999). In applied science, approaches typically focus on one (or maybe two or three) issues in connection with semantics; for instance, Xu et al. (2004) deal with semantic mining and analysis of gene expression data while Caliusco et al. (2006) address semantic modelling issues. Yet, for developing semantic technologies, it would be beneficiary to have a holistic approach towards designing, creating and applying semantic information, similarly to knowledge management models like the building blocks by Probst et al. (1999, p.51ff).

Lacking a supportive methodology for designing and realising semantic applications, this paper aims at sketching such a holistic, process-based approach based on results from literature and experiences of former research activities. Therefore, the upcoming section introduces a lifecycle model for semantic information consisting of five phases. Then, this methodological model for developing semantic applications is explained and evaluated on the basis of five case studies, each one including a semantic lifecycle with its own characteristics.

Page 3: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

112 F. Mödritscher

As a first case study, Section 3 describes an infrastructure for semantic applications developed in the research project APSODLE. Second and third, we outline two lifecycles from the AdeLE project (Sections 4 and 5), one comprising a domain model for retrieval-based instruction and one dealing with learner states for personalised e-learning with eye-tracking. Fourth, Section 6 addresses a good practice lifecycle for a learner-centred technology-enhanced learning approach. Finally, we also step into a different application area and demonstrate the utilisation of our lifecycle model in the field of medical documentation, precisely to support medical experts in navigating Magnetic Resonance Images (MRI) diagnoses.

These case studies will show that each of these developments is related to semantic technologies and takes into consideration a few but not all issues of our process-based approach towards semantic lifecycles. Next to concrete results and experiences, we also indicate ideas being subject for further research or exploitation. In a final step, the paper is concluded, and an outlook on possible directions for future research is given.

2 MAAME, a generic lifecycle model for semantics

In accordance with its definition of being the “study of meaning in communication”(Wikipedia, 2008), we consider semantics to be meaningful data (facts) about the real-world, evolving information to being ‘understandable’ by both humans and machines and being necessary for realising intelligent systemic behaviour. Equally to any kind of information, semantics underlies a lifecycle, describing different phases of its existence and reaching from being created over being utilised up to being measured and improved.

This paper introduces a process-based approach to semantic technologies, which aims at designing and realising semantic-driven ‘intelligence’ of IT systems. Precisely, we propose a lifecycle model for semantics consisting of five important phases (cf. Figure 1):

• First and foremost, semantic technologies require some underlying model to make content understandable for humans and processable by machines. Therefore, this phase includes all aspects of designing and describing such models, for instance by defining ontologies, rules, or feature spaces.

• Second and of primary importance, semantic information must be applied to enable users to achieve their goals by providing intelligent functionality. Dependent on the application area, semantics might be necessary, e.g., for recommendation services, visualisation or adaptation techniques, information retrieval improvements, etc.

• Third and more user-related, the authoring phase deals with all aspects of entering semantics manually, independently from its materialisation (in-content, separated from the original data), type (metadata, structure, concept maps, and so forth), or authoring mode (editing, tagging, annotating, and so on).

• Fourth and consequently, the mining phase summarises automated (and semi-automated) methods to gather meaningful data from the real world, e.g., extract it from digital corpora. Hereby, techniques from different fields, like data mining, natural language processing, Latent Semantic Analysis (LSA), web harvesting, etc., are of interest.

Page 4: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

Semantic lifecycles: modelling, application, authoring, mining 113

• Fifth and finally, the evaluation stage copes with aspects of evaluating such a lifecycle, refining the semantic model and improving the semantic information itself, whereby the success of an approach reaches from more pragmatical issues, like cost-benefit analysis, over the technical correctness and methodological soundness up to the users’ acceptability, usefulness, and usability. Thus, formalisms and field studies have to be applied to evidence the benefits of semantic applications and, furthermore, to improve their underlying models continuously. With their comprehensive approach to ontology evaluation, Gangemi et al. (2006) evidence that this phase is very tedious and costly. On the other hand, it is important and, comparable with the building block model by Probst et al. (1999), to be seen as a starting point for an iterative development process going through the other four phases again.

Figure 1 Lifecycle model for semantic information consisting of five key phases

Perfectly fitting the context, the initials of these five phases coin the term ‘MAAME’ which is, according to All-BabyNames.com (2008), an African and Middle East expression for ‘mother’ and, thus, a synonym for the genesis of semantics.

3 The knowledge artefact lifecycle of the APOSDLE platform

Our first example of a semantic lifecycle is on a more technological level and taken from the APOSDLE project (http://www.aposdle.tugraz.at). This EU project has been started in 2006 and consists of 12 participating organisations. APOSDLE aims at developing a software platform and tools to support workplace learning. Therefore, knowledge workers should be supported in three contexts of their everyday activities:

1 ‘Work’ encompasses the application of knowledge to generate value for or create a product or service. Here, APOSDLE helps to identify the users’ needs and provide them with context-sensitive knowledge tailored to their specific competencies and work situation.

2 ‘Learn’ describes the phases in which knowledge workers are confronted with new tasks or require to improve their competencies in a certain domain. Thus, APOSDLE provides learners with guidance through the company’s externalised knowledge, including both conventional learning material and content originally not intended for learning.

3 ‘Collaborate’ includes attempts to leverage the expertise within a company. In this context, APOSDLE captures artefacts and the information communicated during collaboration and social interaction and helps knowledge workers to get in contact with experts according to their current working context.

Page 5: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

114 F. Mödritscher

The first APOSDLE prototype (cf. Godehardt et al., 2007) includes a semantic layer, namely ‘Homogeneous Access’, for integrating artefacts stored in the internal or in one external repositories. Owing to the fact that APOSDLE aims at considering all parts of a company’s intellectual capital, i.e., implicit and explicit knowledge, and making it available to the knowledge workers, Mödritscher et al. (2007a) proposes to evolve this simple data integration layer to a so-called ‘Multimedia, Multi-source, Metadata-based Knowledge Artefact Repository’. In this context, semantics refers to the information describing the digital artefacts within the APOSDLE system, i.e., metadata fields for different purposes but also relations between the artefacts and in connection with working tasks and competencies.

An architectural design for such a system is depicted on the left-hand side of Figure 2. Summarising this architecture, this special component has an own (internal) file storage and can include external data repositories (‘multi-source’) through configuring the access to them, whereby we implemented interfaces for the file-system, WebDAV, and the protocol of the ConcertChat server. On the other hand, the Knowledge Artefact Repository also enables services and external (e.g., client-sided) tools to enrich resources with metadata-based semantics and use these enhanced artefacts for their purposes, e.g., to provide documents relevant for a certain working context or to generate learning events addressing specific competences.

Figure 2 Architecture of the Knowledge Artefact Repository (left) and the Knowledge Artefact Lifecycle (right), taken from Mödritscher et al. (2007a)

As a consequence, text and multimedia artefacts used by APOSDLE underlie a certain lifecycle (cf. right-hand side of Figure 2). Moreover and in combination with value-adding metadata, these pure pieces of data evolve to what we called ‘Knowledge Artefacts’, implying that they have additional meta-information to be applied by ‘intelligent’ services and tools, e.g., to utilise for recommendations or to achieve context-sensitivity, personalised access, and the forth. The knowledge artefact lifecycle starts with documents available in one repository of a company (‘URI given’) or to be inserted into the semantic layer of APOSDLE (‘Document given’). In a first step, such a digital object is analysed by the KnowMiner framework, which adds basic metadata (file-type, size, owner, access rights, etc.) and a first set of semantics

Page 6: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

Semantic lifecycles: modelling, application, authoring, mining 115

(relationships to concepts, competencies, and other artefacts). Then, this semantic can be used, modified, and deleted, even together with the artefact if it is stored within the internal storage base.

Concerning our MAAME model, the following phases can be identified for the knowledge artefact repository of the APOSDLE platform: (M) Modelling has been driven by domain experts, which lead to three models:

a a model of the business processes

b an instructional model for workplace learning

c a competency model to connect documents and experts with each other and include sensitivity towards the working context.

The core model connecting all these aspects is the so-called Competency-Based Knowledge Space Theory (Albert and Lukas, 1999). (A) This semantic model is applied in many ways, e.g., to recommend documents, learning events, and experts in a certain working context. (A, M) One of the main objectives of the APOSDLE project is to automate as much as possible, e.g., by mining concepts from text-based resources or assign competencies to knowledge workers automatically (cf. scruffy technologies by Lindstaedt et al., 2008). In practice, authoring is necessary, e.g., to correct the own user profile or improve the models, which have been automatically extracted by the KnowMiner. (E) At last, evaluation of the automated approaches is based on a broad base, comprising all these models. For instance, Mödritscher et al. (2007a) summarises the evaluation on a more technical level, like soundness, correctness and the performance of a preliminary version of the Knowledge Artefact Repository.

In fact, APOSDLE comprises a very powerful approach towards a semantic technology supporting knowledge workers through automated analysis of content and recommendations within a network of actors, artefacts, and learning activities. The concepts and technological solutions realised for the single phases of the MAAME model are transferable to many other application areas. Future research might address more sophisticated mining techniques, novel social networking functions and their effects on the knowledge artefacts as well as, most importantly, the evaluation of the Knowledge Artefact Lifecycle in practical settings. Particularly, a feedback loop might be necessary to refine and iteratively improve the semantics generated so far.

4 The background knowledge lifecycle for retrieval-based instruction

A second semantic lifecycle has been developed within the scope of the AdeLE project carried out from 2003 to 2007. AdeLE is the abbreviation of ‘Adaptive e-Learning with Eye-Tracking’ and stands for a national research project partially funded by the Austrian ministries BMVIT and BMBWK, through the FHplus impulse programme (http://adele.fh-joanneum.at). Specifically, AdeLE aims at the development of a technology-based solution exploiting novel methods for retrieval-based instructions and fine-grained user profiling based on real-time eye-tracking and content-tracking information (Gütl et al., 2005). Amongst others, Mödritscher (2008, p.131ff) describes the idea of utilising a Dynamic Background Library to adapt the learning process, precisely to insert new instructions into an online course (‘retrieval-based instruction’).

Page 7: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

116 F. Mödritscher

Hereby, semantics comprising a set of concepts that describes the knowledge domain beyond a course is created for different expertise levels by the teacher and allows retrieving instructional materials from pre-defined digital libraries.

Generally, a Dynamic Background Library improves the goal-oriented knowledge transfer process by enabling the development of a dynamically indexed background library of subject-relevant resources residing outside the static repository. The basic functionality scheme of a Dynamic Background Library, as shown on the left-hand side of Figure 3, depicts the different interaction layers and its dependencies through the knowledge transfer process: On the top of this figure, the section ‘Adaptive background knowledge acquisition’ comprises the process of retrieving resources for a pre-defined concept from given repositories (information retrieval systems). The layer in the middle, the section ‘Semantic Knowledge Factory’, includes the semantic model in the form of the background knowledge, which is abstracted as a set of ‘items’ (concepts). In our example, these items are defined for several expertise levels (novice, regular, and expert) and, further, assigned to one or more instructions. The layer at the bottom, the section ‘Adaptive Content Delivery’, deals with the application of the semantic model to adapt learning. In the very first version of the Dynamic Background Library (cf. García-Barrios et al., 2002) adaptivity was realised through four different viewing modes for accessing the background knowledge: (view A) embedded hyperlinks, (view B) end of page, (view C) end of chapter, and (view D) end of content. As a result, the links are selected according to a learner’s expertise and visualised in terms of her preferred viewing mode.

Figure 3 Basic functionality scheme of a Dynamic Background Library (left), the semantic model (top right) and the visualisation of the background knowledge (bottom right), taken from García-Barrios et al. (2002) and Mödritscher (2008, pp.113, 138) (see online version for colours)

For the AdeLE prototype the Dynamic Background Library was re-implemented as a service within the Openwings framework and renamed to ‘Concept-Based Context Modeller’ (Safran, 2008). Referring to the MAAME model, the semantic model (M) is

Page 8: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

Semantic lifecycles: modelling, application, authoring, mining 117

shown on the top of the right-hand side of Figure 3 and describes the relations between instructions, concepts, and queries to different retrieval systems. (A) This model of the background knowledge is utilised to provide additional navigation facilities to access the resources (bottom of the right-hand side of Figure 3). Clicking a link in this area opens a new window and displays the resources, which are dynamically retrieved for the selected concept. According to Mödritscher et al. (2005), it is possible to adapt to various learning contexts, amongst others towards having problems with the course language (provide translations), understanding a concept or passage (provide additional explanations), learning styles (provide an image instead of a text), or the topicality of content (provide latest information). Furthermore, didactical strategies like thematic-driven learning (Dreher et al., 2004) can be realised. (A) Authoring of the background knowledge model has to be done by the facilitator, while (M) mining of such models has not been implemented. (E) The idea of providing background knowledge dynamically was evaluated through several user studies (cf. García-Barrios et al., 2004; Mödritscher, 2008), and learners claimed the precision of the retrieved information. So far, the background knowledge model must be evaluated and refined by teachers.

Beside the weak, more qualitative evaluation results, this approach primarily lacks good mining techniques to create the semantic model (semi-)automatically. However, methods like LSA, e.g., with adequate software like the R package (Wild, 2007), or frameworks like the KnowMiner (Granitzer, 2006) might be applied for this reason. Furthermore, a feedback loop improving the knowledge model might be useful to reduce the teachers’ authoring efforts and increase the quality of the semantic information.

5 The learner state lifecycle for adaptive e-learning with eye-tracking

Next to this didactic-aware example of a semantic lifecycle, we also describe an approach towards an adaptation process based on a pedagogical model. Again, this lifecycle is one of the outcomes of the AdeLE project, whereby here semantics refers to learning goals defined by the teacher and real behaviour traits of learners. As mentioned in the previous section, we utilised an eye-tracking device (see left-hand side of Figure 4) to adapt the learning process. Therefore, we realised a tagging tool for teachers (in the middle of Figure 4), so that they can define in-content learning goals for text passages, namely ‘to scan’, ‘to read’, and ‘to learn’. On the other hand, the eye-tracking device is capable to determine the cognitive processes ‘scanning’, ‘reading’, and ‘learning’ (right-hand side of Figure 4). After compressing the gaze tracking data, these learner states are stored into the user modelling system (Fröschl, 2005) and compared with the semantics (the learning goals) encoded in the form of microformats (cf. Mödritscher and García-Barrios, 2008). Mödritscher (2008, p.91ff) describes how these learner states are used to adapt learning, which is achieved through a traffic light visualisation within the AdeLE prototype.

According to the MAAME model, (M) learner states are modelled as follows: The learning goals encoded into the (web-based) content describe the targeted states while the eye-tracker derives the current learner states. The instructional entity is considered to be one instruction (page). The difference between the targeted and the current states are submitted to the user profile (the Modelling System of AdeLE’s server-sided architecture) and (A) applied to give the learner feedback, which is a less intrusive and scrutable way to adapt learning. (A) Authoring of the semantics, in fact,

Page 9: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

118 F. Mödritscher

has to be done by the teacher, whereby we provide a supportive tool, the Semantic TAGging Editor (STAGE, cf. García-Barrios, 2006). On the other hand, (M) mining comprises the process of observing the learner and deriving her current information processing states. Finally, (E) evaluation was implemented by form-based questionnaires in our case studies, so that teachers could refine the learning goals embedded in the course content.

Figure 4 AdeLE’s eye-tracking system ‘Tobii 1750’ (left), the tagging tool ‘STAGE’ (middle) and the gaze tracking paths for the behaviour ‘reading’ and ‘learning’ (right), taken from Mödritscher (2008, p.93), García-Barrios (2007, p.127) and Pripfl et al. (2006) (see online version for colours)

Overall, this example describes a nearly full lifecycle of semantic information, set into practice successfully (cf. Mödritscher et al., 2006) but lacking pedagogical validity and, more important, usability (cf. Mödritscher, 2008, p.121ff). Anyhow, the outstanding result of this approach is about observing a learner and drawing meaningful information automatically. Eye and gaze tracking are most probably not the best techniques to be applied here, as other biometric sensors or brain-computer interfaces might lead to more interesting and relevant insights. Furthermore, the application area is restricted to technology-enhanced learning and even pure e-learning; similar to the plethora of application areas of semantic technologies (cf. Tochtermann et al., 2007; Auer et al., 2008), such lifecycles might be valuable for other fields and contexts as well.

6 The good practice lifecycle for a learner-centred TEL approach

A fourth lifecycle to report about has been developed within the scope of the iCAMP project. iCAMP is another research and development project funded by the European Commission under the Information Society Technology programme of FP6 (http://www.icamp.eu). The project aims at creating an infrastructure for collaboration and networking across systems, countries, and disciplines in higher education. Pedagogically, it is based on constructivist learning theories that put emphasis on self-organised learning, social networking, and the changing roles of educators. In the scope of the iCAMP project, we consider semantics to be environment design capabilities and learner interactions being subject to practice sharing through automated analysis techniques and community-enabling features.

Page 10: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

Semantic lifecycles: modelling, application, authoring, mining 119

With respect to former publications (Mödritscher et al., 2008a, 2008b; Wild et al., 2008; Mödritscher and Wild, 2008; Wild, 2009), the idea of Mash-UP Personal Learning Environments (MUPPLE) comprises a learner-centred approach to technology enhanced learning and is about empowering learners to design their own learning environments so that they can connect to a network of actors, artefacts, and activities (cf. top left of Figure 5). Therefore, we have built a semantic model of learning activities (cf. top right of Figure 5) as well as a learning environment and interaction model. For the latter model, we are using a scripting language, namely the Learner Interaction Scripting Language (LISL); an exemplary script is shown on the bottom left of Figure 5. In practice learners work with a mashup of learning tools (cf. bottom right of Figure 5).

Figure 5 Concept of a Mash-Up Personal Learning Environment (top left), model the learning activities behind MUPPLE (top right), exemplary LISL code (bottom left) and learning tool mashup (bottom right), taken from Wild (2009) (see online version for colours)

The MUPPLE approach deals with describing how learners design their learning environment and how they use the tool mashup to collaborate with other actors on shared artefacts. Thus, the first phase of our MAAME lifecycle model, (M) modelling, comprises the formalisation of learning activities and learner interactions as depicted in Figure 5. Focussing on utilising these models, (A) the LISL can be applied, e.g., for personalisation purposes (recommending tool combination to inexperienced users) or good practice sharing within a community (by passing ‘activity patterns’ to peers). (A) Authoring of the activities can be achieved by writing the LISL scripts manually or by using the web-based widgets of the MUPPLE prototype. For both cases, new activities can be created from the scratch or derived from an existing pattern. So far, good practices are sharable by exporting (parts of) LISL scripts, even with depersonalised passages, and handing them over to other learners. (M) Mining of semantics is done by analysing all

Page 11: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

120 F. Mödritscher

activities available and deriving action- and tool-recommendations for inexperienced learners. (E) Automated evaluation mechanisms, however, are not fully considered so far, while refinement of activities and patterns can be driven by the learning community and the learners themselves.

For the scope of higher education and lifelong learning, we argue for the necessity of personalisation and good practice sharing within learning environments, which primarily aims at initialising meaningful scenarios for learning in networked communities. Although MUPPLE work is still in progress and not evaluated by means of real user studies, its key concepts – the support of transcompetences (social and team competences as well as hands-on skills), learning environment design, and end-user development in combination with the design for emergence – seem to be more promising than didactic-driven, top-down, ex ante modelling of learning processes (cf. Wild et al., 2008). In this context, semantic lifecycles are relevant for issues like personalisation, sharing good practices or other community-enabling features.

7 A semantic lifecycle for navigating MRI diagnoses

At last, we summarise a MAAME lifecycle within the field of medical documentation. Therefore, we use a semantic model of co-occurrents to provide navigation facilities for medical text corpora. As highlighted by Mödritscher et al. (2007b), we applied a text-mining method to support medical staff in retrieving and navigating medical documents, precisely MRI diagnosis. Basically, the diagnosis are analysed with respect to significant co-occurrences of anatomic regions and pathologic expressions. On the basis of the topological proximity between these two components, we realised a web application, which allows medical experts to search both anatomic and pathologic terms (see left-hand side of Figure 6) and to navigate through a network of these feature pairs (cf. right-hand side of Figure 6).

Figure 6 Filtering MRI diagnosis according to the terms ‘TUMOR’ and ‘KLEINHIRN’ (left) and ToughGraph visualisation for the term ‘GLIOBLASTOMA’ (right), both taken from Mödritscher et al. (2007b) (see online version for colours)

The MAAME lifecycle of this approach can be pointed out as follows: (M) The semantic model is provided by domain experts in terms of anatomic and pathologic terms, whereby we used the 6.800 anatomic structures given by Dauber (2005) and enabled the users to

Page 12: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

Semantic lifecycles: modelling, application, authoring, mining 121

manage both anatomic and pathologic expressions. (A) As already mentioned, this semantic model is applied for retrieval of and navigation through the medical material, the MRI diagnosis. (A) Although anatomic terms are pre-defined, the authoring of both components of the model is possible. (M) Per se, (text) mining comprises the most important part of this approach. Therefore, we started with 6,000 diagnoses on MRIs of a period of 17 years. In a first step, we pre-process these diagnoses, which were written by medical experts by splitting them into sentences and calculating first statistics about the term pairs. Then, the algorithm calculates the feature space with a formula for significant co-occurrences given by Heyer et al. (2006). (E) The model and the resulting feature space has been evaluated by medical experts, which has revealed several problematic aspects like the dependency of medical texts on the time periods and geographical areas, the usage of synonyms, grammatical errors, and the forth.

Beside the rather small corpus we have used here, future work in this area should address the evaluation of the overall approach as well as a refinement of the technique. All in all, automated methods and semantic lifecycles seem to be promising, as they support medical experts in their everyday tasks and, consequently, may help saving lives.

8 Conclusions and outlook

In this paper, we introduced a lifecycle model for semantics and described five exemplary lifecycles from former research activities, each one useful within a certain application area. Primarily, approaches implementing such a lifecycle aim at supporting users (e.g., with facilities to navigate medical documents), automating costly steps (e.g., retrieving artefacts or finding experts for the current working context), or increasing the quality of a process (e.g., through adaptation strategies and networked collaboration to improve learning). Insofar, the MAAME model is helpful to analyse the key phases of such a lifecycle, precisely to examine aspects of modelling, applying, authoring, mining, and evaluating semantics. Beside this holistic view on meaningful information necessary for systemic intelligence, this model also allows categorising problematic aspects of semantic technologies and supports the evaluation of their usability, utility or necessity. Consequently, this model can lead to a building block approach for semantic technologies.

Future research addressing semantic lifecycles might aim at examining these five phases in detail and at gaining more experiences with each of them. Owing to current trends for social technology and Web 2.0 concepts like ‘harnessing the collective intelligence’, mining approaches are of particular interest in these days. As proven by our experiences in the scope of medical documentation, text-mining techniques can be applied to reduce the cognitive load of users and support them in their decisions. Moreover, creating semantics also includes hardware aspects, such as applying new sensor devices like eye-tracker or brain-computer interfaces. Overall, technological and methodical innovations are of central interest for semantic technologies, which is a strong argument for further research activities in this field. To conclude this paper, it has to be outlined that the application of semantics for the sake of users is of primary importance for all considerations towards semantic lifecycles. Thus, the evaluation of such approaches has to be intensified to a much higher extent than highlighted in our five case studies.

Page 13: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

122 F. Mödritscher

Acknowledgements

The author would like to thank all colleagues from the Institute for Information Systems and Computer Media at the Graz University of Technology and the Institute of Information Systems and New Media at the Vienna University of Economics and Business, particularly Victor Manuel García-Barrios for giving valuable feedback on this paper. Additionally, the author offers his gratitude towards all collaborators in his current and past projects, particularly in the research projects AdeLE, APOSDLE, iCAMP, and ROLE.

ReferencesAbbott, B. (1999) ‘The formal approach to meaning: Formal semantics and its recent

developments’, Journal of Foreign Languages, Vol. 119, No. 1, January, pp.2–20. Albert, D. and Lukas, J. (Eds.) (1999) Knowledge Spaces: Theories, Empirical Research,

Applications, Lawrence Erlbaum Associates, Mahwah. All-Babynames.com (2008) Meaning and Origin of Name MAAME, Obtained through the internet:

http://www.all-babynames.com/meaning-of-name-Maame.html [accessed 19/11/2008]. Auer, S., Schaffert, S. and Pellegrini, T. (Eds.) (2008) I-SEMANTICS’08, Proceedings

of the International Conference on Semantic Systems, Graz, Obtained through the Internet: http://i-know.tugraz.at/content/download/522/2000/file/Proceedings%20I-SEMANTICS.pdf [accessed 17/06/2009]

Berners-Lee, T. and Fischetti, M. (1999) Weaving the Web: Origins and Future of the World Wide Web, Orion Business, Britain.

Berners-Lee, T., Hendler, J. and Lassila, O. (2001) ‘The semantic web’, Scientific American,Vol. 284, No. 5, May, pp.34–43.

Bush, V. (1945) ‘As we may think’, Atlantic Monthly, Vol. 176, No. 1, July, pp.101–108. Caliusco, M.L., Galli, M.R. and Chiotti, O. (2006) ‘Technologies for data semantic modelling’,

International Journal of Metadata, Semantics and Ontologies, Vol. 1, No. 4, pp.322–331. Dauber, W. (2005) Feneis’ Bild-Lexikon der Anatomie, Thieme, Stuttgart. Dreher, H., Scerbakov, N. and Helic, D. (2004) ‘Thematic driven learning’, Proceedings of the

World Conference on E-Learning in Corporate, Government, Healthcare, & Higher Education (E-Learn 2004), Washington, pp.2594–2600.

Fröschl, C. (2005) User Modeling and User Profiling in Adaptive E-learning Systems,Master Thesis, University of Technology, Graz.

Gangemi, A., Catenacci, C., Ciaramita, M. and Lehmann, J. (2006) ‘Modelling ontology evaluation and validation’, in Sure, Y. and Domingue, J. (Eds.): The Semantic Web: Research and Applications, Berlin, Springer, pp.140–154.

García-Barrios, V.M. (2006) ‘Real-time learner modeling: using gaze-tracking in distributed adaptive e-learning environments’, Proceedings of the International Convention MIPRO 2006 (CE), Opatja, pp.185–190.

García-Barrios, V.M. (2007) Personalisation in Adaptive E-Learning Systems: A Service-Oriented Solution Approach for Multi-Purpose User Modelling Systems, Doctoral Thesis, University of Technology, Graz.

García-Barrios, V.M., Gütl, C. and Mödritscher, F. (2004) ‘EHELP – enhanced e-learning repositories: the use of a dynamic background library for a better knowledge transfer process’, Proceedings of the International Conference on Interactive Computer Aided Learning (ICL 2004), Villach, pp.1–8.

Page 14: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

Semantic lifecycles: modelling, application, authoring, mining 123

García-Barrios, V.M., Gütl, C. and Pivec, M. (2002) ‘Semantic knowledge factory: a new way of cognition improvement for the knowledge management process’, Proceedings of the International Conference on Society for Information Technology and Teacher Education (SITE 2002), Nashville, pp.168–172.

Godehardt, E., Lokaiczyk, R., Görtz, M., de Jong, T., de Hoog, R., Sikken, J., Bonestroo, W., Kooken, J., Hornung, C., Scheir, P., Ulbrich, A., Beham, G., Pammer, V., Ley, T., Ghidini, C., Calabrese, G., Hoffmann, R. and Hofmair, P. (2007) First Prototype APOSDLE: Deliverables D1.2, D2.2, D3.2, D4.2, D5.2, Obtained through the internet: http://www. aposdle.tugraz.at/content/download/377/1880/file/APOSDLE-First-Prototype.pdf [accessed 24/11/2008].

Granitzer, M. (2006) KnowMiner: Konzeption und Entwicklung eines generischen Wissenserschließungsframework, Doctoral Thesis, University of Technology, Graz [in German].

Gütl, C., Pivec, M., Trummer, C., García-Barrios, V.M., Mödritscher, F., Pripfl, J. and Umgeher, M. (2005) ‘AdeLE (adaptive e-learning with eye-tracking): Theoretical background, system architecture and application scenarios’, European Journal of Open, Distance and E-Learning (EURODL), Vol. II, Obtained through the internet: http://www.eurodl.org/ materials/contrib/2005/Christian_Gutl.htm [accessed 19/11/2008].

Heyer, G., Quasthoff, U. and Wittig T. (2006) Text Mining: Wissensrohstoff Text, Herdecke, W3L [in German].

ISO DIS 13407 (1999) User Centred Design Process for Interactive Systems, International Standards Organisation (ISO), Geneva.

Lindstaedt, S.N., Ley, T., Scheir, P. and Ulbrich, A. (2008) ‘Applying ‘scruffy’ methods to enable work-integrated learning’, The European Journal for the Information Professional (UPGRADE), Vol. 9, No. 3, pp.44–50.

Mödritscher, F. (2008) Adaptive E-Learning Environments: Theory, Practice, and Experience,Verlag Dr. Müller, Saarbrücken.

Mödritscher, F. and García-Barrios, V.M. (2008) ‘Standardization’s all very well, but what about the exabytes of existing content and learner contributions?’, SCORM 2.0 Workshop, Obtained through the internet: http://www.letsi.org/letsi/pages/viewpage.action?pageId=4753958 [accessed 19/11/2008].

Mödritscher, F. and Wild, F. (2008) ‘Personalized e-learning through environment design and collaborative activities’, Proceedings of the Symposium of the WG HCI&UE for Education and Work (USAB 2008), Graz, pp.377–390.

Mödritscher, F., García-Barrios, V.M. and Maurer, H. (2005) ‘The use of a dynamic background library within the scope of adaptive e-learning’, Proceedings of the World Conference on E-Learning in Corporate, Government, Healthcare, & Higher Education (E-Learn 2005),Vancouver, pp.3045–3052.

Mödritscher, F., García-Barrios, V.M., Gütl, C. and Helic, D. (2006) ‘The first AdeLE prototype at a glance’, Proceedings of the World Conference on Educational Multimedia, Hypermedia and Telecommunications (ED-MEDIA 2006), Orlando, pp.791–798.

Mödritscher, F., Hoffmann, R. and Klieber, W. (2007a) ‘Integration and semantic enrichment of explicit knowledge through a multimedia, multi-source, metadata-based knowledge artefact repository’, Proceedings of the International Conference on Knowledge Management (I-Know 2007), Graz, pp.365–372.

Mödritscher, F., Neumann, G., García-Barrios, V.M. and Wild, F. (2008a) ‘A web application mashup approach for e-learning’, Proceedings of the OpenACS and LRN Conference,Guatemala City, pp.105–110.

Mödritscher, F., Tatzl, R., Geierhofer, R. and Holzinger, A. (2007b) ‘Utilizing text mining techniques to analyze medical diagnoses’, Proceedings of the International Conference on Semantic Systems (I-Semantics 2007), Graz, pp.364–371.

Page 15: Semantic lifecycles: modelling, application, authoring ... · Semantic lifecycles: modelling, application, authoring, mining 113 • Fifth and finally, the evaluation stage copes

124 F. Mödritscher

Mödritscher, F., Wild, F. and Sigurdarson, S.E. (2008b) ‘Language design for a personal learning environment design language’, Proceedings of the Workshop on Mash-UP Personal Learning Environments (MUPPLE) at the European Conference on Technology Enhanced Learning (EC-TEL 2008), Maastricht, pp.5–13.

Nelson, T.H. (1965) ‘A file structure for the complex, the changing, and the indeterminate’, Proceedings of the ACM National Conference, Cleveland, pp.84–100.

Pripfl, J., Trummer, C. and Pivec, M. (2006) ‘User behaviour detection by means of eye-tracking’, Proceedings of the International Conference on Information Technology Interfaces (ITI),Dubrovnik, pp.17, 18.

Probst, G., Raub, S. and Romhardt, K. (1999) Wissen managen: Wie Unternehmen ihre wertvollste Ressource optimal nutzen, Auflage, Gabler, Vol. 3, Wiesbaden [in German].

Safran, C. (2008) Concept-based Information Retrieval for User-Oriented Knowledge Transfer,Verlag Dr. Müller, Saarbrücken.

Shneiderman, B. (2002) Leonardo's Laptop: Human Needs and the New Computing Technologies,MIT Press, Cambridge.

Tochtermann, K., Haas, W., Kappe, F., Scharl, A., Pellegrini, T. and Schaffert, S. (Eds.) (2007) I-MEDIA’07 and I-SEMANTICS’07, Proceedings of the International Conferences on New Media Technology and Semantic Systems, Graz, Obtained through the Internet: http://triple-i.tugraz.at/content/download/300/1235/file/Proceedings_I-Media_I-Semantics.pdf [accessed 17/06/2009].

Wikipedia (2008) ‘Semantics’, Wikimedia Foundation Inc, Obtained through the internet: http://en.wikipedia.org/wiki/Semantics [accessed 19/11/2008].

Wild, F. (2007) ‘LSA: latent semantic analysis’, CRAN package at: The Comprehensive R Archive Network, Obtained through the internet: http://cran.r-project.org/web/packages/lsa/index.html [accessed 24/11/2008].

Wild, F. (Ed.) (2009) Mash-Up Personal Learning Environments, iCAMP deliverable D3.4, Obtained through the internet: http://www.icamp.eu/wp-content/uploads/2009/01/d34_icamp_final.pdf [accessed 30/3/2009].

Wild, F., Mödritscher, F. and Sigurdarson, S.E. (2008) ‘Designing for change: mash-up personal learning environments’, eLearning Papers, Vol. 2008, No. 9, ISSN: 1887-1542.

Xu, X., Cong, G., Ooi, B.C., Tan, K.L. and Tung, A.K.H. (2004) ‘Semantic mining and analysis of gene expression data’, Proceedings of the International Conference on Very Large Data Bases (VLDB 2004), Toronto, Vol. 30, pp.1261–1264.