HealthFinland ...other web applications, as semantic widgets (called \ oatlets") [23]. Figure 2...

17
HealthFinland —a National Semantic Publishing Network and Portal for Health Information Osma Suominen a,* , Eero Hyv¨ onen a , Kim Viljanen a , Eija Hukka b a Semantic Computing Research Group (SeCo), Helsinki University of Technology (TKK), Department of Media Technology University of Helsinki, Department of Computer Science http://www.seco.tkk.fi b National Institute for Health and Welfare (THL) http://www.thl.fi Abstract Providing citizens with reliable, up-to-date and individually relevant health information on the web is done by governmental, non-governmental, business and other organizations. Currently the information is published with little co-ordination and co-operation between the publishers. For publishers, this means duplicated work and costs due to publishing same information twice on many websites. Also maintaining links between websites requires work. From the citizens point of view, finding content is difficult due to e.g. differences in layman’s vocabularies compared to medical terminology and difficulties in aggregating information from several sites. To solve these problems, we present a national scale semantic publishing system HealthFinland which consists of a 1) a centralized content infrastructure of health ontologies and services with tools, 2) a distributed semantic content creation channel based on several health organizations, and 3) an intelligent semantic portal aggregating and presenting the contents from intuitive and health promoting end-user perspectives for human users as well as for other websites and portals. Key words: health information, semantic portal, metadata, faceted search, content aggregation, ontology service 1. Introduction Health information is one of the most frequently searched materials on the web. Among American Internet users, about 8 million adults searched for information on at least one health topic on a typical day in August 2006, most commonly information * Corresponding author. Tel: +358 9 451 3346 Email addresses: [email protected] (Osma Suominen), [email protected] (EeroHyv¨onen), [email protected] (Kim Viljanen), [email protected] (Eija Hukka). about specific diseases and problems, treatments, diet and nutrition or exercise [26]. However, a citizen searching for health information on the web faces many challenges [17]. General web search engines are the most com- mon starting points used to find health information [26]. Web search engines require that users know or can guess suitable keywords. Search engines are gen- erally not aware of the context of use and cannot match content to user needs based on, e.g., target audience or other demographic information. Apart from information that search engines can infer from link structure and other statistical measures, there

Transcript of HealthFinland ...other web applications, as semantic widgets (called \ oatlets") [23]. Figure 2...

  • HealthFinland—a National Semantic Publishing Network and Portal for Health

    Information

    Osma Suominen a,∗, Eero Hyvönen a, Kim Viljanen a, Eija Hukka baSemantic Computing Research Group (SeCo),

    Helsinki University of Technology (TKK), Department of Media TechnologyUniversity of Helsinki, Department of Computer Science

    http://www.seco.tkk.fibNational Institute for Health and Welfare (THL)

    http://www.thl.fi

    Abstract

    Providing citizens with reliable, up-to-date and individually relevant health information on the web is done bygovernmental, non-governmental, business and other organizations. Currently the information is published with littleco-ordination and co-operation between the publishers. For publishers, this means duplicated work and costs dueto publishing same information twice on many websites. Also maintaining links between websites requires work.From the citizens point of view, finding content is difficult due to e.g. differences in layman’s vocabularies comparedto medical terminology and difficulties in aggregating information from several sites. To solve these problems, wepresent a national scale semantic publishing system HealthFinland which consists of a 1) a centralized contentinfrastructure of health ontologies and services with tools, 2) a distributed semantic content creation channel basedon several health organizations, and 3) an intelligent semantic portal aggregating and presenting the contents fromintuitive and health promoting end-user perspectives for human users as well as for other websites and portals.

    Key words: health information, semantic portal, metadata, faceted search, content aggregation, ontology service

    1. Introduction

    Health information is one of the most frequentlysearched materials on the web. Among AmericanInternet users, about 8 million adults searched forinformation on at least one health topic on a typicalday in August 2006, most commonly information

    ∗ Corresponding author. Tel: +358 9 451 3346Email addresses: [email protected] (Osma

    Suominen), [email protected] (Eero Hyvönen),[email protected] (Kim Viljanen), [email protected]

    (Eija Hukka).

    about specific diseases and problems, treatments,diet and nutrition or exercise [26]. However, a citizensearching for health information on the web facesmany challenges [17].

    General web search engines are the most com-mon starting points used to find health information[26]. Web search engines require that users know orcan guess suitable keywords. Search engines are gen-erally not aware of the context of use and cannotmatch content to user needs based on, e.g., targetaudience or other demographic information. Apartfrom information that search engines can infer fromlink structure and other statistical measures, there

  • is no notion of trusted information sources or qualitymeasures relevant to assessing the trustworthinessof health information and advice.

    In practice, general web search result listings forcommon health topics often contain a heterogeneousmix of health-related webpages. As an example, thetop twenty search results for diabetes retrieved byGoogle on April 21, 2009 include news sites, non-governmental organizations such as patient groups,a Wikipedia article, commercial sites that promotehealth-related products and services as well as gov-ernment sites that seek to inform citizens. More spe-cific queries such as diabetes pregnant also turn upuser-generated content, e.g., discussion forum post-ings, blogs and video clips.

    Sorting out relevant, reliable and useful contentfrom this mix, and assessing the quality of healthwebsites, requires background knowledge about theorganizations that publish health information on theweb. When users do find interesting resources, find-ing related relevant content is not very easy as nosingle website can provide everything a user may beinterested in, and the linking between sites is ofteninsufficient. Many common information needs andsituations, such as when a baby is born into a family,require aggregation of content from several informa-tion providers.

    The quality and trustworthiness of informationvaries. In many cases, it is difficult to know whetherinformation on a web page is based on scientific re-search, layman opinions and rumors, or whether it ismotivated by commercial interests. Even basic qual-ity indicators such as information sources and pub-lication dates are often absent on health websites,and users do not consistently check for them [26].Even when good quality information is found, it maynot be targeted for the right audience: a treatmentrecommendation for gestational diabetes issued bya government agency for doctors may be useless foran anxious future mother.

    From the health information publishers’ point ofview, a central problem today is that there are manyactors and websites but little coordination and co-operation between them, resulting in duplicate workeven between non-competitive organizations such asdifferent government institutions [17]. Reusing pre-existing web content is mainly done by manuallylinking websites, but such links quickly expire whenwebsites are restructured or shut down. Creatinggood quality health content is problematic and re-quires a process for quality control, e.g., regular re-views and updates.

    The annotation of web content with metadata isnot very common on health websites, but it is some-times mandated by, e.g., government standards. Inthese cases, the annotation processes are often te-dious and require expertise in the use of large vo-cabularies such as the Medical Subject Headings(MeSH) 1 .

    HealthFinland 2 is a semantic health infor-mation publishing system that addresses theseproblems both from the publishers’ and citizens’viewpoints. The concept and ideas behind Health-Finland have been presented in [17,18]. The designand implementation of the portal information archi-tecture and user interface was presented [33], andthe design and implementation of the portal in moredetail in [32]. The FinnONTO ontology service in-frastructure and services used in the system are pre-sented in [42,19,36]. This article complements ourearlier work and publications by providing a moredetailed description of the HealthFinland portalcomponents, the content creation infrastructure,the user interface, and the evaluations performed.Based on the lessons learned in developing the pro-totype, the system has been developed further andis being published in a production environment asa national health portal.

    In the following, we first present and overview ofthe ideas underlying HealthFinland and the ma-jor components and architecture of the system. Wethen present the ontologies and other domain vo-cabularies used in the system as well as the meta-data model and the underlying content productionsystem. Finally, we present the user interface of thesemantic portal. In each section, lessons learned indesigning and implementing the prototype are dis-cussed, and related work is pointed out. In conclu-sion, contributions of the research are summarizedand directions for further research are outlined.

    2. The HealthFinland Approach

    The main reasons for developing Health-Finland using Semantic Web technologies are [17]:facilitating cost-effective, distributed content cre-ation in an interoperable way, aggregating contentsautomatically based on semantics, and providingthe end-users with intelligent services.

    First, HealthFinland minimizes duplicate re-dundant work and costs in creating health content

    1 http://www.nlm.nih.gov/mesh/2 http://www.seco.tkk.fi/applications/tervesuomi/

    2

  • on a national level by collaboration. A goal of theHealthFinland collaborative production networkis to ensure that information about a health topic isproduced only once and by the organization that hasthe best professional experience about the topic. Byusing semantic technologies, the content can thenbe re-used in different web portals by the other or-ganizations, not only in the organization’s own web-site. This possibility is facilitated by annotating thecontent locally with semantic metadata based onshared ontologies, and by making the global reposi-tory available by a semantic portal and as mash-upweb services.

    The second key idea behind HealthFinland isto try to minimize the content maintenance costs ofportals by letting the computer take care of semanticlink maintenance and aggregation of content fromthe different publishers. This possibility is also basedon shared semantic metadata and ontologies. Newcontent relevant to a topic may be published at anymoment by any of the content providers, and thesystem is able to put the new piece of informationin the right context in the portal, and automaticallylink it with related information.

    The third major idea of HealthFinland is toprovide the end-user with intelligent services forfinding the right information based on his or herown information needs or conceptual view to health,and for browsing the contents based on their seman-tic relations. The views and vocabularies used inthe end-user interface are independent of the con-tent providers’ organizational perspective, and arebased on a “layman’s” vocabulary that is differentfrom the medical expert vocabularies used by thecontent providers in indexing the content [33].

    To implement these ideas, HealthFinland hasthree main components (cf. Figure 1): 1) The Do-main Vocabularies describe the different aspectsof health information such as topics, genres andaudiences, and they are published in an ontologyrepository as part of the FinnONTO system of mu-tually interlinked vocabularies/ontologies [19]. 2)The Content Production System contains specifica-tions and tools for annotating, harvesting, and veri-fying content. 3) The Semantic Portal for providingthe contents for human users through semanticsearch and browsing services and for machines, i.e.,other web applications, as semantic widgets (called“floatlets”) [23].

    Figure 2 gives a more detailed overview of theHealthFinland system architecture. Metadataand documents are collected from the content pub-

    lishers either by harvesting content and metadatafrom their content management systems or by anno-tating content manually using the SAHA metadataeditor [37] connected to ONKI ontology services[36]. The content is validated and possible prob-lems are reported to the content providers. Thesuccessfully validated metadata is finally publishedthrough the portal for humans and machines to use.

    3. Domain Vocabularies

    A semantic portal needs a set of vocabularieswhich are referenced by the metadata. The ontolo-gies should describe all the relevant concepts inthe application domain of the portal. In Health-Finland ontologies and other controlled vocab-ularies are used for describing document subjectmatter, content genres and target audiences.

    3.1. Subject Vocabularies

    In the case of health information and health pro-motion, the subject matter ontologies must describea large number of possible topics such as diseases,treatments and anatomy as well as living habits suchas diet, exercise and substance use.

    We started by analyzing pre-existing, establishedontologies and other structured vocabularies for de-scribing health-related content. Reusing existing on-tologies offers several advantages: it eases semanticinteroperability with other applications, saves timeand money by avoiding unnecessary ontology engi-neering work, and helps to ensure broad coverage ofthe subject area as established ontologies typicallyhave been used and developed for a long time. Wesettled on three core subject domain ontologies thatare used for describing the subject matter of webcontents:

    (i) The Finnish General Upper Ontology (YSO) 3

    that includes approximately 20 000 concepts.The YSO ontology was created by transform-ing the General Finnish Thesaurus YSA 4 intoRDF/OWL format using the Protégé editor 5

    and by manually crafting the concepts intordfs:subClassOf hierarchies [19].

    (ii) The international Medical Subject Headings(MeSH) which includes approximately 23 000

    3 http://www.seco.tkk.fi/ontologies/yso/4 http://www.vesa.lib.helsinki.fi5 http://protege.stanford.edu

    3

  • HealthFinlandSystem

    Semantic Portal- humans: faceted user interface- machines: AJAX widgets

    Content Production System- metadata schema- annotation, harvesting and verification tools

    Domain Vocabularies- OWL and SKOS vocabularies- Ontology Library ONKI (FinnONTO infrastructure)

    Fig. 1. Three main components of HealthFinland.

    HealthFinlandPortal

    Diabetes AssocUKK Institute

    Heart Assoc

    etc.

    Content Management Systems

    metadata & documentsSAHA

    metadataeditor metadata

    ONKIOntologyService

    www.yso.fi

    Contentaggregation &

    validation

    metadata &

    documents use

    suses

    LuceneJenaRDFstore

    MetadataFeedback

    Report

    prod

    uces

    MetadataFeedbackReports

    Mash-upfloatlets for

    other portals

    ONKISelectorwidget

    Content editor

    reads

    uses

    Citizenend-user

    Fig. 2. HealthFinland system architecture.

    concepts. We transformed the vocabulary intothe SKOS Core format 6 without changing thesemantics of the vocabulary or its structureusing conversion tools and methods developedat the Free University of Amsterdam [38].

    (iii) The European Multilingual Thesaurus onHealth Promotion 7 (HPMULTI), which in-cluded a Finnish translation. HPMULTI con-tains approximately 1200 concepts relatedspecifically to health promotion. We trans-formed HPMULTI into SKOS/RDF in a sim-ilar way as MeSH using a custom conversiontool.

    All three vocabularies are required to cover thesubject matter of the portal properly. YSO is broadbut does not include enough concepts that describedetailed medical content. On the other hand, MeSHcontains lots of useful medical concepts, is widely

    6 http://www.w3.org/2004/02/skos/7 http://www.hpmulti.net/

    used in the health sector, but is focused on clinicalhealthcare. HPMULTI complements the two vocab-ularies by focusing on health promotion specific ter-minology.

    To make the three subject vocabularies seman-tically compatible with each other, we createdthe Health Promotion Ontology using YSO as thestructural basis and extended it manually, usingthe Protégé ontology editor, with concepts from thetwo other vocabularies. Currently, the Health Pro-motion Ontology contains all concepts from YSOand HPMULTI together with some 2500 conceptsfrom MeSH.

    3.2. Content Genre Vocabulary

    Content genres describe different types of doc-uments based on purpose, form and content [29].Genres can be specific to the application domain,and in the case of health information, some impor-

    4

  • tant genres include articles, tests (for assessing yourcurrent health or habits), official recommendationsand news. Although genres are currently not verywidely used in web search, they provide an interest-ing means of restricting content searches in additionto subject- or keyword-based searching.

    We could not find a suitable pre-existing genrevocabulary for health information, so we used alightweight user-centered process very similar toRosso’s card sorting survey method [29] to constructa classification of genres. Five test subjects (Rosso’sexperiment had only three participants in the initialphase) were asked to sort a selection of forty printedcontent pages into groups based on their form andstructure, and finally label them with descriptivenames. An initial genre vocabulary was constructedbased on an analysis of the results. The vocabularywas later extended and restructured as new genreswere identified that were not well represented in theoriginal experiment. Some categories were mergedor renamed based on experiences from user testing.The resulting genre classification currently includes21 content genres and is expressed as a SKOS Corevocabulary.

    3.3. Target Audience Vocabulary

    Describing the target audience of documents al-lows content to be selected and promoted to certainusers. Audience-based matching requires a vocabu-lary of audiences and a user interface that makes useof audience information.

    To describe the different target audiences of theHealthFinland portal, we initially created a Tar-get Audience Ontology. We constructed the ontol-ogy by taking into account five possible ways of seg-menting audiences: by age, gender, occupation, andpatient group as well as some other special groupsand roles, e.g., parents and exercise teams. We usedthe classification of occupations published by Statis-tics Finland 8 and a smaller target audience classi-fication used in a publication database at the UKKInstitute 9 as starting points for the ontology. Weorganized the different occupations based on a di-vision into health professionals (e.g. doctors, nursesand therapists) and other, non-health occupations.The ontology was built using the Protégé ontologyeditor and transformed into SKOS Core format. The

    8 http://www.stat.fi9 http://www.ukkinstituutti.fi

    final ontology contains more than 800 target audi-ences.

    However, we soon discovered that the size andstructure of the ontology was unwieldy for annota-tions and the hierarchies often unsuitable for pro-cessing search queries in a health portal. The oc-cupation classification we used was intended for adifferent purpose (collecting national level statisticsabout, e.g., salaries) and mainly distinguished dif-ferent occupations by the level of education they re-quire, which had little bearing on the needs of portalaudiences. Additionally, as the needs of profession-als are very different from those of ordinary citizens,we decided to drop health professionals as a targetaudience altogether in order to better serve ordinarycitizens in situations and tasks involving their per-sonal health. Many other websites and portals thatfulfil work-related needs of health professionals al-ready exist, and new ones, such as THL.fi 10 pub-lished by the National Institute for Health and Wel-fare, were being developed in parallel to Health-Finland.

    Finally, we settled on a relatively small audienceclassification that takes into account the needs andgoals of users as well as the tone of discussion inthe available content. The classification consists oftwo levels: special categories for specific situations –e.g. families with a baby, occupational health, andinformation for travellers – as well as three generalcategories for disease- and problem-oriented infor-mation, health improvement, and environmental is-sues, respectively. The intent is that content thatfits one of the special categories is classified therein,while the rest is classified into one of the generalcategories. The current classification includes ninecategories, but more special categories are expectedto be defined when new content is added into theHealthFinland portal. The classification is ex-pressed using the SKOS Core vocabulary.

    3.4. National Ontology Library Service ONKI

    Methods for publishing controlled vocabulariesand using them in applications are required whenused in professional content creation work such asin the case of the HealthFinland content cre-ation network. At the start of the HealthFinlandproject, none of the Content Management Systems(CMS) used by the content publishers supported

    10http://www.thl.fi

    5

  • referencing concepts described in controlled vocab-ularies by URIs in the metadata, as would be re-quired by true Semantic Web applications. Vocab-ularies also evolve constantly, so methods for pub-lishing the latest versions are needed in such a waythat all parties are always using the latest version ofthe vocabularies.

    The HealthFinland content creation networkuses The Finnish Ontology Library Service ONKI 11

    for publishing all the domain vocabularies discussedabove. ONKI is a pilot system for deploying ontolo-gies and using them on a national scale [41]. ONKIfocuses especially on ontology publishing and usingthem in content indexing and information retrievalthrough both end-user and application interfaces.ONKI ensures that all content creators use the lat-est versions of each ontology. From the point of viewof the content creators, ONKI is used for browsingthe ontologies and to support annotation of con-tent with ontological concepts. ONKI is also usedby the HealthFinland content aggregator appli-cation for finding concept URIs for the term labels.

    4. Content Production System

    A semantic portal requires rich metadata withwell-defined object types, properties (fields) andvalue ranges in order to provide rich search facilitiesto end users. To ensure interoperability and easeintegration of systems, metadata schemes should bebased on common standards and frameworks suchas Dublin Core [4]. However, general purpose meta-data frameworks may be adapted for particulardomains and applications by specifying additionalconstraints and extensions.

    A trade-off exists between the level of semanticrichness and ease of metadata creation. If annotat-ing objects requires large amounts of manual workand the results and benefits of metadata productionare not clearly visible, less metadata will likely beproduced, or shortcuts will be taken that compro-mise metadata quality [8]. Metadata schemas shouldfocus on effective annotations with an appropriatebreadth and expressivity that does not hinder an-notation while still giving benefits for the users whorely on the metadata.

    We designed the HealthFinland metadataschema using the Dublin Core Metadata ElementSet and the Finnish JHS143 recommendation on

    11http://www.yso.fi

    document metadata [20] as starting points, choos-ing only fields that were relevant in the domain ofhealth information. The schema was extended witha domain-specific Genre field that describes healthinformation content types such as article, guide,news item and campaign. The original schema hasbeen published [17], and a detailed metadata speci-fication for system implementors is available [34].

    The metadata in HealthFinland is representedusing RDF, conforming to the recommendations forexpressing Dublin Core in RDF [6]. A subset ofthe metadata can also be embedded in (X)HTMLpages using META and LINK elements based on theDublin Core recommendation [5].

    4.1. Metadata Production Tools

    Creating good quality metadata requires suitabletools that assist content creators and annotatorsin the metadata production process. Many contentproducers use a content management system (CMS)to produce and publish web documents. Adding sup-port for metadata production using the Health-Finland metadata schema into such systems makesit easy for content authors to produce metadata andkeep it synchronized with updates in the content. Insuch a setting many metadata fields such as publi-cation and modification dates can also be automat-ically assigned values.

    To make it easier to assign subjects and otherconcept values from the vocabularies used in theHealthFinland metadata schema, we have devel-oped the ONKI Selector Widget (Figure 3) [40]. Itis an AJAX-enabled component that can easily beintegrated into any HTML form and allows the se-lection of concepts from any vocabulary availableon the ONKI server. The concepts picked using thecomponent are stored in an HTML form field likeany other metadata. This way, the content manage-ment system does not need to be aware of the vo-cabularies and the newest vocabulary versions arealways directly available from the ONKI server.

    To publish the metadata for use by the Health-Finland system, the content management systemmust be configured to expose its metadata eitherembedded into individual HTML pages or as RDFdata retrievable using HTTP.

    Not all content publishers use a CMS, and even ifthey do, adding metadata support to web publishingsystems may not always be technically or economi-cally feasible. To create metadata for such content

    6

  • ONKI Selector Widget

    Fig. 3. Finnish Heart Association’s metadata production tool using ONKI Selector Widget. The widget has been integratedinto the Navigo portal software used by the association.

    sources, we have developed the browser-based meta-data editor SAHA [37] that can be configured withthe HealthFinland metadata schema. It was usedin HealthFinland for manual annotation of healthinformation. SAHA uses the above mentioned ONKISelector Widget for supporting concept lookup fromvocabularies available on the ONKI server.

    4.2. Content Publishers

    Eight Finnish health organizations—the FinnishDiabetes Association, the Finnish Medical SocietyDuodecim, National Public Health Institute, Socialand Welfare Services of Oulu, Savonia University ofApplied Sciences, Finnish Centre for Health Promo-tion, Finnish Institute of Occupational Health, andUKK Institute for Health Promotion Research—contributed to the health contents available in theportal protype. The system contains almost 3500health content objects.

    Depending on the organization and content, thecontent was created in different ways: two orga-

    nizations used embedded mark-up conforming tothe HealthFinland metadata scheme on theirHTML pages; two organizations provided their con-tent in XML formats that were then transformedinto RDF by special transformers; one organizationprovided a relational database that was similarlytransformed into RDF; and three organizationsprovided the metadata directly in the specifiedRDF format using the SAHA editor attached toONKI services. The XML and database sourcesused different subject vocabularies that had to bemapped to the Health Promotion Ontology beforethe transformation could be completed.

    4.3. Metadata Aggregation and Quality Control

    Using heterogeneous data sources presents somechallenges with regard to data validity and qualitycontrol. We have developed an aggregation and val-idation tool that collects metadata from differentsources and processes it into a well-defined formatfor use in the HealthFinland portal. The meta-

    7

  • Fig. 4. Metadata Feedback Report

    data is collected either by crawling the web sites ofcontent publishers or by retrieving metadata pub-lished in RDF format. Metadata expressed as partof HTML documents is syntactically transformedinto RDF, and any concepts expressed using con-cept labels are converted into URI references usingWeb Service facilities for concept lookups providedby the ONKI server. Finally, the aggregator vali-dates the metadata against the HealthFinlandmetadata schema, and produces output in an RDFformat that is guaranteed to conform strictly to theHealthFinland metadata schema. Any problemsencountered during the aggregation and validationprocess are reported back to the publisher by cre-ating a problem report, shown in Figure 4, whichis implemented as a set of browsable static HTMLpages.

    4.4. Experiences and Feedback

    To gain some understanding into how the ini-tial content creation infrastructure works from thecontent publishers’ point of view, we collected userfeedback from annotators after several months ofuse. The feedback was collected using an e-mailsurvey and a meeting between the content editorswhere the experiences of different content producerswas discussed and summarized into a list of notableproblems and possible improvements. Overall, theprocesses and tools were considered functional, butmany improvements were also suggested.

    The annotators suggested that the metadata

    schema and related annotation tools could be im-proved by removing or hiding some optional fields(e.g., Coverage) that were seldom used. They alsorequested tools that could help them avoid repet-itive work, e.g., by using previous annotations astemplates for annotating new content.

    The use of three different complementary sub-ject vocabularies was considered somewhat imprac-tical as it forced annotators to look up concepts ineach vocabulary. The annotators had also encoun-tered situations where a concept they needed wasnot present in any of the available vocabularies, andrequested some means to add new necessary con-cepts into the vocabularies during the annotationwork.

    The feedback reports were considered a useful toolfor metadata quality control, but the details shownin the initial reports were often overwhelming. Away to suppress warnings about acknowledged butunimportant problems, such as systematic syntacticerrors, was requested.

    The content publishers also requested a way tosuppress content from appearing in the Health-Finland portal even though valid metadata hadbeen created. This was necessary for situationswhere content had been produced that was usefulfor the intended audience of the originating websitebut not relevant or useful to the general public.

    4.5. Refined Content Creation Process andMetadata Schema

    Based on the feedback, we made several improve-ments to the content creation process. The usabil-ity of SAHA and the problem reports has been en-hanced in many ways, such as fixing user interfaceglitches and removing unimportant problems fromthe reports. The metadata schema has been up-dated, removing many optional fields that were notwidely used in practice. The new schema allows thedirect use of the Health Promotion Ontology as asubject vocabulary. The content publishers can thusavoid using all the other subject vocabularies exceptwhen they consider them necessary due to legacymetadata or requirements of direct semantic inter-operability towards third party systems. However,as the Health Promotion Ontology contains map-pings to the original source vocabularies, expressedusing the SKOS mapping properties, indirect se-mantic interoperability with systems using the orig-inal vocabularies is possible regardless of which sub-

    8

  • ject vocabularies the content publishers decide touse.

    The new metadata schema, shown in Table 1, hasbeen improved by removing fields that were not con-sidered useful. Some fields, such as Spatial and Tem-poral coverage, Part of relations, References andTranslations, had not been used in practice by anno-tators. The Media type field was intended to be usedfor distinguishing between web content and otherpublications such as printed books and CDs, but itwas dropped as the scope of the portal search func-tions was restricted to web content. The Source fieldwas sometimes used in the metadata, but its usagewas not consistent. The inconsistencies caused sometechnical and presentation issues in the portal userinterface, so the field was dropped from the meta-data schema and publishers were instead advised topresent sources in the content itself.

    The definition of some metadata fields were alsomade stricter. In the original schema, documentscould be categorized into multiple Genres, Audi-ences, Presentation types and Languages. After ana-lyzing the situations where multiple categorizationsfor these fields had been used, we found that in mostcases the content being described was not suitablefor inclusion in the portal. For example, when themetadata for a document claimed that it was in-tended for multiple audiences, in reality it was typ-ically not written for anyone in particular and thuswould not be very helpful for any practical task facedby the users of the portal. Thus, we decided to allowa single document to be placed in only one categoryfor each of these fields. Problematic documents thatcannot obviously be classified into a single categorywill either be classified into the most relevant cate-gory, reworked, or dropped from the portal entirely.As a side benefit, this restriction also simplified theimplementation of the portal user interface and theforms used for annotation.

    Some metadata fields were also added to theschema at the request of annotators. A literal fieldfor Keywords was added with the intent that anno-tators can use it to describe subjects that cannotbe found in any of the available subject vocabu-laries. Frequently occurring keywords can then beconsidered candidates for inclusion as concepts inthe Health Promotion Ontology. Additionally, afield for preventing the inclusion of content into theHealthFinland portal (NoIndex ) was added, togive content publishers a degree of control over thepublishing of their content through the Health-Finland portal.

    The specification for representing metadata onweb pages has also been updated to reflect theupdated recommendations of the Dublin Coreworkgroup [5]. The representation of metadata onHTML pages now includes a reference to the DublinCore metadata profile that, in principle, allows themetadata to be parsed and transformed into a validRDF/XML representation by any software agentthat supports the W3C GRDDL recommendation[43]. However, our content aggregator still uses acustom HTML metadata parser as it is more robustthan typical GRDDL parsers in the face of syn-tactical problems in the HTML code and can alsoprocess legacy representations of HealthFinlandmetadata based on our older specifications. Ourparser can also make domain-specific assumptionsand corrections that a general-purpose GRDDLagent cannot be expected to make.

    5. Semantic Portal

    In order to fulfil its purpose, a citizens’ healthportal must be usable from the perspective of ordi-nary citizens. The usability of an information por-tal is affected by, e.g., the layout and functionalitiesof the user interface. In a semantic portal used forinformation retrieval, additional factors that affectusability include the search facilities available andthe user-visible parts of ontologies and other vocab-ularies used in the portal.

    The faceted search paradigm [16] was chosen asthe basis for the search facilities in the Health-Finland portal, as it had been successfully appliedin other semantic search systems [9,15]. In facetedsearch, users navigate and make search queries inthe portal by making category selections along theavailable facets. However, finding a useful and intu-itive set of facets and categorizations in the domainof health information was a challenge.

    In many faceted search systems, the facets are di-rectly constructed using the values found in meta-data fields and the vocabularies that define the setsof possible field values [9,27,30]. In some systems,such as MuseumFinland, logical rules are used todefine additional facets that do not directly corre-spond to the structure of the ontology [15].

    In HealthFinland the main subject matter vo-cabularies have been created for and by profession-als such as doctors, nurses and librarians for use atwork or for scientific purposes. Their structure andchoice of terms are not always appropriate for the

    9

  • Table 1HealthFinland Metadata Schema. Required fields are marked in bold.

    Name QName Card. Value type Value range

    Identifier dc:identifier 1 URI

    Locator ts:url 0..1 URL

    Title dc:title 1 Literal Non-empty string

    Description dc:description 1 Literal Non-empty string

    Language dc:language 1 Typed literal RFC 4646

    Publication

    time

    dct:issued 1 Typed literal W3CDTF (ISO 8601)

    Acceptance time dct:dateAccepted 0..1 Typed literal W3CDTF (ISO 8601)

    Modification time dct:modified 0..1 Typed literal W3CDTF (ISO 8601)

    Publisher dc:publisher 1..* Instance foaf:Organization

    Creator dc:creator 0..* Instance foaf:Agent

    Rights dc:rights 0..* Literal Non-empty string

    NoIndex ts:noindex 0..1 Typed literal Boolean value

    Genre ts:genre 1 Concept Genre Classification

    Presentationtype

    dc:type 1 Concept DCMI Type vocabulary

    Format dc:format 1 Typed literal IANA MIME types

    Subject dc:subject 1..* Concept Subject Vocabularies

    Keyword ts:keyword 0..* Literal Non-empty string

    Audience dct:audience 1 Concept Audience Classification

    general public and typical tasks and goals related topersonal health, so a faceted navigation system di-rectly based on the vocabularies would likely not bevery useful for users of a portal devoted to personalhealth issues. Instead, we constructed new, citizen-centric facets and mapped these to the underlyingontologies. The end-user facets were created using auser-centered card sorting method [33]. In the searchfacets, the Subject metadata field is represented bythe four independent facets Topic, Life event, Groupof people and Body part. The additional facets Docu-ment type (genre), Publisher and Audience were cre-ated by using the values of the corresponding meta-data fields directly.

    5.1. User Interface Design and Implementation

    The user interface of the HealthFinland por-tal was designed using a design process inspired bythe discount usability engineering philosophy [24,21]that stresses the application of lightweight usabil-ity evaluations performed between incremental de-sign updates. The design was based on experiences

    with earlier portals and search systems, lightweightdesign methods such as paper prototypes, and eval-uation using heuristics and opportunistic user test-ing. Our evaluation was focused on a qualitative ap-proach as we tried to find problems early on andthus guide the design process.

    The first user interface sketches were drawn onpaper and we made extensive use of design pat-tern collections such as [35] and earlier facetedsearch systems. The presentation of search resultswas designed to show rich metadata that helpsusers quickly find relevant items and filter out non-relevant results. The design of the search resultslisting was based on research by Crystal and Green-berg that evaluated the usefulness of metadata forhealth information seekers [3]. The display of facetsand search results is shown in Figure 5.

    In addition to faceted search, the portal providesrecommendation links based on ontological knowl-edge (e.g. “smoking is a risk factor for lung cancer”)and an alphabetical index of concepts.

    The user interface is multilingual and supportsFinnish, Swedish and English. Cross-language

    10

  • Genre facet

    Publisher facet

    Audience facet

    Body partfacet

    Group of peoplefacet

    Life eventfacet

    Subtopicbrowsing

    Search resultswith rich metadata

    Recommendationsgrouped by genre

    Fig. 5. HealthFinland portal with facets and search results.

    queries are supported, so that, e.g., an English userinterface and English-language categories may beused to perform searches in the Finnish-languagecontent items, demonstrating the potential of multi-lingual ontologies. However, not all facet categoriesand ontological concepts have been translated intoall three languages.

    In addition to serving end users directly, the por-tal also provides a floatlet (Figure 6), i.e., an AJAX-based semantic Web widget that can, with a smallamount of JavaScript code, be incorporated intoother portals to display related content items fromthe HealthFinland system. This way, the portalcontents are exported for machine processing andmash-up applications. The implementation is simi-lar to the related museum item display described inan earlier publication [42].

    The portal was implemented as a Java Servlet ap-plication running on Apache Tomcat. It was built us-ing the Tapestry framework and uses Jena for RDFfunctionality. Search and recommendation function-ality was implemented using the Lucene search en-gine, which we enhanced to handle category andconcept queries.

    5.2. Evaluation

    The portal application was evaluated with a seriesof user tests. The first user interface mock-ups werepresented to potential users of the portal and the testsubjects were asked to describe the user interfaceelements they saw. The mock-ups were refined untilsubjects were able to understand the purpose andfunction of the user interface.

    The first working implementation of the portalwas evaluated with a two-phase usability test. Thetest subjects represented the target audience of theportal, i.e., ordinary citizens with varying back-grounds such as students of different disciplines,administrative staff at the Helsinki University ofTechnology and a freelance journalist. A pilot testwith two users was used to refine the test setupand the information retrieval tasks, expressed asrealistic scenarios involving the test subjects.

    In the second phase, six users completed four re-trieval tasks using the portal. The tasks were ex-pressed as realistic situations where the users had tosolve problems or seek advice related to their ownpersonal health. For a task to be considered success-fully completed, the user was required to come upwith a solution backed by information retrieved us-

    11

  • Floatlet

    Fig. 6. AJAX-based semantic Web Widget showing dynamic context-based links to the HealthFinland content. The widget

    has been integrated into another portal by adding a small amount of JavaScript code into the HTML page.

    ing the portal and to express being satisfied withthe solution. Each task was successfully completedby at least three users, with an average task successrate of 70%. Typical task completion times were 3–10 minutes depending on the difficulty of the task.Users started each task from the portal front pagethat shows the top level of the topic hierarchies, con-structed using a user-centered card sorting process[33]. The vast majority of the first actions of testsubjects took them closer to their goal, indicatingthat the front page design was succesful and thusthat the topic structure was not confusing or mis-leading at this level. However, deeper in the topic hi-erarchy the subdivision of topics was often confusingor overwhelming, as it reflected the structure of theunderlying ontology. Most users did not make useof secondary query facets or recommendation linksbetween documents, commenting that they didn’tseem to help them achieve their task goals. The al-phabetical index was seldom used, but was usefulfor the users who tried it.

    A post-test questionnaire of subjective usability

    was used to calculate a System Usability Scale (SUS)[1] score. The scale represents the subjective usabil-ity of a system on a scale of 0 (worst) to 100 (best)points. The calculated SUS score of the portal was70.8 points, representing the finding that most usersconsidered the usability of the portal to be generallygood.

    5.3. Refined User Interface

    Based on the user tests and feedback from contentpublishers, we made several changes to the user in-terface. The most significant change was the separa-tion of search and browsing facilities, which had beencombined into one interface in the prototype por-tal. In the new portal, both the browsing and searchuser interfaces are based on the categorization ofdocuments by Section (audience), Topic, Situation,Genre and Publisher. Compared with the prototypeportal, we have dropped the facets for Group of peo-ple, Body part and Life event, which were not con-sidered very useful in our usability tests, and added

    12

  • the Situation facet, which partly serves the samepurpose as the Life event facet but is specific to par-ticular topics.

    The section (audience), situations, genre andpublisher for a document are directly describedin the metadata. The topic of a document is in-directly inferred based on a mapping between thecitizen-centered portal topics and subject conceptsdescribed in the Health Promotion Ontology, as inthe prototype portal.

    The portal has been split into top-level Sectionsthat correspond to the target audiences defined inthe Audience Classification. Within each section,the content is organized into browsable Topics, suchas the Exercise topic shown in Figure 7, and eachtopic may contain content areas devoted to specificSituations. The browsing interface resembles manytraditional health portals such as BBC Health 12

    and Yahoo! Health 13 , but the interface is dynam-ically generated based on the data model that de-fines the sections, topics and situations which arereferenced in the metadata.

    The search facility is shown in Figure 8. Searchesare usually initiated by entering keywords, but canbe further constrained by the five available searchfacets.

    6. Related Work

    The idea of the HealthFinland portal architec-ture and ontology-based faceted search was inspiredby earlier semantic portals such as SWED [28], theMultimediaN E-culture demonstrator [31] and Mu-seumFinland [15]. An early approach of creatingand using an ontology infrastructure is presentedin [22]. The infrastructure underlying Health-Finland was developed within the FinnONTOproject [19] where HealthFinland has been a ma-jor application case, in addition to the analogousCultureSampo system [14] for cultural heritagecontent. Both systems are based on distributedsemantic content creation, centralized ontology li-brary services, and a portal supporting semanticsearch and recommendation [11].

    Vocabulary and ontology development in thehealth domain has a long tradition and has pro-duced very large and comprehensive systems such

    12http://www.bbc.co.uk/health/13http://health.yahoo.com

    as MeSH, UMLS 14 and SNOMED CT 15 . In ourwork, existing vocabularies are used with additionalmutual ontology mappings [7] and mappings to ageneral vocabulary (YSO).

    The pitfalls of metadata about web resources havebeen well documented [8]. We have tried to avoidsome of the problems by creating tools that makevocabularies easily available for annotators and toreport problems back to the source. Transformingmetadata from literal sources and databases hasbeen discussed e.g. in [25]. In our case manual an-notation using the online metadata editor SAHAtogether with ontology services is also used. Whilewe haven’t performed rigorous evaluations of differ-ent annotation interfaces, our own experiences areconsistent with a recent end-user study of manualmetadata creation using ontology services [12].

    Faceted search, also called view-based search ordynamic taxonmies in IR research, has been studiedby various authors for a long time [27,9,30]. Severalsemantic portals are based on the paradigm, bothdomain-specific systems such as the SWED environ-ment portal [28] as well as generic search systems,e.g., /facet [10]. Separating end-user facets from an-notation ontologies is discussed in [13]. The auto-matic construction of a browsing interface based onfaceted metadata has been discussed in [2].

    Several health portals share important aspectswith our portal. The HONsearch system 16 , pub-lished by the Health on the Net Foundation, allowsthe searching of health web sites accredited with theHONcode quality seal using keywords as well as tar-get audiences to restrict the search. The commercialHealia search engine 17 similarly allows searches forhealth websites to be restricted by gender, age andheritage.

    7. Conclusions

    The HealthFinland system demonstrates howshared meanings, expressed using ontologies, canbe used to bind together syntactically and seman-tically heterogeneous content sources from differentpublishers. The content creation, validation and ag-gregation infrastructure, including ontology servicesand tools, enables the collaborative publication ofhealth content and reduces duplicate work. From an

    14http://www.nlm.nih.gov/research/umls/15http://www.ihtsdo.org/snomed-ct/16http://www.hon.ch/HONsearch/Patients/17http://www.healia.com

    13

  • Audience-basedmain sections

    Sub-topicnavigation

    Situation-specificadvice

    Topic contentby genre

    Related topics

    Related contentin other sections

    Fig. 7. Browsing interface that is dynamically constructed from faceted metadata.

    end-user perspective, the underlying semantic tech-nology enables a citizen-centered faceted search userinterface as well as more traditional services such askeyword search and topical index. All the tools andservices discussed above have been implemented andtested in a real world setting, and the prototype por-tal has been published on the Web 18 . The Health-Finland system was awarded the third prize at theOpen Track of the Semantic Web Challenge 2008competition 19 .

    The National Institute for Health and Welfareis building a production version of the Health-Finland portal. The first public version of the por-tal 20 was published in May 2009. The portal uti-lizes the same content creation infrastructure dis-cussed above, which has been integrated into the Al-fresco content management system. The user inter-

    18http://demo.seco.tkk.fi/tervesuomi/19http://challenge.semanticweb.org20http://www.tervesuomi.fi

    face is implemented using Java portlet technologyand built using Liferay portal software.

    8. Future Work

    Our ongoing research is focused on building per-sonalization services into the HealthFinland por-tal, such as push e-mail and recommendation linksbased on user profile information. The portal willalso be expanded with a registry of health servicesthat will be semantically interlinked with the infor-mation resources currently in the portal.

    Creating and maintaining the current portal, thecontent production infrastructure, vocabularies andmetadata, requires a relatively large amount of man-ual work. While vocabulary maintenance is a nec-essary burden for most semantic applications, weare developing tools and processes that facilitatecollaborative ontology development and will applythese to the maintenance of the vocabularies used

    14

  • Keyword search

    Section facet

    Topic facet

    Situation facet

    Genre facet

    Publisher facet

    Search resultswith rich metadata

    ”Did you mean...”suggestions

    ”See also...”suggestions

    Topic matches

    Fig. 8. Search interface, allowing queries to be restricted by five facets as well as keywords.

    in HealthFinland.Manual annotation work could further be reduced

    using semiautomatic annotation tools [39] to sug-gest, e.g., document languages and subjects basedon text analysis. We are currently investigating theuse of automatic categorization and topic extractionfor dynamic content – such as news feeds and discus-sion forum postings – that could then be integratedas content sources into the portal with little or nomanual effort.

    The situations referenced by documents in theportal are currently only defined within the contentrepository of the portal. In the future they may bepublished as a controlled vocabulary on the ONKIserver so that they can be referenced by contentproducers and third parties.

    While some HealthFinland content publish-ers already expose their metadata, mainly throughHTML meta tags, the aggregated and validatedRDF data that the portal relies on is not currently

    republished back to the Web. In the future the por-tal will be configured to expose at least some of itsmetadata. We are also investigating ways to exposethe portal search results in machine-processableformats such as RSS feeds, in addition to the AJAXfloatlets that we already provide.

    Acknowledgements

    This work is a part of the national semantic webontology project FinnONTO 21 2003–2010, fundedby the National Funding Agency for Technology In-novation (Tekes) and a consortium of 38 companiesand public organizations, and the Ministry of SocialAffairs and Health. The HealthFinland projectis co-ordinated by the National Institute for Healthand Welfare in Finland. We thank Johanna Eerola,Marko Ruohoranta, Matti Sarjakoski, Katri Seppälä

    21http://www.seco.tkk.fi/projects/finnonto/

    15

  • and Jouni Tuominen for their input to the work re-ported in this paper.

    References

    [1] J. Brooke, SUS: a ‘quick and dirty’ usability scale,

    in: Usability Evaluation in Industry, Taylor & Francis,

    London, 1996, pp. 189–194.

    [2] A. Crystal, Facetsare fundamental: Rethinking information architecture

    frameworks, Technical Communication 54 (1) (2007) 15–27.

    [3] A. Crystal, J. Greenberg, Relevance criteria identified by

    health information users during web searches, Journal

    of the American Society for Information Science andTechnology 57 (10) (2006) 1368–1382.

    [4] Dublin Core Workgroup, Dublin core metadata element

    set, v. 1.1, http://dublincore.org/documents/dces/(2008).

    [5] Dublin

    Core Workgroup, Expressing Dublin Core metadata

    using HTML/XHTML meta and link elements,http://dublincore.org/documents/dc-html/ (2008).

    [6] Dublin

    Core Workgroup, Expressing Dublin Core metadatausing the Resource Description Framework (RDF),

    http://dublincore.org/documents/dc-rdf/ (2008).

    [7] J. Euzenat, P. Shvaiko, Ontology matching, Springer–

    Verlag, Berlin, 2007.

    [8] D. Hawking, J. Zobel, Does topic metadata help with

    web search?, Journal of the American Society for

    Information Science and Technology 58 (5) (2007) 613–628.

    [9] M. Hearst, A. Elliott, J. English, R. Sinha,

    K. Swearingen, K.-P. Lee, Finding the flow in web site

    search, Communications of the ACM 45 (9) (2002) 42–49.

    [10] M. Hildebrand, J. van Ossenbruggen, L. Hardman,

    /facet: A browser for heterogeneous SemanticWeb repositories., in: International Semantic Web

    Conference, 2006.

    [11] M. Hildebrand, J. van Ossenbruggen, L. Hardman, Ananalysis of search-based user interaction on the Semantic

    Web, Tech. Rep. INS-E0706, CWI, Amsterdam (2007).

    [12] M. Hildebrand, J. van Ossenbruggen, L. Hardman,

    G. Jacobs, Supporting subject matter annotation usingheterogeneous thesauri, a user study in web data reuse.,

    Tech. Rep. INS-E0902, CWI, Amsterdam (2009).

    [13] M. Holi, E. Hyvönen, Fuzzy view-based semanticsearch, in: Proceedings of ASWC 2006, Beijing, China,

    Springer–Verlag, Berlin, 2006.

    [14] E. Hyvönen, E. Mäkelä, T. Kauppinen, O. Alm,J. Kurki, T. Ruotsalo, K. Seppälä, J. Takala, K. Puputti,H. Kuittinen, K. Viljanen, J. Tuominen, T. Palonen,M. Frosterus, R. Sinkkilä, P. Paakkarinen, J. Laitio,

    K. Nyberg, CultureSampo—Finnish culture on theSemantic Web 2.0. Thematic perspectives for the end-

    user., in: Proceedings of Museums and the Web 2009(MW2009), Archives & Museum Informatics, Toronto,Canada, 2009.

    [15] E. Hyvönen, E. Mäkela, M. Salminen, A. Valo,K. Viljanen, S. Saarela, M. Junnila, S. Kettula,

    MuseumFinland – Finnish museums on the Semantic

    Web, Journal of Web Semantics 3 (2) (2005) 224–241.

    [16] E. Hyvönen, S. Saarela, K. Viljanen, Application ofontology techniques to view-based semantic search and

    browsing, in: Proceedings of ESWS 2004, Heraklion,

    Greece, Springer–Verlag, Berlin, 2004.

    [17] E. Hyvönen, K. Viljanen, O. Suominen,

    HealthFinland—Finnish health information on theSemantic Web, in: Proceedings of ISWC 2007 + ASWC

    2007, Busan, Korea, Springer–Verlag, Berlin, 2007.

    [18] E. Hyvönen, K. Viljanen,

    O. Suominen, E. Hukka, HealthFinland – publishinghealth promotion information on the Semantic Web, in:

    Proceedings of DrMED 2008: International Workshop

    on Describing Medical Web Resources. The 21stInternational Congress on the European Federation for

    Medical Informatics (MIE 2008), Göteborg, Sweden,

    2008.

    [19] E. Hyvönen, K. Viljanen,

    J. Tuominen, K. Seppälä, Building a national SemanticWeb ontology and ontology service infrastructure—the

    FinnONTO approach, in: Proceedings of ESWC 2008,

    Tenerife, Spain, Springer–Verlag, Berlin, 2008.

    [20] JHS workgroup,JHS 143: Asiakirjojen kuvailun ja hallinnan metatiedot,

    http://www.jhs-suositukset.fi/suomi/jhs143 (2004).

    [21] S. Krug, Don’t Make Me Think : A Common Sense

    Approach to Web Usability, 2nd ed., New Riders Press,

    2005.

    [22] A. Maedche, B. Motik, L. Stojanomic, R. Studer,

    R. Volz, An infrastructure for searching, reusing andevolving distributed ontologies, in: Proceedings of

    WWW 2003, Budapest, Hungary, ACM Press, NewYork, 2003.

    [23] E. Mäkelä, K. Viljanen, O. Alm, J. Tuominen,O. Valkeapää, T. Kauppinen, J. Kurki, R. Sinkkilä,

    T. Känsälä, R. Lindroos, O. Suominen, T. Ruotsalo,

    E. Hyvönen, in: Proceedings of First Industrial Resultsof Semantic Technologies, co-located with ISWC 2007

    + ASWC 2007, Busan, Korea, 2007, cEUR Workshop

    Proc., Vol 293.

    [24] J. Nielsen, Guerrilla HCI: using discount usability

    engineering to penetrate the intimidation barrier, in:Cost-justifying usability, Academic Press, Inc., Orlando,

    FL, USA, 1994, pp. 245–272.

    [25] B. Omelayenko, Porting cultural repositories to thesemantic web, in: Proceedings of the 1st Workshopon Semantic Interoperability in the European DigitalLibrary (SIEDL-2008), 2008.

    [26] Pew Internet & American Life Project, Online Health

    Search (2006).

    [27] A. S. Pollitt, The key role of classification and indexing

    in view-based searching, Tech. rep., University of Huddersfield,

    UK, http://www.ifla.org/IV/ifla63/63polst.pdf (1998).

    [28] D. Reynolds, P. Shabajee, S. Cayzer, SemanticInformation Portals, in: Proceedings of WWW 2004,

    Alternate track papers & posters, ACM Press, NewYork, New York, 2004.

    16

  • [29] M. A. Rosso, User-based identification of web genres,

    Journal of the American Society for Information Science

    and Technology 59 (7) (2008) 1053–1072.

    [30] m. schraefel, M. Karam, S. Zhao, mSpace: interaction

    design for user-determined, adaptable domainexploration in hypermedia, in: Proceedings of AH 2003:

    Workshop on Adaptive Hypermedia and Adaptive Web

    Based Systems, 2003.

    [31] A. Schreiber, A. Amin, M. van Assem, V. de Boer,L. Hardman, M. Hildebrand, L. Hollink, Z. Huang,

    J. Kersen, M. Niet, B. Omelayenko, J. Ossenbruggen,

    R. Siebes, J. Taekema, J. Wielemaker, B. Wielinga,MultimediaN E-Culture demonstrator, in: Proceedings

    of ISWC 2006, Athens, GA, Springer–Verlag, Berlin,

    2006.

    [32] O. Suominen, Käyttäjäkeskeinen moninäkymähaku

    semanttisessa portaalissa (user-centric faceted searchin a semantic portal), Master’s thesis, University of

    Helsinki, Dept. of Computer Science (February 2008).

    [33] O. Suominen, K. Viljanen, E. Hyvönen, User-centric

    faceted search for semantic portals, in: Proceedingsof ESWC 2007, Innsbruck, Austria, Springer–Verlag,

    Berlin, 2007.

    [34] O. Suominen, K. Viljanen, E. Hyvönen, TerveSuomi-

    portaalin metatietomäärittely (HealthFinland portal

    metadata specification), v. 2.0, Tech. rep., SemanticComputing Research Group, SW 2.0 project (2009).

    [35] J. Tidwell, Designing Interfaces, O’Reilly & Assoc.,Sebastopol, California, 2005.

    [36] J. Tuominen, M. Frosterus, K. Viljanen, E. Hyvönen,

    ONKI SKOS server for publishing and utilizing SKOS

    vocabularies and ontologies as services, in: Proceedingsof ESWC 2009, Heraklion, Greece, Springer–Verlag,

    Berlin, 2009.

    [37] O. Valkeapää, O. Alm, E. Hyvönen, Efficient content

    creation on the semantic web using metadata schemas

    with domain ontology services, in: Proceedings of ESWC2007, Innsbruck, Austria, Springer–Verlag, Berlin, 2007.

    [38] M. van Assem, V. Malaise, A. Miles, G. Schreiber, A

    method to convert thesauri to SKOS, in: Proceedings

    of ESWC 2006, Budva, Montenegro, Lecture Notes inComputer Science, 2006.

    [39] A. Vehviläinen, E. Hyvönen, O. Alm, A semi-automaticsemantic annotation and authoring tool for a library

    help desk service, in: Emerging Technologies for

    Semantic Work Environments: Techniques, Methods,and Applications, IGI Group, Hershey, USA, 2008.

    [40] K. Viljanen, J. Tuominen, E. Hyvönen, Publishing and

    using ontologies as mash-up services, in: Proceedings of

    the 4th Workshop on Scripting for the Semantic Web(SFSW2008) at ESWC 2008, 2008.

    [41] K. Viljanen, J. Tuominen, E. Hyvönen, Ontologylibraries for production use: The Finnish ontologylibrary service ONKI, in: Proceedings of ESWC 2009,

    Heraklion, Greece, Springer–Verlag, Berlin, 2009.

    [42] K. Viljanen, J. Tuominen, T. Känsälä, E. Hyvönen,

    Distributed semantic content creation and publicationfor cultural heritage legacy systems, in: Proceedings of

    the 2008 IEEE International Conference on DistibutedHuman-Machine Systems, Athens, Greece, IEEE Press,

    2008.

    [43] W3C, Gleaning resource descriptions from dialectsof languages (GRDDL), W3C Recommendation, 11

    Sept 2007, http://www.w3.org/TR/2007/REC-grddl-20070911/ (2007).

    17