Enterprise Terminology Management - Publishers'...

42
Enterprise Terminology Management as a Basis for Powerful Semantic Services in Content Publishing Publishers‘ Forum 2013 Berlin, 22 of April 2013 Martin Kaltenböck Semantic Web Company www.semantic-web.at Christian Dirschl Wolters Kluwer Deutschland GmbH www.wolterskluwer.de @semwebcompany

Transcript of Enterprise Terminology Management - Publishers'...

Page 1: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Enterprise Terminology Management as a Basis for Powerful Semantic Services in Content Publishing

Publishers‘ Forum 2013Berlin, 22 of April 2013

Martin KaltenböckSemantic Web Companywww.semantic-web.at

Christian DirschlWolters Kluwer Deutschland GmbHwww.wolterskluwer.de

@semwebcompany

Page 2: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Agenda of the Workshop

Challenges and Introduction

Solution: Linked Controlled Vocabularies

Terminology WKD Use Case (C. Dirschl, WKD)

Conclusion & Outlook: a new Business Model?

Semantic Services on Top of Terminology Mgnt.

Q&A and Open Discussion… bring your own Use Cases!

© Semantic Web Company – http://www.semantic-web.at/

Page 3: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Semantic Web Company (SWC)

SWC FACTSSEMANTIC INFORMATION MANAGEMENT

• Semantic Web Company founded 2001 in Vienna, Austria• 20 experts in strategy, coding, consulting, research• Product: PoolParty Suite (launched 2009)• Serving global 500 companies• EU- & US-based consulting services

Partner Network

© Semantic Web Company – http://www.semantic-web.at/

Page 4: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

SWC Customers (excerpt)

World Bank

Roche Diagnostics

Credit Suisse

Wolters Kluwer

Biogen Idec

Wood MacKenzie

UNIQA Insurance AG

Pearson

REEEP

British Museum

Education Services Australia

Daimler

A1 Telekom

© Semantic Web Company – http://www.semantic-web.at/

Page 5: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Challenges and Introduction

© Semantic Web Company – http://www.semantic-web.at/

Page 6: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

We use different terminologies…

We use different languages…

We use different classification systems…

We use different meta data management systems…

We use different glossaries and definitions…

We use content from several data silos…

What are the challenges?

Innovationmanagement Innovation

management

HRMarketing

© Semantic Web Company – http://www.semantic-web.at/

Page 7: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Terminology = Controlled Vocabulary = SKOS Thesaurus

SKOS = Simple Knowledge Organisation System

L(O)D = Linked (Open) Data

Linked Controlled Vocabularies = using L(O)D principles

Concept based tagging = semantic tagging = semantic annotation

URI = Uniform Resource Identifier

….

I am using a special Terminology ;)

© Semantic Web Company – http://www.semantic-web.at/

Page 8: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

What is a thesaurus, what is the difference to a taxonomy or an ontology?

A thesaurus is expressiveenough to improve mostenterprise applications

significantly

but it is not too complex to create and maintain it

in a sustainable way

Taxonomy – Thesaurus - Ontology

© Semantic Web Company – http://www.semantic-web.at/

Page 9: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

SKOS stands for ‚Simple Knowledge Organization System‘

© Semantic Web Company – http://www.semantic-web.at/ 9

• W3C Standard since 2009

• Based on SemanticWeb standards

• Open for linking withadditional linked data

• W3C Standard since 2009

• Based on SemanticWeb standards

• Open for linking withadditional linked data

http://www.w3.org/2004/02/skos/

Page 10: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

What is a Concept? The Semiotic Triangle

conceptconcept

objectobjectlabellabel

A-Class

A-Klasse

W 176

Mental model of „A-Class“

anotherobjectanotherobject

Another mental model of „A-Class“

© Semantic Web Company – http://www.semantic-web.at/ 10

Page 11: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Concept-tagging vs. Term-tagging

Enterprise vocabulary

--- ------ --- --- ---- ----- ---- ------- --- - --- --- ---- ----- ------

Concept Tagging

Content from CMS

Term Tagging

‚Term-tags‘ become a ‚concept‘as part of the enterprise vocabulary

Concept-tagging is done on top of concepts which are already part of the enterprise vocabulary, thus contextualised and linked to other concepts.

Term-tagging means that tags are extracted from text (automatically via text mining) which are not part of the controlled vocabulary yet.

Term-tags can be inserted into the enterprise vocabulary. This extends and refines the vocabulary more and more.

© Semantic Web Company – http://www.semantic-web.at/ 11

Page 12: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Solution: Linked Controlled Vocabularies

© Semantic Web Company – http://www.semantic-web.at/

Page 13: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Using Linked (Open) Data Principles

• Use URIs to denote things.• Use HTTP URIs so that these things can be

referred to and looked up ("dereferenced") by people and user agents.

• Provide useful information about the thing when its URI is dereferenced, leveraging standards such as RDF, SPARQL.

• Include links to other related things (using their URIs) when publishing data on the Web.

Linked Data Principles Tim Berners‐Lee

WHY?• To enable connected vocabularies over several

departments (also different languages)• To enrich a Terminology in the areas of concepts,

synonyms, definitions, relations….• To enable contextualization / data integration

linking different Terminologies

Linked Controlled Vocabularies

© Semantic Web Company – http://www.semantic-web.at/

Page 14: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/ 14

1. Each concept in one or many concept schemes2. Each concept has one URI3. Each concept has one ore more labels4. (Poly‐)Hierarchical and non‐hierachical relations5. Matching between concepts from various sources

1. Each concept in one or many concept schemes2. Each concept has one URI3. Each concept has one ore more labels4. (Poly‐)Hierarchical and non‐hierachical relations5. Matching between concepts from various sources

1.

2.

3.4.

5.

Linked Controlled Vocabularies

Page 15: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Linked Controlled Vocabularies

• Simple Knowledge Organisation System is a W3C standard to develop enterprise vocabularies

• SKOS provides several properties for vocabularylinking (mapping):– skos:exactMatch– skos:closeMatch– skos:broadMatch– skos:narrowMatch– skos:relatedMatch

http://www.w3.org/TR/2009/REC-skos-reference-20090818/

© Semantic Web Company – http://www.semantic-web.at/

Page 16: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

16© Semantic Web Company – http://www.semantic-web.at/

Page 17: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Semantic Services on Top of Terminology Mgnt.

© Semantic Web Company – http://www.semantic-web.at/

Page 18: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Semantic Services on Top of Terminology Management

Topic Pages & Dossier Pages

SEO / SEM

Semantic Search

Recommender Systems

Content Aggregation

Data Integration (Services)

Matchmaking Services

Smart Glossary Services

© Semantic Web Company – http://www.semantic-web.at/

Page 19: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/ 19

Live‐Demohttp://scot.curriculum.edu.au/

Smart Glossary ServicesExample: Schools Online Thesaurus

Page 20: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Dossier Pages:From ‚Gopher‘ to ‚Super-Mashups‘

© Semantic Web Company – http://www.semantic-web.at/ 20

Live‐Demohttp://www.reegle.info/countries

Page 21: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Topic Pages: Mashups providing a quick overview

© Semantic Web Company – http://www.semantic-web.at/ 21

Short Description

Related Concepts

Geo-Search

Content (Tw

itter, Videos etc) fom

several different sourcesAPI

http://

CMS

Page 22: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/ 22

Live‐Demohttp://www.gbpn.org/newsroom/news-aggregator

Content AggregationExample: GBPN News Aggregator

Page 23: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

SKOS & Linked data alignment

© Semantic Web Company – http://www.semantic-web.at/ 23

Live‐Demohttp://bit.ly/semantic_search

Page 24: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

The Business Perspective: Costs of Data Integration

© Semantic Web Company – http://www.semantic-web.at/ 24

Source: Price Waterhouse Coopers – Technology Forecast, Spring 2009

Page 25: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Semantic Search

„Innovation management methods“ Search

HRMarketing/Sales

Research Production

© Semantic Web Company – http://www.semantic-web.at/

Live‐Demohttp://pilot4.poolparty.biz/alcedo/

Page 26: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Querying structured data AND unstructured data in one step

IndustryNews

Show me industry news which mention countries or regions to which our export volume has increased over the last 5 years at least by 10% and which deal with one of our products and/or with one of our competitors.

(Federated) SPARQL Queries

Export statistics

© Semantic Web Company – http://www.semantic-web.at/

Page 27: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Terminology WKD Use Case (C. Dirschl, WKD)

© Semantic Web Company – http://www.semantic-web.at/

Page 28: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/

Content Acquisition

Manually collecting data from different sources Most information is publicly not available 1:1 contractual relationships with authors

Content Enrichment

Composing/Bundling

Using internal taxonomies and thesauri Mainly manual enrichment Linking of WK content only

Sales

Customer ServiceOnline libraries as isolated applications Hardly any integration with Web content Only first steps in integration of client software and content

ContentAcquisition

ContentEnrichment

ComposingBundling

PublishingInterfacing Sales Customer

Service Customer

Publishing

Interfacing

Publishing mainly in the context of a distinct product Publishing of texts, not information

Content Supply Chain

Page 29: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/

Jurion Platform

jDeskReal integration in

local processes

jCloudSecure access and mobility

jStoreAccess to many sources and immediate usage

jBookIndividualisation of

content

jLinkNetworking and Personalisation

jCreateCreate and sell

knowledge

jSearchSemantic search on legal information

Page 30: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/

Overview Search and Content Enrichment architecture

CMS

CustomerContent

MetadataDB/Services

www… Crawler

Importpath

3rd PartyContent

UGCImportpath

Classification*

Metadata Recognition

Content Enrichment

Classification*

Metadata Recognition

Content Enrichment

Index

Concept Recognition*

Doc. Segmentation

Normalization

Index

Concept Recognition*

Doc. Segmentation

Normalization

User Query

Query Analysis• Concept Recogn.*• Named Entity Recogn.• Semantic expansion*• Link to Taxonomy*

Search

Search Result (Raw)

Result Analysis• Relevance Ranking

Refinement• Data organization

(e.g. faceting)• Further analysis (e.g.

ontology, linked data)

Search Result(Final)

Search Feedback

(e.g. ontology)

* Domain specific requirements

Enrichment Preprocessing/Indexing Search

UserInformation

Page 31: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/

Jurion – Autosuggest from dedicated knowledge domain database

Domain knowledge in PoolParty is the basis for auto complete;No keywords, but detailed legal concepts are offered

Page 32: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/

PoolParty for Metadata Storage and Development

Tool for storing the domain knowledge vocabulary; independent of content and metadata database; sound basis for applied knowledge management

Page 33: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/

Pebbles for Additional Metadata Assignment

Vocabulary maintained in PoolParty is assigned to content via an editorial workflow;Additional free metadata can also be applied

Page 34: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/

Pebbles as a means to include external knowledge

Leveraging the external knowledge available in the Semantic Web;Automatic inclusion of e.g. synonyms, definitions and references

Page 35: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/

Linked Data Publishing

vocabulary.wolterskluwer.de

Page 36: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/

Cooperation between SWC and WKD

Metadata Management

Text Mining

Data Integration

Semantic Search

Thesaurus Management

Knowledge Extraction

Knowledge Model Creation

Knowledge Model Maintenance

Knowledge Model Development

Open Data Usage

Linked Data Usage

Wolters Kluwer

Semantic Web Company

Page 37: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

© Semantic Web Company – http://www.semantic-web.at/

Cooperation between SWC and WKD

Metadata Management

Text Mining

Data Integration

Semantic Search

Thesaurus Management

Knowledge Extraction

Knowledge Model Creation

Knowledge Model Maintenance

Knowledge Model Development

Open Data Usage

Linked Data Usage

Wolters Kluwer

Semantic Web Company

Page 38: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Conclusion & Outlook: a new Business Model?

© Semantic Web Company – http://www.semantic-web.at/

Page 39: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Enterprise Terminologies: An Explicit Metadata Layer

• Metadata are stored and processed separately from data• Metadata management is part of the enterprise information management strategy

HRMarketing/Sales

Research Production

© Semantic Web Company – http://www.semantic-web.at/

Page 40: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Linked enterprise vocabularies are the backbone for a semantic infrastructure

© Semantic Web Company – http://www.semantic-web.at/ 40

Information integration on semantic level

Application (integrated views)

http://company.com/research/1452

http://company.com/production/729

Lean manufacturing

Lean production

http://company.com/regions/Belgium

http://company.com/regions/Benelux broaderrelatedmatch

Page 41: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

Experienced publishers can provide support in each of these steps:

1. Publishers have expertise in their specific domain and can support others with this knowledge about adequate concepts and its usage.

2. Publishers can consult partners or customers concerning the different processes that come up with creating standardized data or transforming existing data in the desired format.

3. Publishers can take over the creation of taxonomies or thesauri by using existing resources or engaging their internal domain experts’ network.

4. Enrichment can be supported by publishers in form of planning and executing the linking with external (cloud) or internal (publisher’s) resources and quality management of the linking.

5. Also curation can be executed manually or automatically by specialized tools. Publishers might have better experience in quality improvement of data and appropriate tools at hand.

6. Values of controlled vocabularies lie in the internal structural processes. They can improve functionalities of applications or enable additional services and even completely new applications. Publishers can support in order to use the potential of these data and to monetize the advantages of already existing applications by introducing proper showcases.

7. Maintenance is also an important topic that has to be taken into account as language, data and information change over time. This service can be offered by publishers.

Publishers could therefore support the implementation of external linked data infrastructures by process consulting and content expertise.

Source: A systemic perspective on linked open vocabularies (Blumauer, Dirschl, Eck, Pellegrini)

A Business Model for Publishers?

© Semantic Web Company – http://www.semantic-web.at/

Page 42: Enterprise Terminology Management - Publishers' Forumpublishersforum.de/wp-content/...EnterpriseTerminologyManagemen… · Enterprise Terminology Management as a Basis for Powerful

http://www.semantic‐web.at/http://poolparty.biz

Martin KaltenböckManaging Partner & CFOm.kaltenboeck@semantic‐web.at

42

„We are happy aboutany comments andquestions – and pleasebring in your own usecases now!“

Christian DirschlContent [email protected]