Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement...

24
Thesaurusmanagemen t Quickstart Introduction

Transcript of Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement...

Page 1: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

ThesaurusmanagementQuickstart

Introduction

Page 2: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

What are controlled vocabularies?

• organized arrangement of words and phrases

• used to index content and/or to retrieve content through browsing or searching

• include preferred and variant terms

• have defined scope or describe a specific domain

http://vocabularyserver.com/glossaries/getty/index.php

Page 3: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Thesaurus= a controlled vocabulary arranged in a known order and structured so that the various relationships among termsare displayed clearly and identified by standardized relationship indicators.

Important:• ISO 25964 is a standard for building thesauri• SKOS is a W3C recommendation designed for representation

of controlled vocabularies and is built upon RDF and RDFS. It allows publication of such vocabularies as linked data.

1 http://www.niso.org/schemas/iso25964/ (September 19th, 2014)

Page 4: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

ISO 25964

• Part 1: Thesauri for information retrieval

- published in 2011- developing a thesaurus (mono- and multilingual)- replaced previous standards ISO 2788/5964- includes data model and XML schema

• Part 2: Interoperability with other vocabularies - published in 2013- recommendations for the establishment and maintenance of

mappings between multiple thesauri, or between thesauri and other types of vocabularies

Data Model

http://www.niso.org/schemas/iso25964/

Page 5: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

SKOSSimple Knowledge Organization System

http://www.w3.org/2004/02/skos/intro

SKOS provides a standard way to represent knowledge organization systems using the Resource Description Framework (RDF). Encoding this information in RDF allows it to be passed between computer applications in an interoperable way.

Using RDF also allows knowledge organization systems to be used in distributed, decentralised metadata applications. Decentralised metadata is becoming a typical scenario, where service providers want to add value to metadata harvested from multiple sources.

Page 6: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Multilingual vocabulary issues (examples)

• structural problems: conceptual systems differ in the various languages

• equivalence problems: lexicalisation of concepts differs in different languages

• eg. bone – fish bone (en); Knochen – Gräten (de)[1];• intra- and inter-language problems; terms

differ in meaning (homographs) given term can have more than one meaning in a language

• eg. Turkey (country) and turkey (animal)[1] http://www.dsoergel.com/cv/B67.pdf 20th August, 2014

Page 7: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Federated Model

• LoCloud vocabulary based on federated model• having independent vocabularies for various

languages in the same domain (no one language is dominant)

• alignment of vocabularies via concept identifiers, end-user can search in all linked indexing vocabularies

• AIT experimental application based on TemaTres Vocabulary Tool

Page 8: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

TemaTres ...• supports distributed management models • ensures consistency and integrity of data and

relationships between terms • has features specially designed to provide data

traceability and quality control in the context of a controlled vocabulary

• supports the analysis and categorisation of terms for search

• enables vocabularies to be represented in a wide range of metadata standards relevant to knowledge management http://www.vocabularyserver.com/

Page 9: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

TemaTres functionalities• No limits to number of terms, alternative labels, levels of

hierarchy, etc • allows import/export of data in text or SKOS format• multilingualism• SPARQL endpoint• relationships between terms• notes• user management• Reports• Additionally: meta-terms: define facets, collections or arrays of terms, expose vocabularies

with powerful web services, search terms suggestion (did you mean...?), display terms in multiple deep levels in the same screen, user management, duplicate and free terms control, multilingual terminology mapping etc.

Page 10: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Why TemaTres?• Fast to use• Making vocabularies available as Webservice in the

enrichment process• Many vocabularies (like UNESCO, Gemet, PICO) have

already been established with this tool and are usable in the LoCloud infrastructure (http://www.vocabularyserver.com/vocabularies.php , 175 vocabularies available)

• Additionally own vocabularies can be created• Best starting point: Skos-file for import

Page 11: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Import in TemaTres

• Tabulated text

• Tagged text

• Skos core

Page 12: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Vocabularies that can at present be used during LoCloud aggregation:

Author Name of vocabulary University of California, Santa Barbara Alexandria Digital Library Feature Type Thesaurus Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)

Archeological Objects Thesaurus Scotland

English Heritage Archeological Sciences Thesaurus English Heritage Building Materials Thesaurus English Heritage Components Thesaurus American Folklore Society Ethnographic Thesaurus English Heritage Event Type Thesaurus English Heritage Evidence Thesaurus English Heritage FISH Archeological Objects Thesaurus Eionet European Environment Information and Observation Network

General Multilingual Environmental Thesaurus GEMET

Federation Internationale des Archives du Film (FIAF)

General Subject headings for Film Archives

The Discovery Programme Irish Monuments The Discovery Programme Irish Periods Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)

Maritime Craft Thesaurus Scotland

English Heritage Maritime Craft Type Thesaurus English Heritage and Royal Commission on the Historical Monuments of England

MDA Archaeological Objects Thesaurus

Royal Commission on the Ancient and Historical Monuments of Wales (RCAHMW)

Monument Thesaurus Wales

Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)

Monument Type Thesaurus

English Heritage Period Thesaurus Royal Commission on the Ancient and Historical Monuments of Wales (RCAHMW)

Period Thesaurus Wales

Bibliographic Standards Committee of the Rare Books and Manuscripts Section (ACRL/ALA)

Relator Terms for Use in Rare Book and Special Collections Cataloguing

Universidad de León

Tesauro de Ciencias de la Documentación

Library of Congress. Prints and Photographs Division

Thesaurus for Graphic Materials 1: Subject Terms

Library of Congress. Prints and Photographs Division

Thesaurus for Graphic Materials 2: Genre and Physical Characteristic Terms

Ministero per i Beni e le Attività Culturali

Thesaurus PICO 4.1

UKAT UK Archival Thesaurus (UKAT) UNESCO UNESCO thesaurus

Page 13: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Tool for vocabulary training• Mediathread is CCNMTL's 1 open-source platform for exploration, analysis,

and organization of web-based multimedia content

• Launched at Columbia in 2010, Mediathread has now been used in over 300 courses across a wide range of subject domains, including Social Work, Journalism, East Asian Studies, Art History, Film Studies, History, Public Health, Education, and English.2

• Mediathread is in use today at over 25 Colleges and Universities, including the MIT, Dartmouth College, Princeton University, Wellesley College etc. 3

• Mediathread is under constant development

1 Columbia Center for New Media Teaching and Learning http://ccnmtl.columbia.edu/portfolio/custom_software_applications_and_tools/mediathread.html 2014-10-09

2http://mediathread.info/content/cases-columbia 2014-10-093http://getmediathread.com/index.html#who 2014-10-09

Page 14: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Accessing Mediathread

• http://mediathread.ait.co.at

• user/password:

Page 15: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Next Parts:Thesaurusmanagement:

• Part 1: Basics

• Part 2: Import/Export

• Part 3: Multilingual Vocabularies

Option:

• Mediathread in a Nutshell

Page 16: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Mediathread in a Nutshell

Page 17: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

After logging into Mediathread

Page 18: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Mediathread sections (I)• From Your Instructor (left side)

Contains the compositions with the instructions

Start with “How to use the Mediathread tool?”

Followed by Chapter 0 to Chapter 6

• Compositions give instructions After reading each composition

complete the associated Assignment (same chapter number and name)

Page 19: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Mediathread sections (II)• Assignments contain exercises (middle)

Accomplish them by clicking on “Respond to Assignment” If necessary check the instructions in the compositions again

Page 20: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Reading Compositions (I)• After clicking on a Composition

Read the text on the left side Click on the symbol or text to see Power Point slides on the

right side

Page 21: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Reading Compositions (II)• Change size and position of the slides by

Using the arrow and plus/minus signs on the left Using the scroll function of your mouse (to change size) Dragging the slide by holding the left mouse button (to change

position)

Page 22: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Reading Compositions (III)

• When finished click on “LoCloud Vocabulary Training” to return to the course overview

• Here click on the next Composition or on “Respond to Assignment”

• Or use the links at the bottom of each Composition or Assignment

OR

Page 23: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

The Locloud Vocabulary Training ...

• is an English online tool workshop

• includes all features of the vocabulary tool TemaTres

• is too comprehensive to complete it in this section

• can be started in class and finished any time online

Please use the time left to start with the Vocabulary Training ...

Page 24: Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.

Starting Vocabulary Training ...

• Open Mediathread under http://mtp.ait.co.at

• Logging in