Contextual Metadata Jan Dvorak CERIF Task Group Leader @ euroCRIS Researcher @ Charles University in...

17
S Contextual Metadata Jan Dvorak CERIF Task Group Leader @ euroCRIS Researcher @ Charles University in Prague, CZ Consultant @ InfoScience Praha, CZ 13 euroCRIS Seminar :: September 9-10, 2013 in Brussels, Belgium

Transcript of Contextual Metadata Jan Dvorak CERIF Task Group Leader @ euroCRIS Researcher @ Charles University in...

S

Contextual Metadata

Jan Dvorak• CERIF Task Group Leader @ euroCRIS• Researcher @ Charles University in Prague,

CZ• Consultant @ InfoScience Praha, CZ

The 2013 euroCRIS Seminar :: September 9-10, 2013 in Brussels, Belgium

Research Metadata

Discovery metadata for information to be found Serve many specific use-cases, scenarios, niches

Many standards Tens of major ones Hundreds of domain-specific standards … Thousands on experiment-level

The Purpose of Metadata

Enable the re-use of resources Knowledge stored in publications Data in datasets Functionality in software Participation in events Infrastructure

Facilities Equipment Services

Common Grounds

Organisations Universities, Research institutes, Hi-tech companies Funding bodies & organisations Publishers Facility operators

People Researchers Management

One Domain

Research

Consistency

Several possible views of the same objects

Inconsistencies would be unprofessional (at the very least)

Common Metadata Format?

To drive all the discovery metadata views

A lingua franca for research

Requirements

Complete coverage of research information

Interlinked: the context

Allow for many perspectives on the research information

Accommodate multilinguality: support translations

Accept the world keeps changing: record history

Declared semantics: definitions rather than terms

Formal syntax – machine processable & understandable

… the answer

CERIFCommon European Research Information FormatCommon Exchange Research Information Format

CERIF: a concise history

CERIF91 – flat file

CERIF 2000 – database structured

CERIF 2006 – semantics moved into Semantic Layer XML exchange format

CERIF 1.5 (2012) – federated identifiers XML exchange format polished

CERIF 1.6 (2013) – datasets supported

CERIF: Complete Coverage

cfExpertiseAndSkills

cfEquipmentcfFunding

cfFacility

cfService

cfCitation

cfEventcfLanguage cfCurrency

cfCountry

cfCurriculumVitae

cfPrize

cfQualification

cfGeographicBoundingBox

cfPostalAddress

cfElectronicAddress

cfPerson

cfProject

cfOrganisationUnit

cfResultPatent

cfResultPublication

cfResultProduct

cfIndicator cfMeasurement

cfFederated Identifier

CERIF: Many Perspectives

Start from any entity: Project – funding, consortium, project team, outputs Publication – authors, publisher, funding Research dataset – creator/contributor, origin

project, publications that build upon it Person – outputs, datasets, projects, events, … …

A mesh, a fully connected graph

CERIF: Multilinguality

Any free-text attribute is treated as: Possibly multi-valued Each value qualified with

Language code Translation mode

Original value Human translation Machine translation

CERIF: Interlinking

(Almost) any entity connected to any other entity

Most entities connected to itself “is-part-of / has part” “builds upon / is used by”

CERIF: Record History

Every relationship records the time interval in which it is/was/will be true

Open ends represented by effective ±∞

When something changes: the old relationship is not removed, only its end date is

set a new relationship is inserted, starting now

Historic data accumulates

CERIF: Declared Syntax

Terms can be misleading Senior researcher vs. Research associate

It’s the real meaning that matters Definition Description Examples

Research Information Infrastructure

Discovery metadatagenerated from

CERIF

referencesDetailed (meta)data