METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation...

24
METS Awareness Training An Introduction to METS

Transcript of METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation...

Page 1: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

METS Awareness Training

An Introduction to METS

Page 2: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

Digital libraries – where are we now?

Digitisation technology now well established and well-understood

Standards for digitisation processes have settled down and are widely recognised

Still a disparity in approaches to metadata - no 'MARC standard‘ for the digital library

Page 3: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

Proprietary packages, eg Olive software

Home-designed databases using Access or similar

SGML: including:-TEI aloneTEI + EAD Ad-hoc DTDs

Approaches to metadata - varied!

Page 4: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

The lack of a standard – what it means...

poor cross-searching limited interchange facilitiesmetadata tied to proprietary packagesconsequent obsolescence and costs of

conversion little chance of a 'hybrid library'

Page 5: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

What is needed?

A standard for metadata content : analogous to AACR2

A standardised framework for holding and exchanging metadata : analogous to the MARC record

METS is designed to fulfil the latter function

Page 6: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

Three types of metadata

The Digital Library Federation defines three types of metadata for a digital object:-

Descriptive

Administrative

Structural

Information about intellectual content (analogous to standard catalogue record)

Information needed to handle, delivery, maintain and archive an object

Description of internal structure of object

Page 7: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

What is METS?

● “Metadata Encoding and Transmission Standard”

● Produced by Library of Congress Standards Office and Digital Library Federation

● Provides framework for holding all types of metadata for digital object

● Does not prescribe content of metadata, but recommends a number of schemes for this

● Written in XML (‘eXtensible Markup Language)

Page 8: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

Why XML?

An ISO standard, not dependent on any given application

Interchangeable with other applications

Easy to integrate cataloguinginformation with text transcription, images etc.

Handles structural metadata easily

Page 9: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

An overview of the METS file

Generally one METS file corresponds to one digital object (which may incorporate many files)

All metadata (descriptive, administrative and structural) encoded in single document

Each type is held in a separate section, linked by identifiers

All metadata and external data (eg. images, text, video) is either referenced from METS file or can be held internally

Page 10: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

The inside of a METS file

METSheader

dmdSec

admSec

behaviorSec

structMap

fileSec file inventory

descriptive metadata

administrative metadata

behaviour metadata

structural map

Page 11: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

Title Pagetitle page

Prefacepage ipage ii

Chapter 1page 1page 2page 3page 4page 5

Chapter 2page 7page 8

<div LABEL=”Title Page”>

<div LABEL=”Preface”>

<div LABEL=”Chapter 1>

<div LABEL=”Chapter 2>

<div LABEL=”Page 1”><div LABEL=”Page 2”><div LABEL=”Page 3”>

<fptr FILEID=”xxx”/>

<area BEGIN=”xxx”END=“xxx”/>

Page 12: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

<structMap>

<div ID="munahi010-aaa-div.1" LABEL="Section 1”> <div ID="munahi010-aaa-div.1.1" LABEL="Plate 1">

<fptr FILEID="munahi010-aaa-fgrp-0001"/> </div>

<div ID="munahi010-aaa-div.1.1" LABEL="Plate 2"> <fptr FILEID="munahi010-aaa-fgrp-0002"/>

</div> </div>

</structMap>

Page 13: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

The inside of a METS file

METSheader

dmdSec

admSec

behaviorSec

structMap

fileSec file inventory

descriptive metadata

administrative metadata

behaviour metadata

structural map

Page 14: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

<fileSec>

fileSec fileGrp

file

file

file

FLocat

Page 15: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

<fileGrp ID="munahi010-aaa-fgrp-0001">

<file GROUPID="0" ID="munahi010-aaa-0001-0" MIMETYPE="image/tiff" ADMID="munahi010-aaa-tmd-0001-0"> <FLocat LOCTYPE="URL" xlink:href="file://hfs.ox.ac.uk/data/odl/munahi010/digObjects/aaa/0/munahi010-aaa-0001.tiff"/> </file>

<file GROUPID="6" ID="munahi010-aaa-0001-6" MIMETYPE="image/jpeg" ADMID="munahi010-aaa-tmd-0001-6"> <FLocat LOCTYPE="URL" xlink:href="http:odl/munahi010/digObjects/aaa/6/munahi010-aaa-0001-6.jpg"/> </file>

<file GROUPID="3" ID="munahi010-aaa-0001-3" MIMETYPE="image/jpeg" ADMID="munahi010-aaa-tmd-0001-3"> <FLocat LOCTYPE="URL" xlink:href="http:odl/munahi010/digObjects/aaa/3/munahi010-aaa-0001-3.jpg"/> </file>

</fileGrp>

Page 16: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

The inside of a METS file

METSheader

dmdSec

admSec

behaviorSec

structMap

fileSec file inventory

descriptive metadata

administrative metadata

behaviour metadata

structural map

Page 17: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

Descriptive and administrative metadata

Descriptive and administrative metadata may be handled in two ways:

embedding directly within the METS file within an <mdWrap> element

being held in an external file and referenced from the METS file using an <mdRef> element

Page 18: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

<mdWrap MIMETYPE="text/xml" MDTYPE="MODS" LABEL="MODS Metadata"> <xmlData> <mods:mods> <mods:titleInfo> <mods:title>Cobbett's parliamentary history of England, from

the Norman Conquest, in 1066 to the year, 1803 : from which last-mentioned epoch it is continued downwards in the work

entitled, &amp;quot;The parliamentary debates&amp;quot;</mods:title> </mods:titleInfo> <mods:titleInfo type="alternative"> <mods:title>Cobbett's Parliamentary History -

volume 2</mods:title> </mods:titleInfo> <mods:name> <mods:namePart>$aGreat Britain. Parliament.</mods:namePart> <mods:role> <mods:roleTerm type="code“

authority="marcrelator">spn</mods:roleTerm> </mods:role> </mods:name> </mods:mods> </xmlData></mdWrap>

Page 19: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

<amdSec ID="munahi010-aaa-amd-0001">

<techMD ID="munahi010-aaa-tmd-0001-0"> <mdRef MDTYPE=“MIX" LOCTYPE="URL"

xlink:href=“../munahi010-aaa-0001-0.xml"/> </techMD> </amdSec>

Page 20: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

IDs and METS

METS uses IDs to express the relations between its component parts A coherent system of identifiers is therefore essential

Project ID munahi010Item ID munahi010-aaaTechnical metadata munahi010-aaa-tmd-0001File groups munahi010-aaa-fgrp-0001File IDs munahi010-aaa-0001-3divs munahi010-aaa-div.1

Page 21: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

What to put in a METS file? METS does not prescribe the content (particularly the

descriptive metadata) which it can contain However, the METS board does endorse some schemas

as recommended for use with METS:-

Descriptive MetadataDublin Core MODS (Metadata Object Description Schema)MARCXML MARC 21 Schema (MARCXML)

Administrative MetadataSchema for Technical Metadata for Text (NYU)Library of Congress Audio-Visual Prototyping Project NISO Technical Metadata for Digital Still Images (MIX)METS Schema for Rights Declaration

Page 22: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

METS and interoperability

METS is very flexible in its application – there are multiple ways of encoding everything:-

– metadata and data can be embedded or referenced

– any scheme can be used for this metadata

– file inventory can be organised in multiple ways (by referenced object, by type of file etc)

This all reduces interoperability of METS records.

Page 23: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

METS Profiles (cont.)

● This can be countered to some extent by METS Profiles:-

– XML documents describing application of METS in a given project/institution

– follows METS Profile schema and each profile has to validate against it

– registered with central repository at Library of Congress

● But does not allow automated cross-mapping of METS files: this has yet to be explored

Page 24: METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.

Next:-

A case study of METS in action