CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

47
CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval

Transcript of CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Page 1: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

CASPARCultural, Artistic and Scientific knowledge for

Preservation Access and Retrieval

Cultural, Artistic and Scientific knowledge for

Preservation Access and Retrieval

Page 2: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

WHAT:objectives (from the call)

• Develop systems and tools which will support the accessibility and use over time of digital cultural and scientific resources.– Explore how to preserve the availability

and authenticity of digital resources over time

– Support emerging complexity of scientific, cultural and creative objects and associated repositories

Page 3: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Objectives• Objective 1: to lay the foundation for all future preservation

activities (CASPAR methodology)• Objective 2: to create key advanced components to use in all the

preservation activities (CASPAR components)• Objective 3: to create the long-term autonomous system to support

all the preservation activities (CASPAR framework)• Objective 4: to demonstrate the validity of the CASPAR framework

with heterogeneous data and a variety of innovative applications (CASPAR testbeds)

In addition to these fundamental objectives, CASPAR offers supporting activities in order to guarantee the successful execution of the project results even after the end of the project and the re-usability of outcomes in a wider domain than the testbed-related sectors:

• Objective 5: to build up the CASPAR preservation user community in order to create consensus around the initiative and gather a critical mass of potential users/customers

• Objective 6: to create a self-sustainable model for the CASPAR process and offer supporting activities in order to promote the successful exploitation of the project results after the end of the project.

Page 4: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

WHAT: vision• CASPAR manages knowledge to keep archives

alive through time: – Preserve information & knowledge – not just “the bits”

• Preservation is a process, not a one-shot event– transforming content (migration, emulation, etc.) to

adapt it to new constraints of rendition and playabilityand– enriching content to preserve its intelligibility and

(re)usability (not just rendering)

• OAIS provides a general framework: – current implementations deal more with format than the

interpretation of data – CASPAR proposes a richer implementation for dealing

with content interpretation

Page 5: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

WHAT: expected results• CASPAR approach and framework to support

the “end-to-end” lifecycle for scientific, cultural and creative digital resources– Infrastructure– Tools– Techniques

• Testbeds: science, culture, artistic to identify and test common infrastructure– Supported by discipline specific access– Embedded in long-lived institutions

Page 6: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

• must be relatively easy to use• must have a low “buy-in” in terms of

effort required to adopt the CASPAR paradigm

• must avoid requiring wholesale change of everyone else’s systems

• must be decentralised and reproducible so that it can live on after the formal end of the CASPAR project.

Page 7: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

FOR WHOM

• Potential USERS:– Creators of the resources– Funders of the resources and their

preservation– Curators of the resources– Suppliers of preservation-related

services– Users of the information

Page 8: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

...for WHOM

• Large users communities involved with– Science:

• European Space Agency• CCLRC

– Culture• UNESCO

– Artistic• INA, IRCAM, CIANT …

• Creators• Funders• Curators• Suppliers• End-users

Page 9: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

.......for WHOM

• Multi-Industry perspectives– Software– Hardware– Middleware

Page 10: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

HOW: Foundations of Preservation approach

• OAIS Reference Model

• OAIS related stds work:– Producer-Archive

interface– NARA/RLG Audit &

Certification draft – now released for testing and comment

– SIP, XFDU….others

• OAIS based projects– InterPARES– ….many others

Page 11: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

HOW: Implementation plan structure (blocks of work)

Page 12: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

HOW (cont’d): S&T approach

• Component-based research– OAIS-based components

• e.g. Storage

– OAIS-based extensions– Next generation components– Focused research & testbeds: vertical

threads

Page 13: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

HOW (cont’s): OAIS extensions

• Knowledge driven approach• Knowledge management to support long-

term preservation of concepts/information:– Single, complex, on demand, interactive

objects– DRM – Authenticity– Access– Storage

Page 14: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Framework

• Integrated Framework: supports the development of the three vertical testbeds– Component-based research Open standards &

Open Source development methodology – Framework: integration of research components

with existing off-the-shelf/modifiable-off-the-shelf components

• Service Oriented Architecture for service delivery

• Process control and composition

Page 15: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

CASPAR Testbeds• Three testbeds: Cultural, Performing Arts,

Scientific– Cultural <- UNESCO– Peforming Arts <- INA , IRCAM– Scientific <- ESA (with CCLRC)

• Complex, multi-source, multifaceted data• Specific requirements on preservation (technical,

delivery, legal)• Specific research issues: as matter of facts, they

represents three focused research streams• Identifying and confirming common

infrastructure elements

Page 16: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

CASPAR testbeds:Testing and Validation

• Common design & validation methodology– Uniform evaluation parameters

• Each testbed has its own user communities

• Continuous feeding to the Project Performance Evaluation process

Page 17: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.
Page 18: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

CASPAR Integrated architecture

Page 19: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

CCLRC Infrastructure Build-up

European Preservation Infrastructure

Alliance

Other Alliance Members e.g.

ESA

Future Alliance

Members

CCLRC Curation Facility

CASPAR

Other CCLRC projects

Other CCLRC projects

FP7 projects

Page 20: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Registries

Page 21: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

UK DCC Organisation

Industry

research collaborators

standards bodies

testbeds& tools

communities of practice: users

community support & outreach

research

development co-ordination

service definition & delivery

management & admin support

curation organisations eg DPC

Collaborative Associates Network of DataOrganisations

Page 22: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

DCC Registry

Page 23: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Sharing RepInfo

• RepInfo is needed• RepInfo is extensive• May need to “extend” RepInfo as

Designated Community and/or its knowledgebase changes

• How can we avoid every Repository repeating the work– Need to control costs

• Need to share the effort

Page 24: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Requirements

• Data users - need to be able to obtain pre-identified RepInfo

• Curators: need to be able to find suitable pre-existing RepInfo to re-use

Or• Create RepInfo

Page 25: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Registry for Representation Info

Example of use of Representation Information Labelling

The Digital Object could have RepInfo packed with it

Support automated access & processing

Page 26: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Use of RepInfo

CPIDStructure = CPID

Semantics = CPID

Rendering s/w = CPID

CPID

CPID

Structure = CPID

Semantics = CPID

Rendering s/w = CPID

Structure = CPID

Semantics = CPID

Rendering s/w = CPID

External Registry

Each “bag of bits” has an associated pointer (CPID) to a Label

Page 27: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Registry Interface Requirements• Give it an identifier, give me back

something (e.g. RepInfo)• Allow me to search for RepInfo• Interoperable with other (format)

registries• Not limited to single protocols

Page 28: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Registry API

API allows applications to talk to many different implementationshttp://dev.dcc.ac.uk/cvs

Page 29: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

API

Page 30: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

ebXML Registry Version 3.0: Simplified View of Architecture

Source: ebXML Registry Services and Protocols Committee Draft, 10 February 2005

Page 31: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Labels and CPIDs

Page 32: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Example RepInfo LabelA Label is itself RepInfo. It provides a way to collect together in a sensible way lots of individual pieces of RepInfo

Page 33: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Re-using RepInfo

• Existing RepInfo can be used to build up further RepInfo– E.g. refer to

existing RepInfo in labels

Page 34: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Versioning and LID

• Each object has a unique identifier• Versions of an object share a “logical

ID” (LID)• Simply using the LID gives the latest

version• Can specify a particular version

Page 35: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Clients

• DCC Registry:– Web browser– Thick client (http://registry.dcc.ac.uk)

• Any Registry– Applications using API

Page 36: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

GUI access to Registry

Page 37: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Classifications

• Many Classification Schemes• Help to find RepInfo

Page 38: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Initial RepInfo

• Simple text– ASCII– Unicode– UTF7/8

• PDF, Word(!)• FITS format• FITS standard dictionaries• Things that are “MISSING”

Page 39: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

RepInfo entry

• Simple command line tool

Page 40: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Creating Repinfo

• There are many tools which can be used to create RepInfo:– Simple text editor to create text

describing the data– Complex tools to capture data

description e.g.• EAST (see next slides)• DFDL etc

– Programming languages of various sorts

Page 41: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

EAST descriptions

Page 42: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Snapshot d ’écran OASIS

OASIS tool for creating EAST descriptions

Page 43: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Example of EAST description

Page 44: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Using RepInfo

• A pointer to RepInfo can be attached to data

• The RepInfo can be used to – Display– Examine – Process– Re-use

the data

Page 45: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

• Laser facility produces Binary data normally used by proprietary software

• Describe using EAST data description language

• Use in generic application (shown here) to display/process

Example of use of RepInfo

Page 46: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Simple Buy-In

• Need to add RepInfo to your Data Objects?

• Does the RepInfo already exist?– Yes: get its ID and put that in a label– No: register what you have – be

assigned an ID.• Add more details later when needed• Or others can add more details

Page 47: CASPAR Cultural, Artistic and Scientific knowledge for Preservation Access and Retrieval.

Operating Registries

• See http://dev.dcc.ac.uk/twiki/bin/view/Main/RegistryProcedures