Reference Model for an Open Archival Information System (OAIS) ESIP Summer Meeting John Garrett –...

21
Reference Model for an Open Archival Information System (OAIS) ESIP Summer Meeting John Garrett – ADNET Systems at NASA/GSFC 2009-07-09

Transcript of Reference Model for an Open Archival Information System (OAIS) ESIP Summer Meeting John Garrett –...

Reference Model for an Open Archival Information System

(OAIS)

Reference Model for an Open Archival Information System

(OAIS)

ESIP Summer Meeting

John Garrett – ADNET Systems at NASA/GSFC

2009-07-09

ESIP Summer Meeting

John Garrett – ADNET Systems at NASA/GSFC

2009-07-09

Topics (time permitting)Topics (time permitting)

• OAIS Reference Model

• Follow-on/Related Standards– Producer-Archive Interface Methodology Abstract Standard

(PAIMAS)

– Repository Audit and Certification Metrics (draft)

• Requirements for Bodies Providing Audits (draft)

– Producer-Archive Interface Specification (PAIS) (draft)

– XML Formatted Data Unit (XFDU)

Contributors: Don Sawyer, Daniele Boucon, Lou Reich, David Giaretta and many others involved in CCSDS Archiving and Packaging standards development

• OAIS Reference Model

• Follow-on/Related Standards– Producer-Archive Interface Methodology Abstract Standard

(PAIMAS)

– Repository Audit and Certification Metrics (draft)

• Requirements for Bodies Providing Audits (draft)

– Producer-Archive Interface Specification (PAIS) (draft)

– XML Formatted Data Unit (XFDU)

Contributors: Don Sawyer, Daniele Boucon, Lou Reich, David Giaretta and many others involved in CCSDS Archiving and Packaging standards development

OAIS Reference Model HomeOAIS Reference Model Home

• Consultative Committee for Space Data Systems (CCSDS)

• International group of space agencies• Develop variety of science discipline-independent standards• Became working body for an ISO TC 20/ SC 13 about 1990

TC20: Aircraft and Space Vehicles

SC13: Space Data and Information Transfer Systems

– http://www.ccsds.org/

– Ensured broad participation, including traditional archives, libraries, companies

(Not restricted to space communities; all participation was welcomed!)

• Consultative Committee for Space Data Systems (CCSDS)

• International group of space agencies• Develop variety of science discipline-independent standards• Became working body for an ISO TC 20/ SC 13 about 1990

TC20: Aircraft and Space Vehicles

SC13: Space Data and Information Transfer Systems

– http://www.ccsds.org/

– Ensured broad participation, including traditional archives, libraries, companies

(Not restricted to space communities; all participation was welcomed!)

OAIS RM

1. Negotiating and accepting information

2. Obtaining sufficient control of the information to ensure long-term preservation

3. Determining the "designated community"

4. Ensuring that information is independently understandable

5. Following documented policies and procedures

6. Making the preserved information available4-

1.2

MANAGEMENT

Ingest

Data Management

SIP

AIPDIP

queries

result sets

Access

PRODUCER

CONSUMER

Descriptive Info

AIP

orders

Descriptive Info

Archival Storage

Administration

Preservation Planning

ArchivalInformation

Package (AIP)

ContentInformation

PreservationDescriptionInformation

(PDI)

PackagingInformation

PackageDescription

further described by

delimited byderived from

Producer

Consumer

queries

resultsets

orders

OAIS

ArchivalInformationPackages

SubmissionInformationPackages

DisseminationInformationPackages

OAIS Information ModelOAIS Mandatory Responsibilities:

OAIS Functional ModelOAIS Environment and Data Flows

In the Beginning: OAIS Reference Model

CCSDS 650.0-B-1 Reference Model for an Open Archival Information System (OAIS)(ISO 14721:2003) http://public.ccsds.org/publications/archive/650x0b1.pdf

• Negotiates and accepts information from information producers

• Obtains sufficient control to ensure long-term preservation

• Determines which communities (designated) need to be able to understand the preserved information

• Ensures the information to be preserved is independently understandable to the Designated Communities

• Follows documented policies and procedures that ensure the information is preserved against all reasonable contingencies

• Makes the preserved information available to the Designated Communities in forms understandable to those communities

• Negotiates and accepts information from information producers

• Obtains sufficient control to ensure long-term preservation

• Determines which communities (designated) need to be able to understand the preserved information

• Ensures the information to be preserved is independently understandable to the Designated Communities

• Follows documented policies and procedures that ensure the information is preserved against all reasonable contingencies

• Makes the preserved information available to the Designated Communities in forms understandable to those communities

OAIS ResponsibilitiesOAIS Responsibilities

ArchivalInformation

Package (AIP)

ContentInformation

PreservationDescriptionInformation

(PDI)e.g., • Hardcopy document

• Document as an electronic file together with its format description • Scientific data set consisting of image file, text file, and format descriptions file describing the other files

e.g., • How the Content Information came into being, who has held it, how it relates to other information, and how its integrity is assured

OAIS Archival Information PackageOAIS Archival Information Package

PackagingInformation

PackageDescription

further described by

delimited byderived from

e.g., How to find Content information and PDI onsome medium

e.g., Informationsupporting customersearches for AIP

Preservation Description Information (PDI)Preservation Description Information (PDI)

• Reference Information

– Provides one or more identifiers, or systems of identifiers, by which the Content Information may be uniquely identified

– Bibliographic Description, Persistent IDs

• Provenance Information

– Describes the source of Content Information, who has had custody of it, what is its history

– Logs of migrations

• Context Information

– Describes how the Content Information relates to other information outside the Information Package

– Pointers to related collections

• Fixity Information

– Protects the Content Information from undocumented alteration– Digital signatures, Checksums

• Reference Information

– Provides one or more identifiers, or systems of identifiers, by which the Content Information may be uniquely identified

– Bibliographic Description, Persistent IDs

• Provenance Information

– Describes the source of Content Information, who has had custody of it, what is its history

– Logs of migrations

• Context Information

– Describes how the Content Information relates to other information outside the Information Package

– Pointers to related collections

• Fixity Information

– Protects the Content Information from undocumented alteration– Digital signatures, Checksums

View of an OAIS EnvironmentView of an OAIS Environment

OAIS(archive)

Management

Producer Consumer

• Producer provides the information to be preserved

• Management sets overall OAIS policy• Consumer seeks and acquires preserved

information of interest

OAIS Functional EntitiesOAIS Functional Entities

SIP = Submission Information PackageAIP = Archival Information PackageDIP = Dissemination Information Package

SIP

DescriptiveInfo.

AIP AIP DIP

Administration

PRODUCER

CONSUMER

queriesresult sets

MANAGEMENT

Ingest Access

DataManagement

ArchivalStorage

DescriptiveInfo.

Preservation Planning

orders

Producer

Consumer

queries

resultsets

orders

OAIS

ArchivalInformationPackages

External Data Flow ViewExternal Data Flow View

SubmissionInformationPackages

DisseminationInformationPackages

ConformanceConformance

• How does an archive conform?– It discharges the set of minimal responsibilities

– It supports the basic information concepts that address a definition of information and types of information packages

• How do other documents conform?– By using OAIS terms and concepts

• Certification Standard in progress

• How does an archive conform?– It discharges the set of minimal responsibilities

– It supports the basic information concepts that address a definition of information and types of information packages

• How do other documents conform?– By using OAIS terms and concepts

• Certification Standard in progress

OAIS UpdateOAIS Update

Many improvements including:

• Authenticity

• Information properties

• Risk management

• Emulation

• Federation

Many improvements including:

• Authenticity

• Information properties

• Risk management

• Emulation

• Federation

SIP = Submission Information Package

SIP

DIP

Administration

PRODUCER

CONSUMER

queriesresult sets

MANAGEMENT

Ingest Access

DataManagement

ArchivalStorage

DescriptiveInfo.

Preservation Planning

orders

AIP

AIP = Archival Information Package

DIP = Dissemination Information Package

PAIMAS Focus

Producer-Archive Interface Methodology Abstract Standard

PAIMAS MethodologyPAIMAS Methodology

•The Archive Project is broken into 4 main phases:• Preliminary Phase,

• Formal Definition Phase,

• Transfer Phase,

• Validation Phase.

• PAIMAS identifies: • the phases in the process of transferring information,

• the objective of the phases,

• Extensive action tables of actions that must be carried out,

• the expected results.

• PAIMAS is a basis: • for further specialization by a particular community

• for the identification of standards and implementation guides,

• for identification and development of a set of software tools.

CCSDS 651.0-B-1 Producer-Archive Interface Methodology Abstract Standard.(ISO 20652:2006) http://public.ccsds.org/publications/archive/651x0b1.pdf

Data ready to archive

PAIMAS PAIMAS phases & relationshipsphases & relationships

Preliminary Agreement

Submission Agreementincluding Dictionary andFormal Model

Transferred object files

Validation agreement

Ph

ase

obje

ctiv

e

Preliminary Phase

Formal Definition Phase

Transfer Phase

ValidationPhase

Anomalies

Validate

the

transfe

rred

objectsDefi

ne th

e

info

rmat

ion

to

be

arch

ived

•reso

urces

estim

atio

nDev

elop

agre

emen

t (dat

a to

be

deliv

ered

, com

plem

enta

ry

ele

men

ts, s

ched

ule)

Actual

tran

sfer

of t

he

ob

ject

s

Preliminary phase: sub-phasesPreliminary phase: sub-phases

Information to be archived, Quantification, Legal andcontractual aspects, permanent impact on the Archive,Summary of costs, etc.

Id Preliminary phase: quantification Involves

P-19 Estimate the data volume to be transmitted to the Archive Producer

P-20 Assess the permanent data volume to store Archive

P-21 Assess the storage capability need for the ingest process Archive

P-22 Assess the associated costs Archive

Action table

Description

First contact

Preliminary definition,feasibility and assessment

Establishment of apreliminary agreement

Repository Audit and Certification - MetricsRepository Audit and Certification - Metrics

• http://wiki.digitalrepositoryauditandcertification.org/bin/view

• Closing in on public draft that will be submitted to CCSDS and ISO

• Builds on previous audit work by TRAC and many others

• http://wiki.digitalrepositoryauditandcertification.org/bin/view

• Closing in on public draft that will be submitted to CCSDS and ISO

• Builds on previous audit work by TRAC and many others INCLUDED TOPICS

ORGANISATIONAL INFRASTRUCTURE GOVERNANCE & ORGANIZATIONAL VIABILITY ORGANIZATIONAL STRUCTURE & STAFFING PROCEDURAL ACCOUNTABILITY & .

PRESERVATION POLICY FRAMEWORK FINANCIAL SUSTAINABILITY CONTRACTS, LICENSES, & LIABILITIES

DIGITAL OBJECT MANAGEMENT

INGEST: ACQUISITION OF CONTENT

INGEST: CREATION OF THE AIP

PRESERVATION PLANNING AIP PRESERVATION INFORMATION MANAGEMENT ACCESS MANAGEMENTINFRASTRUCTURE AND SECURITY RISK MANAGEMENT Technical Infrastructure Risk Management Security risk management

4.2.8 The repository shall verify each AIP for completeness and correctness at the point it is created.

Supporting Text This is necessary in order to ensure that what is maintained over the long term is as it should be and can be traced to the information provided by the Producers.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Description of the procedure that verifies completeness and correctness of the AIPs; logs of the procedure.

DiscussionThe repository should be sure that the AIPs it creates are as they are expected to be by …

18

PAIS Objectives PAIS Objectives

• Producer-Archive Interface Specification

• Provide formal modelling of data objects that are to be transferred from Producer to Archives– XML-based interchange of the model and SIPs

• Implementation standard for Producer – Archive Interface

– Conformity with the OAIS Reference Model

– Conformity with the PAIMAS

– Conformity with the XFDU

• Aimed mainly at Formal Definition Phase with applicability to Transfer Phase with Validation

• Closing in on public draft that will be submitted to CCSDS and ISO possibly by end of year

• Producer-Archive Interface Specification

• Provide formal modelling of data objects that are to be transferred from Producer to Archives– XML-based interchange of the model and SIPs

• Implementation standard for Producer – Archive Interface

– Conformity with the OAIS Reference Model

– Conformity with the PAIMAS

– Conformity with the XFDU

• Aimed mainly at Formal Definition Phase with applicability to Transfer Phase with Validation

• Closing in on public draft that will be submitted to CCSDS and ISO possibly by end of year

19

PAIS Basic EntitiesPAIS Basic Entities

XFDU Packaging Standard RationaleXFDU Packaging Standard RationaleTechnology and Requirements Evolution

• Physical media -->Electronic Transfer

• No standard language for metadata--> XML

• Homogeneous Remote Procedure Call-->CORBA, SOAP

• Little understanding of long-term preservation-->OAIS RM

• Record formats-->Self describing data formats

New Requirements• Describe multiple encodings of a data object

• Better describe the relationships among a set of data objects.

Technology and Requirements Evolution• Physical media -->Electronic Transfer

• No standard language for metadata--> XML

• Homogeneous Remote Procedure Call-->CORBA, SOAP

• Little understanding of long-term preservation-->OAIS RM

• Record formats-->Self describing data formats

New Requirements• Describe multiple encodings of a data object

• Better describe the relationships among a set of data objects.

• Use of XML based technologies•Designed to be extensible to include new XML technologies as they emerge

• Linkage of data and software• Direct mapping to OAIS Information Models• Support both media and network exchange• Support for multiple encoding/compression on individual objects or on entire package• Mapping to current SFDU Packaging & Data Description Metadata where possible•Maximal use of existing standards and tools from similar efforts

Technical Drivers

XFDU Conceptual ViewXFDU Conceptual View

CCSDS 661.0-B-1 XML Formatted Data Unit (XFDU) Structure and Construction Rules. (ISO 13527:2009) http://public.ccsds.org/publications/archive/661x0b1.pdf

Open source XFDU Toolkit Library developed as a reference implementation

(available at: http://sindbad.gsfc.nasa.gov/xfdu• Interoperability testing completed with ESA XFDU Implementation

• Partnered with JPL/PDS to establish a NASA Testbed

• XFDU Briefing on Scalability at the Collaborative Expedition Workshop / Toward Scalable Data Management (available at: http://colab.cim3.net/cgi-bin/wiki.pl?ExpeditionWorkshop/TowardScalableDataManagement_2008_06_10)