Post on 10-Dec-2015
Preservation Metadata: Implementation Strategies(PREMIS)
Rebecca GuentherLibrary of Congressrgue@loc.gov
IS&T Archiving ConferenceApril 28, 2005
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Overview of presentation
Background to PREMIS PREMIS membership and charge Preservation repositories implementation survey PREMIS Core elements group
• Development of data dictionary• Data model
Next steps Implementation issues
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
OCLC/RLG Preservation Metadata Framework Working Group
OCLC/RLG Preservation Metadata Working Group• Convened March 2000• Looked at CEDARS, NLA, NEDLIB, OCLC
Preservation metadata framework (June 2002)• Synthesized elements from existing sets• Based on OAIS information model• Elaboration of OAIS• Set of “prototype” preservation metadata elements
http://www.oclc.org/research/projects/pmwg/pm_framework.pdf
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
PREMIS
June 2003: OCLC/RLG sponsored new working group: PREMIS• Preservation Metadata: Implementation Strategies
Need• Practical and implementable, not broadly theoretical• Independent of specific implementation
Objectives• Define “core” set of preservation metadata elements, with
supporting data dictionary, applicable to broad range of digital preservation activities
• Identify and evaluate alternative strategies for encoding, storing, managing, and exchanging preservation metadata
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Membership
Priscilla Caplan, FCLA (Chair) Rebecca Guenther, LC (Chair) Michael Alexander, British Library George Barnum, GPO Charles Blair, U. of Chicago Olaf Brandt, U. of Gottingen Adam Farquhar, British Library
David Gewirtz, Yale Kevin Glavash, MIT/Dspace Cathy Hartman, U. of N. Texas Helen Hodgart, British Library Nancy Hoebelheinrich, Stanford Roger Howard/Sally Hubbard,
Getty Museum Pam Kircher, OCLC John Kunze, Calif. Digital Library
Brian Lavoie, OCLC liaison Robin Dale, RLG liaison Vicky McCarger, LA Times Jerry McDonough, NYU/METS Evan Owens, JSTOR Erin Rhodes, NARA Madi Solomon, Walt Disney Co. Angela Spinazze, ATSPIN Gunter Waibel, RLG Lisa Weber, NARA Robin Wendler, Harvard Hilde van Wijngaarden, KB Andrew Wilson, NAA
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Advisory Committee
Howard Besser, UCLA Liz Bishoff, OCLC (via
Colorado Digitization Program)
Gerard Clifton, National Library of Australia
Gail Hodge, CENDI Steve Knight, National Library
of New Zealand
Maggie Jones, Digital Preservation Coalition
Nancy McGovern, Cornell Cliff Morgan, Wiley UK Richard Rinehart, U. of
California, Berkeley
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Implementation Survey Report
State of the art in Winter, 2003/2004 28 libraries, 7 archives, 3 museums, and 11 other 13 different countries; 45% from U.S. 38% in planning; 33% development; 46% production
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Survey findings
Little experience with digital preservation• Most didn’t have active preservation strategy• Many not yet in production• Cannot assess adequacy of metadata
Lack of common vocabulary and conceptual framework• Informed by OAIS reference model• Difference of opinion as to meaning of OAIS compliance
Metadata• Many recording rights, provenance, technical,
administrative, descriptive and structural Most repositories serve goals of both preservation and
access
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Trends
Store metadata redundantly in XML or relational database and with content data objects
Use METS for structural metadata and as container for descriptive and administrative; MIX for images
Use OAIS as framework and starting point Maintain multiple versions (originals, some normalized or
migrated) in repository with complete metadata for all versions
Choose multiple strategies for digital preservation
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Core Elements
Mission: Define a core set of implementable preservation metadata elements.
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Core Elements
Mission: Define a core set of implementable preservation metadata elements.
• Information that supports and documents the digital preservation process;
• Information that supports the viability, renderability, understandability, identity and authenticity of digital objects over time.
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Core Elements
Mission: Define a core set of implementable preservation metadata elements.
• What most working preservation repositories are likely to need to know
• Core does not imply mandatory
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Core Elements
Mission: Define a core set of implementable preservation metadata elements.
• As rigorous as possible• As much explanation as possible• Implementation neutral -- “This is what you
have to know”• Values can be automatically supplied and
processed -- no lengthy textual descriptions
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Core Elements: Data Model
Intellectual Entities
Rights
Objects
Agents
Events
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Scope of data dictionary
Implementation independent Descriptive metadata out of scope Metadata about Agents is limited Technical metadata applying to all or most format types Media or hardware details is limited Business rules are essential for working repositories, but
not covered Rights information for preservation actions, not access
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Sample data dictionary entrySemantic unit size Semantic components
None
Definition The size in bytes of the file or bitstream stored in the repository.
Rationale Size is useful for ensuring the correct number of bytes from storage have been retrieved and that an application has enough room to move or process files. It might also be used when billing for storage.
Data constraint Integer Object category Representation File Bitstream Applicability Not applicable Applicable Applicable Examples 2038927 Repeatability Not repeatable Not repeatable Obligation Optional Optional Creation/ Maintenance notes
Automatically obtained by the repository.
Usage notes Defining this semantic unit as size in bytes makes it unnecessary to record a unit of measurement. However, for the purpose of data exchange the unit of measurement should be stated or understood by both partners.
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Semantic units pertaining to objects
objectIdentifier preservationLevel objectCategory objectCharacteristics creatingApplication originalName Storage environment
signatureInformation relationship linkingEventIdentifier linkingIntellectual
Entity Identifier linkingPermission
StatementIdentifier
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
objectCharacteristics
compositionlevel fixity size format significantProperties inhibitors
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Semantic units pertaining to Events
eventIdentifier eventType eventDateTime eventDetail eventOutcome eventOutcomeDetail linkingAgentIdentifier linkingObjectIdentifier
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Semantic units pertaining to Agents
agentIdentifier agentName agentType
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Semantic units pertaining to Rights
permissionStatement permissionStatementIdentifier relatedObject grantingAgent grantingAgreement permissionGranted
act restriction termOfGrant permissionNote
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Next steps
PREMIS deliverables (May 2005)• Data dictionary and report• XML schemas• Draft for experimentation to remain stable for a year• Revisions will be based on results of testing
Follow-up activities• Testbeds for implementation and exchange• Community outreach• Establish maintenance activity• Consider formal standardization
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
Implementation considerations
Schema use with specific implementations (e.g. METS) Machine generation of metadata Tools Role of registries (format, environment) Prospects for collaboration and exchanging information
content Rights and permissions Emergence of best practices Support needed from PREMIS maintenance activity
Apr. 28, 2005 IS&T Archiving Conference 2005
Preservation Metadata: Implementation Strategies
For More Information:
PREMIS Web Site• www.oclc.org/research/projects/pmwg
“Implementing Metadata in Digital Preservation Systems: The PREMIS Activity” D-Lib (April ‘04)• www.dlib.org/dlib/april04/lavoie/04lavoie.html
RLG DigiNews October 2004 and December 2004 issues• www.rlg.org/en/page.php?Page_ID=12081
Priscilla Caplan: pcaplan@ufl.edu
Rebecca Guenther: rgue@loc.gov