Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy...
-
Upload
jeremiah-jimenez -
Category
Documents
-
view
218 -
download
2
Transcript of Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy...
Metadata for Digital Preservation: A Status Report on PREMIS
Priscilla Caplan,FCLANancy Hoebelheinrich,Stanford University
CNI Fall Task Force MeetingDecember 6-7, 2004
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
OCLC/RLG Preservation Metadata Framework Working Group
OCLC/RLG Preservation Metadata Working Group• Convened March 2000• Looked at CEDARS, NLA, NEDLIB, OCLC
Preservation metadata framework (June 2002)• Synthesized elements from existing sets• Based on OAIS information model• Set of “prototype” preservation metadata elements
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
PREMIS
June 2003: OCLC/RLG sponsored new working group: PREMIS• Preservation Metadata: Implementation Strategies
Objectives• Define “core” set of preservation metadata elements, with
supporting data dictionary, applicable to broad range of digital preservation activities
• Identify and evaluate alternative strategies for encoding, storing, managing, and exchanging preservation metadata
http://www.oclc.org/research/projects/pmwg/
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Membership
Priscilla Caplan, FCLA (Chair) Rebecca Guenther, LC (Chair) Michael Alexander, British Library George Barnum, GPO Charles Blair, U. of Chicago Olaf Brandt, U. of Gottingen Adam Farquhar, British Library
David Gewirtz, Yale Kevin Glavash, MIT/Dspace Cathy Hartman, U. of N. Texas Helen Hodgart, British Library Nancy Hoebelheinrich, Stanford Roger Howard/Sally Hubbard,
Getty Museum Pam Kircher, OCLC John Kunze, Calif. Digital Library
Brian Lavoie, OCLC liaison Robin Dale, RLG liaison Vicky McCarger, LA Times Jerry McDonough, NYU/METS Evan Owens, JSTOR Erin Rhodes, NARA Madi Solomon, Walt Disney Co. Angela Spinazze, ATSPIN Stefan Strathmann, U. of
Gottingen Gunter Waibel, RLG Lisa Weber, NARA Robin Wendler, Harvard Hilde van Wijngaarden, KB Andrew Wilson, NAA
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Advisory Committee
Howard Besser, UCLA Liz Bishoff, OCLC (via
Colorado Digitization Program)
Gerard Clifton, National Library of Australia
Gail Hodge, CENDI Steve Knight, National Library
of New Zealand
Maggie Jones, Digital Preservation Coalition
Nancy McGovern, Cornell Cliff Morgan, Wiley UK Richard Rinehart, U. of
California, Berkeley
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Implementation Survey Report
State of the art in Winter, 2003/2004 28 libraries, 7 archives, 3 museums, and 11 other 13 different countries; 45% from U.S. 38% in planning; 33% development; 46% production
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Core Elements
Mission: Define a core set of implementable preservation metadata elements.
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Core Elements
Mission: Define a core set of implementable preservation metadata elements.
• Information that supports and documents the digital preservation process;
• Information that supports the the viability, renderability, understandability, identity and authenticity of digital objects over time.
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Core Elements
Mission: Define a core set of implementable preservation metadata elements.
• What most working preservation repositories are likely to need to know.
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Core Elements
Mission: Define a core set of implementable preservation metadata elements.
• As rigorous as possible• As much explanation as possible• Implementation neutral -- “This is what you
have to know”• Values can be automatically supplied and
processed -- no lengthy textual descriptions
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Core Elements: Data Model
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Sample data dictionary entry
Semantic unit sizeSemanticcomponents
None
Definition The size of a file or bitstream in bytes.Rationale Size is useful for knowing whether you have retrieved
the correct number of bytes from storage and whetheran application has enough room to move or processfiles. I t might also be used when billing for storage.
Data constraint IntegerLEVEL Representation File BitstreamScope Not applicable Applicable ApplicableExamples 2038927Repeatability Not repeatable Not repeatableObligation Optional OptionalNotes May be repeated for embedded files.
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
The evolution of a semantic unit: Format
What is a format? What types of objects have format? Is there a usable authority list of formats? Is there a difference between a format and a
profile? Shouldn’t we plan for format registries?
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
First try
Format• formatName• formatScheme
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Second try
formatName• formatNameValue• formatVersion
formatRegistry• formatRegistryEntry• formatRegistryKey
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Third try
formatName• formatNameValue• formatVersion
formatRegistry• formatRegistryIdentifier
• formatRegistryIdentifierScheme• formatRegistryIdentifierValue
• formatRegistryName• formatRegistryEntry
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Fourth try
Format (Required, not repeatable)• formatName (Optional, repeatable)
• formatNameValue• formatNameVersion• formatNameRole
• formatRegistry (optional, repeatable)• formatRegistryIdentifier• formatRegistryName• formatRegistryEntry• formatRegistryRole
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Current draft
Format (Required, Not Repeatable)• formatName (Optional, Not repeatable)
• formatNameValue• formatVersion
• formatRegistry (Optional, Repeatable)• formatRegistryName• formatRegistryKey• formatRegistryRole
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Semantic units pertaining to Objects
objectIdentifier contentLocation originalName preservationLevel objectCharacteristics environment
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
objectCharacteristics
compositionlevel fixity size format inhibitors significantProperties creatingApplication
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Semantic units pertaining to Events
eventIdentifier• eventIdentifierScheme• eventIdentifierValue
eventType eventOutcome eventOutcomeDetail eventDetail eventDateTime relatedPermission
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Semantic units pertaining to Agents
agentIdentifier• agentIdentifierScheme• agentIdentifierValue
agentName
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Semantic units pertaining to Rights
permissionStatement relatedObject grantingAgent grantingAgreement permission
act restriction
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
The context for rights statements
Two approaches addressed:
• (Preservation) rights metadata
• Formal agreements between depositors and digital archives / repositories
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
(Preservation) rights metadata
Documentary role by conveying• What is allowed• Record of changes made to content for preservation
purposes
Predicated on getting the metadata in the first place• From whom? (integrity of data)• Business reason to provide?
At present, few satisfactory means for managing preservation, intellectual property (rightsholders), or authorized uses of content (DRM tools)
C.Ayre, “The right to preserve: the rights issues of digital preservation”, D-Lib Magazine, March 2004.
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Focus upon permissions
C. Ayre article – • Description of preservation strategies / copying
requirements, e.g.,
• From: Refreshing bits & media migration – for purposes of overcoming storage media deterioration or obsolescence – by periodic copying of bitstreams from one physical medium to another
• To: Re-creation of content – for purposes of overcoming both hardware & software obsolescence – by re-keying data, reverse engineering original software & recreating or creation of new software environment.
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Use Cases
• Use cases gathered from active preservation repositories• Necessary to allow:
• Various kinds of copying• Retention of copies• Modification of the original material• Adaptation to new technologies • Transfer of all these permissions to another party• Withdrawing / deleting content
• Provide for various kinds of restrictions• Time, number related• Attribution• Format or quality• Purpose for actions to be taken
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Formal agreements between depositors and digital archives / repositories
Acquisition of a work for a digital archive• Copies received through mandatory deposit (LC)• Copies obtained by gift or purchase• Copies obtained through subscription or license
• Copies made or received under agreements with copyright owners
• J.M. Besak, “Copyright issues relevant to the creation of a digital archive: a preliminary
assessment, 2002.
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Comparison of sample agreements
14 out of 49 respondents contributed agreements Analyzed to determine how agreements treated:
• Rights granted, expressly• Restrictions and conditions upon the user / uses of the
materials submitted• Repository permissions and actions expressly allowed
(not covered herein)• Warranties made
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Rights granted, expressly to repository
Type of Library Rights granted
Governmental agency Preserve and make accessibleArchive, distribute and use
National library (by legal deposit)
Retain in the archive and, subject to negotiated access conditions, provide public access to it in perpetuityTake preservation action necessary to keep publication accessible as hardware and software changes
State archive Non-exclusive, non-transferable and non-assignable right to make use of services of the Archive (not including access)
Open source repository system
Non-exclusive right to reproduce, translate and/or distribute data worldwide in print & e-format & in any medium (incl abstract)
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Rights granted, expressly to repository
Type of Library Rights granted
Governmental agency Catalog, enhance, validate and document the dataDistribute copies of the data in a variety of formatsIncorporate metadata or documentation into public access catalogs
University Library Right to publish & continually an author’s work in digital format on the university websiteTransfer, in all or partially, the rights & obligations included in the Agreement to a 3rd party
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Restrictions upon the user / uses
Type of Library RestrictionsGovernmental agencies, private archive, open source repository system
Non-commercial, research / educational purposes only
Governmental agencies, private archive,
Must be “authorised” user (e.g., subscribed, agreeing to license conditions, registerd)
National Library (legal deposit)
Upon commercial publications, access limited to physical premises of the Library on a single computer with copying & communication functions disabled (during time in which pub is commercially viable
Fee for service archive Must not breach rightsholder’s copyright by selling all or part of data, or including in product which is sold
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Conditions upon the user / uses
Type of Library ConditionsGovernmental agencies, private archive
User must acknowledge or attribute rightsholder upon future publications based on use of the dataState to users of the data that rightsholder is not responsible for the quality of the work users produceDeposit with the Archive copies of any published work based in whole or in part under negotiated conditions of use
Private archive Keep list of all persons to whom access of data has been given & supply to Archive Director when asked
Governmental agencies, National library / archive
Adhere to privacy provisions, as applicable
Fee for service archive Attach notice of restrictions when making data available to end-users
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Warranties made
Type of Library Warranties
All types Depositor is copyright holder or authorized by
National library Will provide for the permanent storage and maintenance of data in a form that will provide security to data integrity and usabilityWill maintain the content of the data, not the format or functionality of the contentExplicitly assumes the role of an official, non-exclusive archival agent
Governmental agencies, National library (federal deposit)
Not warranted to be suitable for useContributors to documents being archived have been notified of deposit into archive and agree
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Range of approaches based on risk assessment
Implicit understanding: if you deposit it, we will preserve (safest for legal depository institutions?)
How dark is your archive?
It’s all stored in the agreement, see it here…
“Oh, sorry – we’ll just take it down” – asking for forgiveness (not permission)
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
Next steps:
PREMIS ACTIVITIES Complete data dictionary (January 2005) Write narrative report Develop XML schemas for exchanging metadata
FOLLOW-UP ACTIVITIES Community outreach Establish feedback/maintenance mechanism Testbeds for implementation and exchange
CNI Fall 2004 Metadata for Digital Preservation
Preservation Metadata: Implementation Strategies
For More Information:
PREMIS Web Site• www.oclc.org/research/projects/pmwg
“Implementing Metadata in Digital Preservation Systems: The PREMIS Activity” D-Lib (April ‘04)• www.dlib.org/dlib/april04/lavoie/04lavoie.html
RLG DigiNews October 2004 and December 2004 issues• www.rlg.org/en/page.php?Page_ID=12081
Priscilla Caplan: [email protected]
Rebecca Guenther: [email protected]