Core Issues in Digital Preservation
description
Transcript of Core Issues in Digital Preservation
![Page 1: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/1.jpg)
Core Issues in Digital Preservation
Jacob Nadal, Preservation OfficerUCLA Library
![Page 2: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/2.jpg)
PERSPECTIVE
Is there a general preservation framework that applies to all records? How does it differ in application between artifactual and digital preservation?
![Page 3: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/3.jpg)
-ed
preserv- -ation
Preservation consists in sustainable efforts, optimized over time
![Page 4: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/4.jpg)
-ed -ing
-ablepreserv- -ation
Preservation consists in sustainable efforts, optimized over time
![Page 5: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/5.jpg)
Framework
• Materials: tangible substance that carries media
• Media: materials that record information
• Transport: means(s) for perceiving media
• Language: system for interpretation of media
![Page 6: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/6.jpg)
LINEAR B: Digital Preservation Analog
Photo: British Museumhttp://www.britishmuseum.org/
![Page 7: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/7.jpg)
Linear B
• Bronze Age Cretan script: c. 1450 to 1375 B.C.
• No cribs, such as the Rosetta Stone, an almost entirely logical decipherment
• This is the essential problem that digital preservation tries to avert or mitigate
• Show all four parts of our preservation framework
• Discovered by Sir Arthur Evans, in spring of 1900 on numerous inscribed (media) clay (material) tablets.
![Page 8: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/8.jpg)
Linear B Tablet and Transcribed Glyphs
Photo: Dennis Jarvishttp://www.flickr.com/photos/archer10/
![Page 9: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/9.jpg)
First successes
• Counting system was easy to determine
• Analog to Digital: Some formats & encodings are favored because they’re easy to identify
• 90 distinct characters, indicative of a syllabic system, with a writing direction from left to right
• Debate over relation to Greek or Cypriot. Most felt it was a unique Cretan language.
• Analog to Digital: Encoding and File Format
![Page 10: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/10.jpg)
Alice Kober: Pattern Recognition
• 1940 - Alice Kober identifies word triplets
• Same word stem with different endings, presumably for case (e.g. accusative, or nominative)
• Kober separated symbols into modifiers and word stems
• Analog to Digital: Metadata, Headers, Content blocks, Structured data
![Page 11: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/11.jpg)
Michael Ventris: Patterns to Prose
• Consonant-vowel patterns established
• Problem of missing vowels and leading vowels: e.g. di-vi-si-b(i)-le or i-n(i)-di-vi-si-b(i)-le
• Analog to Digital: The problem of compression
• Developed refinements of Kober’s chart to manage these relationships
![Page 12: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/12.jpg)
![Page 13: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/13.jpg)
A few good guesses
• Refinement of relationships gave Ventris enough confidence to take a guess at three words, the towns of Anisos, Knossos, and Tulissos
• Assigning consonant values opened up more words
• Greek philologist John Chadwick partnered to carry forward the decoding of a Greek dialect from the time of the Trojan War.
![Page 14: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/14.jpg)
In effect, those are the issues in digital preservation:
• Began with identification of parts…
• Digital Forensics and Analysis
• … associated with possible informational content …
• Metadata and Contextual Information
• … then instantiated by a subject expert and translated into a known, contemporary language.
• Digital Curation and Migration
![Page 15: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/15.jpg)
Defining Digital Preservation
Photo: John Keogh http://www.flickr.com/people/jvk/
![Page 16: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/16.jpg)
Short Definition of Digital Preservation
• Digital preservation combines policies, strategies and actions that ensure
access to digital content over time.
• Remove digital and it’s a generic definition of preservation
• The medium definition adds some strongly digital concepts, that do not bear heavily on artifactual preservation.
![Page 17: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/17.jpg)
Medium Definition of DIgital Preservation
Digital preservation combines policies, strategies and
actions to ensure access to reformatted and born
digital content regardless of the challenges of
media failure and technological
change. The goal of digital preservation is the
accurate rendering of authenticated
content over time.
![Page 18: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/18.jpg)
Long Definition Core
• Digital preservation combines policies, strategies and actions to ensure the accurate rendering of authenticated content over time, regardless of the challenges of media failure and technological change. Digital preservation applies to both born digital and reformatted content.
• Digital preservation policies document an organization’s commitment to preserve digital content for future use; specify file formats to be preserved and the level of preservation to be provided; and ensure compliance with standards and best practices for responsible stewardship of digital information.
• Digital preservation strategies and actions address content creation, integrity and maintenance.
![Page 19: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/19.jpg)
Long Form Details:Content Creation
• Content creation includes:
• Clear and complete technical specifications
• Production of reliable master files
• Sufficient descriptive, administrative and structural metadata to ensure future access
• Detailed quality control of processes
![Page 20: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/20.jpg)
Long Form Details: Content Integrity
• Content integrity includes:
• Documentation of all policies, strategies and procedures
• Use of persistent identifiers
• Recorded provenance and change history for all objects
• Verification mechanisms
• Attention to security requirements
• Routine audits
![Page 21: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/21.jpg)
Long Form Details: Content Maintenance
• Content maintenance includes:
• A robust computing and networking infrastructure
• Storage and synchronization of files at multiple sites
• Continuous monitoring and management of files
• Programs for refreshing, migration and emulation
• Creation and testing of disaster prevention and recovery plans
• Periodic review and updating of policies and procedures
![Page 22: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/22.jpg)
• Content creation includes:→ Clear and complete technical specifications → Production of reliable master files → Sufficient descriptive, administrative and
structural metadata to ensure future access → Detailed quality control of processes • Content integrity includes:→ Documentation of all policies, strategies and
procedures → Use of persistent identifiers → Recorded provenance and change history for
all objects → Verification mechanisms → Attention to security requirements → Routine audits • Content maintenance includes: → A robust computing and networking
infrastructure → Storage and synchronization of files at
multiple sites → Continuous monitoring and management of
files → Programs for refreshing, migration and
emulation → Creation and testing of disaster prevention
and recovery plans → Periodic review and updating of policies and
procedures
• Digital preservation combines policies, strategies and actions to ensure the accurate rendering of authenticated content over time, regardless of the challenges of media failure and technological change. Digital preservation applies to both born digital and reformatted content.
• Digital preservation policies document an organization’s commitment to preserve digital content for future use; specify file formats to be preserved and the level of preservation to be provided; and ensure compliance with standards and best practices for responsible stewardship of digital information.
• Digital preservation strategies and actions address content creation, integrity and maintenance.
Long Form (Detail)Long Form (Core)
![Page 23: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/23.jpg)
• Content creation includes:→ Clear and complete technical specifications → Production of reliable master files → Sufficient descriptive, administrative and
structural metadata to ensure future access → Detailed quality control of processes • Content integrity includes:→ Documentation of all policies, strategies and
procedures → Use of persistent identifiers → Recorded provenance and change history for
all objects → Verification mechanisms → Attention to security requirements → Routine audits • Content maintenance includes: → A robust computing and networking
infrastructure → Storage and synchronization of files at
multiple sites → Continuous monitoring and management of
files → Programs for refreshing, migration and
emulation → Creation and testing of disaster prevention
and recovery plans → Periodic review and updating of policies and
procedures
• Digital preservation combines policies, strategies and actions to ensure the accurate rendering of authenticated content over time, regardless of the challenges of media failure and technological change. Digital preservation applies to both born digital and reformatted content.
• Digital preservation policies document an organization’s commitment to preserve digital content for future use; specify file formats to be preserved and the level of preservation to be provided; and ensure compliance with standards and best practices for responsible stewardship of digital information.
• Digital preservation strategies and actions address content creation, integrity and maintenance.
Long Form (Detail)Long Form (Core)
Create Good Files
Keep an Eye on Them
Store Them Safely
Don’t get lost in the fine print
![Page 24: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/24.jpg)
Text
![Page 25: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/25.jpg)
Text
• UTF-8, a way of representing Unicode, is standard
• Digital text is purely character data
• No font or layout information is stored in a pure text file
• Critical for searching and manipulation
• XML is a UTF-8 text format
![Page 26: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/26.jpg)
Images
![Page 27: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/27.jpg)
Images
• TIFF standard preservation format; JPEG2000 emerging as a new alternative
• Must be uncompressed image data (TIFF and JP2K can both store compressed data)
• At least 300 pixels per inch (ppi/dpi), 24-bit color
• More pixels allows more magnification without pixelation
• Color should be calibrated and profiled with an ICC color profile.
![Page 28: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/28.jpg)
Audio
![Page 29: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/29.jpg)
Audio
• Broadcast WAV (BWAV) – Wave file with a metadata header
• WAV audio is Pulse Code Modulation (PCM), the universal format for uncompressed audio
• Resolution of at least 44.1 kHz (CD quality), preferably 96 kHz
• Bit Depth of at least 16-bit (CD quality), pref. 24-bit
![Page 30: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/30.jpg)
Video & Moving Image
![Page 31: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/31.jpg)
Video & Moving Image
• Standards and practices developing
• Uncompressed desirable, but high storage costs
• Compression is normal in video, but may cause preservation problems
• Uncompressed .AVI is the current safe bet
• Motion JP2K & MPEG21 may be options
• H.264 becoming the standard for service copies
• Pick one, but plan on a migration
![Page 32: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/32.jpg)
Data and Interactivity
![Page 33: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/33.jpg)
Data and Interactivity
• Need to decide if fixed points in time are required: Are you storing an instance of data?
• Need to decide if active system is required: Are you maintaining and experience or immersive environment?
• Or, are you doing both?
• ICPSR: www.icpsr.umich.edu/icpsrweb
• CDL: www.cdlib.org/services/uc3/datamanagement
• Variable Media Network: variablemedia.net
![Page 34: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/34.jpg)
Metadata
![Page 35: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/35.jpg)
PREMIS: PREservation Metadata: Implementation Strategies
![Page 36: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/36.jpg)
Storage and Maintenance
![Page 37: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/37.jpg)
Storage and Maintenance
• Lots of options
• LOCKSS Networks
• Digital archives (OCLC digital archive, DuraSpace)
• DIY systems, from a couple removable hard drives, to cloud storage, to building your own data center.
![Page 38: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/38.jpg)
1. OAIS compliance
2. Administrative responsibility
3. Organizational viability
4. Financial sustainability
5. Technological and procedural suitability
6. System security
7. Procedural accountability
RLG: Trusted Digital Repositories
![Page 39: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/39.jpg)
The OAIS Reference Model
![Page 40: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/40.jpg)
Digital Preservation: Reasonable Expectations
• Digital preservation has strong points and weak points; so does artifactual preservation.
• With digital preservation, we should expect
• High day to day reliability
• Low incidence of acid decay, mold, or biohazards
• Some preservation problems in the future; stick to standards and the impact will be mitigated
![Page 41: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/41.jpg)
Methods of Preservation
• Digital Archaeology: Recovery and forensic analysis of data from damaged media
• Conservation: Maintaining original equipment for access
• Bit preservation: Storage, transfer and refresh of data
• Migration: Transformation of data into new formats to allow for continued access
• Emulation: Recreation of original operating environment for continued access
![Page 42: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/42.jpg)
Methods of Preservation
• Digital Archaeology: Recovery and forensic analysis of data from damaged media
• Conservation: Maintaining original equipment for access
• Bit preservation: Storage, transfer and refresh of data
• Migration: Transformation of data into new formats to allow for continued access
• Emulation: Recreation of original operating environment for continued access
Now, but also Now, but also never.never.
This is what’s This is what’s
next.next.
Step one (and 2…
Step one (and 2…
n)n)
![Page 43: Core Issues in Digital Preservation](https://reader035.fdocuments.in/reader035/viewer/2022062807/568151eb550346895dc02606/html5/thumbnails/43.jpg)
THANK YOU!Questions & Comments: jacobnadal.com/247