Digital Preservation
-
Upload
michael-day -
Category
Documents
-
view
2.051 -
download
1
description
Transcript of Digital Preservation
A centre of expertise in digital information management
www.ukoln.ac.uk
UKOLN is supported by:
Digital Preservation
Michael DayResearch and Development Team Leader
UKOLN, University of Bath
Information Systems and Services, UWE, Bristol, 15 February 2011
A centre of expertise in digital information management
www.ukoln.ac.uk
Presentation outline
• Digital preservation overview– Some definitions– Technical challenges– Organisational challenges
• Approaches to solving the problem– Preservation Strategies– Tools for:
• Format characterisation (DROID)• Preservation Planning (Plato)
– The OAIS model:• Preservation metadata• Repository audit frameworks (TRAC, DRAMBORA)
A centre of expertise in digital information management
www.ukoln.ac.uk
Definitions
• Digital preservation:– Is mainly concerned with the sustainability of “content” for
a given period of time (not forever)– Largely about ensuring “continued access” to content– “The series of managed activities necessary to ensure
continued access to digital materials for as long as necessary” - Digital Preservation Coalition (DPC) Digital Preservation Definitions and Concepts list: http://www.dpconline.org/advice/preservationhandbook/introduction/definitions-and-concepts?q=definitions
– A combination of technical, organisational and legal challenges
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation basics
• An ongoing (lifecycle) approach to managing digital content based on:– The identification and adoption of appropriate
preservation strategies for content– The collection and management of appropriate metadata
(explicit and implicit knowledge, contexts)– The ongoing monitoring of technical contexts and the
application of preservation planning techniques– Continual monitoring of the organisation (audit)
A centre of expertise in digital information management
www.ukoln.ac.uk
A multi-faceted set of challenges
• Technical– Strategies needed to
deal with ongoing obsolescence and scale
• Organisational– Access and reuse– Authenticity and
integrity– Sustainability (costs)– Legal (see Andrew
Charlesworth’slecture)
A centre of expertise in digital information management
www.ukoln.ac.uk
Technical challenges (1)
• Physical– Bits stored on a physical medium (or in the cloud?)– Focus 20 years ago was on new media types (e.g.
optical storage technologies) as a panacea– Bit-level preservation is still important – the first layer in a
viable preservation strategy
A centre of expertise in digital information management
www.ukoln.ac.uk
Obsolete media
Image courtesy of Frank Carey
Exhibition at NASA White Sands Test Facility, 2009
A centre of expertise in digital information management
www.ukoln.ac.uk
Technical challenges (2)
• Hardware and software dependence– Most digital objects are dependent on particular
configurations of hardware and software– Relatively short obsolescence cycles
A centre of expertise in digital information management
www.ukoln.ac.uk
Hardware and software dependence
Exhibition at NASA White Sands Test Facility, 2009Image courtesy of Frank Carey
A centre of expertise in digital information management
www.ukoln.ac.uk
Conceptual challenges (1)
• What is an digital object?– Some are analogues of traditional objects, e.g. meeting
minutes, research papers– Others are not, e.g. Web pages, blogs, GIS, 3D models
of chemical structures, research data more generally• Complexity• Dynamic nature• Interactivity
– Born digital vs. product of digitisation initiatives– Logical layer between physical storage of bits and the
conceptual objects that need preservation (includes data types, formats, etc.)
A centre of expertise in digital information management
www.ukoln.ac.uk
Conceptual challenges (2)
• Need to identify and document the “significant properties” (or characteristics) of content:– Recognises that preservation is context dependent, even user
specific (OAIS concept of 'designated community')
– Helps with choosing an acceptable preservation strategy
• Compare the ‘performance model’ developed by the National Archives of Australia (2002) - “The source of a record is a fixed message that interacts with technology. This message provides the record’s unique meaning, but by itself is meaningless to researchers since it needs to be combined with technology in order to be rendered as its creator intended. The process is the technology required to render meaning from the source”
– Focus on re-use (data curation)
A centre of expertise in digital information management
www.ukoln.ac.uk
Organisational challenges (1)
• Sustainability:– Ultimately the sustainability of content depends upon the long-
term sustainability of organisations• Focus on business models
– Organisational commitment:• “An institutional repository needs to be a service with
continuity behind it … Institutions need to recognise that they are making commitments for the long term” Clifford Lynch
• Need for policy development– Incentives for preservation:
• Clarity on roles and responsibilities needed• Who benefits? Who pays? “Free riding?”
A centre of expertise in digital information management
www.ukoln.ac.uk
Organisational challenges (2)
• Economic perspectives:– Blue Ribbon Task Force on Sustainable Digital
Preservation and Access: http://brtf.sdsc.edu/• Final report (Feb 2010) “Ensuring that valuable digital
assets will be available for future use is not simply a matter of finding sufficient funds. It is about mobilizing resources - human, technical, and financial - across a spectrum of stakeholders diffuse over both space and time. But questions remain about what digital information we should preserve, who is responsible for preserving, and who will pay.”
– JISC-funded LIFE (Life Cycle Information for E-Literature) has developed a predictive costing tool: http://www.life.ac.uk/
A centre of expertise in digital information management
www.ukoln.ac.uk
Organisational challenges (3)
• The challenge of scale:– The Web– Digitised content:
• Google Books– The “data deluge” in e-Science:
• New generations of instruments, computer simulations
• Many terabytes generated per day, petabyte scale computing (and growing)
• Cory Doctorow, “Welcome to the petacentre.” Nature, 455, pp 17-21, 4 Sep 2008
A centre of expertise in digital information management
www.ukoln.ac.uk
Organisational challenges (4)
• The need for collaboration:– Need for 'deep-infrastructure' for preservation recognised
as far back as 1996 by the Task Force on Archiving of Digital Information
• Digital preservation involves the "grander problem of organizing ourselves over time and as a society ... [to manoeuvre] effectively in a digital landscape" (p. 7)
– Building on existing networks– Role for national-level co-ordination:
• Digital Preservation Coalition (DPC), nestor (Germany), National Digital Information Infrastructure and Preservation Program (NDIIPP)
A centre of expertise in digital information management
www.ukoln.ac.uk
Organisational challenges (5)
• Learn the lessons from the past:– Things will go wrong– Do what you can to
enable recovery from disaster
– Digital technologies support replication (create more than one point of failure)
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation strategies (1)
• Main approaches:– Technology preservation (e.g., computing museums)– Digital archaeology (a post hoc approach)– Emulation (focusing on the environment, often used
where look-and-feel is important, e.g. computer games)– Migration (focusing on the content)
• A mature approach: A set of organised tasks designed to achieve the periodic transfer of digital information from one hardware and software configuration to another, or from one generation of computer technology to a subsequent one - CPA/RLG report (1996)
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation strategies (2)
• Preservation strategies are not in competition– Different strategies will work together, may be value in
diversification– Migration strategies mean difficult choices need to be
made about target formats
• But the strategy chosen has implications for:– The technical infrastructure required (and metadata)– Collection management priorities– Rights management
• Owning the rights to re-engineer software– Costs
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation strategies (3)
• Tools for format characterisation and validation– DROID - Digital Record Object Identification (based on
the PRONOM registry• Very important to know what types (formats) of
content exist in a particular collection (e.g., institutional repository or Web archive)
• Performs batch identification of file formats• http://www.nationalarchives.gov.uk/PRONOM/
Default.aspx– JHOVE - JSTOR/Harvard Object Validation Environment
• Used for format validation• http://hul.harvard.edu/jhove/
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation strategies (4)
• Plato preservation planning tool– Developed by EU Planets project– A decision support tool that helps users explore the
evaluation of potential preservation solutions against specific requirements and for building a plan for preserving a given set of objects
– Integrates file format identification (using DROID); some migration services; XML-based generic format characterisation using XCL (eXtensible Characterisation Languages)
– More info: http://www.ifs.tuwien.ac.at/dp/plato/intro.html– Integration with repositories tested by JISC KeepIt
project: http://preservation.eprints.org/keepit/
A centre of expertise in digital information management
www.ukoln.ac.uk
The OAIS Reference Model
4-1.
2
MANAGEMENT
Ingest
Data Management
SIP
AIPDIP
queries
result setsAccess
PRODUCER
CONSUMER
Descriptive Info
AIP
orders
Descriptive Info
Archival Storage
Administration
Preservation Planning
OAIS Functional Entities (Figure 4-1)
http:public.ccsds.org/publications/archive/650x0b1.PDF
A centre of expertise in digital information management
www.ukoln.ac.uk
Preservation metadata
• Metadata and documentation is vitally important– Relates to OAIS concepts like Representation Information
and Preservation Description Information– Functions:
• Enables resource discovery - supports the development of finding aids
• Records meaning (structure and semantics)• Records context and provenance (authenticity)
– Standards that support digital preservation activities are under development:
• PREMIS Data Dictionary (for core metadata): http://www.loc.gov/standards/premis/
A centre of expertise in digital information management
www.ukoln.ac.uk
Repository audit frameworks (1)
• Repository audit frameworks first developed out of the OAIS Reference Model– OAIS Mandatory Responsibilities (only six of them):
• The main focus was on technical and organisational aspects, e.g.:
– That repositories ensure that preserved information (content) can be understood (independently understandable)
– That documented policies and procedures are being followed
• No clear concept of OAIS “compliance”
A centre of expertise in digital information management
www.ukoln.ac.uk
Repository audit frameworks (2)
• Trusted Repositories Audit and Certification (TRAC): Criteria and Checklist:– Source: http://www.crl.edu/archiving-preservation/digital-
archives/metrics-assessing-and-certifying– RLG-NARA Digital Repository Certification Task Force
checklist, revised by the Center for Research Libraries (CRL) and OCLC
– Criteria cover three main things:• Organisational Infrastructure
– Governance and viability, structure and staffing, financial sustainability, contracts, etc.
• Digital Object Management– Ingest, preservation planning, archival storage, etc.
• Technologies, Technical Infrastructure, & Security– Systems and infrastructure, etc.
A centre of expertise in digital information management
www.ukoln.ac.uk
Core repository principles (1)
• Ten Principles - agreed 2007 by CRL (US), Digital Curation Centre (UK), Nestor (Germany) and Digital Preservation Europe– The repository commits to continuing maintenance of digital
objects for identified community/communities.
– Demonstrates organizational fitness (including financial, staffing structure, and processes) to fulfill its commitment.
– Acquires and maintains requisite contractual and legal rights and fulfills responsibilities.
– Has an effective and efficient policy framework.
– Acquires and ingests digital objects based upon stated criteria that correspond to its commitments and capabilities.
A centre of expertise in digital information management
www.ukoln.ac.uk
Core repository principles (2)
• Ten principles (continued)– Maintains/ensures the integrity, authenticity and usability of
digital objects it holds over time. – Creates and maintains requisite metadata about actions taken
on digital objects during preservation as well as about the relevant production, access support, and usage process contexts before preservation.
– Fulfills requisite dissemination requirements.– Has a strategic program for preservation planning and action.– Has technical infrastructure adequate to continuing
maintenance and security of its digital objects.
• Available: http://www.crl.edu/archiving-preservation/digital-archives/metrics-assessing-and-certifying/core-re
A centre of expertise in digital information management
www.ukoln.ac.uk
TRAC Checklist example page
A centre of expertise in digital information management
www.ukoln.ac.uk
Repository audit frameworks (3)
• DRAMBORA (Digital Repository Audit Method Based on Risk Assessment)– Digital Curation Centre / Digital Preservation Europe– “Presents a methodology for self-assessment,
encouraging organisations to establish a comprehensive self-awareness of their objectives, activities and assets before identifying, assessing and managing the risks implicit within their organisation“
– Identifying risks and scoring each one on likelihood and impact
– Covers: organisational context, policies, assets, risks, etc.– Online tool (http://www.repositoryaudit.eu/about/)
A centre of expertise in digital information management
www.ukoln.ac.uk
Repository audit frameworks (4)
• A means of "asking the right questions" about repositories and documenting appropriate procedures and risks
• Both TRAC and DRAMBORA are under consideration by ISO technical committees– External badge of quality (a "certified preservation
repository")
or– Management tool for self assessment
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation basics (reprise)
• An ongoing (lifecycle) approach to managing digital content based on:– The identification and adoption of appropriate
preservation strategies for content– The collection and management of appropriate metadata
(explicit and implicit knowledge, contexts)– The ongoing monitoring of technical contexts and the
application of preservation planning techniques– Continual monitoring of the organisation (audit)
A centre of expertise in digital information management
www.ukoln.ac.uk
The Future ...
• “It is always a mistake for a historian to try and predict the future. Life, unlike science, is simply too full of surprises” - Richard J. Evans, In defence of history (1997, p. 62)
A centre of expertise in digital information management
www.ukoln.ac.uk
Web links:
• PRESERV project: http://preservation.eprints.org/
• KeepIt project: http://preservation.eprints.org/keepit/
• Plato Preservation Planning tool: http://www.ifs.tuwien.ac.at/dp/plato/intro.html
• DRAMBORA: http://www.repositoryaudit.eu/about/
• RSP briefing paper on preservation and storage formats: http://www.rsp.ac.uk/pubs/briefingpapers-docs/technical-preservformats.pdf
• WePreserve cartoons at: http://www.youtube.com/user/wepreserve
A centre of expertise in digital information management
www.ukoln.ac.uk
Available: http://www.youtube.com/watch?v=PGFOZLecjTc
A centre of expertise in digital information management
www.ukoln.ac.uk
Further reading
• Blue Ribbon Task Force on Sustainable Digital Preservation and Access, Final Report (NSF, 2010) http://brtf.sdsc.edu/
• Digital Preservation Coalition, Digital preservation handbook: http://www.dpconline.org/advice/preservationhandbook/
• JISC infoNet, Digital repositories infoKit: http://www.jiscinfonet.ac.uk/infokits/repositories
• Paradigm Project, Workbook on Digital Private Papers: http://www.paradigm.ac.uk/workbook/index.html
• Marieke Guy, JISC Beginner’s Guide to Digital Preservation (UKOLN, 2010) http://blogs.ukoln.ac.uk/jisc-beg-dig-pres/
• Digital Preservation Coalition and Digital Curation Centre, What’s New (monthly current awareness bulletin): http://www.dpconline.org/newsroom/whats-new
A centre of expertise in digital information management
www.ukoln.ac.uk
Further reading (research data)
• National Science Board, Long-lived digital data collections: enabling research and education in the 21st century (NSF, 2005) http//www.nsf.gov/pubs/2005/nsb0540/
• Liz Lyon, Dealing with data; roles, rights, responsibilities and relationships (JISC, 2007) http://www.jisc.ac.uk/whatwedo/programmes/digitalrepositories2005/dealingwithdata.aspx
• Neil Beagrie, Jullia Chruszcz, and Brian Lavoie, Keeping research data safe: a cost model and guidance for UK universities (JISC, 2008) http://www.beagrie.com/publications.php
• Neil Beagrie, Brian Lavoie and Matthew Woollard, Keeping research data safe 2 (JISC, 2010) http://www.beagrie.com/publications.php
A centre of expertise in digital information management
www.ukoln.ac.uk
Questions?
A centre of expertise in digital information management
www.ukoln.ac.uk
Acknowledgments
• UKOLN is funded by the Joint Information Systems Committee (JISC) of the UK higher and further education funding councils, the Museums, Libraries and Archives Council (MLA), as well as by project funding from the JISC, the European Union, and other sources. UKOLN also receives support from the University of Bath, where it is based.
• More information: http://www.ukoln.ac.uk/
A centre of expertise in digital information management
www.ukoln.ac.uk
Thank you!