Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan...

24
Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and Access for Electronic College and University Records October 13, 2001

Transcript of Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan...

Page 1: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Emulation, Migration and Long-Term Preservation of

Electronic Records

Cal Lee

University of MichiganSchool of Information

ECURE 2001: Preservation and Access for Electronic College and University Records

October 13, 2001

Page 2: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Outline

• The Digital Preservation Problem• Base-Line Assumptions• Major Approaches: Migration and

Emulation• Migration• Emulation• For Further Reference

Page 3: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

The Digital Preservation Problem

Page 4: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Technological Dependency

• Digital objects are useless if we can’t interact with them

• Those interactions depend on numerous technical components.

Page 5: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Key Concept - Abstraction

"Computer science is largely a matter of abstraction: identifying a wide range of applications that include some overlapping functionality, and then working to abstract out that shared functionality into a distinct service layer (or module, or language, or whatever). That new service layer then becomes a platform on top of which many other functionalities can be built that had previously been impractical or even unimagined. How does this activity of abstraction work as a practical matter? It's technical work, of course, but it's also social work. It is unlikely that any one computer scientist will be an expert in every one of the important applications areas that may benefit from the abstract service. So collaboration will be required.” (emphasis added) - Phil Agre, Red Rock Eater, March 25, 2000

Page 6: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Oh so many layers• Physical medium - only layer yielding real consensus• Bit• Byte• Character encoding• Instruction set architecture• Physical organization of bytes• Logical organization of chunks• Reading hardware• Input/output hardware• Input/output software

Page 7: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

But, wait, there’s more• Operating system kernel• Network operating system• Networking protocols• Desktop and windowing environment• Data syntax• Data structure• Data semantics• Data content• Data values• Contextual linking within and between objects

Page 8: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Obsolescence

"Those who forget the past are condemned to reload it."

- Nick Montfort, July 2000

• All layers undergo change over time, at varying rates.

Page 9: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Some Base-Line Assumptions

• Several assumptions which I will take to be given.

• Making them explicit can help us to be more precise about available options and their costs/benefits.

Page 10: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Assumption #1: Digital objects are instructions for future interaction

• Only a small part of preservation work is about treating them like physical artifacts.

• Jeff Rothenberg takes this even farther, contending that all digital objects should be seen as programs.

Page 11: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Assumption #2: Bits will be Bits

• Bit rot and advantages of newer media both call for periodic refresh and reformatting.

• Ensuring the integrity of the bit stream in such transfers is extremely important.

• See Charles Dollar’s 1999 book for an excellent explanation of these processes.

Page 12: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Assumption #3: Change Happens

• Any long-term strategy must recognize that any underlying technical platform will eventually be abandoned by the industry and thereafter increasingly difficult to support.

• Ongoing preservation effort is assumed, regardless of the strategy adopted.

• Goal is to minimize (rather than eliminate) work and maximize the benefits.

Page 13: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Assumption #4: Must identify what’s desirable and what’s possible

• Best, most informed guess about how objects will be used.

• Characteristics that support such use.• Currently available technical approaches.• Whether using any given approach can cost-

effectively preserve those characteristics.• All of these decisions should be well

documented and revisited periodically.

Page 14: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Major Approaches: Migration and

Emulation

Page 15: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Migration

• Periodic transformation of the bits/bytes to run directly on newer platforms.

• Used widely as an approach to actively managing legacy systems.

• Work can be expensive and introduce errors of translation.

• Since the resulting objects can run directly on newer platforms, layers of technology can be minimized.

Page 16: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Emulation - Oxford English Dictionary, Second Edition

“To reproduce the action of or behave like (a different type of computer) with the aid of hardware or software designed to effect this; to run (a program, etc., written for another type of computer) by this means.”

Page 17: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Popular Examples from the History of Emulation

• Hardware and software - IBM System/360 (1963)• Operating systems

– IBM MVS (1972)– Amiga (1985)– Microsoft Z80 Softcard (1989)– DOS emulation in Windows (1987)– SoftWindows (for Macintosh)– Virtual PC (1997)– Wine (Windows Emulator, 1993)

Page 18: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

More Emulation Examples

• Processors - Intel 8080 (1974)• Virtual Machines - Java (1995)• Terminal emulators - Telnet (1969),

WinFrame (1995)

• Lots and lots of games

Page 19: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Broad Issues to Address

• What level to emulate

• When to create the emulator - now vs. later, once vs. periodically

• How to develop emulators - what language, what platform

• Intellectual property rights

Page 20: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Arguments for preservation using emulation

• Rothenberg - specification, interpreter, virtual machine

• IBM - distinction between preserving data files and programs, create emulators to run on Universal Virtual Computer (UVC)

• CEDARS - maintain byte stream, focus on preserving the significant properties of its underlying abstract form (UAF)

• CAMiLEON - create emulator in a (simplified) high-level language, migrate emulator across platforms when necessary

Page 21: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Critiques of Emulation

• David Bearman most vocal critic

• Metadata and functional requirements are what counts for preserving electronic records

• Emulation attempts to capture too much (full functionality of technical environment) and not enough (essential characteristics of records)

Page 22: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

A Balanced Perspective on Preservation Strategies

• No single solution

• Identify requirements THEN evaluate the technical options.

• What attributes should be preserved (which differences matter)?

• Make (and document) educated guesses of costs and benefits.

Page 23: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

For Further Reference

• Growing literature on these issues• Several prominent projects now and in recent

years• Please see the bibliography associated with this

presentation

Page 24: Emulation, Migration and Long-Term Preservation of Electronic Records Cal Lee University of Michigan School of Information ECURE 2001: Preservation and.

Thank you!