Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for...

42
Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation Budapest, 13-15 October 2004

Transcript of Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for...

Page 1: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 1

Experiences on Migration of Data in Digitization Projects

Julián BescósJulián Bescós

Presentation for the ERPANET WorkshopWorkflow in Digital PreservationBudapest, 13-15 October 2004

Page 2: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 2

1. The Migration Issue 2. Our Experience 3. Migration Tasks 4. Best Practices for Preservation5. Planning and Schedule

OVERVIEW

Page 3: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 3

• Migration is the set of tasks to achieve periodic transfer of digital materials from one hard/soft configuration to another

Purpose • Long term preservation of the digital information created

and stored using digital technology

• Allow broad access– Retrieve, display and use

Origin • New devices, processes and software replace the methods

to record, store and access

• New standards

• Enhancement of service

MIGRATION

Page 4: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 4

• Technology obsolescence– HardwareHardware

More powerfull computers and higher density storageElements for updating are not available ( increase of

storage, memory, etc)– Basic softwareBasic software

Operating systemsData base managers

• Media– Lifetime is rarely the constraining factor for DPLifetime is rarely the constraining factor for DP– Obsolescence of old storage media as newer and better media are Obsolescence of old storage media as newer and better media are

available in the marketavailable in the market

• Obsolescence of the Access software – Access in new platform and mediaAccess in new platform and media– Not available long term programsNot available long term programs– Changes in metadata and in image formats Changes in metadata and in image formats – New functions of the softwareNew functions of the software

ORIGIN OF MIGRATION

Page 5: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 5

• In practice it is a combination of:– Technology obsolescence Technology obsolescence – New functionalities of the softwareNew functionalities of the software– Derived from information and communication technologyDerived from information and communication technology– Daily work on: digitisation, storage and access requiring:Daily work on: digitisation, storage and access requiring:

Higher density storageFaster computers

• It is a consequence of:

– The digital world of information and communication The digital world of information and communication technology is still relatively young and inmature technology is still relatively young and inmature

ORIGIN OF MIGRATION

Page 6: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 6

• Beginning in 1988 with the design and development of the Information System for the Archivo de Indias in Seville

• Computarization of 66 Archives and Libraries of different kinds and sizes in Spain and abroad

• Digitalization of more than 20 millions pages of ancient documents

• Installation of more than 320 workstations

• Development of the own products ArchiDOC-ArchiGES for Archives

• With a team in the areas of consulting, managing, development, installation, trainning and maintenance of systems for archives

EXPERIENCE IN DIGITALIZATION PROJECTS

Archivo General de Indias, Sevilla Access Room in 1992

Page 7: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 7

Archivo General de Indias, Sevilla

Archivo General de Simancas

Archivo Histórico Nacional, Madrid

Archivo Histórico Nacional - Sección Nobleza, Toledo

Archivo Histórico Nacional Sección Guerra Civil, Salamanca

Archivo de la Corona de Aragón, Barcelona

Archivo General de Navarra

Archivo del Reino de Valencia

Archivo del Reino de Mallorca

Biblioteca Sancho el Sabio, Vitoria

Archivo Virtual de la corona de Aragón ( con Imágenes del ACA y AHN)

Archivo Eclesiástico de Poblet

Archivo Histórico Universidad de Salamanca

Archivo Histórico de la Universidad de Santiago de Compostela

Archivo Histórico de la Universidad de Oviedo

Archivo General de la Nación, Colombia

Archivo Histórico Ultramarino, Lisboa

Archivo del Nacionalismo de la Fundación Sabino Arana, Vizcaya

Biblioteca Valenciana Archivo del Ilustre Colegio Notarial de Granada

Real Academia Española (Diccionarios Histórico)

Diccionario Biográfico Real Academia Historia

Archivo General Militar, Segovia

Archivo General Militar, Ávila

Instituto de Historia y Cultura Militar

Archivo General de la Marina, El Viso del Marqués, Ciudad Real

Archivo Histórico Provincial de Murcia

Sistema de Información del Archivo, Biblioteca, Fototeca y Videoteca de Cruz Roja Española

Biblioteca de la Fundación Francisco de Zabalburu, Madrid

Biblioteca Parlamento Vasco

Archivo-Biblioteca de la Diputación de Cáceres

Digitalización de 11 periódicos para 11 Instituciones Vascas de Prensa retrospectiva y prensa actual

Archivo Municipal de Castellón de la Plana

Archivo Histórico del Excmo. Ayuntamiento de La Laguna, Tenerife

Archivo del Ayuntamiento Oviedo

Archivo del Komintern, Moscow and its replica in 6 National Archives, LOC and Open Society Archives

MAIN PROJECTS WITH DIGITALIZATION

Archivo General de Navarra

Archivo General Militar, Segovia

Zabalburu Library

Page 8: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 8

Date Institution Number of Images Kind of Images 89-02 Archivo General de Indias, Sevilla 11.000.000 Manuscripts XVI-XIX 97- Archivo General de la Nación, Colombia 1.000.000 Manuscripts 94-00 Archivo General de Simancas 1.000.000 Manuscripts 97-04 Archivo General Militar, Ávila 180.000 Expedientes Militares 97-04 Archivo General Militar, Segovia 300.000 Expedientes Militares 98-04 Archivo General de Navarra 450.000 Manuscritos medievales 98- Archivo General de la Marina, El Viso del Marqués, Ciudad Real 150.000 Manuscripts 96- Archivo de la Real Chancillería de Valladolid Manuscripts 93-03 Archivo Histórico Nacional, Madrid 3.000.000 Manuscripts 95-01 Archivo Histórico Nacional - Sección Nobleza, Toledo 300.000 Manuscripts 96 Archivo Histórico Nacional Sección Guerra Civil, Salamanca Manuscripts 96 Archivo Histórico Provincial, Vizcaya 97 Archivo Histórico Provincial de Murcia 250.000 Protocols 99-02 Archivo Histórico Provincial de Oviedo 95 Archivo Histórico Ultramarino, Lisboa Manuscritos antiguos 95-04 Archivo de la Corona de Aragón, Barcelona 200.000 Medieval Manuscripts 94-01 Archivo Histórico de la Universidad de Salamanca 700.000 Manuscripts 96-02 Archivo Histórico de la Universidad de Oviedo 97-04 Archivo Histórico de la Universidad de Santiago de Compostela 400.000 Manuscripts 98-02 Archivo del Komintern, Moscú 1.000.000 Documents 1900-1945 93-04 Biblioteca y Archivo de la Fundación Sancho el Sabio, Vitoria 1.100.000 Monographs XVI-XIX 96-02 Biblioteca de la Fundación Francisco de Zabalburu, Madrid 700.000 Manus. y Mon. 96-00 Archivo del Nacionalismo de la Fundación Sabino Arana, Vizcaya 100.000 97-01 Archivo Histórico del Excmo. Ayuntamiento de La Laguna,Tenerife 100.000 Manuscripts 96 Archivo del Ilustre Colegio Notarial de Granada 200.000 Protocols 1998 Instituto de Historia y Cultura Militar 100.000 Manuscipts 95-00 Archivo Eclesiástico de Poblet 200.000 Manuscipts 98- Archivo-Biblioteca de la Diputación de Cáceres 200.000 Actas 98 Archivo Municipal de Castellón de la Plana 98 Centro de Investigaciones Biológicas (CSIC)

FIGURES OF DIGITALIZATION

Page 9: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 9

Date Institution Number of Images Kind of Images 96-00 Real Academia Española Historical Dictionaries 96-00 Digitalización de 11 periódicos para 11 Instituciones Vascas 300.000/year Ancien Journals 99-04 Archivo Histórico Provincial Cantabria 2000 Archivo Ayuntamiento Estella 00-02 Archivo y Biblioteca Cruz Roja Photographs Monog. 00-04 Archivo Virtual de Aragón ( Imágenes del ACA y AHN) Medieval Manuscripts 00-01 Proyecto AER ( Con AGI y AHN inicialmente) 00-04 Biblioteca Parlamento Vasco 300.000 Monographs 01-04 Archivo del Reino de Valencia Manuscripts, Protocols 01-02 Diccionario Biográfico Real Academia Historia 01 Archivo del Ayuntamiento Oviedo Padrones XV 01-04 Archivo del Reino de Mallorca 02 Sistema Archivos Principado Asturias 02 Archivo Casa de Alba

FIGURES OF DIGITALIZATION

Page 10: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 10

1. Projects from 1988 – 1992: Computer System for Archivo General de Indias

• The Archive contains 86 million of pages of original manuscripts related to the Spanish Administration in America (XV-XIX centuries), in 43.000 bundles

• The Computer System integrated:–A Textual Data Base with 400.000 descriptive entries–A Digital Image Archive with 11 million digital images in

1995 –A Module for User and Document Management: Control

of User management, Consultation room, documents movements and statistics

• Access by researchers and archivists from 50 workstations

• About 30% of present consultations are on the screen (1 million pages/year )

• About 35% of printing are digital ( 85.000/year )

• Access system in service since 1992

EXPERIENCES ON MIGRATION

Page 11: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 11

Architecture

• The Data Base for Descriptions in SQL/400 keeps the hierarchical structure of fonds

• Standalone Digitization Workstations with flat bed scanners and optical disk driver under DOS

• Images servers based on PCs with optical disk drivers

• Access from PCs under OS/2Image Acquisition and Storage

• 11 million images digitized in gray levels with high fidelity with respect to the original manuscripts

• Low cost workstations

• Legibility Enhancements applied by users at the consultation time

• Non expert digitization operators

• Digitization: 100 dpi, 16 gray levels

• 1 Page/minute, 15 workstations, 2 turns, 4 years

EXPERIENCES ON MIGRATION

Page 12: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 12

Image Acquisition and Storage

• Images stored in WORM optical disks–The structure at the low level (

bundle/documents ) was also in directories in the WORM disks

–Access to images in one disk done through the call number of the document

–Images path as metadata: images names had information about document call number and number of page.

–Not available standard compression for gray level images. Images were DPCM compressed by software without losses.

• Compressed Image size of A4: 300-350 Kbytes

• Storage for 1 bundle: 2000 x 350 = 700 MB

EXPERIENCES ON MIGRATION

Page 13: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 13

Image Acquisition and Storage

• Media for storage of digital images:

Bundles Media Year beg. Number of disks Images

1.729 IBM optical disks ( 200 MB) 1989 6.916 3.458.000

3.732 Plasmon optical disks ( 940 MB) 1991 3.732 7.464.000

50 CD-R (640 MB) 1996 100.000

EXPERIENCES ON MIGRATION

Page 14: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 14

Page 15: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 15

Page 16: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 16

Page 17: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 17

Example of blotches removal to be applied by the user

Page 18: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 18

Page 19: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 19

Example of reduction of ink bleeding through the paper

Page 20: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 20

Archivo General de Indias

Digitization Room of Archivo de Indias in 1989

Page 21: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 21

Archivo General de Indias

Shelf with optical disks

Page 22: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 22

2. Projects from 1992 – 1996:

– Data Base Server under OS/2 and DB2 – Access and Digitization workstations from PCs with OS/2– The relational Data Base keeps the hierarchical structure of

documentation – Images stored in CDRs

Directory structures and image names changed.Metadata in binary control files: Each image has

information about signature, position in hierarchical structure, number of page, notes

Image compression: JPEGMetadata in images: resolution, date, dimensions

EXPERIENCES ON MIGRATION

Page 23: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 23

Example: metadata in Binary Control File

– The file keeps information about the hierarchical structure– It maintains relationship between each

image file and its position in the document.

– The control file and its metadata can be imported into the database

EXPERIENCES ON MIGRATION

Page 24: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 24

Migration of Images of Archivo de Indias from 10.600 optical disks to 6.000 CD-Rs

– The images of a bundle are stored in 1 or 2 CD-R– Reading of optical disks through the network– No direct connectivity between optical disks and Windows

NT

– Main Operation Tasks:Decompression of the DPCM formatCompression on JPEG formatTemporary storage in magnetic diskAll images of the bundle are copied in CD-RVerification of images by reading6.000 CD-Rs, and 6.000 CD-Rs backup copy

EXPERIENCES ON MIGRATION

Page 25: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 25

EXPERIENCES ON MIGRATION

IBM Optical Drives

Microchannel IBM PS/2File system driver for OS/2OS/2 1.3 and Lan ServerTokenRing Microchannel Card

CD-R Drives

Token RingNetwork

Pentium PCWindows NTToken-Ring PCI Card3GB disk SCSI interface

IBM Disks to CD-R

Migration of Images from 6.916 WORM IBM disks to CD-Rs– Typically 4 WORM disks ( 200 MB each) in 1 or 2 CD-R

Page 26: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 26

Migration of Images from 3.732 WORM Plasmon to CD-Rs– 1 WORM Plasmon disk ( 940 MB) in 1 or 2 CD-R

EXPERIENCES ON MIGRATION

Pentium PCWindows NTToken-Ring PCI Card3GB disk SCSI interface

Plasmon Drives

PC with i486SCSI interfaceFile system driver for OS/2OS/2 3.0 Ethernet card

CD-R Drives

HUB EthernetNetwork

HUB EthernetNetwork

Plasmon Disks to CD-R

Page 27: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 27

Migration of Images of Archivo de Indias from 10.600 optical disks to 6.000 CD-Rs – Requirements of personnel and timeRequirements of personnel and time

3 operators during 4 months3 operators during 4 months

EXPERIENCES ON MIGRATION

Similar migration schemes with less images:

•Library Sancho el Sabio ( Vitoria) 1.000.000 images

•University of Salamanca 700.000 images

•Archivo General Militar, Segovia 200.000 images

•Archivo del Monasterio Poblet 100.000 images

Page 28: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 28

3.Projects from 1996 to now:

– Oracle Data Base– Access and Digitization workstations with PCs with W/NT,.. W

XP – Capturing Images also using standard programs and their

metadata– Images stored in magnetic disks. CDROMS as backup

Metadata in database: Scanning operator, date of creation, Signature, path, dimensions in bytes… Data about control of the information

Metadata in image: resolution, dimensions… Data for presentation in computers and for printing

Image quality: 200 – 300 dpi, 256 gray levels Color images

Standard formats: TIFF, CCITTGIV JPEG, PDF,

EXPERIENCES ON MIGRATION

Page 29: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 29

Example: metadata in database

EXPERIENCES ON MIGRATION

Management of Image Access

Modes of Image Display

Page 30: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 30

Example: metadata XML File

– Same functionality than binary control file

– Standard: virtually any program can import these metadata

EXPERIENCES ON MIGRATION

Page 31: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 31

Migration of Archivo de Indias from CD-R to magnetic disk in 2000

– Project for online access and InternetJust copy. Images are already with JPEG compression10 RAID cabinets of 350 GB each ( 8 disks x 50 GB )1 operator was required during 1 month for the copy

from a CD-ROM tower to magnetic disks– Transfer rate from different media:

Media Transfer rate Image BundleIBM optical disk 60 KBs 6 seconds 4 hoursPlasmon optical disk 100 KB/s 3 seconds 1 hourCD-R 16x 2,5 MB/s <1 second 5 minutesMagnetic disk 80 MB/s 1 minute

Similar Migrations:Sancho Sabio Library ( Vitoria) 1 million imagesZabalburu Library 700.000 imagesMilitary Archives 500.000 imagesArchivo General Navarra 600.000 imagesKomintern Archives (Moscow) 1 million images........

EXPERIENCES ON MIGRATION

Komintern Archives, Moscow

Page 32: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 32

UPS

UPS

Image Server

RAID Cabinet 1

RAID Cabinet 2

RAID Cabinet 3

RAID Cabinet 4

RAID Cabinet 5

RAID Cabinet 6

RAID Cabinet 7

RAID Cabinet 8

RAID Cabinet 9

RAID Cabinet 10

Data Base Server

Domain Controler Server

WEB Server

UPS

UPS

Archivo General de Indias

SERVERS AND IMAGE STORAGE

Page 33: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 33

Reserved UPS

Data Base Server

Domain Controler Servers

UPS

WEB Servers

Image Server

RAID Cabinet 1

RAID Cabinet 2

UPS

Reserved for RAID Cabinet 3

Auto Replicated on line Remote Disk subsystemfor Back up and Service

Red local

Archivo General de Indias

Page 34: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 34

• Analysis of origin and destination data models

• Equivalence between of the fields in the origin and destination models

– New versions include new metadata not available before

• Development of migration software

• Testing with a limited number of objects

• Display of information in a destination card

• Application of migration to all data

• Verification of results

• Correction of errors:– Sometimes some images cannot be copied and must be

recoverd from alternative media or even to be digitised again

MIGRATION TASKS

Komintern Archives, Moscow

Page 35: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 35 Komintern Archives, Moscow

MAIN COST FACTORS

• Preparation of the system for migration– Hardware and Basic Software:

Magnetic disk storage for imagesPCs with appropriate OS and DB manager

• Development of Software (1 programmer, 2-3 weeks work ) – Software development for migration– Testing of migration of data

• Operation ( usually less than 1 week)– Significant operation with removable media

Page 36: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 36

• General principles– Based on PC’s and mainstream commercial equipmentBased on PC’s and mainstream commercial equipment– Key hardware provided by first class IT companiesKey hardware provided by first class IT companies– Database managers of widespread useDatabase managers of widespread use– Consultations with institutions undertaking projectsConsultations with institutions undertaking projects– Based on elements and standard formats. Officials or the Based on elements and standard formats. Officials or the

facto, like TIFF, JPEG, XML, etc. facto, like TIFF, JPEG, XML, etc. – Modular, allowing a progressive installation and easy update Modular, allowing a progressive installation and easy update

of elementsof elements– Selection of software:Selection of software:

FunctionalitiesNumber of installationsMaintenanceProvided by a IT company settled in the sector

– Key factors:Key factors:Server, operating system, database managerBackup policies

BEST PRACTICES FOR PRESERVATION

Page 37: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 37

• Digitization– Capture systems:Capture systems:

Robust flatbed scanners (A3)Zenithal scanners. Digital cameras with limitations.

– Use of standard compression formats. JPEG, CCITTGIV Use of standard compression formats. JPEG, CCITTGIV – Ensure that digital images will allow a broad range of future Ensure that digital images will allow a broad range of future

useuse– Capture the highest quality image technically possible and Capture the highest quality image technically possible and

economically feasible for large-scale production economically feasible for large-scale production – Capture the informational content / physical appearanceCapture the informational content / physical appearance– Fast and easy correction of errors Fast and easy correction of errors

• Criteria for holding selection– ValueValue– ConditionCondition– UseUse– Acceptability of the digital objectAcceptability of the digital object– Access aidsAccess aids

BEST PRACTICES FOR PRESERVATION

Page 38: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 38

• Storage– Media of wide use and low cost: Media of wide use and low cost:

Magnetic disk for on line image service (specially in high demand)

Disks with redundancyBackup in tapes of high capacity (10/20GB)One or two units available as hotsawpIt allows migration without personnel operation

In a distributed network they may need to be stored online in multiple locations

CD-R or DVD as backup for off line access in case of system failure

– In general there is little experience in storing massive In general there is little experience in storing massive quantities of culturally valuable materials quantities of culturally valuable materials

• Backup and Recovery– Use industry standard backup and recovery procedures:Use industry standard backup and recovery procedures:

Periodic backup to magnetic tape A copy held on site for near term recoveryA copy off-site stored for disaster recovery

BEST PRACTICES FOR PRESERVATION

Page 39: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 39

Traditional approach of Computer Science

• Migration of media– Refreshing digital information by copying it from medium to Refreshing digital information by copying it from medium to

mediummedium– Conversion of files to another format to be interpreted by new Conversion of files to another format to be interpreted by new

programs; to a reduced number of standard formats; programs; to a reduced number of standard formats;

• Migration of technology platform– Server and PCsServer and PCs– PeriphericalsPeriphericals– Capture devices and CDR writersCapture devices and CDR writers– Operating system and database managerOperating system and database manager

• Migration of the digitising and access software– Maintenance of software in new platformMaintenance of software in new platform– New software versions for digitising and accessNew software versions for digitising and access

APPLICATION OF MIGRATION

Page 40: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 40

• Planning for migration is difficult due to:

– the limited experience

– we cannot predict when media, soft and hard will become obsoleted

• No single strategy applies to all formats of digital information

• It varies in different applicational environments, for different formats of digital materials and for preserving different degrees of computation, display and retrieval

• It requires a unique new solution for each new format and process

• Automatic conversion is only partially possible

• In general there are no firm plans for migration, but to stay up to date with current technologies by migration the content

• Usually there is urgency involved in migration: due by the obsolescence of soft and hard

PLANNING

Page 41: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 41

• Schedule

– New releases of software, databases,etc. can be expected every 2-3 years, with minor updates more often

– Migration from one storage media to another every 4-5 years, if not online

– Migration to new hardware and software occur less frequently but can be expected between 5-10 years

SCHEDULE

Page 42: Slide 1 Experiences on Migration of Data in Digitization Projects Julián Bescós Presentation for the ERPANET Workshop Workflow in Digital Preservation.

Slide 42

• Best practices for Digital Preservation

– Mainstream commercial equipment

– Use of standard formats

– Storage in magnetic disk with redundancy

– Backup policies

– Maintenance

• Periodical Update Policy

– Hardware

– Media

– Basic sofware

– Application software

SUMMARY