The New DRS (DRS 2) Introduction. What is DRS? Digital repository for preservation and access...

49
The New DRS (DRS 2) Introduction

Transcript of The New DRS (DRS 2) Introduction. What is DRS? Digital repository for preservation and access...

The New DRS (DRS 2) Introduction

What is DRS?

• Digital repository for preservation and access

– Maintains integrity of deposited content

– Preserves content for future reuse regardless of changes in technology

– Provides access through associated delivery services

• More about DRS: http://hul.harvard.edu/ois/systems/drs/

Why DRS 2?

• Support more formats & genres

• Improve collection management

• Improve access to content & metadata

• Support preservation planning & activities

• Implement digital preservation best practices and standards

• Improve metadata preservation

• Modern software to enable efficient processing and faster depositing

• More about DRS 2: http://hul.harvard.edu/ois/systems/drs/drs2.html

New functionality highlights

• Easier-to-use management interface

• Advanced searching on more than 200 metadata fields in DRS 2 Web Admin

• Support for rights management activities and deposit of rights documentation

• Ability to import descriptive metadata from the central catalog

• Drag-and-drop structure editor for page-turned objects

• Custom image captions

• Wordshack - a new central vocabulary registry

New functionality highlights

• Improvements for audio collections:

– Support for MP3 and MP4 audio

– Support for streaming audio as well as delivery-to-download

– A new web-based audio player with an improved user interface

• Support for text

– UTF-8 text, XML, CSV

• Support for Opaque objects (up to 50 GB per object)

– More about DRS opaque storage: http://hul.harvard.edu/ois/systems/drs/opaque-storage.html

• For the full list of new functionality see: http://hul.harvard.edu/ois/systems/drs/enhancements.html

Depositing and managing content in DRS 2

• Batch Builder 2 – depositing content

– Creates initial description of content

– Performs initial file format validity checks

– Packages files into ‘objects’ and ‘batches’ for deposit

• DRS 2 Web Admin – managing content

– Search for and view deposited content

– Make changes and additions to deposited content

• e.g. changes to metadata or access restrictions

• e.g. adding and replacing files in existing objects

• WordShack – for vocabulary management

– Create definitive terms for organizations and administrative categories

• Terms can be used in Batch Builder and Web Admin

Accessing deposited content

• Delivery Services (subject to restrictions)

– Image Delivery Service (IDS) – Images

– Page Delivery Service (PDS) – Page turned objects

– Streaming Delivery Services (SDS) – Audio files

– File Delivery Service (FDS) – Documents (PDF), audio files, text files

• Web Admin (staff only)

– Download original content as deposited

– Only way to get some content that is not available in any delivery service

DRS Concepts

• Objects

• Files

• Metadata

• Content models

• Relationships

• Rights

• Controlled vocabulary

Objects and files

• Objects = groups of files that together make up a coherent unit of content

Archival master TIFF image

Delivery JPEG image

ThumbnailJPEG image

FILEFILEFILE

OBJECT: still imageolvwork85592

Metadata

Object descriptor is a METS file with descriptive , administrative, and technical metadata about the object and its flies

All of the metadata for the object and its files

Archival master TIFF image

Delivery JPEG image

ThumbnailJPEG image

FILEFILEFILE

OBJECT: still imageolvwork85592

objectdescriptor

FILE

Object Descriptors

• Every object will have one

• Metadata about the object and its files

– Descriptive, administrative, technical, preservation, structural, rights

– Significant events (ingest, deletion, etc.)

• A METS metadata file stored in the DRS

– Gives the metadata the same preservation services as the DRS content files

– Self-contained and portable

– Persists in the DRS even after the content it describes is deleted

Types of Objects = Content Models

• Every DRS object conforms to a DRS content model

• Defines:

– valid file formats

– object structure and relationships

– technical metadata schemas

– known delivery and rendering applications

– associated preservation plans

• Enforces conformity – we know what we have in the DRS and can better monitor and preserve it

Content Models

1. Audio

2. Color Profile

3. Document

4. Opaque

5. Opaque Container

6. PDS Document

7. Still Image

8. Target Image

9. Text

Upcoming:

10.List (multi-part page-turned objects)

11.Collection

12.Email Message

13.Web Harvest

14.Google Document

15.Video

Content model: audio

Example: recorded interviews

• AIFF archival master and MP4 deliverable

• Audio Decision list in XML

Image source:

Content model: color profile

Example: color profile

• ICC color profile

Image source: http://www.harddriverecoveryfix.com/

Content model: Document

Example: a report

• PDF deliverable file

Intergovernmental Panel on Climate Change (IPCC) WG1 Fourth Assessment Report,Environmental Science and Public Policy Archives Harvard College Library

Content model: opaque

Example: The contents of a faculty member’s hard drive

• Files in various formats

– Wordperfect files

– Research datasets

– Powerpoint presentations

– Databases

– LaTeX documents

• Plus documentation about the collection

Image source: http://www.harddriverecoveryfix.com/

Content model: opaque container

Definition: Any opaque object that has been compressed into a single zip file

• All the rules for content in an opaque object apply:

– One directory for content

– One (optional) directory for documentation

• Once unzipped, content should be able to be deposited as an opaque object

Content model: PDS Document

Example: a book

Zoeller, Karl William. Merchandising the plumbing business. Chicago : Domestic Engineering Co., c1921. Baker Library.

• JP2 archival master & deliverable JPEG image files per page

• Plain text files per page

Content model: still image

Example: scanned photo

• TIFF archival master and JPEG deliverable

Content model: target image

Example: scanned target

• TIFF

Content model: text

Examples: Tabular data, collection inventory list

• CSV or tab delimited file

Concepts in action

1. Harvard units digitize, create or acquire files

FILE

FILE

FILE

FILE

FILE

FILE

FILE

FILE

FILE

Concepts in action

2. Depositors use Batch Builder to build objects from the files, generate the object descriptors, and to group objects into batches for DRS deposit

FILE

FILE

FILE

FILE

FILE

FILE

FILE

FILE

FILE

objectdescriptor

objectdescriptor

objectdescriptor

OBJECT 1 OBJECT 2 OBJECT 3

BATCH

Concepts in action

3. Depositors transfer batches of objects to their DRS drop boxes. The batches are loaded into the DRS.

BATCH 1

falftp/ incoming/

BATCH 2

BATCH 3

lawftp/ incoming/

BATCH 4

BATCH 5

sftp transfer

Concepts in action

4. Files and some objects can be accessed by the delivery services

Files: Objects:

Batch Builder 2

• deposit tool that creates DRS 2 object batch deposits

• takes still images, page-turned, audio, text, PDF, opaque objects, image targets and color profiles

• adds administrative, descriptive, technical metadata

• creates object descriptors

Administrative Metadata

Completed batch . . .

. . . is ready to deposit

DRS Loader Report

DRS 2 Web Admin

• Search

• View

• Download

• Add / Replace

• Update

• Delete

• Recover

http://wcetblog.files.wordpress.com/2012/05/digital-content.jpg

35

• Changes to Web Admin access

• New search options

• New administrative metadata

• New relationship metadata

• New rights metadata

• WordShack term management

New in Web Admin

DRS 2 Web Admin - Managing content

• Search for and view deposited content

• Download content

– Objects, files, descriptors

• Make changes and additions to deposited content

– e.g. changes to metadata or access restrictions

– e.g. adding and replacing files in existing objects

• Delete and recover objects and files

– Delete one object at a time or up to 500 files at a time

• Delete batches

– Immediately for batches less that 5 days old

– Simple request for older batches

What kinds of things can you search for?

Files, objects, batches and events can be searched and managed in the DRS Web Admin

Who can do what? – Roles

• Staff will have access to view and edit content based on roles

– Specific activities like editing metadata

– Specific owner codes like FHCL.FAL

Web Admin - Searching

• Quick search by identifier.

• Advanced search in over 200 metadata fields.

• Searching full text of documents.

– Text, HTML, and PDF with embedded text

• Ability to save and restore search results.

Save & Restore Search Results

• Download search results for review, reporting• Upload saved results (no search required)

• Instructions: http://hul.harvard.edu/ois/systems/drs/docs/wa2-userguide/search-downloadresults.html

40

41

Descriptive Metadata

• View current descriptive metadata

• Add or replace descriptive metadata

– Import with Aleph ID or upload MODS file

• Create links to

– HOLLIS records

– OASIS finding aids

– Other related external sites

Administrative Metadata

42

• Many editable fields in Web Admin

– Admin categories: curator-assigned label used to group files and objects

– IDS Captions: turn on or off at object level; can customize fields

– Non-public notes

Rights

• Rights information for objects and files can be stored in the DRS

• Mapped to PREMIS XML standard behind the scenes

• PREservation Metadata: Implementation Strategies – 2003 international working group of experts– The primary standard for preservation metadata

43

• Storing rights information requires a “Rights Basis” – License, copyright, statute, policy, etc.

• Rights Documentation– Explains the “Rights Basis”– Deposited in a separate object

• Content Restrictions – Embargoes, secure storage, audio downloading,

image delivery size

44

Rights

Relationships

There are three types of explicit relationships in DRS 2• File – to –file: for example

– HAS_SOURCE – one file derived from another as a result of transformation (equivalent to DRS 1 “IS_DERIVATIVE_OF”)

• File – to object – for example:– HAS_WORLD_REFERENCE_DATA – points from an

image file to a text object with geospatial coordinates for the image file.

– HAS_DOCUMENTATION - has another object in DRS that serves as documentation object (typically a PDF Document or Text object containing supporting documentation).

Relationships

• Object – to – object – for example:– HAS_DOCUMENTATION - has another object in DRS

that serves as documentation object (typically a PDF Document or Text object containing supporting documentation).

– HAS_METHODOLOGY - has another object in DRS that serves as a methodology object (typically a PDF or Text file containing methodology).

Relationships

48

Controlled vocabulary

• Wordshack– new central vocabulary maintenance system for use

in context of digital repositories and digital preservation services

– offers controlled term management for admin categories, admin flags, email addresses, organizations, persons, software, and topics

49

Thanks

• Please get advice before starting projects

• Contact info: [email protected]

• More information about the new DRS:http://hul.harvard.edu/ois/systems/drs/drs2.html