Digital Collections, Repositories, and Archives

45
Digital Collection Services Digital Collections, Repositories, and Archives Greg Zick, Vice President Digital Collection Services ARL Research Library Leadership Fellows November 1 st , 2007

Transcript of Digital Collections, Repositories, and Archives

Page 1: Digital Collections, Repositories, and Archives

Digital Collection Services

Digital Collections, Repositories, and Archives

Greg Zick, Vice PresidentDigital Collection Services

ARL Research Library Leadership Fellows

November 1st, 2007

Page 2: Digital Collections, Repositories, and Archives

Digital Collection Services

Future Direction- New OCLC StrategyEnterpriseproductarchitecture

Local

Global

Group

WorldCatGrid

Page 3: Digital Collections, Repositories, and Archives

Digital Collection Services

Presentation Outline

Digital Collection CreationInstitutional RepositoriesDigital RepositoriesOCLC Future Directions

Page 4: Digital Collections, Repositories, and Archives

Digital Collection Services

One way to digitize all the world’s collections

List all the institutions that have collections?Institutional Registry

List all the collections at each institutionArchive Grid

Digitize the items in the collectionsDigital Registry

Create the Digital CollectionCONTENTdm

Archive original TIFFSDigital Archive

Add to Global Digital RepositoryHarvest metadata into WorldCat

Page 5: Digital Collections, Repositories, and Archives

Digital Collection Services

Digital Collection Creation

CONTENTdm

Page 6: Digital Collections, Repositories, and Archives

Digital Collection Services

Organize and store your digitalcollections

Expose your digital collections while providing Web access

Assess users’ needs and collections’conditions

Convert your materials to digital collections and create metadata

Page 7: Digital Collections, Repositories, and Archives

Digital Collection Services

What is CONTENTdm?

A complete software solutiondeveloped by DiMeMaStore, manage and access your digital collections For organizations of all types and sizesfrom academic libraries, historical societies, public libraries and other collaboratorsShowcase a wide range of media types from photos and documents to audio and video files

Page 8: Digital Collections, Repositories, and Archives

Digital Collection Services

Alaska

2

British Columbia

2Alberta4

Saskatchewan Manitoba

Ontario2

Quebec1 New

BrunswickWashington18

Oregon8 Idaho

2

Montana2

Wyoming1

North Dakota2

South Dakota1

Nevada4 Utah

7

California20

Arizona3

Colorado2

Nebraska3

Kansas5

New Mexico1

Texas10

Oklahoma1

LA2

Arkansas1

MS1

AL4

Tennessee3

Missouri4

Georgia2

FL4

SC1

N Carolina 3

Iowa6

KY3

Illinois10

Indiana11

Ohio18

WV1

VA2

Pennsylvania16

Minnesota5

Wisconsin8

Michigan10

New York24

Maine

VT 3NH 1MA 3

RI 1CT 3

NJ 2DE 1MD 3DC 1

Hawaii

Canada: 11 licensees

(Newfoundland – 1

Nova Scotia – 1)

Page 9: Digital Collections, Repositories, and Archives

Digital Collection Services

Building partnerships and CONTENTdm digital collections

Multi-regional or statewide consortia: AK, CA, FL, IA, IL, IN, NE, MN, MO, NY,OH, PA, WIMountain West Digital Library – 20 membersWestern Waters Digital Library - 30 membersColumbia River Basin Ethnic History ArchiveDigital Library of Appalachia -12 membersHistorically Black Colleges and Universities Library AllianceUpper Mississippi Valley Digital Image Archive

REALIA Project –Assoc. Colleges of the Midwest, Assoc. Colleges of the South and the Great Lakes College Assoc.

Consortium of Liberal Arts Colleges (CLAC) – 62 members

These and many more…

Page 10: Digital Collections, Repositories, and Archives

Digital Collection Services

Images: Photographs, posters, postcards

Page 11: Digital Collections, Repositories, and Archives

Digital Collection Services

Maps

JPEG2000 supportedJPEG2000 supported

Page 12: Digital Collections, Repositories, and Archives

Digital Collection Services

Zoom and Pan

Page 13: Digital Collections, Repositories, and Archives

Digital Collection Services

Documents:Text-based letters, journals, diaries and more

Transcribed text becomes searchable!

Page 14: Digital Collections, Repositories, and Archives

Digital Collection Services

CONTENTdm’s Integrated OCR Capability

Documents:Text-based letters, journals, diaries and more

Search term(s) highlightedSearch term(s) highlighted

Page 15: Digital Collections, Repositories, and Archives

Digital Collection Services

Bureau of Economic Analysis

OCLC Preservation Service Center completed scanning and provided digital images to BEA.

Use of CONTENTdm to search and access.

Page 16: Digital Collections, Repositories, and Archives

Digital Collection Services

Documents:Newspapers

Page 17: Digital Collections, Repositories, and Archives

Digital Collection Services

Audio/video

Page 18: Digital Collections, Repositories, and Archives

Digital Collection Services

Browse

Page 19: Digital Collections, Repositories, and Archives

Digital Collection Services

Advanced search

Select available collections

Search

Page 20: Digital Collections, Repositories, and Archives

Digital Collection Services

Discovery: WorldCat harvesting & Open WorldCat

Page 21: Digital Collections, Repositories, and Archives

Digital Collection Services

CONTENTdm provides comprehensive solutions for your historical digital archives

One system, many solutions

Photographs, diaries, slides, newspapers, books, letters, audio and video oral histories, maps −all your digital assets!

http://www.contentdm.com/customers/

Page 22: Digital Collections, Repositories, and Archives

Digital Collection Services

CONTENTdm 4.3

Connexion digital importNew option for OCLC catalogers Add digital resources to CONTENTdm via Connexioncataloging process

BenefitMore options for building digital collectionsCatalog using most convenient process for the organization

Page 23: Digital Collections, Repositories, and Archives

Digital Collection Services

CONTENTdm 4.3

Connexion digital importAdd items to CONTENTdm via the Connexion ClientDigital collection growth built into cataloging workflowWorldCat MARC record crosswalked to Qualified Dublin Core and added to CONTENTdm

OCLC number stored in CONTENTdm – a global persistent identifier

Digital items accessible by FirstSearch, WorldCat.org and WorldCat LocalRequires OCLC Cataloging subscription, CONTENTdm license and CONTENTdm Hosting Services

Page 24: Digital Collections, Repositories, and Archives

Digital Collection Services

CONTENTdm 4.3

Connexion digital importMetadata choices for cataloging

Connexion client (MARC)

CONTENTdm (DC, QDC, VRA)

Acquisition Station

Web-based Add option

Serials supportUse “Attach Digital Object” in Connexion client for each issue in a serial item

856 link will automatically retrieve a search results page with links to each issue stored in CONTENTdm

Page 25: Digital Collections, Repositories, and Archives

Digital Collection Services

CONTENTdm 4.3

Connexion digital importIn Connexion Client:

Attach Digital Content to existing recordSelect CONTENTdm collectionSelect file(s) from local computer/networkReplace command

System processes metadata and file for import into CONTENTdm

Digital item sent to CONTENTdm collectionMARC metadata mapped to Qualified Dublin CoreCompound object creation, JPEG2000 conversion, and OCR or PDF processing, if applicableThumbnails generatedLink added to 856 field in WorldCat record

Page 26: Digital Collections, Repositories, and Archives

Digital Collection Services

Access byUsers

Catalogerw/ ConnexionClient

CONTENTdmCollectionAdministrator

CONTENTdmConnexion

WorldCat

WorldCat.org

CONTENTdmImport

Attach digital content to WorldCat record

Configure CONTENTdmcollection with Qualified Dublin Core

OCLC# hyperlink to digital content

MARC QDCTIFF JP2OCR, PDF

Page 27: Digital Collections, Repositories, and Archives

Digital Collection Services

CONTENTdm 4.3

Improved PDF handling Convert multiple-page PDF files to CONTENTdm compound objects

Subset print options

Search term highlighting within PDF filesCreation of thumbnail images from PDF filesImproved text extraction from PDF files

BenefitsMore efficient processing of PDF documentsSingle database of ALL digital resourcesBetter end user experience

Page 28: Digital Collections, Repositories, and Archives

Digital Collection Services

CONTENTdm 4.3

PDF files can be imported using standard optionsSingle or batch import via Acquisition StationWeb-based Add optionConnexion digital import

Thumbnail images are automatically generated from the PDF when the item is added to the collectionText is extracted from the PDF and inserted into the full text search field when the item is added to a collection

Collection must have full text search field and that field must be empty when PDF is added to the collection

Page 29: Digital Collections, Repositories, and Archives

Digital Collection Services

CONTENTdm 4.3

Compound object conversionWhen compound object conversion is enabled, CONTENTdm:

Creates a compound object based on the page order of the PDF.

Generates a page-level metadata record for each page.

Extracts text from the PDF, converts it to UTF-8, and inserts it into the full text field of the associated page level record.

Generates thumbnail images of each page of the PDF. The thumbnail image of the first page will also be used for the compound object.

Retains the original PDF file for export and printing.

Displays the PDF compound object in a compound object viewer with each page of the PDF accessible from the left navigation menu.

Highlights search terms in the PDF.

Provides an option to select a subset of the PDF to print or save.

Page 30: Digital Collections, Repositories, and Archives

Digital Collection Services

Compound object conversion

Page 31: Digital Collections, Repositories, and Archives

Digital Collection Services

Compound object conversion

Page 32: Digital Collections, Repositories, and Archives

Digital Collection Services

PDF Enhancements

Printing and downloadingComplete print version

Original PDF file retained for printing and saving

Subset of print versionSelect a subset of pages from the PDF to view, save, or print

Select all pages with search hits or pick individual pages or page ranges

Do not have to wait for large download if only need a few pages

Also available for non-PDF compound objects when they have been processed using the OCR Extension

Page 33: Digital Collections, Repositories, and Archives

Digital Collection Services

Printing and downloading

Page 34: Digital Collections, Repositories, and Archives

Digital Collection Services

CONTENTdm 4.3

Compound object conversionReduce the size of file that is downloaded for viewing

An entire PDF may be several MB but individual pages are much smallerView a page within large PDF without downloading the full documentIncrease speed of access to view

Provide full text indexing by page not documentNo secondary search required to find specific content in PDF

Print only the information you needBetter end-user experience!

Page 35: Digital Collections, Repositories, and Archives

Digital Collection Services

CONTENTdm 4.3

Compound object conversionQuick and efficient for collection builders!PDF pages of compound object do not count against total number of items on the server!

Ideal for born digital documents!Theses, dissertations, government documents, e-publications, and more…

Page 36: Digital Collections, Repositories, and Archives

Digital Collection Services

Digital Repository - future directions

Automatic WorldCat registration and linksRegister and replicateWeb-based user controlled mapping DC, MARC, VRAAuto add WorldCat record #

Seamless/transparent WorldCat ingestUpdate of all digital items with unique WorldCatreference numberIntegration with all features and functions of WorldCat.orgLocal branding and customization

Page 37: Digital Collections, Repositories, and Archives

Digital Collection Services

OCLC Enterprise Strategy“Discovery to Delivery” Collection Curation

Consumer PlatformWorldCat.org “Indiana Cotton Mills”

Management SystemsWorldCat Local “Hegg Alaska Photographs”, “meed”

Content PlatformDigital RepositoryDigital Archive

Network ServicesUse statisticsCollection Analysis

Page 38: Digital Collections, Repositories, and Archives

Digital Collection Services

Digital Collection Services

Digital Archive – re-engineered

Development of storage infrastructure in Dublin

Redefinition and packaging of Digital Archive for marketlong term archiving of digital masters

opportunity to reprocess as software improvesexample OCR

Significantly reduce cost of service

Two modesWith repository

Mirroring

Page 39: Digital Collections, Repositories, and Archives

Digital Collection Services

Digital Archive workflow for other systemsType A service: batch mirror

Verify

Volume Staging

Archival Storage

Check-in

Ingest

• Vol ID Checkinfixity, virus, format verification (JHOVE), etc.

• Verify manifest & files

OK to MoveARCHIVE_ID

Copy tostaging

Metadata Master Files

manifest

From externalcontent managementsystem

Page 40: Digital Collections, Repositories, and Archives

Digital Collection Services

Digital Repositoriescooperative advantage

Broaden use of the repositoryCultural heritage organizationsInstitutional RepositoriesGovernment documents

Integrate with library tools and systemsConnexionWorldCat.orgWorldCat Local

Page 41: Digital Collections, Repositories, and Archives

Digital Collection Services

Digital Repositoriescooperative advantage

Integrate with other contentJournalsBooks, manuscriptsAudio and video

Increase the value, reduce the costsSearch across institutional collectionsSearch within the larger scope of content

WorldCat.org

Reduce cost through shared services

Page 42: Digital Collections, Repositories, and Archives

Digital Collection Services

Future Direction- New OCLC StrategyEnterpriseproductarchitecture

Local

Global

Group

WorldCatGrid

Page 43: Digital Collections, Repositories, and Archives

Digital Collection Services

Future direction - Products and services to support the collections grid

Page 44: Digital Collections, Repositories, and Archives

Digital Collection Services

dContenteContent

“Digital Stacks”

Future direction - support for the full range of library content

Page 45: Digital Collections, Repositories, and Archives

Digital Collection Services

Thank you

Questions?

email:[email protected]