Digital Collections, Repositories, and Archives
Transcript of Digital Collections, Repositories, and Archives
Digital Collection Services
Digital Collections, Repositories, and Archives
Greg Zick, Vice PresidentDigital Collection Services
ARL Research Library Leadership Fellows
November 1st, 2007
Digital Collection Services
Future Direction- New OCLC StrategyEnterpriseproductarchitecture
Local
Global
Group
WorldCatGrid
Digital Collection Services
Presentation Outline
Digital Collection CreationInstitutional RepositoriesDigital RepositoriesOCLC Future Directions
Digital Collection Services
One way to digitize all the world’s collections
List all the institutions that have collections?Institutional Registry
List all the collections at each institutionArchive Grid
Digitize the items in the collectionsDigital Registry
Create the Digital CollectionCONTENTdm
Archive original TIFFSDigital Archive
Add to Global Digital RepositoryHarvest metadata into WorldCat
Digital Collection Services
Digital Collection Creation
CONTENTdm
Digital Collection Services
Organize and store your digitalcollections
Expose your digital collections while providing Web access
Assess users’ needs and collections’conditions
Convert your materials to digital collections and create metadata
Digital Collection Services
What is CONTENTdm?
A complete software solutiondeveloped by DiMeMaStore, manage and access your digital collections For organizations of all types and sizesfrom academic libraries, historical societies, public libraries and other collaboratorsShowcase a wide range of media types from photos and documents to audio and video files
Digital Collection Services
Alaska
2
British Columbia
2Alberta4
Saskatchewan Manitoba
Ontario2
Quebec1 New
BrunswickWashington18
Oregon8 Idaho
2
Montana2
Wyoming1
North Dakota2
South Dakota1
Nevada4 Utah
7
California20
Arizona3
Colorado2
Nebraska3
Kansas5
New Mexico1
Texas10
Oklahoma1
LA2
Arkansas1
MS1
AL4
Tennessee3
Missouri4
Georgia2
FL4
SC1
N Carolina 3
Iowa6
KY3
Illinois10
Indiana11
Ohio18
WV1
VA2
Pennsylvania16
Minnesota5
Wisconsin8
Michigan10
New York24
Maine
VT 3NH 1MA 3
RI 1CT 3
NJ 2DE 1MD 3DC 1
Hawaii
Canada: 11 licensees
(Newfoundland – 1
Nova Scotia – 1)
Digital Collection Services
Building partnerships and CONTENTdm digital collections
Multi-regional or statewide consortia: AK, CA, FL, IA, IL, IN, NE, MN, MO, NY,OH, PA, WIMountain West Digital Library – 20 membersWestern Waters Digital Library - 30 membersColumbia River Basin Ethnic History ArchiveDigital Library of Appalachia -12 membersHistorically Black Colleges and Universities Library AllianceUpper Mississippi Valley Digital Image Archive
REALIA Project –Assoc. Colleges of the Midwest, Assoc. Colleges of the South and the Great Lakes College Assoc.
Consortium of Liberal Arts Colleges (CLAC) – 62 members
These and many more…
Digital Collection Services
Images: Photographs, posters, postcards
Digital Collection Services
Maps
JPEG2000 supportedJPEG2000 supported
Digital Collection Services
Zoom and Pan
Digital Collection Services
Documents:Text-based letters, journals, diaries and more
Transcribed text becomes searchable!
Digital Collection Services
CONTENTdm’s Integrated OCR Capability
Documents:Text-based letters, journals, diaries and more
Search term(s) highlightedSearch term(s) highlighted
Digital Collection Services
Bureau of Economic Analysis
OCLC Preservation Service Center completed scanning and provided digital images to BEA.
Use of CONTENTdm to search and access.
Digital Collection Services
Documents:Newspapers
Digital Collection Services
Audio/video
Digital Collection Services
Browse
Digital Collection Services
Advanced search
Select available collections
Search
Digital Collection Services
Discovery: WorldCat harvesting & Open WorldCat
Digital Collection Services
CONTENTdm provides comprehensive solutions for your historical digital archives
One system, many solutions
Photographs, diaries, slides, newspapers, books, letters, audio and video oral histories, maps −all your digital assets!
http://www.contentdm.com/customers/
Digital Collection Services
CONTENTdm 4.3
Connexion digital importNew option for OCLC catalogers Add digital resources to CONTENTdm via Connexioncataloging process
BenefitMore options for building digital collectionsCatalog using most convenient process for the organization
Digital Collection Services
CONTENTdm 4.3
Connexion digital importAdd items to CONTENTdm via the Connexion ClientDigital collection growth built into cataloging workflowWorldCat MARC record crosswalked to Qualified Dublin Core and added to CONTENTdm
OCLC number stored in CONTENTdm – a global persistent identifier
Digital items accessible by FirstSearch, WorldCat.org and WorldCat LocalRequires OCLC Cataloging subscription, CONTENTdm license and CONTENTdm Hosting Services
Digital Collection Services
CONTENTdm 4.3
Connexion digital importMetadata choices for cataloging
Connexion client (MARC)
CONTENTdm (DC, QDC, VRA)
Acquisition Station
Web-based Add option
Serials supportUse “Attach Digital Object” in Connexion client for each issue in a serial item
856 link will automatically retrieve a search results page with links to each issue stored in CONTENTdm
Digital Collection Services
CONTENTdm 4.3
Connexion digital importIn Connexion Client:
Attach Digital Content to existing recordSelect CONTENTdm collectionSelect file(s) from local computer/networkReplace command
System processes metadata and file for import into CONTENTdm
Digital item sent to CONTENTdm collectionMARC metadata mapped to Qualified Dublin CoreCompound object creation, JPEG2000 conversion, and OCR or PDF processing, if applicableThumbnails generatedLink added to 856 field in WorldCat record
Digital Collection Services
Access byUsers
Catalogerw/ ConnexionClient
CONTENTdmCollectionAdministrator
CONTENTdmConnexion
WorldCat
WorldCat.org
CONTENTdmImport
Attach digital content to WorldCat record
Configure CONTENTdmcollection with Qualified Dublin Core
OCLC# hyperlink to digital content
MARC QDCTIFF JP2OCR, PDF
Digital Collection Services
CONTENTdm 4.3
Improved PDF handling Convert multiple-page PDF files to CONTENTdm compound objects
Subset print options
Search term highlighting within PDF filesCreation of thumbnail images from PDF filesImproved text extraction from PDF files
BenefitsMore efficient processing of PDF documentsSingle database of ALL digital resourcesBetter end user experience
Digital Collection Services
CONTENTdm 4.3
PDF files can be imported using standard optionsSingle or batch import via Acquisition StationWeb-based Add optionConnexion digital import
Thumbnail images are automatically generated from the PDF when the item is added to the collectionText is extracted from the PDF and inserted into the full text search field when the item is added to a collection
Collection must have full text search field and that field must be empty when PDF is added to the collection
Digital Collection Services
CONTENTdm 4.3
Compound object conversionWhen compound object conversion is enabled, CONTENTdm:
Creates a compound object based on the page order of the PDF.
Generates a page-level metadata record for each page.
Extracts text from the PDF, converts it to UTF-8, and inserts it into the full text field of the associated page level record.
Generates thumbnail images of each page of the PDF. The thumbnail image of the first page will also be used for the compound object.
Retains the original PDF file for export and printing.
Displays the PDF compound object in a compound object viewer with each page of the PDF accessible from the left navigation menu.
Highlights search terms in the PDF.
Provides an option to select a subset of the PDF to print or save.
Digital Collection Services
Compound object conversion
Digital Collection Services
Compound object conversion
Digital Collection Services
PDF Enhancements
Printing and downloadingComplete print version
Original PDF file retained for printing and saving
Subset of print versionSelect a subset of pages from the PDF to view, save, or print
Select all pages with search hits or pick individual pages or page ranges
Do not have to wait for large download if only need a few pages
Also available for non-PDF compound objects when they have been processed using the OCR Extension
Digital Collection Services
Printing and downloading
Digital Collection Services
CONTENTdm 4.3
Compound object conversionReduce the size of file that is downloaded for viewing
An entire PDF may be several MB but individual pages are much smallerView a page within large PDF without downloading the full documentIncrease speed of access to view
Provide full text indexing by page not documentNo secondary search required to find specific content in PDF
Print only the information you needBetter end-user experience!
Digital Collection Services
CONTENTdm 4.3
Compound object conversionQuick and efficient for collection builders!PDF pages of compound object do not count against total number of items on the server!
Ideal for born digital documents!Theses, dissertations, government documents, e-publications, and more…
Digital Collection Services
Digital Repository - future directions
Automatic WorldCat registration and linksRegister and replicateWeb-based user controlled mapping DC, MARC, VRAAuto add WorldCat record #
Seamless/transparent WorldCat ingestUpdate of all digital items with unique WorldCatreference numberIntegration with all features and functions of WorldCat.orgLocal branding and customization
Digital Collection Services
OCLC Enterprise Strategy“Discovery to Delivery” Collection Curation
Consumer PlatformWorldCat.org “Indiana Cotton Mills”
Management SystemsWorldCat Local “Hegg Alaska Photographs”, “meed”
Content PlatformDigital RepositoryDigital Archive
Network ServicesUse statisticsCollection Analysis
Digital Collection Services
Digital Collection Services
Digital Archive – re-engineered
Development of storage infrastructure in Dublin
Redefinition and packaging of Digital Archive for marketlong term archiving of digital masters
opportunity to reprocess as software improvesexample OCR
Significantly reduce cost of service
Two modesWith repository
Mirroring
Digital Collection Services
Digital Archive workflow for other systemsType A service: batch mirror
Verify
Volume Staging
Archival Storage
Check-in
Ingest
• Vol ID Checkinfixity, virus, format verification (JHOVE), etc.
• Verify manifest & files
OK to MoveARCHIVE_ID
Copy tostaging
Metadata Master Files
manifest
From externalcontent managementsystem
Digital Collection Services
Digital Repositoriescooperative advantage
Broaden use of the repositoryCultural heritage organizationsInstitutional RepositoriesGovernment documents
Integrate with library tools and systemsConnexionWorldCat.orgWorldCat Local
Digital Collection Services
Digital Repositoriescooperative advantage
Integrate with other contentJournalsBooks, manuscriptsAudio and video
Increase the value, reduce the costsSearch across institutional collectionsSearch within the larger scope of content
WorldCat.org
Reduce cost through shared services
Digital Collection Services
Future Direction- New OCLC StrategyEnterpriseproductarchitecture
Local
Global
Group
WorldCatGrid
Digital Collection Services
Future direction - Products and services to support the collections grid
Digital Collection Services
dContenteContent
“Digital Stacks”
Future direction - support for the full range of library content