Smithsonian Institution Libraries Digital Library Program

22
Digitization Fair: Spotlight on Digitization @ the Smithsonian October 30, 2006 / Smithsonian Institution Martin R. Kalfatovic Smithsonian Institution Libraries Smithsonian Institution Libraries Digital Library Program Martin R. Kalfatovic New Media Office and Preservation Services Smithsonian Institution Libraries

description

Presentation given at the Smithsonian Digitizing Fair: Spotlight on Digitizing @ The Smithsonian, October 30, 2006

Transcript of Smithsonian Institution Libraries Digital Library Program

Page 1: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Smithsonian Institution LibrariesDigital Library Program

Martin R. KalfatovicNew Media Office and Preservation Services

Smithsonian Institution Libraries

Page 2: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Overview of Library Digitizing

• Books are unique objects for scanning purposes

• Differ from 2 dimensional works (e.g. Photographs)

• Differ from 3 dimensional works (e.g. Artifacts)

• Codex has been around for over 1600 years

• The book format (title page, text, index, etc.) since the mid-16th century

• Web delivery of book objects

has interesting challenges

Page 3: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Book Digitizing Process

• Bound materials– Quality issues– Protects the

material for future use

• Disbound materials– Generally better

scans

– Destructive

Page 4: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

SIL Imaging Center

• Established 1999

• 2 digital scanning-back cameras– BetterLight

– Phase I

• Flatbed scanners

• All Mac-based

Page 5: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

• SIL has used a variety of commercial vendors for non- and semi-rare materials– Kirtas Technologies

(Robotic APT 2400 Scanner)

– Preservation Resources– PTFS, Inc.– JJT– TechBooks

Digitizing Vendors

Page 6: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Digitizing Partners

Internet Archive HQSan Francisco

Internet ArchiveScribe Book Scanner

Page 7: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Digitizing Standards: Page Images

DLF Benchmark for Faithful Digital Reproductions of Monographs and Serials

300 dpi, 24-bit color uncompressed TIFF, or lossless compressed images (e.g. LZW, JPEG2000)

Page 8: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Digitizing Standards: Text Conversion

Re-keying or OCR with correction to 99.997% accuracy

Standard mark-up schema (e.g. flavors of XML like TEI or structured databases)

Page 9: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

SIL Digitizing Statistics• Approximately

300,000 scanned pages

• 700+ titles• 1,100 volumes

Page 10: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Who Is Using the SIL Digital Library?

• Sewing machine enthusiasts

• Researchers in Brazil• School kids around

the country

• Lepidopterists in Peru

Page 11: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Major Projects: Digital Editions• History of Science• Natural History• History and Culture

• Art and Design

Page 12: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Major Projects: Trade Literature

• Trade Literature Collections– Over 350,000 pieces,

only a small fraction digitized

– 30,000 images from two collections

– Among SI Libraries’ most popular sites with over 15,000 visitors per month

Page 13: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Major Projects: Image Galaxy

• SIL Image Galaxy– Over 7,000 of SI Libraries’

most interesting images

– Serves as a gateway for product development and licensing

– Assists students and teachers in locating images for use in the classroom and other projects

Page 14: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Major Projects: Scholarly Publications

• Smithsonian Contributions and Studies Series– Collaboration with

Smithsonian Institution Scholarly Press

– Soon to have over 65,000 pages online with another 80,000 in FY 2007

Page 15: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

National and International Partnerships

• Aluka– African history

and culture

• Open Content Alliance

Page 16: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

National and International Partnerships

• Biodiversity Heritage Library– Consortium of 10

major natural history and botanical libraries

– Over 1,000,000 total pages of taxonomic literature online

– Model for international library cooperation

Page 17: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Digitizing Philosophy

• Digital Curation– Just as libraries keep books, so do libraries

have a mission to preserve “born digital” material

– Digital Preservation through assisting in the transmission of digital content to future generations

Page 18: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Smithsonian Digital Repository

• DSpace– Developed jointly by MIT and HP– Open source software used in hundreds of

academic libraries

– Preserves and makes available digital output of scientists, researchers, curators, historians, etc.

Page 19: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Needs for Enhancing the SIL Digital Library Program

• Petabyte storage system for source files

• Effective system for archiving of digital material (byte preservation)

• Enhanced capacity for storing/delivering web-deliverable images

• Central programming support for enhanced XML data delivery

Page 20: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Smithsonian Institution Libraries

Page 21: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries

Page 22: Smithsonian Institution Libraries Digital Library Program

Digitization Fair: Spotlight on Digitization @ the SmithsonianOctober 30, 2006 / Smithsonian Institution

Martin R. KalfatovicSmithsonian Institution Libraries