Digitizing a local history collection on a shoestring
-
Upload
jonathan-wiener -
Category
Self Improvement
-
view
114 -
download
0
Transcript of Digitizing a local history collection on a shoestring
Digitizing a Local History Collection on a Shoestring
Case Memorial Library Local History Collectionhttp://www.orange.lioninc.org/local.htm
Jonathan Wiener, Head of Technical ServicesCase Memorial Library
Why digitize local history materials?
• Dissemination: To share materials with a larger audience
• Preservation: To preserve rare or unique materials by limiting handling of the originals (worst case: to have digital copy on hand if original were damaged or lost)
In our case: • Preservation was not a pressing issue.• We hoped to increase awareness of our
materials and the history they convey.
Suitable materials
We had on hand:
• Books that were out of copyright, or copyrighted by our parent body, the Town of Orange.
• Unpublished 19th-century manuscript notebooks (primary sources).
Materials of both kinds were sturdy enough to be placed safely on a flatbed scanner.
Equipment
Hardware I worked with:• PC with 74.4 GB of storage & 0.99 GB of RAM
(more than enough)• Monitor capable of 800x600 or greater resolution• Flatbed scanner with resolution 600x600 dpi, 24-
bit depth (Hewlett Packard ScanJet 4200C)• Access to web server maintained by our
consortium, where our website resides
Equipment
Software I worked with:• Scanning software that came with the scanner
(HP PrecisionScan LT)• OCR software: SimpleOCR (free); OmniPage 15• Photo editing software: Photoshop Elements is
nice, but MS Office Picture Manager is adequate• HTML editing software: Dreamweaver is nice,
but Notepad is adequate
Steps to create the digital presentation
• Create HTML shells for each page of the document
• Scan page images
• Transcribe and proofread text, using OCR where possible
• Reduce page images to size desirable for viewing on screen
• Insert transcribed text in each HTML file
Page design
I adapted a simple design that I created for a project in library school (“The Digital Library of Shelburne Falls Memorabilia,” http://home.southernct.edu/~wienerj1/index.html).
Using this design, I created a web page (HTML file) corresponding to each page of each item being digitized.
Basic HTML shell<html><head><title>Constitution of the North Milford Temperance Society</title></head><body><p><a href="index.htm">Contents</a> <a href="01.htm">Previous Page</a> <a href="03.htm">Next Page</a><br> <img src="02.jpg“ width="450" ><br><!-- Paste transcribed text here --></body></html>
Change these numbers and “save as” until you have an HTML file for each page in the document. This is the file for page 2 (filename: “02.htm”).
Sample page viewed in browser
Navigation links: Contents, Previous Page, Next Page
Image of document page
Text of document page, transcribed in machine-readable (searchable) form
Using HP scanning software
You can “send” the scan to your OCR software, or save image file to work with later
Output type: “b/w photo” preserves enough detail for OCR to work on text
Output size: scan at 100%, OCR works best with a large image file
Using OCR software
SimpleOCR (freeware)• Not very accurate, but makes it easy to go
through the text and correct errors• Works from top to bottom regardless of layout• Claims to work on handwriting. OmniPage (about $150)• Very accurate• Is “aware” of page layout and can process
separate blocks of text separately
Sizing images for display
• Settled on uniform page width of 450 pixels as a compromise between readability and ease of navigation
• For a more preservation-oriented project, you would want to retain the original, larger image files in addition to the smaller ones used for Web display
Documents are incorporated in the library’s website
Our website is indexed by FreeFind.com. This free utility searches text on all pages of the website, including local history documents.
This page serves as a table of contents for the digitized resources.
Each item is also linked from a record in our online catalog.
Findability
• The following Google searches lead directly to our digital items:– “north milford temperance society” (only
result)– “orange washington temperance society”
(only result)
• I placed links to our local history pages in the Wikipedia article on “Orange, Connecticut,” and in the “Digital Library” article at liswiki.org
Publicity
• We have a list of some 300 email addresses for people who want to receive news about events at the library. We sent this list an announcement about these resources being available online.
• Editor of the Orange Town News responded with enthusiasm, and the paper published our announcement twice.
Outcomes?
• No direct responses, comments, or questions from the public.
• Until Feb. 2009, we did not have access to much of the usage stats for our site
• In February, we had 92.78 hits per day on pages of the local history collection, which is about 1.6% of total hits on our site.
Sample pages from the digital collection
170 years ago, people cared enough about Temperance to attend meetings about it in the middle of winter and create this handwritten record of their activities.
1841: Temperance is still a live issue. Meanwhile North Milford has changed its name to Orange.
Mary R. Woodruff’s 1949 history of Orange includes fold-out map from 1868 (a detail is shown here).
1910: The Congregational Church remains the center of the community and chief custodian of local history.
1972: Congratulations from President Nixon on the 150th anniversary of our founding.
1986: Detailed description of the Orange Historic District includes hand-mounted photographs of each historic structure.
Concluding thoughts
• Like other communities, Orange has a colorful history worth learning about.
• Someone does seem to be looking at this material online. We have just started collecting data, and we will get a better idea of usage as we go along.
• Under ideal conditions, we could probably do more to publicize this project.
Questions?
Jonathan Wiener
Acknowledgments: “Creating a Digital Archive on a Shoestring” by Laurie Thompson and Sarah Houghton served as a helpful model for this presentation. See: http://librarianinblack.typepad.com/DigitalArchive.ppt