Post on 30-Jun-2015
description
MAKING LINKS IN THE BHL
Primary Source Materials as a Window to a Scientist’s Methods
Constance Rinaldo, Librarian of the Ernst Mayr Library, MCZ, Harvard
TDWG Annual Meeting 2014, Jonkoping, Sweden
Connecting Content: Field Notes, Specimens & Published Literature
• Digitize• Deposit• Link• Repurpose
Why Field Notes?
• Archival materials fill in the documentation of the full research cycle & are primary source material
• Field notes provide unpublished observations, sketches, weather reports and species lists
• Accessibility & adaptation for today’s tools and researchers• We chose William Brewster, an ornithologist who worked
during the late 19th and early 20th centuries • Test case to connect old and current data: Brewster species
lists & current EOL data• Connect content from multiple sources to advance scientific
and educational pursuits= open science.
Life Cycle Completed
Image digitized for BHL
Observations in notes, Later digitized for BHL
Original Specimen record
Publication of species description, digitized for BHL
Full digital specimen record With links to digitized material
Purposeful Gaming
• Digitize horticultural catalogs• Select tool for transcription of handwritten & multi
column formatted BHL content • Transcribe field notes & catalogs (each page twice) • Crowdsource transcription• Compare digital outputs• Extract problem words for game• Build BHL technical framework for classifying,
comparing & managing multiple OCR outputs
Transcription Tool Criteria• Open source• Crowdsourcing capability• User-friendly• Allow administrative oversight and editing (i.e., reviewing,
correcting, and validating transcriptions)• Provide transcription file exports that can be efficiently formatted
for use by the game(s)• Sustainable (tool selected will hopefully be used permanently for
BHL)• Code easy to install, manage, and troubleshoot• Technical support• Multiple transcriptions of a single page
Transcription Tools
• FromthePage & Digivol • Selected 2 tools to fulfill the need for 2
transcriptions of each page• Built in community of volunteers with Digivol
illustration
"4058841","Jessica Mitchell","Joseph deVeer","JournalsWilliam00Brew_0013.jpg","Fully transcribed by Jessica Mitchell. Exported on 21-Oct-2014 from DigiVol (http://volunteer.ala.org.au)","05-Jun-2014 02:17:15","11-Jun-2014 23:02:51","0","MCZ","1888\nMarch 20\nRevere Beach, Massachusetts.\n Cloudy with occasional light showers; warm.\n To revere Beach with Chadbourne by 9 a.m. train.\nLeft the cars at Point of Pines and first inspected'\nthe pines behind the large hotel in hopes of finding\nCrossbills there. There were English Sparrows in\nabundance and four Tree Sparrows (S. monticola) but\nnothing else save a single Robin. In the bushy thickets\naround the outskirts of the grove Song Sparrows\nswarming as usual at this season and, despite\nthe gloomy weather, singing freely. We saw none\nelsewhere along the beach although they used to\nbe numerous during migration time at several\nplaces, especially Oak Island.\n[margin]S.monticola[/margin]\n Near the extreme end of the Point we came on\na flock of about 15 Pine Linnets feeding among\nweeds on the side of a dyke embankment. Firing\ntwo barrels into these killed eight.\n[margin]Chrysomitris\npinus[/margin]\n Retracing our steps to the station & crossing the\nrailroad we next tried the marshes. There were no\nsmall birds there but we saw a flock of about\n30 Crows (evidently migrants), about as many\nGolden-eye Duck feeding in the river, and numerous\nHerring Gulls.\n The rest of the way to Oak Island we kept along\nthe beach ridge. Pine Linnets are exceedingly\nnumerous the entire distance, in flocks of 5 to 15 birds\neach. We shot nine more specimens. I made one\ncapital shot at a single bird passing very swiftly\nbefore the strong S. E. wind.\n Besides the Linnets we saw a single Snow Bunting,\n& many English Sparrows, the latter feeding on the\nwet beach in flocks. Returned to the city at 12 n.","13"
http://www.tiltfactor.org/the-lab/
Access to Digitized Texts
• Improved OCR from crowdsourcing & gaming• Technical infrastructure to manage & compare
multiple text sources
Next steps• Social media campaign: transcription• Release games/more social media• Operationalize crowdsourcing of OCR
improvements: data mining possibilities
More to come