MCN2011 Crowdsourcing Transcription
description
Transcript of MCN2011 Crowdsourcing Transcription
Crowdsourcing Transcription: Who, Why, What, and How
Ben Brumfield, Perian Sully
Why?
• You can’t OCR cursive!– 1 in 3 letters correctly recognized at best
• Engagement/Outreach
Who?
• Genealogy Community
Who?
• Genealogy Community
• Natural Sciences
Who?
• Genealogy Community
• Natural Sciences
• Open Source/Creative Commons
Who?
• Genealogy Community
• Natural Sciences
• Open Source/Creative Commons
• Libraries
Who?
• Genealogy Community
• Natural Sciences
• Open Source/Creative Commons
• Libraries
• Archives
Who?
• Genealogy Community
• Natural Sciences
• Open Source/Creative Commons
• Libraries
• Archives
• Museums!
(I’m saving this for Perian.)
Why?
• Sense of Purpose
Why?
• Sense of Purpose
• Love of the Subject
Why?
• Sense of Purpose
• Love of the Subject
• Immersion in the Text
Why?
• Sense of Purpose
• Love of the Subject
• Immersion in the Text
• It’s Fun! (Gamification)
Why?
• Sense of Purpose
• Love of the Subject
• Immersion in the Text
• It’s Fun! (Gamification)• Balance of Motivations
Why?
• Sense of Purpose
• Love of the Subject
• Immersion in the Text
• It’s Fun! (Gamification)• Balance of Motivations
•Money
How?
• Choose your material:– Homogenous format.
• Decide on uses for the transcription.
How?
• Choose your material:– Homogenous format.
• Decide on uses for the transcription.
• Find sources of volunteers.
• Choose a tool.
Which Tool?
• Structured data or free-form text?
• CMS integration or stand-alone?
• What kind of mark-up?
• What underlying technology?
Which Tool?
• Free/Open-Source Options:– Wikisource (MediaWiki+ProofreadPage)– FromThePage– Scripto (War Department Papers)– Bentham Transcription Desk– Scribe (Zooniverse/OldWeather)– OpenScribe (PROV)
• Build your own– (But please share it with the rest of us!)
Thanks!
• Slides and links at– http://manuscripttranscription.blogspot.com