BHL: Assigning DOIs & Other Identifiers to Legacy Literature
-
Upload
chris-freeland -
Category
Technology
-
view
5.306 -
download
0
Transcript of BHL: Assigning DOIs & Other Identifiers to Legacy Literature
10.5962/BHL:)tle.24947 10.5962/bhl.)tle.55495ASS IGNING /bh l .) t l e . 35182 10.5962/bhl .DOIs .16580&10.5962/bhl.)tle.OTHER/IDENTIFIERS .20698 10.5962/bhl.)tle.TO10.5962/bhl.)tle.34905 10.LEGACY.LITERATURE10.5962 /bh l .)t le .17497 10 .5962/bh l .)t le .3 8 1 8 2 1 0 . 5 9 6 2 / b h l . ) t l e . 3 7 0 5 8
Chris Freeland Technical Director, Biodiversity Heritage Library
Director, Center for Biodiversity Informa)cs, Missouri Botanical Garden
@chrisfreeland
About Biodiversity Heritage Library
• Interna)onal consor)um of the world’s leading natural history libraries
• Funded to digi)ze books & journals in the public domain @BioDivLibrary
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
Purpose of exercise
• Make legacy scien)fic literature citable via modern systems – Make 250+ years of scholarly communica)ons available via new tools
• Assign contemporary iden)fiers to legacy literature – DOI – ISBN – ISSN
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
Iden)fy the Iden)fiers • Digital Object Iden)fier (DOI) – Resolvable iden)fier for a digital object – Register through an agency: CrossRef is popular
• Provides cita)on metrics, reference linking – Co$t
• Interna)onal Standard Book Number (ISBN) – US Agency for registra)on: Bowker – Co$t
• Interna)onal Standard Serial Number (ISSN) – US assignments: Library of Congress – Free
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
DOI, ISBN, ISSN, ei, ei,…uh oh…
• Ran into trouble with each agency – BHL isn’t a publisher – BHL is a consor)um, not a separate legal en)ty – BHL doesn’t own all the content it serves – “We know we need a policy on that…”
…and on…and on…for more than 3 years… CrossRef DOIs: path of least resistance
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
Challenge: Title Matching for Uniqueness
• Only want 1 DOI per intellectual, citable unit • Differences between cataloging & publishing – Libraries:
– Publishers:
<datafield tag="245" ind1="1" ind2="4"> <subfield code="a">The amoebae living in man;</subfield> <subfield code="b">a zoological monograph,</subfield> <subfield code="c">by Clifford Dobell.</subfield>
</datafield>
<)tle>The amoebae living in man; a zoological monograph</)tle>
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
Challenge: Monographs/Series/Monographic Series • “Report on the Rhynchota collected by the Wollaston Expedi)on in Dutch New Guinea” – Published in 1914 – Bound & catalogued as a monograph – Scanned as a monograph – Assigned a DOI as a monograph in BHL:hqp://dx.doi.org/10.5962/bhl.)tle.13791
• Transac)ons of the Zoological Society of London, vol. 20, pt. 11. – Presented by current publisher of that )tle as an ar)cle – Assigned a DOI as a journal ar)cle: hqp://dx.doi.org/10.1111/j.1469-‐7998.1912.tb07839x
Guess who provides free access?
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
Challenge: Ownership of Backfiles
• Who owns public domain works? – Some publishers consider they have ownership of backfiles for journals they currently publish • Long running series in natural history • Cur$s’s Botanical Magazine, since 1787
• Those publishers assign DOIs to their current volumes
• BHL assigns DOIs to the public domain works we’ve digi)zed Guess who is upset?
BHL DOIs in Use
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
Linked Data
• CrossRef DOIs are available as Linked Data, announced April 2011: – hqp://www.crossref.org/CrossTech/2011/04/content_nego)a)on_for_crossr.html
• Awesome! :)
• But I couldn’t get it to work :(
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
So I asked Twiqer for help…
• Any #lodlam #LOD ppl interested in taking a look at open #bhlib data? hqp://t.co/is1a2dUl
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
…and Twiqer responded
• @chrisfreeland Got this ht.ly/8zTuk using morph.talis.com awer conver)ng BibTEX to RDF using ht.ly/8zTxq
@asaletourneau:
• @chrisfreeland I poked at making this work, but my results were not encouraging. See the comment on your blog post for details.
@cajunjoel
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
I also asked CrossRef support
• Turns out it’s a bug! • CrossRef DOI API wasn’t returning results for ISBN-‐less books – And remember, we don’t have those because of the ugly problems
• We iden)fied & resolved a problem via Twiqer! – Crowdsourcing #FTW! – Talk about #lod bringing people together!
• And all because of this panel
@cajunjoel
@asaletourneau
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
And then a dash of reality
• “…why bother with linked data? Obviously I know the stock answer, but the reality is that most "linked data" isn't linked.…CrossRef RDF for an ar)cle the only external link (eventually) is to hqp://periodicals.dataincubator.org/ via the ISSN. Author names have arbitrary, non-‐resolvable URIs. So effec)vely the data is a silo. Linked data, yes, RDF, yes, but s)ll a silo.”
@rdmpage
Freeland. ALA Midwinter. 21 Jan 2012. @chrisfreeland #bhlib #alamw12
Takeaway • Assigning modern iden)fiers to legacy content is challenging – “We know we need a policy on that…” – Benefits are there, but may be slow to realize
• Working with an agency like CrossRef has its advantages – Support – Tech advances: they flip a switch, instaLOD – Disadvantages: Co$t, may s)ll need custom solu)ons for emerging technologies
• Linked Data is conceptually promising, but fraught with uncertainty in produc)on systems
10.5962/bhl.)tle.24947 10.5962/bhl.)tle.55495 10.5962/bhl.)tle.35182 10.5962/bhl .)t le .16580THANK/bhl .)t le .20698 10.5962/bhl.)tle.33773 10.5962/bhl.)tle.34905 10.5962/YOU.)tle.17497 10.5962/bhl.)tle.3818210.5962/bhl.)tle.37058 10.5962/bhl.)tle.29660 10.5962/bhl.)tle.4 2 4 5 8 1 0 . 5 9 6 2 / b h l . ) t l e . 3 3 2 6 1
Chris Freeland Technical Director, Biodiversity Heritage Library
Director, Center for Biodiversity Informa)cs, Missouri Botanical Garden
@chrisfreeland