Practical approaches to entification in library bibliographic data

16
Practical approaches to entification in library bibliographic data

Transcript of Practical approaches to entification in library bibliographic data

Page 1: Practical approaches to entification in library bibliographic data

Practical approaches to entification in library bibliographic data

Page 2: Practical approaches to entification in library bibliographic data

BIBFRAME = Internet of Things

• BIBFRAME is the model, but the devil is in the details• Reconciliation with legacy data

• Different flavors of the model (kind of like different flavors of MARC, but

not really)

• How do make our data semantic web friendly

• How do we build links (down with strings!)

• What services do we trust and are these services available yet

• How do we experiment to start learning what works and what doesn’t

Page 3: Practical approaches to entification in library bibliographic data

Where do you start?

• If you are a developer?

• The current toolset is built for you. LC’s tools, SPARQL, system APIs –

as a developer, the raw components that you need to start pulling

together toolsets for experimentation can be found if you look for them.

• If you are a cataloger?

• Find a developer, or start writing scripts yourself…

• Today, very few resources are being developed for practitioners.

Zepheira has a training set and is sponsoring LibHub, LC’s BIBFRAME

site provides examples of data in context, and there is MarcEdit.

Page 4: Practical approaches to entification in library bibliographic data

Linked Data Tools in MarcEdit

• MARCNext• The MARCNext toolset represents an effort to beginning creating a set of tools

that can integrate into existing workflows for Libraries and Catalogers interested

in testing or implementing linked data concepts within their bibliography

environments today.

• Directly in MarcEditor• Integration of the Linked data tool as part of the cataloger’s workflow

• Via the Command-line• As part of cmarcedit.exe: eg. cmarcedit.exe –s [sourcefile] –d [destfile] -linkeddata

Page 5: Practical approaches to entification in library bibliographic data

MarcEdit’s MARCNext Toolset

Page 6: Practical approaches to entification in library bibliographic data

MarcEdit’s MARCNext Toolset

• Main motivations for making this available• Exposes a part of a larger framework presently within MarcEdit to support my

research interests in emerging metadata models and linked data concepts in

general.

• To place tools in the hands of catalogers; who are largely pushed to sidelines

when thinking about issues like BIBFRAME and Linked Data

• Lower the barriers for those interested in experimenting with their own data

Page 7: Practical approaches to entification in library bibliographic data

MarcEdit’s MARCNext Toolset

• BIBFRAME Testbed: a tool utilizing LC’s XQuery transformations to allow users the

ability to visualize their own metadata within various BIBFRAME serializations.

• JSON Object View: a tool allowing users to open a JSON file and visualize the

relationships between objects.

• Link Identifiers: a tool that catalogers can use now to embed URIs into the $0 of

controlled terms

• SPARQL Browser: A Spartan interface for users wanting to test SPARQL endpoints

Page 8: Practical approaches to entification in library bibliographic data

Link Data Tool

• The Last Mile Problem: To take advantage of metadata models designed

for the web, someone will need to “link” the data.

• EZ-Entification: Takes advantage of the current MARC structure to embed

$0’s into the 1xx, 6xx, and 7xx fields.• Process supports the generation of links to a wide range of authority sources.

• Presently:

• VIAF

• ID.LOC.GOV

• FAST

• MESH

• Embedding OCLC Work ID’s into records

Page 9: Practical approaches to entification in library bibliographic data

Link Data Tool

• How it works• In March 2015, I formalized support for linked data resources and created the

melinked_data.dll assembly. This assembly is the engine that drives MarcEdit’s

Linked Data work.

• Within the assembly is a resolution framework, designed to enable plug & play

networks for eventual user definition of new linked data services.

• Framework has been designed to support SPARQL, JSONLD, and

OpenSearch (with Atom or RSS responses)

• As part of the tool, the resolution algorithm has multiple validation layers,

with basic data normalization to ensure optimal communication with the

current linking services.

Page 10: Practical approaches to entification in library bibliographic data

Link Data Tool

• So what get’s Linked?• Tool is looking for specific values

• VIAF and LCNAF linking occurs on 1xx and 7xx data elements

• Subject Linking occurs on all 6xx fields

• Linking services are automatically evaluated and processed by utilizing data

found within the second indicator and the $2.

• When working with known services, the tool will evaluate any data found in

the $0 and if a URI isn’t present, will update the value appropriately

Page 11: Practical approaches to entification in library bibliographic data

Link Data Tool

• Creating Actionable Data• $0 defined as: Authority record control number or standard number (R)

• Linked Data Tool ignores this utilizing URIs (and will actively convert control

numbers to URIs)

Example:

=650 \7$aMedical policy.$2fast$0(OCoLC)fst01014505

• Converted to:

=650 \7$aMedical policy.$2fast$0http://id.worldcat.org/fast/1014505

Page 12: Practical approaches to entification in library bibliographic data

Link Data Tool

• Challenges• Are the Linking Services ready?

• Honestly, many of these services are still evolving. Will a VIAF identify

continue to make the most sense when linking to OCLC person data, or will

the Person Identifiers that they talked about at ALA be more appropriate?

• Id.loc.gov doesn’t handle redirects well through the API – there is (or was

last time I tested) a disconnect between terms that have been replaced.

Page 13: Practical approaches to entification in library bibliographic data

Link Data Tool

• Challenges• Linking will also be local – and how will those services be implemented. I’m

hoping SPARQL, but my experience has been all over the map.

• Where is OCLC in all of this. They are working hard on their own internal data

streams, but its actually groups like Zepheira, BibFlow, and LD4P that are actively

engaging catalogers.

.

Page 14: Practical approaches to entification in library bibliographic data

Link Data Tool

• Challenges• Linking will also be local – and how will those services be implemented. I’m

hoping SPARQL, but my experience has been all over the map.

• Where is OCLC in all of this. They are working hard on their own internal data

streams, but its actually groups like Zepheira, BibFlow, and LD4P that are actively

engaging catalogers.

.

Page 15: Practical approaches to entification in library bibliographic data

Source Code

• Zepheira BIBFRAME Testing Plugin Code• Code is provided minus the API key

• Includes the linked data assembly from ME

• http://marcedit.reeset.net/software/plugins/source/libhub.zip

Page 16: Practical approaches to entification in library bibliographic data

Contact Me:

Terry ReeseHead of Digital InitiativesUniversity Libraries 175 West 18th Avenue320F 18th Avenue Library,, Columbus, OH 43210614-292-8263 Office / 614-407-4998 [email protected] / http://library.osu.edu / http://reeset.net