Online Presentation
description
Transcript of Online Presentation
Establishing the Connection: Creating the Linked Open British National Bibliography
Neil WilsonHead of Metadata Services
Online Information Conference 30 November 2011
twitter.com/#!/BLMetadata
2
British Library Metadata ServicesBackground
Operated prior to the BL’s foundation as ‘The British National Bibliography’ (BNB) Ltd from1950
Originally offered priced services for national & international libraries
Evolved through changes in format & delivery technologies
Offered free services from 2010 as part of the BL’s new open metadata strategy
3
BL Metadata Services Stakeholder Relationships
4
Library Sector Relevance?
Declining? Increasing?
“I did my PhD with only 12 visits to a library. That was 5 years ago; things have improved since then, now you don’t need to use one at all!”
“The release of library data offers the opportunity for it to be used in ways unthought-of by the library & information community…”
Changing ExpectationsPutting Public Sector Data To Work
McKinsey forecasts the benefits value of open public data could be 250bn Euros
“Putting the Frontline First” required “the majority of government-published information to be reusable, linked data” by June 2011.
Public data will be released under the same open licence which enables free re-use, including commercial re-use
6
Library Metadata The Promise of Linked Data
Better web integration of resources increasing visibility & reaching new users
A global pool of reusable data for organisations to add unique value
New library leadership opportunities due to persistence, stability & authority
Such benefits cross national & sectoral boundaries but require huge cultural changes
7
How Are We Meeting The Challenge?
Our new open metadata strategy aims to:
Enable increased innovation without unnecessary barriers
Break from library formats & use cross domain standards
Obtain attribution while offering more permissive licensing
Deliver with decreasing resources while maintaining revenue
8
What Have We Achieved?
Signed over 450 organisations in 71 countries to free data services
Supplied 3-15 million item XML datasets under Creative Commons licenses
Worked with JISC & linked open data implementers on technical, standards & licensing challenges
Created a linked data version of the British National Bibliography
User type
42.7%
1.5%9.9%1.0%3.5%
6.2%
2.0%
4.2%
6.7%
22.3%
Academic Chari ty Commercia l ConsortiaGovernment Individual Medica l NationalPubl ic School
9
Our Linked Data Journey… Why the British National Bibliography?
We wanted to:
Advance debate from theory to practice via release of a critical mass of data
Show commitment by using a core dataset - niche examples are not as compelling
Create a foundational service others can build upon & not a dead end
10
Our Linked Data Journey… Preliminaries
We first identified: The best licensing model
for our objectives (CC0) A proven hosting platform
(Talis) Sources of expert
knowledge & feedback (e.g. W3C, Open Bibliography etc)
…in order to concentrate effort on adding new value to our data
11
Our Linked Data Journey…Additional Objectives
The project would be a staff & organisational development opportunity using:
In-house personnel i.e. librarians rather than IT experts
Pre-existing tools & technologies
Library MARC21 data Established & trusted linked
resources
12
Our Linked Data Journey…Migrating From a Flat Catalogue Card Model…
We aimed to:
Start simple & develop in line with evolving staff expertise
Utilise staff training & mentoring from Talis in:
Linked data concepts RDF modelling
Presentation options
… and use the opportunity to blend the best of traditional & new approaches
13
Our Linked Data Journey…To Something New…
14
Our Linked Data Journey… Selecting Sites To Link To (for mutual benefit)
To position our data in a wider context
We blended general linked resources i.e.:
GeoNames Lexvo RDF Book Mashup
With key linked library resources i.e.:
Dewey.info LCSH SKOS VIAF
15
Our Linked Data Journey…Matching & Generating Links
Three approaches used:
Automatic generation from data elements in records
Automated text matching with linked data resource dumps
Two stage crosswalk matching process for coded data
16
Our Linked Data Journey…Embedding The Links
17
Full BNB MARC21
File
Transform to RDFXML using
XSLT
Load to Linked Data Platform
Generate RDF Triple Dump
BNB RDF/XML file
Select single volume
published books only
Normalise for improved
matching & transforms
Convert to pre-composed UTF-8
Create BL URIs and add external
URIs by matching
MARCPre-Processing
Our Linked Data Journey The MARC to RDF XML Conversion Workflow
MARC to RDF XML Conversion Consists of multiple automated steps using a number of tools
• Selection• Pre-processing• Character set conversion• URI Generation• Data Transformation
18
Where Did We Get To?
Hosted on the Platform:
bnb.data.bl.uk/sparql bnb.data.bl.uk/describe bnb.data.bl.uk/search
.
BNB Books 1950-2011 2.5 Million Records
80 Million Unique RDF Triples
19
What Does It Look Like ?
20
Lessons Learned - Its a new way of thinking…
Legacy data wasn’t designed for this so take care with data modelling & sustainability
Everyone is still learning so you may be the best judge
There are often tools or expertise out there so don’t reinvent the wheel
21
Lessons Learned – Data Issues
Offer sample access to the community for feedback
Expect criticism in addition to positive feedback & continually improve
Any conversion inevitably identifies hidden data issues…& creates new ones!
…but it’s often better to release an imperfect something than a perfect nothing!
22
Lessons Learned - Staff and Resource Issues
It can be a steep learning curve so:
Exploit external expertise to work with or guide your own domain experts
Cultivate a staff culture of enquiry & innovation to widen perspectives
Identify & use pre-existing tools to save development time & assist data validation
23
Lessons Learned – Was It Worth It?
The benefits have been significant & the initiative has:
Given us a presence without distorting revenue streams …& may even offer new options
Gained us a 1st mover advantage within our sector & advanced discussion as hoped
Shown that if you offer useful data, people will use it With over 3 million transactions in the 1st 3 months
24
Our Linked Data Journey - Where Next?
Release of further BNB material
Refine & document the new data model
Identify further resources to link to
Monthly updates on completion
Identify what else can be offered?
2525
Final Thoughts…
It’s never going to be perfect first time
We expect to make mistakes
We aim to learn from them
We hope others will learn something too
& everyone benefits
So if anyone is thinking of undertaking a similar journey…
Just do it!