Post on 11-Jan-2017
British Library Labs:
Lesson learned in its first year
TRACK 2: EXPLOITING SEARCH, RESEARCH & DISCOVERY
Tools and e-resources for researchers
Online Information Show 2013
Victoria Park Plaza Hotel, London, SW1V 1EQ, UK
Wednesday 20th of November, 2013, 1130 - 1200
Mahendra Mahey
Manager of British Library Labs
http://labs.bl.uk 2 #bl_labs labs@bl.uk
http://goo.gl/wgzCrP
http://labs.bl.uk 3 #bl_labs labs@bl.uk
Overview
• What is Labs?
• Lessons learned…
– getting content
– data driven approach, lessons learned
– running the first competition
– other engagement
• Questions and discussion
http://labs.bl.uk 4 #bl_labs labs@bl.uk
What is British Library Labs?
• 2 Year Andrew Mellon funded project.
• Encouraging scholars to do research and development with and
across British Library digital collections and data (born digital
and digitised). No digitisation involved in project.
• ‘Data driven’ approach through competitions, events and
creating an environment for scholars where they can work
intensively with British Library digital collections / data.
• Library will learn how better to support digital scholars and
build on existing or create new processes, tools (e.g. APIs etc.)
and services.
• Case studies for other research libraries around the world
wanting to build Labs for their digital content.
http://labs.bl.uk 5 #bl_labs labs@bl.uk
Our Brand…
At the beginning of the project…
Now…
Let loose on our digital collections
Experiment with our digital collections
http://labs.bl.uk 6 #bl_labs labs@bl.uk
How Labs works in pictures…
BL Labs
Software
Publications
Tools &
services to
support
Digital
Scholarship
Other
outputs…
Researchers
Developers
?
Audience Research
question / idea
idea
idea
idea
Competition
Contact
Events
Meetings
and visits
Engagement with Labs
Experimenting with our
digital collections
Outputs from
engagement Data
Other Digital
Collection / Data
BL Digital
Collection /
Data
BL Digital
Collection /
Data
http://labs.bl.uk 7 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
British Library
Universities & wider
e.g. companies, start-
ups, independent
scholars etc
http://labs.bl.uk 8 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 9 #bl_labs labs@bl.uk
Labs Project and Advisory Boards
Kristian Jensen
- Head of Arts and Humanities
British Library
Richard Boulderstone
- Chief Digital Officer
British Library
Maja Maricevic
- Head of Higher Education
British Library
Michele Burton
- Head of Trusts & Foundations
British Library
Professor Tim Hitchcock (Digital History)
– University of Sussex
Professor Andrew Prescott (Digital Humanities)
– King’s College London
Bill Thompson (Head of Partnership of Archive
Development Group) - BBC
Professor Claire Warwick (Digital Humanities)
- University College London
David De Roure – Professor of e-research
- Oxford e-research centre (University
of Oxford)
Project
Board
Advisory
Board
http://labs.bl.uk 10 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 11 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 12 #bl_labs labs@bl.uk
Labs staff
Mahendra Mahey
- Labs Project Manager (Started in March 2013)
Ben O’Steen
- Labs Technical Lead (Started in August 2013)
Researchers / Interns / Volunteers / Curators
- Can start at any time
Labs
http://labs.bl.uk 13 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 14 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 15 #bl_labs labs@bl.uk
Digital Curator Team Digital Scholarship Heads
Digital Scholarship
Stella Wisdom
- Digital Curator
Nora McGregor
- Digital Curator
Aquiles Alencar Brayner
- Digital Curator
James Baker
- Digital Curator
Rossitza Atanassova
- Digital Curator
Adam Farquhar
- Head of Digital Scholarship
(Wrote Labs proposal)
Aly Conteh
- Head of Digital Research and
Curator Team
Digital
Scholarship
Team
Digital
Curators
http://labs.bl.uk 16 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 17 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 18 #bl_labs labs@bl.uk
200 British Library Curators • Responsible for many different kinds of collections, though not
all digital
• If they work with digital content, and the content is freely
available (or potentially), important to get the curators on board
with Labs and work together
• ‘Story’ behind a collection - detailed knowledge (e.g. rights,
how acquired, who uses them (potential researchers working
with Labs), etc.)
• Usually subject experts too, so have good ideas of what to do
with the content
• May know about curatorial decisions about which items were
chosen and answer questions around metadata
200
Curators
http://labs.bl.uk 19 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 20 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators Access &
Reuse
Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 21 #bl_labs labs@bl.uk
Access and Reuse Group
• Internal group within the Library
responsible for giving open licenses
for digital content
• Meets around once every 2 months
• Curators submit ‘Access and
Reuse Authorisation Request’
• At meeting, decision is made as to
what kind of license can be put on
the content or not (through risk
assessment)
Access and Reuse Authorisation Request
Cleared content
Access &
Reuse
Group
©
http://labs.bl.uk 22 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 23 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 24 #bl_labs labs@bl.uk
Researchers in the Library • Library is a research organisation in its own right and is able
to bid for funding from Research Councils
• Over 100 researchers working at the Library
• Labs engages with researchers internally (and externally)
through funding calls especially with digital content
Developers in the Library
• Over 30 software developers working in the Library
• Labs has helped facilitate the creation of a ‘Library
developers’ group to share ideas, best practice, exchange
knowledge, assess a list of web services / APIs for external
access
Researchers
Developers
http://labs.bl.uk 25 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies, start-
ups, independent
scholars etc British Library
http://labs.bl.uk 26 #bl_labs labs@bl.uk
Stakeholders involved in Labs
Project
Board
Advisory
Board
Digital
Scholarship
Team
Digital
Curators
Access &
Reuse Group
©
200
Curators
Labs
Researchers
Developers
Researchers
Developers
Universities & wider
e.g. companies,
start-ups,
independent
scholars, etc.
British Library
http://labs.bl.uk 27 #bl_labs labs@bl.uk
Researchers outside the Library
• Main target for Labs are researchers in UK academia
(through interest from around the world)
• Researchers considering or already using digital content
and associated research methods
• Not necessarily have to be skilled in computational / digital
research methods, just have a good idea, Labs can support
them if possible
Researchers
http://labs.bl.uk 28 #bl_labs labs@bl.uk
Developers outside the Library
• Interest to developers in academia and commercial sector.
• Labs has organised and participates in Hack events getting
developers to use our content to build things, e.g. – http://hackathoncentral.com/ (26-27 Oct, 2013, Google Campus, London, UK)
– http://labs.bl.uk/Competition+2013+-+Hack+Event (26-27 May, 2013, British Library, UK)
• Interest from start-ups / creatives working with Labs and
creating new opportunities
• Some researchers have software development skills
• Pairing up developers with researchers at events potentially
useful collaborations
Developers
http://labs.bl.uk 29 #bl_labs labs@bl.uk
British Library Digital Collections
Where do you start?
http://www.flickr.com/photos/t_buchtele/3422507814/ Over 600 digital collections and counting…
Some kind of filter needed
Finding openly licensed digital content
can be like finding a needle in a haystack
1 2
3
http://labs.bl.uk 30 #bl_labs labs@bl.uk
Sifting through
British Library Digital Collections
• Copyright cleared for research
and non commercial / commercial use, or close
to (cleared through Access and Reuse Group)?
• Curated (Is there someone who knows the
‘story’ about the collection?)
• Collection / Item Level Metadata available?
(What state/ cleanse?)
• Where is it?
• Most content is in Arts and Humanities
domain
• Lots of meetings in the canteen!
Available
only in
Reading
Rooms due
to ©
Available
on site
only at the
moment
due to ©
Digital but
not online –
various
storage
devices
Available only onsite,
(at the moment), local events,
researchers in residence,
remote access where possible
Digital
and
online
British Library
http://labs.bl.uk 31 #bl_labs labs@bl.uk
British National Bibliography UK Web Archive Data 19th Century Books
Environmental Sounds Text-mining of
electronic journals
Book ordering and
anonymised reader
data
Resonance FM
10 year Community
Arts Radio Show
Datasets, Books / Text, Images / Music,
Maps, Sounds, Multimedia http://labs.bl.uk/Digital+Collections
Planning to launch http://data.bl.uk
http://labs.bl.uk 32 #bl_labs labs@bl.uk
Example digital research methods
http://labs.bl.uk/Launch+Event (has some examples from researchers)
Corpus analysis tools
Visualisations
Location based searching
Geotagging
Annotation
Crowdsourcing /
Human Computation
Natural Language
Processing
Using APIs for datasets e.g. Metadata, Images
Transcribing
http://labs.bl.uk 33 #bl_labs labs@bl.uk
Engaging with Labs
• Events – Hack/ Data Days and Ideas Labs
• Funding calls – writing Labs into funding proposals
• Competitions – running competitions to encourage
researchers to come up with ideas, Labs will try to work with
them (resources permitting)
http://labs.bl.uk 34 #bl_labs labs@bl.uk
Engaging with Labs - events
• Hack and Data days - researchers,
developers, curators and anyone
interested with digital collections
working together at events, solving
problems and developing prototypes
• Ideas Labs – researchers together
over lunch, engaging with the
Library’s digital collections through
cards, coming up with ideas / research
questions, focussing on what outputs
might be generated
• Contact us…
Brainstorm ideas & group
Consider and choose
Work into the night and show
what has been done
http://www.flickr.com/photos/zoonie/5077408371/sizes/l/in/photostream/
1 2
3
Labs Data Cards
Social Media Top Trumps Cards
http://labs.bl.uk 35 #bl_labs labs@bl.uk
Engagement with Labs…
written into funding calls
• Recent AHRC Big Data Call, Labs was written in as a data
partner, facilitating data access to BL Digital content (August
2013)
• Five potential projects
• Labs being considered to be written into other calls all the
time, e.g. Other Research Councils, National Lottery
Funding etc.
• Labs is another route into access to digital content at the
Library
• Contact us!
http://labs.bl.uk 36 #bl_labs labs@bl.uk
Engaging with Labs - competitions
• Labs will organise at least 2 Competitions
• Winners will work ‘in residence’ where possible
• ‘Data Driven’ approach, i.e. here is our data come and do
stuff with it!
• Focus particularly on cross collection research, research at
scale but other research and development encouraged too!
• Help develop tools and services to support digital
scholarship
• Approach will be re-examined each time, e.g. possibly
smaller rewards for shorter pieces of work, theme etc.
http://labs.bl.uk 37 #bl_labs labs@bl.uk
Labs Competition 2013
• Launched late April 2013, closed end of June 2013
• 22 high quality entries
– Text mining tool in the reading rooms
– Curatorial…repackaging metadata for teaching and learning
in a CMS e.g. Drupal, funded through another AHRC fund,
creating narratives on Oil paintings from the India Office /
Foreign and Commonwealth office
– Working to re-use a radio archive
• 2 winners chosen, ideas worked on in ‘residence’ at the Library
working with Labs (expenses paid) from Aug – Nov 2013,
presented at showcase event, 11 November 2013
http://labs.bl.uk 38 #bl_labs labs@bl.uk
The winners of the Labs 2013 competition
Dan Norton (left) and Pieter Francois (right) each receiving a cheque for £2000
as winners of the first British Library Labs 2013 competition from
Adam Farquhar, Head of Digital Scholarship, The British Library. On the 11th November 2013 at the
Transforming Research through Digital Scholarship Event, held at the British Library, London, UK
http://labs.bl.uk/Competition+2013+Showcase
http://labs.bl.uk 39 #bl_labs labs@bl.uk
Dr Pieter Francois
• The Sample Generator
• Postdoctoral Researcher at the University of
Oxford, interested in travel in the 19th Century in
Europe
• Creating demonstrator which searches across 1.8
million metadata records from the 19th Century and
where possible finds highly significant digital
samples for further research from the books we
have digitised so far
http://labs.bl.uk 40 #bl_labs
Sample Generator – distribution of items
http://samplegenerator.cloudapp.net
http://labs.bl.uk 41 #bl_labs
Sample Generator – with sample
and search terms around ‘tour’
http://samplegenerator.cloudapp.net
http://labs.bl.uk 42 #bl_labs
Sample Generator – with samples
(a closer look)
http://samplegenerator.cloudapp.net
http://labs.bl.uk 43 #bl_labs labs@bl.uk
Dr Dan Norton • Mixing the Library:
The Disc Jockey and the Digital Collection
• PhD Researcher, University of Dundee and is Artist in Residence
at Hangar, Centre for Art and Research, Barcelona.
• Building a prototype interface for interacting with Library digital
collections, for building aesthetic, experimental, or logical links
between resources; and for developing ad hoc visualizations, or
publishing annotated data, developed from the DJ's interaction
with information.
• Working on functioning prototype to collect URLs for different
media types e.g. text, video, sound and images, and then
comparing two digital objects and being able to annotate in real
time
http://labs.bl.uk 44 #bl_labs
Mixing the Library:
The Disc Jockey and the Digital Collection
http://labs.bl.uk 45 #bl_labs labs@bl.uk
Mixing the Library:
The Disc Jockey and the Digital Collection
http://www.tompro.co.uk
http://www.ablab.org/shetland
http://www.ablab.org/pd/di/
Prototype design
Annotation
Preview ‘item’
Selected ‘right’
channel ‘item’
Selected ‘left’
channel ‘item’ Collection ‘stalks’ made of ‘items’. Each ‘item’ is a URL.
The order of the ‘items’ can be ‘shuffled’ and sent to the ‘left’ or ‘right’ channels
‘Play back’ of ‘items’ (Blue)
and annotations (Yellow)
http://labs.bl.uk 46 #bl_labs labs@bl.uk
Mixing the Library:
The Disc Jockey and the Digital Collection
http://212.71.253.54:8000/a
First functioning prototype, Labs focussed on backend – to be developed
by further project, Living Lab: Library of the Future,
see: http://alturl.com/284zw
http://labs.bl.uk 47 #bl_labs labs@bl.uk
Other Labs’ developments
• Labs is working with Library stakeholders, particularly with
Curators who engaged with the project (reward!)
• Working on technical ‘quick wins’ to support the Library and
Labs remit and release as much content possible,
prioritising ‘projects’ at the moment to focus on
• Digital Research and Curator team and Labs recently were
awarded £40,000 worth of cloud computing facilities to
experiment with for a year (36 Cores, 10TB), enabling
parallel computing tasks to be carried out (From Microsoft
Research)
http://labs.bl.uk 48 #bl_labs labs@bl.uk
• Posts small illustrations taken almost at random from the
digitised book corpus to a Tumblr blog.
• This experiment with undirected engagement was a by-
product of work to uncover the hidden wealth of
illustrations within the digitised pages.
• Grown to 100+ unique visitors per day since its launch
(26/09/2013).
• http://mechanicalcurator.tumblr.com/
• Images now available on Flickr (420,000 plus images and
API), http://goo.gl/OrCKZz
• Using other data types, e.g. sounds etc.
The Mechanical Curator
http://mechanicalcurator.tumblr.com/image/67461770133
http://labs.bl.uk 49 #bl_labs labs@bl.uk
Other possible technical developments
• Some Labs projects being prioritised at the moment:
– Teletext corpus proof of concept
– Augmenting news metadata through text mining subtitles
– Working with Latin American digital books and AHRC mini project
– Working with colleagues on providing access to the computer
archive of John Maynard Smith (Evolutionary Biologist)
– Releasing more content by working with curators, via various
channels, e.g. Wikimedia, Flickr, other channels, e.g. Early Indian
photos, Russian posters etc.
• Parallel computing facilities, e.g. Map reduce activities for
processing large datasets , tasks that require significant compute
power
http://labs.bl.uk 50 #bl_labs labs@bl.uk
Lessons learned…in getting digital content
• Filter was necessary because of the amount of content, size
and time period of the project
• Getting the story behind the collection was crucial, usually
from the curator
• Getting the curators on board (engaging with the
competition, getting them to be judges) and rewarding them
after is important (e.g. technical quick wins by working with
the Labs technical lead)
http://labs.bl.uk 51 #bl_labs labs@bl.uk
Lessons learned…metadata
• Cataloguing isn’t consistent, e.g.
1850?
• Older records don’t have subject
classification (Only from 1950s
onwards), have to rely on titles and
text mining if possible
• Metadata cleansing needed,
duplicate records, records not
always linked when updated
• Lots of digital content doesn’t have
metadata, initiate crowd sourcing
perhaps? There is limited subject classification for the 19th
century metadata for books
Distribution of the use of Dewey Decimal
in the British National Bibliography
http://benosteen.com/dewey/
http://labs.bl.uk 52 #bl_labs labs@bl.uk
Lessons learned…technical
• Some content is only available on site due to licensing
restrictions
• Labs highlights when systems don’t always join up and this
can be flagged internally
• Some restrictions mean that workarounds have to be
developed for researchers to work with the content
http://labs.bl.uk 53 #bl_labs labs@bl.uk
Lessons learned…human
• Those engaged with Labs early and regularly got better
results all round
• Working on site means internal systems and process
challenges, issues not insurmountable, workarounds
possible, lessons for the library
• Starting a dialogue with the right person is the most
important lesson…it all starts with a conversation.
http://labs.bl.uk 54 #bl_labs labs@bl.uk
Lessons learned…working with researchers
• Release data early
• Research questions change once researchers see the data
• Researchers don’t often know their research questions until
they see the data
• Researchers don’t always have the technical skills to do the
research
• Researchers working with developers might be fruitful as
they are waiting for problems to solve
http://labs.bl.uk 55 #bl_labs labs@bl.uk
Lessons learned from competition
• Have more lead time to allow for engagement with the
sector and longer to work on winning entries
• Data driven approach means research questions may
change depending on data
• Creating ‘themes’ linked to exhibitions
• Think about rewarding small wins at events, e.g. hacks etc.
• Allow more time for judging and asking for amendments to
ideas before they start working on them
http://labs.bl.uk 56 #bl_labs labs@bl.uk
Future developments on content
http://goo.gl/cV0Rtf
1800 1810 1820 1830 1840 1850 1860 1870
A ‘USB’ digital exhibition
• Clear more digital content through Access and Reuse group
(need help from volunteers / interns, very time consuming,
please contact us!)
• Hope to develop data.bl.uk as part of wider Library Strategy
• Creating data ‘exhibits’
where people can take
content for free, like
dead drop USB • Pass it on / pay it forward
hard drives to universities /
public libraries with Library
digital content
A timeline from the 19th Century
A ‘USB’ slot
Ideas from Ben O’Steen http://www.studyin-uk.com/e/uk-university-map/
http://www.notcot.org/post/35623/
UK Universities
http://labs.bl.uk 57 #bl_labs labs@bl.uk
Future Labs engagement
• Data / hack events at the Library in London
– learn about the data well before competition deadline and
engaging with Labs early
– 12 Dec 2013 – Data / Hack Event around images (‘Unseen
Illustrations’) and second competition launched
– 13 January 2014
– 12 February 2014
– 10 March 2014, possibly in April 2014 too
• Ideas Labs around the UK and virtually
– Organising several Ideas Labs in universities around the UK
focussing on early career researchers
– Virtual events to allow for international participation
BL Labs Virtual Event, May 2013 http://www.youtube.com/watch?v=RFt0NvbTFHs
http://labs.bl.uk 58 #bl_labs labs@bl.uk
Labs competition 2014
• Launched Dec 12 2014
• Close around late March / April 2014
• Start late May 2014 and finish Nov 2014
• Showcase Nov / Dec 2014
Other Labs engagement
• Links with successful funding calls
• Ad-hoc collaborations internally and externally with curators,
researchers and developers
http://labs.bl.uk 59 #bl_labs labs@bl.uk
Questions…
http://labs.bl.uk 60 #bl_labs labs@bl.uk
Email us
• Let us know your ideas for engaging with Labs!
labs@bl.uk
http://labs.bl.uk 61 #bl_labs labs@bl.uk
Licensing…
You are free to:
– Copy, share, adapt, or re-mix
– Photograph, film, or broadcast
– Blog, live-blog, or post video of;
this presentation provided that:
– You attribute the work to its author
and respect the rights and licences
associated with its components
This work is licensed under a
Creative Commons Attribution 3.0
Unported License unless stated
otherwise.