Transcript of BL Labs and Digikult 2016
Working with News Data across differrent MediaDIGIKULT, Gothenburg,
Sweden.
Mahendra Mahey
25 Seconds (68 Words)
*
*
Stockton-on-Tees
are borrowed from public libraries.
St Pancras, London, UK
Legal Deposit Library – Reference only
Uses low oxygen and robots
Reading room and delivery to London
Document Supply and Storage at Boston Spa
140 seconds
The British Library is the national library of the UK and one of
the largest research libraries in the world . The Library moved to
a new purpose built building in 1997 <click> the largest of
it’s kind that was built in the UK in the 20th century. Many
frequently used items are stored 5 stories below the main building
at St Pancras in London and many might not know that part of the
building is meant to look like a ship on a journey to
discovery!<click>. <click to switch off>
The building can sit 1,200 researchers at any one time across 5
reading rooms.
<click>Medium and long term requested items are held at
Boston Spa in Yorkshire in a low oxygen warehouse, using robot to
retrieve items. In total, the library has 625 km of shelving,
growing by 12 km every year.
*
*
To make our intellectual heritage accessible to everyone,
for research, inspiration and enjoyment and be the most open,
creative and innovative institution of its kind by 2023.
Roly Keating (Chief Executive Officer of the British Library)
To make our intellectual heritage accessible to everyone,
for research, inspiration and enjoyment and be the most open,
creative and innovative institution of its kind by 2023.
Custodianship
Research
Business
Culture
Learning
International
Document:http://goo.gl/h41wW7
Speech:https://goo.gl/Py9uHK
85 seconds
The picture you can see is inside the main building in London, it’s
the King’s Library – King George the Third’s personal library!
Sometimes known as the ‘stack’, I walk past this everyday and I
sometimes forget that the collections the British Library have are
truly staggering! We currently estimate them to exceed
<click>150 million items, representing every age of written
civilisation and every known language. Our archives now contain the
earliest surviving printed book in the world, the Diamond Sutra,
written in Chinese and dating from 868 AD….
So some big numbers…
<click>60 million patents
<click>8 million stamps
<click>4 million maps
<click>3 million sound recordings
<click>1.6 million music scores
<click>over .3 million manuscripts
*
*
33 Seconds (100 Words)
<Click>
to ‘experiment’ with our digital collections and data. We are
particularly interested in those who have questions which focus on
the potential to find and create NEW things through access to the
digital content. For example, being able to ask a question across
thousands of digitised books or newspapers using computational
techniques would not feasible using manual methods. Let’s look at a
clear example.
<Click>
33 Seconds (100 Words)
<Click>
to ‘experiment’ with our digital collections and data. We are
particularly interested in those who have questions which focus on
the potential to find and create NEW things through access to the
digital content. For example, being able to ask a question across
thousands of digitised books or newspapers using computational
techniques would not feasible using manual methods. Let’s look at a
clear example.
<Click>
17 Seconds (53 Words)
*
Chart1
2006
2007
2008
2009
2010
2011
2012
2013
2014
GB
1832.0741535882
3553.4182458315
19293.0540502761
45749.7067145621
120015.039018255
182291.992547065
242728.000878453
348737.374305472
426263.756179228
Sheet1
Year
GB
2006
1832.0741535882
2007
3553.4182458315
2008
19293.0540502761
2009
45749.7067145621
2010
120015.039018255
2011
182291.992547065
2012
242728.000878453
2013
348737.374305472
2014
426263.756179228
To resize chart data range, drag lower right corner of range.
Year
new
Items
GB
2006
249137
249137
1,832
1,832
2007
13136
262273
1,721
3,553
2008
176148
438421
15,740
19,293
2009
145227
583648
26,457
45,750
2010
488731
1072379
74,265
120,015
2011
801956
1874335
62,277
182,292
2012
5082907
6957242
60,436
242,728
2013
825789
7783031
106,009
348,737
2014
871782
8654813
77,526
426,264
Kings Cross, http://www.knowledgequarter.london
6 Seconds (20 Words)
*
*
Copyright cleared?
Internal Access and Reuse and Licensing Group (Risk assessment
group – Strategic, Commercial, Copyright, Curatorial,
Technical)
Curated?
Learn the story behind a collection!
Is there a human who knows the ‘story’ about the collection, who
wants it used, are there any surprises lurking?
Metadata available?
Finding openly licensed collections is sometimes like detective
work and from lessons learned Labs, uses the following 4 methods
for filtering digital content:
<click>Is the Copyright cleared for research and non
commercial use?
<click>Is it Curated (Is there someone who knows the ‘story’
about the collection?)
<click>Is there Collection / Item Level Metadata available?
And importantly what state is it in, does it need cleansing?
<click>Finally, where is it?
<click>These have been effective filters in doing the work of
Labs in an agile way.
<click>Labs has therefore identified several collections at
the website above, some are shown in the slide:
<click>Due to our licensing conditions, we are in the process
of text mining the abstracts for a large number of journal titles
in electronic form. The visualisation indicates the subject spread
of our collections.
<click>We have been harvesting the UK Web since 1993 and this
is available as a resource under specific conditions for
research.
<click>We are also investigating the use of our item request
data (around 17 million records) and anonymised reader data, data
protection allowing.
<click>The British National Bibliography has over 3 million
catalogue records available as linked open data, licensed under CCO
from the British and Irish National Library catalogues.
*
*
only on site due to © or ethical etc
not online / available – various storage devices, personal
data
online and open
online behind paywall
<click>The British Library faces many challenges of access to
our Digital collections!
<click> Sometimes digital content is only available onsite
due to license restrictions,
<click>or even only on a specific computer in a reading room!
Technically there are very few reasons why digital content can’t be
online
<click> though it might be too big or hasn’t been transferred
from other digital storage media.
<click>Sometimes access is through a paywall. Finally,
<click>some content is in the happy sunny place, online, open
and freely available.
The real reasons why there are challenges to accessing digital
content are of course human. They require different approaches from
the Library and may often involve an honest, open dialogue and
negotiation with the publishers.
*
*
Soon…http://data.bl.uk
35 Seconds overall
We have created collection guides detailing some of these digital
collections <Click>on our Digital Scholarship site.
<Click>and some on the Labs site.
<Click> Soon data.bl.uk will be the place where people can
directly access some of the digital collections we have
available.
<Click> Today we have brought data with us, see the guide on
how to access it and print outs on the tables.
*
*
http://goo.gl/cwThHw
Books and newspapers
From Digital Scholar : How technology is transforming scholarly
practice, Martin Weller, Bloomsbury Academic, 2011, page 4
Specialist who employs digital, networked and open approaches to
demonstrate their specialism.
Digital
Networked
Open
50 seconds
In his book, The Digital Scholar: How technology is transforming
scholarly practice, Martin Weller suggests that a short hand term
should be used to loosely define a Digital Scholar. First of
all,
<click>the person does not necessarily need to be a
recognised academic or someone who posts online<click>. It is
someone who employs
<click>digital,
*
*
Adam Crymble
Adam Crymble was doing his PhD research on Distant Reading at
King’s College. He won a competition to explain his thesis in 2
minutes in the PhD Comics competition.
*
*
Annotation
*
*
Only a small amount of content is digitised!
Might not be the treasure expected at the end of a digital
journey!
This is where Labs works
*
6 Seconds (20 Words)
*
*
Competition
Awards
Projects
Tell us your ideas of what to do with our digital content
Show us what you have already done with our digital content
Talk to us about working on collaborative projects
*
Residency June – October 2016.
Showcase @ Symposium Monday 7 Nov 16.
Winner £3000 & Runner up £1000!
Pitch an idea and win a goody bag!
Competition
41 Seconds (123 Words)
One way is by running an annual competition which is open to the
world! All you have to do is
<Click>submit and idea by 11 April 2016.
<Click>The two finalists will be announced in late May
<Click>and they work with in residence between June and
October,
<Click>where they will get up to £3600 financial support,
together with technical, curatorial and other types of
support.
<Click>The winners will showcase their work and receive their
prizes at our symposium on Monday 7th of November.
*
*
@BL_Labs #bldigital #digikult http://goo.gl/Fp9SQW
Projects already using BL digital content in interesting and
innovative ways.
Submit projects (previous and new) by
5 September 2016.
Winners announced @Symposium 7 Nov 16.
£500 Winner & £100 Runner Up.
Awards
15 Seconds (45 Words)
The next way we try to engage those interested in using our digital
content is through our Awards,
<Click>these recognise work already carried out using our
digital content.<Click>The deadline for this year is the 5
September. You can submit previous and new projects<Click>in
one of four categories: Artistic, Commercial, Research and Teaching
& Learning <Click> Winners will be announced on Monday
7th of November
*
*
Projects
Ideas change once you try to access, examine and use the
data!
Talk to us about working on potential ideas / projects.
8 Seconds (24 Words)
*
*
6 Seconds (20 Words)
*
*
7 Seconds (21 Words)
*
*
Mrs Folly
reliably in it automatically
Tweak if necessary
21 Seconds (65 Words)
*
*
The Red Lion Pub, Soho
Chartists Re-enactment
33 Seconds (101 Words)
*
*
<Click>
*
*
Bob Nicholson interviewed on
http://goo.gl/fmV9ep
http://goo.gl/xIDRhz
9 Seconds (25 Words)
*
*
26 Seconds (78 Words)
*
*
http://www.lancaster.ac.uk/fass/projects/spatialhum.wordpress/
Labs Symposium 2015: https://goo.gl/ZCU56a
12 Seconds (37 Words)
*
*
7 Seconds (21 Words)
*
*
105 seconds
Curator Cheryl Tipp Curator of Environment and Nature Sounds
<click>in Digital Scholarship worked with the creative
industries department at the British Library and a company called
Ideas Tap to launch the <click>‘Sound Edit Wildlife Films
Competition’ which challenged animators, filmmakers and
photographers to create a short film inspired by the Library's
collection of 10 wildlife sound recordings.
<click>The winning entry was 'Dave's Wild Life' from Samuel
de Ceccatty, a fantastic short which follows Dave, an amateur
naturalist whose sole aim is to have his own TV show. The clip I
will show uses the ‘Haddock drumming calls’ to give a voice to the
cranes or, as Dave liked to call them, the Diplodocus longus
cranum.
Cue up video and play from 47 s- 1.58
*
*
*
*
Exhibition themed asset packs
Jackson Rolls-Gray, Sebastian Filby and Faye Allen.
Blog: http://goo.gl/mZ2X3T
Video: https://goo.gl/WGfJGo
Off Our Rockers: De Montfort University,
Dan Bullock, Freddy Canton, Luke Day, Denzil Forde, Amber Jamieson
and Braden May
Video: https://goo.gl/fPDZHE
Blog: http://goo.gl/MYih7C
Pudding Lane Productions: De Montford University, Ian Hargreaves,
Joe Dempsey, Luc Fontenoy, Dominic Bell, Daniel Peacock and Chelsea
Lindsay.
Blog: http://goo.gl/XPBmq1
Video: https://goo.gl/jNrpj5
Closing date: Monday 6th June 2016
Off The Map challenges budding game makers to use collections
from The British Library as inspiration to create exciting
interactive digital media.
*
*
Just one digital collection
75 seconds
The work of Labs is really about a number of stories, stories about
digital collections and about researchers wanting to ask
fascinating research questions about them. Let’s now tell you a
story about one collection and the intended and unintended
consequences of working with it.
The Library digitised 65,000 17th to 19th century books from our
collections a few years ago (around 2.7 % of the physical total in
that period). You can view them from our catalogue or read them on
your <click>IPad via the Historical Books app developed by
BiblioLabs. We also captured 22 million individual page images,
along with full text scans of these images all of which contain
untold quantity of useful data such as names of people, places,
historical events, dates.
*
*
http://mechanicalcurator.tumblr.com
http://www.flickr.com/photos/britishlibrary/
from 65,000 Digitised Books
*
*
You can purchase
50 seconds
Here is the anatomy of a Flickr record, importantly we have created
links to many of the Library’s services <click>some of this
lovely traffic is going back to the Library and hopefully
generating more interest in our services, from downloading a pdf of
the book to purchasing a high res scan of the image.
*
*
Competition Winner 2014
Understanding value / impact of making the BL’s data open / in the
public domain
Analytics dashboard for the Library showing what is happening to
our Flickr Images
Presentation: https://goo.gl/phtgqv
http://goo.gl/8SkfM1
Spells trouble
(Notice the direction faces)
37 Seconds (112 Words)
*
*
James Heald – Wikimedia and Map work
Geotagging maps
18 Seconds (56 Words)
*
*
27 Seconds (82 Words)
*
*
You choose!
79 Seconds Video Clip
We are close to installing the machine at the National Video Arcade
in Nottingham to see how successful the games will be. If you’re
interested in having the machine in your institution, please
contact us.
https://www.youtube.com/watch?v=xoCgHo2rwN4 (Switch on
Subtitles)
1.47 – 3.06 1 min 19 seconds
<Click>
*
*
Play from 4m 50 seconds to 5m 19 seconds
*
*
1m 18seconds
Exhibited from
Music mix by DJ Yoda using British Library Sounds:
https://goo.gl/z3k4JT
David Normal’s art work: http://www.crossroadsofcuriosity.com and
http://www.davidnormal.com
Physical
Digital
Digital
Physical
1m30seconds
16 Seconds Video Clip
Square brackets to indicate inferred information
*
*
Everything has a URL
URL links to page which tells you about the thing
Link to other things
Not there yet!
Internet / APIs
*
Teaming up with Expert?
Many researchers have the domain knowledge but lack the technical
skills to use Digital Research methods
Should our support be more focused on training?
There are plenty of computational experts looking for problems to
solve
*
5 Seconds (15 Words)
*
*
Re-OCRing Newspapers
Flickr API
BL Explore – Search Catalogue
20 Seconds (62 Words)
*
*
only on site due to © or ethical
not online / available – various storage devices, personal
data
online and open
Labs Residency Model
online behind paywall
<click>The British Library faces many challenges of access to
our Digital collections!
<click> Sometimes digital content is only available onsite
due to license restrictions,
<click>or even only on a specific computer in a reading room!
Technically there are very few reasons why digital content can’t be
online
<click> though it might be too big or hasn’t been transferred
from other digital storage media.
<click>Sometimes access is through a paywall. Finally,
<click>some content is in the happy sunny place, online, open
and freely available.
The real reasons why there are challenges to accessing digital
content are of course human. They require different approaches from
the Library and may often involve an honest, open dialogue and
negotiation with the publishers.
*
*
Lessons…
Huge appetite to use digital content & data (e.g. Flickr
Commons stats)
Identifying / bridging gaps for researchers to use data.
Can help researchers navigate through the Library to get the data
they want.
28 Seconds (86 Words)
*
*
Start those conversations, start small and simple, but think
big!
Embrace serendipity, work fast, give it energy.
Learn the lessons, tell the positive stories and move on!
Don’t be afraid to experiment and fail!
*
Perfection vs Imperfection
If we focus too much on perfection, we will never get anything
done!
Fear of failure seen as a negative thing.
*
*
Fail faster
Small experiments!
20 Seconds (62 Words)
The Labs is a place where we do many small experiments quickly.
Most importantly it’s where it’s OK to make mistakes and learn from
them. Fail faster and fail better! Perhaps Jimmy Wales’ advice
(founder of Wikipedia) can sum what we have learned time and time
again.<Click>
40 Seconds
Video Clip
DIGIKULT, Gothenburg, Sweden.
Inspiration and Lessons from the British Library Labs
Mahendra Mahey
25 Seconds (68 Words)
*
*
Why did you…
What if…
What was the worst thing…
If you could have your time again, …
How did you…
What was the biggest challenge…
What was the most successful thing about…
Who did…