Digitised historic newspapers in Europe
-
Upload
alastairdunning -
Category
Education
-
view
1.718 -
download
3
description
Transcript of Digitised historic newspapers in Europe
![Page 1: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/1.jpg)
Surveying Newspaper Digitisation in European Libraries, Then Aggregating Them !
Europeana NewspapersAlastair Dunning
Programme Manager, The European Library@alastairdunning, alastair.dunning AT kb.nl
LIBER Conference, June 2013, Munich
This presentation is at http://www.slideshare.net/alastairdunning
![Page 2: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/2.jpg)
On November 3, 1948, the early edition of the Chicago Tribune proclaimed Thomas Dewey as winner of the US presidential campaign
http://www.chicagotribune.com/news/politics/chi-histdewey_defeats_an20080104104816,0,547284.photo
![Page 3: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/3.jpg)
In actual fact, the campaign was won by Harry Truman, who became the 33rd President of the United States
http://en.wikipedia.org/wiki/File:Deweytruman12.jpg
![Page 4: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/4.jpg)
Later editions of the Chicago Tribune corrected this mistake with headline "DEMOCRATS MAKE SWEEP OF STATE OFFICES"
However, I cannot find these online !
http://en.wikipedia.org/wiki/File:Deweytruman12.jpg
![Page 5: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/5.jpg)
As we shall see, presenting comprehensive digital archives, where everything is digitised, is difficult... yet this is what users often demand !
![Page 6: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/6.jpg)
"This lack of collocation and collection presents efficiency challenges and deepens scholars’ concerns about comprehensiveness. The anxiety over “missing something” was quite common across interviews."
Ithaka S+R, Supporting the Changing Research Practices of Historians,
http://www.sr.ithaka.org/research-publications/supporting-changing-research-practices-historians
![Page 7: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/7.jpg)
"When lined up against the non-digital object upon which it is based, the digital object can only ever appear impoverished."
Jim Mussell, Historian at University of Birminghamhttp://jimmussell.com/2013/05/23/the-proximal-past-digital-archives-and-the-here-and-now/
![Page 8: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/8.jpg)
Genealogists - those studying family history
"Genealogists represent the majority of users in many archives. And yet, the traditional archival information system does not meet their needs."
Wendy M. Duff, Catherine A. Johnson, Where Is the List with All the Names? Information-Seeking Behavior of Genealogists, American Archivist, Volume 66(1), 2003, http://archivists.metapress.com/content/L375UJ047224737N
![Page 9: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/9.jpg)
Despite this, European libraries have made great strides in digitising their newspapers
(These results taken from first Europeana Newspapers survey, 2012. 47 libraries responded.)http://www.europeana-newspapers.eu/wp-content/uploads/2012/04/D4.1-Europeana-newspapers-survey-report.pdf
![Page 10: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/10.jpg)
129, 041, 663 pages
from
23,987 titles
![Page 11: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/11.jpg)
![Page 12: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/12.jpg)
11 libraries have digitised more than 3m pages
1. National Library of Czech Republic
2. Koninklijke Bibliotheek van België
3. National Library of Spain
4. National Library of Norway
5. National and Univeristy Library of Iceland
6. BCU Lausanne
7. Hamburg State and University Library
8. Bibliothèque nationale de France
9. British Library
10. Koninklijke Bibliotheek
11. Austrian National Library
![Page 13: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/13.jpg)
But, only 12 (26%) of the
libraries had digitised more than 10%
of their collection
(either in terms of titles or page numbers)
![Page 14: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/14.jpg)
National Library of Luxembourg
620.000
pages digitised
4.000.000
pages in collection
![Page 15: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/15.jpg)
National Library of Finland
620.000
pages digitised
2.010.246 pages in collection
![Page 16: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/16.jpg)
Hamburg State and University Library
c. 2.000.000 pages digitised
c. 12.000.000 pages
in collection
![Page 17: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/17.jpg)
What else did the survey discover ?
![Page 18: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/18.jpg)
Access to digitised newspapers is nearly always
free of charge. At least 40 (85%) offered free access to their digitised newspapers.
One library had pay per view, whilst another three offered subscription services for users (ie paid access per day or per month).
Only four libraries licensed their newspaper contents to other groups (e.g. school, universities).
![Page 19: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/19.jpg)
Access to twentieth-century content remains problematic.
27 out of 47 libraries (57%) have a cut off date
beyond which they will not publish digitised newspapers on the web. Most frequently, this is based on a 70 year sliding scale.
23% (11 out of 47) had an agreement with a rights
organisation so that in-copyright digitised newspapers could be published, but often restricted to individual titles
![Page 20: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/20.jpg)
There is still much to be done to exploit the richness of digitised newspaper content
64% (37 from 47) of libraries made use of OCR
But only 17 of these libraries (36%) exposed the resulting
full text to the viewer
36% had undertaken zoning and segmentation and only six
libraries (13%) had included features such as facetted
browsing or extracting entities such as place or name
![Page 21: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/21.jpg)
--> Motivation for Europeana Newspapers
Others WPs will explain process of improving digitised archives but I want to return to one earlier quote
![Page 22: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/22.jpg)
"... the lack of comprehensive search tools for primary sources ..."
Locating primary sources presents a crucial challenge for reserachers.
--> TEL aggregator as part of Europeana Newspapers project
![Page 23: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/23.jpg)
Timetable: Early version with limited content added to The European Library website in September 20
More content being added in 2013 and 2014
![Page 24: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/24.jpg)
http://theeuropeanlibrary.org will deliver a search interface to help
locate 18m pages digitised
at European libraires
Users will also be able to search over titles of newspapers. Title metadata will also be forwarded to Europeana
![Page 25: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/25.jpg)
Some Issues:
Copyright means that some images cannot be shared at all, only metadata (e.g. names and dates of newspapers)
![Page 26: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/26.jpg)
Some Issues:
OCR and zoning quality will affect search results significantly. Eg Higher quality OCR will be returned more often in search results
![Page 27: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/27.jpg)
Some Issues:
Some pages have no OCR whatsoever - more difficult to find
![Page 28: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/28.jpg)
Some Issues:
Different libraries are willing to share different amounts of content
Some libraries happy for full content to be shared; for others it is just snippets of images
![Page 29: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/29.jpg)
Last Thoughts and What Next ?:
The European Library will sustain access beyond project funding; but adding more content will require membership of TEL
How can we allow for transcription?
What do non-academic users want?
How do we create full-text APIs ?
![Page 30: Digitised historic newspapers in Europe](https://reader034.fdocuments.in/reader034/viewer/2022052315/554fc597b4c9050e7d8b500a/html5/thumbnails/30.jpg)
Oh, the results here were all based on the first edition of the project survey.
If your library want to contribute to later editions, see links by July 2013
http://www.europeana-newspapers.eu/tell-us-about-your-newspaper-digitisation-project/
http://www.surveymonkey.com/s/BQ28579