European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation...

17
European Metadata Initiatives: European Metadata Initiatives: The METAe Metadata Engine The METAe Metadata Engine Simon Tanner Simon Tanner Higher Education Digitisation Service Higher Education Digitisation Service http://heds.herts.ac.uk http://heds.herts.ac.uk

Transcript of European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation...

Page 1: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

European Metadata Initiatives: European Metadata Initiatives:

The METAe Metadata EngineThe METAe Metadata Engine

Simon TannerSimon TannerHigher Education Digitisation ServiceHigher Education Digitisation Service

http://heds.herts.ac.ukhttp://heds.herts.ac.uk

Page 2: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

Introduction to HEDS.

Current metadata contexts in Europe.

METAe - The Metadata Engine Project

• Project summary

• Project objectives

• Description of work

• Benefits

OverviewOverview

Simon Tanner http://heds.herts.ac.uk

Page 3: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

Recent projects include:• Rekeying and tagging 35 million characters in Anthropology• 17th century Trade Directories• British newsreel scripts from the 1940’s• Transparencies - artwork, manuscripts, stained glass• Photographic prints and postcards - local history collections• Microfilm: manuscripts, political pamphlets• Consultancy: The British Library, Oxford University, New

Opportunities Fund applicants.

Introduction to HEDSIntroduction to HEDS

Simon Tanner http://heds.herts.ac.uk

HEDS provides advice, consultancy and a complete production service for digitization and digital library development.

Page 4: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

SCHEMAS: Forum for Metadata Schema Implementors

http://www.schemas-forum.org/

“SCHEMAS will inform schema implementers about the status and proper use of new and emerging metadata standards. The project will support development of good-practice guidelines for the use of standards in local implementations. It will investigate how metadata registries can support these aims.”

Current metadata contexts:Current metadata contexts:

Simon Tanner http://heds.herts.ac.uk

Page 5: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

RSLP Collection Description

http://www.ukoln.ac.uk/metadata/rslp/

“Based on a thorough modelling of collections and their catalogues, the project will develop a collection description metadata schema and associated syntax using the Resource Description Framework (RDF). We will develop a simple Web-based tool in order that projects can describe their collections and prototype a search service.”

Current metadata contexts:Current metadata contexts:

Simon Tanner http://heds.herts.ac.uk

Page 6: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

CEDARS: CURL Exemplars for Digital Archives

http://www.curl.ac.uk/projects/cedars.html

“There is a pressing need for a strategy for digital preservation… the CEDARS project aims to address the strategic, methodological and practical issues and will provide guidance for libraries in best practice for digital preservation.”

CEDARS are identifying the descriptive metadata elements that should be gathered to maximize the continued accessibility of digital resources.

Current metadata contexts:Current metadata contexts:

Simon Tanner http://heds.herts.ac.uk

Page 7: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

Presentation of an EU-project within the5th Framework Programme

http://meta-e.uibk.ac.at/http://meta-e.uibk.ac.at/

Page 8: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

November 2000

http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

METAe THE METADATA ENGINE PROJECT

Project summary

To make the digital conversion of printed material– more reliable in terms of digital preservation– more cost-effective in terms of automation– more attractive in terms of user-friendliness and

accessibility.

METAe will develop a software package to extensively automate and improve the generation of metadata.

Page 9: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

November 2000

http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

METAe THE METADATA ENGINE PROJECT

Project summary

The goals will be achieved by applying new technologies for character, layout and document recognition.

The METAe package will convert the captured information into XML documents.

XML files serve as a basis for various applications, such as: new XML search engines, navigation tools, electronic books, audio books, or the automated production of HTML, XHTML, PDF or PS files.

Page 10: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

November 2000

http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

METAe THE METADATA ENGINE PROJECT

Participants: co-ordinator & technical partners

Co-ordinator: Leopold-Franzens-Universität,

Innsbruck (A)

Institut für Angewandte Informatik, University of Linz (A)

Mitcom Neue Medien GmbH (G)

CCS Compact Computer Systeme (G)

Dipartimento di Sistemi e Informatica, University of

Florence (I)

Scuola Normale Superiore, Centro di Ricerche

Informatiche per i Beni Culturali (I)

Page 11: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

November 2000

http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

METAe THE METADATA ENGINE PROJECT

Participants: library & research partners

Universidad de Alicante (S)

Friedrich-Ebert-Stiftung (G)

Cornell University Library (USA)

Bibliothèque nationale de France (F)

The National Library of Norway (N)

Biblioteca Statale A. Baldini (I)

Karl-Franzens-Universität Graz, (A)

Higher Education Digitisation Service HEDS (UK)

Page 12: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

November 2000

http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

METAe THE METADATA ENGINE PROJECT

Project objectives

Introduction of layout and document analysis as a key technology in future digitisation software.

Development of capturing and conversion tools for the automated recording and generation of administrative and descriptive metadata.

Development of an omnifont OCR-engine specialised in processing old European typefaces of the 19th century („Fraktur“, Gothic fonts).

Page 13: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

November 2000

http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

METAe THE METADATA ENGINE PROJECT

Project objectives

Evaluation of digital preservation standards(i.e. XML, EAD, TEI or ISO 12083)

Development of an XML search engine for tagged full texts and images.

Page 14: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

November 2000

http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

METAe THE METADATA ENGINE PROJECT

Description of work

1. Input module for scanning and importing existing metadata

2. OCR-engine specialised in typefaces of the 19th century

3. Document analysis module

4. Page layout analysis module

6. Conversion module assembling an XML document containing all recognised metadata

7. Export module for the XML enriched document and the scanned image

5. Rules and controlled vocabulary for automated recognition process

Page 15: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

November 2000

http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

METAe THE METADATA ENGINE PROJECT

Page 16: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

November 2000

http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

METAe THE METADATA ENGINE PROJECT

Benefits

1. Reduce the need for manual post-processing of scanned content.

2. Produce a rich output, with metadata on all levels: administrative, structural and format metadata.

3. Offer new possibilities for successful long-term preservation.

5. Selective and distributed correction of OCR‘d content.

4. New ways to enhance access, re-use and multi-versioning.

6. Benefits for the visually disabled and also in scenarios of functional disability.

Page 17: European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service .

European Metadata Initiatives: European Metadata Initiatives:

The METAe Metadata EngineThe METAe Metadata Engine

Simon TannerSimon TannerHigher Education Digitisation ServiceHigher Education Digitisation Service

Email: [email protected]: [email protected]://heds.herts.ac.ukhttp://heds.herts.ac.uk