Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

38

description

 

Transcript of Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Page 1: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library
Page 2: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

europeana cloud

Ingestion and Aggregation Workshop

Chiara LatronicoOperations Officer The European Library

Europeana Cloud Kick-Off Meeting, Den Haag, 04-05 March 2013

Page 3: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Agenda

The European Library Datasets life-cycle and workflows Content ingestion questionnaire Ingestions tools Aggregation and delivery to Europeana Europeana Data Model (EDM) Full-text index Questions

Page 4: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

48 National Libraries ~ 40 Research and University Libraries ~ 115 Million Bibliographic Records > 16 Million Digital Objects > 25 Million Pages of Full-text

The European Library

Page 5: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library
Page 6: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library
Page 7: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

The European Library

Data access point for researchers Combination of bibliographic records and

metadata for digital objects

Aggregator for Europeana Cloud

Page 8: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

The European LibraryAggregation into Europeana

Page 9: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

The European LibraryIngestion Workflow Content ingestion questionnaire Scheduling of ingestion Datasets ready for harvesting Create case in CRM: case # to provider Harvesting metadata Enhance metadata (VIAF, Geonames, MACS,...) Indexing in acceptance portal E-mail to provider to accept dataset Live index = live portal Delivery to Europeana Enhancing and publishing in Europeana

Page 10: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Content Ingestion Questionnaire Web-form Personal Information

(about the person filling the web-form) Name & surname Job title E-mail address Skype address

Information about Organization Organization name Country Website Type of institution

Page 11: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Content Ingestion QuestionnaireHarvesting Details Which protocol will be used to transfer data?

OAI-PMH File Z39,50 FTP HTTP

Harvesting time and dates preferences

How often dataset(s) will need to be updated? Weekly Monthly Quarterly Annually On demand

Page 12: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Content Ingestion QuestionnaireInformation about dataset(s)

Number of dataset(s) to be ingested

Number of records to be expected

Number of digital objects to be expected

Contact person(s) per dataset(s) Editorial: for collection description Technical: for collection ingestion

Page 13: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Content Ingestion QuestionnaireInformation about Metadata

Metadata standard(s) available to describe objects Marc21 MarcXchange Unimarc ESE EDM METS MODS OAI_DC TEI

Number of formats available per dataset

Page 14: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Content Ingestion QuestionnaireInformation about Metadata

Are the metadata ready? If yes, for which dataset(s) If not, when will they be ready?

Type of digital objects per dataset(s) TEXT IMAGE AUDIO VIDEO

Page 15: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Content Ingestion QuestionnaireInformation about Content

Will content be delivered in addition to metadata?

If yes, for which dataset(s)? If yes, in which format(s)?

Has the content been digitized? If yes, for which dataset(s)? If not, when will the content be available?

Page 16: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Content Ingestion QuestionnaireInformation about Authority

Will authority files be delivered? If yes, for which dataset(s)? If yes, in which format(s)?

Are controlled vocabularies utilized? If yes, which kind?

• Classification• Thesauri• Subject Headings• Other

If yes, for which dataset(s)?

Will full-text be delivered? If yes, for which dataset(s)? If yes, in which format(s)?

Page 17: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Content Ingestion QuestionnaireSubmit

If you make a mistake, we can fix it!

After [email protected]

Page 18: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

SugarCRM aggregation tasks management

SugarCRM Customer Relation Management tool

Page 19: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

SugarCRM generation automated reports

SugarCRM Customer Relation Management tool

Page 20: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

In SugarCRM Organizations, contacts, datasets, project and more

SugarCRM is utilized for Collections control Ingestion plans Automated reports Cases per specific datasets

SugarCRM Customer Relation Management tool

Page 21: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

The European LibrarySystem architecture

Page 22: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

UIM Unique Ingestion Management

Page 23: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Dataset in Acceptance Portal

Acceptance Portal Test environment Providers to validate data

Reports via UIM workflows Link Validation Field Validation

Page 24: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Dataset in Acceptance Portal

Page 25: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

When Dataset in Acceptance Portal

Create an account onhttp://www.theeuropeanlibrary.org/

Use credential to log-in in acceptancehttp://www.tel.ulcc.ac.uk/acceptance/

Validate data using tabs for Default XML

Page 26: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Dataset in Acceptance Portal

Page 27: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Dataset(s) in Live Index and Portal

When a provider accepts dataset(s) E-mail Dataset(s) ready for live index Dataset(s) ready for Europeana

Dataset(s) indexed into the live portal It takes ~ 24 hrs for dataset(s) to be

searchable into the live portal

Page 28: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Dataset(s) Live in Europeana

When a provider accepts dataset(s)

Dataset(s) delivered to Europeana Europeana publishes live once a month Delivery deadline ~ 21 of each month Dataset(s) searchable in Europeana by

following month

Dataset(s) published live in Europeana E-mail to provider with link to dataset(s)

into Europeana portal

Page 29: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

EDM – Europeana Data Model

Europeana Libraries project EDM for library data

Europeana Cloud Project EDM for museum and archive metadata &

content

Delivery in EDM to Europeana

Page 30: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

EDM – Europeana Data Model

Page 31: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Europeana Preview

Page 32: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Europeana Preview

Page 33: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Full-Text (OCR)

Continue Full-text indexing

Page 34: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Full-Text (OCR)

Full-text & OCR

URLs to OCR texts into metadata Extraction of Full-text Full-text indexing

Page 35: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Full-Text (OCR)

Continue the work about Full-text Europeana Newspapers Europeana Cloud

Page 36: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Summary

What we would like to have from you Richest possible metadata Content Full-text Authority files or ontologies

Page 37: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

Thank you!Questions?

For every questions or feedback [email protected]

Chiara [email protected]

Page 38: Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

www.theeuropeanlibrary.org