OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller...

20
OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004

Transcript of OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller...

Page 1: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

OAI-PMH

The Open Archives Initiative Protocol for Metadata Harvesting

Presenter: Knud Möller

Friday, 30.07.2004

Page 2: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 2OAI-PMH - Knud Möller, DERI Galway

Content

• Basic idea behind OAI-PMH• Architectural Overview

– Repositories and Harvesters– Resources, Items and Records

• Internal Record Format• Sets• Selective Harvesting• Response Format• Command Overview

Page 3: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 3OAI-PMH - Knud Möller, DERI Galway

Basic idea behind OAI-PMH

• provide a standard protocol for the harvesting/ querying of metadata about any kind of resource - “What kind of resources can you provide and what are their properties?“

• OAI-PMH is only the protocol, needs to be implemented

• some implementations exist:– Emblem Project Utrecht

http://emblems.let.uu.nl/emblems/html/techoai.html

– Virginia Tech (VTOAI)http://www.dlib.vt.edu/projects/OAI/software/vtoai/vtoai.html

Page 4: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 4OAI-PMH - Knud Möller, DERI Galway

Architectural Overview

Repositories and Harvesters

Repository

Harvester

HarvesterHarvester

Harvester

Page 5: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 5OAI-PMH - Knud Möller, DERI Galway

Architectural Overview

Repositories and Harvesters

Repository

Harvester

HarvesterHarvester

Harvester

Request

Request

Request

Request

Harvesters issue OAI-PMH requests for metadata via HTTP.

Page 6: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 6OAI-PMH - Knud Möller, DERI Galway

Architectural Overview

Repositories and Harvesters

Repository

Harvester

HarvesterHarvester

Harvester

Request

Request

Request

Request

Response

ResponseResponse

Response

Harvesters issue OAI-PMH requests for metadata via HTTP.A Repository processes the OAI-PMH requests and has to implement the protocol.

Page 7: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 7OAI-PMH - Knud Möller, DERI Galway

Architectural Overview

Resources, Items and Records

ResourceAnything - physical artifact, a digital resource, a concept, etc.Whatever the metadata is about.

Page 8: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 8OAI-PMH - Knud Möller, DERI Galway

Architectural Overview

Resources, Items and Records

Resource

Item

Representation of resource inrepository. Can disseminatemetadata in various formats. Must always provide Dublin Core. Has unique identifier.

Anything - physical artifact, a digital resource, a concept, etc.Whatever the metadata is about.

oai:arXiv.org:cs/0112017

Page 9: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 9OAI-PMH - Knud Möller, DERI Galway

Architectural Overview

Resources, Items and Records

Resource

Item

Record(oai_dc)

Record(lom)

Record(id3)

XML-encoded byte stream ofactual metadata.

Anything - physical artifact, a digital resource, a concept, etc.Whatever the metadata is about.

Representation of resource inrepository. Can disseminatemetadata in various formats. Must always provide Dublin Core. Has unique identifier.

oai:arXiv.org:cs/0112017

Page 10: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 10OAI-PMH - Knud Möller, DERI Galway

Internal Record Format I<record> <header> <!-- blabla --> </header> <metadata> <!-- blabla --> </metadata> <about> <!-- blabla --> </about></record>

Page 11: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 11OAI-PMH - Knud Möller, DERI Galway

Internal Record Format I<record> <header> <!-- blabla --> </header> <metadata> <!-- blabla --> </metadata> <about> <!-- blabla --> </about></record>

<header> <identifier>oai:arXiv.org:cs/0112017</identifier> <datestamp>2002-02-28</datestamp> <setSpec>cs</setSpec> <setSpec>math</setSpec></header>

Page 12: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 12OAI-PMH - Knud Möller, DERI Galway

Internal Record Format II<metadata> <oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance“ xsi:schemaLocation= "http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"> <dc:title>Using Structural Metadata to Localize Experience of Digital Content</dc:title> <dc:creator>Dushay, Naomi</dc:creator> <dc:subject>Digital Libraries</dc:subject> <dc:description>With the increasing [..bla..] to particular communities of users. </dc:description> <dc:date>2001-12-14</dc:date> <dc:type>e-print</dc:type> <dc:identifier> http://arXiv.org/abs/cs/0112017 </dc:identifier> </oai_dc:dc></metadata>

Page 13: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 13OAI-PMH - Knud Möller, DERI Galway

Internal Record Format III<about> <provenance xmlns="http://www.openarchives.org/OAI/2.0/provenance" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.openarchives.org/OAI/2.0/provenance http://www.openarchives.org/OAI/2.0/provenance.xsd"> <originDescription harvestDate="2002-02-02T14:10:02Z" altered="true"> <baseURL>http://the.oa.org</baseURL> <identifier>oai:r2.org:klik001</identifier> <datestamp>2002-01-01</datestamp> <metadataNamespace> http://www.openarchives.org/OAI/2.0/oai_dc/ </metadataNamespace> </originDescription> </provenance></about>

Page 14: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 14OAI-PMH - Knud Möller, DERI Galway

Sets

• Items can be organized into sets.• Sets can either be organized flat or hierarchically.

setName setSpec

Institutions institution

Oceanside University of Nebraska institution:nebraska

Valley View University of Florida institution:florida

Subject subject

Existential Kenesiology subject:kenesiology

Quantum Psychology subject:quantum

Page 15: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 15OAI-PMH - Knud Möller, DERI Galway

Selective Harvesting

• Harvesters can specify some constraints on which items they are interested in

• Regarding datestamps:– only items that where created, modified or deleted

(optional) in a certain time period

• Regarding sets:– only items that belong to a specific set (or any of

its subsets)

Page 16: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 16OAI-PMH - Knud Möller, DERI Galway

Response Format<?xml version="1.0" encoding="UTF-8" ?> <OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/“ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance“ xsi:schemaLocation= "http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"> <responseDate>2002-05-01T19:20:30Z</responseDate> <request verb="GetRecord" identifier="oai:arXiv.org:hep-th/9901001“ metadataPrefix="oai_dc"> http://an.oa.org/OAI-script </request> <GetRecord> <record>...</record> </GetRecord></OAI-PMH>

Page 17: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 17OAI-PMH - Knud Möller, DERI Galway

Command Overview I

• GetRecord: get a specific record, must specify item‘s URI and metadata prefix

• Identify: retrieve information about a repository (name, protocol version, supports deletion, ...)

• ListRecords: get either all records or a subset, must specify metadata prefix

• ListIdentifiers: like ListRecords, but retrieves only headers

Page 18: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 18OAI-PMH - Knud Möller, DERI Galway

Command Overview II

• ListMetadataFormats: lists the available metadata formats of a repository

• ListSets: returns the set structure of a repository

Page 19: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 19OAI-PMH - Knud Möller, DERI Galway

References

• OAI-PMH specification: http://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm

Page 20: OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting Presenter: Knud Möller Friday, 30.07.2004.

30.07.2004 20OAI-PMH - Knud Möller, DERI Galway

Thanks andgoodbye!