the OAI Protocol for Metadata Harvesting an update

36
herbert van de sompel the OAI Protocol for Metadata Harvesting an update Herbert Van de Sompel Los Alamos National Laboratory – Research Library

description

the OAI Protocol for Metadata Harvesting an update. H erbert V an de S ompel Los Alamos National Laboratory – Research Library. - PowerPoint PPT Presentation

Transcript of the OAI Protocol for Metadata Harvesting an update

  • the OAI Protocol for Metadata Harvestingan updateHerbert Van de Sompel Los Alamos National Laboratory Research Library

  • The Open Archives Initiative has been set up to create a forum to discuss and solve matters of interoperability between preprint solutions, as a way to promote their global acceptance.

    Paul Ginsparg, Rick Luce & Herbert Van de Sompel

  • Luce * Van de Sompel * Ginsparg

  • 2 core motivations as a systems librarian: change the system as a researcher: find (technical) ways to facilitate the change

  • as a systems librarianoptimizing the outputthe input is far from optimal

  • eprint systems xxx e-print archive (Physics - 1991 - Los Alamos - Ginsparg) RePEc (Economy - Surrey U - Krichel) NCSTRL (Computer Science - Cornell U - Lagoze) NDLTD (Theses - Virginia Tech - Fox) CogPrints (Cognitive Sciences - Southampton U - Harnad)

  • eprints are attractive building block in ongoing transformation of scholarly communication but: interoperability could increase impact of e-prints: amongst e-print solutions with building blocks that implement other functions of scholarly communication with the established communication systemas a researcher

  • UPS Prototype: eprints discovery 1999: Van de Sompel, Krichel, Nelson results: insights regarding how un-interoperable the systems were a cross-repository searching and linking service recommendations to the Santa Fe meeting: data provider / service provider model metadata harvesting simplicity

  • evolution towards OAI-PMH v.2.0 OAI-PMH 1.0 [01/2001] OAI-PMH 2.0 [06/2002] Santa Fe Convention [02/2000]

  • Santa FeconventionOAI-PMHv.1.0/1.1OAI-PMHv.2.0

  • service providerdata provider6 OAI-PMHOAI-PMH model

  • Supporting protocol requests: Identify ListMetadataFormats ListSetsHarvesting protocol requests: ListRecords ListIdentifiers GetRecordservice providerdata providerOAI-PMH model

  • service providerdata providerDatestampIdentifierSetRecordsOAI-PMH model

  • federated services

  • metadata harvesting via OAI-PMHmetadataFTXT

  • metadatametadata harvesting via OAI-PMH

  • issue solved? no, just a tiny part of the technical challenges to support discovery many more technical issues even more non-technical issues

  • issue solved? technicalregistrationawarenessarchivingcertificationrewarding

  • issue solved? non-technical I am happy to leave those to you but: even for non-technological issues, part of the answer might be found in applying technology

  • indicators of adoption of OAI-PMH tools structural support service providers data providers

  • 49 registered repositories [11/2001] 65 registered repositories [03/2002] 5+ million records many unregistered repositories data providers

  • Arc : cross-searching of registered repositories [Old Dominion U][ http://arc.cs.odu.edu ] OLAC: cross-searching of Language Archive Community repositories http://www.language-archives.org/index.html

    service providers

  • Scirus scientific search engine [Elsevier][ http://www.scirus.com ] my.OAI : user-tailorable cross-searching of registered repositories [FS Consulting, Inc.][http://www.myoai.com] growing interest from web search engines service providers

  • Repository Explorer: interactive exploration of repositories [Virginia Tech][ http://www.purl.org/NET/oai_explorer ] eprints.org: generic OAI-PMH compliant repository software [U of Southampton][ http://www.eprints.org ] ALCME repository and harvester software [OCLC][ http://alcme.oclc.org/index.html ] OAI-PMH tools

  • OAI-PMH flies: structural support Metadata Harvesting Initiative of the Mellon Foundation NSDL (NSF funded) UK FAIR call for proposals to support disclosure of institutional assets (papers, learning materials, etc.) Institute for Museum and Library Services several EC projects exploring/supporting usage of OAI-PMH: TEL, Leaf, Cyclades, OA Forum

  • Australian Museums Online & CIMI : OAI conference NIMH white paper on data archiving for Animal Cognition Research Library of Congress National Library of Canada OCLC thesis database Illinois State Library CatalogueOAI-PMH flies: and also

  • future adoption communities OAI-PMH OAI

  • release of OAI-PMH v.2.0 [06/2002] no backwards compatibility with v.1.0/1.1 stable migration process for registered repos ? formal standardization ? ? SOAP version ~ web services framework [SOAP, WSDL, UDDI] ? the OAI-PMH

  • proliferation of community-specific add-ons for: collection & set level metadata expressive metadata formats (e.g. qualified DC XML Schema) shared set-structures machine readable rights (about the metadata) communities

  • evolution from talking about OAI-PMH to talking about projects that use OAI-PMH to talking about projects and failing to mention they use OAI-PMH=> OAI-PMH becomes part of the infrastructure adoption

  • I just wanted to report what I consider an OAI success. I discovered that RLG had harvested records for two of the American Memory collections I had made available and integrated them into their Cultural Materials Initiative service without the need for a single e-mail or phone call. They reported that it was working very well for them.

    [Caroline Arms, Library of Congress]

  • http://www.openarchives.org

    [email protected]

  • the OAI: not really an organization Executive: Carl Lagoze & Herbert Van de Sompel 2000 2002 funding from CNI and DLF Steering Committee Technical Committe: protocol revision & stabilization Alpha testers

  • US representativesThomas Krichel (Long Island U) - Jeff Young (OCLC) - Tim Cole - (U of Illinois at Urbana Champaign) - Hussein Suleman (Virginia Tech) - Simeon Warner (Cornell U) - Michael Nelson (NASA) - Caroline Arms (LoC) - Muhammad Zubair (Old Dominion U) - Steven Bird (U Penn.) European representativesAndy Powell (Bath U. & UKOLN) - Mogens Sandfaer (DTV) - Thomas Baron (CERN) - Les Carr (U of Southampton)OAI-tech

  • The British Library Cornell U. -- NSDL project & e-print arXiv Ex Libris FS Consulting Inc -- harvester for my.OAI Humboldt-Universitt zu Berlin InQuirion Pty Ltd, RMIT University Library of Congress NASA OCLC OAI-PMH 2.0 alpha testers (1/2)

  • OAI-PMH 2.0 alpha testers (2/2) Old Dominion U. -- ARC , DP9 U. of Illinois at Urbana-Champaign U. Of Southampton -- OAIA, CiteBase, eprints.org UCLA, John Hopkins U., Indiana U., NYU -- sheet music collection UKOLN, U. of Bath -- RDN Virginia Tech -- repository explorer

    Dat grote dingen niet altijd even ernstig starten wodt geilustreerd door deze foto. Gepubliceerd in een US tijdschrift na weg-editen van glazen en flessen.Informatieketen verbindt auteurs met lezers. Als systeembibliothekaris: optimizeer toegang tot de collectie. Ik begon me belachelijk te voelen, want de input was verre van optimaal. Dus daar ben je: de output aan het optimiseren van een systeem met verre van optimale input. Dus: ik wou daar iets aan veranderen en vond dat de prioriteit in bibliotheekautomatisering (en bibliotheken) moest veranderen van systemen voor toegang tot informatie naar systemen die een minder gestremde doorstroming van informatie konden realiseren. Er moest dus iets gaan gebeuren aan de bron: reposition libraries. Zulke systemen waren zich beginnen ontwikkelen buiten de bibliotheken. Het Los Alamos arXiv was het meest notoire voorbeeld, maar er waren ook andere, zoals NCSTRL, RePEc, NDLTD, CoGPrints etc.The terms data provider and servidde provider may somehow be misleading; the reasons that they are there is that we do indeed think of the harvester as being a system that wants to provide services for data collected from multiple repositories. Still, it is perfectly imaginable that the protocol would only be used as aa meaans to sync metadata between 2 sys; as such no real notion of service provision would be involved.