Core Integration Web Services Dean Krafft, Cornell University [email protected].

12
Core Integration Web Services Dean Krafft, Cornell University [email protected]

Transcript of Core Integration Web Services Dean Krafft, Cornell University [email protected].

Page 1: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

Core Integration Web Services

Dean Krafft, Cornell [email protected]

Page 2: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

2

CI Infrastructure: Past

Massive, Monolithic Infrastructure User Access through NSDL.org MR Input/Output: OAI-PMH – heavy Search: SDLIP – Java package – heavy Archive: Only through NSDL.org search NSDL.org Portal framework: uPortal –

large, complex Java system

Page 3: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

3

CI Infrastructure: Future

Open, service-friendly infrastructure User access: multiple portals,

browser extensions, standard web search

MR I/O: SOAP/WSDL, REST, RSS Search: SOAP/WSDL, REST NSDL.org: PHP reimplementation –

flexible, indexable, reusable

Page 4: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

4

CI Philosophy

Open, lightweight mechanisms for access and contribution

Play well with the Internet – don’t be a silo

Synergize with existing web tools and infrastructure, don’t compete

Enable many forms of access and contribution – including ones we haven’t thought of yet

Page 5: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

5

Accessing the MR

Given OAI ID of record, REST access is available now: http://services.nsdl.org:8080/nsdloai/OAI?verb=GetRecord&identifier=‘xxx'&metadataPrefix=oai_dc

What other queries should we support? Search engine style – but MR is

structured, not full text SQL query – Exposes database structure XQuery – Dependent on full XML

schema, expensive to implement

Page 6: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

6

Strawman MR Access Proposal SOAP/WSDL and REST access FetchElementsLike(“dc:title”, “frog”) –

returns IDs where title contains “frog” FetchElementsStarting(“dc:author,

“bill”) – returns sorted list of IDs FetchElements(“oai:nsdl.org:pri:00010

”, “dc:title”, “dc:author”) – returns list of elements where OAI ID matches

Page 7: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

7

MR Access for Relationships Committed to adding Annotations,

Relationships (e.g. Equivalence), Organizational Structures

Can expose as links – slow, expensive traversal

What are the alternatives? Dump it as a (large) XML file? Support extended relationship queries?

Page 8: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

8

MR Ingest

Need a lightweight alternative to OAI-PMH

RSS (Rich Site Summary, Really Simple Syndication, RDF Site Summary) v0.9x, v1.0, v2.0

RSS supports Dublin Core (and some variants support arbitrary metadata)

Idea: create RSS/OAI gateway (in development)

Page 9: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

9

Search Access

Currently, WebDAV access available (underlying SDLIP protocol)

Need to use SDS query language Full text search design collapses

multiple fields (author, identifier) SOAP interface will be forthcoming

Page 10: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

10

Archive Access

Primary current access through NSDL.org search results

Available as HTML page through: http://srb.npaci.edu/cgi-bin/mra-oai2.cgi? verb=GetArchive&identifier=oai:nsdl.org:pri:00109

SOAP interface almost complete

Page 11: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

11

Playing well with the Web

Expose the MR as a crawlable, indexable tree – enable Google search

Expose MR relationships as web link structure

Support lightweight contribution, annotation, and NSDL membership check for resources

Enable new user services

Page 12: Core Integration Web Services Dean Krafft, Cornell University dean@cs.cornell.edu.

12

What do you need?

How should we expose MR data and Core Integration services?

How should we support authenticated contributions (annotations, et al.)

Is SOAP/WSDL the best? REST for query/data access?

How can we enable exciting new services?