2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with...

27
2004-09-15 NVO Summer School, Aspen Center for Physics 1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham

Transcript of 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with...

Page 1: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 1

Publishing and Resource Discovery with Registries

Ray PlanteMatthew Graham

Page 2: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 2

All about Registries

• Overview of the Registry Framework

• Publishing to the NVO

• Visits to Registries (publishing)

• VOResource: Resource Metadata in XML

• Visits to Registries (searching)

• Exercise: query registry in an application

• IVOA Standard Registry Interface

Page 3: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 3

The role of Resource Registries

• Used to discover and locate resources—data and services—that can be used in a VO application

• Resource: anything that is describable and identifiable.– Besides data and services: organizations, projects,

software, …– Presently concerned with simple set of resource types

• Registry: a list of resource descriptions– Expressed as structured metadata

to enable automated processing and searching

Page 4: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 4

An Overview of Data Discovery

• You can search the main NVO registry to find resources based on descriptive criteria

• NVO Registries are “coarse-grained”– You can find organizations, archives, catalogs– Won’t find images, celestial objects, table

records

• Registry framework contains multiple registries:– searchable registries– publishing registries

Page 5: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 5

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Framework

Page 6: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 6

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Framework

harvest(pull)

Page 7: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 7

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Framework

harvest(pull)

replicate

Page 8: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 8

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Framework

harvest(pull)

replicate

selectiveharvesting

Page 9: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 9

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

ClientApplications

searchqueries

Registry Framework

Page 10: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 10

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

ClientApplications

searchqueries

Registry Framework

Page 11: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 11

Caution: Construction Ahead

• Registries in the NVO are currently operating and functional– Supports DataScope

but…

• Registries are at the leading edge of development– Standardized metadata– Standardized interfaces– Consistent behavior

Page 12: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 12

NVO Public Registries

Registry URL Searchable?

Publishing?

STScI/JHUNVO Registry

http://nvo.stsci.edu/voregistry/ Yes Yes

Caltech Carnivore http://mercury.cacr.caltech.edu:8080/carnivore/ Yes Yes

NCSARegistrationPortal

http://nvo.ncsa.uiuc.edu/nvoregistration.html No Yes

Private Publishing Registries• HEASARC• CDS

Only support harvesting protocol

Page 13: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 13

Overview of Publishing

• Resources are published if one can use NVO facilities to find them.

• Multiple layers of publishing– Starts with registry description of resource– Data Access Services

Incremental exposure for incremental effort

• Who are you? How you publish depends on what you want to publish.– An individual with a small data collection– An archive center– Someone with a cool service

Page 14: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 14

Small collections:VO-ready Repositories

• Repositories that allow users to deposit data to share with community– Guarantee long-term storage, availability

• Automatic support for VO publishing mechamisms– Entries into NVO Registry– Support for standard services:

Cone Search, SIA, SSA, SkyNode

• Currently available Repositories– Images: NCSA Astronomy Digital Image Library

http://adil.ncsa.uiuc.edu/– Spectra: Spectrum Service for the VO

http://voservices.net/spectrum/

• More public repositories are expected to emergeCheck NVO website (http://us-vo.org/) for latest

Page 15: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 15

Persistent Archives:Tools for Federation

• Registering your resources with a VO publishing registry– Enter description into registration form at one of the

available NVO registries:• STScI/JHU Registry: http://nvo.stsci.edu/voregistry/• NCSA Registration Portal:

http://nvo.ncsa.uiuc.edu/nvoregistration.html• Caltech Carnivore:

http://mercury.cacr.caltech.edu:8080/carnivore/

– If you have a large number of resources to register, you can run your own registry on your own site

• NCSA VORegistry-in-a-Box http://nvo.ncsa.uiuc.edu/VO/software/

• Caltech Carnivore: http://mercury.cacr.caltech.edu:8080/carnivore/

Page 16: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 16

• Which registry should I register with?– Each registry is incomplete in different ways. – Behavior is slightly different– Different strategies for making it “easier” to register

multiple resources

• Support for standard services is strongest– Cone Search, SIA– This week: use NCSA or STScI if you want to see

integration into DataScope– SkyNode: go to STScI

• Next generation by Jan. 2005– Better consistency, more complete support

• Your feedback is valuable!

Caution: Construction Ahead

Page 17: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 17

• What can/should you register?– Should: your Organization

• Declares yourself as a publisher with an ID– Should: your Collection– Can: your existing services

• Browser-based services: e.g. search page• Traditional CGI services• Web Services

The next level…• Implement and register one or more standard services

– Cone Search– Simple Image Access– SkyNode*– Simple Spectral Access*

*standard still in development

Persistent Archives:Tools for Federation

Page 18: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 18

Cool Services:Integrating with the VO

1. Register your service at a registry• Currently as a generic resource• Improved support for non-standard services coming• (immediate future: contact NVO project)

2. Integrate support for standard VO formats, schemas• FITS and VOTable• Standard Data Model schemas (emerging)

• VOResource, Space-time Coordinates, Spectra

3. Implement Standard Support Interface• a standard in development for:

Self-description, tracking health and usage

Page 19: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 19

A word about Identifiers…

• IVOA Identifier: a globally-unique URI identifying a resource

Ex: ivo://adil.ncsa/targeted/SIA

• Required as part of a registered resource description

• As publisher, you control what it looks like• Two components:

– Authority ID: e.g. adil.ncsaDefines a namespace for identifiersOwned by a single publishing organization

– Resource Key: e.g. targeted/SIAName for the resource unique within the namespaceEncourage re-use of local identifiers

Page 20: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 20

Visits to the Publishing Registries

• Publishing Registries– STScI/JHU Registry:

http://nvo.stsci.edu/voregistry/

– NCSA Registration Portal: http://nvo.ncsa.uiuc.edu/nvoregistration.html

– Caltech Carnivore: http://mercury.cacr.caltech.edu:8080/carnivore/

• Recommend you stick with one registry– Authority IDs are currently “stuck” to a single registry

• Can’t do something? Contact us!

Page 21: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 21

Resource Metadata: XML Schema

• Classes of ResourcesOrganisation, DataCollection, Service, Registry– Specific classes inherit from generic <Resource>

• Organized into separate schemas:– Core resource metadata: VOResource

– Various extensions schemas containing specific types

• Capable of describing…– Data centers, research organizations, missions,

observatories– Data collections, archives – VO standard services: Cone Search, Simple Image

Access– Existing Browser/CGI-based services– Web Services

Page 22: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 22

Describing Resources with XML:VOResource

• Model: types of Resources– Generic Resource– Extensions: e.g. DataCollection, Service, ConeSearch, …

• VOResource: Family of XML schemas– Core schema: VOResource

• Common set of metadata applicable to all resources including Dublin Core

• Resource types: Resource, Service, Organisation

– Extension schemas to describe specific kinds of resources• Extended type inherits generic metadata• adds metadata specific to the type of resource

– Extensibility allows for evolution• Developers only need to support types of interest to them• Allows developers to experiment with non-standard extensions

– Currently transitioning from v0.9 to v0.10

• Lastest status of metadata standards:http://www.ivoa.net/twiki/bin/view/IVOA/ResourceMetadata

Page 23: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 23

Extension Schemas

VODataService: describing data and services– Types: DataCollection, SkyService, TabularSkyService

ConeSearch: describes location and behavior of a Cone Search service– Types: ConeSearch

SIA: describes location and behavior of a Simple Image Access service– Types: SimpleImageAccess

VORegistry: metadata for managing registries– Types: Registry, Authority

Page 24: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 24

Searching the Registry

Registry URL Searchable?

Publishing?

STScI/JHUNVO Registry

http://nvo.stsci.edu/voregistry/ Yes Yes

Caltech Carnivore http://mercury.cacr.caltech.edu:8080/carnivore/ Yes Yes

NCSARegistrationPortal

http://nvo.ncsa.uiuc.edu/nvoregistration.html No Yes

Page 25: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 25

IVOA Standard Registry Interface

• IVOA Working Draft: 2 parts– Harvesting:

sending descriptions from publishers to searchable registry– Searching

• Searching– methods

• keywordSearch(string words, boolean combineByOr)• search(ADQLWhere constaints)

– Returns a list of VOResource descriptions– Advanced searching with ADQL:

• Just the “where” part—i.e. search constraints--of ADQL• In place of column names, use XPath to VOResource

element– Curation/Publisher like ‘%NASA%’– Query extensible to any VOResource XML extension– Maps readily to registry implementations based on RDBMS

or XML-DB.

Page 26: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 26

Why might a client use the standard search interface?

• Uniform interface to all VO Registries

– Not dependant on a single registry

• Direct relationship between information you are querying and information you get back.

• Extensible to any type of resource description

• Re-use of ADQL

Custom interfaces to Registries:

• Registries can provide extended functionality

– More advanced capabilities: e.g. XQuery

– Simpler interfaces for specialized purposes

• Client toolkits can provide simplifying interfaces

– Support for ADQL/s

– XPath aliases

Page 27: 2004-09-15NVO Summer School, Aspen Center for Physics1 Publishing and Resource Discovery with Registries Ray Plante Matthew Graham.

2004-09-15NVO Summer School, Aspen Center for Physics 27

What you might use this week

• STScI or Caltech Registry portals to search for resources

• Any of the registration portals to register resources

• nvoregistry as an example of querying the registry from an application