26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T...

35
26 October 2004 ADASS 2004 - Pasadena 1 Publishing and Resource Discovery with Registries Ray Plante THE US NATIONAL VIRTUAL OBSERVATORY Kevin Benson Sebastien Derriere Pierre Fernique Matthew Graham Gretchen Greene Bob Hanisch Paul Harrison Martin Hill Jeongin Lee Gerard Lemson Tony Linde Tom McGlynn Wil O’Mullane Keith Noddle Ramon Williamson Visit the NVO Demo Booth
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    1

Transcript of 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T...

Page 1: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 1

Publishing and Resource Discovery with Registries

Ray Plante

THE US NATIONAL VIRTUAL OBSERVATORY

Kevin BensonSebastien DerrierePierre FerniqueMatthew GrahamGretchen Greene

Bob HanischPaul HarrisonMartin HillJeongin LeeGerard Lemson

Tony LindeTom McGlynnWil O’MullaneKeith NoddleRamon Williamson

Visit the NVO Demo Booth

Page 2: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 2

Summary (2003)

• We built a working prototype registry system to support an end-user VO service– Distributed Publishing and Searchable components

– Encoded descriptions using emerging VO XML standard schemas

– OAI Harvesting Standard deployed easily

– Used to discover Cone Search and SIA services

• What’s next: Interoperable registries IVOA-wide – Stablize XML metadata standard

– Standardize registry interfaces

Page 3: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 3

Summary (2004)

• We built a working production registry system to support an end-user VO services– DataScope: discovers Cone Search, Simple Image Access

services

– OpenSkyQuery Portal: discovers OpenSkyNodes

• What’s next: Interoperable registries IVOA-wide – Stabilize XML metadata standard

– Standardize registry interfaces

=> IVOA: Frozen working draft standard for January ’05 releases

Page 4: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 4

Registries 2004

• Review of Registry architecture• Resource Metadata Model• IVOA Registry Interface Standard

– Harvesting– Searching

• The NVO Publishing Process• Searching for Resources• Curation Issues

Page 5: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 5

The role of Resource Registries

• Used to discover and locate resources—data and services—that can be used in a VO application

• Resource: anything that is describable and identifiable.– Besides data and services: organizations, projects,

software, …– Presently concerned with simple set of resource types

• Registry: a list of resource descriptions– Expressed as structured metadata

to enable automated processing and searching

Page 6: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 6

Selected Requirements

• Allow user to select resources that are likely to pertain to a scientific question

• Select resources based on characteristics…– Type of resource: catalogs, image archives, EPO, services– Coverage in space, time, and frequency– Where data comes from, who curates it

• Dynamic: resources will come and go

• Distributed: Should not depend on a single point of failure or single view of the VO.

• Preserve the data providers’ control over their data– Curators control what gets registered, content, updates– Allow integration with existing resource management

• Allow extension to new types of resources

Page 7: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 7

IVOA Registry Working Group (RWG)

IVOA = International Virtual Observatory Alliance

• Common, global approach to registries

• Towards a standard framework– Registry Model– Resource Identifiers– Metadata schemas– Registry Interface

• Distributed model for registries

Page 8: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 8

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

Page 9: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 9

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

harvest(pull)

Page 10: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 10

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

harvest(pull)

replicate

Page 11: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 11

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

harvest(pull)

replicate

selectiveharvesting

Page 12: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 12

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

ClientApplications

searchqueries

Registry Model

Page 13: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 13

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

ClientApplications

searchqueries

Registry Model

Page 14: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 14

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

ClientApplications

searchqueries

Registry Model

Page 15: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 15

Local PublishingRegistry

FullSearchableRegistry

Local PublishingRegistryCaltech

JHU/STScI

harvest(pull)

DataScope

search forservices

Registries in Use:DataScope

NCSA DS

Page 16: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 16

Local PublishingRegistry

Local PublishingRegistry

Local PublishingRegistry

FullSearchableRegistry

Local PublishingRegistryCaltech

JHU/STScI

harvest(pull)

DataScope

search forservicesNCSA DS

HEASARC

CDS

FullSearchableRegistry

AstroGridRegistries in Use:DataScope

Page 17: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 17

ConeSearchService

ConeSearchService

Simple ImageAccess

Simple ImageAccess

FullSearchableRegistry

JHU/STScI

search forservices

ConeSearchService

Simple ImageAccess

DataProviders

DataScope

DS

FullSearchableRegistry

AstroGrid

Local PublishingRegistry

Local PublishingRegistry

Local PublishingRegistry

Local PublishingRegistryCaltech

harvest(pull)

NCSA

HEASARC

CDS

Registries in Use:DataScope

Page 18: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 18

Registries in Use

• Registries in the NVO are currently operating and functional– DataScope: discovers Cone Search, Simple Image Access (SIA)

services– OpenSkyQuery Portal: discovers OpenSkyNodes– CDS Aladin/GLU: (Pierre Fernique)

• harvests Cone Search and SIA services • converts them into GLU dictionary records• Accessible directly by the Aladin image and catalog viewer

• AstroGrid Registry foundation for building workflows– Portal uses descriptions to stitch services together– (Previous talk by Keith Noddle)

• Cross-project harvesting– NVO, AstroGrid, AVO (Vizier, GLU)

• Registries are at the leading edge of VO development

Page 19: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 19

Resource Metadata ModelIVOA Recommendation:

Resource Metadata

Page 20: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 20

Resource Metadata ModelIVOA Recommendation:

Resource Metadata

Resource

Organisation Service

IVOA Working Draft: VOResource

as XMLCore Metadata

Page 21: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 21

Resource Metadata ModelIVOA Recommendation:

Resource Metadata

Resource

OrganisationAuthority

Registry

Service

IVOA Working Draft: VOResource

VORegistry

DataCollection

SkyService

TabularSkyService

VODataService

as XML

Page 22: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 22

Resource Metadata ModelIVOA Recommendation:

Resource Metadata

Resource

OrganisationAuthority

Registry

Service

IVOA Working Draft: VOResource

VORegistry

DataCollection

SkyService

TabularSkyService

VODataService

SimpleImageAccess

SIA

ConeSearch

ConeSearch

as XML

Page 23: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 23

Resource Metadata ModelIVOA Recommendation:

Resource Metadata

Resource

OrganisationAuthority

Registry

Service

IVOA Working Draft: VOResource

VORegistry

DataCollection

SkyService

TabularSkyService

VODataService

SimpleImageAccess

SIA

ConeSearch

ConeSearch

CEAApplication

CEAService

VOCEA

as XML

Page 24: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 24

IVOA Working Draft:

Registry Interface (RI) StandardKevin Benson (AstroGrid), Editor• Harvesting

Delivering resource descriptions from publishers to searchable registries

– Adoption of Open Archives Initiative (OAI) standard: Protocol for Metadata Harvesting

http://www.openarchives.org/– RI defines application of OAI to VO resource records

• Plug in VOResource as metadata format– Optional SOAP version to augment HTTP Get standard

• Searching– Returns XML VOResource records– Keyword search– Advanced search

• Uses the Astronomical Dataset Query Language (ADQL)• Refer to metadata items via a simplified XPath

– Easily mapped to either SQL for an RDBMS implementation, XQuery for an XML DB implementation

Page 25: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 25

Publishing to the NVOhttp://www.us-vo.org/publish.cfm

• Resources are published if one can use VO facilities to find them.

• Multiple layers of publishing– Starts with registry description of resource– Data Access Services

Incremental exposure for incremental effort

• Who are you? How you publish depends on what you want to publish.– An individual with a small data collection– An archive center– Someone with a cool service

• Extinction Correction Service– Developed by C. Miller, K. S. Krughoff– In one day of the NVO Summer School using VO tools

Page 26: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 26

Small collections:VO-ready Repositories

• Repositories that allow users to deposit data to share with community– Guarantee long-term storage, availability

• Automatic support for VO publishing mechanisms– Entries into NVO Registry– Support for standard services:

Cone Search, SIA, SSA, SkyNode

• Currently available Repositories– Images: NCSA Astronomy Digital Image Library

http://adil.ncsa.uiuc.edu/– Spectra: Spectrum Services for the VO

http://voservices.net/spectrum/

• More public repositories are expected to emergeCheck NVO website (http://us-vo.org/) for latest

Page 27: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 27

Persistent Archives:Tools for Federation

• Registering your resources with a public VO publishing registry

Choose resourcetype

Edit Form

STScI Registry

NCSA Registry

Page 28: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 28

Persistent Archives:Tools for Federation

• Registering your resources with a VO publishing registry– Enter description into registration form at one of the

available NVO registries:• STScI/JHU Registry: http://nvo.stsci.edu/voregistry/• NCSA Registration Portal:

http://nvo.ncsa.uiuc.edu/nvoregistration.html• Caltech Carnivore:

http://mercury.cacr.caltech.edu:8080/carnivore/

– If you have a large number of resources to register, you can run your own registry on your own site

• NCSA VORegistry-in-a-Box http://nvo.ncsa.uiuc.edu/VO/software/

• Caltech Carnivore: http://mercury.cacr.caltech.edu:8080/carnivore/

Page 29: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 29

• What can/should you register?– Should: your Organization

• Declares yourself as a publisher with an ID– Should: your Collection

• Users at least know how to access it via a Browser– Can: your existing services

• Browser-based services: e.g. search page• Traditional CGI services• Web Services

The next level…• Implement and register one or more standard services

– Cone Search– Simple Image Access– SkyNode*– Simple Spectral Access*

*standard still in development• NVO Summer School Software package: server-side templates

and toolkits http://www.us-vo.org/summer-school/

Persistent Archives:Tools for Federation

Page 30: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 30

Cool Services:Integrating with the VO

1. Register your service at a registry

2. Integrate support for standard VO formats, schemas• FITS and VOTable Enable integration with existing tools & visualizers

• Standard Data Model schemas (emerging)• VOResource, Space-time Coordinates, Spectra

Enable integration with other services using these models

3. Implement Standard Support Interface• a standard in development for:

Self-description, tracking health and usage

Page 31: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 31

Searching the Registry

• Use a searchable registry to find data and services– NVO has two searchable registries available:

• STScI/JHU Registry: http://nvo.stsci.edu/voregistry/• Caltech Carnivore:

http://mercury.cacr.caltech.edu:8080/carnivore/

• Two types of searches:– Simple keyword-based search– Advanced search

• STScI/JHU: SQL-based• Caltech: XQuery-based

• Currently working on user-oriented improvements to interactive interface

G. Greene & W. O’Mullane @ STScI– Help with advanced searches– Improved organization of returned results

Page 32: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 32

Accessing the Registry from Applications

• Custom Web Service Interfaces available– keyword and advanced search functions– Currently used by DataScope and SkyPortal

• IVOA Standard Web Service interface– Full support targeted for January 2005 roll-out– Beta support available from Caltech Carnivore

• Available Java client software – Currently available via NVO Summer School software

distribution• Zip file: http://chart.stsci.edu/twiki/bin/view/Main/Software• HowTos: http://chart.stsci.edu/twiki/bin/view/Main/NVOSummerSchoolCourseNotes

– Includes:• Client library for IVOA Standard search interface• Sample client code for both custom and standard

interfaces

Page 33: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 33

Curation Issues

• NVO Registries now contain over 3000 records Lots of problematic metadata:

– Missing information, incorrect usage, truncated values– Duplicates, deprecated records, missing resources– Broken/non-compliant services

• People need to assume responsibility for curation– Software can help, but is not sufficient– Role of Registry administrator?

Page 34: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 34

A practical approach to Curation

• Proposal: “VerificationLevel” tag attached to resource descriptions by a registry curator– 3 levels:

• Unverified• Verified by software• Verified by human curator

– Tag exposed to users/apps: e.g. select only highly verified resources

– Tag is specific to a registry; can by overridden when harvested by another registry.

• Software verification– NCSA: building a suite of software verifiers– Can be incorporated directly into registries

Either locally or by calling a remote web service– First example: Cone Search Verifier

http://nvo.ncsa.uiuc.edu/services/csvalidate.html

Page 35: 26 October 2004ADASS 2004 - Pasadena1 Publishing and Resource Discovery with Registries Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY Kevin Benson.

26 October 2004ADASS 2004 - Pasadena 35

Summary 2004

• NVO is operating production registries– serving end-user applications– greater emphasis on user interfaces– registry searches easily integrated into applications– Full release of latest improvements by January 2005

• Interoperable exchange between IVOA registries• Extensible Resource Metadata model• IVOA Registry Interface Standard is emerging

What’s next: shift from development to curation• Finalize RI standard• Address curation issues• No talk on registries next year