European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé...

11
European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May 2007 Homogeneous Access to Tabular Data Aurélien Stébé Iñaki Ortiz, Kona Andrews, Guy Rixon

Transcript of European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé...

Page 1: European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May.

European Space Astronomy Centre (ESAC)Villafranca del Castillo, MADRID (SPAIN)Aurélien Stébé

Homogeneous Access to Tabular DataBeijing, China - May 2007

Homogeneous Access to Tabular Data

Aurélien Stébé

Iñaki Ortiz, Kona Andrews, Guy Rixon

Page 2: European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May.

European Space Astronomy Centre (ESAC)Villafranca del Castillo, MADRID (SPAIN)Aurélien Stébé

Homogeneous Access to Tabular DataBeijing, China - May 2007

Introduction

Three input query methods:• Simple Access Query - Key Value Pair filters• Complete Access Query - ADQL (synchronous)• Asynchronous Querying - ADQL and UWS

Output result formats / error response handling common to all methods

New proposals for output format selection, empty results, error messages

http://esavo.esac.esa.int/doc/Homogeneous_Access.pdf

Page 3: European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May.

European Space Astronomy Centre (ESAC)Villafranca del Castillo, MADRID (SPAIN)Aurélien Stébé

Homogeneous Access to Tabular DataBeijing, China - May 2007

Simple Access Query

Intended to be easy to implement for server and client Uses Key-Value Pairs to filter the data to be returned Allows minimal control over quantity of data to be returned Presents dataset as flat, single table, hiding inner structure

Invoked via HTTP GET to:• http://service.endpoint.saq{?,&}PARAM=value[&…]

PARAMs should not modify output format or alter data PARAMs may only limit the quantity of data (rows), the type of specific

data or the quantity of information (columns)

Define generic types / families of parameters Reserve a few parameter names (POS, SIZE, BAND, TIME, …)

Page 4: European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May.

European Space Astronomy Centre (ESAC)Villafranca del Castillo, MADRID (SPAIN)Aurélien Stébé

Homogeneous Access to Tabular DataBeijing, China - May 2007

Simple Access Query - PARAMs

Single value type:• PARAM_NAME=value

• PARAM_column = “value”• PARAM_column > value (upper limit parameter)

List value type:• PARAM_NAME=value1,value2,value3

• PARAM_column IN (“value1”, “value2”, “value3”)

Interval value type:• PARAM_NAME=valueMIN/valueMAX

• PARAM_column BETWEEN valueMIN AND valueMAX• PARAM_column > valueMIN (open upper limit interval)

Interval PARAM type:• PARAM_columnMIN < value AND PARAM_columnMAX > value• PARAM_columnMIN < valueMAX AND PARAM_columnMAX > valueMIN• PARAM_columnMAX > valueMIN (open upper limit interval)

Page 5: European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May.

European Space Astronomy Centre (ESAC)Villafranca del Castillo, MADRID (SPAIN)Aurélien Stébé

Homogeneous Access to Tabular DataBeijing, China - May 2007

Complete Access Query

Intended to give total access and control over the dataset Uses full query language (defined in other specifications) Makes dataset’s inner structure available to the client Users become responsible for data-level considerations

Invoked via HTTP POST to:• http://service.endpoint.caq

Sending the message body:• queryType=queryString

Three query types:• nativeADQL - ADQL against the service’s table/column names• uTypeADQL - ADQL against formal data model using uTypes• directQuery - pass-through to the DBMS, any query language

Page 6: European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May.

European Space Astronomy Centre (ESAC)Villafranca del Castillo, MADRID (SPAIN)Aurélien Stébé

Homogeneous Access to Tabular DataBeijing, China - May 2007

Asynchronous Querying

Interface identical to the Complete Access Query, plus DEST Using the UWS for job management and workflow

The DEST parameter for delivery:• DEST=LOCAL

• For local staging of the data at the service

• DEST=http://my.server/~john/out.vot• DEST=ftp://my.server/~john/out.vot• DEST=vos://my.server!vospace/john/out.vot

• For respectively HTTP, FTP or VOSpace delivery

Page 7: European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May.

European Space Astronomy Centre (ESAC)Villafranca del Castillo, MADRID (SPAIN)Aurélien Stébé

Homogeneous Access to Tabular DataBeijing, China - May 2007

Output Result Formats

Various formats possible, default is VOTable-v1.1 Two methods to select the output format:

• HTTP header “Accept:” with MIME types• OUTPUT parameter with fixed values

The OUTPUT method always overrides the “Accept:” one

Response to valid queries must have the “200 OK” status code and “Content-Type:” header with MIME type of the output format

Empty response to valid queries must have the “204 No Content” status code and no message body

Formats defined: VOTable, CSV or TSV, XML, …

Page 8: European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May.

European Space Astronomy Centre (ESAC)Villafranca del Castillo, MADRID (SPAIN)Aurélien Stébé

Homogeneous Access to Tabular DataBeijing, China - May 2007

Output Result Formats

The use of OUTPUT and “Accept:” HTTP header• Advantage: consistent across query methods, allows direct human

readable webpage interface for a web browser• Disadvantage: output format must be a MIME type, need two methods

to select format, because “Accept:” cannot always be used• Alternative: only use the OUTPUT parameter method• NOTE: OUTPUT is not the equivalent to FORMAT from DAL

The use of “204 No Content” for empty responses• Advantage: consistent across result formats, processing power

spared• Disadvantage: current clients may not check for this status code• Alternative: return empty VOTable (or equivalent in output format)

Page 9: European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May.

European Space Astronomy Centre (ESAC)Villafranca del Castillo, MADRID (SPAIN)Aurélien Stébé

Homogeneous Access to Tabular DataBeijing, China - May 2007

Metadata Access Format

All the information needed to call the service should come from it Should come from a unique endpoint to invoke via HTTP GET

For Simple Access Query, we need: input PARAMs, output FIELDs For Complete Access Query, we need: tables/columns information

Encoding this information in Registry format would ease many things If filling those requirements, would use VOSI for Metadata access

Page 10: European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May.

European Space Astronomy Centre (ESAC)Villafranca del Castillo, MADRID (SPAIN)Aurélien Stébé

Homogeneous Access to Tabular DataBeijing, China - May 2007

Error Responses

Error output is done using the HTTP error codes

Here are a few examples:• 400 Bad Request - malformed input query• 404 Not Found - query to unsupported method• 500 Internal Server Error - general misc server error• 501 Not Implemented - unimplemented optional method• 502 Bad Gateway - backend error (DBMS, store, …)

The message body should contain the error text explanation

Advantage: same system regardless of output format Disadvantage: limited list of codes we don’t control Alternative: classical VOTable-based error output

Page 11: European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Aurélien Stébé Homogeneous Access to Tabular Data Beijing, China - May.

European Space Astronomy Centre (ESAC)Villafranca del Castillo, MADRID (SPAIN)Aurélien Stébé

Homogeneous Access to Tabular DataBeijing, China - May 2007

Conclusion

Need work on Metadata access, reserved PARAMs list, output format, empty responses, error mechanism to be complete

Compliant with second generation DAL services: the Simple Access Query method represents the first step (queryData method)

Questions ?