THREDDS Data Server Unidata’s Common Data Model Background / Summary John Caron Unidata/UCAR Mar...
-
Upload
victor-carter -
Category
Documents
-
view
218 -
download
0
Transcript of THREDDS Data Server Unidata’s Common Data Model Background / Summary John Caron Unidata/UCAR Mar...
THREDDS Data ServerUnidata’s Common Data Model
Background / Summary
John Caron
Unidata/UCAR
Mar 2007
HTTP Tomcat Server
THREDDS Data Server
Datasets
catalog.xml
motherlode.ucar.edu
THREDDS Server Application
NetCDF-Javalibrary
IDD Data
•HTTPServer
•NetcdfSubset
•WCS
•OPeNDAP
THREDDS Catalogs
• XML over HTTP• Hierarchical listing of online resources (datasets)• Container for arbitrary search metadata
– Standard set maps to DC, GCMD, ADN – Unidata/CDP
• Metadata can be inherited• Design goal: Make it easy for data providers• TDS uses for configuration
– Client view vs. server view• Data Access URLS
– “Crossing the protocol boundary”
catalog.xml
Motherlode catalog example
THREDDS WCS 1.0 Server
• Each (gridded) Dataset is WCS• Each Grid is a Coverage • Return formats
– GeoTIFF: floating point, greyscale– NetCDF / CF-1.0 (same as NetcdfSubset Service)
• No reprojections, resampling• GALEON 2
– upgrade to WCS 1.1– Try returning point datasets
THREDDS OPeNDAP Server
• Current version 2.0; NASA ESE standard– Working on new 4.0 protocol spec
• Based on Java-OPeNDAP library – shared development by Unidata/opendap.org
• Any CDM dataset can be served• Server4 (Hyrax):
– latest version of opendap.org C++ library – uses THREDDS catalog generation code– THREDDS Catalogs replace dods_dir
HTTP Tomcat Server
Common Data Model
catalog.xml
hostname.edu
THREDDS Server Application
NetCDF-Javalibrary
IDD Data
•HTTPServer
•NetcdfSubset
•WCS
•OPeNDAP
Then a miracle
happens
Datasets
NetcdfDataset
ApplicationScientific Datatypes
NetCDF-Java version 2.2 architecture
OPeNDAP
THREDDS
Catalog.xml NetCDF-3
HDF5
I/O service provider
GRIB
GINI
NIDS
NetcdfFile
NetCDF-4
…Nexrad
DMSP
CoordSystem Builder
Datatype Adapter
ADDE
NcMLNcML
I/O Service Provider Implementations
• General: NetCDF, HDF5, OPeNDAP• Gridded: GRIB-1, GRIB-2 • Radar: NEXRAD level 2 and 3, DORADE,
Chinese NEXRAD• Point: BUFR, ASCII• Satellite: DMSP, GINI, McIDAS AREA• In development / tentative
– NOAA CLASS legacy files– Barrowdale DataBlade
Coordinate Systems
Common Data Model Layers
Data Access
Scientific Datatypes
Grid
Point
Radial
Trajectory
Swath
Station Profile
NetCDF-4 andCommon Data Model(Data Access Layer)
NetCDF-4 C library
• 4.0 Beta implements CDM access layer– complete, but waiting for HDF5 release 1.8 to
finalize file format (Maybe this month, 1.5 years late!)
– Persistence format for complete CDM
• 4.1: adding Coordinate Systems – Optional layer, focus on CF-1 (libcf)
• 4.?: merge OPeNDAP access (pending funding)
Coordinate Systems UML
NcML: NetCDF Markup Language
XML representation of netCDF metadata• Core: netCDF data access model• Coordinate System: general and
georeferencing coordinate system• Dataset: redefine, aggregate, subset
Luca Cinquini (NCAR/SCD/ESG), John Caron, Ethan Davis, Bob Drach (LLNL), Stefano Nativi (Florence), Russ Rew
NcML
• NcML Coordinate Systems further developed into NcML-G by Stefano et al.
• NcML Core and Dataset combined into single schema to allow dataset modification
• Aggregation:– Union– Syntactic join on (existing or new) outer dimension– Semantic aggregation of (runtime, forecast time) =
Forecast Model Run Collection
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/schemas/netcdf/ncml-2.2" location=“/data/nids/N0R_20041119_2147">
<attribute name=“cdm_datatype" value=“Radial" /> <remove type=“attribute” name=“password" /> <variable name="Reflectivity" orgName=“R34768”> <attribute name="units" value=“dBZ" /> </variable>
</netcdf>
NcML example
TDS / NcML example
<datasetScan name="Ocean Satellite Data" path="ocean/sat" dirLocation="R:/tds/netcdf/">
<netcdf>
<attribute name="Conventions" value="CF-1.0"/>
</netcdf>
</datasetScan>
TDS / NcML aggregation
<dataset name="WEST-CONUS_4km Aggregation" urlPath="satellite/3.9/WEST-CONUS_4km">
<netcdf > <aggregation dimName="time" type="joinNew"> <scan location="/data/ldm/pub/satellite/3.9/WEST-CONUS_4km/"
suffix=".gini" /> </aggregation> </netcdf>
</dataset>
Datasets vs. Files
• Must hide actual location of data files on your server
• Would like to hide actual file format
• Must encapsulate collections of files into logical datasets– Homogenous metadata – Hide arbitrary storage decisions– Minimize number of datasets
Forecast Model Run Collection (FMRC)
Data Model: Sampled Functions
Our phenomena are continuous functions: F: Domain → Range
where
Domain = subset of space-time (3 spatial, time) (Ε4)
Range = Rn (product set of real numbers)
Our measurements are sampled functions Domain is a point subset = {p, p є Ε4}
M: E4 → Rn
Variables
Variable is a container for an Array of valuesdimensions lat = 64; lon = 128;variables: float temperature( lat, lon);
Domain is a set of points in Index space:Temperature : {[0..63] x [0..127]} → RTemperature : I2 → RVariable : Im → Rn
Coordinate Systems
Coordinate Axis : Im → R
{Axis} = Coordinate System : Im → E4
V: Im → Rn
CS: Im → E4
V ° CS-1 : E4 → Rn
Scientific Data Types
• Trying to go beyond index-space subsetting
• Trying to satisfy V ° CS-1 : E4 → Rn
– I.e. support subsetting using Space, Time “queries”
• Based on datasets Unidata is familiar with– APIs are evolving
• Intended to scale to large, multifile collections• Corresponding “standard” NetCDF file format
conventions
Implementations
Datatype• Grid• PointObs• RadialSweep• Swath
Dataset• GridDataset• FMRCDataset• CollectionOfPointObs• StationCollectionOfPointObs• StationCollectionOfRadialSweep
Conclusions
• CDM is our implementation data model
• Map to data access models such as OGC
• Current work is to serve collections instead of individual files.
• Dataset is desired level of granularity
• Scientific data types are implementations with specialized access
Datatype Collection
• GridDataset collection of GridDatatype
NetcdfDataset
ApplicationScientific Datatypes
NetCDF-Java version 2.2 architecture
OPeNDAP
THREDDS
Catalog.xml NetCDF-3
HDF5
I/O service provider
GRIB
GINI
NIDS
NetcdfFile
NetCDF-4
…Nexrad
DMSP
CoordSystem Builder
Datatype Adapter
ADDE
NcMLNcML
Gridded Datatype
float gridData(t,z,y,x); float time(t); float y(y); float x(x); float lat(y,x); float lon(y,x); float z(z); float height(t,z,y,x);
• Cartesian coordinates• All dimensions are connected• horizontal: lat,lon or projection x,y • time(time) orthogonal 1D• seperable: (x, y) X time X z
GridDatatype methods
CoordinateAxis getTaxis();CoordinateAxis getXaxis();CoordinateAxis getYaxis();CoordinateAxis getZaxis();Projection getProjection();
int[] findXYindexFromCoord( double x_coord, double y_coord);
LatLonRect getLatLonBoundingBox();
Array getDataSlice (Range[] …) GridDatatype makeSubset (Range[] …)
Radial Data
radialData(radial, gate) : distance(gate) azimuth(radial) elevation(radial) time(radial)
• Polar coordinates• All dimensions are connected• Not separate time dimension
Swath
swathData(line,cell) lat(line,cell) lon(line,cell) time(line) z(line,cell) ??
• lat/lon coordinates• not separate time dimension• all dimensions are connected
Unstructured Grid
float unstructGrid(t,z,pt); float lat(pt); float lon(pt); float time(t); float height(z);
• Pt dimension not connected• Looks the same as point data• Need to specify the connectivity explicitly
Point Observation Data
Structure { lat, lon, z, time; v1, v2, ... } obs( pt);
• Set of measurements at the same point in space and time• Point dimension not connected
float obs1(pt);float obs2(pt); float lat(pt); float lon(pt); float z(pt); float time(pt);
PointObsDataset Methods
// Iterator<StructureData>
Iterator getData(
LatLonRect boundingBox,
Date start, Date end);
Time series Station Data
Structure { name; lat, lon, z; Structure{ time; v1, v2, ... } obs(*); // connected } stn(stn); // not connected
StationObs Methods
// List<Station>List getStations( LatLonRect boundingBox);
// Iterator<StructureData>Iterator getData( Station s, Date start, Date end);
Structure { name; Structure { lat, lon, z, time; v1, v2, ... } obs(*); // connected } traj(traj) // not connected
Trajectory Data
Structure { lat, lon, z, time; v1, v2, ... } obs(pt); // connected
• pt dimension is connected• Collection dimension not connected
Profiler/Sounding Station Data Structure { name; lat, lon, time; Structure { z; v1, v2, ... } obs(*); // connected } loc(nloc); // not connected
Structure { name; lat, lon; Structure { time, Structure { z; v1, v2, ... } obs(*); // connected } time(*); // connected } stn(stn); // not connected
Data Types Summary
• Data access through a standard API
• Convenient georeferencing
• Specialized subsetting methods– Efficiency for large datasets
File Format#N
File Format#2
File Format#1
CDM
Visualization&Analysis
PayoffN + M instead of N * M things on your TODO List!
NetCDF file
OpenDAP Server
WCS Service
Web Service
Next: DataType Aggregation
• Work at the CDM DataType level, know (some) data semantics
• Forecast Model Collection– Combine multiple model forecasts into single
dataset with two time dimensions– With NOAA/IOOS (Steve Hankin)
• Point/Station/Trajectory/Profile Data – Allow space/time queries, return nested sequences– Start from / standardize “Dapper conventions”
Forecast
Model
Collections
Coordinate Systems: implicit/explicit
• NetCDF, OPeNDAP, HDF data models do not have explicit coordinate systems– so georeferencing not part of API– Need conventions to specify (eg CF-1,
COARDS, etc)
• GRIB, HDF-EOS (eg) are explicit– But no uniform API
47
NetCDF-4
C
Library
HDF5 Library
netCDF-4 Library
netCDF-3Interface
NetCDF-4 C Library
Conclusion
• Standardized Data Access in good shape– HDF5, NetCDF, OPeNDAP– Write an IOSP for proprietary formats (Java)
• But that’s not good enough!• To do:
– Standard representations of coordinate systems
– Classifications of data types, standard services for them