Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
from the Lindenberg Perspective
Michael Sommer
GRUAN Lead Centre, DWD
GRUAN Data Management Coordination MeetingAsheville, North Carolina, USA28rd September 2009
Data Management Plans
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 2 / 23
Contents
Goals
Strategy of data handling
1. Collection & 2. Preprocessing
3. Archive → GRUAN Meta-DB
4. Processing
5. Dissemination & 6. Monitoring
Discussion
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 3 / 23
Goals of GRUAN data handling
Long-term stability & reference quality imply:
as accurately as possible → heterogeneous measuring(instruments/network)
quality quantification → error bar for all measured values
traceability → still in 20 years !!!
What do these facts mean for the data processing within GRUAN?
reprocessable → improvements of algorithms should be of use
for complete series
traceable → all steps of measuring and processing should be adequately documented (meta
data)
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 4 / 23
3. Archive
GRUAN sites
2. Preprocessingconvert to
internal standard
3.A Datafile archive /data base
3.B Meta datameta data base
4. Processingmodular, extensible
1. Collectingdata, meta data
6. Monitoring
Customer
5. Dissemination
any options
e.g.● data host● lead centre
decentral:● lead centre● sites● ...
GRUAN scientists
lead centre
data host
officialdata centre
web application
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 5 / 23
1. Collecting measuring data
What do we collect?
any measuring data (raw data) which is relevant for GRUAN(first of all → priority one instruments)
meta data of measurements(e.g. conditions, equipment, used software, operator, problems, ...)
How do we collect?
a lot of variants are possible
elaborate → special services to collect data e.g. GTS
simple → email, ftp, http
agreement is necessary→ optimal integration into the existent data flow of sites
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 6 / 23
Raw data & meta data
A) Engineering raw datameasured signals
(frequencies, voltage, ...)
B) Physical raw datafirst calculated measures
(pressure, temperature, …)
What are raw data in GRUAN?
Additional information
Summarised• raw data are not filtered, not corrected, …
• all data which are needed to calculate the target variables and to quantify their quality
What are meta data in GRUAN?
SummarisedMeta data are all additional information to categorise and to understand the target variables.
Example – a radiosonde ascent• basic facts: when, what, where, (who)• how (assembly of rig): balloon, parachute, string length, position
• meteorological parameters• ground check, coefficients, …
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 7 / 23
3. Archive
GRUAN sites
2. Preprocessingconvert to
internal standard
3.A Datafile archive /data base
3.B Meta datameta data base
4. Processingmodular, extensible
1. Collectingdata, meta data
6. Monitoring
Customer
5. Dissemination
any options
e.g.● data host● lead centre
decentral:● lead centre● sites● ...
GRUAN scientists
lead centre
data host
officialdata centre
web application
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 8 / 23
2. Preprocessing
Importall collected raw data and meta data
Testthe integrity of data
Convertto the standard GRUAN data format (e.g. netCDF)
Storeconverted files in data archive (+ original files as backup)
Inform“meta data database” of results from preprocessing
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 9 / 23
3. Archive
GRUAN sites
2. Preprocessingconvert to
internal standard
3.A Datafile archive /data base
3.B Meta datameta data base
4. Processingmodular, extensible
1. Collectingdata, meta data
6. Monitoring
Customer
5. Dissemination
any options
e.g.● data host● lead centre
decentral:● lead centre● sites● ...
GRUAN scientists
lead centre
data host
officialdata centre
web application
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 10 / 23
3. Archive
A) Datafile archive / database
Storage of:• original data files
as backup additional to sites• converted standardised files
from preprocessing• processed data / data products
GRUAN Archive
B) Meta datameta data base
Information about:• sites
location, measurement systems, instruments, …
• measurementsall “relevant” infos
• processinglevel, versions, used algorithm,
software version, ...
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 11 / 23
Capabilities of the Meta-database
Main goals of GRUAN: reprocessable & traceable
save “all relevant” information → question: What is relevant?
about sites, measurements, processing, data products, …
options for design of data base
static (content-concrete) layout
pros: fast, plain / clear
cons: expansible only with change of DB layout
dynamic (content-abstract) layout
pros: very flexible, easy expansible (no change of DB)
cons: complex layout, many interlinks between tables→ abstract and therefore error-prone
I favour → a combination of static and dynamic parts
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 12 / 23
Current work on the GMDB (GRUAN Meta Data Base)
Define interfaces: e.g. use of HTTP, SOAP, XML
to get meta data
to put meta data
user management for access is important and necessary→ Who can do what in the GMDB?
Develop the layout:
Overview of current parts and tables
Develop a GUI: AdminClient to manage the GMDB
easy access to meta data for overview, supply, update, repair, …→ for administration (e.g. LC)
assistants to collect meta data (e.g. site, instrumentation, measuring, …)→ for use at sites
Java software (online and offline usage possible)
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 13 / 23
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 14 / 23
Administration of meta data
Collection of meta data
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 15 / 23
3. Archive
GRUAN sites
2. Preprocessingconvert to
internal standard
3.A Datafile archive /data base
3.B Meta datameta data base
4. Processingmodular, extensible
1. Collectingdata, meta data
6. Monitoring
Customer
5. Dissemination
any options
e.g.● data host● lead centre
decentral:● lead centre● sites● ...
GRUAN scientists
lead centre
data host
officialdata centre
web application
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 16 / 23
Processing server4. Processing
Modular layout
standard interface for archive communication
specific modules to:
test, filter, calculation,error handling (QA, QQ),correction, interpolation, merging, etc.
Traceability
all processing infos in meta data base
definition of specific processing schemes
met
a da
ta d
atab
ase
put ← new data
get → meta data
module 1
...
module n
module 2
Spe
cific
pro
cess
ing
sche
me
Arc
hive
get → data
file
arch
ive
/ dat
a ba
se
put ← meta data
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 17 / 23
Processing software
What do we need?
complete traceable processing (incl. quality quantification)
verifiable → by each customer, who would like this
Proposal for processing software → open to discussion
extendible (modular design with a open interface)
complete documentation
version control
free access and free use
Developed software from Lead Centre areOpen Source Software
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 18 / 23
Level of data “products”
0 Original data files
source for preprocessing
backup
1 Raw data
see → definition of raw data
2 Processed measuring data + error bar on all values
e.g. interpolated, filtered, corrected, …
no use of independent observations
3 “Best possible” profile – composite data (merging)
use of independent observations
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 19 / 23
3. Archive
GRUAN sites
2. Preprocessingconvert to
internal standard
3.A Datafile archive /data base
3.B Meta datameta data base
4. Processingmodular, extensible
1. Collectingdata, meta data
6. Monitoring
Customer
5. Dissemination
any options
e.g.● data host● lead centre
decentral:● lead centre● sites● ...
GRUAN scientists
lead centre
data host
officialdata centre
web application
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 20 / 23
5. Dissemination of data products
Objectives of dissemination:
integration into the generally usual distribution ways
use of an existent data centre (→ NCDC)
How promptly should the data products be distributed?
On GRUAN web site (DWD) and / or data dissemination portal (NCDC): search and download of data products documentation
→ all information about measurements and data products special software for easy usage of data (like a viewer) explicit identifying the data products (reference)
→ Digital Object Identifier (DOI)
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 21 / 23
3. Archive
GRUAN sites
2. Preprocessingconvert to
internal standard
3.A Datafile archive /data base
3.B Meta datameta data base
4. Processingmodular, extensible
1. Collectingdata, meta data
6. Monitoring
Customer
5. Dissemination
any options
e.g.● data host● lead centre
decentral:● lead centre● sites● ...
GRUAN scientists
lead centre
data host
officialdata centre
web application
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 22 / 23
6. Monitoring
What is it serving for?
status of network
status of processing
detection of problems
Who has access to the monitoring?
all sites
Lead Centre
science partner
Implementation as a web application
Deutscher WetterdienstLindenberg Meteorological Observatory
Richard Assmann Observatory
M. Sommer / GRUAN Lead Centre – 23 / 23
Discussion
How do we collect data and meta data?
Is the collection of the raw data feasible?
How do we handle “black-box” software?→ quality quantification and error handling
How promptly should the data products be distributed?
Should / Can we make data usage traceable?
How could you contribute?
resources (archiving, dissemination, ...)
existent software solutions
processing methods
Who can develop and/or provide which part or service?
Top Related