VERA-QC, a new Data Quality Control based on Self-Consistency
description
Transcript of VERA-QC, a new Data Quality Control based on Self-Consistency
VERA-QC, a new Data Quality Control based on Self-Consistency
Dieter Mayer, Reinhold Steinacker, Andrea SteinerUniversity of Vienna, Department of Meteorology and Geophysics, Vienna, Austria
Presentation at the 10th European Conference on Applications of Meteorology (ECAM)
Berlin, 14 September 2011
Outline
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
• Motivation for VERA-QC• Applicability and basis of VERA-QC• Mathematical background of VERA-QC• Deviations and error detection• Handling special station alignments• Conclusion and availability of VERA-QC
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Motivation for VERA-QC
High quality data is needed as input for VERA
• What is VERA?• Analysing observations to grid points (complex topography)• Combining interpolation (TPS) & downscaling (Fingerprints)
• Features of VERA• Model independent• No need for first guess fields• Works on real time & operational basis
• Applications of VERA & VERA-QC• Real time model verification• Basis for nowcasting• Evaluation of case & field studies• Computation of analysis ensembles
High quality data is needed as input for VERA
• What is VERA?• Analysing observations to grid points (complex topography)• Combining interpolation (TPS) & downscaling (Fingerprints)
• Features of VERA• Model independent• No need for first guess fields• Works on real time & operational basis
• Applications of VERA & VERA-QC• Real time model verification• Basis for nowcasting• Evaluation of case & field studies• Computation of analysis ensembles
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Selecting or designing a QC?Properties of VERA
& its applicationsExisting QC-
methodsRequirements to
select / design QC
Bayesian QCVariational QC
QC using OIQC using IDQC using SR
Limit checks
Internal consistency checks
model independent
no back-ground fields
model verification
real timefast (not iterative)
field studies
no statistical information
complex topography handle inhomogeneous
station distribution
analysis ensembles
propose deviations
Answer: there is a need for a new QC-method
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Applicability of VERA-QC• Basis: spatio and / or temporal consistency of data• Requirement: High degree of redundancy in observations
Example:
VERA-Analysis for precipitation (green) & MSL-pressure (black)
Dots and stars:Observations for precip. & pressure
d0d0d
d
– Depending on station density & scale of phenomenon– Expressed as station distance and
decorrelation length – QC applicable if / >> 1 (GTS: pMSL,Q,Qe)
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Basis of VERA-QC• Error affected observations (rough) observation field Yo
• Corrected observations (smoother) analysis field Ya = Yo + DY• Main task is to receive deviations DY
Example: South-West to North East pressure-gradient with some artificial errors:
Note: DY is not a simple difference between observation and interpolation
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Mathematical Core of VERA-QC• Goal: receive deviations to obtain smooth analysis field.
d1,d2, D: dimensionsn, N: grid points
P: prim. neighbors
m,M: main stationss,S: second. neighbors
- Defining cost function J as squared curvature of analysis field:
- Curvature of analysis field Cya is not known Taylor series expansion:
- Building global cost function: (taking into account all stations and grid points)
- Solving optimization problem for deviations :
• Questions regarding the cost function:– Q1: Where should the cost function
be evaluated? A1: Regular grid is too expensive, take
station points– Q2: What are main stations, primary and
secondary neighbors? A2: m: Main station: one station after another s: (secondary) neighbors of m
p: (primary) direct neighbors of m – Q3: ? Which stations contribute to the
Taylor series expansion? A3: A certain station and its natural
neighbors. More than one station is allowed to be erroneous!
Concept of natural neighbors
Method connecting stations: Delaunay Triangulation
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Triangulation / Computing curvatures
Typical example for realistic station distribution and Delaunay Triangulation
•Defining local grids around stations •Interpolate station values YS to grid points n:
•Computing curvatures
(Inverse distance interpolation)
• Simplest example: 1D, 1 spike• Outlier corrected partially, but
counter swinging at neighbors • Solution: correcting erroneous
observation should reduce cost function. Compute weighted deviations:
• with
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Weighting Deviations
• Three possibilities to handle an observation
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Deviations and Gross Errors
No gross error Obs. corrected
Gross error Obs. rejected
No gross error Obs. accepted
yes
no
yes
no
yes
no
• a, b and c: parameter dependent, user defined thresholds• VERA-QC is repeated without rejected observations
• Error propagation possible at close by stations• Example: circles with stations, cluster in center• Both stations obtain significant deviations • Combine both stations to one fictive cluster station• Compute deviation for cluster station • Add deviation to both stations• Repeat VERA-QC for modified observations
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Cluster Treatment
• Properties of VERA-QC:– Applicable to 1, 2, 3 and 4 dimensional problems– High efficiency in detecting errors compared to other QC methods– No simple averaging algorithm – Can handle very inhomogeneous station distributions– Model independent, fast, no iterations necessary – Deviations can be stored to compute bias– Implemented as Matlab stand alone application, runs on Server & PC
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Conclusions
• Further Informations:– Publication: Steinacker, R., D. Mayer, and A. Steiner 2011, Data Quality Control Based on Self Consistensy. Accepted in Monthly Weather Review.– Poster Presentation: A. Steiner, Operational Application of VERA-QC, Challenges and how to cope with them. Poster Hall, Thursday 16-17:00.
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Availability of VERA-QC
Homepage: http://www.univie.ac.at/amk/veraflex/test/intern/VERA-QC is freely available for non-commercial use
The End
Acknowledgments: Austrian Science Fund (FWF), support under grant number P19658
Contact: [email protected]://www.univie.ac.at/amk/veraflex/test/intern/
Thank you for your attention
Is VERA-QC an averaging technique?• Considering a signal at only 3 stations (unlikely to be a gross error)• Unweighted deviations smooth signal• Weighted deviations only soften contrast
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
VERA-QC in higher dimensions
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
• Interpolate irregularly distributed station values to regular grid (Thin plate spline)
• Downscaling with the help of idealized physically motivated patterns
VERA in a nut shell
10th European Conference on Applications of Meteorology (ECAM)Berlin, 12-16 September 2011
IMG ViennaMayer et.al.
Solution
Unexplained field Explained field Weight
Fingerprint