The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009...
Transcript of The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009...
![Page 1: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/1.jpg)
The XM Tool Project: The Lessons Learned in Non-Linear Statistical Modeling for Air Quality Forecasting in Canada
Andrew Teakles1, Sean Perry1, Aiming Wu1, Qian Li1, Stavros Antonopoulos1, Ken Lau1, Harry Yau1, Rita So1, Johannes Jenkner2 IWAQFR – Washington, DC Nov 28th – Dec 1st, 2011
1 – Environment Canada, Canada 2 – University of British Columbia, Canada
![Page 2: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/2.jpg)
Page 2 – 10-10-31
Overview
• XM Tool project and it’s goals • Project History • Optimizing MLR • Use of Antecedent Conditions • Non-Linear Methods • Lessons Learned • Future Work
![Page 3: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/3.jpg)
Page 3 – 10-10-31
XM Tool Project
Mandate: • Develop and evaluate new non-linear tools for post-processing of air quality forecasts of O3, NO2, and PM2.5 Purpose: • Supports AQHI forecast program •Key Objectives:
– Improved guidance for air quality episodes. – Improve timing of air quality episodes. – Improve overall model forecast skill.
![Page 4: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/4.jpg)
Page 4 – 10-10-31
Project History
• Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 – 2011 • XM Tool prototype (pending) Science Phases: 1. Round 1 Trials to narrow prospective non-linear
techniques 2. Round 2 to explore the use of antecedent conditions 3. Round 3 Introduce adaptive learning algorithms and
test learning rates relative to UMOS-AQ
![Page 5: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/5.jpg)
Page 5 – 10-10-31
Techniques Tested Method Tested Status Comments
Modified MLR with BIC predictors
Prototype
Boosted Regression Minor refinements Optimization routine needed
Bayesian Neural Network
Needs Refinements Revise predictors Matlab code
Support Vector Regression
Needs Refinement
Neural Network with cross-validation
Rejected Data loss
MDA Rejected
CART Rejected
![Page 6: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/6.jpg)
Page 6 – 10-10-31
List of Predictors Predictors
84 AQ Model Predictors
3 Persistence Predictors
Persistence Predictors
OBS at Hr 00
27 Antecedent Predictors
Antecedent Predictors
Lag 24/48/72 hrs Max Min OBS
AQ Model Predictor types • Meteorological Variables • Sine Julian Day • Day of the week • NO2, O3, PM2.5 • Spatial Average, Min, Max
6 AQ Sites Tested • Halifax • Montreal • Toronto • Winnipeg • Edmonton • Vancouver
![Page 7: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/7.jpg)
Page 7 – 10-10-31
MLR approaches
UMOS-AQ XM version
Technique MLR MLR
Predictors AQ model + Persistence AQ model + Persistence and/or Antecedents
Predictor Selection
Forward Stepwise Regression
Leaps Minimum BIC (Ken Lau)
Seasonality 2 season 1 season
QA/QC Min, Max, rate Min, Max, rate
Minimum cases 250 250
Updatable Yes Not yet
![Page 8: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/8.jpg)
Page 8 – 10-10-31
Flow Diagram for Leaps BIC Selection
All Predictors
Leaps Forward Selection
35 predictors
Data subset with 35 predictors
Leaps Exhaustive Search
Model Size 3 Model Size 4 Model Size 11
Minimum BIC
Optimized Predictors
• Leaps Package in R Statistical programming language • Use combination of Forward and Exhaustive Search
![Page 9: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/9.jpg)
Page 9 – 10-10-31
MLR for O3: with Persistance
• Higher bias • No updating
Test data from May 30, 2011 to Oct 28th, 2011
![Page 10: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/10.jpg)
Page 10 – 10-10-31
MLR for O3: with Persistance
• reduced bias • less RMSE
![Page 11: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/11.jpg)
Page 11 – 10-10-31
MLR for O3: Persistance + Antecedents
The value of antecedants
![Page 12: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/12.jpg)
Page 12 – 10-10-31
Bayesian Neural Network (BNN)
Categories Details comments
Predictor Selection Stepwise Regression ensemble + frequency analysis
Select any predictor that occurs more 20%
Optimization Weights by Bayesian Theory
Full use of training data
Scaling Yes
Seasonality 1 Season Code Matlab R code exist to issue
forecast
![Page 13: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/13.jpg)
Page 13 – 10-10-31
Improve R2 + Lower RMSE
BNN vs MLR: Persistence + Antecedents
![Page 14: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/14.jpg)
Page 14 – 10-10-31
Impact of Data Loss on Predictor Selection MLR w/ Pers. +
Ante. MLR w/ Pers
Loss of training data due to its chose of antecedent predictors
*Last column indicates data availability for a given model (black space = loss of data)
Availability of antecedent preds.
BNN w/ Pers. +
Ante.
![Page 15: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/15.jpg)
Page 15 – 10-10-31
Generalized Boosted Regression Model (GBM)
obs obs obs
Fitte
d Va
lues
Fitte
d Va
lues
m=1 m=2 m=M
Fitte
d va
lues
Basis 1 Basis 1 Basis 2 Basis 1 Basis M + +
Training Data
• Iterative Regression method • Minimizes a certain loss function • Applicable to many techniques [MLR, CART, SVR, …]
![Page 16: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/16.jpg)
Page 16 – 10-10-31
Verification Results for Above 75th Percentile cases
Shrinkage: 0.001 Iterations: 6000 Interactions: 9
![Page 17: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/17.jpg)
Page 17 – 10-10-31
Lesson Learned
• Quantile Thresholds for Verification • Minimum BIC method based on “leaps” module in R • Antecedent predictors are a double edge sword
– Add skill but increase data reliability issues – Need a good backup model
• GBM and BNN show promise
![Page 18: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/18.jpg)
Page 18 – 10-10-31
Path Forward
• Address data loss issues in predictor selection and training stages
– Fill in missing data – Account for data reliability in predictor selection
• Continue testing boosting and forecast calibration methods above case bias
• Test new predictors (observed meteorology, trajectories, new GEM-MACH fields)
• New statistical techniques (Recursive NN, Fuzzy Logic) • Prototypes for Summer 2012
18
![Page 19: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/19.jpg)
Page 19 – 10-10-31
Thank You
![Page 20: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/20.jpg)
Page 20 – 10-10-31
MLR for NO2: Persistance + Antecedents
Pers. = Ante.
![Page 21: The XM tool project - Air Resources Laboratory · Project History • Project launch - Fall 2009 • Data gathering and Infrastructure – 2010 • Round 1 and 2 Testing - 2010 –](https://reader034.fdocuments.in/reader034/viewer/2022051806/5ffe52072c4b4e60b84dd9e2/html5/thumbnails/21.jpg)
Page 21 – 10-10-31
MLR for PM2.5: Persistance + Antecedents
Antecedent for PM2.5