It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

33
It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena J. McLean Sloughter This work was supported by the DoD Multidisciplinary University Research Initiative (MURI) program administered by the Office of Naval Research under Grant N00014-01-10745. Based on research being conducted under Adrian E. Raftery and Tilmann Gneiting

description

It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena. J. McLean Sloughter. Based on research being conducted under Adrian E. Raftery and Tilmann Gneiting. - PowerPoint PPT Presentation

Transcript of It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Page 1: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

It Never Rains But It Pours:Modeling Mixed Discrete-

Continuous Weather Phenomena

J. McLean Sloughter

This work was supported by the DoD Multidisciplinary University Research Initiative (MURI) program administered by the Office of Naval Research under Grant N00014-01-10745.

Based on research being conducted under Adrian E. Raftery and Tilmann Gneiting

Page 2: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Ensemble Forecasting

• Single forecast model is run multiple times with different initial conditions

• Ensemble mean tends to outperform individual members

• Spread-skill relationship: spread of forecasts tends to be correlated with magnitude of error

• Model is underdispersive (not calibrated)

Page 3: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Spread/Skill Plot

Spread = max forecast – min forecastSkill = abs(forecast mean – observed)

Page 4: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Ensemble Member Forecasts

48-hour forecasts for precipitation at 5pm Oct 20, 2003From http://www.atmos.washington.edu/~emm5rt/ensemble.cgi

Page 5: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Bayesian Model Averaging

• Weighted average of multiple models

• Weights determined by posterior probabilities of models

• Posterior probabilities given by how well each member fits the training data

• Weights, then, give an indication of the relative usefulness of ensemble members

Page 6: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

BMA for ensembles1

1

( | ,..., ) ( | )k

k i ii

p Y f f w p Y f

if

Picture taken from Raftery, Balabdaoui, Gneiting, and Polakowski (2003),“Calibrated MesoscaleShort-Range Ensemble Forecasting Using Bayesian Model Averaging.”

where is the forecast from member i,

iw( | )ip Y f

is the weight associated with member i, and

is the estimated distribution function for Y given member i

Page 7: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

The Trouble With Our Models

• Forecasts never predict zero – artifact of differential equations used to create forecasts

• Observed wind speed is often zero

• Wind speed, even ignoring zeroes, is not normally distributed

Page 8: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

What Wind Speed Looks Like

Wind Speed Histogram –Several exceptionally high values make it harderto see clearly

Wind Speed Histogramtruncated to only go up to fifty – there is a spike at zero

Wind Speed Histogramwithout zeroes

Page 9: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

What Forecasts Look Like

Forecast histogram on left (all eight forecasts have similar histograms) and observed histogram on right – even after removing zeroes from the actual histogram, the shape is still not quite right – actual is more sharply skewed.

Page 10: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

The Problem With Reality

• As we saw, the histograms for forecasts do not match the histogram for observations very well

• Maximal observed value is 124.000• Maximal values for each model:

AVN CMCG Eta GASP JMA NGPS TCWB UKMO

44.591 54.566 44.722 51.732 51.829 45.383 45.125 49.243

Page 11: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

More Trouble With RealityAVN CMCG Eta GASP JMA NGPS TCWB UKMO Y

AVN 1.000 0.826 0.845 0.797 0.795 0.783 0.731 0.840 0.417

CMCG 0.826 1.000 0.822 0.807 0.819 0.797 0.770 0.805 0.402

Eta 0.845 0.822 1.000 0.789 0.793 0.781 0.747 0.800 0.406

GASP 0.797 0.807 0.789 1.000 0.788 0.779 0.747 0.786 0.394

JMA 0.795 0.819 0.793 0.788 1.000 0.788 0.756 0.783 0.400

NGPS 0.783 0.797 0.781 0.779 0.788 1.000 0.757 0.779 0.388

TCWB 0.731 0.770 0.747 0.747 0.756 0.757 1.000 0.721 0.384

UKMO 0.840 0.805 0.800 0.786 0.783 0.779 0.721 1.000 0.415

Y 0.417 0.402 0.406 0.394 0.400 0.388 0.384 0.415 1.000

Pairwise correlations – Y is observed value, others are the various forecasts

Page 12: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena
Page 13: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Time Trends

Left - average observed wind speed per dayRight – same, but smoothed to average over 3-week interval

Page 14: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Time Trend Troubles

• Higher winds in summer, lower in winter (note that this appears to be an odd trait of the northwest)

• Need model to reflect seasonal patterns

• Would still like to just have a simple model based on forecasts

Page 15: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

A Recap Of All The Things That Make Life Interesting and Miserable• Distribution not normal

• Time

• Forecasts not very highly corellated with observations

• Zeroes

Page 16: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

What to do about distributions?

• Model using another distribution – Gammas and Weibulls are popular models for windspeed

• Can apply a transformation to the data

Left: Root of forecast windspeed from model 1Right: Root of observed windspeed (excluding zeroes for easier visualization)

Page 17: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

What to do about time?

• Rather than using all available data as training set, only train on recent data

• Trade-off between lower variance of estimates with more data, and better picture of current trends with less data

• Previous research (on temperature and pressure) indicates 40-day window of training data seems to give a good balance

Page 18: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

What to do about the forecasts?

• We’re not making the forecasts, just using them• We can apply a bias-correction by performing a

linear regression of observed on forecasts (this is commonly done in forecasting already)

• We can see from our weight terms which models perform better and which perform worse, and report that to the folks making the forecasts

• We can hope that the science of meteorology continues to move forward as it has thus far

Page 19: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

What to do about zeroes?

• We need a model that includes a point mass at zero

• Two main possibilities:• We could model a weighted average of eight

distributions, each of which is a normal plus a point mass at zero

• Or, we could first model probability of zero or non-zero, then, conditioned on non-zero, the weighted average of eight normals

• We will pursue the second option for now

Page 20: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

So, let’s get to it then

• Probability of zero can be modeled by a logistic regression on the eight forecasts

• Then, the weighted average of normals can be determined by the EM algorithm

• Assume each normal has the bias-corrected forecast as its mean, and has a constant variance

• Alternate between predicting membership based on weights and variances, and weights and variances based on membership

• Make sure to also include probability of being non-zero when evaluating our functions

Page 21: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Let’s try out a simple test case first

• Generate a sample of 100,000 ordered triples (x1, x2, y), x1 uniform over 30 to 50, x2 uniform over 10 to 20

• logistic regression coefficients of a=10, b1=-.2, b2=-.6

• with probability determined by logistic regression, y=0

• otherwise, with probability .6, y is normal around x1 with sd of 1, and with probability .4 is normal around x2 with sd of 3.14

Page 22: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

How did we do?

• Predicted logisitic coefficients of a=10.257, b1=-0.206, b2=-0.600

• weights of 0.598 and 0.402

• sds of 1.000 and 3.414

• Seems to be able to model pretty well under ideal artificial conditions

• So now let’s try the real thing

Page 23: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

How do we do?

RMSE from creating forecasts for 33 days, using 40 day training periods –black is without modeling zeroes, red is with modeling zeroes

Page 24: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Iterations to convergenceagain, black is without modeling zeroes, red is with

Page 25: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

How do we feel about this?

• Including modeling of zeroes doesn’t appear to help our error much (p=0.4734), which is somewhat disappointing

• However, we get our model much faster (p=2.104e-08), which is a concern when having to do a lot of these

Page 26: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

What We Would Have Done Next Had We Not Been Distracted By

More Pressing Matters• Consider fitting different distributions rather than using a

transformation• Fit a model with point masses at zero for each individual

component• Try additional bias corrections• Compare results for different training windows• Investigate importance of starting values in EM algorithm• Evaluate performances of prediction intervals rather than

just prediction means

Page 27: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Three Months Later…

• In the distraction interim, precipitation data became available

• Precipitation forecasts tend to be better than wind forecasts

• Weather people tend to be more interested in precipitation forecasts

• And so, we have abandoned (for now) wind speed, and are looking at precipitation instead

Page 28: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

How is rain different?

• Distribution doesn’t look normal, even under transformations

• Models do predict zero, but we still see point masses at zero

Page 29: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Conditional Histograms

Observed given forecast from 1.5 to 4 Observed given forecast from 55 to 80

in .01”

Page 30: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Fitting Gammas

Shape Parameter Rate Parameter Coef. Of Variation

Page 31: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Something’s Not Quite Right

Sample Mean Estimated Mean

Page 32: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

At least something worked out nicely

Proportion of Zeroes

Page 33: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

And Now?

• First, find out what’s going funny with our gamma fitting

• Then, try to come up with a way to do some sort of gamma fitting in the EM algorithm

• Then, look at all those things we wanted to look at before