RStan: Efficient MCMC in R - University of Warwick · Generated quantities (optional) generated...

RStan: Efficient MCMC in RJack Jewson

Warwick R user group: 2 March 2017

Problem

You're doing a statistical analysis:

Want to impose some prior information.

Ultimately you care about prediction.

So you decide to be Bayesian.

However… you don't have conjugate priors.

You could spend ages coding up a Metropolis-Hastings, optimising acceptancerates and proposals.

Or you could use RStan!

Original paper Alder and Wainwright (1959) or see Neal and others (2011) for aneasier read

HMC is a method to produce proposals for a MH algorithm that are acceptedwith high probability.

Rather than have a proposal distribution we appeal to Hamiltonian dynamics.

Consider the target distribution as an inverted ice rink:

Give the particle some momentum.

It slides around the ice rink spending most time where the density is high.

Taking snap shots of this trajectory gives a proposal sample for theposterior.

We then correct using Metropolis-Hastings.

HMC, like RWMH, requires some tuning, the number and size of the leapfrogsteps.

The "No-U-Turn Sampler" or NUTs (Hoffman and Gelman (2014)), optimisesthese adaptively.

Too small number and size of steps leads to RW type behaviour while too bigthe trajectory starts to come back on itself.

NUTS builds up a set of likely candidate points and stops immediately when atrajectory starts to come back on itself.

Advantages of Stan

Can produce high dimensional proposals that are accepted with high probabilitywithout having to spend time tuning.

Has inbuilt diagnostics to analyse the MCMC output.

Built in c++ so runs quickly but outputs to R.

Example

What to build a Bayesian linear regression model using LASSO shrinkage.

Build a Stan model:

see example bayes_LASSO.stan model file.

Data: , , , , prior parameters, hyper-parameters

Parameters: ,

Model: Gaussian likelihood, Laplacian and Gamma priors.

Output: Posterior samples, posterior predictive samples.

- n p Y X

- β σ2

data { int<lower=0> n; int<lower=0> p; matrix[n,p+1] X; vector[n] y; real<lower=0> a; real<lower=0> b; real<lower=0> w; }

Parameters

parameters { vector[p+1] beta; real<lower=0> sigma; }

Transformed parameters (optional)

transformed parameters { vector[n] linpred; linpred = X*beta; }

or without vectorisation,

model { beta ~ double_exponential(0,w); sigma ~ gamma(a,b); y~ normal(linpred,sigma); }

for(i in 1:n){ y[i]~normal(X[i,]*beta,sigma); }

Generated quantities (optional)

generated quantities { vector[n] y_predict; for(i in 1:n){ y_predict[i] = normal_rng(linpred[i],sigma); } }

evaluates this code once for every element of the posterior sample.·

Running in R

library(rstan) setwd("C:/Users/jack/Documents/OxWaSP_Yr2/Presentations") data<‐list(n=102,p=3,X=cbind(Prestige$education, Prestige$women, Prestige$prestige,rep(1,102)),y=Prestige$income,a=10,b=10,w=100) chain1 <‐ stan(file="bayes_LASSO.stan",data=data,iter=50000, chains=1, cores=1) params<‐extract(chain1)

Runs 25000 warmups and 25000 samples in 3.5 seconds.

Compiles c++ code the first time so that may take longer.

Plotting posterior distributions

par(mfrow=c(1,2)) plot(density(params$beta[,1]),xlab="beta1",ylab="Density",main="") plot(density(params$beta[,2]),xlab="beta2",ylab="Density",main="")

and predictive distributions

par(mfrow=c(1,2)) plot(density(params$y_predict[,1]),xlab="Income",ylab="Density",main="") plot(density(params$y_predict[,100]),xlab="Income",ylab="Density",main="")

Chain Diagnostics

sampler_params <‐ get_sampler_params(chain1, inc_warmup = FALSE) sampler_params[[1]][1:5,]

## accept_stat__ stepsize__ treedepth__ n_leapfrog__ divergent__ ## [1,] 0.9803107 0.1240111 4 15 0 ## [2,] 0.9025821 0.1240111 4 31 0 ## [3,] 0.9662435 0.1240111 5 31 0 ## [4,] 0.8553821 0.1240111 5 31 0 ## [5,] 0.8977977 0.1240111 5 31 0 ## energy__ ## [1,] 6588.685 ## [2,] 6587.883 ## [3,] 6589.927 ## [4,] 6591.606 ## [5,] 6589.096

Chain Diagnostics

traceplot(chain1,pars="beta" )

Chain Diagnostics

pairs(chain1,pars="beta")

more Chain Diagnostics

Stan can also extract various other diagnostics from the chain

for more information see Rstan Diagnostic Plots

Credibility intervals.

effective sample size and the Markov chain squared error.

Comparison plots between the values of the chain and various chain properties,log-likelihood, acceptance rate and step size.

When Stan goes wrong

"Divergent transitions after warmup"

Means Stan is taking steps that are too big.

Can fix by manually increasing the desired average acceptance probably,adapt_delta, above it's default of 0.8

chain1 <‐ stan(file="bayes_LASSO.stan",data=data,iter=50000, chains=1, cores=1,control = list(adapt_delta = 0.99, max_treedepth = 15))

This will slow your chain down but may result in a better sample

see http://mc-stan.org/misc/warnings.html for more info.

Self-made function

Stan also has compatibility with self-made function.

Useful if your prior or likelihood is not standard.

model { beta ~ double_exponential(0,w); sigma ~ gamma(a,b); for(i in 1:n){ increment_log_prob(‐0.5*fabs(1‐(exp(normal_log(y[i],linpred[i], sigma))/y_dense[i]))); } }

Conclusion

Don't waste your time coding and tuning RWMH!

Stan will run faster, is automatically tuned and should produce a superiorsamples.

Inbuilt functions make analysing your chain easy.

For more information

Stan manual: http://mc-stan.org/documentation/ (only 601 pages long).

Google groups: http://mc-stan.org/community/.

R-package documentation: https://cran.r-project.org/web/packages/rstan/index.html.

References

Alder, Berni J, and T E Wainwright. 1959. “Studies in Molecular Dynamics. I. GeneralMethod.” The Journal of Chemical Physics 31 (2). AIP: 459–66.

Hoffman, Matthew D, and Andrew Gelman. 2014. “The No-U-Turn Sampler:Adaptively Setting Path Lengths in Hamiltonian Monte Carlo.” Journal of MachineLearning Research 15 (1): 1593–1623.

Neal, Radford M, and others. 2011. “MCMC Using Hamiltonian Dynamics.”Handbook of Markov Chain Monte Carlo 2: 113–62.

RStan: Efficient MCMC in R - University of Warwick · Generated quantities (optional) generated...

Documents

Transcript of RStan: Efficient MCMC in R - University of Warwick · Generated quantities (optional) generated...

Goldman Sachs 2016 Global Energy Conference - crc.com€¦ · GS Energy Conference 0116 Cautionary Statements Regarding Hydrocarbon Quantities We have provided internally generated

Massachusetts Institute of Technologydedekind.mit.edu/~rstan/pubs/pubfiles/72.pdf · Created Date: 11/16/2006 10:59:12 PM

An Equivalence Relation on the Symmetric Group and ...rstan/papers/multfree.pdfRichard P. Stanley We consider the equivalence relation ∼ on the symmetric group S n generated by the

Estimated Quantities -- Letting August 19, 2016 Quantities ... · Estimated Quantities -- Letting August 19, 2016 Quantities shown are for information only and should not be used

Introduction - WordPress.com · 2020. 1. 8. · MICROBES IN SEWAGE TREATMENT • Sewage is the municipal/urban waste water i.e. large quantities of waste water are generated everyday

Electrical Quantities

Estimated Quantities -- Letting April 01, 2016 Quantities ...

Package ‘rstan’ · 2/20/2017 · Package ‘rstan’ December 28, 2016 Type Package Title R Interface to Stan Version 2.14.1 Date 2016-12-28 Description User-facing R functions

Physics Beyond 2000 Chapter 1 Kinematics Physical Quantities Fundamental quantities Derived quantities.

Concrete Quantities

Vector Quantities We will concern ourselves with two measurable quantities: Scalar quantities: physical quantities expressed in terms of a magnitude only.

Relative Invariants of Finite Groups Generated by ...rstan/pubs/pubfiles/32.pdfNote that in the course of the proof we have established the assertion RxG = fx . RG. I Problem. Classify

Stan and probabilistic programming RStan rstanarm and brms ... · Stan and probabilistic programming RStan rstanarm and brms Dynamic HMC used in Stan MCMC convergence diagnostics

Vectors 6001. INTRODUCTION SCALAR QUANTITIES: _______________________________________________________ VECTOR QUANTITIES: ________________________________________________________.

Long Beach ANALYST DAY & SITE TOURAnalyst Presentation – Day 1 CAUTIONARY STATEMENTS REGARDING HYDROCARBON QUANTITIES We have provided internally generated estimates for proved reserves

Establishing Traceability for Quantities Derived from Multiple Traceable Quantities

Guidelines for District Health Managers · injection activities remains problematic as significant quantities of disposable or auto-disable syringes and needles are generated, for

Establishing Traceability for Quantities Derived from ... · Quantities Derived from Multiple Traceable Quantities Jian Liu and Alberto Campillo Abstract: Some quantities are traceable

Rstan Manual

* INTRODUCTION Physical quantities Base quantities Derived quantities Prefixes Scientific notation (standard form Scalar quantities Vector quantities Dimensional.

Vectors 6001. INTRODUCTION SCALAR QUANTITIES: _ VECTOR QUANTITIES: __.