The Calibrated Bayes Approach to Sample Survey
Inference
Roderick Little
Department of Biostatistics, University of Michigan
Associate Director for Research & Methodology, Bureau of Census
Learning Objectives1. Understand basic features of alternative modes of
inference for sample survey data.2. Understand the mechanics of Bayesian inference for finite
population quantitities under simple random sampling.3. Understand the role of the sampling mechanism in sample
surveys and how it is incorporated in a Calibrated Bayesian analysis.
4. More specifically, understand how survey design features, such as weighting, stratification, post-stratification and clustering, enter into a Bayesian analysis of sample survey data.
5. Introduction to Bayesian tools for computing posterior distributions of finite population quantities.
2Models for complex surveys 1: introduction
Acknowledgement and Disclaimer• These slides are based in part on a short course on
Bayesian methods in surveys presented by Dr. Trivellore Raghunathan and I at the 2010 Joint Statistical Meetings.
• While taking responsibility for errors, I’d like to acknowledge Dr. Raghunathan’s major contributions to this material
• Opinions are my own and not the official position of the U.S. Census Bureau
3Models for complex surveys 1: introduction
Module 1: Introduction• Distinguishing features of survey sample inference• Alternative modes of survey inference
– Design-based, superpopulation models, Bayes
• Calibrated Bayes
4Models for complex surveys 1: introduction
Distinctive features of survey inference 1. Primary focus on descriptive finite population
quantities, like overall or subgroup means or totals– Bayes – which naturally concerns predictive
distributions -- is particularly suited to inference about such quantities, since they require predicting the values of variables for non-sampled items
– This finite population perspective is useful even for analytic model parameters:
= model parameter (meaningful only in context of the model)( ) = "estimate" of from fitting model to whole population
(a finite population quantity, exists regardless of validity of model)
Y Y
A good estimate of should be a good estimate of
(if not, then what's being estimated?)
5Models for complex surveys 1: introduction
Distinctive features of survey inference 2. Analysis needs to account for "complex" sampling
design features such as stratification, differential probabilities of selection, multistage sampling.• Samplers reject theoretical arguments suggesting such
design features can be ignored if the model is correctly specified.
• Models are always misspecified, and model answers are suspect even when model misspecification is not easily detected by model checks (Kish & Frankel 1974, Holt, Smith & Winter 1980, Hansen, Madow & Tepping 1983, Pfeffermann & Holmes (1985).
• Design features like clustering and stratification can and should be explicitly incorporated in the model to avoid sensitivity of inference to model misspecification. 6Models for complex surveys 1: introduction
Distinctive features of survey inference 3. A production environment that precludes detailed
modeling. • Careful modeling is often perceived as "too much
work" in a production environment (e.g. Efron 1986). • Some attention to model fit is needed to do any good
statistics• “Off-the-shelf" Bayesian models can be developed that
incorporate survey sample design features, and for a given problem the computation of the posterior distribution is prescriptive, via Bayes Theorem.
• This aspect would be aided by a Bayesian software package focused on survey applications.
7Models for complex surveys 1: introduction
Distinctive features of survey inference 4. Antipathy towards methods/models that involve
strong subjective elements or assumptions. • Government agencies need to be viewed as objective
and shielded from policy biases.• Addressed by using models that make relatively weak
assumptions, and noninformative priors that are dominated by the likelihood.
• The latter yields Bayesian inferences that are often similar to superpopulation modeling, with the usual differences of interpretation of probability statements.
• Bayes provides superior inference in small samples (e.g. small area estimation)
8Models for complex surveys 1: introduction
Distinctive features of survey inference 5. Concern about repeated sampling (frequentist)
properties of the inference. • Calibrated Bayes: models should be chosen to have
good frequentist properties• This requires incorporating design features in the model
(Little 2004, 2006).
9Models for complex surveys 1: introduction
Approaches to Survey Inference• Design-based (Randomization) inference• Superpopulation Modeling
– Specifies model conditional on fixed parameters– Frequentist inference based on repeated samples from
superpopulation and finite population (hybrid approach)
• Bayesian modeling– Specifies full probability model (prior distributions on
fixed parameters)– Bayesian inference based on posterior distribution of
finite population quantities– argue that this is most satisfying approach
10Models for complex surveys 1: introduction
Design-Based Survey Inference1( ,..., ) design variables, known for populationNZ Z Z
( , ) = target finite population quantityQ Q Y Z
I I IN( ,..., )1 = Sample Inclusion Indicators1, unit included in sample
0, otherwise iI
incˆ ˆ( , , ) = sample estimate of q q I Y Z Q
incˆ( , , ) = sample estimate of V I Y Z V
inc inc ( ) part of included in the surveyY Y I Y
ˆ ˆˆ ˆ1.96 , 1.96 95% confidence interval for q V q V Q
I
incY
exc[ ]Y
11100000
11
1( ,..., ) = population values,
recorded only for sampleNY Y Y
Note: here is random variable, ( , ) are fixedI Y Z
Z Y
Models for complex surveys 1: introduction
• Random (probability) sampling characterized by:– Every possible sample has known chance of being selected– Every unit in the sample has a non-zero chance of being
selected– In particular, for simple random sampling with replacement:
“All possible samples of size n have same chance of being selected”
Random Sampling
1
{1,..., } = set of units in the sample frame
1/ , , !Pr( | )= ;
!( )!0, otherwise
N
ii
Z N
NI n N N
I Z nn n N n
( | ) Pr( 1| ) /i iE I Z I Z n N 12Models for complex surveys 1: introduction
Example 1: Mean for Simple Random Sample
1
1, population mean
N
ii
Q Y yN
1
ˆ( ) / , the sample meanN
i ii
q I y I y n
2 2 2
1
1Var ( ) / , ( )
1
finite populatio
(1 / )
(1 / n correc) tion
N
I ii
y V S n S y YN
n N
n N
2 2 2
1
1ˆ (1 / ) / , = sample variance = ( )1
N
i ii
V n N s n s I y yn
ˆ ˆ95% confidence interval for 1.96 , 1.96Y y V y V
Random variable
Fixed quantity, not modeled
1 1 1
Unbiased for : / ( ) / ( / ) /N N N
I i i I i i ii i i
Y E I y n E I y n n N y n Y
13Models for complex surveys 1: introduction
Example 2: Horvitz-Thompson estimator
• Pro: unbiased under minimal assumptions• Cons:
– variance estimator problematic for some designs (e.g. systematic sampling)
– can have poor confidence coverage and inefficiency
Q Y T Y YN( ) ... 1
( | ) = inclusion probability 0i iE I Y
HT HT1 1 1
ˆ ˆ/ , E ( ) ( ) / = /N N N
i i i I i i i i i ii i i
t I Y t E I Y Y T
HTˆ Variance estimate, depends on sample designv
HT HT HT HTˆ ˆˆ ˆ1.96 , 1.96 = 95% CI for t v t v T
14Models for complex surveys 1: introduction
Role of Models in Classical Approach• Inference not based on model, but models are often
used to motivate the choice of estimator. E.g.:– Regression model regression estimator– Ratio model ratio estimator– Generalized Regression estimation: model estimates
adjusted to protect against misspecification, e.g. HT estimation applied to residuals from the regression estimator (Cassel, Sarndal and Wretman book).
• Estimates of standard error are then based on the randomization distribution
• This approach is design-based, model-assisted
15Models for complex surveys 1: introduction
Model-Based Approaches• In our approach models are used as the basis for the entire
inference: estimator, standard error, interval estimation• This approach is more unified, but models need to be
carefully tailored to features of the sample design such as stratification, clustering.
• One might call this model-based, design-assisted• Two variants:
– Superpopulation Modeling– Bayesian (full probability) modeling
• Common theme is “Infer” or “predict” about non-sampled portion of the population conditional on the sample and model
16Models for complex surveys 1: introduction
Superpopulation Modeling• Model distribution M:
~ ( | , ), = design variables, fixed parametersY f Y Z Z
ˆ ˆˆ ( | , ), model estimate of i i iy E y z
( ~), ~ ,
,q Q Y y
y
yii
i
RST if unit sampled;
if unit not sampled
( ),v mse q I M over distribution of and
. , . q v q v Q 1 96 1 96a f = 95% CI for
• Predict non-sampled values :I Z Y
incY
excY
In the modeling approach, prediction of nonsampled values is central
In the design-based approach, weighting is central: “sample represents … units in the population”
11100000
excY
17Models for complex surveys 1: introduction
Bayesian Modeling• Bayesian model adds a prior distribution for the parameters:
( , ) ~ ( | ) ( | , ), ( | ) prior distributionY Z f Y Z Z
I Z Y
incY
excY
In the super-population modeling approach, parameters are considered fixed and estimated
In the Bayesian approach, parameters are random and integrated out of posterior distribution – leads to better small-sample inference
11100000
inc
inc
Inference about finite population quantitity ( ) based on
( ( ) | ) posterior predictive distribution
of given sample values
Q Y
p Q Y Y
Q Y
inc inc
Inference about is based on posterior distribution from Bayes Theorem:
( | , ) ( | ) ( | , ), = likelihoodp Z Y Z L Z Y L
inc inc inc( ( ) | , ) ( ( ) | , , ) ( | , )
(Integrates out nuisance parameters )
p Q Y Z Y p Q Y Z Y p Z Y d
18Models for complex surveys 1: introduction
Bayesian Point Estimates• Point estimate is often used as a single summary
“best” value for the unknown Q • Some choices are the mean, mode or the median of
the posterior distribution of Q• For symmetrical distributions an intuitive choice is
the center of symmetry• For asymmetrical distributions the choice is not
clear. It depends upon the “loss” function.
19Models for complex surveys: simple random sampling
Bayesian Interval Estimation• Bayesian analog of confidence interval is posterior
probability or credibility interval– Large sample: posterior mean +/- z * posterior se– Interval based on lower and upper percentiles of
posterior distribution – 2.5% to 97.5% for 95% interval – Optimal: fix the coverage rate 1-a in advance and
determine the highest posterior density region C to include most likely values of Q totaling 1-a posterior probability
20Models for complex surveys: simple random sampling
Bayes for population quantities Q• Inferences about Q are conveniently obtained by first
conditioning on and then averaging over posterior of . In particular, the posterior mean is:
and the posterior variance is:
• Value of this technique will become clear in applications• Finite population corrections are automatically obtained as
differences in the posterior variances of Q and • Inferences based on full posterior distribution useful in
small samples (e.g. provides “t corrections”)
inc inc inc( | ) ( | , ) |E Q Y E E Q Y Y
inc inc inc inc inc( | ) ( | , ) | ( | , ) |Var Q Y E Var Q Y Y Var E Q Y Y
21Models for complex surveys: simple random sampling
Simulating Draws from Posterior Distribution• For many problems, particularly with high-dimensional
it is often easier to draw values from the posterior distribution, and base inferences on these draws
• For example, if
is a set of draws from the posterior distribution for a scalar parameter , then
( )1( : 1,..., )d d D
11 ( )
1 11
2 1 ( ) 21 11
1
approximates posterior mean
( 1) ( ) approximates posterior variance
( 1.96 ) or 2.5th to 97.5th percentiles of draws
approximates 95% posterior credibility interva
D d
d
D d
d
D
s D
s
l for
22Models for complex surveys: simple random sampling
( )Given a draw of , usually easy to draw non-sampled
values of data, and hence finite population quantities
d
Calibrated Bayes• Any approach (including Bayes) has properties in
repeated sampling• We can study the properties of Bayes credibility
intervals in repeated sampling – do 95% credibility intervals have 95% coverage?
• A Calibrated Bayes approach yields credibility intervals with close to nominal coverage
• Frequentist methods are useful for forming and assessing model, but the inference remains Bayesian
• See Little (2004) for more discussionModels for complex surveys 1: introduction 23
Summary of approaches• Design-based:
– Avoids need for models for survey outcomes– Robust approach for large probability samples– Less suited to small samples – inference basically
assumes large samples– Models needed for nonresponse, response errors, small
areas – this leads to “inferential schizophrenia”
Models for complex surveys 1: introduction 24
Summary of approaches• Superpopulation/Bayes models:
– Familiar: similar to modeling approaches to statistics in general
– Models needs to reflect the survey design– Unified approach for large and small samples,
nonresponse and response errors.– Frequentist superpopulation modeling has the
limitation that uncertainty in predicting parameters is not reflected in prediction inferences:
– Bayes propagates uncertainty about parameters, making it preferable for small samples – but needs specification of a prior distribution
Models for complex surveys 1: introduction 25
Module 2: Bayesian models for simple random samples
2.1 Continuous outcome: normal model
2.2 Difference of two means
2.3 Regression models
2.4 Binary outcome: beta-binomial model
2.5 Nonparametric Bayes
Models for complex surveys 1: introduction 26
Models for simple random samples• Consider Bayesian predictive inference for population
quantities• Focus here on the population mean, but other posterior
distribution of more complex finite population quantities Q can be derived
• Key is to compute the posterior distribution of Q conditional on the data and model– Summarize the posterior distribution using posterior mean,
variance, HPD interval etc• Modern Bayesian analysis uses simulation technique to
study the posterior distribution • Here consider simple random sampling: Module 3
considers complex design features27Models for complex surveys: simple random sampling
Diffuse priors• In much practical analysis the prior information is diffuse,
and the likelihood dominates the prior information.• Jeffreys (1961) developed “noninformative priors” based
on the notion of very little prior information relative to the information provided by the data.
• Jeffreys derived the noninformative prior requiring invariance under parameter transformation.
• In general,1/2
2
( ) | ( ) |
where
log ( | )( )
t
J
f yJ E
28Models for complex surveys: simple random sampling
Examples of noninformative priors 2 2Normal: ( , )
In simple cases these noninformative priors result in numerically same answers as standard frequentist procedures
1/2 1/2Binomial: ( ) (1 ) 1/2Poisson: ( )
2 2Normal regression with slopes : ( , )
29Models for complex surveys: simple random sampling
2.1 Normal simple random sample2
2 2
~ iid ( , ); 1, 2,...,
( , )
iY N i N
Derive posterior distribution of Q
inc 1
exc
exc
simple random sample results in ( ,..., )
( )
(1 )
nY y y
ny N n YQ Y
Nf y f Y
30Models for complex surveys: simple random sampling
2.1 Normal Example
Posterior distribution of ( ,m s2)
The above expressions imply that
22 2 /2 1
inc 2 2inc
2 /2 1 2 2 2 2
inc
( )1( , | ) (2 ) exp
2
1( ) exp ( ) / ( ) /
2
n i
i
ni
i
yp Y
y y n y
2 2 2inc 1
inc
2 2inc
(1) | ~ ( ) /
(2) | , ~ ( , / )
i ni
Y y y
Y N y n
31Models for complex surveys: simple random sampling
2.1 Posterior Distribution of Q
exc
22
inc
(1 )
(1 )| , ~ ,
Q f y f Y
fQ Y N y
n
2
exc inc 1
2
inc 1
| ~ ,(1 )
(1 )| ~ ,
n
n
sY Y t y
f n
f sQ Y t y
n
22
exc
2 2 22
exc inc
| , ~ ,
| , ~ ,(1 )
Y NN n
Y Y N yN n n f n
32Models for complex surveys: simple random sampling
2.1 HPD Interval for QNote the posterior t distribution of Q is symmetric and unimodal -- values in the center of the distribution are more likely than those in the tails.
Thus a (1-a)100% HPD interval is:
2
1,1 /2
(1 )n
f sy t
n
Like frequentist confidence interval, but recovers the t correction
33Models for complex surveys: simple random sampling
2.1 Some other Estimands • Suppose Q=Median or some other percentile• One is better off inferring about all non-sampled values• As we will see later, simulating values of adds
enormous flexibility for drawing inferences about any finite population quantity
• Modern Bayesian methods heavily rely on simulating values from the posterior distribution of the model parameters and predictive-posterior distribution of the nonsampled values
• Computationally, if the population size, N, is too large then choose any arbitrary value K large relative to n, the sample size– National sample of size 2000– US population size 306 million– For numerical approximation, we can choose K=2000/f, for some
small f=0.01 or 0.001.
excY
34Models for complex surveys: simple random sampling
2.1 Comments• Even in this simple normal problem, Bayes is
useful:– t-inference is recovered for small samples by putting a
prior distribution on the unknown variance– Inference for other quantities, like Q=Median or some
other percentile, is achieved very easily by simulating the nonsampled values (more on this below)
• Bayes is even more attractive for more complex problems, as discussed later.
35Models for complex surveys: simple random sampling
2.2 Comparison of Two Means • Population 1 • Population 2
1
1
21 1 1
2 21 1 1
( , )
( , )
i
Population size N
Sample size n
Y ind N
2
2
22 2 2
2 22 2 2
( , )
( , )
i
Population size N
Sample size n
Y ind N
1
21 1
2 2 21 1 1 1
21 1 1 1
21 1 1
: ( , )
:
( 1) / ~
~ ( , / )
~ ( , ), exc
n
i
Sample Statistics y s
Posterior distributions
n s
N y n
Y N i
2
22 2
2 2 22 2 2 1
22 2 2 2
22 2 2
: ( , )
:
( 1) / ~
~ ( , / )
~ ( , ), exc
n
i
Sample Statistics y s
Posterior distributions
n s
N y n
Y N i
36Models for complex surveys: simple random sampling
2.2 Estimands• Examples
– (Finite sample version of Behrens-Fisher Problem)– Difference– Difference in the population medians– Ratio of the means or medians– Ratio of Variances
• It is possible to analytically compute the posterior distribution of some these quantities
• It is a whole lot easier to simulate values of non-sampled in Population 1 and in Population 2
1 2Y Y
1 2Pr( ) Pr( )Y c Y c
'1
sY '2
sY
37Models for complex surveys: simple random sampling
2.3 Ratio and Regression Estimates• Population: (yi,xi; i=1,2,…N)• Sample: (yi, iinc, xi, i=1,2,…,N).
y
y
yn
1
2
.
.
.
1
2
1
2
.
.
.
.
.
.
n
n
n
N
x
x
x
x
x
x
Objective: Infer about the population mean
Excluded Y’s are missing values
1
N
ii
Q y
For now assume SRS
38Models for complex surveys: simple random sampling
2.3 Model Specification2 2 2( | , , ) ~ ind ( , )
1,2,...,
known
gi i i iY x N x x
i N
g
2 2Prior distribution: ( , )
g=1/2: Classical Ratio estimator. Posterior variance equals randomization variance for large samplesg=0: Regression through origin. The posterior variance is nearly the same as the randomization variance.g=1: HT model. Posterior variance equals randomization variance for large samples.Note that, no asymptotic arguments have been used in deriving Bayesian inferences. Makes small sample corrections and uses t-distributions.
39Models for complex surveys: simple random sampling
2.3 Posterior Draws for Normal Linear Regression g = 0
• Easily extends to weighted regression
2
( )2 2 21
( ) ( )
21
1 1
ˆ( , ) ls estimates of slopes and resid variance
( 1) /
ˆ
= chi-squared deviate with 1 df
( ,..., ) , ~ (0,1)
upper triangular Cholesky factor of (
dn p
d T d
n p
Tp i
T
s
n p s
A z
n p
z z z z N
A X X
1
1
( ) ( ) ( ) ( )2
) :
( )
Nonsampled values | , ~ ( , )
T T
d d d di i
A A X X
y N x
40Models for complex surveys: simple random sampling
2.4 Binary outcome: consulting example
• In India, any person possessing a radio, transistor or television has to pay a license fee.
• In a densely populated area with mostly makeshift houses practically no one was paying these fees.
• Target enforcement in areas where the proportion of households possessing one or more of these devices exceeds 0.3, with high probability.
41Models for complex surveys: simple random sampling
2.4 Consulting example (continued)
• Conduct a small scale survey to answer the question of interest
• Note that question only makes sense under Bayes paradigm
1
Population Size in particular area
1, if household has a device
0, otherwise
/ Proportion of households with a device
Question of Interest: Pr( 0.3)
i
N
ii
N
iY
Q Y N
Q
42Models for complex surveys: simple random sampling
2.4 Consulting exampleinc 1 exc 1
1
1 1
srs of size , { ,..., }, { ,..., }
| ~ Bernoulli( )
( | ) ( ) (1 )
( ) 1 (0,1)
/ /
n n N
i
n
ii
n x n xx
N N
i ii i n
n Y Y Y Y Y Y
Y iid
x Y
f x
Q Y N x Y N
Model for observable
Prior distribution
Estimand
43Models for complex surveys: simple random sampling
2.4 Beta Binomial modelThe posterior distribution is
( | ) ( )( | ) ( | ) ( )
( | ) ( )
( ) (1 ) 1( | )
( ) (1 )
| ~ ( 1, 1)
n x n xxn x n xx
f xp x f x
f x d
p xd
x Beta x n x
1
1
( ) /
| , ~ Bin( , )
N
ii n
N
ii n
Q x Y N
Y x N n
44Models for complex surveys: simple random sampling
2.4 Infinite Population
What is the maximum proportion of households in the population with devices that can be said with great certainty?
,
Pr( 0.3 | ) Pr( 0.3 | )
Compute using cumulative distribution function
of a beta distribution which is a standard function
in most software such as SAS, R
N
N
For N Y
Y x x
Pr( ? | ) 0.9
Inverse CDF of Beta Distribution
x
45Models for complex surveys: simple random sampling
2.5 Bayesian Nonparametric Inference
• Population: • All possible distinct values:• Model:• Prior:• Mean and Variance:
d d dK1 2, ,...,
Pr( )Y di k k 1
1 2( , ,..., ) if 1k k kkk
2 2 2
( | )
( | )
i k kk
i k kk
E Y d
Var Y d
1 2 3, , ,..., NY Y Y Y
46Models for complex surveys: simple random sampling
2.5 Bayesian Nonparametric Inference
• SRS of size n with nk equal to number of dk in the sample
• Objective is to draw inference about the population mean:
• As before we need the posterior distribution of m and s2
exc(1 )Q f y f Y
47Models for complex surveys: simple random sampling
2.5 Nonparametric Inference• Posterior distribution of q is Dirichlet:
• Posterior mean, variance and covariance of q
1inc( | ) if 1 and kn
k k kk kk
Y n n
inc inc 2
inc 2
( )( | ) , ( | )
( 1)
( , | )( 1)
k k kk k
k lk l
n n n nE Y Var Y
n n n
n nCov Y
n n
48Models for complex surveys: simple random sampling
2.5 Inference for Q
Hence posterior mean and variance of Q are:
inc
22 2
incinc
2 2inc
( | )
1 1( | ) ; ( )
1 1
1( | )
1
kk
k
ii
nE Y d y
n
s nVar Y s y y
n n n
nE Y s
n
inc inc( | ) (1 ) ( | )E Q Y f y f E Y y
2
inc
1( | ) (1 )
1
s nVar Q Y f
n n
49Models for complex surveys: simple random sampling
Module 3: complex sample designs• Considered Bayesian predictive inference for population
quantities• Focused here on the population mean, but other
posterior distribution of more complex finite population quantities Q can be derived
• Key is to compute the posterior distribution of Q conditional on the data and model– Summarize the posterior distribution using posterior mean,
variance, HPD interval etc• Modern Bayesian analysis uses simulation technique to
study the posterior distribution • Models need to incorporate complex design features like
unequal selection, stratification and clustering50Models for complex surveys: simple random sampling
Models for complex sample designs 51
Modeling sample selection• Role of sample design in model-based (Bayesian)
inference• Key to understanding the role is to include the sample
selection process as part of the model• Modeling the sample selection process
– Simple and stratified random sampling– Cluster sampling, other mechanisms– See Chapter 7 of Bayesian Data Analysis (Gelman, Carlin,
Stern and Rubin 1995)
Models for complex sample designs 52
Full model for Y and I
• Observed data:• Observed-data likelihood:
• Posterior distribution of parameters:
( , | , , )p Y I Z
Model for Population
Model forInclusion
inc( , , ) (No missing values)Y Z I
inc inc exc( , | , , ) ( , | , , ) ( , | , , )L Y Z I p Y I Z p Y I Z dY
inc inc( , | , , ) ( , | ) ( , | , , )p Y Z I p Z L Y Z I
( | , )p Y Z ( | , , )p I Y Z
Models for complex sample designs 53
Ignoring the data collection process• The likelihood ignoring the data-collection process is
based on the model for Y alone with likelihood:
• The corresponding posteriors for and are:
• When the full posterior reduces to this simpler posterior, the data collection mechanism is called ignorable for Bayesian inference about .
inc inc exc( | , ) ( | , ) ( | , )L Y Z p Y Z p Y Z dY
exc,Y
inc inc
exc inc exc inc inc
( | , ) ( | ) ( | , )
( | , ) ( | , , ) ( | , )
p Y Z p Z L Y Z
p Y Y Z p Y Y Z p Y Z d
Posterior predictive distribution of
excY
excY
Models for complex sample designs 54
Bayes inference for probability samples• A sufficient condition for ignoring the selection mechanism is
that selection does not depend on values of Y, that is:
• This holds for probability sampling with design variables Z• But the model needs to appropriately account for relationship of
survey outcomes Y with the design variables Z.• Consider how to do this for (a) unequal probability samples, and
(b) clustered (multistage) samples
( | , , ) ( | , ) for all .p I Y Z p I Z Y
Models for complex sample designs 55
Ex 1: stratified random sampling• Population divided into J strata• Z is set of stratum indicators:
• Stratified random sampling: simple random sample of units selected from population of units in stratum j.
• This design is ignorable providing model for outcomes conditions on the stratum variables Z.
• Same approach (conditioning on Z works for post-stratification, with extensions to more than one margin.
1, if unit is in stratum ;
0, otherwise. i
i jz
n jN j
Z Y Z Sample Population
Models for complex sample designs 56
Inference for a mean from a stratified sample• Consider a model that includes stratum effects:
• For simplicity assume is known and the flat prior:
• Standard Bayesian calculations lead to
where:
j2
( | ) .jp Z const
2 2inc st st[ | , , { }] ~ ( , )jY Y Z N y
st1
2 2 2st
1
, / , sample mean in stratum ,
(1 ) / , /
J
j j j j jj
J
j j j j j j jj
y P y P N N y j
P f n f n N
2ind[ | ] ~ ( , )i i j jy z j N
Models for complex sample designs 57
Bayes for stratified normal model• Bayes inference for this model is equivalent to
standard classical inference for the population mean from a stratified random sample
• The posterior mean weights case by inverse of inclusion probability:
• With unknown variances, Bayes’ for this model with flat prior on log(variances) yields useful t-like corrections for small samples
1 1st
1 1 :
/ ,
where / selection probability in stratum .i
J J
j j i jj j i x j
j j j
y N N y N y
n N j
Models for complex sample designs 58
Suppose we ignore stratum effects?• Suppose we assume instead that:
the previous model with no stratum effects. • With a flat prior on the mean, the posterior mean of is then the
unweighted mean
• This is potentially a very biased estimator if the selection rates
vary across the strata– The problem is that results from this model are highly sensitive
violations of the assumption of no stratum effects … and stratum effects are likely in most realistic settings.
– Hence prudence dictates a model that allows for stratum effects, such as the model in the previous slide.
2[ | ] ~ ( , ),i i indy z j N
Y
j j jn N /
2inc
1
( | , , ) , /J
j j j jj
E Y Y Z y p y p n n
Models for complex sample designs 59
Design consistency• Loosely speaking, an estimator is design-consistent if
(irrespective of the truth of the model) it converges to the true population quantity as the sample size increases, holding design features constant.
• For stratified sampling, the posterior mean based on the stratified normal model converges to , and hence is design-consistent
• For the normal model that ignores stratum effects, the posterior mean converges to
and hence is not design consistent unless • We generally advocate Bayesian models that yield design-
consistent estimates, to limit effects of model misspecification
yst
Y
y
Y N Y Nj j j j jj
J
j
J
/
11
j const .
Models for complex sample designs 60
Ex 2. A continuous (post)stratifier Z
Z Y Z Sample Population
HT1
1/ ; selection prob (HT)
n
i i ii
y yN
Consider PPS sampling, Z = measure of size
HT
2 2
model-based prediction estimate for
~ Nor( , ) ("HT model")i i i
y
y
When the relationship between Y and Z deviates a lot from the HT model, HT estimate is inefficient and CI’s can have poor coverage
Standard design-based estimator is weighted Horvitz-Thompson estimate
Models for complex sample designs 61
Ex 4. One continuous (post)stratifier Z
Z Y Z Sample Population
wt1
1/ ; selection prob (HT)
n
i i ii
y yN
mod1 1
2
A modeling alternative to the HT estimator is create
predictions from a more robust model relating to :
1ˆ ˆ= , predictions from:
~ Nor( ( ), ); ( ) = penalized s
n N
i i ii i n
ki i i i
Y Z
y y y yN
y S S
pline of on
(Zheng and Little 2003, 2005)
Y Z
Models for complex sample designs 62
Ex 3. Two stage sampling• Most practical sample designs involve selecting a
cluster of units and measure a subset of units within the selected cluster
• Two stage sample is very efficient and cost effective
• But outcome on subjects within a cluster may be correlated (typically, positively).
• Models can easily incorporate the correlation among observations
Models for complex sample designs 63
Two-stage samples• Sample design:
– Stage 1: Sample c clusters from C clusters– Stage 2: Sample units from the selected cluster i=1,2,…,c
• Estimand of interest: Population mean Q• Infer about excluded clusters and excluded units within
the selected clusters
1
Population size of cluster i
C
ii
K i
N K
ik
Models for complex sample designs 64
Models for two-stage samples• Model for observables
• Prior distribution
( ) 1
2
2
~ ( , ); 1,..., ; 1, 2,...,
~ ( , )
ij i i
i
Y N i C j K
iid N
Assume and are known
Models for complex sample designs 65
Estimand of interest and inference strategy• The population mean can be decomposed as
• Posterior mean given Yinc
,exc1 1
[ ( ) ]c C
i i i i i i ii i c
NQ k y K k Y K Y
inc1 1
inc inc inc1 1
2 2
inc 2 2
2 2
inc 2
( | , , 1,2,..., ; ) [ ( ) ]
( | ) [ ( ) ( | )] ( | )
ˆ( / ) (1/ )where ( | )
/ 1/
/ ( / )ˆ ( | )
1/ (
c C
i i i i i i ii i c
c C
i i i i i ii i c
i ii
i
i ii
E NQ Y i c k y K k K
E NQ Y k y K k E Y K E Y
y kE Y
k
y kE Y
2 / )i
i
k
Models for complex sample designs 66
Posterior Variance• Posterior variance can be easily computed
,exc inc ,exc inc inc ,exc inc inc
22
inc inc inc inc inc
2 2
( | ) [ ( | , ) | ] [ ( | , ) | ]
, 1,2, ,
( | ) [ ( | , ) | ] [ ( | , ) | ]
/ , 1, 2, ,
i i i i i
i i
i i i i i
i
Var Y Y E Var Y Y Y Var E Y Y Y
i cK k
Var Y Y E Var Y Y Y Var E Y Y Y
K i c c C
2 2 2 2inc
1 1
( | ) ( )( ( ) ) ( )c C
i i i i i ii i c
Var NQ Y K k K k K K
Models for complex sample designs 67
Inference with unknown s and t• For unknown s and t
– Option 1: Plug in maximum likelihood estimates. These can be obtained using PROC MIXED in SAS. PROC MIXED actually gives estimates of , ,q s t and E(mi|Yinc) (Empirical Bayes)
– Option 2: Fully Bayes with additional prior
where b and v are small positive numbers
2 2 2 2 2( , , ) exp / (2 )v b
Models for complex sample designs 68
Extensions and Applications• Relaxing equal variance assumption
• Incorporating covariates (generalization of ratio and regression estimates)
• Small Area estimation. An application of the hierarchical model. Here the quantity of interest is
2~ ( , )
( , log ) ~ iid ( , )il i i
i i
Y N
BVN
inc ,exc inc( | ) ( ( ) ( | )) /i i i i i i iE Y Y k y K k E Y Y K
2~ ( , )
( , log ) ~ iid ( , )il il i i
i i
Y N x
MVN
Models for complex sample designs 69
Extensions• Relaxing normal assumptions
• Incorporate design features such as stratification and weighting by modeling explicitly the sampling mechanism.
2| ~ Glim( ( ), ( ))
: a known function
~ ( , )
il i i il i i
i
Y h x v
v
iid MVN
Models for complex sample designs 70
Summary• Bayes inference for surveys must incorporate design
features appropriately• Stratification and clustering can be incorporated in
Bayes inference through design variables• Unlike design-based inference, Bayes inference is not
asymptotic, and delivers good frequentist properties in small samples
Models for Complex Surveys: Bayesian Computation 71
• A Bayesian analysis uses the entire posterior distribution of the
parameter of interest.• Summaries of the posterior distribution are used for statistical
inferences– Means, Median, Modes or measures of central tendency– Standard deviation, mean absolute deviation or measures of
spread– Percentiles or intervals
• Conceptually, all these quantities can be expressed analytically in terms of integrals of functions of parameter with respect to its posterior distribution
• Computations – Numerical integration routines – Simulation techniques – outline here
Module 4: Short introduction to Bayesian computation
Models for Complex Surveys: Bayesian Computation 72
Types of Simulation• Direct simulation (as for normal sample, regression) • Approximate direct simulation
– Discrete approximation of the posterior density– Rejection sampling– Sampling Importance Resampling
• Iterative simulation techniques– Metropolis Algorithm– Gibbs sampler– Software: WINBUGS
Models for Complex Surveys: Bayesian Computation 73
Approximate Direct Simulation
• Approximating the posterior distribution by a normal distribution by matching the posterior mean and variance.– Posterior mean and variance computed using numerical
integration techniques• An alternative is to use the mode and a measure of
curvature at the mode– Mode and the curvature can be computed using many
different methods• Approximate the posterior distribution using a grid of
values of the parameter and compute the posterior density at each grid and then draw values from the grid with probability proportional to the posterior density
Models for Complex Surveys: Bayesian Computation 74
Normal Approximation
Posterior density : ( | )
Easy to work with log-posterior density
( ) log( ( | ))
At the mode, ( ) '( ) 0
Curvature : '( ) ''( )
x
l x
f l
f l
For logarithm of the normal density
Mode is the mean and
the curvature at the mode
is negative of the precision
(Precision:reciprocal of variance)
Models for Complex Surveys: Bayesian Computation 75
Rejection Sampling
• Actual Density from which to draw from
• Candidate density from which it is easy to draw
• The importance ratio is bounded
• Sample q from g, accept q with probability p otherwise redraw from g
( | data)
( ), with ( ) 0 for all
with ( | data) 0
g g
( | data)
( )M
g
( | data)
( )p
M g
Models for Complex Surveys: Bayesian Computation 76
Sampling Importance Resampling• Target density from which to
draw• Candidate density from
which it is easy to draw• The importance ratio
• Sample M values of q from g• Compute the M importance
ratios and resample with probability proportional to the importance ratios.
1 2* * *, ,..., M
w i Mi( ); , ,...,* 1 2
( | data)
( ), such that ( ) 0
for all with ( | data) 0
g g
( | data)( )
( )w
g
Models for Complex Surveys: Bayesian Computation 77
Markov Chain Simulation• In real problems it may be hard to apply direct or
approximate direct simulation techniques.• The Markov chain methods involve a random walk in the
parameter space which converges to a stationary distribution that is the target posterior distribution.– Metropolis-Hastings algorithms – Gibbs sampling
Models for Complex Surveys: Bayesian Computation 78
Metropolis-Hastings algorithm• Try to find a Markov Chain whose stationary distribution is
the desired posterior distribution.• Metropolis et al (1953) showed how and the procedure was
later generalized by Hastings (1970). This is called Metropolis-Hastings algorithm.
• Algorithm:– Step 1 At iteration t, draw
( )~ ( | )
: Candidate Point
: Candidate Density
ty p y x
y
p
Models for Complex Surveys: Bayesian Computation 79
– Step 2: Compute the ratio
– Step 3: Generate a uniform random number, u
– This Markov Chain has stationary distribution f(x).– Any p(y|x) that has the same support as f(x) will work– If p(y|x)=f(x) then we have independent samples– Closer the proposal density p(y|x) to the actual density f(x),
faster will be the convergence.
( )
( ) ( )
( ) / ( | )1,
( ) / ( | )
t
t t
f y p y xw Min
f x p x y
( 1)
( 1) ( )
if
otherwise
t
t t
X y u w
X X
Models for Complex Surveys: Bayesian Computation 80
Gibbs sampling• Gibbs sampling a particular case of Markov Chain Monte Carlo
method suitable for multivariate problems
1. This is also a Markov Chain whose stationary Distribution is f(x)
2. This is an easier Algorithm, if the conditional densities are easy to work with
3. If the conditionals are harder to sample from,
then use MH or Rejectiontechnique within the Gibbs sequence
1 2
1 2 1 1
( 1) ( ) ( ) ( )1 1 2 3
( 1) ( 1) ( ) ( )2 2 1 3
( 1) ( 1) ( 1) ( ) ( )1 1 1
(
( , ,..., ) ~ ( )
( | , ,..., , ,..., )
Gibbs sequence :
~ ( | , ,..., )
~ ( | , ,..., )
~ ( | ,..., , ,..., )
p
i i i p
t t t tp
t t t tp
t t t t ti i i i p
tp
x x x x f x
f x x x x x x
x f x x x x
x f x x x x
x f x x x x x
x
1) ( 1) ( 1)
1 1~ ( | ,..., )t tp pf x x x
Conclusion• Design-based: limited, asymptotic• Bayesian inference for surveys: flexible, unified,
now feasible using modern computational methods
• Calibrated Bayes: build models that yield inferences with good frequentist properties – diffuse priors, strata and post-strata as covariates, clustering with mixed effects models
• Software: Winbugs, but software targeted to surveys would help.
• The future may be Calibrated Bayes!Models for complex surveys 1: introduction 81
Top Related