Bayesian Analysis of the Ordered Probit Model with...
Transcript of Bayesian Analysis of the Ordered Probit Model with...
Bayesian Analysis of the Ordered Probit Model withEndogenous Selection∗
Murat K. MunkinDepartment of Economics
531 Stokely Management CenterUniversity of Tennessee
Knoxville, TN 37919, U.S.A.Email: [email protected]
Pravin K. TrivediDepartment of Economics
Wylie Hall 105Indiana University
Bloomington, IN 47405, U.S.AEmail: [email protected]
February 2, 2007
Abstract
This paper presents a Bayesian analysis of an ordered probit model with endoge-nous selection. The model can be applied when analyzing ordered outcomes thatdepend on endogenous covariates that are discrete choice indicators modeled by amultinomial probit model. The model is illustrated by analyzing the effects of dif-ferent types of medical insurance plans on the level of hospital utilization, allowingfor potential endogeneity of insurance status. The estimation is performed usingthe Markov Chain Monte Carlo (MCMC) methods to approximate the posteriordistribution of the parameters in the model.
Key words: Treatment Effects; MCMC; Discontinuity Regression.∗We thank Jeff Racine for comments on an earlier version of the paper presented at the 2004 meetings
of the Southern Economic Association. In revising and rewriting the paper we have benefitted from thecomments of two anonymous referees, an Associate Editor, and Co-Editor John Geweke. However, weremain responsible for the current version.
1. Introduction
This paper develops an estimation method for the ordered probit model with endogenous
covariates, termed the ordered probit model with endogenous selection (OPES). Specif-
ically, we analyze the effect of endogenous multinomial choice indicators on an ordinal
dependent variable. Endogeneity is modeled using a correlated latent variable structure,
with multinomial choice represented by the multinomial probit model. Markov chain
Monte Carlo (MCMC) methods are then used to approximate the posterior distribution
of the parameters and treatment effects. The application of the model is illustrated by
analyzing the effects of different types of medical insurance plans on the level of hospital
care utilization by the US adult population.
The ordered probit (OP) model with exogenous covariates is well established in the
literature. Extending it to the case where some covariates are endogenous is empirically
useful. Then it can be applied also to models with count dependent variables whose
frequencies are restricted to just a few support points. Thus, the OPES model may
serve as an alternative to the existing count models with endogenous treatment.
Our model analyzes the effect of a set of endogenous choice indicators on a count
variable whose distribution displays a very large proportion of zeros. Specifically we
consider cases when even extensions of the Poisson model that allow for overdispersion
do not provide an adequate fit. Examples of such extensions include the negative bi-
nomial and the Poisson-lognormal mixture models (Munkin and Trivedi, 2003). There
are at least two empirical considerations which motivate this paper. First, using obser-
vational data we want to model an outcome (the biannual number of hospitalizations)
which is a count variable, but more than 80 percent of observations are zeros, and the
distribution has a short tail. Second, the outcome depends on some categorical dummy
variables (e.g., types of health insurance plans) which are potentially endogenous, i.e.,
jointly determined with the outcome variable. This is simply a particular case of an
often-encountered model in which some of the covariates are endogenous dummy vari-
ables. We develop a model that generalizes the OP model by including endogenous
choice variables among the covariates.
Our approach is Bayesian. The full model consists of an ordered probit equation and
2
a set of discrete choice equations. The interdependence between the OP and discrete
choice equations is modeled using a correlated latent variable structure. The defined
latent variables are made a part of the parameter set. Augmenting full conditional
densities with latent variables, following Tanner and Wong (1987) and others, simpli-
fies the MCMC algorithm. Our analysis is related to several previous contributions,
including Albert and Chib (1993), Cowles (1996), Chib and Hamilton (2000), Geweke,
Gowrisankaran, and Town (2003), Poirier and Tobias (2003), and Li and Tobias (2006).
Albert and Chib (1993) present a Bayesian treatment of the OP model using the Gibbs
sampler. However, the proposed Gibbs sampler mixes poorly in the case of many thresh-
old parameters and large samples. Geweke et al. (2003) analyze the endogenous binary
probit model (EBP) to study the quality of hospitals based on mortality rates in treating
pneumonia. In their analysis the patients self-select hospitals, so choices are endoge-
nous. Our model can be interpreted as an extension or synthesis of both the OP model
and the EBP model.
The rest of the paper is organized as follows. Section 2 describes the OPES model.
Section 3 presents the MCMC estimation algorithm for the model. Section 4 presents
an illustrative application using the Medical Expenditure Panel Survey (MEPS) data
on hospitalizations and health insurance. Section 5 concludes.
2. An Ordered Probit Model with Endogenous Selection
Assume that we observe N independent observations for individuals who choose the
treatment variable among J alternatives. Let di = (d1i, d2i, ..., dJ−1i) be binary ran-
dom variables for individual i (i = 1, ..., N) representing this choice (category J is the
baseline) such that dji = 1 if alternative j is chosen and dji = 0 otherwise. Define the
multinomial probit model using the multinomial latent variable structure which rep-
resents gains in utility received from the choices, relative to the utility received from
choosing alternative J . Let the (J − 1)× 1 random vector Zi be defined as
Zi =Wiα+ εi,
3
whereWi is a (J−1)× q matrix of exogenous regressors, α is a q×1 parameter vector,such that
dji =JYl=1
I[0,+∞) (Zji − Zli) , j = 1, ..., J,
where ZJi = 0 and I[0,+∞) is the indicator function for the set [0,+∞). The distributionof the error term εi is (J − 1)-variate normalN (0,Σ). For identification it is customary
to restrict the leading diagonal element of Σ to unity. We will impose identifying
restrictions after defining the entire model.
To model the ordered dependent variable we assume that there is another latent
variable Y ∗i that depends on the outcomes of di such that
Y ∗i = Xiβ + diρ+ ui,
where Xi is a 1 × p vector of exogenous regressors, β is p × 1 and ρ is (J − 1) × 1parameter vectors. Define Yi as
Yi =MXm=1
mI[τm−1,τm) (Y∗i ) ,
where τ0, τ1, ...,τM are threshold parameters and m = 1, ...,M . In our application Yi
is an ordered variable measuring the degree of medical service utilization. For identi-
fication, it is standard to set τ0 = −∞ and τM = ∞ and additionally restrict τ1 = 0.
Denote τ = (τ2, ..., τM−1). The choice of insurance is potentially endogenous to utiliza-
tion and this endogeneity is modeled through correlation between ui and εi. Assume
that they are jointly normally distributed such that cov(εi, ui) = δ with variance of ui
restricted for identification since Y ∗i is latent. Assume that V ar (ui) = 1 + δ0Σ−1δ.
Then ui|εi ∼ N¡δ0Σ−1εi, 1
¢.
We present our estimation strategy by first simplifying the exposition of the model
to be consistent with the application and reparameterizing Σ. In the application the
multinomial choice is among three alternatives so that J = 3. Let Zi =³Z1i, eZi´ such
that eZi = Z2i and use tilde to denote all parameters and variables related to eZi. DenoteV ar( eZi) = eσ22 where in fact eσ22 = σ22, cov( eZi, Z1i) = σ21 and restrict variance of Z1i
for identification such that V ar(ε1i) = 1 + σ221eσ−122 . Then ε1i|eεi ∼ N (σ21eσ−122 eεi, 1).4
Denote π0 = δ0Σ−1, π0 = (π1, eπ) (where π1 is 1 × 1 and eπ is 1 × 1) and eσ21 =σ21eσ−122 . There is a one-to-one correspondence between parameter sets (δ,Σ) and
(π1, eπ0, eσ21, eσ22). Then the model can be presented asY ∗i = Xiβ + diρ+ (Z1i −W1iα1)π1 + ( eZi − fWieα)eπ + ζi,
Z1i = W1iα1 + ( eZi − fWieα)eσ21 + ηi,eZi = fWieα+ eεi,such that ζi
ηieεi i.i.d.∼ N
03, 1 0 00 1 00 0 eσ22
.Let ∆i = (Xi,Wi, τ , β,ρ,π1, eπ,α1, eα, eσ21, eσ22). For each observation i the joint
density of the observable data and latent variables is
Pr³Y ∗i , Yi, Z1i, eZi,di|∆i
´= (2π)−3/2 eσ−1/222 exp
h−0.5eσ−122 ( eZi − fWieα)2i
× exph−0.5[Z1i −W1iα1 −
³ eZi − fWieα´ eσ21]2i× 3Xj=1
dji
3Yl=1
I[0,+∞) (Zji − Zli)
× exp·−0.5
³Y ∗i −Xiβ − diρ− (Z1i −W1iα1)π1 − ( eZi − fWieα)eπ´2¸
×"MXm=1
I{Yi=m}I[τm−1,τm) (Y∗i )
#.
The joint distribution of observable and latent variables for all observations is the
product of N such independent terms over i = 1, ...N . The posterior density is propor-
tional to the product of the prior density of the parameters and the joint distribution
of observables and included latent variables.
In order to identify causal effects of the endogenous treatment variables on the out-
come variable one needs exclusion restrictions, which arise if there are variables which
affect the insurance choices but not utilization. We discuss such restrictions at greater
length in the application section. However, there is a further identification issue in that
5
the multinomial probit structure is known to be difficult to identify in the absence of
additional restrictions. There are two potential sources of such restrictions, those that
arise from restricting elements of the covariance matrix and those that are generated
through restrictions on covariates which affect the utility levels of certain alternatives.
Keane (1992) shows, in the maximum likelihood framework, that without such exclusion
restrictions the estimation of the covariance parameters is tenuous. Because the identi-
fication problem also arises in the current Bayesian context, we propose such exclusion
restrictions in the application section. Further, we also study properties of the Markov
chain in our application under the no-exclusion-restriction specification.
The leading diagonal element σ11 of matrix
Σ =
·σ11 σ21σ21 σ22
¸is restricted for identification. However, the model is not formally identified if σ21 = 0
since in that case variance parameter σ22 must also be restricted. Bunch (1991) inves-
tigates estimability in the MNP model and recommends that both variance parameters
be restricted in which case the model is formally identified. We restrict parameter eσ22to unity. This restriction greatly improves the convergence properties of the resulting
Markov chain. When unrestricted even a visual examination of the chains for parameterseσ21, eσ22 in the application after considerably long runs indicates that convergence hasnot been achieved. This poor mixing is consistent with McCulloch, Polson, and Rossi
(2000) who have a similar reparameterization of the covariance matrix in their analysis
of the MNP model. They point out that more diffuse priors cause slower convergence
in their MCMC algorithm.
We select proper prior distributions for all parameters. The prior distributions
for parameters α1 and eα are normal N (α1,H−1α1) and N¡α,H−1α
¢, respectively, and
centered at zero vectors
α1 ∼ N (0, 10Ik1) , eα ∼ N ¡0, 10Iek¢ .
We also select proper normal priors centered at zero for parameters β and ρ
β ∼ N (0, 10Ip) , ρ ∼ N (0, 10I2) .
6
For the covariances among the errors εi in the MNP choice equations and the error ui
in the latent variable Y ∗i , we select prior distributions such that E(δ) = 0 and E(Σ)
is a diagonal matrix which means that E(σ21) = 0. The last assumption implies that
in the prior distributions parameters eσ21 and π are centered at zeros. We choose thepriors
π1 ∼ N (0,κ) , eπ ∼ N (0,κ) , eσ21 ∼ N (0,κ) .
We try two values κ = 1/2 and κ = 1/8 to evaluate the sensitivity of the tested null
hypothesis of no endogeneity to the choice of the priors. While parameters δ and σ21
must satisfy complicated restrictions such that Σ and the covariance matrix of vector
(ε0i, ui)0 be positive definite respectively, the new parameters π and eσ21 do not have
such restrictions.
The priors for the threshold parameters must respect the order restrictions placed
on them. It is easier to choose priors by reparameterizing these parameters first. The
prior distributions for the threshold parameters are specified in the next section.
3. MCMC algorithm
We block the parameter set ashZ1i, eZii , [Y ∗i , τ ] , α1, eα, [β,ρ,π1, eπ] , eσ21 and adopt a
hybrid Metropolis-Hastings/Gibbs algorithm. The steps of the MCMC algorithm are
the following:
1. The latent variable Z1i (i = 1, ...N) is conditionally independent with normal dis-
tribution Z1iiid∼ N (Z1i,H−11i ) where
H1i = 1 + π21,
Z1i = W1iα1
+H−11i
hπ1(Y
∗i −Xiβ − diρ− ( eZi − fWieα)eπ) + ( eZi − fWieα)eσ21i ,
and subject to
Z1i > max {Zli| l = 2, ..., J} if d1i = 1 andZ1i < max {Zli| l = 2, ..., J} if d1i = 0.
7
The latent vectors eZi (i = 1, ...N) are conditionally independent with the normaldistribution eZi iid∼ N (Zi,Hi−1) whereHi = eσ−122 + eπ2 + eσ221,Zi = fWieα
+Hi−1[eπ (Y ∗i −Xiβ − diρ− (Z1i −W1iα1)π1) + eσ21 (Z1i −W1iα1)] ,
and truncated such that
Zji > max {Zli| l = 1, ..., J, l 6= j} if dji = 1 andZji < max {Zli| l = 1, ..., J, l 6= j} if dji = 0.
We use the rejection sampling algorithm of Geweke (1991) to draw values from all
truncated normal distributions in our algorithm.
2. The full joint conditional density of block [Y ∗i , τ ] is
Pr¡Y ∗i , τ |Yi,Zi,di,∆i
¢=
NYi=1
"MXm=1
I{Yi=m}I[τm−1,τm) (Y∗i )
#
× exp·−0.5
³Y ∗i −Xiβ − diρ− (Z1i −W1iα1)π1 − ( eZi − fWieα)eπ´2¸ ,
which we write as
Pr¡Y ∗i , τ |Yi,Zi,di,∆i
¢= Pr
¡Y ∗i |τ , Yi,Zi,di,∆i
¢Pr¡τ |Yi,Zi,di,∆i
¢,
where ∆i includes the same parameters as ∆i except for τ . Latent variable Y ∗i is
N³Xiβ + diρ+ (Z1i −W1iα1)π1 + ( eZi − fWieα)eπ, 1´ and conditional on Yi = m
it is truncated on left by τm−1 and on the right by τm. The full conditional density
of vector τ = (τ2, ..., τM−1) is
NYi=1
"MXm=1
I{Yi=m} Pr¡τm−1 < Y ∗i < τm|∆i,di,Zi, Yi
¢#
8
where
Pr¡τm−1 < Y ∗i < τm|∆i,di,Zi, Yi
¢= (3.1)
Φ[τm − (Xiβ + diρ+ (Z1i −W1iα1)π1 + ( eZi − fWieα)eπ)]−Φ[τm−1 − (Xiβ + diρ+ (Z1i −W1iα1)π1 + ( eZi − fWieα)eπ)].
As the elements of vector (τ2, ..., τM−1) are ordered, the prior assigned to the
threshold parameters must be restricted. Instead, we follow Chib and Hamilton
(2000) and reparameterize them as
γ2 = log (τ2) , γj = log(τ j − τ j−1), 3 6 j < M − 1
and assign a normal prior N (γ0,Γ0) without any restrictions since elements ofvector γ =
¡γ2, ..., γM−1
¢do not have to be ordered. The full conditional for
vector γ is the product of the prior and the full conditional (3.1) after substituting
τ j =
jXk=2
exp (γk) .
This density is intractable and we utilize the Metropolis-Hastings algorithm to
sample from it, using t-distribution centered at the modal value of the full con-
ditional density for the proposal density. Note that it would be possible to avoid
the Metropolis-Hastings step with a reparameterization similar to Nandram and
Chen (1996) if the number of ordered categories did not exceed three. Let
bγ = argmax log p ¡γ|∆i,di,Zi, Yi¢,
andVbγ = −(Hbγ)−1 be the negative inverse of the Hessian of log p ¡γ|∆i,di,Zi, Yi¢
evaluated at the mode bγ. Choose the proposal distribution q (γ) = fT (γ|bγ,ϕVbγ , υ),a t-distribution with υ degrees of freedom and tuning parameter ϕ, an adjustable
constant selected to obtain reasonable acceptance rates. When a proposal value
γ∗ is drawn the chain moves to the proposal value with probability
Pr(γ,γ∗) = min
(p¡γ∗|∆i,di,Zi, Yi
¢q (γ)
p¡γ|∆i,di,Zi, Yi
¢q (γ∗)
, 1
).
9
If the proposal value is rejected then the next state of the chain is at the current
value γ.
3. Given the prior distribution of α1, N (α1,H−1α1 ), the full conditional distributionof α1 is α1 ∼ N (α1,H−1α1 ) where
Hα1 = Hα1 +NXi=1
W01i
¡1 + π21
¢W1i
α1 = H−1α1
"Hα1α1 +
NXi=1
nW0
1i
¡1 + π21
¢Z1i −W0
1i( eZi − fWieα)eσ12−W0
1iπ1(Y∗i −Xiβ − diρ− ( eZi − fWieα)eπ)oi .
4. Given the prior distribution of eα, N ¡α,H−1α
¢, the full conditional distribution ofeα is eα ∼ N (α,H−1α ) where
Hα = Hα +NXi=1
fW0i(eσ−122 + eπ2 + eσ221)fWi
α = H−1α [Hαα+
NXi=1
nfW0i(eσ−122 + eπ2 + eσ221) eZi
−fW0ieπ (Y ∗i −Xiβ − diρ− (Z1i −W1iα1)π1)
−fW0ieσ21 (Z1i −W1iα1)
oi.
5. Let Ci = (Xi,di, (Z1i −W1iα1) , ( eZi − fWieα)), χ0 = ¡β0,ρ0
¢, θ0 = (χ0,π1, eπ)
and specify prior distributions χ ∼ N (χ,H−1χ ) and π ∼ N¡π,H−1π
¢. The full
conditional distribution of θ is θ ∼ N (θ,H−1θ ) where
Hθ =
µHχ 00 Hπ
¶+
NXi=1
C0iCi
θ = H−1θ
"µHχχ
Hππ
¶+
NXi=1
C0iY∗i
#.
6. Given the prior distribution of eσ21, N ¡σ,H−1σ
¢, the full conditional distribution
10
of eσ21 is eσ21 ∼ N (σ,H−1σ ) whereHσ = Hσ +
NXi=1
( eZi − fWieα)2σ = H
−1σ
"Hσσ +
NXi=1
( eZi − fWieα) (Z1i −W1iα1)
#.
This concludes the MCMC algorithm.
4. Application
We study the effects of different types of medical insurance plans on hospital admission
rates of the US population aged between 55 and 75 years. Hospital utilization shows
large increases in admission rates at the age of 65, the age at which the majority of
individuals become eligible for coverage under the Medicare public insurance program
for the elderly. Hence in the literature the transition to age 65 has been treated as an
important source of variation in modeling the probability of Medicare coverage.
4.1. Modeling Considerations
The standard regression discontinuity approach assumes that measures of hospital uti-
lization would change smoothly with age in the absence of Medicare insurance and,
therefore, any change in utilization would occur due to changes in insurance status.
Following Lichtenberg (2002) and Decker, Dushi, and Deb (2005), it is argued that
there is no reason to believe that at age 65 individuals experience any structural breaks
in their health status. Although some measures of individual’s medical care use do show
a positive change at age 65, these movements are interpreted as reflecting the expression
of demand previously postponed by the uninsured individuals, and not as an indicator
of sharply deteriorating health. Hence the binary variable, DUM65, which indicates
whether an individual is older than or equal to 65 years of age, would affect utilization
only through the Medicare insurance status. In such a case DUM65 generates an ex-
clusion restriction; that is, under our specification DUM65 affects the insurance status,
but not utilization.
11
An example is the article by Card, Dobkin and Maestas (2004) which analyzes the
effect of medical insurance on utilization of medical services for the US subpopulation
aged 55-75 years in order to identify its causal effects. The article analyzes various
measures of demand for medical services, including hospital admission rates. As indi-
cated by Card et al. (2004), the study is subject to some limitations. First, it does not
distinguish among different types of medical insurance, but instead combines them in
a single insurance category. However, plan heterogeneity can potentially be a serious
issue. For example, if selection of insurance plans is partially driven by unobservable
attitude towards risk, with the privately insured being relatively more risk-averse and
the publicly insured being relatively more risk-inclined, then on average it could be dif-
ficult to first identify and then separate the selection effects if both plans are aggregated
into a single category. Second, the dependent variable in Card et al. (2004) is assumed
to be linear in covariates, a specification which would not accommodate nonlinearities,
such as variable marginal effects. Our specification allows for nonlinear relationships
between the dependent variable and covariates. Finally, the reference category in Card
et al. (2004) is the uninsured. Only a very small fraction of individuals is uninsured in
the over-65 group; in the MEPS sample less than one percent of the elderly is uninsured.
Potentially this makes it difficult to identify the coefficient of DUM65.
Our analysis is conditional on all individuals in the sample being insured. This solves
the problem of instability of parameter estimates arising from having very few uninsured
individuals aged over 65 years. Health economists generally expect that the treatment
effect of insurance relative to no insurance is positive. It seems more interesting to
identify signs and magnitudes of treatment effects for different insurance plans relative
to each other.
We consider only those individuals aged between 55 and 75 years who are insured,
either by Medicare only, or by a private insurance plan. Further, two types of private
insurance plans are considered: group and nongroup private plans. In order to have
a group plan an individual must be either employed or have it through an employed
spouse. This is referred to as an employer sponsored insurance (ESI) plan. A nongroup
plan must be purchased individually through a private insurance firm, and hence it is
referred to as an individually purchased insurance (IPI) plan. Group plans benefit from
12
a higher degree of risk pooling and typically involve lower premia than the correspond-
ing IPI plans. For the Medicare-eligible population, an ESI plan is a supplementary
insurance plan which may cover copayments for ambulatory visits and payments to-
wards prescription medications. An IPI plan may be a variant of a number of available
supplementary insurance plans. Some of these variants can be expected to have a higher
premium and less generous coverage. Goldman and Zissimopoulos (2003, p.198) show,
using 1998 Health and Retirement Survey data, that beneficiaries with ESI plans have
significantly lower average out-of-pocket expenditures than those with other supplemen-
tary plans. Atherly (2002, p.137) notes that in the past ESI plans have been a major
source of prescription drug coverage. Hence, it is possible that healthier individuals, or
those with a greater capacity for assuming health risks, would self-select into private
plans with lower coverage.
Individuals with Medicaid (public insurance for the low-income) are deleted from
the sample. To be eligible for Medicaid an individual must satisfy some strict income
criteria. In the health economics literature, Medicaid status is often assumed to be
exogenous (an important exception is related to nursing home care when some indi-
viduals “spend down” their assets to qualify for Medicaid which provides coverage of
long-term care). We follow Fang, Keane, and Silverman (2006) who also exclude the
Medicaid patients from their analysis of selection into Medigap private insurance. This
reduces heterogeneity in the sample since Medicaid patients are likely to be different
from the rest of the insured population. Thus, the choice of insurance is trinomial in
our application.
Our data set pools observations from the 1996-2003 waves of the Medical Expendi-
ture Panel Survey, a nationally representative survey of health care use, expenditure,
sources of payment and insurance coverage for the US civilian non-institutionalized
population. The MEPS sampling frame is a two-year overlapping panel, i.e., in each
calendar year after the first survey year, one sample is in its second year of responses
while another sample is in its first year of responses. To avoid panel and clustering
issues, we only use observations on the second round of the survey respondents in each
year with the dependent variable and income measuring the outcomes for the two year
period. We use only those individuals whose insurance status did not change during the
13
survey period. We set the nongroup insurance (IPI) plans as the baseline choice with
Medicare only (MEDICARE) and group private plans (ESI) as the choice categories for
insurance plans.
4.2. Covariates and Exclusion Restrictions
Table 1 gives summary statistics of all variables used in our analysis. The sample has
11,432 observations. Table 2 describes the distribution of the hospital visits (HOSPVIS)
up to cell 4. The table shows that the dependent variable has about 80% of the cases
with zero utilization; it seems unlikely that even mean-preserving transformations of
the Poisson model that allow for overdispersion will provide a satisfactory fit to the
data. Because it is not feasible to estimate threshold parameters for very sparse cells in
the right tail we combine them so that the last cell (> 4) has 1.36 percent of the wholesample (155 observations). HOSPVIS has five cells and three threshold parameter to
estimate. The mean and standard deviation (in parentheses) of the dependent variable
are 0.43 (0.87) for those on Medicare only, 0.27 (0.693) for those with group plans and
0.34 (0.79) for nongroup plans.
The covariate vector X consists of self-perceived health status variables VEGOOD,
GOOD, FAIR, POOR (excellent health is the omitted category), measures of chronic dis-
eases and physical limitation, CHRONIC and PHYSLIM, geographical variables NORE-
AST, MIDWEST, SOUTH and MSA, demographic variables BLACK, HISPANIC, FAM-
SIZE, FEMALE, MARRIED, EDUC, AGE, year dummies YEAR98-YEAR03 and eco-
nomic variable INCOME. The insurance variables are MEDICARE and ESI. The year
dummies are intended to capture the variations induced by trend-like changes affecting
the sample.
Vectors W1 and W2 include all variables included in X plus exclusion restrictions
to identify three covariance parameters. Exclusion restrictions are different for vectors
W1 and W2. Since latent variables Z1i and eZi measure gains in utility received fromMEDICARE and ESI choices, relative to the utility received from the baseline ISI
choice, we offer an explanation for why variables affect one utility level but not the
other. One of the exclusion restrictions included inW1 is DUM65, which was also used
by Card et al. (2004). The reasons why DUM65 should not directly affect utilization
14
have been given above. DUM65 indicates that the MEDICARE choice is available to
most individuals aged above 65 (One has to pay Medicare taxes for 40 quarters to be
eligible). However, DUM65 should not have a direct effect on the utility from the ESI
choice and, therefore, should not enterW2 since ESI is related to employment and there
is no evidence that the employment rates drop substantially at the age 65. In fact, in
our data the employment rate monotonically decreases with age.
Another restriction, entering vector W2 but not W1, is OFFER, an indicator of
whether the current employer offers a health insurance benefit. This variable should
affect the choice of insurance but does not have a direct impact on the utilization vari-
able. Some of those who are offered a group plan choose not to purchase it. Therefore,
being offered a group plan does not automatically make one insured with an ESI plan.
On the other hand, some of those with group insurance coverage receive it through a
spouse, and not as an employment benefit. Variable OFFER has a direct impact on
availability of employment-based private insurance but should not affect availability of
Medicare or nongroup plans.
Our final exclusion restriction is imposed through the variable SSIRATIO, which
enters both W1 and W2. It is defined as the share of the social security income re-
ceived for two years to the total individual’s income for that period. It is argued that
those individuals whose main source of income is social security are less likely to pur-
chase nongroup private insurance plans. That is, we assume that a high value of the
SSIRATIO, given income, reflects reduced affordability of such private insurance. Thus,
SSIRATIO should affect the utility level of the ISI choice and, since both Z1i and eZiare defined as measures of utility relative to the baseline choice it should be included
in both equations. Note also that SSIRATIO is an attribute of an individual, whereas
OFFER and DUM65 may be thought of as attributes of insurance plans. Hence it seems
correct that SSIRATIO enters both insurance equations, whereas OFFER and DUM65,
respectively, enter only one insurance equation.
If our data had information on insurance premiums for the considered plans then we
could use them as our exclusion restrictions. The actual premiums have direct impacts
on the choices. Given the limitation of our data we rely on proxy variables which are
correlated with the premiums. In general, the main difficulty which arises from the use
15
of proxy variables as exclusion restrictions is that there may be arguments that would
question their validity, in which case all variables excluded from the outcome equations
should enter the treatment equations through both W1 and W2. We estimate such
a specification of the model, study the properties of the resulting Markov chain, and
compare our findings with those of a similar investigation in the maximum likelihood
framework by Keane (1992).
4.3. Some Computing Issues
Our program has been fully tested with the joint distribution tests of posterior simulators
developed by Geweke (2004). We generate data with 10 observations and specify the
model such that there are three threshold parameters to estimate γ = (γ2, γ3, γ4) and
the dependent variable takes five values. The regressors are generated as
W1i = (1, w1i) and w1i ∼ N(0, 1);fWi = (1, ewi) and ewi ∼ N(0, 1);Xi = (1, xi) and xi ∼ N(0, 1),
and fixed. All the selected prior distributions are proper such as
α1 ∼ N (0, 0.25I2) , eα ∼ N (0, 0.25I2) ;
β ∼ N (0, 0.25I2) , ρ ∼ N (0, 0.25I2) ;
π1 ∼ N (0, 0.25) , eπ ∼ N (0, 0.25) ,
eσ21 ∼ N (0, 0.25) ;γ ∼ N (0, 0.25I3).
The tests are based on the 14 first moments and 105 second moments of the 14 para-
meters and 200,000 iterations. The algorithm passes the joint distribution tests.
4.4. Results
We estimate the OPES model and report posterior means and standard deviations of
parameters β, α1 and eα, except for the geographical and year dummies in Table 3 andparameters ρ, π, eσ21 and τ in Table 5. The reported results are for the set of priors
16
corresponding to κ = 1/2 and based on Markov chains run for 50,000 replications,
after discarding first 1000 draws of the burn-in-phase. We collect every 50th iteration,
discarding the rest. For the tuning parameter the value of ϕ = 2 is selected so that
the acceptance rates are equal to 0.25. The model has 87 parameters to estimate and
the autocorrelation functions for most of them die off after at most 5 lags. The slowest
convergence is observed for parameters ρ, π, eσ21 and τ for which serial correlation
is much more considerable. Table 4 reports the relative numerical efficiencies (RNE)
corresponding to them, which are between 0.028 and 0.09. However, all these parameters
pass the formal test of convergence based on Geweke (1992). The relative numerical
efficiencies for 12 more parameters are between 0.1 and 0.3 with the rest exceeding 0.3.
To facilitate discussion of the results we estimate and report in Table 3 the posterior
means and standard deviations of the average marginal effects for E (Y |X,d), Pr(d1 =1|W) and Pr(d2 = 1|W). The average is taken with respect to all sampled individuals
and the posterior distribution of the parameters. For the binary dummy variables the
marginal effects are calculated as the partial differences when the respective exogenous
variables change their values from 0 to 1.
The results for exogenous covariates are plausible. Health status indicators have
strong impacts on utilization as worsening health conditions increase the probability
of hospitalization. Age has a positive impact on probability of hospitalization. Being
female and living in a metro area have negative impacts. Income, race family size,
education have no impact. The variables that generate exclusion restrictions are strongly
correlated with the insurance choice variables. DUM65 has a strong positive impact
on the probability of being on Medicare only. OFFER has strong impact on being
privately insured. SSIRATIO has a strong negative impact on ESI and positive impact
on Medicare only.
We perform formal tests for null hypotheses that set the covariance parameters to
zero. Denote by M1 the unconstrained model specification and by M0 the constrained
model. Assume that modelsM1 andM0 employ the same priors for parameters common
to both models. Since models M1 and M0 are nested, we test the hypothesis using the
Savage-Dickey density ratio approach (Verdinelli and Wasserman, 1995) to calculate the
17
Bayes factor as
B01 =m(y|M0)
m(y|M1).
First we calculate the Bayes factor for H0 : π1 = eπ = 0 against the alternative that
leaves these parameters unconstrained. According to this approach the Bayes factor can
be calculated as
B01 =p(π∗1, eπ∗|y)p(π∗1, eπ∗) , (4.1)
where p(π1, eπ|y) is the joint posterior density and p(π1, eπ) is the prior for parametersπ1, eπ calculated for the unrestricted model at the point π∗1 = eπ∗ = 0. In general, lessinformative priors for these parameters would always favor the null hypothesis. This
motivates our choice of proper priors and the use of a sensitivity check. The posterior
means and standard deviations of parameters π1 and eπ are 0.092 (0.079) and 0.122(0.165) respectively. The calculated Bayes factor, 12.8 (6.6), does not provide strong
evidence to support or reject the null hypothesis. This conclusion is robust to both
specifications of the priors that correspond to κ = 1/2 and κ = 1/8, respectively.Similarly, we calculate the Bayes factor for H0 : eσ21 = 0 against the alternative that
leaves parameter eσ21 unconstrained. The estimated posterior distribution of the co-variance parameter eσ21 is centered at the posterior mean 1.011 with posterior standarddeviation 0.188. The estimated Bayes factor is 0.0 (0.0) and the null hypothesis is over-
whelmingly rejected. The positive sign of the covariance parameter indicates that the
common unobserved factors influencing the choices affect them in the same directions.
We also estimate the OPES model under the specification which imposes no choice
specific exclusion restrictions. The results are presented in Tables 4 and 5. It is interest-
ing to notice that variable DUM65 still has a strong positive impact on the probability
of being on Medicare only and has no impact on ESI which is consistent with our jus-
tification of the exclusion restrictions. Similarly, OFFER has a strong impact on being
privately insured and no impact on Medicare only. The Markov chain is constructed to
be of the same length as before but it has different convergence properties. The relative
numerical efficiencies in Table 5 indicate that the chain converges much more slowly.
Relative to the specification with choice specific exclusion restrictions the new posterior
standard deviations are larger in magnitude with that of parameters eσ21 increasing from18
0.188 to 0.425. This is consistent with findings of Keane (1992).
In addition, two competing specifications, denoted OP and EBP, respectively, are
estimated and compared with the OPES model. The posterior means, standard devia-
tions of parameters ρ, π, eσ21 and τ of the OP and EBP models are given in Table 5.The OP model ignores endogeneity of insurance status and, therefore, its estimates are
potentially subject to self-selection bias. The estimates indicate that insurance plans
have no strong impacts on utilization, 0.013 (0.058) and 0.061 (0.054). The estimated
Markov chains when endogeneity is ignored display almost no serial correlation and the
RNE values are high. The results are based on a shorter Markov chain run for 20,000
replications with every 20th iteration collected after discarding first 1000 draws. Table
2 reports the actual cell frequencies and those predicted by OPES and OP models.
Regardless of whether endogeneity is modeled both OPES and OP models give simi-
lar estimates of the cell frequencies. This result suggests that the main benefit of the
structural approach, which controls for selection on unobserved factors, is that the total
impact of insurance can be decomposed into selection and incentive components. The
incentive component can be interpreted as the pure effect of insurance.
The EBP model is applied to a binary dummy constructed as an indicator of zero and
positive hospital utilization. That is, the last (second) category includes all observations
of greater than or equal to one hospital visit. In this case there is no need to estimate
the threshold parameter. We impose the same choice specific exclusion restrictions as
those of the original OPES model specification. The reported RNE values are estimated
based on Markov chains of the same length as those for the OPES model. The Markov
chain of the EBP model displays very high serial correlations and the corresponding
RNE values are between 0.007 and 0.015. It is interesting to notice that the posterior
standard deviations for EBP model reported in Table 5 are substantially greater than
those of OPES model. This suggests a loss of precision or efficiency for the endogenous
parameters, as should be expected given limited information over cell frequencies that
the EBP model utilizes compared with that of the OPES model.
The coefficients of MEDICARE and ESI, −0.494 (0.361) and −0.333 (0.329), do notsuggest strong incentive effects. However, the signs of coefficients in the ordered probit
model do not necessarily coincide with those of the corresponding marginal or treatment
19
effects. Thus, treatment effects must be formally calculated to assess the direction and
magnitude of the incentives effects. We estimate the average treatment effect (ATE)
which is such a measure.
Denote ηi = (Zi,β,ρ,π,α, τ ) and define the expected utilization gain evaluated
at ηi for a randomly selected individual i between state j (dji = 1, j = 1, 2) and the
baseline choice (d3i = 1) as
E³Y ji − Y 3i |Xi,ηi
´=
MXm=1
m [Pr (Yi = m|dji = 1,ηi)− Pr (Yi = m|d3i = 1,ηi)] .
We calculate the ATE for the outcome variable as
E¡Y j − Y 3|X¢ = 1
N
NXi=1
Eηi|YhE³Y ji − Y 3i |Xi,ηi
´i,
where the expectation is taken with respect to the posterior distribution of the parame-
ters in the model and over N individuals.
The estimated ATEs are 0.096 (0.001) and 0.158 (0.002) for MEDICARE and ESI,
respectively, relative to nongroup private insurance status respectively. The actual
differences in utilization are 0.09 and −0.07 respectively. Thus, the ATE and the un-conditional difference in utilization for Medicare only and nongroup plans are very close
to each other, which indicates almost no selection effect between these plans. However,
the estimated ATE parameter for ESI plans indicates a strong selection effect with fa-
vorable selection into such plans. In other words, individuals with ESI plans on average
use hospital care less than their IPI counterparts, perhaps because they are healthier,
even though ESI plans may provide better coverage for care than the other alternative
plans. Favorable or advantageous selection is the converse of adverse selection. Ad-
verse selection implies that high risk individuals purchase higher insurance coverage.
Favorable selection has been discussed in the health insurance literature; see Fang et
al. (2006). One possible reason for it that has been offered is that those purchasing
supplementary insurance may be both the healthier and more risk averse individuals.
Risk aversion would increase the propensity for higher insurance coverage while being
healthy would contribute to lower utilization. It is also possible that ESI plans allow
substitution of other care for hospital care, e.g. nursing home care.
20
5. Conclusion
This paper proposes an OPES model to estimate the effect of endogenous treatment
variables on an ordinal dependent variable. The model is illustratively applied to ana-
lyze the effects of different types of insurance plans on hospital utilization allowing for
potential endogeneity of insurance status — a feature neglected by many previous studies.
In our illustration we find evidence that controlling for endogeneity is important.
21
References
Albert, J.H., Chib, S., 1993. Bayesian Analysis of Binary and Polychotomous Response
Data. Journal of American Statistical Association, 88, 422, 669-679.
Atherly, A., 2001. Supplemental Insurance: Medicare’s Accidental Stepchild. Medical
are Research and Review, 58, 2, 131-161.
Bunch, D.S., 1991. Estimability in the Multinomial Probit Model, Transportation
Research B, 25, 1-12.
Cameron, A.C., Trivedi, P.K., 1986. Econometric Models Based on Count Data: Com-
parisons and Applications of Some Estimators. Journal of Applied Econometrics,
1, 29-53.
Card, D., Dobkin, C., Maestas, N., 2004. The Impact of Nearly Universal Insurance
Coverage on Health Care Utilization and Health: Evidence from Medicare. NBER
Working Papers 10365, National Bureau of Economic Research, Inc.
Chib, S., Hamilton, B.H., 2000. Bayesian Analysis of Cross-Section and Clustered
Data Treatment Models, Journal of Econometrics, 97, 25-50.
Cowles, M.K., 1996. Accelerating Monte Carlo Markov Chain Convergence for Cumulative-
link Generalized Linear Models. Statistics and Computing, 6, 101-111.
Decker, S.L., Dushi, I., and Deb, P., 2005. Medicare at 65: Does it Level the Playing
Field? Unpublished paper.
Fang, H., Keane M.P., and Silverman, D. (2006), Sources of Advantageous Selection:
Evidence from the Medigap Insurance Market, NBER Working Papers 12289,
National Bureau of Economic Research, Inc
Geweke, J., 1991. Efficient Simulation from the Multivariate Normal and Student-t
Distributions Subject to Linear Constraints. in E. M. Keramidas, Editor, Com-
puting Science and Statistics: Proceedings of the Twenty-Third Symposium on
the Interface, 571-578.
22
Geweke, J., 1992. Evaluating the Accuracy of Sampling-Based Approaches to the
Calculation of Posterior Moments. in J.O. Berger, J.M. Bernardo, A.P Dawid,
A.F.M. Smith, Eds., Bayesian Statistics 4, 169-194. Oxford: Oxford University
Press.
Geweke, J., Gowrisankaran, G., Town, R.J., 2003. Bayesian Inference for Hospital
Quality in a Selection Model. Econometrica, 71(4), 1215-38.
Geweke, J., 2004. Getting it Right: Joint Distribution Tests of Posterior Simulators.
Journal of the American Statistical Association, 99, 467, 799-804.
Goldman, D. P. and J.M. Zissimopoulos. 2003. High Out-of-Pocket Health Care
Spending By the Elderly. Health Affairs, 22(3), 194-202.
Keane, M.P., 1992. A Note on Identification in the Multinomial Probit Model. Journal
of Business and Economic Statistics, 10, 193-200
Li, M., Tobias, J., 2006. A Bayesian Analysis of Treatment Effects in an Ordered Po-
tential Outcomes Model, in Advances in Econometrics: Modelling and Evaluating
Treatment Effects in Econometrics Volume 21, Editors: Daniel Millimet, Jeffrey
Smith, and Ed Vytlacil.
Lichtenberg, F., 2002. The Effects of Medicare on Health Care Utilization and Out-
comes, in Frontiers in Health Policy Research, A. Garber, editor (Cambridge, MA:
MIT Press).
McCulloch, R.E., Polson, N.G., Rossi, P.E., 2000. A Bayesian Analysis of the Multino-
mial Probit Model with Fully Identified Parameters. Journal of Econometrics, 99,
173-193.
Munkin, M.K., Trivedi, P.K., 2003. Bayesian Analysis of Self-Selection Model with
Multiple Outcomes Using Simulation-Based Estimation: An Application to the
Demand for Healthcare. Journal of Econometrics, 114, 197-220.
23
Nandram, B. and M.-H. Chen, 1996. Reparameterizing the Generalized Linear Model
to Accelerate Gibbs Sampler Convergence. Journal of Statistical Computation
and Simulation, 54, 129-144.
Poirier, D.J., Tobias, J.L., 2003. On the Predictive Distributions of Outcome Gains
in the Presence of an Unidentified Parameter. Journal of Business and Economic
Statistics, 21, 258-268.
Tanner, M.A., Wong, W.H., 1987. The Calculation of Posterior Distribution by Data
Augmentation. Journal of American Statistical Association, 82, 528-540.
Verdinelli, I., Wasserman, L. , 1995. Computing Bayes Factors Using a Generalization
of the Savage-Dickey Density Ratio. Journal of American Statistical Association,
90, 614-618.
24
Table 1.Summary statistics
Mean St. dev.HOSPVIS Number of hospital admissions 0.303 0.734Insurance plan typesESI = 1 if plan is group private 0.766 0.423MEDICARE = 1 if Medicare only 0.159 0.366Demographic characteristicsFAMSIZE family size 2.166 1.035AGE age/10 6.389 0.615EDUC years of schooling 12.633 3.059INCOME $ income/1000 59.230 50.139FEMALE = 1 if female 0.526 0.499BLACK = 1 if black 0.107 0.310HISPANIC = 1 if hispanic 0.091 0.288MARRIED = 1 if married 0.716 0.451NOREAST = 1 if northeast 0.184 0.388MIDWEST = 1 if midwest 0.243 0.429SOUTH = 1 if south 0.376 0.484MSA = 1 if metropolitan statistical area 0.757 0.429Health characteristicsVEGOOD = 1 if very good health 0.316 0.465GOOD = 1 if good health 0.309 0.462FAIR = 1 if fair health 0.129 0.336POOR = 1 if poor health 0.044 0.206PHYSLIM = 1 if physical limitation 0.167 0.373CHRONIC = 1 if chronic conditions 0.718 0.450Instrumental VariablesDUM65 = 1 if AGE>6.5 0.447 0.497OFFER if the current employers offer insurance 0.325 0.468SSIRATIO = SSI/INCOME 0.239 0.310Year dummiesYEAR98 = 1 if year 1998 0.108 0.310YEAR99 = 1 if year 1999 0.104 0.306YEAR00 = 1 if year 2000 0.134 0.340YEAR01 = 1 if year 2001 0.103 0.304YEAR02 = 1 if year 2002 0.215 0.411YEAR03 = 1 if year 2003 0.146 0.353
25
Table 2.Utilization frequency
Actual PredictedCells Using OPES Using OP0 0.8062 0.8052 0.80541 0.1263 0.1268 0.12672 0.0395 0.0402 0.04013 0.0144 0.0145 0.0145>4 0.0136 0.0134 0.0134
Note: Ordered Probit (OP), Ordered Probit Model with Endogenous Selection (OPES)
26
Table 3.Posterior means and standard deviations of parameters β, α1 and eα and their marginaleffects. Ordered Probit Model with Endogenous Selection with choice specific exclusionrestrictions.
Coefficients Marginal EffectsHOSPVIS MEDICARE ESI HOSPVIS MEDICARE ESI
CONST -2.813 -1.003 1.880 − − −0.366 0.487 0.309
FAMSIZE -0.005 0.171 0.113 -0.002 0.026 0.0270.016 0.034 0.030 0.007 0.005 0.006
AGE 0.231 0.115 -0.263 0.100 0.017 -0.0630.037 0.075 0.046 0.016 0.011 0.009
EDUCYR 0.0005 -0.039 0.022 0.0002 -0.006 0.0050.0055 0.010 0.008 0.0024 0.0016 0.002
INCOME -0.0001 -0.0045 -0.0009 -0.0001 -0.0007 -0.00020.0004 0.0009 0.0005 0.0002 0.0001 0.0001
FEMALE -0.080 -0.219 -0.069 -0.035 -0.035 -0.0180.031 0.054 0.040 0.013 0.009 0.010
BLACK -0.046 0.696 0.540 -0.019 0.124 0.1100.049 0.110 0.095 0.021 0.023 0.017
HISPANIC -0.070 0.424 0.064 -0.029 0.072 0.0150.056 0.106 0.084 0.023 0.018 0.021
MARRIED -0.034 -0.351 0.179 -0.015 -0.057 0.0440.036 0.067 0.056 0.016 0.010 0.013
VEGOOD 0.169 0.123 0.104 0.077 0.019 0.0240.049 0.078 0.054 0.023 0.014 0.015
GOOD 0.414 0.185 0.064 0.194 0.029 0.0130.049 0.078 0.058 0.024 0.013 0.015
FAIR 0.785 0.447 0.208 0.452 0.075 0.0450.063 0.100 0.076 0.040 0.018 0.018
POOR 1.236 0.251 0.034 0.909 0.041 0.0060.081 0.136 0.106 0.074 0.025 0.025
MSA -0.089 0.150 0.203 -0.040 0.023 0.0500.034 0.060 0.045 0.016 0.008 0.011
PHYSLIM 0.329 0.177 0.057 0.159 0.029 0.0150.038 0.074 0.059 0.020 0.012 0.013
CHRONIC 0.518 0.046 0.141 0.190 0.007 0.0340.044 0.066 0.046 0.012 0.010 0.011
DUM65 − 0.646 − − 0.099 −0.076 0.011
OFFER − − 1.121 − − 0.2150.063 0.007
SSIRATIO − 0.309 -0.412 − 0.048 -0.0980.112 0.091 0.017 0.018
27
Table 4.Posterior means and standard deviations of parameters β, α1 and eα and their mar-ginal effects. Ordered Probit Model with Endogenous Selection without choice specificexclusion restrictions.
Coefficients Marginal EffectsHOSPVIS MEDICARE ESI HOSPVIS MEDICARE ESI
CONST -2.816 -0.940 2.035 − − −0.380 0.636 0.413
FAMSIZE -0.006 0.175 0.111 -0.003 0.026 0.0250.016 0.037 0.030 0.007 0.006 0.008
AGE 0.234 0.106 -0.285 0.101 0.015 -0.0670.037 0.095 0.069 0.016 0.012 0.016
EDUCYR 0.001 -0.045 0.021 0.0003 -0.007 0.0050.006 0.014 0.010 0.0024 0.002 0.002
INCOME -0.0001 -0.005 -0.0008 -0.0001 -0.0007 -0.00020.0004 0.001 0.0005 0.0002 0.0001 0.00013
FEMALE -0.083 -0.228 -0.069 -0.036 -0.034 -0.0160.031 0.066 0.045 0.013 0.008 0.011
BLACK -0.048 0.707 0.536 -0.020 0.121 0.1070.049 0.120 0.092 0.020 0.023 0.018
HISPANIC -0.071 0.448 0.061 -0.029 0.075 0.0150.056 0.124 0.091 0.023 0.019 0.020
MARRIED -0.033 -0.391 0.175 -0.015 -0.061 0.0420.036 0.089 0.066 0.016 0.012 0.015
VEGOOD 0.173 0.121 0.101 0.078 0.018 0.0220.050 0.082 0.056 0.023 0.012 0.013
GOOD 0.417 0.193 0.064 0.194 0.029 0.0140.052 0.082 0.056 0.024 0.012 0.012
FAIR 0.792 0.448 0.197 0.453 0.072 0.0430.070 0.111 0.075 0.042 0.017 0.014
POOR 1.249 0.257 0.029 0.913 0.040 0.0050.087 0.145 0.107 0.073 0.024 0.026
MSA -0.089 0.149 0.201 -0.039 0.021 0.0480.033 0.063 0.044 0.015 0.009 0.011
PHYSLIM 0.330 0.184 0.059 0.158 0.027 0.0130.040 0.081 0.059 0.020 0.011 0.013
CHRONIC 0.524 0.037 0.138 0.191 0.005 0.0330.047 0.063 0.043 0.012 0.009 0.011
DUM65 − 0.705 0.018 − 0.102 0.0050.139 0.081 0.015 0.019
OFFER − -0.266 1.045 − -0.037 0.1980.174 0.081 0.020 0.014
SSIRATIO − 0.326 -0.423 − 0.047 -0.0990.154 0.118 0.019 0.024
28
Table 5.Posterior means, standard deviations and RNE values of parameters ρ, π, eσ21 and τ
OPES OPES EBP OP(MNP restrictions) (no restrictions)Coefficients RNE Coefficients RNE Coefficients RNE Coefficients RNE
Insurance dummiesMEDICARE -0.494 0.040 -0.529 0.026 -1.536 0.007 0.013 0.334
0.361 0.381 0.953 0.058ESI -0.333 0.041 -0.362 0.022 -1.157 0.007 0.061 0.334
0.329 0.365 0.811 0.054
Covariance parameters between insurance and utilization equationsπ1 0.092 0.090 0.119 0.022 0.367 0.007 − −
0.079 0.112 0.279eπ 0.122 0.065 0.119 0.053 0.328 0.011 − −0.165 0.204 0.255
Threshold parametersτ2 0.748 0.085 0.753 0.028 − − 0.724 0.334
0.031 0.036 0.018τ3 1.251 0.072 1.259 0.027 − − 1.211 0.334
0.050 0.059 0.026τ4 1.612 0.077 1.624 0.028 − − 1.563 0.334
0.064 0.074 0.036
Covariance between insurance equationseσ21 1.011 0.028 1.055 0.010 0.876 0.015 − −0.188 0.425 0.201
Note: Ordered Probit (OP), Endogenous Binary Probit (EBP), Ordered Probit Modelwith Endogenous Selection (OPES), (MNP restrictions) indicates the OPES model withchoice specific exclusion restrictions, (no restrictions) indicates the OPES model withoutchoice specific exclusion restrictions
29