Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye...

15
Empir Econ (2016) 51:1607–1621 DOI 10.1007/s00181-015-1051-7 Bivariate generalized Poisson regression model: applications on health care data Hossein Zamani 1 · Pouya Faroughi 2 · Noriszura Ismail 3 Received: 9 December 2013 / Accepted: 3 December 2015 / Published online: 5 January 2016 © Springer-Verlag Berlin Heidelberg 2016 Abstract This paper introduces several forms of bivariate generalized Poisson regres- sion model (BGPR) which can be fitted to bivariate and correlated count data with covariates. The main advantage of these forms of BGPR is that they are nested and thus they allow likelihood ratio tests to be performed to choose the best model. The BGPR can be fitted not only to bivariate count data with positive, zero, or negative correlations, but also to under- or overdispersed bivariate count data with flexible form of mean–variance relationship. Applications of several forms of the BGPR are illus- trated on two sets of count data: the Australian health survey data and the US National Medical Expenditure Survey data. Keywords Generalized Poisson · Bivariate · Correlation · Overdispersion · Underdispersion JEL Classification C35 1 Introduction Poisson regression model has been widely used for modeling count data with covari- ates. However, count data are often overdispersed, and negative binomial regression model (NBR) has been used for handling overdispersion, whereas generalized Poisson B Noriszura Ismail [email protected] 1 Hormozgan University, Bandar Abbas, Iran 2 Department of Statistics, Sanandaj Branch, Islamic Azad University, Sanandaj, Iran 3 Faculty of Science and Technology, School of Mathematical Sciences, Universiti Kebangsaan Malaysia (UKM), 43600 Bangi, Selangor, Malaysia 123

Transcript of Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye...

Page 1: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

Empir Econ (2016) 51:1607–1621DOI 10.1007/s00181-015-1051-7

Bivariate generalized Poisson regression model:applications on health care data

Hossein Zamani1 · Pouya Faroughi2 ·Noriszura Ismail3

Received: 9 December 2013 / Accepted: 3 December 2015 / Published online: 5 January 2016© Springer-Verlag Berlin Heidelberg 2016

Abstract This paper introduces several forms of bivariate generalized Poisson regres-sion model (BGPR) which can be fitted to bivariate and correlated count data withcovariates. The main advantage of these forms of BGPR is that they are nested andthus they allow likelihood ratio tests to be performed to choose the best model. TheBGPR can be fitted not only to bivariate count data with positive, zero, or negativecorrelations, but also to under- or overdispersed bivariate count data with flexible formof mean–variance relationship. Applications of several forms of the BGPR are illus-trated on two sets of count data: the Australian health survey data and the US NationalMedical Expenditure Survey data.

Keywords Generalized Poisson · Bivariate · Correlation · Overdispersion ·Underdispersion

JEL Classification C35

1 Introduction

Poisson regression model has been widely used for modeling count data with covari-ates. However, count data are often overdispersed, and negative binomial regressionmodel (NBR) has been used for handling overdispersion, whereas generalized Poisson

B Noriszura [email protected]

1 Hormozgan University, Bandar Abbas, Iran

2 Department of Statistics, Sanandaj Branch, Islamic Azad University, Sanandaj, Iran

3 Faculty of Science and Technology, School of Mathematical Sciences, Universiti KebangsaanMalaysia (UKM), 43600 Bangi, Selangor, Malaysia

123

Page 2: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

1608 H. Zamani et al.

regression model (GPR) has been fitted for under- or overdispersed count data. Thegeneralized Poisson distribution (GPD) is obtained from the limiting form of a gener-alized negative binomial distribution (Consul and Jain 1973). Based on the literatures,different forms of GPR have been proposed using different parameterization of theGPR (Consul and Famoye 1992; Famoye et al. 2004;Wang and Famoye 1997; Zamaniand Ismail 2012, 2013).

If we have bivariate or multivariate count data, several forms of bivariate ormultivariate distributions can be fitted such as bivariate Poisson distribution (BPD)(Campbell 1934) and multivariate Poisson distribution (Banerjee 1959; Sibuya et al.1964). A few examples of bivariate discrete distribution based on themethod of trivari-ate reduction can be found inKocherlakota andKocherlakota (1992) and Johnson et al.(1997). The bivariate generalized Poisson distribution (BGPD) based on the methodof trivariate reduction was introduced by Famoye and Consul (1995). However, thetrivariate reduction method only admits positive correlation, and studies on the lim-itation of bivariate distributions based on this method can be seen in Mitchell andPaulson (1981) who suggested the bivariate negative binomial distribution (BNBD)that allows a restricted range of negative correlations and Lee (1999) who defined theBNBD from copula functions.

The BPD which handles negative, zero, or positive correlation was introduced byLakshminarayana et al. (1999) where the distribution is defined from the product oftwo Poisson marginals with a multiplicative factor parameter. The work of Laksh-minarayana et al. (1999) was then continued by Famoye (2010a) who proposed theBGPD, Famoye (2010b) who introduced the bivariate negative binomial regressionmodel (BNBR), and Famoye (2012) who defined the bivariate generalized Poissonregression model (BGPR).

This paper defines several forms of BGPR which have several advantages; themodels are nested and allow likelihood ratio tests to be applied for choosing the bestmodel; the models can be fitted to bivariate count data with positive, zero, or negativecorrelation; the models allow over- or underdispersion of the two dependent variables;and the models allow the two dependent variables to have flexible forms of mean–variance relationship.

The rest of this paper is organized as follows. Section 2 provides the joint p.m.f.of the bivariate Poisson regression model (BPR) which is derived from the productof two Poisson marginals with a multiplicative factor parameter. Several forms ofthe BGPR are proposed in Sect. 3, whereas Sect. 4 discusses several tests for testingoverdispersion and independence.Numerical illustrations are provided inSect. 5whereseveral forms of the BGPR are applied to two sets of data: the Australian health surveydata (Cameron et al. 1988) and the US National Medical Expenditure Survey (NMES)data (Deb and Trivedi 1997).

The response variables for health care utilization in theAustralian health survey dataconsist of several health services such as number of doctor or specialist consultations,number of hospital admissions, number of nights in a hospital, number of non-doctorconsultations, number of prescribed medications, and number of non-prescribed med-ications. Similarly, the response variables for health services in the US NMES dataconsist of several measures of utilization such as number of visits to a physician inan office setting, number of visits to a non-physician in an office setting, number of

123

Page 3: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

Bivariate generalized Poisson regression... 1609

visits to a physician in an outpatient setting, number of visits to a non-physician in anoutpatient setting, and number of visits to an emergency room. Casual observation onboth data suggests that the utilization of each health service may imply some depen-dent events. For the case of the Australian data as an example, different health state(ranging from good to bad) may result in several trips to the doctor, and this eventmay cause other dependent events such as non-doctor consultations, or days spent inhospital to be utilized. This study aims to fit our forms of BGPR to both the Australianand the US NMES data for choosing the best model, and for performing the test ofindependence which can be used to indicate whether the two response variables shouldbe fitted jointly under bivariate model or independently under univariate model.

2 Bivariate Poisson regression model (BPR)

The joint p.m.f. of the BPDwhich allows the correlation structure to be positive, nega-tive, or zero is derived from the product of two Poisson marginals with a multiplicativefactor parameter (Lakshminarayana et al. 1999)

Pr(y1, y2) = e−μ1−μ2μy11 μ

y22

y1!y2! {1 + α [(g1(y1) − g1)(g2(y2) − g2)]}y1, y2 = 0, 1, 2, . . . , μ1, μ2 > 0 (1)

where g1(y1) and g2(y2) are bounded functions in y1 and y2, respectively. To ensurenonnegativity in the value of {.} in (1),

gt (yt ) = e−yt and gt = E[gt (Yt )] = E(e−Yt ), t = 1, 2. (2)

Suppose Yi1 and Yi2 (i = 1, 2, . . . n) are count response variables. Following(1)–(2), the joint p.m.f. of the BPR is

Pr(yi1, yi2) = e−μi1−μi2μyi1i1 μ

yi2i2

yi1!yi2![1 + α(e−yi1 − e−dμi1)(e−yi2 − e−dμi2)

](3)

where d = 1 − e−1 and α is the multiplicative factor (or correlation) parameter.The marginal means, marginal variances, and covariance of the BPR are E(Yi1) =

Var(Yi1) = μi1, E(Yi2) = Var(Yi2) = μi2 and Cov(Yi1,Yi2) = αμi1μi2d2

e−d(μi1+μi2). When α = 0, the response variables Y1 and Y2 are independent, eachis distributed as a marginal Poisson. When α > 0 and α < 0, we have positive andnegative correlations, respectively.

The covariates can be included in the marginal means using log-link functions

E(Yi1) = μi1 = exp(xTi β

)and E(Yi2) = μi2 = exp

(xTi γ

)(4)

where β and γ are the regression parameters for Yi1 and Yi2, respectively, and xi is thevector of covariates.

123

Page 4: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

1610 H. Zamani et al.

3 Bivariate generalized Poisson regression model (BGPR)

The p.m.f. of the GPD is Consul and Famoye (1992)

Pr(y) = θ(θ + vy)y−1

y! e−θ−vy (5)

where θ > 0, and v is the dispersion parameter with max(−1,− θ4 ) < v < 1. The

mean and variance are E(Y ) = μ = θ(1−v)−1 and Var(Y ) = θ(1−v)−3. The GPDreduces to Poisson distribution when v = 0, and handles under- and overdispersionwhen v < 0 and v > 0, respectively.

The p.g.f. of the GPD is ϕY (u) = E(UY ) = eθ(t−1) where t = uev(t−1) (Consuland Famoye 2006), so that the m.g.f. is

MY (u) = E(euY ) = eθ(et−1) (6)

where et = ev(et−1)+u . By setting u = −1 in (6), we obtain

E(e−Y ) = eθ(s−1) (7)

where ln s − v(s − 1) + 1 = 0. By differentiating m.g.f. in (6) with respect to u andletting u = −1, we have

∂uMY (u)

∣∣∣∣u=−1

= E(Y e−Y ) = θ

1 − vse(θ+v)(s−1)−1 (8)

where ln s − v(s − 1) + 1 = 0.The generalized Poisson-1 regression model (GPR-1) can be obtained by replacing

θi = (1 − v)μi in p.m.f. (5) producing

Pr(yi ) = (1 − v)μi [(1 − v)μi + vyi ]yi−1

yi ! e−(1−v)μi−vyi .

Using the same approach suggested by Lakshminarayana et al. (1999) for derivingthe BPD, the joint p.m.f. of the bivariate generalized Poisson-1 regression model(BGPR-1) is

Pr(yi1, yi2) =[

2∏t=1

(1 − vt )μi t [(1 − vt )μi t + vt yi t ]yit−1e−(1−vt )μi t−vt yi t

yi t !

]

[1 + α

2∏t=1

(e−yit − cit )

]. (9)

where vt , t = 1, 2 are the dispersion parameters, and α is the multiplicative factor(or correlation) parameter. From (7), cit = E(e−Yit ) = eμi t (1−vt )(st−1), where ln st −vt (st − 1) + 1 = 0, t = 1, 2.

123

Page 5: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

Bivariate generalized Poisson regression... 1611

The marginal means, marginal variances, and covariance for the BGPR-1 areE(Yit ) = μi t , Var(Yit ) = μi t (1 − vt )

−2, t = 1, 2, and Cov(Yi1,Yi2) = α(ci11 −ci1μi1)(ci22−ci2μi2). From (8), citt = E(Yite−Yit ) = (1−vt )μi t

1−vt ste[(1−vt )μi t+vt ](st−1)−1

where ln st − vt (st − 1) + 1 = 0, t = 1, 2. When α = 0, the response variables Y1and Y2 are independent, each is distributed as a marginal GPR-1. When α > 0 andα < 0, we have positive and negative correlations, respectively. The BGPR-1 reducesto the BPR when v1 = v2 = 0, and handles under- and overdispersion when vt < 0and vt > 0, t = 1, 2, respectively.

This paper proposes a BGPR-1 which is based on the GPR-1 suggested in Zamaniand Ismail (2012). The GPR-1 is obtained by replacing v with a

1+a and θi withμi1+a

in p.m.f. (5). The mean and variance for the GPR-1 are E(Yi ) = μi and Var(Yi ) =μi (1 + a)2, where a is the dispersion parameter. The GPR-1 reduces to Poissonregression model when a = 0, and handles under- and overdispersion when a < 0and a > 0, respectively.

The p.g.f. of the GPR-1 is ϕYi (u) = E(UYi ) = eμi1+a (t−1) where t = ue

a1+a (t−1),

so that the m.g.f. is MYi (u) = E(euYi ) = eμi1+a (et−1) where et = e

a1+a (et−1)+u .

Therefore, E(e−Yi ) = eμi1+a (s−1) and E(Yie−Yi ) = μi

1−a(s−1)eμi+a1+a (s−1)−1 where

ln s − a1+a (s − 1) + 1 = 0.

The joint p.m.f. of the BGPR-1 is

Pr(yi1, yi2) =[

2∏t=1

μi t (μi t + at yit )yit−1

(1 + at )yit yi t ! e− μi t+at yi t1+at

] [1 + α

2∏t=1

(e−yit − cit )

]

(10)

where at , t = 1, 2, are the dispersion parameters, α is the multiplicative factor (or

correlation) parameter, cit = E(e−Yit ) = eμi t1+at

(st−1) and ln st − at1+at

(st − 1) + 1 =0, t = 1, 2.

The marginal means, marginal variances, and covariance for the BGPR-1 areE(Yit ) = μi t , Var(Yit ) = μi t (1 + at )2, t = 1, 2, and Cov(Yi1,Yi2) = α(ci11 −ci1μi1)(ci22 − ci2μi2), where citt = E(Yite−Yit ) = μi t

1−at (st−1)eμi t+at1+at

(st−1)−1 andln st − at

1+at(st − 1) + 1 = 0, t = 1, 2. When α = 0, the response variables Y1 and

Y2 are independent, each is distributed as a marginal GPR-1. When α > 0 and α < 0,we have positive and negative correlations, respectively. The BGPR-1 reduces to theBPR when a1 = a2 = 0, and handles under- and overdispersion when at < 0 andat > 0, t = 1, 2, respectively.

The generalized Poisson-2 regression model (GPR-2) is obtained by replacing v

with aμi1+aμi

and θi withμi

1+aμiin p.m.f. (5). The mean and variance for the GPR-2

are E(Yi ) = μi and Var(Yi ) = μi (1 + aμi )2. The GPR-2 also reduces to Poisson

regression model when a = 0, and handles under- and overdispersion when a < 0and a > 0, respectively.

The p.g.f. of the GPR-2 is ϕYi (u) = E(UYi ) = eμi

1+aμi(t−1)

where t = ueaμi

1+aμi(t−1)

,

so that the m.g.f. is MYi (u) = E(euYi ) = eμi

1+aμi(et−1)

where et = eaμi

1+aμi(et−1)+u

.

Therefore, E(e−Yi ) = eμi

1+aμi(s−1)

and E(Yie−Yi ) = μi1−aμi (s−1)e

μi+aμi1+aμi

(s−1)−1, where

123

Page 6: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

1612 H. Zamani et al.

ln s − aμi1+aμi

(s − 1) + 1 = 0. The joint p.m.f. of the bivariate generalized Poisson-2regression model (BGPR-2) is already defined in Famoye (2012).

This paper also proposes a bivariate generalized Poisson-P regression model(BGPR-P) which is based on the generalized Poisson-P regression model (GPR-P)suggested in Zamani and Ismail (2012). The GPR-P is a flexible model that nests theGPR-1 andGPR-2 by including an additional parameter P (functional parameter). The

GPR-P is obtained by replacing v withaμP−1

i

1+aμP−1i

and θi withμi

1+aμP−1i

in p.m.f. (5). The

mean and variance for the GPR-P are E(Yi ) = μi and Var(Yi ) = μi (1 + aμP−1i )2.

The GPR-P reduces to Poisson regression model when a = 0, reduces to the GPR-1and GPR-2 when P = 1 and P = 2, respectively, and handles under- and overdisper-sion when a < 0 and a > 0, respectively.

The p.g.f. of the GPR-P is ϕYi (u) = E(UYi ) = eμi

1+aμP−1i

(t−1)where t

= ue

aμP−1i

1+aμP−1i

(t−1), so that the m.g.f. is MYi (u) = E(euYi ) = e

μi1+aμ

P−1i

(et−1)

where et = e

aμP−1i

1+aμP−1i

(et−1)+u. Therefore, E(e−Yi ) = e

μi1+aμ

P−1i

(s−1)and E(Yie−Yi )

= μi

1−aμP−1i (s−1)

e

μ+aμP−1i

1+aμP−1i

(s−1)−1, where ln s − aμP−1

i

1+aμP−1i

(s − 1) + 1 = 0.

The joint p.m.f. of the BGPR-P is

Pr(yi1, yi2) =⎡⎢⎣

2∏t=1

μi t (μi t + atμPt−1i t

yi t )yit−1

(1 + atμPt−1i t )yit yi t !

e− μi t+atμ

Pt−1i t yi t

1+atμPt−1i t

⎤⎥⎦

[1 + α

2∏t=1

(e−yit − cit )

](11)

where at , t = 1, 2, are the dispersion parameters, α is the multiplicative factor (or

correlation) parameter, cit = E(e−Yit ) = e

μi t

1+atμPt−1i t

(st−1)and ln st − atμ

Pt−1i t

1+atμPt−1i t

(st −1) + 1 = 0, t = 1, 2.

The marginal means, marginal variances, and covariance for the BGPR-P areE(Yit ) = μi t , Var(Yit ) = μi t (1+atμPt−1

i t)2, t = 1, 2, andCov(Yi1,Yi2) = α(ci11−

ci1μi1)(ci22−ci2μi2), where citt = E(Yite−Yit ) = μi t

1−atμPt−1i t (st−1)

e

μi t+atμPt−1i t

1+atμPt−1i t

(st−1)−1

and ln st − atμPt−1i t

1+atμPt−1i t

(st − 1) + 1 = 0, t = 1, 2. When α = 0, the response variables

Y1 and Y2 are independent, each is distributed as a marginal GPR-P. When α > 0 andα < 0, we have positive and negative correlations, respectively. The BGPR-P reducesto the BPR when a1 = a2 = 0, handles under- and overdispersion when at < 0and at > 0, t = 1, 2, respectively, and reduces to the BGPR-1 and BGPR-2 whenP1 = P2 = 1 and P1 = P2 = 2, respectively.

123

Page 7: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

Bivariate generalized Poisson regression... 1613

The advantage of the BGPR-P is that the model allows the two response variables tohave flexible forms ofmarginalmean–variance relationship. It can be seen that themar-ginal mean–variance relationships of the BGPR-1 are linear since Var(Yit ) = μi t (1+at )2, t = 1, 2, the BGPR-2 are cubic since Var(Yit ) = μi t (1+ atμi t )

2, t = 1, 2, andtheBGPR-P are to the (2P−1)th power sinceVar(Yit ) = μi t (1+atμPt−1

i t)2, t = 1, 2.

The covariates for the BGPR-1, BGPR-2, and BGPR-P can be included in themarginal means using log-link functions (4).

4 Tests

4.1 Likelihood ratio test

A two-sided LRT can be performed to test the dispersion (over- or underdispersion)in BPR against BGPR alternatives (BGPR-1 or BGPR-2) where the hypothesis isH0 : a1 = a2 = 0. The LRT statistic is T = 2(ln L1 − ln L0), where ln L1 and ln L0are the models’ log likelihood under their respective hypothesis. Since the BPR isnested within the BGPR-1 and BGPR-2, the statistic is asymptotically distributed asa Chi-square with two degrees of freedom.

A two-sided LRT for testing BGPR-1 (or BGPR-2) against BGPR-P, H0 : P1 =P2 = 1 (or H0 : P1 = P2 = 2), can also be performed using the LRT, where thestatistic is asymptotically distributed as a Chi-square with two degrees of freedom.

Since the response variables Yi1 and Yi2 are independent when the correlationparameter, α, is zero, we can also use a two-sided LRT to test independence, wherethe hypothesis is H0 : α = 0 against H1 : α �= 0. The statistic is asymptoticallydistributed as a Chi-square with one degree of freedom.

4.2 Wald test

The test of dispersion (over- or underdispersion) in BPR against BGPR alternatives(BGPR-1 or BGPR-2), H0 : at = 0, t = 1, 2, can also be performed using Wald

test. The Wald test statistic is a2tV ar(at )

, t = 1, 2, where at , t = 1, 2, is the estimateddispersion parameter. The statistic is asymptotically distributed as a Chi-square withone degree of freedom.

The independence of response variables Yi1 and Yi2 can also be tested using Waldtest, where the statistic is α2

Var(α), and α is the estimated correlation parameter. The

statistic is asymptotically distributed as a Chi-square with one degree of freedom.For testing the adequacy of BGPR-1 against BGPR-P, H0 : Pt = 1, t = 1, 2, the

Wald test statistic is (Pt−1)2

Var(Pt ), t = 1, 2, where Pt is the estimated functional parameter.

For testing the adequacy of BGPR-2 against BGPR-P, H0 : Pt = 2, t = 1, 2, theWald

test statistic is (Pt−2)2

Var(Pt ). Both statistics asymptotically follow a Chi-square distribution

with one degree of freedom.In terms of preference between LRT and Wald test, the LRT may be better to be

used for the two response variables. The hypothesis for testing dispersion under the

123

Page 8: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

1614 H. Zamani et al.

LRT is H0 : a1 = a2 = 0, compared to H0 : at = 0, t = 1, 2, under the Waldtest. For testing adequacy of BGPR-1 (or BGPR-2) against BGPR-P, the hypothesis isH0 : P1 = P2 = 1 (or H0 : P1 = P2 = 2) under the LRT, compared to H0 : Pt = 1(or H0 : Pt = 2), t = 1, 2, under the Wald test.

5 Applications

5.1 Australian health data (1977–1978)

The Australian health survey data (Cameron et al. 1988) are considered for the BPRand our forms of the BGPR. The same data were also used by Cameron and Johansson(1997) for fitting several univariate models, Gurmu and Elder (2000) who fitted thebivariate generalized negative binomial regression model, Famoye (2010b) who fittedthe BNBR, and Famoye (2012) who fitted the BGPR-2. The health survey data containa sample of 5190 single-person households based on the 1977–1978 Australian HealthSurvey.

In this study,we consider twopossibly dependent andnegatively correlated responsevariables namely Y1, which is the total number of prescribed medications used in pasttwo days (PRESCRIBED), and Y2, which is the total number of non-prescribed med-ications used in past two days (NON-PRESCRIBED). We fit the BPR and our formsof BGPR, choose the best model, and perform the test of independence to indicatewhether the two response variables should be fitted jointly under bivariate modelor independently under univariate model. The mean and standard deviation for pre-scribedmedications are 0.863 and 1.415, respectively, themean and standard deviationfor non-prescribed medications are 0.356 and 0.712, respectively, and the correlationbetween prescribed and non-prescribedmedications is -0.043. The negative correlationindicates possible negative dependency between the two response variables. Furtherinformation on the explanatory variables is provided in Cameron et al. (1988).

Table 1 provides the estimates and standard errors for the BPR, BGPR-1, BGPR-2,and BGPR-P which are fitted jointly to the two response variables. To provide betterconvergence, the estimates from theBPR are used as initial values for fitting theBGPR.

The LRT statistics for testing BPR against BGPR-1 and BPR against BGPR-2are 381.46 and 338.16, respectively, indicating overdispersion in the two responsevariables. The LRT statistic for testing BGPR-1 against BGPR-P is 4.40, which isinsignificant since the p-value is 0.11. On the contrary, the LRT statistic for testingBGPR-2 against BGPR-P is 47.70, which is significant. It can also be seen that thedifference in AIC between the BGPR-1 and BGPR-P is very minimal, which is 0.41.Based on the LRT, the best model is the BGPR-1, followed by the BGPR-P andBGPR-2.

The estimates of correlation parameter under all models are negative, indicatingnegative dependence between the two response variables. The absolute values of t-ratio for the correlation parameter under the BPR, BGPR-1, BGPR-2, and BGPR-P,respectively, are 6.66, 6.94, 7.13, and 6.95, indicating that the two response variablesare significantly dependent. Therefore, the two response variables are suggested to be

123

Page 9: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

Bivariate generalized Poisson regression... 1615

Table 1 BPR, BGPR-1, BGPR-2, and BGPR-P (Australian health care data)

Parameter BPR BGPR-1 BGPR-2 BGPR-P

Est. SE Est. SE Est. SE Est. SE

Y1, prescribed

Intercept −2.70 0.13 −2.66 0.15 −2.76 0.15 −2.70 0.15

Sex 0.48 0.04 0.55 0.04 0.55 0.04 0.55 0.04

Age 2.41 0.62 2.33 0.71 2.36 0.73 2.34 0.72

Agesq −0.64 0.64 −0.65 0.74 −0.59 0.78 −0.63 0.76

Income 0.00 0.06 −0.01 0.06 0.03 0.06 0.01 0.06

Levyplus 0.29 0.05 0.27 0.06 0.27 0.06 0.27 0.06

Freepoor −0.05 0.12 −0.09 0.14 −0.06 0.13 −0.09 0.14

Freerepa 0.30 0.06 0.28 0.07 0.30 0.07 0.28 0.07

Illness 0.20 0.01 0.20 0.01 0.21 0.00 0.20 0.01

Actdays 0.03 0.01 0.03 0.01 0.03 0.01 0.03 0.00

Hscore 0.02 0.01 0.02 0.01 0.02 0.01 0.20 0.01

Chcond1 0.77 0.05 0.76 0.05 0.77 0.05 0.77 0.05

Chcond2 1.01 0.05 1.00 0.06 1.02 0.06 1.01 0.06

Y2, nonprescribed

Intercept −2.03 0.17 −1.95 0.19 −2.04 0.19 −1.97 0.19

Sex 0.27 0.05 0.26 0.06 0.27 0.06 0.27 0.06

Age 2.86 0.95 2.83 1.05 2.88 1.08 2.84 1.06

Agesq −3.90 1.07 −3.89 1.19 −3.87 1.22 −3.90 1.20

Income 0.17 0.08 0.11 0.09 0.16 0.09 0.13 0.09

Levyplus −0.03 0.06 −0.05 0.06 −0.04 0.07 −0.04 0.06

Freepoor 0.00 0.12 −0.08 0.14 −0.01 0.14 −0.06 0.14

Freerepa −0.29 0.09 −0.29 0.10 −0.29 0.10 −0.29 0.10

Illness 0.20 0.02 0.20 0.02 0.21 0.02 0.20 0.02

Actdays 0.01 0.01 −0.00 0.01 0.01 0.01 −0.00 0.01

Hscore 0.03 0.01 0.03 0.01 0.03 0.01 0.03 0.01

Chcond1 0.15 0.06 0.14 0.06 0.15 0.06 0.14 0.06

Chcond2 0.02 0.08 0.03 0.09 0.01 0.09 0.03 0.09

a1, dispersion − − 0.18 0.02 0.13 0.01 0.18 0.02

a2, dispersion − − 0.14 0.02 0.35 0.04 0.17 0.06

P1, functional − − 1.00 − 2.00 − 1.22 0.11

P2, functional − − 1.00 − 2.00 − 1.20 0.33

α, correlation −0.89 0.13 −0.91 0.13 −0.94 0.13 −0.91 0.13

Log likelihood −9522.59 −9331.86 −9353.51 −9329.66

AIC 19,099.18 18,721.72 18,765.01 18,721.31

123

Page 10: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

1616 H. Zamani et al.

fitted jointly under the BGPR-1, which is the best model compared to the BGPR-Pand BGPR-2.

For comparison purposes, the univariate models (Poisson regression model, GPR-1, GPR-2, and GPR-P) are fitted separately to the two response variables so that theLRT can be performed for testing independence. The estimates and standard errorsfor the fitted models are shown in Table 2.

Using the results from Tables 1, and 2, the LRT for testing independence, whereH0 : α = 0 against H0 : α �= 0, can be implemented for testing univariate Poissonregression model against BPR, univariate GPR-1 against BGPR-1, univariate GPR-2against BGPR-2, and univariate GPR-P against BGPR-P. The LRT statistics are 38.58,41.22, 44.22, and 41.52, respectively, indicating that the two response variables aredependent under all models.

Comparison in terms of significance of the estimates of the regression parametersbetween the BGPR-1 (which is the best model) and the univariate GPR-1 shows thatboth models provide the same insignificant estimates at 0.05 level, namely Agesq,Income, and Freepoor in Y1 (PRESCRIBED), and Income, Levyplus, Freepoor, Act-days, and Chcond2 in Y2 (NON-PRESCRIBED).

Comparison of the estimates of the regression parameters between the bivariate andthe univariate regression models shows that most regression parameters have similarestimates. However, there are several regression parameters that indicate otherwise. Ifwe consider only significant regression parameters, the estimate of Age in prescribedmedicines and the estimates of Age and Agesq in non-prescribed medicines are dif-ferent under the BGPR-1 and univariate GPR-1. The use of prescribed medicines ishigher, but the use of non-prescribed medicines is lower for older individuals underthe BGPR-1 compared to the univariate GPR-1.

5.2 US NMES data (1987–1988)

The US National Medical Expenditure Survey (NMES) data (Deb and Trivedi 1997)are also considered for fitting the BPR and our forms of BGPR. The same data wereused by Deb and Trivedi (1997) for fitting several univariate NB models (hurdle andfinite mixtures). The health survey data contain a subsample of 4406 observations ofindividuals ages 66 and over covered by Medicare.

In this study,we consider two possibly dependent and positively correlated responsevariables which are the number of hospital stays (HOSP, Y1) and the number of non-physician hospital outpatient visits (OPNP,Y2).Wefit theBPRandour forms ofBGPR,choose the best model, and perform the test of independence to indicate whether thetwo response variables should be fitted jointly under bivariate model or independentlyunder univariate model. The mean and standard deviation for hospital stays are 0.30and 0.75, respectively; the mean and standard deviation for non-physician outpatientvisits are 0.54 and 3.88, respectively; and the correlation between hospital stays andnon-physician outpatient visits is 0.065. The positive correlation indicates possiblepositive dependency between the two response variables. Further information on theexplanatory variables is provided in Deb and Trivedi (1997).

123

Page 11: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

Bivariate generalized Poisson regression... 1617

Table 2 Univariate Poisson, GPR-1, GPR-2, and GPR-P (Australian health care data)

Parameter Poisson GPR-1 GPR-2 GPR-P

Est. SE Est. SE Est. SE Est. SE

Y1, prescribed

Intercept −2.74 0.13 −2.62 0.15 −2.73 0.15 −2.66 0.15

Sex 0.48 0.04 0.55 0.04 0.55 0.04 0.55 0.04

Age 2.65 0.61 2.06 0.71 2.12 0.73 2.07 0.72

Agesq −0.89 0.64 −0.36 0.74 −0.30 0.78 −0.33 0.76

Income 0.00 0.06 0.00 0.06 0.03 0.06 0.01 0.07

Levyplus 0.28 0.05 0.27 0.06 0.27 0.06 0.28 0.06

Freepoor −0.05 0.12 −0.10 0.14 −0.06 0.13 −0.09 0.14

Freerepa 0.30 0.06 0.28 0.07 0.29 0.07 0.29 0.07

Illness 0.20 0.01 0.20 0.01 0.21 0.01 0.20 0.01

Actdays 0.03 0.00 0.03 0.00 0.04 0.01 0.03 0.00

Hscore 0.02 0.01 0.02 0.01 0.02 0.01 0.02 0.01

Chcond1 0.78 0.05 0.76 0.05 0.77 0.05 0.78 0.05

Chcond2 1.01 0.05 1.00 0.06 1.02 0.06 1.02 0.06

a, dispersion − − 0.18 0.02 0.13 0.01 0.18 0.02

P , functional − − − − − − 1.21 0.11

Log likelihood −5530.77 −5425.49 −5445.46 −5423.72

AIC 11,087.53 10,878.98 10,918.92 10,877.44

Y2, nonprescribed

Intercept −2.31 0.17 −2.25 0.19 −2.33 0.19 −2.28 0.19

Sex 0.24 0.05 0.24 0.06 0.25 0.06 0.24 0.06

Age 4.69 0.94 4.82 1.04 4.79 1.08 4.85 1.06

Agesq −5.93 1.07 −6.13 1.18 −6.02 1.22 −6.16 1.20

Income 0.12 0.08 0.05 0.08 0.11 0.09 0.07 0.09

Levyplus −0.03 0.06 −0.04 0.06 −0.04 0.07 −0.04 0.06

Freepoor −0.02 0.12 −0.08 0.14 −0.02 0.14 −0.07 0.14

Freerepa −0.28 0.09 −0.28 0.10 −0.29 0.10 −0.29 0.10

Illness 0.20 0.02 0.20 0.02 0.21 0.02 0.21 0.02

Actdays 0.00 0.01 −0.01 0.01 0.00 0.01 0.00 0.01

Hscore 0.03 0.01 0.03 0.01 0.03 0.01 0.03 0.01

Chcond1 0.16 0.06 0.13 0.06 0.16 0.06 0.14 0.06

Chcond2 0.01 0.08 0.01 0.09 0.00 0.09 0.01 0.09

a, dispersion − − 0.14 0.02 0.35 0.04 0.18 0.06

P , functional − − 1.00 − 2.00 − 1.24 0.31

Log likelihood −4011.11 −3926.98 −3930.16 −3926.70

AIC 8048.209 7881.967 7888.313 7883.403

123

Page 12: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

1618 H. Zamani et al.

Table 3 provides the estimates and standard errors for the BPR, BGPR-1, BGPR-2,and BGPR-P which are fitted to the two response variables. The LRT statistics fortesting BPR against BGPR-1 and BPR against BGPR-2 are 8299.80 and 8268.54,respectively, indicating that the two response variables are overdispersed. The LRTstatistics for testing BGPR-1 against BGPR-P and BGPR-2 against BGPR-P are 17.40and 48.66, respectively, indicating that the BGPR-P is better than both the BGPR-1and BGPR-2. Based on the LRT and AIC, the best model is the BGPR-P, followed bythe BGPR-1 and BGPR-2.

The estimates of correlation parameter under all models are positive, indicatingpositive dependence between the two response variables. The absolute values of t-ratio under the BPR, BGPR-1, BGPR-2, and BGPR-P are 11.36, 7.75, 7.54, and 7.63,respectively, indicating that the two response variables are significantly dependentunder these models. Therefore, the two response variables are suggested to be fittedjointly under the BGPR-P, which is the best model compared to the BGPR-1 andBGPR-2.

Although the estimates of the regression parameters are not shown here, the univari-ate GPR-1, GPR-2, and GPR-P are fitted to the two response variables so that the LRTtest can be performed for testing independence, where H0 : α = 0 against H0 : α �= 0.The LRT statistics for testing univariate GPR-1 against BGPR-1, univariate GPR-2against BGPR-2, and univariate GPR-P against BGPR-P are 55.72, 63.58, and 64.94,respectively, indicating that the two response variables are dependent under all of theBGPR models.

If we compare the BGPR-P (which is the best model) with the univariate GPR-P,the estimate of Black in Y2 is insignificant under the BGPR-P (with p-value 0.28),but significant under the univariate GPR-P (with p-value 0.05). For other regressionparameters, both the BGPR-P and univariate GPR-P provide the same significantestimates at 0.10 level, namely Exclhlth, Poorhlth, Numchron, Adldiff, Age, Male,and Privins in Y1 (HOSP), and Exclhlth, Numchron, Noreast, Midwest, Age, Male,School, and Privins in Y2 (OPNP).

If we consider only significant estimates of the regression parameters, comparisonbetween the BGPR-P and the univariate GPR-P shows that the estimates of Exclhlthin HOSP (Y1) and Exclhlth and Privins in OPNP (Y2) are slightly different. It isinteresting to observe that the BGPR-P, which is a model that considers positive cor-relation between hospital stays and non-physician outpatient visits, has lower numberof hospital stays and non-physician outpatient visits for individuals with excellenthealth (Exclhlth) compared to the univariate GPR-P. Another interesting observationis that individuals with private health insurance (Privins) have higher number of non-physician hospital outpatient visits under the BGPR-P compared to the univariateGPR-P.

6 Conclusions

This paper has defined several forms of nested BGPR which allow the LRT to beapplied for choosing the best model. The proposed forms of BGPR allow positive,zero, or negative correlation, over- or underdispersion of the two response variables,

123

Page 13: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

Bivariate generalized Poisson regression... 1619

Table 3 BPR, BGPR-1, BGPR-2, and BGPR-P (US NMES data)

Parameter BPR BGPR-1 BGPR-2 BGPR-P

Est. SE Est. SE Est. SE Est. SE

(HOSP, Y1)

Intercept −2.62 0.36 −2.75 0.44 −3.43 0.49 −3.29 0.46

Exclhlth −0.77 0.17 −0.58 0.19 −0.69 0.19 −0.68 0.20

Poorhlth 0.47 0.07 0.51 0.08 0.51 0.10 0.53 0.09

Numchron 0.25 0.02 0.25 0.02 0.28 0.03 0.26 0.02

Adldiff 0.43 0.07 0.32 0.08 0.35 0.09 0.33 0.09

Noreast −0.06 0.08 0.02 0.09 −0.04 0.10 0.02 0.10

Midwest 0.13 0.07 0.09 0.09 0.12 0.09 0.08 0.09

West 0.04 0.08 0.06 0.10 0.13 0.11 0.09 0.10

Age 0.06 0.04 0.08 0.05 0.17 0.06 0.14 0.06

Black 0.37 0.09 0.09 0.11 0.13 0.12 0.09 0.11

Male 0.20 0.06 0.18 0.07 0.24 0.08 0.19 0.08

Married 0.00 0.06 −0.04 0.08 −0.05 0.09 0.00 0.08

School 0.00 0.01 0.00 0.01 0.00 0.01 0.01 0.01

Faminc 0.00 0.01 0.01 0.01 0.00 0.01 0.00 0.01

Employed 0.00 0.10 0.03 0.12 0.04 0.13 0.05 0.13

Privins 0.30 0.08 0.15 0.09 0.15 0.11 0.19 0.10

Medicaid 0.21 0.10 0.19 0.12 0.12 0.14 0.20 0.13

(OPNP, Y2)

Intercept 2.66 0.30 2.32 0.57 1.22 0.89 0.94 0.66

Exclhlth −0.97 0.14 −0.62 0.19 −0.78 0.24 −0.45 0.18

Poorhlth −0.16 0.06 0.09 0.12 0.34 0.25 0.08 0.10

Numchron 0.16 0.01 0.14 0.03 0.30 0.07 0.11 0.03

Adldiff 0.68 0.05 0.21 0.10 0.77 0.19 0.12 0.11

Noreast −0.08 0.06 0.34 0.10 0.17 0.18 0.29 0.09

Midwest 0.25 0.05 0.25 0.10 0.38 0.16 0.22 0.09

West −0.34 0.07 0.04 0.12 0.04 0.20 0.07 0.09

Age −0.60 0.04 −0.53 0.07 −0.49 0.11 −0.33 0.11

Black 1.09 0.05 −0.07 0.14 0.94 0.22 −0.15 0.14

Male 0.14 0.05 −0.19 0.08 −0.06 0.15 −0.18 0.07

Married 0.04 0.05 −0.04 0.09 0.07 0.15 −0.02 0.07

School −0.01 0.01 0.03 0.01 0.01 0.02 0.03 0.01

Faminc 0.00 0.01 0.01 0.01 −0.02 0.03 0.00 0.01

Employed −0.28 0.08 −0.21 0.13 −0.15 0.21 −0.16 0.12

Privins 0.67 0.06 0.54 0.13 0.76 0.19 0.50 0.16

Medicaid −0.02 0.08 −0.08 0.18 −0.07 0.26 −0.04 0.15

a1, dispersion − − 0.29 0.03 0.76 0.07 0.39 0.07

a2, dispersion − − 2.13 0.17 3.36 0.16 1.82 0.43

P1, functional − − 1.00 − 2.00 − 1.29 0.16

P2, functional − − 1.00 − 2.00 − 0.67 0.35

123

Page 14: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

1620 H. Zamani et al.

Table 3 continued

Parameter BPR BGPR-1 BGPR-2 BGPR-P

Est. SE Est. SE Est. SE Est. SE

α, correlation 1.74 0.15 1.54 0.20 1.56 0.21 1.53 0.20

Log likelihood −9992.87 −5842.97 −5858.60 −5834.27

AIC 20,055.75 11,759.95 11,791.20 11,746.53

and have flexible form of marginal mean–variance relationship of the two responsevariables.

The proposed forms of BGPR are fitted to the Australian health survey data(Cameron 1988) and the US NMES data (Deb and Trivedi 1997). Based on the LRT,the best model for the Australian data is the BGPR-1, followed by the BGPR-P,BGPR-2, and BPR. The estimates of correlation parameter under all models for theAustralian data are significantly negative, suggesting the two response variables to befitted jointly under the BGPR-1, which is the best model compared to the BGPR-P,BGPR-2, and BPR. Comparison between the BGPR-1 and univariate GPR-1 showsthat several significant estimates of regression parameters in Y1 and Y2 are differentunder both models.

As for the US NMES data, the best model is the BGPR-P, followed by the BGPR-1,BGPR-2, and BPR. The estimates of correlation parameter under all models are sig-nificantly positive, suggesting the two response variables to be fitted jointly using theBGPR-P, which is the best model compared to the BGPR-1, BGPR-2, and BPR. Com-parison between the BGPR-P and univariate GPR-P also shows that several significantestimates of regression parameters in Y1 and Y2 are different under both models.

Acknowledgments The authors gratefully acknowledge the financial support received in the form ofresearch grants (FRGS/1/2015/SG04/UKM/02/2 and GUP-2015-002) from the Ministry of Higher Educa-tion (MOHE), Malaysia.

References

Banerjee DP (1959) On some theorems on Poisson distribution. Proc Natl Acad Sci India Sec A 28:30–33Cameron AC, Johansson P (1997) Count data regression using series expansion: with applications. J Appl

Econom 12:203–223Cameron AC, Trivedi PK, Milne F, Piggott J (1988) A microeconomic model of the demand for health care

and health insurance in Australia. R Econ Stud 55:85–106Campbell JT (1934) The Poisson correlation action. Proc Edinb Math Soc 4:18–26Consul PC, Famoye F (1992) Generalized Poisson regression model. Commun Stat Theory 21(1):89–109Consul PC, Famoye F (2006) Lagrangian probability distributions. Birkhäuser, BostonConsul PC, Jain GC (1973) A generalization of Poisson distribution. Technometrics 15:791–799Deb P, Trivedi PK (1997) Demand for medical careby the elderly: a finite mixture approach. J Appl Econom

12:313–336Famoye F (2010a) A new bivariate generalized Poisson distribution. Stat Neerl 64(1):112–124Famoye F (2010b) On the bivariate negative binomial regression model. J Appl Stat 37(6):969–981Famoye F (2012) Comparisons of some bivariate regression models. J Stat Comput Simul 82(7):937–949

123

Page 15: Bivariate generalized Poisson regression model: applications on … et al 2016.pdf · BGPD, Famoye (2010b) who introduced the bivariate negative binomial regression model (BNBR),

Bivariate generalized Poisson regression... 1621

Famoye F, Consul PC (1995) Bivariate generalized Poisson distribution with some applications. Metrika42:127–138

Famoye F, Wulu JT, Singh KP (2004) On the generalized Poisson regression model with an application toaccident data. J Data Sci 2:287–295

Gurmu S, Elder J (2000) Generalized bivariate count data regression models. Econ Lett 68:31–36Johnson NL, Kotz S, Balakrishnan N (1997) Discrete multivariate distributions. Wiley, New YorkKocherlakota S, Kocherlakota K (1992) Regression in the bivariate Poisson distribution. Commun Stat

Theory 30(5):815–825Lakshminarayana J, Pandit SNN, Rao KS (1999) On a bivariate Poisson distribution. Commun Stat Theory

28(2):267–276Lee A (1999) Modelling rugby league data via bivariate negative binomial regression. Aust NZ J Stat

41(2):141–152Mitchell CR, Paulson AS (1981) A new bivariate negative binomial distribution. Nav Res Logist Q 28:359–

374Sibuya M, Yoshimura I, Shimizu R (1964) Negative multinomial distribution. Ann I Stat Math 16:409–426Wang WR, Famoye F (1997) Modeling household fertility decisions with generalized Poisson regression.

J Popul Econ 10(3):273–283Zamani H, Ismail N (2012) Functional form for the generalized Poisson regression model. Commun Stat

Theory 41(20):3666–3675Zamani H, Ismail N (2013) Score test for testing zero-inflated Poisson regression against zero-inflated

generalized Poisson alternatives. J Appl Stat 40(9):2056–2068

123