Econometrics Paper

De La Salle University

A Regression Analysis on the Factors Affecting Total Health

Expenditure per Capita in Asian Countries

An Individual Report Presented to

The Faculty of Economics Department

In partial fulfillment Of the course requirements in

Basic Econometrics

Submitted to: Dr. Cesar Rufino

Submitted by: Maria Pamela A. Ramos

September 6, 2013

2

INTRODUCTION 3 Background of the Study Statement of the Problem Significance of the Study Objectives of the Study Scope and Limitation REVIEW OF RELATED LITERATURE 5 THEORETICAL FRAMEWORK 6 OPERATIONAL FRAMEWORK 7 Description of Variables 7 A-Priori Expectations 7 Introduction of Hypothesized Econometric Model 9 METHODOLOGY 10 Presentation of Data 10 Empirical Procedures 12 EMPIRICAL RESULTS AND INTERPRETATION OF RESULTS 13 Summary Statistics 13 Initial Regression 14 Overall Test of Significance 16 Test for Multicollinearity 17 Test for Heteroscedasticity 18 Test for Misspecification 22 CONCLUSION AND RECOMMENDATION 24 BIBLIGOGRAPHY 24

3

I. Introduction A. Statement of the Problem

Health care is one of the things that is very significant in a country. To be able to

measure whether health care is provided properly in a country, total health care

expenditure per capita. This paper aims to determine what causes total health care

expenditure per capita to grow or to decrease.

B. Significance of the Study

In different countries both government and private hospitals are supposed to

provide quality health care services to its people. But of course, it would vary because

there are countrys where government expenditure on health is greater than private

expenditure on health. This study is important so that countries know what to do to be

able to provide quality health care services.

C. Objectives of the Study

a.) To determine what affects total health care expenditure per capita.

b.) To understand the relationship of the government health expenditure,

private health expenditure and population towards the total health

expenditure per capita.

c.) To give policy recommendation that would help increase total health

expenditure per capita.

D. Scope and Limitation

This study will used a cross-sectional data to be able to make comparisons on

the different total expenditure per capita of select countries from Asia. Data from the

year 2008 was taken for it had complete information and values. However only 33

countries were selected from Asia because some countries had still incomplete data.

4

Data shown are estimated or rounded up values. Total health expenditure per capita is

measured by purchasing power parity (NCU per USS), government and private health

expenditure are measured in millions current USS$ and population is by thousands.

5

II. Review of Related Literature

The World Health Organizations together with other organizations have

conducted studies on the different determinants of health expenditure. One would be

the paper of Ke Xu, Priyanka Saksena, Alberto Holly entitled The Determinants of

Health Expenditure: A Country-level Panel Data Analysis.

The rapid growth of health expenditure has become a great concern for both households and governments. There is extensive literature on the determinants of health expenditure in OECD countries, but the same is not true for developing countries. The aim of this study is to understand the trajectory of health expenditure in developing countries. We use panel data from 143 countries over 14 years, from 1995 to 2008 to study this. We apply both standard fixed effects and dynamic models to explore the factors associated with the growth of total health expenditure as well as its main components namely, government health expenditure and out-of-pocket payments. Our data show great variation across countries in health expenditure as a share of GDP, which ranges from less than 5% to 15%. Apart from income many factors contribute to this variation, ranging from demographic factors to health system characteristics. Our results suggest that health expenditure in general does not grow faster than GDP after taking other factors into consideration. Income elasticity is between 0.75 and 0.95 in the fixed effect model while, it is much smaller in the dynamic model. We found no difference in health expenditure between tax-based and insurance based health financing mechanisms. The study also confirms the existence of fungibility, where external aid for health reduces government health spending from domestic sources. However, the decrease is much small than a dollar to dollar substitution. The study also finds that government health expenditure and out-of-pocket payments follow different paths and that the pace of health expenditure growth is different for countries at different levels of economic development.

6

III. Theoretical Framework

The theory or concept of gross domestic product per capita will be used in this

study. Blanchard (2010) discusses gross domestic product in three equivalent ways, (1)

GDP is the value of final goods and services produced in the economy during a given

period; (2) GDP is the sum of value added in the economy during the given period; and

(3) GDP is the sum of incomes in the economy during a given period. It is composed of

consumption, which is the acquisition of goods and services, by consumers. Second, is

investment or the sum of nonresidential and residential investment. Third is government

expenditure, this is the total procurement of goods and services by the government. The

last component would be net exports or the difference export and imports.

GDP per capita, on the other hand, according to Investopedia (retrieved last

August 18, 2013) this is the quotient of a countrys GDP and population. A greater GDP

per capita indicates growth in the economy and means that there is more productivity.

This is useful when comparing the relative performance of a country to another.

In this research, the total expenditure on health will presume the role of GDP per

capita.

7

IV. Operational Framework

A. Description of Variable

Table 1: Variable List and Description

Variable Definition

Regressand or Dependent Variable

Total expenditure on

health / capita at

Purchasing Power Parity

(NCU per US$)

The quantitative variable that measures the quotient of a

countrys total health expenditure, that is consisted of total

government health expenditure and total private

expenditure on health, and the countrys total population.

This is measured by the purchasing power parity

Regressor o Independent Variable

General Government

Expenditure on Health

Total government expenditure on health. It is measured in

million current US$.

Private Expenditure on

Health

Total outlays for health by households as direct payments

or also called as out-of-the pocket expenditure, by

Population

Total number of de facto resident population that is

provided from the United Nations Population Division from

the World Health Organization.

B. A-Priori Expectations of Regressor

Table 2: A-Priori Expectations

Endogenous Variable thecap

Total expenditure on health / capita at Purchasing Power

8

Parity (NCU per US$)

Exogenous Variable A-priori Expectations

geh

General Government

Expenditure on Health

General government expenditure on health is expected to

have a positive relationship with total health expenditure

per capita.

This is because the increase in the general government

expenditure on health will also increase the total

expenditure on health of a country. The bigger the total

expenditure is when divided by the total population will

result to a positive value that will constitute to the rise of

the total expenditure per capita.

peh

Private Expenditure on

Health

Out-of-the pocket expenditure on health is expected to

have a positive effect on total.

This is because the increase in the private expenditure on

health will also increase the total expenditure on health of

a country. The bigger the total expenditure is when divided

by the total population will result to a positive value that will

constitute to the rise of the total expenditure per capita.

pop

Population

Population is expected to have a negative relationship on

total health expenditure per capita.

9

As population increases the total health expenditure per

capita will decrease. This is because the expenditure will

be divided among more residents or citizens in the country.

C. Introduction to Hypothesized Econometric Model

Based on the economic theories that were discussed in the preceding chapters,

the hypothesized econometric model is developed below. The model was transformed

in a log-log model. This was done to make the units standardized and to make the

model less susceptible to data bias.

Model for Estimation: = ! + ! + ! + ! +

10

V. Methodology

A. Data

The data utilized in this research is from the World Health Organizations (WHO)

Global Health Expenditure Database. This database supplies internationally comparable

numbers on national health expenditures. WHO annually updates the data from publicly

available reports such as national health accounts reports, National Statistics Office,

Central Bank, public expenditure information accounts from the World Bank, the

International Monetary Fund and the such.

The data taken for this empirical analysis are values of the total health

expenditure per capita, general government expenditure on health, out-of-the pocket

expenditure, maternal mortality rate and population of 33 Asian countries for the year

2008. Considering this the data has a cross-sectional nature.

Table 3: Data

Country thecap geh peh pop

Afganisthan 30 64 837 29,840

Armenia 230 196 244 3,079

Azerbaijan 373 403 1,734 8,944

Bangladesh 19 1,003 1,812 145,478

Bhutan 246 57 9 701

Cambodia 111 105 2,579 13,823

China 285 104,486 104,705 1,335,720

Georgia 440 228 923 4,394

India 112 13,383 37,468 1,190,864

11

Indonesia 110 5,827 8,682 234,951

Iran (Islamic Republic

of) 754 8,840 13,742 72,289

Israel 1,971 9,582 5,300 7,309

Japan 2,878 335,561 79,834 127,692

Jordan 479 1,193 735 5,849

Kazakhstan 440 3,019 2,145 15,655

Kuwait 1,052 2,228 619 2,548

Kyrgyzstan 137 161 151 5,204

Lao, Peoples

Democratic Republic 90 53 169 6,022

Lebanon 886 915 1,312 4,167

Malaysia 532 4,651 3,775 27,502

Maldives 635 104 43 308

Mongolia 225 189 138 2,667

Nepal 62 264 392 28,905

Oman 618 966 280 2,637

Pakistan 84 1,263 3,581 167,442

Philippines 142 2,171 4,559 90,173

Qatar 1,472 1,815 346 1,396

Republic of Korea 1,723 33,650 26,496 48,949

Russian Federation 1,034 56,746 28,648 143,163

Singapore 2,378 2,184 5,783 4,772

12

Sri Lanka 160 686 759 20,474

Thailand 318 8,236 2,579 68,268

Turkey 1,034 56,746 11,971 143,163

B. Empirical Procedures

To be able to analyze the hypothesized econometric model it will be tested for

overall significance. It will undergo the process of estimation and inference. For

estimation, a regression analysis will be done with the model. This is to inspect the

statistical dependence of the dependent variable to one or more variables or also called

the explanatory variables. For inference, a level of significance = 0.05 or confidence

interval of 95% is constructed to verify the values that will be generated. This will help in

determining whether the hypothesized econometric model is significant.

The software Gretl is used to operate the multiple regression analysis for the

estimation and inference. The estimates acquired are expected to have properties such

as sufficiency, unbiasedness, consistency and efficiency. To know if the estimates will

meet these properties, test will be conducted to detect multicollinearity,

heteroscedasticity and misspecification. If these problems arise, remedies will be done

to correct the problems.

13

V. Empirical Testing and Interpretation of Results

A. Summary of Data

Table 4: Summary Statistics

Variable Mean Median Minimum Maximum

l_thecap 5.1892 5.9225 2.9628 7.9674

l_geh 7.4097 7.1415 3.9788 12.724

l_peh 7.4737 7.5023 2.2180 11.559

l_pop 9.8367 9.6586 5.7289 14.105

Variable Std. Dev. C.V. Skewness Kurtosis

l_thecap 1.2473 0.21434 -0.26752 -0.53123

l_geh 2.3076 0.31144 0.40922 -0.57362

l_peh 2.1386 0.28615 -0.15258 -0.16750

l_pop 2.0281 0.20617 0.19814 -0.53410

Above is the summary statistics of the data used. When getting the summary, the

log form of each independent variable is used. The table shows the different special

expectations or moments of each explanatory variable. The first moment is the measure

of central tendency; this is where the mean, the median and the minimum and

maximum values are. The second moment is the standard deviation, is the measure of

how dispersed the data is from the mean, and the variance of the values of all the

variables. The third moment is the skewness or the measure of symmetry. The fourth

moment is kurtosis which measures the tail density of peakedness of the data.

14

B. Initial Regression

Table 5: Initial Regression

Variable Coefficient Standard Error t-Ratio p-value

const 7.74626 0.254278 30.46 1.43e 23 ***

l_geh 0.482429 0.0431294 11.19 4.90e 12 ***

l_peh 0.355901 0.0627896 5.668 3.97e 06 ***

l_pop 0.829708 0.0447623 -18.54 1.28e 17 ***

Mean dependent var 5.819173 S.D. dependent var 1.247274

Sum squared resid 2.253002 S.E. of regression 0.278729

R squared 0.954743 Adjusted R squared 0.950061

F(3, 29) 203.9274 P-Value(F) 1.39e-19

Log-likelihood 2.534944 Akaike criterion 13.06989

Schwarz criterion 19.05592 Hannan - Quinn 15.08400

Log-likelihood for thecap = -194.568

Given the generated estimates and substituting it to the hypothesized

econometric model the sample regression is as follows:

= 7.74626+ 0.482429 + 0.355901 0.829708 +

15

The above results will be examined by level of significance that was mentioned in the

preceding part. Given that the level of significance = 0.05, if the p-value of the estimate

is less than that it means that the estimate is significant and the null hypothesis must be

rejected. Having said that, when the p-value of the estimate is greater than 0.05 then it

is insignificant and there is no strong evidence to reject the null hypothesis.

To interpret the data, the level of significance is discussed first. The intercept of

the model has a positive value of 7.74626, which means that when the independent

variables are 0 then total expenditure per capita will be equal to 7.74626. Given that its

p-value is less than 0.05 or 5% then it can be said that its statistically significant.

The general government expenditure on health (geh) and the private expenditure

on healt (peh) are significant at the 5% level. Their p-values are 4.90e 12 and 3.97e

06 respectively. The regression also displayed that both variables have a positive

coefficient, which means they have a positive relationship with the total expenditure on

health per capita.

The populations p-value is less than 0.05, it can be inferred that it is statistically

significant and there is strong evidence against the null hypothesis that the coefficient

must be 0; hence rejecting it. Since population resulted to have a negative coefficient,

this implies that as population increase there will be a decrease in the total health

expenditure per capita. That being said there is a negative relationship where in a

percentage increase in population, total health expenditure per capita will decrease by

0.829708.

To measure the overall fitness of the chosen model with the given data, the ! must be analyze. The ! is a value that lies in between 0 and 1. If it is nearer to 1 or 1,

16

the fitted regression line is said to explain 100% of the variation of the independent

variable or the fit of the model is suitable the closer ! is to 1 (Gujarati & Porter, 2009). From the regression analysis, the ! that was generated was 0.954743. This means that 95.4743% of the data is explained by the model. The adjusted !, on the other hand, is 0.950061.

C. Overall Test of Significance

Given that a multiple regression analysis is being done, the null hypothesis is a

joint hypothesis. The over all test of significance will be used to test the hypothesis. It

will examine whether the dependent variable is linearly related to the independent

variables. Analysis of Variance (ANOVA) or also called the F-test can be used to

measure this. It is the analysis of the Total Sum of Squares or TSS that is composed of

the Estimated Sum of Squares or ESS and the Residual Sum of Squares or RSS.

The null hypothesis for this model is that all the coefficient of the independent

variables are 0 while on the other hand the alternative hypothesis is not all these

coefficients are 0. So, the null hypothesis will be rejected if the p-value of the F-statistic

is less than the level of significance. The ANOVA or F-table was generated from Gretl,

and its is below:

Table 6: Analysis of Variance

Special of

Squares df Mean square

Regression 47.5292 3 15.8431

Residual 2.253 29 0.776897

Total 49.7822 32 1.55569

17

R^2 = 47.5292 / 49.7822 = 0.954743

F(3, 29) = 15.8431 / 0.0776897 = 203.927 [p-value 1.39e-19]

From the results the p-value is 1.39e 19 and this is less than the level of

significance, therefore the model passed the test for overall significance.

D. Test for Multicollinearity

Ragnar Frisch coined multicollinearity in 1934. It indicates the condition where

there is either an exact or relatively exact linear relationship among the X variables. It

violates one of the classic linear regression model assumptions where there should not

be any mutlicollinearity among the independent variables. There are two types of

multicollinearity, first is the perfectly correlated multicollinearity, which means that they

are singular, and regression is not plausible. Second is the highly correlated but

dangerous multicollinearity, this is when variables are highly correlated to each other -

this is then dangerous for the model. Despite the violation OLS is still BLUE, however

different repercussions might arise such as erroneous detection of a coefficient being

insignificant because of the t-ratio, there will be a wide confidence interval, ! will be very high, and the OLS estimators and their standard errors will be perceptive to

changes in data. (Gujarati & Porter, 2009)

One way to test for multicollinearity, the Variance Inflation Factor will be

computed. It is the speed with which variances and covariances increase, and it

indicates how the presence of muticollinearity inflates the variance of an estimatior. The

value of VIF should be less than or equal to 10, this is because when VIF is greater than

18

10 it is highly collinear. Corrective measures are done to fix the violations such as do

nothing, transform the variables into logarithms, remove the culprit variable or use panel

data.

The VIF for this model was generated using Gretl and the results are as follows:

Table 7: Variance Inflation Factors

Minimum possible value = 1.0

Values > 10.0 may indicate a collinearity problem

l_geh 4.080

l_peh 7.427

l_pop 3.395

VIF(j) =1/1 R(j)^2), where R(j) is the multiple correlation coefficient between variable j

and the other independent variables

Properties of matrix XX:

1-norm = 8696.0278

Determinant = 8121844.9

Reciprocal condition number = 0.00011235508

It can be evaluated that all the exogenous variables have a VIF less than 10.

This shows tolerable multicollinearity. However the logarithm of private expenditure on

health possesses the highest VIF but it will not cause any problem.

E. Test for Heteroskedasticity

19

If the classical linear regression model assumption that the disturbance ! have all the same variance ! is not satisfied then there is heterosccedasticity. The OLS estimators unbiasedness and consistency properties are not destroyed. These

estimators are no longer minimum variances or efficient, therefore OLS is not BLUE. If

heteroscedasticity exist, the variances of OLS estimators are not given by the normal

OLS formulas because the t and f test based on them can be deceptive which will result

to faulty conclusions. To identify hetereoscedasticity, there are two methods the

informal one which is the graphical method and the formal one which are the different

test that can be conducted such as Park Test, Glejser test, Spearmans Rank

Correlation Test, Goldfeld-Quandt Test, Breush-Pagan-Godfrey Test and Whites

General Heteroscedasticity Test. (Gujarati & Porter, 2009)

Both the informal and formal methods will be shown with the use of Gretl.

Figure 1: Scattergram of estimated residuals plotted against the variables

20

Heteroscedasticity can be seen from a graph if there exist a pattern. From the

graphs above, it can be seen that there is no systematic pattern in the model; therefore

the model is not heteroscedastic. On the other hand, it is said that graphs are too

subjective to interpret models therefore it would not specify whether the model is truly

heteroscedastic. So the formal test or the Whites General Heteroscedasticity Test

conducted using Gretl and the results are below:

21

Table 8: Whites Test for Heteroscedasticity

OLS, using observations 1-33

Dependent variable: uhat^2

coefficient std. error t-ratio p-value

const 0.374249 0.808220 0.4631 0.6477

l_geh 0.0294932 0.182433 0.1617 0.8730

l_peh 0.359006 0.261361 1.374 0.1828

l_pop 0.344832 0.236222 1.460 0.1579

sq_l_geh 0.00935994 0.0147458 0.6348 0.5319

X2_X3 0.0263258 0.0359429 0.7324 0.4713

X2_X4 0.0118747 0.0259169 0.4582 0.6511

sq_l_peh 0.0537565 0.0297534 1.807 0.0839

X3_X4 0.106507 0.0478626 2.225 0.0362

sq_l_pop 0.0564436 0.0234462 2.407 0.0245

Unadjusted r-squared = 0.331738

Test statistic: TR^2 = 10.947346,

With p-value = P(Chi-square(9) > 10.947346) = 0.279335

The Whites General Heteroscedasticity Test does not depend on the normality

assumption and is implemented easily. It has an a-priori expectation where the null

22

hypothesis is homoscedasticity and the alternative is heteroscedasticity. From the

results above the p-value is at 0.279335 which is greater than 0.05, this means that

there is a strong evidence in favor of the null hypothesis. The model can now be

concluded as homoscedastic and this means that the variances of the residuals are

constant and it follows the OLS assumption.

F. Test for Mis-specification

Model specification error or bias is disregarding the classical linear regression

model assumption that the regression model used in the analysis must be correctly

specified. There are several types of mis-specification errors but the top three most

important ones are omitted variable bias, irrelevant variable bias and incorrect functional

form (Gujarati & Porter, 2009). Omitted variable bias is because of the underfitting of a

model due to an exclusion of a significant variable. The OLS becomes inconsistent and

biased that results to a misleading and questionable interpretations of the statistical

significance of the estimates and the confidence intervals. The overfitting of a model

causes irrelevant variable bias as a result to an inclusion of an irrelevant variable. The

confidence interval will remain valid, the estimates variances will be greater than

desired making it less accurate and OLS is still BLUE. On the other hand, incorrect

functional form means that the model must be transformed into linear, logarithmic, lin-lin

or log-log forms.

To see if there is any specification error or bias the Ramsey Regression

Specification Error Test will be conducted and the results are as follows:

Table 9: Ramsey Reset Test

RESET test for specification (squares and cubes)

23

Test statistic: F= 1.791166,

with p-value = P(F(2, 27) > 1.79917) = 0.186

RESET test for specification (squares only)


with p-value = P(F(1, 28) > 3.3501) = 0.0779

RESET test for specification (cubes only)


with p-value = P(F(1, 28) > 3.17414) = 0.0857

The null hypothesis for this is that the model is correctly specified and the

alternative hypothesis is that there is misspecification error or bias in the model. Looking

at all the p-values, the values are all greater than 0.05 then there is a strong evidence

not to reject the null hypothesis. So, it can be concluded that the model does not have

misspecification error or bias.

24

VI. Conclusion

This paper aims to determine what causes total health care expenditure per

capita to grow or to decrease. An empirical procedure was done to prove whether

general government expenditure on health, private expenditure on health and

population affect total health care expenditure per capita. And based on the findings, it

can be deduce that the hypothesized econometric model is valid.

For further studies, additional variables such as prevalence of diseases or

mortality rates or percentage of health care services given out should be added to really

measure the total health care expenditure per capita.

Reference:

Global Health Expenditure Database. (n.d.). World Health Organization. Retrieved

August 31, 2013, from apps.who.int/nha/database/DataExplorerRegime.aspx

Gujarati, D., & Porter, D. (2009). Basic Econometrics (5th ed.). Singapor: Mc Graw Hill.

Health financing for universal coverage. (n.d.). World Health Organization. Retrieved

August 18, 2013, from www.who.int/health_financing/documents/cov-

report_e_11-deter-he/en/

Per Capita GDP Definition | Investopedia. (n.d.). Investopedia - Educating the world

about finance. Retrieved August 18, 2013, from

http://www.investopedia.com/terms/p/per-capita-gdp.asp

Econometrics Paper

Documents

Transcript of Econometrics Paper