Econometrics Paper
description
Transcript of Econometrics Paper
-
De La Salle University
A Regression Analysis on the Factors Affecting Total Health
Expenditure per Capita in Asian Countries
An Individual Report Presented to
The Faculty of Economics Department
In partial fulfillment Of the course requirements in
Basic Econometrics
Submitted to: Dr. Cesar Rufino
Submitted by: Maria Pamela A. Ramos
September 6, 2013
-
2
INTRODUCTION 3 Background of the Study Statement of the Problem Significance of the Study Objectives of the Study Scope and Limitation REVIEW OF RELATED LITERATURE 5 THEORETICAL FRAMEWORK 6 OPERATIONAL FRAMEWORK 7 Description of Variables 7 A-Priori Expectations 7 Introduction of Hypothesized Econometric Model 9 METHODOLOGY 10 Presentation of Data 10 Empirical Procedures 12 EMPIRICAL RESULTS AND INTERPRETATION OF RESULTS 13 Summary Statistics 13 Initial Regression 14 Overall Test of Significance 16 Test for Multicollinearity 17 Test for Heteroscedasticity 18 Test for Misspecification 22 CONCLUSION AND RECOMMENDATION 24 BIBLIGOGRAPHY 24
-
3
I. Introduction A. Statement of the Problem
Health care is one of the things that is very significant in a country. To be able to
measure whether health care is provided properly in a country, total health care
expenditure per capita. This paper aims to determine what causes total health care
expenditure per capita to grow or to decrease.
B. Significance of the Study
In different countries both government and private hospitals are supposed to
provide quality health care services to its people. But of course, it would vary because
there are countrys where government expenditure on health is greater than private
expenditure on health. This study is important so that countries know what to do to be
able to provide quality health care services.
C. Objectives of the Study
a.) To determine what affects total health care expenditure per capita.
b.) To understand the relationship of the government health expenditure,
private health expenditure and population towards the total health
expenditure per capita.
c.) To give policy recommendation that would help increase total health
expenditure per capita.
D. Scope and Limitation
This study will used a cross-sectional data to be able to make comparisons on
the different total expenditure per capita of select countries from Asia. Data from the
year 2008 was taken for it had complete information and values. However only 33
countries were selected from Asia because some countries had still incomplete data.
-
4
Data shown are estimated or rounded up values. Total health expenditure per capita is
measured by purchasing power parity (NCU per USS), government and private health
expenditure are measured in millions current USS$ and population is by thousands.
-
5
II. Review of Related Literature
The World Health Organizations together with other organizations have
conducted studies on the different determinants of health expenditure. One would be
the paper of Ke Xu, Priyanka Saksena, Alberto Holly entitled The Determinants of
Health Expenditure: A Country-level Panel Data Analysis.
The rapid growth of health expenditure has become a great concern for both households and governments. There is extensive literature on the determinants of health expenditure in OECD countries, but the same is not true for developing countries. The aim of this study is to understand the trajectory of health expenditure in developing countries. We use panel data from 143 countries over 14 years, from 1995 to 2008 to study this. We apply both standard fixed effects and dynamic models to explore the factors associated with the growth of total health expenditure as well as its main components namely, government health expenditure and out-of-pocket payments. Our data show great variation across countries in health expenditure as a share of GDP, which ranges from less than 5% to 15%. Apart from income many factors contribute to this variation, ranging from demographic factors to health system characteristics. Our results suggest that health expenditure in general does not grow faster than GDP after taking other factors into consideration. Income elasticity is between 0.75 and 0.95 in the fixed effect model while, it is much smaller in the dynamic model. We found no difference in health expenditure between tax-based and insurance based health financing mechanisms. The study also confirms the existence of fungibility, where external aid for health reduces government health spending from domestic sources. However, the decrease is much small than a dollar to dollar substitution. The study also finds that government health expenditure and out-of-pocket payments follow different paths and that the pace of health expenditure growth is different for countries at different levels of economic development.
-
6
III. Theoretical Framework
The theory or concept of gross domestic product per capita will be used in this
study. Blanchard (2010) discusses gross domestic product in three equivalent ways, (1)
GDP is the value of final goods and services produced in the economy during a given
period; (2) GDP is the sum of value added in the economy during the given period; and
(3) GDP is the sum of incomes in the economy during a given period. It is composed of
consumption, which is the acquisition of goods and services, by consumers. Second, is
investment or the sum of nonresidential and residential investment. Third is government
expenditure, this is the total procurement of goods and services by the government. The
last component would be net exports or the difference export and imports.
GDP per capita, on the other hand, according to Investopedia (retrieved last
August 18, 2013) this is the quotient of a countrys GDP and population. A greater GDP
per capita indicates growth in the economy and means that there is more productivity.
This is useful when comparing the relative performance of a country to another.
In this research, the total expenditure on health will presume the role of GDP per
capita.
-
7
IV. Operational Framework
A. Description of Variable
Table 1: Variable List and Description
Variable Definition
Regressand or Dependent Variable
Total expenditure on
health / capita at
Purchasing Power Parity
(NCU per US$)
The quantitative variable that measures the quotient of a
countrys total health expenditure, that is consisted of total
government health expenditure and total private
expenditure on health, and the countrys total population.
This is measured by the purchasing power parity
Regressor o Independent Variable
General Government
Expenditure on Health
Total government expenditure on health. It is measured in
million current US$.
Private Expenditure on
Health
Total outlays for health by households as direct payments
or also called as out-of-the pocket expenditure, by
Population
Total number of de facto resident population that is
provided from the United Nations Population Division from
the World Health Organization.
B. A-Priori Expectations of Regressor
Table 2: A-Priori Expectations
Endogenous Variable thecap
Total expenditure on health / capita at Purchasing Power
-
8
Parity (NCU per US$)
Exogenous Variable A-priori Expectations
geh
General Government
Expenditure on Health
General government expenditure on health is expected to
have a positive relationship with total health expenditure
per capita.
This is because the increase in the general government
expenditure on health will also increase the total
expenditure on health of a country. The bigger the total
expenditure is when divided by the total population will
result to a positive value that will constitute to the rise of
the total expenditure per capita.
peh
Private Expenditure on
Health
Out-of-the pocket expenditure on health is expected to
have a positive effect on total.
This is because the increase in the private expenditure on
health will also increase the total expenditure on health of
a country. The bigger the total expenditure is when divided
by the total population will result to a positive value that will
constitute to the rise of the total expenditure per capita.
pop
Population
Population is expected to have a negative relationship on
total health expenditure per capita.
-
9
As population increases the total health expenditure per
capita will decrease. This is because the expenditure will
be divided among more residents or citizens in the country.
C. Introduction to Hypothesized Econometric Model
Based on the economic theories that were discussed in the preceding chapters,
the hypothesized econometric model is developed below. The model was transformed
in a log-log model. This was done to make the units standardized and to make the
model less susceptible to data bias.
Model for Estimation: = ! + ! + ! + ! +
-
10
V. Methodology
A. Data
The data utilized in this research is from the World Health Organizations (WHO)
Global Health Expenditure Database. This database supplies internationally comparable
numbers on national health expenditures. WHO annually updates the data from publicly
available reports such as national health accounts reports, National Statistics Office,
Central Bank, public expenditure information accounts from the World Bank, the
International Monetary Fund and the such.
The data taken for this empirical analysis are values of the total health
expenditure per capita, general government expenditure on health, out-of-the pocket
expenditure, maternal mortality rate and population of 33 Asian countries for the year
2008. Considering this the data has a cross-sectional nature.
Table 3: Data
Country thecap geh peh pop
Afganisthan 30 64 837 29,840
Armenia 230 196 244 3,079
Azerbaijan 373 403 1,734 8,944
Bangladesh 19 1,003 1,812 145,478
Bhutan 246 57 9 701
Cambodia 111 105 2,579 13,823
China 285 104,486 104,705 1,335,720
Georgia 440 228 923 4,394
India 112 13,383 37,468 1,190,864
-
11
Indonesia 110 5,827 8,682 234,951
Iran (Islamic Republic
of) 754 8,840 13,742 72,289
Israel 1,971 9,582 5,300 7,309
Japan 2,878 335,561 79,834 127,692
Jordan 479 1,193 735 5,849
Kazakhstan 440 3,019 2,145 15,655
Kuwait 1,052 2,228 619 2,548
Kyrgyzstan 137 161 151 5,204
Lao, Peoples
Democratic Republic 90 53 169 6,022
Lebanon 886 915 1,312 4,167
Malaysia 532 4,651 3,775 27,502
Maldives 635 104 43 308
Mongolia 225 189 138 2,667
Nepal 62 264 392 28,905
Oman 618 966 280 2,637
Pakistan 84 1,263 3,581 167,442
Philippines 142 2,171 4,559 90,173
Qatar 1,472 1,815 346 1,396
Republic of Korea 1,723 33,650 26,496 48,949
Russian Federation 1,034 56,746 28,648 143,163
Singapore 2,378 2,184 5,783 4,772
-
12
Sri Lanka 160 686 759 20,474
Thailand 318 8,236 2,579 68,268
Turkey 1,034 56,746 11,971 143,163
B. Empirical Procedures
To be able to analyze the hypothesized econometric model it will be tested for
overall significance. It will undergo the process of estimation and inference. For
estimation, a regression analysis will be done with the model. This is to inspect the
statistical dependence of the dependent variable to one or more variables or also called
the explanatory variables. For inference, a level of significance = 0.05 or confidence
interval of 95% is constructed to verify the values that will be generated. This will help in
determining whether the hypothesized econometric model is significant.
The software Gretl is used to operate the multiple regression analysis for the
estimation and inference. The estimates acquired are expected to have properties such
as sufficiency, unbiasedness, consistency and efficiency. To know if the estimates will
meet these properties, test will be conducted to detect multicollinearity,
heteroscedasticity and misspecification. If these problems arise, remedies will be done
to correct the problems.
-
13
V. Empirical Testing and Interpretation of Results
A. Summary of Data
Table 4: Summary Statistics
Variable Mean Median Minimum Maximum
l_thecap 5.1892 5.9225 2.9628 7.9674
l_geh 7.4097 7.1415 3.9788 12.724
l_peh 7.4737 7.5023 2.2180 11.559
l_pop 9.8367 9.6586 5.7289 14.105
Variable Std. Dev. C.V. Skewness Kurtosis
l_thecap 1.2473 0.21434 -0.26752 -0.53123
l_geh 2.3076 0.31144 0.40922 -0.57362
l_peh 2.1386 0.28615 -0.15258 -0.16750
l_pop 2.0281 0.20617 0.19814 -0.53410
Above is the summary statistics of the data used. When getting the summary, the
log form of each independent variable is used. The table shows the different special
expectations or moments of each explanatory variable. The first moment is the measure
of central tendency; this is where the mean, the median and the minimum and
maximum values are. The second moment is the standard deviation, is the measure of
how dispersed the data is from the mean, and the variance of the values of all the
variables. The third moment is the skewness or the measure of symmetry. The fourth
moment is kurtosis which measures the tail density of peakedness of the data.
-
14
B. Initial Regression
Table 5: Initial Regression
Variable Coefficient Standard Error t-Ratio p-value
const 7.74626 0.254278 30.46 1.43e 23 ***
l_geh 0.482429 0.0431294 11.19 4.90e 12 ***
l_peh 0.355901 0.0627896 5.668 3.97e 06 ***
l_pop 0.829708 0.0447623 -18.54 1.28e 17 ***
Mean dependent var 5.819173 S.D. dependent var 1.247274
Sum squared resid 2.253002 S.E. of regression 0.278729
R squared 0.954743 Adjusted R squared 0.950061
F(3, 29) 203.9274 P-Value(F) 1.39e-19
Log-likelihood 2.534944 Akaike criterion 13.06989
Schwarz criterion 19.05592 Hannan - Quinn 15.08400
Log-likelihood for thecap = -194.568
Given the generated estimates and substituting it to the hypothesized
econometric model the sample regression is as follows:
= 7.74626+ 0.482429 + 0.355901 0.829708 +
-
15
The above results will be examined by level of significance that was mentioned in the
preceding part. Given that the level of significance = 0.05, if the p-value of the estimate
is less than that it means that the estimate is significant and the null hypothesis must be
rejected. Having said that, when the p-value of the estimate is greater than 0.05 then it
is insignificant and there is no strong evidence to reject the null hypothesis.
To interpret the data, the level of significance is discussed first. The intercept of
the model has a positive value of 7.74626, which means that when the independent
variables are 0 then total expenditure per capita will be equal to 7.74626. Given that its
p-value is less than 0.05 or 5% then it can be said that its statistically significant.
The general government expenditure on health (geh) and the private expenditure
on healt (peh) are significant at the 5% level. Their p-values are 4.90e 12 and 3.97e
06 respectively. The regression also displayed that both variables have a positive
coefficient, which means they have a positive relationship with the total expenditure on
health per capita.
The populations p-value is less than 0.05, it can be inferred that it is statistically
significant and there is strong evidence against the null hypothesis that the coefficient
must be 0; hence rejecting it. Since population resulted to have a negative coefficient,
this implies that as population increase there will be a decrease in the total health
expenditure per capita. That being said there is a negative relationship where in a
percentage increase in population, total health expenditure per capita will decrease by
0.829708.
To measure the overall fitness of the chosen model with the given data, the ! must be analyze. The ! is a value that lies in between 0 and 1. If it is nearer to 1 or 1,
-
16
the fitted regression line is said to explain 100% of the variation of the independent
variable or the fit of the model is suitable the closer ! is to 1 (Gujarati & Porter, 2009). From the regression analysis, the ! that was generated was 0.954743. This means that 95.4743% of the data is explained by the model. The adjusted !, on the other hand, is 0.950061.
C. Overall Test of Significance
Given that a multiple regression analysis is being done, the null hypothesis is a
joint hypothesis. The over all test of significance will be used to test the hypothesis. It
will examine whether the dependent variable is linearly related to the independent
variables. Analysis of Variance (ANOVA) or also called the F-test can be used to
measure this. It is the analysis of the Total Sum of Squares or TSS that is composed of
the Estimated Sum of Squares or ESS and the Residual Sum of Squares or RSS.
The null hypothesis for this model is that all the coefficient of the independent
variables are 0 while on the other hand the alternative hypothesis is not all these
coefficients are 0. So, the null hypothesis will be rejected if the p-value of the F-statistic
is less than the level of significance. The ANOVA or F-table was generated from Gretl,
and its is below:
Table 6: Analysis of Variance
Special of
Squares df Mean square
Regression 47.5292 3 15.8431
Residual 2.253 29 0.776897
Total 49.7822 32 1.55569
-
17
R^2 = 47.5292 / 49.7822 = 0.954743
F(3, 29) = 15.8431 / 0.0776897 = 203.927 [p-value 1.39e-19]
From the results the p-value is 1.39e 19 and this is less than the level of
significance, therefore the model passed the test for overall significance.
D. Test for Multicollinearity
Ragnar Frisch coined multicollinearity in 1934. It indicates the condition where
there is either an exact or relatively exact linear relationship among the X variables. It
violates one of the classic linear regression model assumptions where there should not
be any mutlicollinearity among the independent variables. There are two types of
multicollinearity, first is the perfectly correlated multicollinearity, which means that they
are singular, and regression is not plausible. Second is the highly correlated but
dangerous multicollinearity, this is when variables are highly correlated to each other -
this is then dangerous for the model. Despite the violation OLS is still BLUE, however
different repercussions might arise such as erroneous detection of a coefficient being
insignificant because of the t-ratio, there will be a wide confidence interval, ! will be very high, and the OLS estimators and their standard errors will be perceptive to
changes in data. (Gujarati & Porter, 2009)
One way to test for multicollinearity, the Variance Inflation Factor will be
computed. It is the speed with which variances and covariances increase, and it
indicates how the presence of muticollinearity inflates the variance of an estimatior. The
value of VIF should be less than or equal to 10, this is because when VIF is greater than
-
18
10 it is highly collinear. Corrective measures are done to fix the violations such as do
nothing, transform the variables into logarithms, remove the culprit variable or use panel
data.
The VIF for this model was generated using Gretl and the results are as follows:
Table 7: Variance Inflation Factors
Minimum possible value = 1.0
Values > 10.0 may indicate a collinearity problem
l_geh 4.080
l_peh 7.427
l_pop 3.395
VIF(j) =1/1 R(j)^2), where R(j) is the multiple correlation coefficient between variable j
and the other independent variables
Properties of matrix XX:
1-norm = 8696.0278
Determinant = 8121844.9
Reciprocal condition number = 0.00011235508
It can be evaluated that all the exogenous variables have a VIF less than 10.
This shows tolerable multicollinearity. However the logarithm of private expenditure on
health possesses the highest VIF but it will not cause any problem.
E. Test for Heteroskedasticity
-
19
If the classical linear regression model assumption that the disturbance ! have all the same variance ! is not satisfied then there is heterosccedasticity. The OLS estimators unbiasedness and consistency properties are not destroyed. These
estimators are no longer minimum variances or efficient, therefore OLS is not BLUE. If
heteroscedasticity exist, the variances of OLS estimators are not given by the normal
OLS formulas because the t and f test based on them can be deceptive which will result
to faulty conclusions. To identify hetereoscedasticity, there are two methods the
informal one which is the graphical method and the formal one which are the different
test that can be conducted such as Park Test, Glejser test, Spearmans Rank
Correlation Test, Goldfeld-Quandt Test, Breush-Pagan-Godfrey Test and Whites
General Heteroscedasticity Test. (Gujarati & Porter, 2009)
Both the informal and formal methods will be shown with the use of Gretl.
Figure 1: Scattergram of estimated residuals plotted against the variables
-
20
Heteroscedasticity can be seen from a graph if there exist a pattern. From the
graphs above, it can be seen that there is no systematic pattern in the model; therefore
the model is not heteroscedastic. On the other hand, it is said that graphs are too
subjective to interpret models therefore it would not specify whether the model is truly
heteroscedastic. So the formal test or the Whites General Heteroscedasticity Test
conducted using Gretl and the results are below:
-
21
Table 8: Whites Test for Heteroscedasticity
OLS, using observations 1-33
Dependent variable: uhat^2
coefficient std. error t-ratio p-value
const 0.374249 0.808220 0.4631 0.6477
l_geh 0.0294932 0.182433 0.1617 0.8730
l_peh 0.359006 0.261361 1.374 0.1828
l_pop 0.344832 0.236222 1.460 0.1579
sq_l_geh 0.00935994 0.0147458 0.6348 0.5319
X2_X3 0.0263258 0.0359429 0.7324 0.4713
X2_X4 0.0118747 0.0259169 0.4582 0.6511
sq_l_peh 0.0537565 0.0297534 1.807 0.0839
X3_X4 0.106507 0.0478626 2.225 0.0362
sq_l_pop 0.0564436 0.0234462 2.407 0.0245
Unadjusted r-squared = 0.331738
Test statistic: TR^2 = 10.947346,
With p-value = P(Chi-square(9) > 10.947346) = 0.279335
The Whites General Heteroscedasticity Test does not depend on the normality
assumption and is implemented easily. It has an a-priori expectation where the null
-
22
hypothesis is homoscedasticity and the alternative is heteroscedasticity. From the
results above the p-value is at 0.279335 which is greater than 0.05, this means that
there is a strong evidence in favor of the null hypothesis. The model can now be
concluded as homoscedastic and this means that the variances of the residuals are
constant and it follows the OLS assumption.
F. Test for Mis-specification
Model specification error or bias is disregarding the classical linear regression
model assumption that the regression model used in the analysis must be correctly
specified. There are several types of mis-specification errors but the top three most
important ones are omitted variable bias, irrelevant variable bias and incorrect functional
form (Gujarati & Porter, 2009). Omitted variable bias is because of the underfitting of a
model due to an exclusion of a significant variable. The OLS becomes inconsistent and
biased that results to a misleading and questionable interpretations of the statistical
significance of the estimates and the confidence intervals. The overfitting of a model
causes irrelevant variable bias as a result to an inclusion of an irrelevant variable. The
confidence interval will remain valid, the estimates variances will be greater than
desired making it less accurate and OLS is still BLUE. On the other hand, incorrect
functional form means that the model must be transformed into linear, logarithmic, lin-lin
or log-log forms.
To see if there is any specification error or bias the Ramsey Regression
Specification Error Test will be conducted and the results are as follows:
Table 9: Ramsey Reset Test
RESET test for specification (squares and cubes)
-
23
Test statistic: F= 1.791166,
with p-value = P(F(2, 27) > 1.79917) = 0.186
RESET test for specification (squares only)
Test statistic: F= 1.350097,
with p-value = P(F(1, 28) > 3.3501) = 0.0779
RESET test for specification (cubes only)
Test statistic: F= 3.174143,
with p-value = P(F(1, 28) > 3.17414) = 0.0857
The null hypothesis for this is that the model is correctly specified and the
alternative hypothesis is that there is misspecification error or bias in the model. Looking
at all the p-values, the values are all greater than 0.05 then there is a strong evidence
not to reject the null hypothesis. So, it can be concluded that the model does not have
misspecification error or bias.
-
24
VI. Conclusion
This paper aims to determine what causes total health care expenditure per
capita to grow or to decrease. An empirical procedure was done to prove whether
general government expenditure on health, private expenditure on health and
population affect total health care expenditure per capita. And based on the findings, it
can be deduce that the hypothesized econometric model is valid.
For further studies, additional variables such as prevalence of diseases or
mortality rates or percentage of health care services given out should be added to really
measure the total health care expenditure per capita.
Reference:
Global Health Expenditure Database. (n.d.). World Health Organization. Retrieved
August 31, 2013, from apps.who.int/nha/database/DataExplorerRegime.aspx
Gujarati, D., & Porter, D. (2009). Basic Econometrics (5th ed.). Singapor: Mc Graw Hill.
Health financing for universal coverage. (n.d.). World Health Organization. Retrieved
August 18, 2013, from www.who.int/health_financing/documents/cov-
report_e_11-deter-he/en/
Per Capita GDP Definition | Investopedia. (n.d.). Investopedia - Educating the world
about finance. Retrieved August 18, 2013, from
http://www.investopedia.com/terms/p/per-capita-gdp.asp