Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper...

25
Introduction Policy makers have long been interested in the relationship between the environment and human health. In some cases, such as water contamination, this relationship has been well studied. For other environmental factors, such as forest cover, health effects are not as easily quantifiable (we do not know an LD-50 level for forest cover). The purpose of this project was to determine what effects, if any, various environmental factors have on the health of Indonesian children. Do forest cover, water area, rainfall, and erosion affect the number of illnesses in children, after accounting for family, housing, and village characteristics?

Transcript of Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper...

Page 1: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Introduction Policy makers have long been interested in the relationship

between the environment and human health. In some cases, such as water contamination, this relationship has

been well studied. For other environmental factors, such as forest cover, health effects are not as easily quantifiable

(we do not know an LD-50 level for forest cover). The purpose of this project was to determine what effects, if any, various environmental factors have on the health of

Indonesian children. Do forest cover, water area, rainfall, and erosion affect the number of illnesses in children, after accounting for family,

housing, and village characteristics?

Page 2: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Data Collection Methods• Data were collected by Professor Subhrendu Pattanayak

during his doctoral research in Indonesia.

• Observations were obtained through surveys of randomly selected households in several villages.

• Villages were selected to represent a variety of environmental characteristics.

• Data were only collected from households with a total family size less than eight.

• A GIS was used to measure village area, forest cover, and water area. Erosion and sedimentation rates were also derived using a GIS.

Page 3: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Variables Selected for AnalysisDependent

• Annual number of illnesses per child

Independent • Adult education (years)• Total family size• Annual number of illnesses per

adult • Size of farm (hectares)• Condition of the floor: 1=stilts, 2=dirt,

3=cement

• Condition of the roof: 1=straw, 2=wood, 3=zinc

• Income from non-farm sources: 0=none, 1= > zero

• Annual expenditures per family member (rupiahs)

• Village density (people/hectare)• Primary forest cover (hectares)• Secondary forest cover (hectares)• Annual rainfall in watershed (mm)• Water area (hectares) • Annual erosion and

sedimentation rate in watershed (tons)

Page 4: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Summary Statistics• Focused on subset of data with > 0 illnesses per child

Numeric Variables (n=357)

Illness per child

Family size

Adult education (years)

Expenditures per Family Member (rupiahs)

Village Density (people/ hectare)

Primary Forest (ha)

Secondary Forest (ha)

Water Area (ha)

Rain (mm)

Erosion/ Sedimentation (tons)

Min 0.25 1.9 2 4666.67 0.12 0.22 0.05 1.46 6.38 0.01Mean 1.08 4.23 6.42 221391.43 1.67 23.64 99.14 322.24 972 1.86Median 1 3.9 5 161538.46 0.92 16.46 83.13 261.9 678.9 0.94Max 4 7.6 21 2101694.92 11.35 156.54 526.96 2030.44 6470 18.55STD 0.53 1.42 3.5 224346.31 2.06 24.02 90.17 316.71 897.1 2.59

Indicator Variables (Number of Observations per Code)

Code Floor RoofNon-farm Income

1 110 44 1512 182 29 2063 62 279

Page 5: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Exploratory AnalysisLog transformed all numeric variables except family size

0.250 0.625 1.000 1.375 1.750 2.125 2.500 2.875 3.250 3.625 4.000

0.0

0.5

1.0

1.5

Annual illnesses per child

Number of Illnesses

-1.386294-1.109035

-0.831777-0.554518

-0.2772590.000000

0.2772590.554518

0.8317771.109035

1.386294

0.0

0.5

1.0

1.5

2.0

2.5

Log(Annual illnesses per child)

Number of Illnesses

0.050341452.7414973

105.4326532158.1238091

210.8149651263.5061210

316.1972769368.8884328

421.5795887474.2707446

526.9619005

0.000

0.002

0.004

0.006

0.008

Secondary Forest

Number of Hectares (ha)

-2.988927-2.063322

-1.137716-0.212111

0.7134951.639100

2.5647063.490312

4.4159175.341523

6.267128

0.0

0.1

0.2

0.3

0.4

0.5

Log(Secondary Forest)

Log(Number of Hectares)

1.8526312.544860

3.2370883.929316

4.6215455.313773

6.0060016.698230

7.3904588.082687

8.774915

0.0

0.1

0.2

0.3

0.4

0.5

Log(Rainfall)

log(mm of rain)

6.37658652.72822

1299.079861945.43151

2591.783153238.13479

3884.486434530.83808

5177.189725823.54136

6469.89301

0.0000

0.0002

0.0004

0.0006

Rainfall (mm)

mm of rain

Page 6: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Model Selection• We considered plausible interaction effects and quadratic terms.

• Subset model selection (leaps) was used to determine which interaction effects were significant with all main effects included.

• Then, we used manual stepwise selection (backward) to analyze the significance of the main effects.– Farm size (p=0.4749), erosion rate (p=0.3694), and floor condition (p=0.5069)

were not significant and were not involved in a potentially important interaction effect.

Model -- All MainEffects, plus:

BIC Posterior Probability

Roof* Log(Rain) -637.66 0.1583Log(Village Density)^2 -637.01 0.1143Log(Adult Education)*Log(Expenditures)

-636.13 0.0736

Log(Adult Education)*Family size

-635.92 0.0662

Roof* Log(Rain) +Log(Village Density)^2

-635.75 0.0608

Page 7: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Model Selection• Subset model selection (leaps) was also used to determine which interaction

effects were significant with the remaining main effects (minus farm size, erosion, and floor condition).

• Analyzed the importance of each interaction effect using manual stepwise regression.

• Tests for influential observations using Cook’s distance indicated that there were no influential points. All of the observations displayed Cook’s Distance < 0.08.

• Finally, we considered additional transformations. Transforming family size (the final untransformed variable) did not significantly improve our model.

Model -- SignificantMain Effects, plus:

BIC Posterior Probability

Roof* Log(Rain) -656.93 0.1422Log(Village Density)^2 -656.80 0.1330Roof* Log(Rain) +Log(Village Density)^2

-656.06 0.0918

Log(Adult Education) *Log(Adult Illness)

-654.95 0.0526

Roof* Log(Adult Illness) -654.85 0.0501

Page 8: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Fitted Model• If the variable is highlighted in

red, then an increase in this variable is associated with an increase in the annual number of illnesses per child.

• If the variable is highlighted in blue, then an increase in this variable is associated with a decrease in the annual number of illnesses per child.

• R-squared = 0.2943• R-squared adjusted = 0.2688

• This model is homoskedastic, but has a heavy tailed distribution (QQ Normal plot not shown).

Intercept: 3.02(0.90)

Beta Variable Value P-Value

1 Log(Family Size) - 0.04 (0.02)

0.0511

2 Log (Adult Illness) 0.35(0.05)

0.0000

3 Roof - 0.45(0.21)

0.0347

4 Nonfarm Income 0.05(0.02)

0.0113

5 Log(Expenditures) - 0.04(0.02)

0.0877

6 Log(Village Density - 0.06(0.03)

0.0240

7 Log(Adult Education) 0.12(0.06)

0.0410

8 Log(Rain) - 0.39(0.13)

0.0028

9 Log(Primary Forest) 0.13(0.06)

0.0210

10 Log(Second Forest) 0.10(0.03)

0.0017

11 Log(Water Area) - 0.15(0.06)

0.0094

12 Log(Rain) * Roof 0.07(0.03)

0.0322

Page 9: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Question 1: How do the environmental factors considered affect child health as a group?

Hypotheses:• H0: The model without env. factors is adequate.

• HA: The full model (with env. factors) is a significant improvement.

Statistical Technique:• We used an Extra Sum of Squares F-Test to test the joint significance of the environmental

factors (log(rainfall), log(primary forest), log(secondary forest), log(water area), and [log(rain)*roof)].

Results:• The addition of environmental factors significantly improved our model (F=3.24, p=0.01, ESS F-

test).

Conclusions and Limitations:• There was sufficient evidence to reject the null hypothesis and conclude that inclusion of

environmental factors does significantly improve our understanding of the annual number of illnesses in Indonesian children.

• This is a heavy-tailed distribution, even after transformations, and this may have undue influence on the results. Likewise, there may be lurking variables (i.e. watershed area) which were not accounted for in the data.

Page 10: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Question 2: How does forest cover affect child health?

Hypotheses:• H0: Primary forest cover is not a significant explanatory variable (9=0).

• HA: Primary forest cover is significant (90).

and

• H0: Secondary forest cover is not a significant explanatory variable (10=0);

• HA: Secondary forest cover is significant (100).

Statistical Technique:• Two-sided t-tests were used to test the significance of each coefficient with

=0.05, after accounting for the other variables in the model. • We calculated 95% family-wise confidence intervals using Bonferroni

techniques in order to simultaneously estimate the coefficients associated with environmental factors.

Page 11: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Question 2: Results and ConclusionsResults:• According to the t-tests, there is sufficient evidence to reject the null hypotheses and

conclude that both primary and secondary forest are significant explanatory variables in this model (primary: p=0.0210; secondary: p=0.0017).

• However, under the more conservative approach of family-wise confidence intervals, primary forest does not appear to be significant (95% CI: -0.01551, 0.2819; includes zero).

• Secondary forest does appear to be significant, even with the conservative family-wise confidence interval (95% CI: 0.01824 ,0.1846).

Conclusions:• An increase in secondary forest cover is associated with an increase in the median

annual number of illnesses per child in Indonesia. Doubling the amount of secondary forest cover is associated with a 7% (95% CI: 1%,14%) increase in the median number of illnesses per child per year.

• We fail to reject the null hypothesis for primary forest cover under the Bonferroni confidence interval. However, our results are highly suggestive of an association between primary forest cover and child illness.

Page 12: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Question 3: How does the amount of rainfall affect child health?

Hypotheses:• H0: Rainfall is not a significant explanatory variable (8=0).

• HA: Rainfall is significant (80).

and

• H0: The interaction between rainfall and roof type is not significant (12=0).

• HA: The interaction effect is significant (120).

Statistical Technique:• Two-sided t-tests with =0.05.

• 95% family-wise confidence interval (Bonferroni) for family of environmental variables.

• Set up a dummy variable for roof to assess the degree of interaction between rainfall and each roof type.

Page 13: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Question 3: Results

Results:• According to the t-tests, there is sufficient evidence to reject the null

hypotheses and conclude that both rainfall and the interaction between rainfall and roof type are significant (p=0.0028; p=0.0032, respectively).

• Under the more conservative approach of family-wise confidence intervals, the interaction effect does not appear to be significant (95% CI: -0.0140, 0.1518; includes zero). On the other hand, the rainfall variable is significant (95% CI: -0.7326,-0.0548).

• T-tests analyzing the significance of the dummy variables for both the roof variable and the interaction between roof and rain indicate that roofs 1 and 2 do not significantly differ from roof 3. Therefore, a reduced model without the dummy variables is a more appropriate model. The coefficients, t-values and p-values for the dummy variables are as follows:

Coefficient t-value p-valueRoof 1 0.175 0.3265 0.7442Roof 2 -0.34 -1.807 0.0717Log(Rain)*Roof 1 -0.0267 -0.342 0.7325Log(Rain)*Roof 2 0.0514 1.8694 0.0625

Page 14: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Question 3: Conclusions

Conclusions:• An increase in rainfall is associated with a decrease in the median number of

illnesses per child per year. In fact, a doubling of annual rainfall is associated with a 23.9% (95% CI: 4%,40%) decrease in the median number of illnesses per child per year. This is the strongest multiplicative change in median number of illnesses among the environmental factors.

• The negative results of the dummy variable tests are somewhat surprising. It was assumed by those collecting the data that the quality of roof increased from roof 1 (straw) to roof 3 (zinc). Interestingly, the coefficients of the dummy variables suggest that roof 2 (wood) is actually associated with a lower rate of child illness than roof 3. Unfortunately, the lack of significance of the coefficients prevents us from definitively answering this question.

• On the other hand, the fact that the family-wise (Bonferroni) confidence interval indicated that the interaction effect was not significant makes the lack of significance among the dummy variables less surprising.

Page 15: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

General Conclusions• Modeling Indonesian children’s health is an extremely complicated prospect.

With all of the variables we have included, our model explains just 29% of the variation in annual number of illnesses (or 26% with adjusted R2).

• Our analysis indicates that environmental factors are important when attempting to explain child health but the predictive power of such explanations is very low.

• Despite a lack of predictive power, however, the model does exhibit several interesting associations. For example, we expected the coefficient associated with water area to be negative because of a suspected increased number of insects; that coefficient turned out to be positive and significantly greater than 0.

• We began this project in hopes of finding a human health argument for conservation of primary forest. On the contrary, the significantly positive nature of the coefficient of the primary forest variable creates a disincentive for conservation. Before promoting deforestation for health reasons, however, we must again consider the uncertainty inherent in the model.

Page 16: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Recommendations and Further Research

• The observational nature of the data prevent any inference of cause and effect relationships. Thus, we may only discuss associations between variables.

• We were highly suspicious of observations claiming to have had no illnesses among children for a year and focused only on families with counted illnesses. Future surveys identifying the type of illness in question would be helpful in building a more descriptive model.

• Future studies should consider focusing analyses on a specific type of illness to increase the predictive power of the model.

• Few policy recommendations can be drawn from this particular model. More research is needed into the environmental factors affecting the amount of disease among children. Increasing the predictive power of the model will be key to increasing the utility of the model as a policy tool.

Acknowledgments:We would like to thank Professor Subhrendu Pattanayak for supplying the data.

Page 17: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

How does the rate of erosion and sedimentation in the watershed affect child health?

• The erosion/sedimentation variable was not significant in any of the models considered.

• It was also not significant in the final model (F=0.3142, p=0.5755, ESS F-Test).

• There is only a 2% probability that erosion and sedimentation rate affects the annual number of illnesses in children in the top 20 models (total posterior probability).

• Limitations: The erosion/sedimentation variable is correlated with rain (correlation coefficient = - 0.75). This reduces our ability to assess the significance of this variable since it would have a lower

t-statistic and a wider confidence interval.

Page 18: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Does water area affect child health?

Hypotheses:• H0: Water area is not a significant explanatory variable (11=0).

• HA: Water area is significant (110).

Statistical Technique:• Two-sided t-tests with =0.05.• 95% family-wise confidence interval (Bonferroni) for family of environmental variables.

Results:• According to the t-test, there is sufficient evidence to reject the null hypothesis and

conclude water area is a significant explanatory variable in this model (p=0.0094). • The family-wise confidence interval supports the conclusion that water area is

significant (95% CI: -0.2907,-0.0011).

Conclusions:• An increase in water area is associated with a decrease in the annual number of

illnesses per child. Doubling the water area is associated with a 10% decrease in the median number of illnesses per child.

Page 19: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Possible Interactionslog.illadult

aduledu fmlsz

log.farmsz floor roof

nonfarm.binary

log.exppermem

log.villdensity

log.primeforest

log.secondforest

log.waterarea

log.rain

log.erosion

illness per adult log.illadultadult education aduledu Xfamily size fmlsz Xfarm size log.farmszfloor condition** floor X Xroof condition** roof X X Xincome from nonfarm sources** nonfarm.binary Xexpenditures per family member log.exppermem X Xvillage density log.villdensity Xprimary forest cover log.primeforestsecondary forest cover log.secondforestwater area log.waterarea Xrain log.rain X Xerosion log.erosion** Indicator Variables

Possible Quadratic Terms Village Density

Page 20: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

Roof Type

Est. Mean for Intercept (# child illnesses)

95% CI for Intercept (# child illnesses)

Est. Mean for Slope (# child illnesses/ mm of rain)

95% CI for Slope (# child illnesses/ mm of rain)

1: Straw 7.90 (1.61,38.83) -0.78 (- 0.97,-0.62)2: Wood 4.72 (1.43,15.52) -0.84 (-0.99,-0.71)3: Zinc 6.63 (2.17,20.27) -0.80 (-0.94,-0.69)

Prediction at future value of 800mm of rainBetween 0.33 and 1.4 annual illnesses per child per year (so approximately 1 illness)Methods: Centered log(rain) at log(800); Used t(0.975,332)

Page 21: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

floor

1.01.52.02.53.03.54.0

8101214

-2-1-012

-1.5-1.0-0.50.00.51.01.5

1.01.52.02.53.0

1.01.52.02.53.03.54.0

roof

expperfammember

0500000100000015000002000000

8 10 12 14

log.exppermem

villdensity

0 2 4 6 8 1012

-2 -1 -0 1 2

log.villdensity

illperchild

0 1 2 3 4

1.01.52.02.53.0

-1.5-1.0-0.50.00.51.01.5

0500000100000015000002000000

024681012

01234

log.illchild

Matrix of Variables 1

Page 22: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

illperchild

-1.5-1.0-0.50.00.51.01.5

05

101520

01234

-4-3-2-1-012

-238

13

0 1 2 3 4

-1.5-1.0-0.50.00.51.01.5

log.illchild

fmlsz

1 3 5 7

0 5 10 15 20

aduledu

log.illadult

-2 -1 0 1

0 1 2 3 4

illperadult

farmsz

0 1 2 3 4 5 6

-4-3-2-1-01 2

log.farmsz

income.nonfarm

0E+0002E+0064E+0066E+0068E+0061E+0071E+007

01234

-2 3 8 13

1357

-2-101

0123456

0E+0002E+0064E+0066E+0068E+0061E+0071E+007

log.income.nonfarm

Matrix of Variables 2

Page 23: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

illperchild

-1.5-1.0-0.50.00.51.01.5

0

50

100

150

-4-20246

0246

0 1 2 3 4

-1.5-1.0-0.50.00.51.01.5

log.illchild

log.primeforest

-2 0 2 4

0 50 100 150

primeforest

secondforest

0100200300400500

-4 -2 0 2 4 6

log.secondforest

totalforest

0250500750100012501500

01234

0 2 4 6

-2024

0100200300400500

0250500750100012501500

log.totalforest

Matrix of Variables 3

Page 24: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

illperchild

-1.5-1.0-0.50.00.51.01.5

02468

2468

-6-4-2-02

0 1 2 3 4

-1.5-1.0-0.50.00.51.01.5

log.illchild

waterarea

0 500100015002000

0 2 4 6 8

log.waterarea

rain

0 200040006000

2 4 6 8

log.rain

erosion

0 5 10 15 20

01234

-6 -4 -2 -0 2

0500100015002000

0200040006000

05101520

log.erosion

Matrix of Variables 4

Page 25: Health and the Environment : A Case Study of Indonesian Children Claire Harper and Craig Harper Sta242/Env255 April 19, 2001.

QQ Normal Plot and Residuals Plot for the Final Model

Quantiles of Standard Normal

Re

sid

ua

ls

-3 -2 -1 0 1 2 3

-1.0

-0.5

0.0

0.5

1.0

35152

189

Fitted : fmlsz + log.illadult + roof + nonfarm.binary + R

esi

du

als

-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8

-1.0

-0.5

0.0

0.5

1.0

35152

189