STAT 700 Homework 5

download STAT 700 Homework 5

of 10

Transcript of STAT 700 Homework 5

  • 8/11/2019 STAT 700 Homework 5

    1/10

    Steven Abel

    Fred Kaiser

    Loan Robinson

    STAT 700 Homework 5

    1. (a) Make boxplots of the data.

    > data boxplot(weight~feed, data=data, ylab = "Weight", xlab = "Feed",main="Weights of Chickens by Type of Feed")

    The boxplots suggest that the type of feed has a significant effect on the weight of newly hatched chicks

    after six weeks. we see that every other feed except sunfower is signicantly different from casein. The

    type of feed also affects the variance of the weights of chickens fed with it.

    (b) Determine if there are differences in the weights of chicken according to their feed.

    The model we are using is a linear model:

    Y(ij) = mu + alpha(i) + epsilon(ij).

    And we assume that the psilon terms are independent random variables and distributed N(0, sigma^2),

    and constraint the alpha terms to sum to 0. The alpha terms represent treatment effects, in this case

    effects of the type of feed given to the chicken.

  • 8/11/2019 STAT 700 Homework 5

    2/10

    We will test for whether there are differences in weight caused by feed by doing a one-way ANOVA test.

    Ho: alpha(i) = 0 for all i=1,,6.

    Ha: the alpha terms are not all 0, or in other words, at least one alpha term is different from 0.

    > anova(fit.lm F)

    feed 5 231129 46226 15.365 5.936e-10 ***Residuals 65 195556 3009---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    The P-value from the one-way ANOVA test if 5.936e-10, showing that feed has a strong effect on the

    response, weight, explaining a great deal of the variation in weight. We reject the null hypothesis and

    conclude that at least one alpha term is different from 0.

    > par(mfrow=c(2,2))> plot(fit.lm)

  • 8/11/2019 STAT 700 Homework 5

    3/10

    (c) Now test all possible two-group comparisons.

    > attach(data)> pairwise.t.test(weight,feed,p.adjust.method="bonf")

    Pairwise comparisons using t tests with pooled SD

    data: weight and feedcasein horsebean linseed meatmeal soybean

    horsebean 3.1e-08 - - - -linseed 0.00022 0.22833 - - -meatmeal 0.68350 0.00011 0.20218 - -soybean 0.00998 0.00487 1.00000 1.00000 -sunflower 1.00000 1.2e-08 9.3e-05 0.39653 0.00447

    P value adjustment method: bonferroni

    We conclude that, at the alpha=.05 level, casein is significantly different from horsebean, linseed and

    soybean in effect. Horsebean is significantly different from meatmeal, soybean, and sunflower. Linseed

    is significantly different from sunflower. Soybean is significantly different from and sunflower. All otherpairwise comparisons are not different enough to conclude they are different.

    All in all, there are many differences between the types of feed when compared pairwise, which is a

    stronger conclusion than we made in part (b), when we said only that at least one feed had a significant

    effect.

    15 pairwise comparisons were made for this problem.

    (d) The Bonferroni adjustment is a way to control the type I error when doing multiple comparisons, as

    we were in this case with our 15 pairwise comparisons. You can do it by simply fixing alpha and taking

    the P-value and comparing to alpha/n, where n is the number of comparisons. In this case, R does it by

    multiplying each P-value by n, the number of comparisons, and comparing to the original fixed alpha.This leads to equivalent conclusions.

    (e) Other methods exist to adjust the P-value when doing multiple comparisons. Holms method, like

    Bonferroni, also controls the family-wise error rate, and is less conservative but also valid under

    arbitrary assumptions. Hochbergs and Hommels methods also control the error rate, but are only in

    valid in certain conditions.

    The alternative way is to control the false discovery rate, or expected proportion of false discoveries

    amongst rejected hypotheses. This is done in the method of Benjamini, Hochberg, and Yekutieli, and

    methods like this are more powerful and less conservative than the others.

  • 8/11/2019 STAT 700 Homework 5

    4/10

    2. Plot the data using strip charts and interaction plots.

    > data data$Poison data$Treatment attach(data)

    > par(mfrow=c(1,1))> stripchart(Survival~Poison, ylab = "Poison", main = "Survival by Poison")

    This chart shows that type of poison seems to have a significant effect on survival time. Poison 3s effect

    is the most pronounced, and is the best poison, as no observation was over 4 hours, while the other two

    poisons had much more variation and certainly higher means for survival.

  • 8/11/2019 STAT 700 Homework 5

    5/10

    > stripchart(Survival~Treatment, ylab = "Treatment", main = "Survival byTreatment")

    This plot shows that the type of treatment can also have a significant effect on survival. Treatment 1

    stands out as the least effective treatment, as its observations have a low mean and comparatively lowvariation. The other three treatments had rather high variation, but treatment 3 seemed to be less

    effective than 2 and 4.

  • 8/11/2019 STAT 700 Homework 5

    6/10

    > stripchart(Survival~Treatment+Poison, ylab = "Treatment/PoisonCombination", main = "Survival by Treatment/Poison")

    This plot shows that certain treatment/poison combinations could have a strong effect on survival

    compared to the others. The most striking feature is that poison 3 is very potent, as observations given

    poison 3 all had low survival, although treatment did seem to have an effect. Other effects appear, such

    as the strength of treatments 2 and 4, but there is more variation for any observation not given poison3.

  • 8/11/2019 STAT 700 Homework 5

    7/10

    > interaction.plot(Treatment, Poison, Survival, main="Interaction of Poisonand Treatment")

    > interaction.plot(Poison, Treatment, Survival, main="Interaction of Poisonand Treatment")

  • 8/11/2019 STAT 700 Homework 5

    8/10

    These plots suggest that interaction is not a strong feature of the data. In both plots, survival follows

    almost the same pattern across the poisons or treatments, regardless of the level of the second factor.

    They are almost parallel to each other.

    (b) Conduct a two-way ANOVA to test the effects of the two main factors and their interactions.

    > anova(fit.lm F)

    Poison 2 103.043 51.521 23.5699 2.863e-07 ***Treatment 3 91.904 30.635 14.0146 3.277e-06 ***Poison:Treatment 6 24.745 4.124 1.8867 0.11Residuals 36 78.692 2.186---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Our model is another linear model:

    Y(ijk) = mu + alpha(i) + beta(j) + gamma(ij) + epsilon(ijk).

    We assume the alpha, beta, and gamma terms sum to 0, and that the epsilons are independent random

    variables distributed N(0, sigma^2).

    First we test for interaction:

    Ho: All gamma terms equal 0. Interaction is not significant.

    Ha: At least one gamma term does not equal 0. There is significant interaction between factors Poison

    and Treatment.

    Because the P-value is 0.11, we fail to reject the null at the 0.05 level. This confirms what the interaction

    plots suggested: there is no significant interaction between Poison and Treatment.

    Doing tests for the main effects in the same way as our test for one-way ANOVA, we see the P-values of

    2.863e-07 for the test of the Poison effect and 3.277e-06 for the Treatment factor. Thus, we reject the

    null in each case and conclude that both Poison and Treatment have significant effects on survival time.

    In other words, at least one alpha term is not equal to 0, and at least one beta term is not equal to 0.

  • 8/11/2019 STAT 700 Homework 5

    9/10

    > par(mfrow=c(2,2))> plot(fit.lm)

    We do all pairwise comparisons for both the levels of Poison and the levels of Treatment.

    > pairwise.t.test(Survival,Poison,p.adjust.method="bonf")

    Pairwise comparisons using t tests with pooled SDdata: Survival and Poison

    1 22 0.9542 -3 9.3e-05 0.0022

    P value adjustment method: bonferroni

    From this we conclude that poison 3 is different from poisons 1 and 2, while 1 and 2 are not significantly

    different from each other in effect.

    > pairwise.t.test(Survival,Treatment,p.adjust.method="bonf")

    Pairwise comparisons using t tests with pooled SD

    data: Survival and Treatment

    1 2 32 0.0011 - -3 1.0000 0.0147 -4 0.1051 0.6613 0.7235

    P value adjustment method: bonferroni

  • 8/11/2019 STAT 700 Homework 5

    10/10

    From this we conclude that treatment 2 is significantly different from treatments 1 and 3, while all other

    pairs are not significantly different from each other in effect on survival time.

    (c) Conduct the two-way ANOVA using rate of death instead of survival time.

    > Deathrate anova(fit.lm F)

    Poison 2 0.34863 0.174316 72.8419 2.217e-13 ***Treatment 3 0.20396 0.067987 28.4100 1.336e-09 ***Poison:Treatment 6 0.01567 0.002611 1.0911 0.3864Residuals 36 0.08615 0.002393---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    > par(mfrow=c(2,2))> plot(fit.lm)

    We make the same conclusions using rate of death instead of survival time. We conclude that there is

    no significant interaction between Poison and Treatment, while both main effects are significant.

    However, the diagnostic plots look better using rate of death, in particular the residuals vs. fitted values

    plot which shows no pattern, while the same plot using survival time had a bit of a curve. Also, the

    normal Q-Q plot looks much better using rate of death.