Advanced Data Analysis 1 Stat 427/527 - Clicker Questions€¦ · Advanced Data Analysis 1 Stat...
Transcript of Advanced Data Analysis 1 Stat 427/527 - Clicker Questions€¦ · Advanced Data Analysis 1 Stat...
Advanced Data Analysis 1Stat 427/527
Clicker Questions
Erik B. Erhardt
Department of Mathematics and StatisticsMSC01 1115
1 University of New MexicoAlbuquerque, New Mexico, 87131-0001
Office: MSLC [email protected]
Fall 2014
Ch 00Introduction and
R+Rstudio
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 2/1
Ch 0, Learning outcomesGeneral
Q 1. More generally, thinking of what you want toget out of your college education and this course,which of the following is most important to you?
A Acquiring factual knowledgeB Learning how to use knowledge in new
situationsC Developing skills to continue learning after
collegeMark this number on your sheet.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 3/1
Ch 0, R building blocksSubset
Q 2. What value will R return for z?x <- 3:7
y <- x[c(1, 2)] + x[-c(1:3)]
z <- prod(y)
z
A 99
B 20
C 91
D 54
E NA
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 4/1
AnswerCh 0, R building blocks, Subset
x <- 3:7
x
## [1] 3 4 5 6 7
x[c(1, 2)]
## [1] 3 4
x[-c(1:3)]
## [1] 6 7
y <- x[c(1, 2)] + x[-c(1:3)]
y
## [1] 9 11
z <- prod(y)
z
## [1] 99
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 5/1
Ch 0, R building blocksT/F selection 1
Q 3. What value will R return for z?x <- seq(-3, 3, by = 2)
a <- x[(x > 0)]
b <- x[(x < 0)]
z <- a[1] - b[2]
z
A −2
B 0
C 1
D 2
E 6
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 6/1
AnswerCh 0, R building blocks, T/F selection 1
x <- seq(-3, 3, by = 2)
x
## [1] -3 -1 1 3
a <- x[(x > 0)]
a
## [1] 1 3
b <- x[(x < 0)]
b
## [1] -3 -1
z <- a[1] - b[2]
z
## [1] 2
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 7/1
Ch 0, R building blocksT/F selection 2
Q 4. What value will R return for z?a <- 2:-3
b <- a[(a > 0) & (a <= 0)]
d <- a[!(a > 1) & (a <= -1)]
z <- sum(c(b,d))
z
A −6
B −3
C 0
D 3
E 6
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 8/1
AnswerCh 0, R building blocks, T/F selection 2
a <- 2:-3a
## [1] 2 1 0 -1 -2 -3
a[(a > 0)]
## [1] 2 1
a[(a <= 0)]
## [1] 0 -1 -2 -3
b <- a[(a > 0) & (a <= 0)]b
## integer(0)
a[!(a > 1)]
## [1] 1 0 -1 -2 -3
a[(a <= -1)]
## [1] -1 -2 -3
d <- a[!(a > 1) & (a <= -1)]d
## [1] -1 -2 -3
z <- sum(c(b,d))z
## [1] -6
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 9/1
Ch 01Summarizing andDisplaying Data
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 10/1
Ch 1, Random variablesDarts
Q 5.Draw the following dart board: A dart board is constructedfrom three concentric circles with radii 1 inch, 2 inches, and 3inches, respectively. If a dart lands in the innermost circle,the player receives 4 points. If the dart lands between theinnermost circle and the middle circle, the player receives 2points. If the dart lands between the middle circle and theoutermost circle, the player receives 1 point. Assume thatthe probability of a dart landing in any particular region isproportional to the area of that region.Define the random variable X to be the sum of the player’sscore on two successive throws. Then X is what type ofrandom variable?
A discrete
B continuous
Ok49
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 11/1
AnswerCh 1, Random variables, Darts
(A). The possible values for X are 2, 3, 4, 5, 6, and8–countable number of values.Ok49
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 12/1
Ch 1, Random variables
Q 6.
ABCDE
Ok50
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 13/1
AnswerCh 1, Random variables,
Ok50
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 14/1
Ch 1, Random variablesRadioactive 1
Q 7.A radioactive mass emits particles at an averagerate of 15 particles per minute. Define the randomvariable X to be the number of particles emitted ina 10-minute time frame. Then X is what type ofrandom variable?
A discreteB continuous
Ok STT.04.03.030
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 15/1
AnswerCh 1, Random variables, Radioactive 1
(A). The possibles values for X are all integersbetween 0 and the number of particles in themass. Even if there were an infinite number ofparticles in the mass, this would still be a discreterandom variable, since the possible values arecountable (1, 2, 3, . . .).Ok STT.04.03.030
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 16/1
Ch 1, Random variablesRadioactive 2
Q 8.A radioactive mass emits particles at an averagerate of 15 particles per minute. A particle isemitted at noon today. Define the random variableX to be the time elapsed between noon and thenext emission. Then X is what type of randomvariable?
A discreteB continuous
Ok STT.04.03.040
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 17/1
AnswerCh 1, Random variables, Radioactive 2
(B) X can take on any positive value, which is anuncountable set of values.Ok STT.04.03.040
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 18/1
Ch 1, Numerical summariesUnemployment
Q 9.Many individuals, after the loss of a job, receive temporary pay(unemployment compensation) until they are re-employed. Consider thedistribution of time to reemployment as obtained in an employmentsurvey. One broadcast reporting on the survey said that the average timeuntil re-employment was 4.5 weeks. A second broadcast reported thatthe average was 9.9 weeks. One of your colleagues wanted a betterunderstanding of the situation and learned (through a Google search)that one report was referring to the mean and the other to the medianand also that the standard deviation was about 14 weeks. Knowing thatyou are a statistically-savvy person, your colleague asked you which ismost likely the mean and which is the median?
A 4.5 is the mean and 9.9 is the median.
B 4.5 is the median and 9.9 is the mean.
C Neither (A) nor (B) is possible given the SD of the data.
D I am not a statistically-savvy person, so how should I know?
Ok STT.01.02.020Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 19/1
AnswerCh 1, Numerical summaries, Unemployment
(B) The data must be right-skewed since thedistribution is truncated at 0 weeks on the left-sideof the distribution. Data that are truncated atone-end tend to have a skew in the direction awayfrom the truncated end.Ok STT.01.02.020
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 20/1
Ch 1, Stem-and-leaf plot
Q 10.A data set consists of fifty three-digit numbersranging from 180 to 510. The best choice for stemsin a stem-and-leaf display would be to use .
A 1 digit stems (1, 2, . . . , 5)
B 2 digit stems (18, 19, . . . , 51)
C 3 digit stems (180, 181, . . . , 510)
Ok STT.01.01.010
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 21/1
AnswerCh 1, Stem-and-leaf plot,
(A) 1 digit stems (1, 2, . . . , 5)Ok STT.01.01.010
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 22/1
Ch 1, BoxplotOk10, STT.01.02.070
Below are boxplots for two data sets.
1 20
2
4
6
8
TRUE or FALSE: There is a greater proportion of valuesoutside the box for the set on the right than for the set on theleft.
A True, and I am very confident.
B True, and I am not very confident.
C False, and I am not very confident.
D False, and I am very confident.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 23/1
AnswerCh 1, Boxplot, , Ok10, STT.01.02.070
Answer: (False).These are boxplots, so the box represents themiddle 50% of data in both cases, meaning thatwhat’s outside of the box is also 50% in bothcases. (The only exception is if the data set has alot of repeated values right at the first or thirdquartile. These values would be “in” the box andcould increase the proportion of data in the boxbeyond the standard 50%).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 24/1
Ch 1, Mean vs medianHistogram Ok06, STT.01.02.030
For the data set displayed in the followinghistogram, which would be larger?
A meanB medianC Can’t tell from the given histogram.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 25/1
AnswerCh 1, Mean vs median, Histogram, Ok06, STT.01.02.030
Answer: (A).(A) Mean is larger because of the right-skew.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 26/1
Ch 2, Inference for a population meanVLBW Ok43, STT.03.02.010
Researchers believe that one possible cause ofVery Low Birth Weight (VLBW) infants is thepresence of undiagnosed infections in the mother.To assess this possibility, they collected data on allpregnant women presenting themselves forprenatal care at large urban hospitals. What is theappropriate population for this study?
A All infants.B All infants born as VLBW infant.C All infants born in large urban centers.D All pregnant women.E All pregnant women living in large urban
centers.Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 27/1
AnswerCh 2, Inference for a population mean, VLBW, Ok43, STT.03.02.010
Answer: (E).(A), (B), (C) Infants are not the unit of analysis.The researchers believe that VLBW infants resultfrom undiagnosed infections in the mother, thuspregnant women are the unit of analysis.(D) This approach is not the most conservativebecause where the pregnant women live may havean impact on VLBW infants.(E)* correct – This approach is the mostconservative. Pregnant women at large urbancenters is the target population.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 28/1
Ch 2, Inference for a population meanfundamental concept Ok84, STT.06.01.010
The fundamental concept underlying statisticalinference is that
A through the use of sample data we are able todraw conclusions about a sample from whichthe data were drawn.
B through the examination of sample data wecan derive appropriate conclusions about apopulation from which the data were drawn.
C when generalizing results to a sample wemust make sure that the correct statisticalprocedure has been applied.
D Two of the above are true.E All of the above are true.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 29/1
AnswerCh 2, Inference for a population mean, fundamental concept, Ok84, STT.06.01.010
Answer: (B).(A) With statistical inference, we use samples todraw conclusions about the population, not thesample.(B)* correct — This statement is the definition ofstatistical inference.(C) We do not generalize results to a sample but apopulation. Furthermore, using the correctprocedure (to generalize to a population) is not thefundamental concept of inferential statistics.(D), (E) Only (B) is correct.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 30/1
Ch 2, CI for µdefinition Ok85, STT.06.01.020
A 95% confidence interval is an interval calculated from
A sample data that will capture the true populationparameter for at least 95% of all samples randomlydrawn from the same population.
B population data that will capture the true populationparameter for at least 95% of all samples randomlydrawn from the same population.
C sample data that will capture the true sample statisticfor at least 95% of all samples randomly drawn fromthe same population.
D population data that will capture the true samplestatistic for at least 95% of all samples randomly drawnfrom the same population.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 31/1
AnswerCh 2, CI for µ, definition, Ok85, STT.06.01.020
Answer: (A).Note: One point of this question is that inferentialstatistics is about estimating populationparameters from sample data.(A)* correct — This statement refers to the ideasbehind sampling and the Central Limit Theorem.(B) A calculation from population data wouldcapture the true population parameter with 100%confidence.(C) Sample statistics have a sampling distributionso there is no one true sample statistic.(D) See the explanations for (B) and (C).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 32/1
Ch 2, CI for µdefinition Ok86, STT.06.01.050
A 95% confidence interval has been constructed around asample mean of 28. The interval is (21, 35). Which of thefollowing statement(s) is true?
A The margin of error in the interval is 7.
B 95 out of 100 confidence intervals constructed aroundsample means will contain the true population mean.
C The interval (21,35) contains the true population mean.
D Both (a) and (b) are true.
E (a), (b), and (c) are true.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 33/1
AnswerCh 2, CI for µ, definition, Ok86, STT.06.01.050
Answer: (A).(B) The probability is 0.95, but no guarentee that95 of 100 will contain µ.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 34/1
Ch 2, CI, quickieOk89, STT.06.01.080
A 95% confidence intervals for birthweights isfound to be (6.85, 7.61). Is it correct to say that95% of all birth weights will be between 6.85 and7.61 pounds?
A YesB No
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 35/1
AnswerCh 2, CI, quickie, , Ok89, STT.06.01.080
Answer: (B)No. This confidence interval gives us a sense ofwhere the population mean lies, not whichindividual observations are likely to occur (that’scalled a prediction interval).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 36/1
Ch 2, One-sided tests on µFoster care Ok95, STT.06.02.030
Child and Protective Services, a branch of theDepartment of Health and Human Services isinvestigating the monthly average number ofchildren in foster care over the last several years.They are interested in seeing if the average isdropping from 235 children per month in 2001.The null hypothesis for this problem would be:
A H0 : µ < 235
B H0 = 235
C H0 : p = 235
D H0 : µ = 235
E None of the aboveErik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 37/1
AnswerCh 2, One-sided tests on µ, Foster care, Ok95, STT.06.02.030
Answer: (D).Null hypothesis, not alternative.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 38/1
Ch 2, P-valueOk100, STT.06.03.020
A P -value represents
A the probability, given the null hypothesis is true, that theresults could have been obtained purely on the basis ofchance alone.
B the probability, given the alternative hypothesis is true,that the results could have been obtained purely on thebasis of chance alone.
C the probability that the results could have beenobtained purely on the basis of chance alone.
D Two of the above are proper representations of aP -value.
E None of the above is a proper representation of aP -value.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 39/1
AnswerCh 2, P-value, , Ok100, STT.06.03.020
Answer: (a).(A)* correct — This answer gives the definition ofp-value.(B) The definition of p-value is not conditional onthe alternative hypothesis because the probabilitythat the alternative hypothesis is difficult todetermine (The Bayesian Problem).(C) A hypothesis test begins with the assumptionthat the null hypothesis is true (a conditionalprobability, not an unconditional probability).(D) Only A is correct.(E) A is correct.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 40/1
Ch 3, Independent or paired 1Ok127, STT.08.02.030
Two catalysts are being analyzed to determine howthey affect the mean yield of a chemical process.Catalyst 1 is used in the process eight times andthe yield in percent is measured each time. Thencatalyst 2 is used in the process eight times andthe yield is measured each time. What kind oft-test should be used to compare these data?
A Independent t-testB Paired t-test
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 41/1
AnswerCh 3, Independent or paired 1, , Ok127, STT.08.02.030
Answer: (a). In this case, catalyst 1 is applied to adifferent set of processes than catalyst 2, thusthere is no way to match data from the first set withdata from the second set.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 42/1
Ch 3, Independent or paired 2Ok128, STT.08.02.040
Six river locations are selected and the zincconcentration is determined for both surface waterand bottom water at each location. What kind oft-test should be used to compare these data?
A Independent t-testB Paired t-test
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 43/1
AnswerCh 3, Independent or paired 2, , Ok128, STT.08.02.040
Answer: (b). In this case, each pair of data hassomething in common–they are taken from thesame river.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 44/1
Ch 3, t-intervalOk125, STT.08.02.010, variation
A two-sample t-interval interval was constructed for thedifference in the two population means, µ1 − µ2. Theresulting 99% confidence interval was (−0.004, 0.12). Aconclusion that could be drawn is:
A There is no significant difference between µ1 and µ2.
B There is a significant difference between µ1 and µ2.
C The range of possible differences between the twomeans could be from a difference of 0.004 with µ2
being larger up to a difference of 0.12 with µ1 beinglarger.
D Both (a) and (c) are correct.
E Both (b) and (c) are correct.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 45/1
AnswerCh 3, t-interval, , Ok125, STT.08.02.010, variation
Answer: (a).Answer (c) is almost correct, but we could only saythat with 99% confidence.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 46/1
Ch 3, Reporting resultsOk99, STT.06.03.010
Robert is asked to conduct a clinical trial on thecomparative efficacy of Aleve versus Tylenol forrelieving the pain associated with muscle strains.He creates a carefully controlled study and collectsthe relevant data. To be most informative in hispresentation of the results, Robert should report
A whether a statistically significant differencewas found between the two drug effects.
B a P -value for the test of no drug effect.C the mean difference and the variability
associated with each drug’s effect.D a confidence interval constructed around the
observed difference between the two drugs.Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 47/1
AnswerCh 3, Reporting results, , Ok99, STT.06.03.010
Answer: (d).(A) Reporting only a statistically significant difference is theleast informative.(B) Reporting a p-value is more informative than reportingonly a statistically significant difference (answer (A)) andmore informative than reporting the mean difference andvariability (answer (C)), but not as informative as reporting aconfidence interval (answer (D)).(C) Reporting the mean difference and the variability givesno indication of statistical significance.(D)* correct — A confidence interval simultaneously providesinformation about the mean differences, variability, direction,a sense of minimum and maximum effect, as well as aconservative and unconservative estimate.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 48/1
Ch 5, ANOVAFat 1/2
What are the correct hypotheses for testing for adifference between the mean doughnut fatabsorption amounts among the four types?
A H0 : µ1 = µ2 = µ3 = µ4 vsHA : µ1 6= µ2 6= µ3 6= µ4.
B H0 : µ1 6= µ2 6= µ3 6= µ4 vsHA : µ1 = µ2 = µ3 = µ4.
C H0 : µ1 = µ2 = µ3 = µ4 vsHA : At least one pair of means is di�erent.
D H0 : µ1 = µ2 = µ3 = µ4 = 0 vsHA : µ1 6= µ2 6= µ3 6= µ4.
E H0 : µ1 = µ2 = µ3 = µ4 vsHA : µ1 > µ2 > µ3 > µ4.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 49/1
AnswerCh 5, ANOVA, Fat 1/2,
Answer: (c).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 50/1
Ch 5, ANOVAFat 2/2
What is the conclusion of the hypothesis test? Thedata provide convincing evidence that the mean fatabsorption amounts
A are different for all types.B is lower for fat4 than the other fats.C are different for at least two of the types.D are the same for all types.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 51/1
AnswerCh 5, ANOVA, Fat 2/2,
Answer: (c).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 52/1
Ch 5, ANOVABonferroni
In the doughnut data set fat has 4 types: 1, 2, 3,and 4. If α = 0.05, what should be the modifiedBonferroni significance level for two sample t-testsfor determining which pairs of groups havesignificantly different means?
A α∗ = 0.05
B α∗ = 0.05/2 = 0.0250
C α∗ = 0.05/3 = 0.0167
D α∗ = 0.05/4 = 0.0125
E α∗ = 0.05/6 = 0.0083
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 53/1
AnswerCh 5, ANOVA, Bonferroni,
Answer: (e). There are 6 comparisons:(1,2), (1,3), (1,4), (2,3), (2,4), and (3,4).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 54/1
Ch 5, ANOVAMultiple Comparisons
Goal: Create a summary for the multiplecomparisons of 4 groups.Story: Percent of a Standard 50-word list heardcorrectly in the presence of background noise. 24subjects with normal hearing listened to standardaudiology tapes of English words at low volumewith a noisy background. They repeated the wordsand were scored correct or incorrect in theirperception of the words. The order of listpresentation was randomized.(5 slides). . .
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 55/1
Ch 5, ANOVAMultiple Comparisons
20
30
40
List1 List2 List3 List4Code for each list played
Sco
re r
ecei
ved
on h
earin
g te
st
ListID
List1
List2
List3
List4
Hearing
(Consider the order of the means). . .
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 56/1
Ch 5, ANOVAMultiple Comparisons
ANOVA results:
## Df Sum Sq Mean Sq F value Pr(>F)
## ListID 3 920 306.8 4.92 0.0033 **
## Residuals 92 5738 62.4
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
. . .
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 57/1
Ch 5, ANOVAMultiple Comparisons
Bonferroni multiple comparisons:
##
## Pairwise comparisons using t tests with pooled SD
##
## data: hearing$Hearing and hearing$ListID
##
## List1 List2 List3
## List2 1.0000 - -
## List3 0.0085 0.3347 -
## List4 0.0135 0.4594 1.0000
##
## P value adjustment method: bonferroni
. . .
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 58/1
Ch 5, ANOVAMultiple Comparisons
A) List: 3 4 2 1
B) List: 1 2 4 3
C) List: 1 2 3 4
D) List: 3 4 2 1
E) None of A–D
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 59/1
AnswerCh 5, ANOVA, Multiple Comparisons,
Answer: (d). Only List 1 is different from 3 and 4.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 60/1
Ch 7, CI for proportionsOk114, STT.08.01.010
To estimate the proportion of students at a university whowatch reality TV shows, a random sample of 50 students wasselected and resulted in a sample proportion of .3. A 95%confidence interval for the proportion that watches reality TVwould be ______ a 90% confidence interval.
A narrower than
B the same width as
C wider than
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 61/1
AnswerCh 7, CI for proportions, , Ok114, STT.08.01.010
Answer: (C).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 62/1
Ch 7, Test statisticOk112, STT.07.01.057
In a random sample of 2013 adults, 1283 indicated that theybelieve that rudeness is a more serious problem than in pastyears. Which of the test statistics shown below would beappropriate to determine if there is sufficient evidence toconclude that more than three-quarters of U.S. adults believethat rudeness is a worsening problem?
Ap̂− .5√
(.5)(1− .5)/2013
Bp̂− .75√
(.75)(1− .75)/2013
Cx̄− .75√s/2013
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 63/1
AnswerCh 7, Test statistic, , Ok112, STT.07.01.057
Answer: (b).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 64/1
Ch 7, Parachute null hypothesisOk117, STT.08.01.040
A parachute manufacturer is concerned that thefailure rate of 0.1% advertised by his companymay in fact be higher. What is the null hypothesisfor the test he would run to address his worries.
A H0 : µ = 0.001
B H0 : p > 0.001
C H0 : µ < 0.001
D H0 : p = 0.001
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 65/1
AnswerCh 7, Parachute null hypothesis, , Ok117, STT.08.01.040
Answer: (d).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 66/1
Ch 7, Parachute conclusionOk117, STT.08.01.050
A parachute manufacturer is concerned that thefailure rate of 0.1% advertised by his companymay in fact be higher. A hypothesis test was runand the result was a P -value of 0.03333. The mostlikely conclusion the manufacturer might make is:
A My parachutes are safer than I claim.B My parachutes are not as safe as I claim them
to be.C I can make no assumption of safety based on
a statistical test.D The probability of a parachute failure is
0.03333.E Both (b) and (d) are true.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 67/1
AnswerCh 7, Parachute conclusion, , Ok117, STT.08.01.050
Answer: (b).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 68/1
Ch 7, Parachute p-valueOk119, STT.08.01.060
To explain the meaning of a P -value of 0.033, youcould say:
A There is approximately a 96.7% chance ofobtaining my sample results.
B Assuming the null hypothesis is accurate,results like those found in my sample shouldoccur only 3.3% of the time.
C We can’t say anything for sure withoutknowing the sample results.
D There is approximately a 3.3% chance ofobtaining my sample results.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 69/1
AnswerCh 7, Parachute p-value, , Ok119, STT.08.01.060
Answer: (b).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 70/1
Ch 7, Excess successesOk125, STT.05.01.030
In 1938, Duke University researchers Pratt and Woodruffconducted an experiment looking for evidence of ESP(extrasensory perception). In the experiment, students werepresented with five standard ESP symbols (square, wavylines, circle, star, cross). The experimenter shuffled a desk ofESP cards, each of which had one of the five symbols on it.The experimenter drew a card from this deck, looked at it,and concentrated on the symbol on the card. The studentwould then guess the symbol, perhaps by reading theexperimenter’s mind. This experiment was repeated with 32students for a total of 60,000 trials. The students werecorrect 12,489 times.If the students were selecting one of the five symbols asrandom, the probability of success would be p = 0.2 and wewould expect the students to be correct 12,000 times out of60,000. Should we write off the observed excess of 489 asnothing more than random variation?
A Yes
B NoErik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 71/1
AnswerCh 7, Excess successes, , Ok125, STT.05.01.030
Answer: (b). The Central Limit Theorem gives usthat if X ∼ Bin(n, p), then X is approximatelynormal with the same mean and standarddeviation. This fact can be used to computeP (X ≥ 12489), which turns out to be a very smallnumber.binom.test(x = 12489, n = 60000, p = 0.2, alternative = "two.sided")
#### Exact binomial test#### data: 12489 and 60000## number of successes = 12489, number of trials = 60000, p-value## = 6.85e-07## alternative hypothesis: true probability of success is not equal to 0.2## 95 percent confidence interval:## 0.2049 0.2114## sample estimates:## probability of success## 0.2082
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 72/1
Ch 7, Comparing two proportionsOk125, STT.08.02.010
A two proportion z interval was constructed for the differencein the two population proportions, p1 and p2. The resulting99% confidence interval was (−0.004, 0.12). A conclusionthat could be drawn is:
A There is no significant difference between p1 and p2.
B There is a significant difference between p1 and p2.
C The range of possible differences between the twoproportions could be from a 0.4% difference with p2being larger up to a 12% difference with p1 being larger.
D Both (a) and (c) are correct.
E Both (b) and (c) are correct.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 73/1
AnswerCh 7, Comparing two proportions, , Ok125, STT.08.02.010
Answer: (a). Answer (c) is almost correct, but wecould only say that with 99% confidence.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 74/1
Ch 8, Correlation coefficientsOk STT.02.02.010
The scatterplots below display three bivariate datasets. The correlation coefficients for these datasets are 0.03, 0.68, and 0.89. Which scatter plotcorresponds to the data set with r = 0.03?
A Plot 1B Plot 2C Plot 3
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 75/1
AnswerCh 8, Correlation coefficients, , Ok STT.02.02.010
Answer: (b).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 76/1
Ch 8, Strong correlationOk STT.02.02.020
Joe Bob found a strong correlation in an empiricalstudy showing that individuals’ physical abilitydecreased significantly with age. Which numericalresult below best describes this situation?
A −1.2
B −1.0
C −0.8
D +0.8
E +1.0
F +1.2
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 77/1
AnswerCh 8, Strong correlation, , Ok STT.02.02.020
Answer: (c).(A), (F) The range of the correlation coefficient is |r| < 1.(B) This is a perfect negative correlation, which is unlikely tohappen with empirical data.(C)* correct — The problem statement assumes increasingage, so the best answer is a strong, negative correlation.(D) Although this correlation is strong, it is also positive,whereas the problem statement implies that the correlationshould be negative.(E) This correlation is both perfect (unlikely with empiricaldata) and positive (whereas the problem statement impliesthat the correlation should be negative).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 78/1
Ch 8, Coney IslandOk STT.02.02.050
A researcher found that r = +.92 between thehigh temperature of the day and the number of icecream cones sold at Coney Island. This result tellsus that
A high temperatures cause people to buy icecream.
B buying ice cream causes the temperature togo up.
C some extraneous variable causes both hightemperatures and high ice cream sales.
D temperature and ice cream sales have astrong positive linear relationship.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 79/1
AnswerCh 8, Coney Island, , Ok STT.02.02.050
Answer: (d).(A) This claim may be true, but correlation tells usonly about the strength and direction of arelationship, not about the cause-effect aspect ofthe relationship.(B) Correlation does not imply causation, in eitherdirection.(C) Correlation does not imply the existence of alurking variable.(D)* correct — A correlation of r = +.92 implies astrong, positive, linear relationship.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 80/1
Ch 8, EquationOk STT.02.03.010
A store manager conducted an experiment in which hesystematically varied the width of a display for toothpastefrom 3 ft. to 6 ft. and recorded the corresponding number oftubes of toothpaste sold per day. The data was used to fit aregression line, which was
tubes sold per day = 20 + 10(display width)
What is the predicted number of tubes sold per day for adisplay width of 12 feet?
A 120
B 140
C It would be unwise to use the regression line to make aprediction for a display width of 12 ft.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 81/1
AnswerCh 8, Equation, , Ok STT.02.03.010
Answer: (c).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 82/1
Ch 8, Salariescoefficient of determination Ok STT.02.02.060
The salary and the numbers of years of teachingexperience were recorded for 20 social studiesteachers in rural west Texas. When the data pointswere plotted, there was a roughly linearrelationship and a positive correlation betweensalary and number of years of teachingexperience, with r = 0.8. What percentage of thevariation in the salaries is explained by the linearrelationship between salary and years of service?
A 80%B 64%C 36%D 20%
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 83/1
AnswerCh 8, Salaries, coefficient of determination, Ok STT.02.02.060
Answer: (b).
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 84/1
Ch 8, OutliersOk STT.02.04.040
Why is it important to look for outliers in data priorto applying regression?
A Outliers always affect the magnitude of theregression slope.
B Outliers are always bad data.C Outliers should always be eliminated from the
data set.D Outliers should always be considered
because of their potential influence.E We shouldn’t look for outliers, because all the
data must be analyzed.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 85/1
AnswerCh 8, Outliers, , Ok STT.02.04.040
Answer: (d).(A) Outliers don’t always affect the regressionslope.(B), (C) Outliers may be the data of most interestand are certainly not always bad data.(D)* correct — Outliers should always beconsidered but are not always influential.(E) Even if one analyzes all the data, one shouldbe aware of outliers because of their impact.
Erik B. Erhardt, UNM Stat 427/527, ADA1, Ch 00, Clicker Questions 86/1