Conﬁdence Intervals for two Proportions and One …cathy/Math2311/Lectures/Ch… · ·...

Confidence Intervals for two Proportions and OneSample MeanSections 7.3 & 7.4

Cathy Poliak, [email protected]

Office hours: T Th 2:30 pm - 5:15 pm 620 PGH

Department of MathematicsUniversity of Houston

April 5, 2016

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Sections 7.3 & 7.4 April 5, 2016 1 / 33

Outline

1 Beginning Questions

2 Comparing Two Proportions

3 Inference for Means

4 The T-distribution

5 Confidence Interval for Population Mean

6 Sample Size


Popper Set Up

Fill in all of the proper bubbles.

Use a #2 pencil.

This is popper number 17.


Items Used in Confidence Intervals

Point estimate

Confidence level

Critical value

Standard error

Margin of error = critical value × standard error

Interpretation: The confidence interval is

point estimate±margin of error

We are C% confident that the population parameter is betweenpoint estimate−margin of error andpoint estimate + margin of error.


Popper #17 Questions

For the Wisconsin State GOP primary election a poll was conducted bythe Marquette Law School (3/24 - 3/28). From a sample of 768 likelyvoters for the republican primary, 40% said they would vote for Cruz,while 30% said they would vote for Trump, with 5.8% margin of error.

1. Which statement is correct about the results of this poll?a Of all likely republican primary voters, Cruz will win over Trump.

b There is no statistical difference between who will vote for Cruz andwho will vote for Trump.

c 40% of all likely republican primary voters will vote for Cruz.

d 30% of all likely republican primary voters will not vote at all.


Comparing Two Proportions

What is the difference between the proportion of m&ms that are blue inthe plain m&ms compared to the peanut m&ms?

From a random sample of plain m&ms and peanut m&ms we getthe following results.

Candy type n Number of Blue Sample proportion (p̂)plain 81 28 p̂plain = 28

81 = 0.3458peanut 100 20 p̂peanut = 20

100 = 0.2We want to know what is the difference of the proportion of m&msthat are blue for all of plain and peanut m&ms. That is, estimate:

ppeanut − pplain


Two-sample problems assumptions

The goal of inference is to compare the responses in two groups.1. Each group is considered to be a simple random sample from

two distinct populations.2. The population sizes are both at least ten times the sizes of the

samples.3. The number of successes and failures in both samples must all

be ≥ 10.


Confidence intervals for comparing two proportions

Choose an SRS of n1 from a large population having proportion p1 ofsuccesses and and independent SRS of size n2 from anotherpopulation having proportion p2 of successes.

1. Point estimate: D = p̂1 − p̂2 = X1n1− X2

n2

2. Confidence level: C a percent predetermined in the problem if notuse 95%.

3. Critical value: z∗ is the value for the standard Normal densitycurve with area C between −z∗ and z∗.

4. Confidence interval:

(p̂1 − p̂2)± z∗√

p̂1(1− p̂1)

n1+

p̂2(1− p̂2)

n2

5. Interpret


Determine a 95% confidence interval for the differenceof the proportion of m&ms that are blue for all of plainand peanut m&ms.

From a random sample of plain m&ms and peanut m&ms we get thefollowing results.

Candy type n Number of Blue Sample proportion (p̂)plain 81 28 p̂plain = 28

81 = 0.3458peanut 100 20 p̂peanut = 20

100 = 0.2


R code

prop.test(x=c(x1,x2),n=c(n1,n2),conf.level = C, correct = FALSE)

prop.test(x=c(28,20),n=c(81,100),conf.level = 0.95,correct=FALSE)

2-sample test for equality of proportions without continuitycorrection

data: c(28, 20) out of c(81, 100)X-squared = 4.8738, df = 1, p-value = 0.02727alternative hypothesis: two.sided95 percent confidence interval:0.01578192 0.27557610

sample estimates:prop 1 prop 2

0.345679 0.200000


TI-83(84)

STAT→ TESTS→ B:2-PropZint



This is from The Practice of Statistics for Business and Economics,3ed. by Moore, et al., p 483. A Pew Internet Project Data Memopresented data comparing adult gamers with teen gamers with respectto the devices on which they play. The data are from two surveys. Theadult survey had 1063 gamers, and the teen survey had 1064 gamers.The memo reports that 574 of adult gamers played on consoles (Xbox,PlayStation, Wii, etc.), and 947 of teen gamers played on gameconsoles.

2. Find the estimate of the difference between the proportion of teengamers who played on game consoles and the proportion ofadults who played on these devices. That is find p̂teen − p̂adult .

a) 373 b) 1 c) 0.35 d) 03. Find the 95% confidence interval for the difference of the

proportions.a) (313,387) c) (0.54,0.89)b) (0.315,0.385) d) (574,947)



A coffee machine dispenses coffee into paper cups. Here are theamounts measured in a random sample of 20 cups.

9.9, 9.7, 10.0, 10.1, 9.9, 9.6, 9.8, 9.8, 10.0, 9.5,9.7, 10.1, 9.9, 9.6, 10.2, 9.8, 10.0, 9.9, 9.5, 9.9

4. Determine the mean amount from these 20 cups.a) 10 b) 9.845 c) 9 d) 0

5. Determine the standard deviation of the amount from these 20cups.

a) 9.845 b) 0.1986 c) 3.137 d) 06. Are the mean and standard deviation you calculated parameters

or statistics?a) parameter b) statistic


Assumptions for Estimating the Population Mean

1. The sample has to be as a result of a simple random sample(SRS).

2. The distribution of the population has to be Normal. By the CentralLimit Theorem if our sample size is larger than 30 then the samplemeans have a Normal distribution.


Point estimate for µ

When estimating the population mean µ the point estimate is thesample mean

x̄ =

∑ni xi

n.


Standard Error

When the standard deviation of a statistic is estimated from thedata, the result is called the standard error of the statistic.

The standard error of the sample mean is

SEX̄ =s√n

where s is the computed sample standard deviation from the data.

From our example: SEX̄ = 0.1986√20

= 0.0444.


The T-distribution

The problem is that the sample standard deviation s varies fromsample to sample.William Gosset, (a quality control engineer for the GuinnessBrewery) discovered this problem and figured out a newdistribution that changes the critical value based on the samplesize.This new distribution is called Students T distribution, becauseGuinness would not allow Gosset to publish his findings since hewas their employee.The shape of this distribution changes with different sample sizes.So it depends on a parameter called the degrees of freedom (df )The degrees of freedom for the T-distribution of the sample meanis the sample size minus one (n − 1). Because we are using thesample standard deviation s =

√1

n−1∑n

i=1(xi − x̄)2.


T distribution

Used for the inference of the population mean. When populationstandard deviation σ is unknown.

The distribution of the population is basically bell-shape.

Formula for t :t =

x̄ − µs/√

n

Use t-table, or qt(probability,df) in R.

Degrees of freedom: df = n − 1.


Normal Distribution vs T distribution

The red graph is the Normal density curve and the blue graph is the Tdensity curve with a degrees of freedom of 4.

-3 -2 -1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

x

density

-3 -2 -1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

x

density


Using T-table

The top margin is the area in the right tail.

The left margin is the degrees of freedom n − 1.

The values inside the table are the t values.


Critical value σ unknown

When σ is unknown we use t-distribution.

With degrees of freedom, df = n − 1.

The critical value is t∗ where the area between −t∗ and +t∗ underthe T-curve is the confidence level C = 1− α.

t∗ is found in T-table using the row according to the degrees offreedom and the column according to the confidence level at thebottom of the table.

In R use qt((1 + C)/2, df).


Critical value for µ with σ known

If σ is known the critical value is z∗ where the area under theNormal curve is between −z∗ and +z∗ is the confidence levelC = 1− α. This critical value is found at the bottom of the T-table.

The following table is the common confidence levels with theirz-score

C 80% 90% 95% 99%z∗ 1.28 1.645 1.96 2.576

In R qnorm((1 + C)/2).


Margin of Error

The margin of error is

m = critical value× standard error

If σ is known then the margin of error for estimating the mean µ is

m = z∗ × σ√n

If σ is unknown then the margin of error for estimating the mean µis

m = t∗ × s√n

With df = n − 1.


What is the mean population monthly cell phone bill?

A survey taken in 2010 polled 400 randomly chosen cell phoneusers. They answered the question: "What is your averagemonthly cell phone bill?"The following are the characteristics of the sample:

I The sample mean is x̄ = $71.I Assume the population standard deviation to be σ = $20.I The sample size is n = 400.

Determine a 96.5% confidence interval.


Snicker bars

Suppose your class is investigating the weights of Snickers 1−ouncefun−size candy bars to see if the customers are getting full value fortheir money. Assume the weights are Normally distributed. Severalcandy bars are randomly selected and weighed with sensitivebalances borrowed from the physics lab. The weights are:

0.95 1.02 0.98 0.97 1.05 1.01 0.98 1.00

We want to determine a 90% confidence interval for the true meanweight of these candy bars.


Using R if Given the Data

R code: t.test(name of x, conf.level = C)

> snickers<-c(0.95,1.02,0.98,0.97,1.05,1.01,0.98,1.00)> t.test(snickers,conf.level=0.9)

One Sample t-test

data: snickerst = 88.996, df = 7, p-value = 5.957e-12alternative hypothesis: true mean is not equal to 090 percent confidence interval:0.973818 1.016182

sample estimates:mean of x

0.995


TI-83(84)

1. STAT, Edit, enter the data into L1.2. STAT→ TESTS3. 7:ZInterval if we are given the population standard deviation σ,

otherwise use 8:TInterval.


TI Screen Shots



A coffee machine dispenses coffee into paper cups. From a simplerandom sample of 20 cups we found the mean to be x̄ = 9.845 oz. andthe sample standard deviation to be s = 0.1986.

7. Determine a 99% confidence interval for the mean amount ofcoffee dispensed from this machine.a) (9.7306, 9.9594)b) (9.718,9.972)c) (9.762, 9.928)d) (9.277, 10.413)


Choosing Sample Size

You can have both a high confidence while at the same time asmall margin of error by taking enough observations.The confidence interval for a population mean will have aspecified margin of error m when the sample size is

n =

(z∗ × σ

m

)2

where the sample size is the next whole number.


Starting Salary

We want to estimate annual starting salaries for collegegraduates. To determine this we need a sample.Assume that a 95% confidence interval estimate of the populationmean annual starting salary is desired.Assume the standard deviation is σ = $7,500.How large a sample should be taken if the desired margin of erroris m = $500?



8. Given σ = 7500 and a 95% confidence level. What should be thesample size if we want the margin of error to be m = $100?a) 7125b) 147c) 21,609d) 21,610


Conﬁdence Intervals for two Proportions and One …cathy/Math2311/Lectures/Ch… · ·...

Documents

Transcript of Conﬁdence Intervals for two Proportions and One …cathy/Math2311/Lectures/Ch… · ·...