L8 Estimate 2014

download L8 Estimate 2014

of 40

description

lecture notes

Transcript of L8 Estimate 2014

  • 1

    Chapter 8

    Confidence Interval

    Estimation

    David Chow

    Oct 2014

  • 2

    Learning Objectives

    To construct and interpret confidence interval estimates for the mean and the proportion.

    To determine the necessary sample size for a confidence interval.

    Section 8.5: Applications in Auditing (NOT covered)

  • 3

    Basic Concepts

    A point estimate is a single number

    Eg: For the population mean (), a point estimate is ____

    A confidence interval is an interval estimate. It provides additional information about variability.

    Eg: Giant pandas mean age = 22 yrs old

    Point Estimate

    Lower Confidence

    Limit

    Upper Confidence

    Limit

    Width of confidence interval

    Eg: According to a survey, the 95% confidence interval mean wage of private tutoring is between $110 to $150 per hour

    I.e., = $130 20

  • 4

    Basic Concepts

    The general formula for all confidence intervals (C.I.) is:

    Critical values (Z) are related to the level of confidence (1- , also called confidence level).

    Eg: 95% confidence: (1 - ) = 0.95, or = 5%.

    With a given , critical values can be obtained from the Z-table.

    Then, a C.I. can be computed:

    Eg: 95% confidence: (1 - ) = 0.95, or = 5%.

    Point Estimate Margin of Error, where

    Margin of Error (e) = (Critical Value) x (Standard Error)

  • 5

    Remarks

    This chapter focuses on two parameters, and

    Lets start with the easiest case: estimating with a known population standard deviation ()

    A more realistic case ( unknown) follows

    Concepts versus Computation:

    As always, statistic concepts can be a bit abstract at first, but computations have standard steps to follow

    We will work out a few examples, master the computations first, then go back to think about the rationale and interpretation behind your math

  • 6

    Estimating ( Known)

  • 7

    Confidence Interval for ( Known)

    Assume population standard deviation is known.

    Also assume n is large enough (n > 30), or the population

    is normally distributed. Such assumptions ensure ____.

    A two-tailed confidence interval estimate:

    Z, also written as Z/2, is the standardized normal distribution

    critical value for a probability of ____ in each tail.

    n

    ZX

  • 8

    Critical Values of Z

    Consider a 95% confidence interval:

    Z= -1.96 Z= 1.96

    .951

    .0252

    .025

    2

    Lower Confidence Limit

    Upper Confidence Limit Point Estimate

    0

    Find the critical values

    Z0.05 and Z0.005.

  • 9

    Eg: Length of A4 Paper

    A paper producer wants to check if the

    paper produced has the correct mean length

    of 11 inches

    Find the 95% confidence interval of the

    population mean paper length based on a

    sample of 100, sample meanx = 10.998 in

    is known to be 0.02 in

  • 10

    Eg: Length of A4 Paper

    The 95% confidence interval is given by:

    =x Z/2 x

    Step 1: Find Z0.025 = 1.96

    Step 2: Z/2 x = 1.96 (0.02)/10 = 0.00392

    The required confidence interval is:

    = 10.998 0.00392 inches, or

    10.99408 < < 11.00192

    Find the 99% interval. What is the effect of raising the confidence level?

  • Eg: Mean Resistance

    A sample of 11 circuits

    from a large normal

    population has a mean

    resistance of 2.20 ohms.

    Past testing shows that

    the population standard

    deviation is 0.35 ohms.

    Determine a 95%

    confidence interval for

    the true mean resistance

    of the population.

    2.4068) , (1.9932

    .2068 2.20

    )11(.35/ 1.96 2.20

    n

    025.0

    ZX

    11

    We are 95% confident that the true

    mean resistance is between 1.9932

    and 2.4068 ohms

    I.e., 95% of intervals formed in this

    manner will contain the true

    population mean.

    Is it correct to use the Z-distribution?

    ANSWER

  • 12

    Recap: Choosing Confidence Level

    A bigger confidence level raises

    the confidence (of the interval

    containing the true mean)

    But a wider interval estimate also

    means ____ precision

    95% is the most common choice

    It provides a good balance between

    precision and confidence

  • Example: Body Temperature

    n = 106,x = 98.20F, = 0.62F

    1. Find the 95% confidence interval

    2. How to obtain a narrower interval estimate?

    1. Margin of error = ____ = 0.12

    CI: 98.08 to 98.32

    2. Smaller sigma, bigger n, or smaller (1-alpha)

    13

    ANSWER

  • 14

    , Confidence Intervals and Sampling Distribution

    x

    Confidence Intervals

    Intervals:

    to (1-) x 100% of intervals constructed

    contain ;

    () x 100% do not.

    Sampling Distribution

    n

    ZX

    n

    ZX

    x

    x1

    x2

    /2 /21

  • 15

    Interpreting Confidence Level

    Suppose we select many different samples of

    size n from a population.

    A 95% confidence interval is constructed for

    each sample.

    Then 95% of those interval estimates would

    actually contain the true value of .

  • 16

    Estimating ( Unknown)

  • 17

    Confidence Interval for ( Unknown)

    Usually is unknown

    Use sample standard deviation S instead

    This will introduce extra uncertainty

    because S varies from sample to sample

    So another distribution (the t distribution) is used

    It is flatter than the standard normal distribution

    The t distribution requires that the original population is normally distributed

    This is assumed in most cases

    Strictly speaking, this assumption should be checked at first

  • 18

    Confidence Interval for ( Unknown)

    With an unknown , you need to be sure that

    (1) the sample size is large enough (n 30), or

    (2) the population is normal

    Such assumptions enable the use of Students t dist:

    Confidence Interval Estimate:

    where t, also written as t/2,n-1, is the critical value of the t

    distribution with n-1 degree of freedom, and an area of /2 in

    each tail)

    n

    StX 1-n

  • 19

    Critical Values of t

    The critical value of t is characterized by two elements:

    The confidence level (1- ), and

    The degrees of freedom (df).

    What is d.f.?

    It is the number of observations that are free to vary after sample mean has been calculated.

    In this section, df = n-1.

  • 20

    Degrees of Freedom

    Given a mean value of 8.0, X3 must be 9

    (i.e., X3 is not free to vary) Here, n = 3, so degrees of freedom = n 1 = 3 1 = 2

    You are free to choose 2 values (X1 and X2),

    but the third is set for a given mean.

    Eg: Suppose the mean of 3 numbers is 8.0

    Let X1 = 7, X2 = 8

    What is X3?

    In this example d.f. = 2.

    What does it mean?

  • 21

    Degrees of Freedom

    t 0

    t (df = 5)

    t (df = 13) t-distributions: bell-shaped, symmetric,

    but fatter tails than Z

    Standard Normal (t distribution with df = )

    Note: t Z as n increases

  • 22

    Critical Values of t

    Upper Tail Area

    df

    .25 .10 .05

    1

    1.000

    3.078

    6.314

    20 0.687 1.325 1.724

    21

    0.686

    1.323

    1.721

    t 0 1.724

    The body of the table contains

    t values, not ____

    Suppose n = 21, and = 0.10.

    Then df = ____,

    upper-tail area = ____

    /2 = 0.05

    d.f. = 20

  • 23

    Eg: Mean Age of Retirement

    A random sample of 25 retirees has mean age = 50 and std = 8. Find the 95% confidence interval for .

    Must assume a normal population.

    From t-table, t0.025, 24 = 2.0639

    The confidence interval is

    25

    8(2.0639)50

    n

    S1-n /2, tX

    (46.698 , 53.302)

  • Eg: Heating Oil Consumption

    A random sample of 35 households has mean consumption of heating oilx = 1122.75 gallons, and S = 295.72 gallons.

    Find the 95% confidence interval for .

    ANSWER

    Critical values are t0.025, 34 = 2.0322.

    = 1122.75 101.58 gallons.

    Based on the sample evidence, we are 95% confident that the interval 1122.75 101.58 gallons covers the population mean.

    24

    ANSWER

    NOTE: Z or t?

    If n 30, it is commonly acceptable to use Z (instead of t) as an approximation.

    But if you can find a more precise answer (using t-values), why not?

  • 25

    Estimating

    Population Proportion

  • 26

    Confidence Intervals for the

    Population Proportion

    Recall that the distribution of the sample proportion is

    approximately normal if the sample size is large, with

    standard deviation

    We will estimate this with sample data:

    n

    p)p(1

    n

    )(1p

  • 27

    Confidence Intervals for the

    Population Proportion

    The confidence interval for the population proportion is given by:

    where

    Z = critical Z-value given the level of confidence

    p = sample proportion

    n = sample size

    Such interval estimate for is based on a point estimate (p), plus an allowance for uncertainty arising from sampling

    n

    p)p(1Zp

  • 28

    Example: Vegetarians

    1. A random sample of 100 people shows that 25 of them are vegetarians.

    Form a 95% confidence interval for the true proportion of vegetarians in the

    population.

    2. Compute the 95% confidence interval if n=1000.

    00.25(.75)/196.125/100

    p)/np(1p

    Z

    0.3349) , (0.1651

    (.0433) 1.96 .25 Interpretation

    95% of intervals formed from

    samples of size 100 in this manner

    will cover the true proportion

  • 29

    Sample Size

    Determination

  • 30

    Sample Size Determination

    Recall that sample size (n) affects the margin

    of error (e, also called sampling error),

    where

    If e is set before conducting a survey, this

    equation helps you determine the sample size

    for a pre-set value of e (the acceptable level

    of error): 2

    22

    e

    Zn

    n

    Ze

  • 31

    Sample Size Determination

    If = 45, what sample size is needed to estimate

    the mean within 5 with 90% confidence?

    219.195

    (45)(1.645)

    e

    Zn

    2

    22

    2

    22

    Round up to the next integer to get the

    required sample size n = 220

  • Eg: A4 Paper Again

    In the paper manufacturer example, = 0.02, n =

    100, and the 95% interval estimate is = 10.998

    0.00392 inches.

    Suppose the manufacturer wants to limit the error to

    0.003 by choosing a larger sample. What is n?

    ANSWER

    The required sample size is n = 171. 7.1700.003

    (0.02)(1.96)2

    22

    2

    22

    e

    Zn

    32

    ANSWER

  • 33

    Sample Size Determination

    To determine the required sample size for the proportion, you

    must know:

    The critical value Z (from a confidence level of 1-),

    The acceptable sampling error (e), and

    The true proportion .

    If is unknown, use the sample value p, or set = 0.50.

    2

    2 )1(

    e

    Zn

    Now solve

    for n to get n

    Ze)1(

  • Eg: Quality Control

    Out of a population of 1,000 light bulbs, we randomly selected 100 of

    which 30 were defective. What sample size is needed to be within

    0.05 with 90% confidence?

    (a) Since the true population proportion is unknown, use the sample

    value here.

    (b) Now, set = 0.50 and compare the result with (a).

    34

    2 22 2

    1 1.645 0.3 0.7

    Error 0.05

    227.3 228

    Z p pn

    (b) The required sample size

    increases to 271.

    NOTE: The product (1- ) ranges from 0 to 0.25. By assuming a value

    of 0.25, we are in fact playing safe by

    sampling more than necessary.

    ANSWER

    (a)

  • 35

    More on the

    t Distribution

  • 36

    The t distribution is a family of probability distributions. It is bell-shaped, symmetric, & flatter than the Z distribution..

    t Distribution

    A specific t distribution depends on a parameter known as the degrees of freedom (d.f.).

    Degrees of freedom refer to the number of independent pieces of information that go into the computation of s.

  • 37

    A t distribution with more degrees of freedom has ____ dispersion. As the number of d.f. increases, the difference between t distribution and Z distribution becomes smaller and smaller.

  • 38

    Degrees Area in Upper Tail

    of Freedom .20 .10 .05 .025 .01 .005

    . . . . . . .

    50 .849 1.299 1.676 2.009 2.403 2.678

    60 .848 1.296 1.671 2.000 2.390 2.660

    80 .846 1.292 1.664 1.990 2.374 2.639

    100 .845 1.290 1.660 1.984 2.364 2.626

    .842 1.282 1.645 1.960 2.326 2.576

    Look familiar? They are ____.

    t Distribution What is this 2.009?

  • Review Questions

    A population has a standard deviation of 50. A random sample of 100 from this population is selected, and the sample mean is 600. At 95% confidence, the margin of error is ____

    As the number of degrees of freedom for a t distribution ____, the difference between the t distribution and the standard normal distribution becomes smaller

    For the interval estimation of when is known and the sample is large, the proper distribution to use is ____

    1. 9.8

    2. Increases

    3. The normal distribution

    ANSWER

  • Review Questions

    4. The t value for a 95% confidence interval estimation with 24 degrees of freedom is ____

    5. A 95% confidence interval for a population mean is determined to be 100 to 120. If the confidence coefficient is reduced to 0.90, the interval for a. becomes narrower

    b. becomes wider

    c. does not change

    d. becomes 0.1

    6. In a random sample of 144 observations, sample proportion p = 0.6. The 95% confidence interval for is a. 0.52 to 0.68

    b. 0.144 to 0.200

    c. 0.60 to 0.70

    d. 0.50 to 0.70

    4. 2.064

    5. A

    6. A

    ANSWER