10. Parameter Estimation

download 10. Parameter Estimation

of 65

Transcript of 10. Parameter Estimation

  • 7/28/2019 10. Parameter Estimation

    1/65

    1

    PARAMETER

    ESTIMATION

  • 7/28/2019 10. Parameter Estimation

    2/65

    Parameters Populations are described by their

    probability distributions and/or parameters.

    For quantitative populations, the locationand shape are described by mand s.

    Binomial populations are determined bya single parameter,p.

    If the values of parameters are unknown,we make inferences about them usingsample information.

  • 7/28/2019 10. Parameter Estimation

    3/65

    3

    Statistical inference is the process bywhich we acquire information aboutpopulations from samples.

    There are two types of inference:

    Estimation

    Hypotheses testing

  • 7/28/2019 10. Parameter Estimation

    4/65

    Types of Inference

    Estimation

    Estimating or predicting the value of theparameter

    What is (are) the most likely values of

    m orp? Hypothesis Testing

    Deciding about the value of a parameter

    based on some preconceived idea. Did the sample come from a population

    with m = 5 orp= .2?

  • 7/28/2019 10. Parameter Estimation

    5/65

    Types of Inference

    Examples:

    A consumer wants to estimate the averageprice of similar homes in her city beforeputting her home on the market.

    Estimation: Estimate m, the average home price.

    Hypothesis test: Is the new average resistance, mN greater thanthe old average resistance, mO?

    A manufacturer wants to know if a new typeof steel is more resistant to high temperatures

    than an old type was.

  • 7/28/2019 10. Parameter Estimation

    6/65

    Types of Inference

    Whether you are estimating parameters ortesting hypotheses, statistical methods

    are important because they provide: Methods for making the inference

    A numerical measure of the goodness or

    reliability of the inference

  • 7/28/2019 10. Parameter Estimation

    7/65

    7

    Concepts of Estimation

    The objective of estimation is to determinethe value of a population parameter on thebasis of a sample statistic.

    There are two types of estimators: Point Estimator Interval estimator

  • 7/28/2019 10. Parameter Estimation

    8/65

    Definitions

    An estimatoris a rule, usually a formula, thattells you how to calculate the estimate based onthe sample.

    Point estimation:A single number iscalculated to estimate the parameter.

    Interval estimation/Confidence Interval:Two numbers are calculated to create an

    interval within which the parameter isexpected to lie. It is constructed so that, witha chosen degree of confidence, the trueunknown parameter will be captured inside

    the interval.

  • 7/28/2019 10. Parameter Estimation

    9/65

    9

    Point Estimator

    A point estimator draws inference about apopulation by estimating the value of anunknown parameter using a single value

    or point.

  • 7/28/2019 10. Parameter Estimation

    10/65

    Point Estimator of Population Mean

    A sample of weights of 34 male freshman students was obtained.

    185 161 174 175 202 178 202 139 177170 151 176 197 214 283 184 189 168

    188 170 207 180 167 177 166 231 176

    184 179 155 148 180 194 176If one wanted to estimate the true mean of all male freshman students, you might

    use the sample mean as a point estimate for the true mean.

    sample mean x 182.44= =

    n

    xxi=

    An point estimate of population mean, m, is the

    sample mean

  • 7/28/2019 10. Parameter Estimation

    11/65

    Point Estimation of Population

    Proportion

    A sample of 200 students at a large university isselected to estimate the proportion of studentsthat wear contact lens. In this sample 47 wearcontact lens.

    An point estimate of population mean, p, is thesample proportion where x is the numberof successes in the sample.

    nxp /=

    235.200/47 ==p

  • 7/28/2019 10. Parameter Estimation

    12/65

    12

    Population distribution

    Point Estimator

    Parameter

    ?

    Sampling distribution

    A point estimator draws inference about apopulation by estimating the value of anunknown parameter using a single value

    or point.

    Point estimator

  • 7/28/2019 10. Parameter Estimation

    13/65

    Properties of

    Point Estimators

    Since an estimator is calculated from sample values, itvaries from sample to sample according to itssampling distribution.

    An estimatoris unbiased if the mean of its sampling

    distribution equals the parameter of interest. It doesnot systematically overestimate or underestimate thetarget parameter.

    Both sample mean and sample proportion areunbiased estimators of population mean andproportion. The following sample variance is anunbiased estimator of population variance.

    1

    )( 22

    =

    n

    xxs

    i

  • 7/28/2019 10. Parameter Estimation

    14/65

    Properties of

    Point EstimatorsOf all the unbiased estimators, we preferthe estimator whose sampling distributionhas the smallest spread orvariability.

  • 7/28/2019 10. Parameter Estimation

    15/65

    15

    An interval estimator draws inferencesabout a population by estimating the valueof an unknown parameter using an

    interval.

    Interval estimator

    Population distribution

    Sample distribution

    Parameter

    Interval Estimator

  • 7/28/2019 10. Parameter Estimation

    16/65

    16

    Selecting the right sample statistic to estimatea parameter value depends on thecharacteristics of the statistic.

    Estimators Characteristics

    Estimators desirable characteristics:Unbiasedness:An unbiased estimator is one whose

    expected value is equal to the parameter it estimates.Consistency:An unbiased estimator is said to be

    consistent if the difference between the estimator and

    the parameter grows smaller as the sample sizeincreases.

    Relative eff iciency :For two unbiased estimators, the onewith a smaller variance is said to be relatively efficient.

  • 7/28/2019 10. Parameter Estimation

    17/65

    Uncertainty Analysis The estimate of the error is called the uncertainty.

    It includes both bias and precision errors. We need to identify all the potential significant errors for

    the instrument(s). All measurements should be given in three parts

    Mean value

    Uncertainty Confidence interval on which that uncertainty is based

    (typically 95% C.I.) Uncertainty can be expressed in either

    absolute terms (i.e., 5 Volts 0.5 Volts)or in percentage terms (i.e., 5 Volts 10%) We will use a confidence interval throughout this course.

  • 7/28/2019 10. Parameter Estimation

    18/65

    21

    Estimating the Population Mean

    when the Population Variance is

    Known How is an interval estimator produced froma sampling distribution?

    A sample of size n is drawn from thepopulation, and its mean is calculated.

    By the central limit theorem, isnormally distributed (or approximatelynormally distributed.), thus

    X

    X

  • 7/28/2019 10. Parameter Estimation

    19/65

    22

    n

    x

    Z s

    m

    =

    We have established before that

    =s

    ms

    m 1)n

    zxn

    z(P 22

    Estimating the Population Mean when

    the Population Variance is Known

  • 7/28/2019 10. Parameter Estimation

    20/65

    23

    =sms 1)n

    zxn

    zx(P 22

    This leads to the following equivalent statement

    The Confidence Interval form( s is known)

    The confidence interval

  • 7/28/2019 10. Parameter Estimation

    21/65

    24

    CONFIDENCE INTERVAL FOR

    KNOWN

    Confidence Level :1 - = The probabilitythat a stated interval contains the unknownparameter.

    m

    s

    / 2X zn

    s

    / 2X z

    n

    s

    / 2X z

    n

    s

    Upper Confidence Limit

    (UCL)

    Lower Confidence Limit

    (LCL)

  • 7/28/2019 10. Parameter Estimation

    22/65

    25

    Interpreting the Confidence

    Interval form1 of all the values of obtained in repeated

    sampling from a given distribution, construct an interval

    that includes (covers) the expected value of the

    population.

    x

    s

    s

    n

    zx,

    n

    zx 22

  • 7/28/2019 10. Parameter Estimation

    23/65

    26

    x

    nz2 2

    s

    n

    zx 2s

    n

    zx 2s

    Lower confidence limit Upper confidence limit

    1 - Confidence level

    Graphical Demonstration of the

    Confidence Interval form

  • 7/28/2019 10. Parameter Estimation

    24/65

    27

    The Confidence Interval form( s is known)

    Four commonly used confidence levels

    Confidence

    level /20.90 0.10 0.05 1.645

    0.95 0.05 0.025 1.96

    0.98 0.02 0.01 2.33

    0.99 0.01 0.005 2.575

    z/2

  • 7/28/2019 10. Parameter Estimation

    25/65

    28

    Example: Estimate the mean value of the distribution resulting from

    the throw of a fair die. It is known that s = 1.71. Use a 90%

    confidence level, and 100 repeated throws of the die.

    Solution: The confidence interval is

    The Confidence Interval form( s is known)

    =s

    n

    zx 2 28.x100

    71.1645.1x =

    The mean values obtained in repeated draws of samples of size100 result in interval estimators of the form

    [sample mean - .28, Sample mean + .28],

    90% of which cover the real mean of the distribution.

  • 7/28/2019 10. Parameter Estimation

    26/65

    29

    The Confidence Interval form( s is known)

    Recalculate the confidence interval for 95% confidence level.

    Solution: =s

    n

    zx 2 34.x

    100

    71.196.1x =

    34.x 34.x

    .95

    .90

    28.x 28.x

  • 7/28/2019 10. Parameter Estimation

    27/65

    30

    The Confidence Interval form( s is known)

    The width of the 90% confidence interval = 2(.28) = .56

    The width of the 95% confidence interval = 2(.34) = .68

    Because the 95% confidence interval is wider, it is

    more likely to include the value ofm.

  • 7/28/2019 10. Parameter Estimation

    28/65

    31

    Example Doll Computer Company delivers computers

    directly to its customers who order via theInternet.

    To reduce inventory costs in its warehouses Doll,employs an inventory model, that requires theestimate of the mean demand during lead time.

    It is found that lead time demand is normallydistributed with a standard deviation of 75computers per lead time.

    Estimate the lead time demand with 95%

    confidence.

    The Confidence Interval form( s is known)

    http://localhost/var/www/apps/conversion/tmp/scratch_12/Xm10-01.xlshttp://localhost/var/www/apps/conversion/tmp/scratch_12/Xm10-01.xls
  • 7/28/2019 10. Parameter Estimation

    29/65

    32

    Example Solution

    The parameter to be estimated is m, the meandemand during lead time.

    We need to compute the interval estimation form.

    From the data provided in the data file, thesample mean is

    The Confidence Interval form( s is known)

    .16.370=x

    56.399,76.34040.2916.37025

    7596.116.370

    25

    75z16.370

    nzx

    025.2

    ===

    =s

    Since 1 - =.95, = .05.

    Thus /2 = .025. Z.025 = 1.96

  • 7/28/2019 10. Parameter Estimation

    30/65

    33

    Example (MINITAB Output)

    Stat > Basic Statistics > 1-Sample Z

    Z Confidence Intervals

    The assumed sigma = 75.0

    Variable N Mean StDev SE Mean 95.0 % CI

    Demand 25 370.2 80.8 15.0 (340.8, 399.6)

  • 7/28/2019 10. Parameter Estimation

    31/65

    34

    Example

    To help make a decision about expansionplans, the president of a music companyneeds to know how many CDs teenagersbuy annually. 250 teenagers are selected;

    each reports the number of CDs boughtwithin the last 12 months. The responseshave a sample mean of 4.26. The (population)standard deviation is 3. Construct 99, 95, 80

    percent confidence intervals for the trueaverage number of CDs bought by teenagersin a year.

  • 7/28/2019 10. Parameter Estimation

    32/65

    35

    ANSWER

    99% CI:

    95% CI: 80% CI:

    . .z z

    X Xn n

    . .. .

    . . %

    005 005

    2 57 3 2 57 34 26 4 26

    250 250

    3 771 4 749 99

    s s m = m = m =

    = 95%3.89 4.63m = 80%4.015 4.501m

  • 7/28/2019 10. Parameter Estimation

    33/65

    36

    INTERPRETATION

    3.8 as a guess regarding the average

    number of CDs bought by teenagers is:

    a) reasonable corresponding to 99%confidence

    b) unreasonable corresponding to 95%confidence, and

    c) unreasonable corresponding to 80%confidence.

    I f ti d th Width f th

  • 7/28/2019 10. Parameter Estimation

    34/65

    37

    Wide interval estimator provides littleinformation.

    Where is m ????????????????

    Information and the Width of the

    Interval

    I f ti d th Width f th

  • 7/28/2019 10. Parameter Estimation

    35/65

    38

    Here is a much narrower interval.

    If the confidence level remains

    unchanged, the narrower interval

    provides more meaningfulinformation.

    Wide interval estimator provides littleinformation.Where is m ?

    Ahaaa!

    Information and the Width of the

    Interval

  • 7/28/2019 10. Parameter Estimation

    36/65

    39

    The width of the confidence interval isaffected by

    the population standard deviation (s) the confidence level (1-) the sample size (n).

    The Width of the Confidence Interval

  • 7/28/2019 10. Parameter Estimation

    37/65

    40

    90%

    Confidence level

    To maintain a certain level of confidence, a larger

    standard deviation requires a larger confidence interval.

    n)645.1(2

    nz2 05.

    s=

    s

    /2 = .05/2 = .05

    n

    5.1

    )645.1(2n

    5.1

    z2 05.s

    =

    s

    Suppose the standard

    deviation has increased

    by 50%.

    The Affects ofs on the interval width

  • 7/28/2019 10. Parameter Estimation

    38/65

    41

    n)96.1(2

    nz2

    025.

    s=

    s

    /2 = 2.5%/2 = 2.5%

    /2 = 5%/2 = 5%

    n

    )645.1(2

    n

    z2 05.s

    =s

    Confidence level90%95%

    Let us increase the

    confidence level

    from 90% to 95%.

    Larger confidence level produces a wider confidence interval

    The Affects of Changing the

    Confidence Level

    Th Aff t f Ch i th S l

  • 7/28/2019 10. Parameter Estimation

    39/65

    42

    90%

    Confidence level

    n)645.1(2

    nz2 05.

    s=

    s

    Increasing the sample size decreases the width of the

    confidence interval while the confidence level can remain

    unchanged.

    The Affects of Changing the Sample

    Size

  • 7/28/2019 10. Parameter Estimation

    40/65

  • 7/28/2019 10. Parameter Estimation

    41/65

    44

    The required sample size to estimate themean is

    22

    w

    zn

    s=

    Selecting the Sample size

  • 7/28/2019 10. Parameter Estimation

    42/65

    45

    Example To estimate the amount of lumber that

    can be harvested in a tract of land, the

    mean diameter of trees in the tractmust be estimated to within one inchwith 99% confidence.

    What sample size should be taken?Assume that diameters are normallydistributed with s = 6 inches.

    Selecting the Sample size

  • 7/28/2019 10. Parameter Estimation

    43/65

    46

    Solution The estimate accuracy is +/-1 inch. That is w = 1.

    The confidence level 99% leads to = .01, thusz

    /2= z

    .005= 2.575.

    We compute239

    1

    )6(575.2

    w

    zn

    22

    2 =

    =

    s=

    If the standard deviation is really 6 inches,the interval resulting from the random sampling

    will be of the form . If the standard deviation

    is greater than 6 inches the actual interval will

    be wider than +/-1.

    1x

    Selecting the Sample size

  • 7/28/2019 10. Parameter Estimation

    44/65

    47

    EXAMPLE

    A sample survey of kindergarten children in NYcity is being planned to estimate, among theother things, the mean number of older siblingsof such children. It is desired to estimate this

    mean within , with a 90% confidencelevel. A reasonable value for is 0.6.

    a) What sample size is needed to estimate themean number of older siblings of such children?

    w=0.08, s = 0.6,

    0.08 s

    0.05z 1.645=2

    1.645* 0.6n 152.2 153

    0.08

    = =

  • 7/28/2019 10. Parameter Estimation

    45/65

    48

    EXAMPLE(contd.)

    b) If w=0.06, how will n change?

    2

    1.645* 0.6n 270.6 2710.06

    = =

    So, when w=0.08 0.06, the required sample

    size increases substantially.

  • 7/28/2019 10. Parameter Estimation

    46/65

    49

    Inference about the Population

    Mean when the s is unknown When s is unknown, we need to estimate it to

    construct confidence intervals for the population

    mean s is estimated by s.

    When s is unknown, the sample mean of a smallsample is no longer normally distributed. That is,we use t-scores (i.e., scores from a tdistribution) in place of z-scores.

  • 7/28/2019 10. Parameter Estimation

    47/65

    50

    Inference About the PopulationMean when s is Unknown

    The Student t Distribution

    Standard Normal

    Student t

    0

  • 7/28/2019 10. Parameter Estimation

    48/65

    51

    Effect of the Degrees of Freedom

    on the t Density Function

    Student t with 10 DF

    0

    Student t with 2 DF

    Student t with 30 DF

    The degrees of freedom, (a function of the sample size)

    determine how spread the distribution is compared to the

    normal distribution.

    Finding t scores Under a

  • 7/28/2019 10. Parameter Estimation

    49/65

    52

    Finding t-scores Under at-Distribution (t-tables)

    t.100 t.05 t.025 t.01 t.005Degrees of

    Freedom

    1

    2

    3

    45

    6

    7

    8

    910

    11

    12

    3.078

    1.886

    1.638

    1.5331.476

    1.440

    1.415

    1.397

    1.3831.372

    1.363

    1.356

    6.314

    2.920

    2.353

    2.1322.015

    1.943

    1.895

    1.860

    1.8331.812

    1.796

    1.782

    12.706

    4.303

    3.182

    2.7762.571

    2.447

    2.365

    2.306

    2.2622.228

    2.201

    2.179

    31.821

    6.965

    4.541

    3.7473.365

    3.143

    2.998

    2.896

    2.8212.764

    2.718

    2.681

    63.657

    9.925

    5.841

    4.6044.032

    3.707

    3.499

    3.355

    3.2503.169

    3.106

    3.055

    t0.05, 10=1.812

    1.812

    .05

    t0

    C fid I t l

  • 7/28/2019 10. Parameter Estimation

    50/65

    53

    Confidence Intervals on m(s unknown)

    A (1100% confidence interval on m(when s is unknown) based on

    samples of size n drawn from aNormal Population is given by:

    /2, n-1 /2x t , (t with n-1 d.f.)

    s

    n

    EXAMPLE

  • 7/28/2019 10. Parameter Estimation

    51/65

    54

    EXAMPLE A new breakfast cereal is test-marked for

    1 month at stores of a large supermarketchain. The result for a sample of16 storesindicate average sales of $1200 with a

    sample standard deviation of $180. Set up99% confidence interval estimate of thetrue average sales of this new breakfast

    cereal. Assume normality.

    / 2 ,n 1 0.005,15

    n 16 ,x $1200,s $180, 0.01

    t t 2.947

    = = = =

    = =

  • 7/28/2019 10. Parameter Estimation

    52/65

    55

    ANSWER

    99% CI form:

    (1067.3985, 1332.6015)

    With 99% confidence, the limits 1067.3985

    and 1332.6015 cover the true averagesales of the new breakfast cereal.

    / 2 ,n 1

    s 180x t 1200 2.947 1200 132.6015

    n 16

    = =

  • 7/28/2019 10. Parameter Estimation

    53/65

    56

    Checking the required conditions

    We need to check that the population isnormally distributed, or at least not extremelynonnormal.

    There are statistical methods to test fornormality

    From the sample histograms we see

    0

    5

    10

    15

    20

    25

    30

    -4 2 8 14 22 30 More

  • 7/28/2019 10. Parameter Estimation

    54/65

    57

    Example

    A random sample of n=8 E-glass fiber testspecimens of a certain type yielded asample mean interfacial shear yield

    stress of 30.2 and a sample standarddeviation of 3.1. Assuming that interfacialshear yield stress is normally distributed,compute a 95% CI for true average stress.

  • 7/28/2019 10. Parameter Estimation

    55/65

    58

    One Sided Confidence Intervals

    An upper confidence bound form:

    , 1n

    s

    x t nm

    A lower confidence bound form:

    , 1n

    sx t

    n

    m

  • 7/28/2019 10. Parameter Estimation

    56/65

    59

    Example

    A sample of 14 joint specimens of aparticular type gave a sample meanproportional limit stress of 8.48 MPaand a sample standard deviation of 0.79MPa. Calculate and interpret a 95% lowerconfidence bound for the true averageproportional limit stress of all such joints.

    What, if any, assumptions did you makeabout the distribution of proportional limitstress?

    Estimation of a Population

  • 7/28/2019 10. Parameter Estimation

    57/65

    60

    Estimation of a PopulationProportion

    When the population consists of nominaldata, the only inference we can make isabout the proportion of occurrence of a

    certain value. The parameter p was used before to

    calculate these probabilities under the

    binomial distribution.

    Inference About a Population

  • 7/28/2019 10. Parameter Estimation

    58/65

    61

    Inference About a PopulationProportion

    Statistic and sampling distribution

    the statistic used when making inference about p is:

    Under certain conditions, [np > 5 and n(1-p) > 5],is approximately normally distributed, withm = p and s2= p(1 - p)/n.

    .

    .

    xp where

    nx the number of successes

    n sample size

    =

  • 7/28/2019 10. Parameter Estimation

    59/65

    62

    Estimating the Proportion

    Interval estimator for p (1- confidence level)

    100(1-% Confidence Interval for p:

    5)p1(nand5pnprov ided

    n/)p1(pzp2/

  • 7/28/2019 10. Parameter Estimation

    60/65

    63

    Nielsen Ratings

    In a survey of2000 TV viewers at 11.40 p.m. on acertain night, 226indicated they watched TheTonight Show.

    Estimate the number of TVs tuned to the Tonight

    Show in a typical night, if there are 100 millionpotential television sets. Use a 95% confidencelevel.

    Solution:

    x=226, n=2000

    014.113.

    2000/)113.1(113.96.1113./)1( 2/

    = nppzp

    226 0.1132000

    xpn = = =

  • 7/28/2019 10. Parameter Estimation

    61/65

    64

    Solution

    z - Estimate: ProportionViewers

    Sample Proportion 0.113

    Observations 2000

    LCL 0.099

    UCL 0.127

    A confidence interval estimate of the

    number of viewers who watched the

    Tonight Show:

    LCL = .099(100 million)= 9.9 millionUCL = .127(100 million)=12.7 million

    Selecting the Sample Size to

  • 7/28/2019 10. Parameter Estimation

    62/65

    65

    Selecting the Sample Size toEstimate the Proportion

    Recall: The confidence interval for theproportion is

    Thus, to estimate the proportion to within W,we can write

    nppzp /)1( 2/

    nppzW /)1(2/ =

    Selecting the Sample Size to

  • 7/28/2019 10. Parameter Estimation

    63/65

    66

    Selecting the Sample Size toEstimate the Proportion

    The required sample size is

    2

    2/ )1(

    =W

    ppzn

  • 7/28/2019 10. Parameter Estimation

    64/65

    67

    Example

    Suppose we want to estimate the proportion ofcustomers who prefer our companys brand to

    within .03 with 95% confidence. Find the sample size.

    Solution:W = .03; 1 - = .95,

    therefore /2 = .025,

    so z.025 = 1.96

    2

    03.

    )p1(p96.1n

    =

    Since the sample has not yetbeen taken, the sample proportion

    is still unknown.

    We proceed using either one of the

    following two methods:

    Sample Size to Estimate the

    Proportion

    Sample Size to Estimate the

  • 7/28/2019 10. Parameter Estimation

    65/65

    Method 1: There is no knowledge about the value of

    Let . This results in the largest possible n needed fora 1- confidence interval of the form .

    If the sample proportion does not equal .5, the actual W will

    be narrower than .03 with the n obtained by the formulabelow.

    5.p =03.p

    p

    068,103.

    )5.1(5.96.1n

    2

    =

    =

    68303.

    )2.1(2.96.1n

    2

    =

    =

    Sample Size to Estimate theProportion

    Method 2: There is some idea about the value of

    Use the value of to calculate the sample sizep

    p