Confidence Limits for the Mean

download Confidence Limits for the Mean

of 7

Transcript of Confidence Limits for the Mean

  • 8/2/2019 Confidence Limits for the Mean

    1/7

    Confidence Limits for the Mean

    Purpose:

    Interval

    Estimate

    or Mean

    Confidence limits for the mean (Snedecor and Cochran, 1989) are an interval estimate for the mea

    Interval estimates are often desirable because the estimate of the mean varies from sample to sampl

    Instead of a single estimate for the mean, a confidence interval generates a lower and upper limit fothe mean. The interval estimate gives an indication of how much uncertainty there is in our estima

    of the true mean. The narrower the interval, the more precise is our estimate.

    Confidence limits are expressed in terms of a confidence coefficient. Although the choice

    confidence coefficient is somewhat arbitrary, in practice 90 %, 95 %, and 99 % intervals are ofte

    used, with 95 % being the most commonly used.

    As a technical note, a 95 % confidence interval does not mean that there is a 95 % probability ththe interval contains the true mean. The interval computed from a given sample either contains thtrue mean or it does not. Instead, the level of confidence is associated with the method of calculatin

    the interval. The confidence coefficient is simply the proportion of samples of a given size that ma

    be expected to contain the true mean. That is, for a 95 % confidence interval, if many samples acollected and the confidence interval computed, in the long run about 95 % of these intervals woul

    contain the true mean.

    Definition:

    Confidence

    Interval

    Confidence limits are defined as:

    where is the sample mean, s is the sample standard deviation,Nis the sample size, is the desire

    significance level, and t1-/2,N-1 is the 100(1-/2) percentile of thetdistribution withN- 1 degrees freedom. Note that the confidence coefficient is 1 - .

    From the formula, it is clear that the width of the interval is controlled by two factors:

    1. AsNincreases, the interval gets narrower from the term.That is, one way to obtain more precise estimates for the mean is to increase the sample size

    2. The larger the sample standard deviation, the larger the confidence interval. This simpmeans that noisy data, i.e., data with a large standard deviation, are going to generate widintervals than data with a smaller standard deviation.

    Definition:

    Hypothesis

    Test

    To test whether the population mean has a specific value, , against the two-sided alternative that

    does not have a value , the confidence interval is converted to hypothesis-test form. The test is

    one-sample t-test, and it is defined as:

    http://www.itl.nist.gov/div898/handbook/eda/section4/eda43.htm#Snedecorhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htmhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda3664.htmhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda3664.htmhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda3664.htmhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda3664.htmhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htmhttp://www.itl.nist.gov/div898/handbook/eda/section4/eda43.htm#Snedecor
  • 8/2/2019 Confidence Limits for the Mean

    2/7

    H0:

    Ha:

    Test Statistic:

    where ,N, and are defined as above.

    Significance Level: . The most commonly used value for is 0.05.

    Critical Region: Reject the null hypothesis that the mean is a specified value, , if

    or

    Confidence

    IntervalExample

    We generated a 95 %, two-sided confidence interval for the ZARR13.DAT data set based on th

    following information.N = 195MEAN = 9.261460

    STANDARD DEVIATION = 0.022789

    t1-0.025,N-1 = 1.9723

    LOWER LIMIT = 9.261460 - 1.9723*0.022789/19UPPER LIMIT = 9.261460 + 1.9723*0.022789/195

    Thus, a 95 % confidence interval for the mean is (9.258242, 9.264679).

    t-Test

    Example

    We performed a two-sided, one-sample t-test using the ZARR13.DAT data set to test the nu

    hypothesis that the population mean is equal to 5.H0: = 5Ha: 5

    Test statistic: T = 2611.284

    Degrees of freedom: = 194

    Significance level: = 0.05

    Critical value: t1-/2, = 1.9723

    Critical region: Reject H0 if |T| > 1.9723

    We reject the null hypotheses for our two-tailed t-test because the absolute value of the test statistis greater than the critical value. If we were to perform an upper, one-tailed test, the critical valu

    would be t1-, = 1.6527, and we would still reject the null hypothesis.

    The confidence interval provides an alternative to the hypothesis test. If the confidence intervcontains 5, then H0 cannot be rejected. In our example, the confidence interval (9.258242, 9.26467

    does not contain 5, indicating that the population mean does not equal 5 at the 0.05 level

    significance.

    In general, there are three possible alternative hypotheses and rejection regions for the one-sample

    http://www.itl.nist.gov/div898/handbook/eda/section4/eda4281.htmhttp://www.itl.nist.gov/div898/handbook/eda/section4/eda4281.htmhttp://www.itl.nist.gov/div898/handbook/eda/section4/eda4281.htmhttp://www.itl.nist.gov/div898/handbook/eda/section4/eda4281.htm
  • 8/2/2019 Confidence Limits for the Mean

    3/7

    test:

    Alternative Hypothesis Rejection Region

    Ha: 0 |T| > t1-/2,

    Ha: > 0 T> t1-,

    Ha: < 0 T< t,

    The rejection regions for three posssible alternative hypotheses using our example data are shown i

    the following graphs.

  • 8/2/2019 Confidence Limits for the Mean

    4/7

    Questions Confidence limits for the mean can be used to answer the following questions:

    1. What is a reasonable estimate for the mean?2. How much variability is there in the estimate of the mean?3. Does a given target value fall within the confidence limits?

    Related

    Techniques

    Two-Sample t-Test

    Confidence intervals for other location estimators such as the median or mid-mean tend to b

    mathematically difficult or intractable. For these cases, confidence intervals can be obtained usin

    http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htmhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htmhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htmhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm
  • 8/2/2019 Confidence Limits for the Mean

    5/7

    the bootstrap.

    Case Study Heat flow meter data.

    Software Confidence limits for the mean and one-sample t-tests are available in just about all general purpo

    statistical software programs. Both Dataplot code and R code can be used to generate the analyses

    this section.

    Probability Sampling Methods

    The main types of probability sampling methods are simple random sampling, stratified sampling, cluster

    sampling, multistage sampling, and systematic random sampling. The key benefit of probability sampling

    methods is that they guarantee that the sample chosen is representative of the population. This ensures

    that the statistical conclusions will be valid.

    Simple random sampling. Simple random sampling refers to any sampling method that has thefollowing properties.

    The population consists of N objects. The sample consists of n objects. If all possible samples of n objects are equally likely to occur, the sampling method is

    called simple random sampling.

    There are many ways to obtain a simple random sample. One way would be the lottery method.

    Each of the N population members is assigned a unique number. The numbers are placed in a bow

    and thoroughly mixed. Then, a blind-folded researcher selects n numbers. Population members

    having the selected numbers are included in the sample.

    Stratified sampling. With stratified sampling, the population is divided into groups, based on somcharacteristic. Then, within each group, a probability sample (often a simple random sample) is

    selected. In stratified sampling, the groups are called strata.

    As a example, suppose we conduct a national survey. We might divide the population into groups

    strata, based on geography - north, east, south, and west. Then, within each stratum, we might

    randomly select survey respondents.

    http://www.itl.nist.gov/div898/handbook/eda/section3/bootplot.htmhttp://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htmhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda352.dphttp://www.itl.nist.gov/div898/handbook/eda/section3/eda352.rhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda352.rhttp://www.itl.nist.gov/div898/handbook/eda/section3/eda352.dphttp://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htmhttp://www.itl.nist.gov/div898/handbook/eda/section3/bootplot.htm
  • 8/2/2019 Confidence Limits for the Mean

    6/7

    Cluster sampling. With cluster sampling, every member of the population is assigned to one, andonly one, group. Each group is called a cluster. A sample of clusters is chosen, using a probability

    method (often simple random sampling). Only individuals within sampled clusters are surveyed.

    Note the difference between cluster sampling and stratified sampling. With stratified sampling, th

    sample includes elements from each stratum. With cluster sampling, in contrast, the sample

    includes elements only from sampled clusters.

    Multistage sampling. With multistage sampling, we select a sample by using combinations ofdifferent sampling methods.

    For example, in Stage 1, we might use cluster sampling to choose clusters from a population. The

    in Stage 2, we might use simple random sampling to select a subset of elements from each chosencluster for the final sample.

    Systematic random sampling. With systematic random sampling, we create a list of every membeof the population. From the list, we randomly select the first sample element from the first k

    elements on the population list. Thereafter, we select every kth element on the list.

    This method is different from simple random sampling since every possible sample ofn elements

    not equally likely.

    A statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an

    observational study (not controlled). In statistics, a result is called statistically significant if it is unlikely to have

    occurred by chance alone, according to a pre-determined threshold probability, the significance level. The phrase "test

    significance" was coined by Ronald Fisher: "Critical tests of this kind may be called tests of significance, and when suc

    tests are available we may discover whether a second sample is or is not significantly different from the first.

    The testing process

    In the statistical literature, statistical hypothesis testing plays a fundamental role.[7][citation needed]

    The usual line of

    reasoning is as follows:

    1. There is an initial a research hypothesis of which the truth is unknown.2. The first step is to state the relevant null and alternative hypotheses. This is important as mis-stating the

    hypotheses will muddy the rest of the process. Specifically, the null hypothesis allows to attach an attribute: it

    should be chosen in such a way that it allows us to conclude whether the alternative hypothesis can either be

    accepted or stays undecided as it was before the test.[8]

    3. The second step is to consider the statistical assumptions being made about the sample in doing the test; forexample, assumptions about the statistical independence or about the form of the distributions of the

    observations. This is equally important as invalid assumptions will mean that the results of the test are invalid

    4. Decide which test is appropriate, and state the relevanttest statisticT.

    http://en.wikipedia.org/wiki/Controlled_experimenthttp://en.wikipedia.org/wiki/Observational_studyhttp://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Statistically_significanthttp://en.wikipedia.org/wiki/Luckhttp://en.wikipedia.org/wiki/Significance_levelhttp://en.wikipedia.org/wiki/Ronald_Fisherhttp://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-LR-6http://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-LR-6http://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Garbage_In,_Garbage_Outhttp://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-Ad.C3.A8r.2C_2008-7http://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-Ad.C3.A8r.2C_2008-7http://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-Ad.C3.A8r.2C_2008-7http://en.wikipedia.org/wiki/Statistical_assumptionhttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Statistical_assumptionhttp://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-Ad.C3.A8r.2C_2008-7http://en.wikipedia.org/wiki/Garbage_In,_Garbage_Outhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-LR-6http://en.wikipedia.org/wiki/Ronald_Fisherhttp://en.wikipedia.org/wiki/Significance_levelhttp://en.wikipedia.org/wiki/Luckhttp://en.wikipedia.org/wiki/Statistically_significanthttp://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Observational_studyhttp://en.wikipedia.org/wiki/Controlled_experiment
  • 8/2/2019 Confidence Limits for the Mean

    7/7

    5. Derive the distribution of the test statistic under the null hypothesis from the assumptions. In standard cases thwill be a well-known result. For example the test statistic may follow a Student's t distribution or a normal

    distribution.

    6. The distribution of the test statistic partitions the possible values ofTinto those for which the null-hypothesis rejected, the so called critical region, and those for which it is not.

    7. Compute from the observations the observed value tobs of the test statistic T.8. Decide to either fail to reject the null hypothesis or reject it in favor of the alternative. The decision rule is to

    reject the null hypothesisH0 if the observed value tobs is in the critical region, and to accept or "fail to reject" t

    hypothesis otherwise.

    An alternative process is commonly used:

    6. Select a significance level (), a probability threshold below which the null hypothesis will be rejected.Common values are 5% and 1%.

    7. Compute from the observations the observed value tobs of the test statistic T.8. From the statistic calculate a probability of the observation under the null hypothesis (the p-value).9. Reject the null hypothesis or not. The decision rule is to reject the null hypothesis if and only if the p-value is

    less than the significance level (the selected probability) threshold.

    The two processes are equivalent.[9]The former process was advantageous in the past when only tables of test statistics

    at common probability thresholds were available. It allowed a decision to be made without the calculation of a

    probability. It was adequate for classwork and for operational use, but it was deficient for reporting results.

    http://en.wikipedia.org/wiki/Student%27s_t_distributionhttp://en.wikipedia.org/wiki/Normal_distributionhttp://en.wikipedia.org/wiki/Normal_distributionhttp://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-8http://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-8http://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-8http://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-8http://en.wikipedia.org/wiki/Normal_distributionhttp://en.wikipedia.org/wiki/Normal_distributionhttp://en.wikipedia.org/wiki/Student%27s_t_distribution