Interval Estimation for Binomial Proportion, …532980/FULLTEXT01.pdfInterval Estimation for...

of 46 /46
Interval Estimation for Binomial Proportion, Poisson Mean, and Negative –binomial Mean Luchen Liu Supervisor: Rolf Larsson Master Thesis in Statistics, May 2012 Uppsala University, Sweden

Embed Size (px)

Transcript of Interval Estimation for Binomial Proportion, …532980/FULLTEXT01.pdfInterval Estimation for...

  • Interval Estimation for Binomial Proportion, Poisson Mean, and Negative

    binomial Mean

    Luchen Liu

    Supervisor: Rolf Larsson

    Master Thesis in Statistics, May 2012

    Uppsala University, Sweden

  • Contents

    Abstract .......................................................................................................... 1 1. Introduction .............................................................................................. 2 2. Methodology .............................................................................................. 3

    2.1 The Actual Coverage Probability .......................................................... 3 2.2 The Expected Length ............................................................................ 3

    3. Confidence Intervals for the Binomial Proportion .................................. 4 3.1 The Standard Wald Interval................................................................... 4 3.2 Frequentist Alternative Confidence Intervals......................................... 5 3.3 The Bayesian Alternative Confidence Interval ...................................... 8 3.4 The Expected Length .......................................................................... 10 3.5 Discussion ........................................................................................... 11

    4. The Confidence Interval for the Poisson Mean ..................................... 16 4.1 The Wald Interval ............................................................................... 16 4.2 Alternative Confidence Intervals ......................................................... 18 4.3 The Expected Length of The Poisson Intervals .................................... 21

    5. The Confidence Interval for the Negative binomial Mean .................... 23 5.1 The Wald Interval ............................................................................... 23 5.2 Alternative Confidence Intervals ......................................................... 25 5.3 The Expected Length of The Negative-binomial Intervals ................... 28

    6. Conclusions and Discussions .................................................................. 31 6.1 Conclusions ........................................................................................ 31 6.2 Discussion ........................................................................................... 31

    7. Empirical examples ................................................................................. 33 7.1 Female births proportion ..................................................................... 33 7.2 The horse kick data ............................................................................. 34

  • 1

    Interval Estimation for Binomial Proportion, Poisson Mean, and Negative

    binomial Mean

    Author: Luchen Liu

    Supervisor: Rolf Larsson

    Abstract

    This paper studies the interval estimation of three discrete distributions: the

    binomial distribution, the Poisson distribution and the negative-binomial

    distribution. The problem is the chaotic behavior of the coverage probability

    for the Wald interval. To solve this problem, alternative confidence intervals are

    introduced. Coverage probability and expected length are chosen to be the

    criteria evaluating the intervals.

    In this paper, I firstly tested the chaotic behavior of the coverage

    probability for the Wald interval, and introduced the alternative confidence

    intervals. Then I calculated the coverage probability and expected length for

    those intervals, made comparisons and recommended confidence intervals for

    the three cases. This paper also discussed the relationship among the three

    discrete distributions, and in the end illustrated the applications on binomial

    and Poisson data with brief examples.

    Key words: interval estimation, coverage probability, expected length

  • 2

    1. Introduction

    Interval estimation is one of the most basic methodologies in statistics. There

    are a variety of ways to construct confidence intervals, and the most

    well-known is the Wald interval. This interval is used in the textbooks, and it is

    also the most commonly used confidence interval in practice. But in the

    binomial case, in some studies this interval was proved to have poor coverage

    probability. Agresti and Coull (1998) and Brown et. al. (2001) all considered

    this problem, and figured out that, even when n is large the chaotic behavior of

    the standard interval still exists.

    Many other ways of constructing intervals have been taken into

    consideration recently. These intervals contain the Wilson interval (also called

    the score interval), the Jeffreys equal tailed interval, the Clopper-Pearson

    interval, the likelihood-ratio interval, and the Bayesian HPD interval. Agresti

    and Coull (1998) recommended the confidence intervals constructed by

    approximation methods because they are more efficient than the exact

    Clopper-Pearson interval, although the exact interval is always conservative.

    The confidence interval of a binomial proportion is simplest but can

    illustrate many problems. In this paper, I tested the chaotic behavior of the

    Wald interval of the binomial proportion, and introduced some alternative

    intervals, then made a comparison of these intervals, and discussed the

    relationship between some intervals. The evaluation criteria I used to compare

    the confidence intervals are the coverage probability and the expected length.

    To be fair in the comparison of expected length, I plotted the expected length to

    the actual coverage probability but not the nominal confidence level, and got

    the same conclusion as Agresti and Coull.

    I also turned to the confidence interval of Poisson mean and the

    Negative-binomial mean, which are two other commonly used discrete

    distributions. I tested the behavior of the Wald interval, and the coverage

    probability is also quite chaotic, so I tried the score interval, the likelihood ratio

    interval, the exact interval, and the Jeffreys interval as the alternative intervals.

  • 3

    Then I made a comparison using the coverage probability and the expected

    length as criteria, and recommended which confidence interval to choose for

    these two cases. In the end I gave two empirical examples to illustrate the

    applications on binomial and Poisson data.

    2. Methodology

    The coverage probability and the expected length are reasonable criteria for us

    to find the confidence intervals that have relatively higher probability to

    contain the true value and shorter length.

    2.1 The Actual Coverage Probability

    The coverage probability of a confidence interval is the proportion of the time

    that the interval contains the true value of the parameter.

    The coverage probability for a confidence interval CI of the parameter

    from a distribution ~ ( | )XX f x , where ( | )Xf x is the probability density

    function, is calculated by the equation

    ( ) ( , ) ( | )k

    C I k f k

    (1)

    where

    1, ,

    ( , )0, .

    CII k

    CI

    (2)

    2.2 The Expected Length

    The expected length is also called the expected width, which evaluates the

    accuracy of a confidence interval.

    The expected length for confidence interval CI of the parameter

    from a distribution ~ ( | )XX f x is calculated by the equation

    ( ) ( ( ) ( )) ( | )k

    EW U k L k f k

    (3)

    where ( )U k and ( )L k are the upper and lower limits of CI.

  • 4

    Agresti and Coull (1998) suggested that we should choose an approach

    giving narrower intervals for which the actual coverage probability could be

    less than but usually quite close to the nominal confidence level, so they

    recommended the approximation intervals instead of the exact interval.

    3. Confidence Intervals for the Binomial Proportion

    The interval estimation for the binomial proportion is the simplest of the three

    cases, and the most discussed one. Illustrating the problem of the binomial case

    can help us understand the problem of the other two distributions, and there are

    a variety of alternative confidence intervals for the binomial case. So I choose

    to introduce the binomial case in the very beginning.

    3.1 The Standard Wald Interval

    The standard Wald interval is the most widely used interval in practicing

    statistical analysis and econometric research.

    The standard normal approximation confidence interval of the binomial

    proportion p is in the following form.

    2

    SpqCI p zn

    (4)

    where 2

    z is the 1 2 quantile of the standard normal distribution, Xp

    n

    and 1q p .

    The coverage probability for any confidence interval CI at the fixed value

    of p is

    0

    ( ) ( , ) (1 )n

    k n kn

    k

    nC p I k p p p

    k

    , (5)

    where

    1, ,

    ( , )0, .

    p CII k p

    p CI

    . (6)

    Plotting the coverage probability separately by fixing the value of p or n

  • 5

    showed that larger n may have lower coverage probability than small n (Figure

    1(a)). The coverage probability is less than 0.92 when n=40, and over 0.95

    when n=20 (Figure 1(b)), contrary to the textbook. For large sample size like

    n=100, the coverage probability can still be poor when p is near the boundary

    (Figure 1(c)). The coverage probability is still chaotic for n=2000 when p is

    near 0 (Figure 1(d)).

    Figure 1: Plot of the coverage probability for the standard interval at: (a) fixed

    p=0.2 and n=25 to 100, (b) fixed p=0.5 and n=10 to 50, (c) fixed n=100 and

    0

  • 6

    21

    22212 22

    2 2

    2 2

    2 ( )(4 )W

    zz n zX

    CI pq nn z n z

    (7)

    The Wilson interval is the inversion of the equal-tail test 0 0:H p p based on

    the CLT approximation. The coverage probability of the Wilson interval for

    n=50 is shown in Figure 2(a).

    3.2.2 The Clopper-Pearson Interval

    The Clopper-Pearson (1934) interval is also called the exact interval, for it is

    the inversion of the equal-tail binomial test. It is based on the exact cumulative

    probabilities of the binomial distribution rather than any approximation. The

    Clopper-Pearson interval can be written as

    { | ( ( ; ) ) } { | ( ( ; ) ) }2 2p P Bin n p X p P Bin n p X , (8)

    where X is the number of successes observed in the sample and ( ; )Bin n p is a

    binomial random variable.

    Brown et. al. (2001) mentioned that, because of a relationship between the

    cumulative binomial distribution and the beta distribution, the Clopper-Pearson

    interval can be presented in an alternate format that uses quantiles of the beta

    distribution. For the observed data X x the Clopper-Pearson interval is in

    the following form.

    [ ( ), ( )]CP CP CPCI L x U x (9)

    where ( )CPL x is the 2 quantile of the Beta(x,n-x+1) distribution, and

    ( )CPU x is the 1 2 quantile of the Beta(x+1,n-x) distribution. The coverage

    probability of Clopper-Pearson interval for n=50 is shown in Figure 2(b). The

    plot shows that the exact 95% confidence intervals are rather conservative. It

    was proved by Wang (2006) that the Clopper-Pearson interval is the smallest

    two-sided conservative interval if and only if for 0 1 , (0; ,0.5)BF n ,

  • 7

    where ( ; , )BF x n p is the cumulative distribution function of the binomial

    distribution.

    3.2.3 The Agresti-Coull Interval

    Agresti and Coull (1998) pointed out that the Clopper-Pearson interval is

    inefficient, and suggested an adjusted Wald interval in the following form.

    2

    ACpqCI p zn

    (10)

    where 2

    22

    zX X

    and 2

    2n n z , then Xp n

    and 1q p . The

    coverage probability of Agresti-Coull interval for n=50 is shown in Figure 2(c).

    For a 95% confidence interval, 2 22

    1.96 3.84 4z , so if we use 2 instead of

    1.96, this is an add 2 successes and 2 failures interval.

    The standard interval is simple for classroom presentation, but

    unfortunately with poor performance. This interval was suggested by Agresti

    and Coull (1998) for that, it is of the familiar form with 2

    pqp zn

    , and thus

    a compromise alternative interval for the Wald interval.

    3.2.4 The Likelihood Ratio Interval

    The likelihood ratio interval was carried out by Rao (1973), and it is

    constructed by the inversion of the likelihood ratio test 0 0:H p p if

    2

    22log( )n z , where n is the likelihood ratio in the following form.

    0 0 0( ) (1 )

    ( ) ( ) (1 )

    X n X

    n X n Xp

    L p p pX XSup L p n n

    (11)

    where ( )L is the likelihood function. The coverage probability of the

    likelihood ratio interval for n=50 is shown in Figure 2(d). We can see from the

    plot that the coverage probability gets quite chaotic when p is close to the

  • 8

    boundary, but when 0.2

  • 9

    1 1( , )2 2Beta , thus the equal-tailed Jeffreys prior interval is in the following

    form.

    [ ( ), ( )]J J JCI L x U x (13)

    where ( )JL x is the 2 quantile and ( )JU x is the 1 2

    quantile of the

    1 1( , )2 2Beta X n X distribution. The coverage probability of Jeffreys

    interval for n=50 is shown in Figure 3(a).

    As the upper limit of the Clopper-Pearson interval is the 2 quantile of

    the Beta(x,n-x+1) distribution, and the lower limit is 1 2 quantile of the

    Beta(x+1,n-x) distribution, it is pointed out by Brown et. al. (2001) that Jeffreys

    interval is always contained in the exact interval, thus corrects the

    conservativeness of the exact interval.

    3.3.2 The Bayesian HPD Interval

    The highest posterior density (HPD) region consists of p that fulfill ( | )f p x c

    where ( | )f p x is the posterior density of p. The highest posterior density

    (HPD) interval can be denoted as:

    2 ( ) { :[ ( ) ( ) ( ) ( )] }H b p P l p l p h p h p b , (14)

    where ( )l is the log-likelihood function, and ( )h p is the prior of p (Severini,

    1991). The coverage probability of the Bayesian HPD Interval for n=50 with

    1 1( , )2 2Beta prior is shown in Figure 3(b). We can see from the plot that the

    coverage probability gets quite chaotic when p is close to the boundary, but

    when 0.2

  • 10

    Figure 3: Plot of the coverage probability for n=50, 0.5 and 0

  • 11

    Figure 4: Plot of the expected lengths for confidence intervals at n=50, 0.5

    and 0

  • 12

    p

  • 13

    Figure 6: Plot of the expected length for Bayesian interval at n=50, 0.5

    and 0

  • 14

    boundary the coverage probabilities have a little difference.

    Figure 7: Plot of the coverage probability for Bayesian HPD interval with

    1 1( , )2 2Beta prior (blue dotted line) and the Likelihood-Ratio Interval (black

    solid line) at n=50, 0.5 and 0

  • 15

    has the longest expected length. But this is just at the same nominal confidence

    level, not the same actual coverage probability. So I plotted the actual coverage

    probability to expected length of the Wald interval and its alternative intervals

    (Figure 8).

    Figure 8: Plot of the actual coverage probability to the expected length of

    the Clopper-Pearson interval (the black solid line) and (a) the Wilson

    interval, (b) the Agresti-Coull interval, (c) the likelihood ratio interval and

    (d) the Jeffreys interval (the purple dotted line).

    We can see from the plot that, when the actual coverage probability is

    between 0.96 and 0.97, the expected lengths of the Wilson interval, the

    Agresti-Coull interval, the likelihood ratio interval and the Jeffreys interval are

    relatively shorter than the exact interval. But we can also get the result that at

    0.10 0.15 0.20 0.25

    0.85

    0.90

    0.95

    1.00

    (a)

    expected length

    cove

    rage

    pro

    babi

    lity

    0.10 0.15 0.20 0.250.

    850.

    900.

    951.

    00

    (b)

    expected length

    cove

    rage

    pro

    babi

    lity

    0.10 0.15 0.20 0.25

    0.85

    0.90

    0.95

    1.00

    (c)

    expected length

    cove

    rage

    pro

    babi

    lity

    0.10 0.15 0.20 0.25

    0.85

    0.90

    0.95

    1.00

    (d)

    expected length

    cove

    rage

    pro

    babi

    lity

  • 16

    the same expected length, the Clopper-Pearson interval always has the largest

    actual coverage probability. So it is not proper to say that the Clopper-Pearson

    interval is inefficient, because the expected length of the Agresti-Coull interval

    and the Clopper-Pearson interval at the same actual coverage probability are

    not very different at the same nominal confidence level.

    So I think we should choose the confidence interval for the binomial

    proportion according to the requirement and the purpose of the study. If we are

    doing analysis requiring conservativeness of the confidence interval, we should

    choose the exact interval, and if we want to obtain a more accurate interval, we

    should choose the Agresti-Coull interval instead.

    4. The Confidence Interval for the Poisson Mean

    The Poisson distribution is another discrete distribution in the exponential

    family. The estimation of the Poisson mean is also a commonly discussed

    problem in statistical analysis. The Poisson distribution describes the

    probability of a given number of events occurring in a fixed interval of time or

    space if those events occur with a known average rate and independently of the

    time since the last event. Let 1{ , , }nX X be independent, identically

    distributed ( )Poisson random variables, then the probability density function

    of iX is

    ( )!

    k

    ieP X kk

    (17)

    where k is a non-negative integer, and is a positive real number, which

    equals the expectation of X.

    4.1 The Wald Interval

    The simplest and most widely used confidence interval for a Poisson mean is

    still the Wald interval.

    For an independent, identically distributed ( )Poisson random sample

  • 17

    1{ , , }nX X , the standard normal approximation confidence interval of the

    Poisson mean is in the following form. 1

    2

    2( )XX z n (18)

    where 1

    n

    ii

    X X n

    , and 2

    z is the 1 2 quantile of the standard normal

    distribution.

    Barker (2002) gives a method of computing the coverage probability and

    expected length of confidence intervals for the Poisson mean. For 1

    n

    ii

    T X

    which is sufficient for , the coverage probability of a confidence interval is

    in the form:

    0( ) ( { ( ) ( )}) ( !)n nn

    iC I L i n U i e i i

    (19)

    where {}I is the indicator function of the bracketed event, 1

    2

    2( ) ( )L T T z T ,

    and 1

    2

    2( ) ( )U T T z T .

    By plotting the coverage probability of the Poisson mean we can see that

    the behavior for the Wald interval is also quite chaotic for the Poisson case

    (Figure 8). The plot also shows that the coverage behavior is less chaotic when

    n is large (Figure 8(a), (b)), and for big values of , the coverage behavior

    is less erratic than for small values of (Figure 8(c),(d)).

  • 18

    Figure 9: Plot of the coverage probability for the standard interval at: (a)

    n = 10 to 500, (b) n = 1 to 50, (c) fixed = 5 and n= 5 to 100, and (d) fixed

    = 0.2 and n=5 to 100.

    4.2 Alternative Confidence Intervals

    4.2.1 The Score Interval

    The score interval is formed by inverting Raos equal tail test (Rao, 1973) of

    0 0:H . Barker (2002) gives the bounds of score interval in the following

    form. 2 2 0.5

    2 2 2( ) (2 ) ( )[4 ( ) ] (4 )X z n z X z n n (20)

    Brown et. al. (2003) mentioned that, in the Poisson case the coverage

    probabilities are actually functions of n . So I plotted the coverage probability

    of the score interval for n from 2 to 50 (Figure 10(a)).

    0 100 200 300 400 500

    0.92

    0.94

    0.96

    (a)

    nlambda

    cove

    rage

    pro

    babi

    lity

    0 10 20 30 40 50

    0.92

    0.94

    0.96

    (b)

    nlambda

    cove

    rage

    pro

    babi

    lity

    20 40 60 80 100

    0.86

    0.90

    0.94

    (b)

    n

    cove

    rage

    pro

    babi

    lity

    20 40 60 80 100

    0.86

    0.90

    0.94

    (c)

    n

    cove

    rage

    pro

    babi

    lity

  • 19

    4.2.2 The Exact Method

    Familiar with the binomial case, the (1 )100% lower confidence limit for

    is the smallest value of l that satisfies

    0( ) ( !) 2l

    Sn i

    li

    e n i

    (21)

    where S is 1

    n

    ii

    T X

    . If such l does not exist, the lower confidence limit is 0.

    the (1 )100% upper confidence limit for c is the largest value of l that

    satisfies

    ( ) ( !) 2ln ili S

    e n i

    (22)

    Fay and Feuer (1997) give the solution to (21) and (22) in the form of the

    2 distribution as:

    2 1 2 12 2( 1)

    1 1[ ( ), ( )] [ ( ) ( 2), ( ) (1 2)]2 2E E E x x

    CI L x U x (23)

    where 2 1( ) ( )n p is the pth quantile of the 2 distribution with n degrees of

    freedom.

    The coverage probability of the exact confidence interval for the Poisson

    mean with n from 2 to 50 is plotted in Figure 9(b).

    4.2.3 The Likelihood Ratio Interval

    The likelihood ratio interval for the Poisson case is constructed by the inversion

    of the likelihood ratio test 0 0:H . The interval covers if

    2

    22log( )n z , where n is the likelihood ratio given by

    nX n

    n nX nX

    eX e

    (24)

  • 20

    The coverage probability of the likelihood ratio confidence interval for the

    Poisson mean with n from 2 to 50 is plotted in Figure 9(c).

    4.2.4 The Jeffreys Interval

    The non-informative Jeffreys prior of the Poisson distribution is proportional to 1 2 , then the posterior distribution of is ~ ( 1 2,1 )X Gamma X n , so the

    equal-tailed Jefferys interval is in the following form.

    [ ( ), ( )]J J JCI L x U x (25)

    where ( )JL x is the 2 quantile and ( )JU x is the 1 2

    quantile of the

    ( 1 2,1 )Gamma X n distribution.

    The coverage probability of the equal-tailed Jeffreys confidence interval

    for the Poisson mean with n from 2 to 50 is plotted in Figure 10(d).

    Figure 10: Plot of the coverage probability for the Poisson mean with n from

    2 to 50 of: (a) the score interval, (b) the exact interval, (c) the likelihood ratio

    interval, and (d) the Jeffreys interval.

    10 20 30 40 50

    0.92

    0.96

    1.00

    (a)

    nlambda

    cove

    rage

    pro

    babi

    lity

    10 20 30 40 50

    0.92

    0.96

    1.00

    (b)

    nlambda

    cove

    rage

    pro

    babi

    lity

    10 20 30 40 50

    0.92

    0.96

    1.00

    (c)

    nlambda

    cove

    rage

    pro

    babi

    lity

    10 20 30 40 50

    0.92

    0.96

    1.00

    (d)

    nlambda

    cove

    rage

    pro

    babi

    lity

  • 21

    From the plots we can see that the likelihood ratio interval and the Jeffreys

    interval are more chaotic than the score interval and the exact interval, and the

    exact interval is the most conservative interval among these four intervals.

    The likelihood ratio interval and the Jeffreys interval in the Poisson case

    are also close to each other when n is large. As in the binomial case, when

    n is close to the boundary 0, the coverage probabilities are slightly different.

    4.3 The Expected Length of The Poisson Intervals

    The expected length for the Poisson mean is computed by:

    0

    ( ) ( ( ) ( )) ( !)n nni

    EW U i L i e i i

    (26)

    The expected length of the Wald interval for the Poisson mean and the

    alternative intervals are plotted below (Figure 10).

    Figure 11: Plot of the expected length for the Poisson mean with n from

    2 to 50 of: the Wald interval (black solid line), the score interval (blue dashed

    line), the likelihood ratio interval (red dashed line), the Jeffreys interval (green

    dashed line), and the the exact interval (purple dashed line).

    10 20 30 40 50

    510

    1520

    25

    Expected Length

    nlambda

    expe

    cted

    leng

    th

  • 22

    We can see from the plot that the expected lengths of the Wald interval

    and the likelihood ratio interval are the shortest, the expected length of the

    Jeffreys interval is close to the Wald interval, the expected length of the score

    interval is a little bit longer, and the exact interval is the longest.

    Then we can also plot the expected length to the actual coverage

    probability (Figure 12).

    Figure 12: Plot of the expected length to the actual coverage probability of the

    exact interval (the black solid line) and (a) the Wald interval , (b) the score

    interval, (c) the likelihood ratio interval and (d) the Jeffreys interval (the purple

    dotted line).

    We can see from the plot that, when the actual coverage probability is

    close to 0.95, the score interval, the likelihood ratio interval and the Jeffreys

    10 15 20 25

    0.85

    0.90

    0.95

    1.00

    (a)

    expected length

    cove

    rage

    pro

    babi

    lity

    10 15 20 25

    0.85

    0.90

    0.95

    1.00

    (b)

    expected length

    cove

    rage

    pro

    babi

    lity

    10 15 20 25

    0.85

    0.90

    0.95

    1.00

    (c)

    expected length

    cove

    rage

    pro

    babi

    lity

    10 15 20 25

    0.85

    0.90

    0.95

    1.00

    (d)

    expected length

    cove

    rage

    pro

    babi

    lity

  • 23

    interval are relatively shorter than the exact interval. And we can also see that

    at the same expected length, the actual coverage probability of the exact

    interval is always the largest.

    So we should also choose the confidence interval like in the binomial case.

    The approximation methods are more accurate but with smaller coverage

    probability, thus are proper to be used in the studies requiring efficiency. The

    exact method should be used if the study requires conservativeness.

    5. The Confidence Interval for the Negative binomial Mean

    The probability density function for the negative-binomial variable

    ~ ( , )X NB r p is in the form.

    1( | ) (1 ) , 0,1, ;0 1r x

    r xP X x p p p x p

    x

    (27)

    The mean and variance are (1 )( ) r pE Xp

    and 2(1 )( ) r pVar X

    p

    .

    The negative binomial variable ~ ( , )X NB r p can describe the number of

    failures before the first r success when doing Bernoulli trials.

    5.1 The Wald Interval

    Let 1{ , , }nX X be independent, identically distributed (1, )NB p random

    variables, then ~ (1, )X NB p and 0

    ~ ( , )n

    ii

    X NB n p .

    The point estimation of the probability of success p is 11

    pX

    . The

    estimation of the mean is X , and the variance of the mean is 2 2(1 )

    p

    p .

    So the standard Wald confidence interval for the negative-binomial mean is in

    the following form.

  • 24

    222 2

    (1 ) S

    pCI z X zp

    (28)

    where 2

    z is the 1 2 quantile of the standard normal distribution.

    The coverage probability of the confidence intervals for the

    negative-binomial mean is calculated by the equation

    0

    1( ) ( { ( ) ( )}) (1 )r ir

    i

    r iCP I L i U i p p

    i

    0

    1( { ( ) ( )}) ( ) (1 )r i

    i

    r i r rI L i U ii r r

    (29)

    where {}I is the indicator function of the bracketed event, ( )U i and ( )L i

    are the upper and lower bounds of the confidence interval.

    The behavior of the coverage probability of the standard Wald interval for

    the negative binomial mean is quite chaotic (Figure 13), and even never

    reaches the nominal confidence level. For large values of n (Figure 13(d)) and

    (Figure 13(b)) the coverage probability performs less chaotically than for a

    small value (Figure 13(a), (c)). The coverage probability of the Wald interval

    for the negative-binomial mean is in general quite chaotic as it never reaches

    the nominal confidence level even when n=100. But the coverage probability

    reaches 0.95 when n=1000 (Figure 17).

  • 25

    Figure 13: Plot of the coverage probability of the Wald interval for the

    negative binomial mean at: (a) fixed 5 and n=2 to 50, (b) fixed 100

    and n=2 to 50, (c) fixed n=20 and 0 to 100, and (d) fixed n=100 and

    0 to 100.

    5.2 The Alternative Confidence Intervals

    5.2.1 The Score Interval

    The Raos score interval of the negative-binomial mean given by Brown et. al.

    (2003) is in the form

    12 22122 2 2 2

    2 2

    2 2

    2 ( )

    4Rn z z n z

    CIn z n z n

    (30)

    10 20 30 40 50

    0.5

    0.7

    0.9

    (a)

    n

    cove

    rage

    pro

    babi

    lity

    10 20 30 40 50

    0.5

    0.7

    0.9

    (b)

    n

    cove

    rage

    pro

    babi

    lity

    0 20 40 60 80 100

    0.5

    0.7

    0.9

    (c)

    mu

    cove

    rage

    pro

    babi

    lity

    0 20 40 60 80 100

    0.5

    0.7

    0.9

    (d)

    mu

    cove

    rage

    pro

    babi

    lity

  • 26

    The coverage probability of the score interval for from 0 to 100 and

    n=50 is plotted in Figure 14(a). The coverage probability of the score interval

    is larger and less chaotic than the standard interval.

    5.2.2 The Exact Method

    Familiar with the former cases, the (1 )100% lower confidence limit for

    is the smallest value of l that satisfies

    0

    1( ) ( ) 2

    Sn i

    i

    n i ni n n

    (31)

    where S is 1

    n

    ii

    T X

    . the (1 )100% upper confidence limit for is the

    largest value of u that satisfies

    1( ) ( ) 2n i

    i S

    n i ni n n

    (32)

    The coverage probability of the exact confidence interval for the

    negative-binomial mean with from 0 to 100 and n=50 is plotted in Figure

    14(b). As for the former two distributions, the exact interval for the

    negative-binomial mean is also always conservative.

    5.2.3 The Likelihood Ratio Interval

    The likelihood ratio interval for the negative-binomial mean is constructed by

    the inversion of the likelihood rario test 0 0:H . if 22

    2log( )n z , where

    n is the likelihood ratio given by

    (1 ) (1 )

    nX n

    n nX n

    p pp p

    (33)

    where the maximum likelihood estimation for p is 11MLE

    pX

    .

  • 27

    The coverage probability of the likelihood ratio confidence interval for the

    negative-binomial mean with from 0 to 100 and n=50 is plotted in Figure

    14(c).

    5.2.4 The Jeffreys Interval

    Because the non-informative Jeffreys prior of the negative-binomial mean is

    proportional to 1 2 1 2(1 ) , the conjugate prior for p is proportional to

    1 2 1(1 ) ( )p p . So the posterior distribution of p is ~ ( 1 2, )p X Beta X n ,

    and the equal-tailed Jeffreys interval is in the following form.

    ( ) [ ( ), ( )]J J JCI p L x U x (34)

    where ( )JL x is the 2 quantile and ( )JU x is the 1 2

    quantile of the

    ( 1 2 , )Beta X n distribution. Thus the Jeffreys interval for is

    ( ) ( )( ) [ , ]1 ( ) 1 ( )J J

    JJ J

    U x L xCI U x L x (35)

    The coverage probability of the equal-tailed Jeffreys confidence interval

    negative-binomial mean with from 0 to 100 and n=50 is plotted in Figure

    14(d).

    The likelihood ratio interval and the Jeffreys interval in the

    negative-binomial case are close to each other when is large, and have

    difference when is close to 0, which coincides with the former two cases.

  • 28

    Figure 14: Plot of the coverage probability for the negative-binomial mean with

    from 0 to 100 and n=50 of: (a) the score interval, (b) the exact interval, (c)

    the likelihood ratio interval, and (d) the Jeffreys interval.

    5.3 The Expected Length of The Negative-binomial Intervals

    The expected length for the negative-binomial mean is computed by:

    0

    1( ) ( ( ) ( )) ( ) (1 )r in

    i

    r i r rEW U i L ii r r

    (36)

    The expected length of the Wald interval for the negative-binomial mean

    and the alternative intervals are plotted below (Figure 15).

    0 20 40 60 80 100

    0.86

    0.90

    0.94

    0.98

    (a)

    mu

    cove

    rage

    pro

    babi

    lity

    0 20 40 60 80 100

    0.86

    0.90

    0.94

    0.98

    (b)

    mu

    cove

    rage

    pro

    babi

    lity

    0 20 40 60 80 100

    0.86

    0.90

    0.94

    0.98

    (c)

    mu

    cove

    rage

    pro

    babi

    lity

    0 20 40 60 80 100

    0.86

    0.90

    0.94

    0.98

    (d)

    mu

    cove

    rage

    pro

    babi

    lity

  • 29

    Figure 15: Plot of the expected length for the negative-binomial with

    from 0 to 100 of: the Wald interval (black solid line), the score interval (grey

    dotted line), the likelihood ratio interval (green dotted line), the Jeffreys

    interval (blue dotted line), and the the exact interval (red dotted line).

    We can see from the plot that, the score interval has the longest expected

    length most of the time, and the Wald interval is always the shortest. The

    expected length of the likelihood ratio interval and the Jeffreys interval are

    close to each other.

    Then we can also plot the expected length to the actual coverage

    probability (Figure 16)

    0 20 40 60 80 100

    010

    2030

    4050

    6070

    Expected Length of the Negative-binomial Mean

    mu

    expe

    cted

    leng

    th

  • 30

    Figure 16: Plot of the expected length to the actual coverage probability of the

    exact interval (the black solid line) and (a) the Wald interval , (b) the score

    interval, (c) the likelihood ratio interval and (d) the Jeffreys interval (the purple

    dotted line).

    We can see from the plot that, when the actual coverage probability is

    close to 0.95, the likelihood ratio interval and the Jeffreys interval are relatively

    shorter than the exact interval. And we can also see that at the same expected

    length, the actual coverage probability of the exact interval is always the largest.

    In the negative binomial case, the exact method has the expected length more

    close to the approximation methods than in the former two cases, and it is

    always conservative.

    In the negative-binomial case, we should still see to the requirement of the

    study. Choose exact method for conservativeness, and the approximation

    method for accuracy.

    10 20 30 40 50 60 70

    0.85

    0.90

    0.95

    1.00

    (a)

    expected length

    cove

    rage

    pro

    babi

    lity

    10 20 30 40 50 60 70

    0.85

    0.90

    0.95

    1.00

    (b)

    expected length

    cove

    rage

    pro

    babi

    lity

    10 20 30 40 50 60 70

    0.85

    0.90

    0.95

    1.00

    (c)

    expected length

    cove

    rage

    pro

    babi

    lity

    10 20 30 40 50 60 70

    0.85

    0.90

    0.95

    1.00

    (d)

    expected length

    cove

    rage

    pro

    babi

    lity

  • 31

    6. Conclusions and Discussions

    6.1 Conclusions

    From all the three cases we can obtain the following conclusion:

    1. The higher coverage probability a confidence interval obtains, the longer

    the expected length will be.

    2. The exact intervals are always conservative, and subsequently having

    longer expected length.

    3. The approximation methods and the Bayesian methods have relatively

    shorter expected length, and subsequently have more chaotic coverage

    probabilities.

    4. At the same expected length, the exact method is getting the largest

    coverage probability.

    6.2 Discussion

    The three discrete cases studied in this paper can be asymptotically

    approximated to each other. For the binomial case, if the number of trials n is

    very large, while p is sufficiently small, such as 100n while 10np , then

    the distribution can be approximated by the Poisson distribution ~ ( )X Pois np .

    And for the negative-binomial case, if the number of successes n is very large,

    then the distribution can be approximated by the Poisson distribution

    ~ ( (1 ))X Pois n p . So we can also use asymptotic approximation to solve the

    confidence interval problem. I plotted the coverage probability for the three

    cases when n=1000 (Figure 17(a), (b)), and n=100(Figure 17(c), (d)), both of

    10np .

  • 32

    Figure 17: The coverage probability of the Wald interval for: (a) the

    binomial proportion at p = 0 to 0.01 (black solid line), n=1000, (b) the

    negative-binomial mean at (1-p) = 0 to 0.1 (black solid line), n=1000, and the

    Poisson mean at = 0 to 10 (blue dotted line), (c) the binomial proportion at p

    = 0 to 0.1 (black solid line), n=100, (d) the negative-binomial mean at (1-p) = 0

    to 0.1 (black solid line), n=100, and the Poisson mean at = 0 to 10 (blue

    dotted line).

    We can see from the plot that when p is close to 0, the coverage

    probability of the binomial and negative-binomial cases coincide with the

    Poisson case. The coverage probabilities are closer when n is larger and is

    smaller, which coincides with the knowledge that the three discrete cases can

    be asymptotically approximated to each other. This suggests that in the study of

    0.000 0.004 0.008

    0.82

    0.88

    0.94

    (a)

    p

    cove

    rage

    pro

    babi

    lity

    0.000 0.004 0.008

    0.82

    0.88

    0.94

    (b)

    1-p

    cove

    rage

    pro

    babi

    lity

    0.00 0.04 0.08

    0.82

    0.88

    0.94

    (c)

    p

    cove

    rage

    pro

    babi

    lity

    0.00 0.04 0.08

    0.82

    0.88

    0.94

    (d)

    1-p

    cove

    rage

    pro

    babi

    lity

  • 33

    confidence intervals, when one of the three cases is hard to calculate, we can

    use the asymptotical approximation to the other two.

    7. Empirical examples

    7.1 Female births proportion

    To illustrate the differences between the confidence intervals, I will give a brief

    example of the binomial case below. The data is the number of girls and boys

    born in Paris from 1745 to 1770, and it is from Gelman (2004). It was first

    studied by Laplace and was used to estimate the proportion of female births in

    a population, to see if the female births in European populations was less than

    0.5. A total of 241,945 girls and 251,527 boys were born in that period.

    So in this example n=493,472 and the point estimation of the proportion is

    0.4902912Xpn

    . I calculated the limits and length of the Wald interval and

    the alternative confidence intervals for this binomial proportion, and the results

    are in Table 1.

    Table 1: 95% Confidence intervals for the female births proportion

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.488896465 0.491686020 0.002789555

    Wilson 0.488896546 0.491686090 0.002789544

    Agresti-Coull 0.488896546 0.491686090 0.002789544

    Clopper-Pearson 0.488895512 0.491687087 0.002791575

    Likelihood ratio 0.488896538 0.491686050 0.002789512

    Jeffreys 0.488896525 0.491686074 0.002789548

    Bayesian HPD 0.488896525 0.491686074 0.002789548

    We can see from the table that, the Agresti-Coull interval and Wilson score

    interval get the same limits and length, Jeffreys interval and Bayesian HPD

  • 34

    interval also get the same results. The Clopper-Pearson interval is the longest,

    and the likelihood-ratio interval is the shortest. Among these intervals, only the

    Wald interval is symmetric. From this result we can also see that although the

    Clopper-Pearson interval is the longest, if we focus on the aim that to see if the

    female births was less than 0.5, it is better to choose the Clopper-Pearson interval to

    suggest that at the 95% confidence level, 0.5 is beyond the upper bound of the

    Clopper-Pearson interval. As the Clopper-Pearson interval is always

    conservative, so even if 0.5 is above the upper limits of all these intervals, the

    Clopper-Pearson interval can support the result best.

    7.2 The horse kick data

    Quine and Seneta (2006) introduced Bortkiewiczs (1898) horse kick data set,

    which describes the numbers on men killed by horse kicks in the Preussian

    Army from 1875 to 1894, and Bortkiewicz showed in his book that this data

    follows a Poisson distribution.

    There are 280 observations from 14 corps in the army over 20 years, so

    the variable ijX (i= 1,, 20; j= 1,,14) records the number of deaths by

    horse kick in year 1874+i for corps j. Bortkiewicz showed in his book that

    ~ ( )ij jX Pois , and calculated the point estimator of j by 20

    iji

    j

    X

    . I

    calculated the limits and length of the Wald interval and the alternative

    confidence intervals for these Poisson means, the results of corps G are in

    Table 2, and the results of the rest of the 14 corps are in the appendix (Table

    3~15).

    We can see from these tables that, when n is small, the differences

    between different intervals are larger than in the former example. The Wald

    interval is of the shortest length, and the exact interval is of the longest. In this

    example I would like to recommend the likelihood ratio interval for it is of

  • 35

    relatively shorter length, and the behavior of its coverage probability is less

    chaotic than the Wald interval.

    Table 2: 95% Confidence intervals for the horse kick numbers of the corps G

    Corps iji

    X j

    G 16 0.8

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.408 1.192 0.784

    Score 0.492 1.300 0.807

    Likelihood-ratio 0.469 1.258 0.789

    Jeffreys 0.476 1.268 0.792

    Exact 0.457 1.299 0.842

  • References

    [1] Agresti, A. & Coull, B.A. 1998, "Approximate is Better than "Exact" for

    Interval Estimation of Binomial Proportions", American Statistician, vol. 52,

    no. 2, pp. 119-126.

    [2] Barker, L. 2002, "A comparison of nine confidence intervals for a poisson

    parameter when the expected number of events is 5", American Statistician,

    vol. 56, no. 2, pp. 85-89.

    [3] Bortkiewicz, L. von (1898). Das Gesetz der kleinen Zahlen: Teubner,

    Leipzig

    [4] Brown, L.D., Cai, T.T. & DasGupta, A. 2001, "Interval estimation for a

    binomial proportion", Statistical Science, vol. 16, no. 2, pp. 101-133.

    [5] Brown, L.D., Cai, T.T. & DasGupta, A. 2002, "Confidence intervals for a

    binomial proportion and asymptotic expansions", Annals of Statistics, vol. 30,

    no. 1, pp. 160-201.

    [6] Brown, L.D., Cai, T.T. & DasGupta, A. 2003, "Interval estimation for

    exponential families", Statisticl Sinica, vol. 13, pp. 19-49.

    [7] Clopper, C. J. & Pearson, E. S. 1934, "The use of confidence or fiducial

    limits illustrated in the case of the binomial", Biometrika, vol. 26, pp. 404-413.

    [8] Fay P. M. & Feuer J. E. 1997, "Confidence intervals for directly

    standardized rates: a method based on the gamma distribution", Statistics in

    Medicine, vol. 16, pp.791-801.

  • [9] Gelman, A. (2004). Bayesian data analysis . 2. ed. Boca Raton: Chapman

    & Hall

    [10] Quine M. P. & Seneta E. 2006, "Bortkiewiczs data and the law of small

    numbers", International Statistical Institute, vol. 55, pp. 173-181.

    [11] Rao, C. R. (1973). Linear statistical inference and its applications: Wiley,

    New York.

    [12] Severini T. A. 1991, "On the Relationship Between Bayesian and

    Non-Bayesian Interval Estimates", Journal of the Royal Statistical Society.

    Series B (Methodological), vol. 53, no. 3, pp. 611-618.

    [13] Wang, W. 2006, "Smallest confidence intervals for one binomial

    proportion", Journal of Statistical Planning and Inference, vol. 136, no. 12, pp.

    4293-4306.

    [14] Wilson, E. B. 1927, "Probable inference, the law of succession, and

    statistical inference", J. Amer. Statist. Assoc., vol. 22, pp. 209-212.

  • Appendix

    A.1 Interval estimation results of the horse kick numbers for the 13 corps.

    Table 3: 95% Confidence intervals for the horse kick numbers of the corps I

    Corps iji

    X j

    I 16 0.8

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.408 1.192 0.784

    Score 0.492 1.300 0.807

    Likelihood-ratio 0.469 1.258 0.789

    Jeffreys 0.476 1.268 0.792

    Exact 0.457 1.299 0.842

    Table 4: 95% Confidence intervals for the horse kick numbers of the corps II

    Corps iji

    X j

    II 12 0.6

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.261 0.939 0.679

    Score 0.343 1.049 0.706

    Likelihood-ratio 0.321 1.006 0.685

    Jeffreys 0.328 1.016 0.688

    Exact 0.310 1.048 0.738

    Table 5: 95% Confidence intervals for the horse kick numbers of the corps III

    Corps iji

    X j

    III 12 0.6

    Confidence Interval Lower limit Upper Limit Length

  • Wald 0.261 0.939 0.679

    Score 0.343 1.049 0.706

    Likelihood-ratio 0.321 1.006 0.685

    Jeffreys 0.328 1.016 0.688

    Exact 0.310 1.048 0.738

    Table 6: 95% Confidence intervals for the horse kick numbers of the corps IV

    Corps iji

    X j

    IV 8 0.4

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.123 0.677 0.554

    Score 0.203 0.789 0.587

    Likelihood-ratio 0.183 0.745 0.562

    Jeffreys 0.189 0.755 0.566

    Exact 0.173 0.788 0.615

    Table 7: 95% Confidence intervals for the horse kick numbers of the corps V

    Corps iji

    X j

    V 11 0.55

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.225 0.875 0.650

    Score 0.307 0.985 0.678

    Likelihood-ratio 0.286 0.942 0.656

    Jeffreys 0.292 0.952 0.660

    Exact 0.275 0.984 0.710

    Table 8: 95% Confidence intervals for the horse kick numbers of the corps VI

    Corps iji

    X j

  • VI 17 0.85

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.446 1.254 0.808

    Score 0.531 1.361 0.831

    Likelihood-ratio 0.507 1.320 0.813

    Jeffreys 0.514 1.330 0.816

    Exact 0.495 1.361 0.866

    Table 9: 95% Confidence intervals for the horse kick numbers of the corps VII

    Corps iji

    X j

    VII 12 0.6

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.261 0.939 0.679

    Score 0.343 1.049 0.706

    Likelihood-ratio 0.321 1.006 0.685

    Jeffreys 0.328 1.016 0.688

    Exact 0.310 1.048 0.738

    Table 10: 95% Confidence intervals for the horse kick numbers of the corps VIII

    Corps iji

    X j

    VIII 7 0.35

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.091 0.609 0.519

    Score 0.170 0.723 0.553

    Likelihood-ratio 0.150 0.677 0.526

    Jeffreys 0.157 0.687 0.531

    Exact 0.141 0.721 0.580

    Table 11: 95% Confidence intervals for the horse kick numbers of the corps IX

  • Corps iji

    X j

    IX 13 0.65

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.297 1.003 0.707

    Score 0.380 1.112 0.732

    Likelihood-ratio 0.358 1.070 0.712

    Jeffreys 0.364 1.080 0.716

    Exact 0.346 1.112 0.765

    Table 12: 95% Confidence intervals for the horse kick numbers of the corps X

    Corps iji

    X j

    X 15 0.75

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.370 1.130 0.759

    Score 0.455 1.238 0.783

    Likelihood-ratio 0.432 1.196 0.765

    Jeffreys 0.438 1.206 0.767

    Exact 0.420 1.237 0.817

    Table 13: 95% Confidence intervals for the horse kick numbers of the corps XI

    Corps iji

    X j

    XI 25 1.25

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.760 1.740 0.980

    Score 0.847 1.845 0.999

    Likelihood-ratio 0.822 1.806 0.984

    Jeffreys 0.829 1.815 0.986

    Exact 0.809 1.845 1.036

  • Table 14: 95% Confidence intervals for the horse kick numbers of the corps XIV

    Corps iji

    X j

    XIV 24 1.2

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.720 1.680 0.960

    Score 0.806 1.786 0.979

    Likelihood-ratio 0.782 1.746 0.964

    Jeffreys 0.789 1.756 0.967

    Exact 0.769 1.786 1.017

    Table 15: 95% Confidence intervals for the horse kick numbers of the corps XV

    Corps iji

    X j

    XV 8 0.4

    Confidence Interval Lower limit Upper Limit Length

    Wald 0.123 0.677 0.554

    Score 0.203 0.789 0.587

    Likelihood-ratio 0.183 0.745 0.562

    Jeffreys 0.189 0.755 0.566

    Exact 0.173 0.788 0.615

    A.2 Codes

    #Codes for calculating the coverage probability of the Wald interval for the binomial proportion#

    Cwb

  • i

  • pq