Post on 27-Dec-2019
Chapter 5, Probability Distributions
5.1 Introduction- In this chapter, we will discuss various probability distributions including discrete
probability distributions and continuous probability distributions.
- Discrete probability distributions is used when the sampling space is discrete but not countable. Following is a list of discrete probability distributions: discrete uniform binomial and multinomial hypergeometric negative binomial geometric Poisson
- Continuous probability distribution is used when the sample space is continuous. Following is a list of continuous probability distributions: Uniform Normal (or Guassian) Gamma Beta t distribution F distribution 2 distribution
5.2 Discrete uniform distribution- the definition: if a r. v., X, assumes the values x1, x2, ..., xk with equal probabilities,
then X conforms discrete uniform distribution and its probability function is given below:
f (x,k )1k
, xx1, x2 ,. .. ,xk
- the mean and variance:
1k
xii1
k
2 1k
(xi )2
i1
k
5.3 Binomial and multinomial distributions- First, let us introduce the Bernoulli process. If:
the outcomes of process is either success (X = 1) or fail (X = 0) the probability of success is P(X = 1) = p and the probability of fail is P(X = 0) =
1-p = q
Then, the process is a Bernoulli process.
- The probability distribution of the Bernoulli process:p(x) = px(1 - p)1-x, x = 0, 1 and 0 < p < 1
- The mean and the variance:E(X) = pV(X) = p(1 - p)
- An example: what is the prob. of picking a male student?X = 1: male student with probability p = (8/12) = 2/3X = 0: female student with probability 1-p = 1/3
Thus, the probability distribution is:P(x) = (0.25)x(0.75)1-x, x = 0 and 1
In addition, the mean: p = 2/3 and the variance V = (2/3)(1/3) = 2/9
- Binomial Distribution: the binomial distribution is defined based on the Bernoulli process. It is made up of n independent Bernoulli processes. Suppose that X1, X2, ..., Xn are independent Bernoulli random variables, then Y = Xi will conform Binomial distribution. (note that Y is the number of successes among the n trails)
- The probability distribution of binomial distribution is:
- The student example: pick three students from the 12 students (Note we must take samples with replacement in order to ensure the same probability and independence).
none is male student from the 3:the possibility: FFF
the probability: (1-p)3 = (0.037)
one is male student from the 3:the possibility: MFF, FMF, FFM
the probability: 3p(1-p)2 = (0.222)
two are male students from the 3:the possibility: MMF, MFM, FMM
the probability: 3p2(1-p) = (0.445)
three are male students from the 3:the possibility: MMM
the probability: p3 = (0.296)
In general, the formula is:
We can derive the general formula in a same manner.
- Mean and variance of the binomial distribution:E(Y) = E(Xi) = p = npV(Y) = V(Xi) = p(1 - p) = np(1 - p)
- the example: find the mean and variance of picking male students and then use Chybeshev's theorem to interpret the interval ± 2.
= (3)(2/3) = 2 = (3)(2/3)(1/3) = 2/3, = 0.817
at k = 2, + 2 = 2 + (2)(0.816) = 3 - 2 = 2 - (2)(0.816) = 1
(1 - 1/k2) = 3/4. Therefore, there should be at least a probability of 3/4 that the number of male students picked are between 1 to 3. Indeed, the probability is actually p(1)+p(2)+p(3) = 0.973.
- Using the Binomial distribution table: a function of n and p.
- Multinomial distribution: this is an extension of binomial distribution: let x1, x2, ..., xk be independent r. v. with the probability p1, p2, ..., pk, where,
then, they conform multinomial distribution with the probability distribution:
5.4 Hypergeometric Distribution- The example: what is the probability of pick three male students in a roll? Note that at
this time, samples are not independent, or sampling without replacement. As a result we need to use hypergeometric distribution. Following shows how the distribution is formed: no male student from the 3 students
total , male , female
probability =
one male students from the 12 students
total , male , female
probability =
two male students from the 12 students
total , male , female
probability =
three male students from the 12 students
total , male , female
probability =
In general, the probability distribution is as follows:
P(Y y)
8y
43 y
123
, y 0, 1, 2, 3
- the general formula of the hypergeometry distribution:
P(Y y)
ky N kn y
Nn
, y 0, 1, 2, .. ., n
- the mean and the variance of the hypergeometry distribution:
nkN
2 N nN 1
nkN
1 kN
as a special case, let N be infinite, then (k / N) = p, and (N-n) / (N-1) = 1. Hence: = np2 = np(1 - p)
That is, the hypergeometric distribution becomes the binomial distribution
- We can also define the multivariate hypergeometric distribution
5.5 Negative Binomial and Geometric Distributions- An example: picking three students, what is the probability that the third student is
the second male? a possibility is FMM and its probability is (1-p)p2
the other possibility is MFM and its probability is (1-p)p2
note that there are 3 12 1
combinations, and hence, the probability is:
f (X 3,k 2)3 12 1
1 p p2
- The general formula for the negative binomial distribution is as follows:
f (X x) x 1k 1
pk (1 p)x k , x = k, k+1, k+2, ...
where, x is the number of trails and k is the kth success.
- the mean of variance of the negative binomial distribution:E(X) = k(1-p)/pV(X) = k(1-p)/p2
- another example: picking until get a male student: the first pick: p the second pick: (1-p)p the third pick: (1-p)2p
- the general formula is:f(X = x) = (1 - p)x-1p, x = 1, 2, 3, ...
This is the geometric distribution.
- the mean of variance of the negative binomial distribution and geometric distributions:E(X) = 1/pV(X) = (1-p)/p2
5.6 Poisson Distribution- Poisson process is a random process representing a discrete event takes place over
continuous intervals of time or region. Examples of Poisson processes include: the arrival of telephone calls at a switchboard,
the passing cars of an electric checking device.
Note that all these examples involve a discrete random event. At any given small period of time (or region), the probability that the event occurs is small; however, over a long time (or large region), the number of occurrence is large.
- Poisson distribution plays an extremely important role in science and engineering, since it represents an appropriate probabilistic model for a large number of observational phenomena.
- The Poisson distribution can be described by the following formula:
p(x,t) e t (t)x
x!, x = 0, 1, 2, ...
where, is the average number of outcomes per unit time or region. Hence, t represents the number of outcomes.
Proof: refer to the textbook.
- The Poisson process can be considered as an approximation to the Binomial Distribution when n is large and p is small.
- From a physical point of view, given a time interval of length T, which is divided interval into n equal sub-intervals of length t (t 0), (note that T = nt), and assume: The probability of a success in any sub-interval t is given by t. The probability of more than one success in any sub-interval t is negligible. The probability of a success in any sub-interval does not depend on what
happened prior to that time.
Then, we have the Poisson distribution.
- Mean and Variance of Poisson distribution
- An example: in a large company, industrial accidents occur at the mean of three per week (t = 3) (note that accidents occurs independently). the probability distribution:
p(y) = (3)yexp(-3) / y!, y = 0, 1, 2, ...
the probability can be determined based on simple calculation or by means of checking the Poisson distribution table.
the probability of less than and equal to four accidents in a week:p(0) + p(1) + p(2) + p(3) + p(4) = 0.815
the probability of equal and more than four: P(Y 4) = 1 - P(Y 3) = 0.353
the probability of equal to fourP(Y = 4) = P(Y 4) - P(Y 3) = 0.168note that this is the same as:p(4) = 0.168
5.7 Uniform Distribution- The uniform distribution is a continuous probability distribution
the assumption: the random event is equally likely in an interval an example: receiving an express mail between 1 ~ 5 pm
- The probability density function (pdf)
- By integration, we obtain the probability function (pf)
- A comparison between the discrete distributions and continuous distribution the discrete r. v., we have probability function:
P(X = x) = p(x) for continuous r. v.:
F(X = x) = 0
F(x) = -
x
f(x) dx
f(x) = dx
F(x)
- An example: receiving an express mail equally likely between 1 to 5 pm.f(x) = 1/4, 1 x 5
0, elsewhere
hence, the probability of receiving an express mail between 2 to 5 pm isP(2 X 5) = (5 - 1)/(5 - 1) - (2 - 1)/(5 - 1) = 3/4.
- The mean and the variance:E(x) = (a+b)/2V(x) = (b-a)2/12
5.8 Normal Distribution- In the natural world there are more cases where possibilities are not equally likely.
Instead there is a most likely value and then the likelihood decreases symmetrically. This leads to the Normal distribution.
- Normal distribution is by far the most widely used probability distribution. Why Normal distribution is so popular? the large number theorem a linear combination of Normal is still Normal
- The probability density function:
note that probability function does not have analytical form, hence, we rely on numerical calculation (Table A.3)
- The mean, variance and standard deviation of a normal distributions:E(X) = V(X) = 2
These two parameters uniquely determine the normal distribution. Hence, a normal distribution is often denoted as N(, )
- Illustration of the normal distribution: the bell shape the mean the standard deviation: ± (68% area), ±2 (95.4% area), and ±3 (99.7% area).
- In particular, withE(X) = V(X) = 2
we have the standard normal distribution N(0, 1)
- Calculate the probability through the standard normal distribution: translate to a normal distribution to a standard normal distribution by:
Z = X -
use the normal distribution table (Table A.3)
- An example: given N(16, 1), P(X > 17) = ? Z = (X - 16)/1 P[Z > (17 - 16)/1] = P(Z > 1)
= 1 - P(Z < 1) = 1 - 8413 (form Table A.3)= 0.1587
- Questions: given and , how to calculate P(c1 X c2)? given p, and , how to calculate x so that P(X > x) = p
- Given a set of data, it is often necessary to checking whether the data set conforms normal distribution.
- The student example - the number of hours of study of the 12 students: sorting the data: 10, 12, 12, 14, 14, 14, 15, 15, 15, 20, 20, 25 note that there are just 6 different values. So, the 100 6 = 16.7 finding the percentile of the data: 16, 32, 32, 48, 48, 48, 64, 64, 64, 80, 80, 96 finding the z-values of the percentile: -1., -.47, -.47, -.05, -.05,
-.05, .36, .36, .36, .85, .85, 1.75 plotting:
-0.5-1-1.5 0.5 1 1.5 2
15
20
25
10
•
••
•••••
•
Because the horizontal axis is from a normal distribution, the linear relationship indicates that the distribution of the data can be approximated by a normal distribution.
- If a data set conforms normal distribution, then the related probability calculated can be easily done. Following the 12 students example:
= 15.5 = 16
Question: what is the prob. of picking a student who studies at least 15 hours per week?
Answer: we first calculate the z value;z = (15 - 15.5) / 4 = -0.125
hence, the probability is: P(Z > -0.125) = 1 - P(Z < -0.125) = 1 - 0.45 = 0.55
- As another example, assuming that an exam is coming, everybody is putting an extra 3 hours for study per week, what is the probability of picking a student who studies at least 20 hours per week? We first calculate the z value;
z = (20 - 18.5) / 4 = 0.375
hence, P(X > 20) = P(Z > 0.375) = 1 - P(Z < 0.375) = 1 - 0.64 = 0.36.
- As an exercise, you may want to try to find that, given a probability of 95%, what is the range of the hours of study per week for a picked student.
- Normal approximation to binomial. Assuming p is small and n is large, then
Z X npnp(1 p)
is approximately normally distributed. This can be demonstrated by the example. In the students example, the probability of picking a student who studies more than 15 hours per week is p = 3/12 = 1/4. Consider the case of sampling with replacement, picking 3 students who all study more than 15 hours per week is:
b(X = 3, n = 12, p = 1/4) = 0.212
Use normal distribution to approximate: = np = (12)(1/4) = 32 = np(1 - p) = (12)(1/4)(3/4) = 9/4 = 2.25 ( = 1.5)
hence,P(2.5 < X < 3.5) = P[(2.5 - 3)/1.5 < Z < (3.5 - 3)/1.5]
= P(-0.167 < Z < 0.167) = 0.56 - 0.395
= 0.165
It is seen that the results are rather similar. The approximation error is caused by small n (n = 12).
- The normal approximation of binomial distribution is very useful when n is large because binomial distribution will then require tedious calculation.
5.9 Exponential distribution, Gamma distribution and Chi-Square (2) distribution- There are cases, for example the failure rate, in which the possibility decreases
exponentially. This leads to the exponential distribution.
- the probability density function of the exponential distributions:
- the probability function
F(x) = 1 - exp(-x/), x > 0, > 0
- To calculate mean and variance, we need the Gamma () function:
() = 0
x-1 e-xdx
using integration by part:(uv)' = u'v + uv'
uv u' v uv'or
uv' uv u' v
let u = x-1, dv = e-xdx, it follows that:
( ) e x x 10
e x ( 1)x 2dx
0
( 1)( 1)
In particular:(+1) = F()(n) = (n-1)!(1/2) =
In general:
(x) 1e xdx ()
0
for the geometry distribution, since = 1, = :
E(X) = V(X) = 2
- The exponential distribution is correlated to Poisson distribution: given a Poisson distribution with the mean t, the probability of first time occurrence is exponential.
- Another common case is that the possibility is low when close to zero - this leads to the Gamma distribution. The probability density function of Gamma distribution:
, x > 0, > 0.
- The mean and variance:E(X) = V(X) = 2
- Note that exponential distribution is a special case of Gamma distribution with = 1.
- Another special case of the gamma distribution is the 2 distribution. Let = /2 and = 2, it results in the 2 distribution:
f (x) 1
2
2(2)x
2 1ex
2, x > 0
its mean and variance are as follows: = 2 = 2
- Illustration.
5.10 Weibull distribution- The assumption: similar to Gamma
- The probability density function:
f(x) =
x-1 e-x/, x > 0
= 0, otherwise
- The probability function:F(x) = 1 - exp(-x/), x > 0
Exponential
Gamma or 2
- The mean and variance
V(X) = 2/
{(1 + 2) - [(1 +
1)]2}
- Application in reliability, defining:f(t) - the pdf of failureF(t) - the pf of failureR(t) = 1 - F(t) - the probability of no failure (reliability function)r(t) = f(t) / R(t) - the failure rate function
if:
then f(t) will be exponential.
- Proof: sincedF(t)/dt = f(t) • F'(t) = 1 - F(t) • F'(t) + F(t) = 1
solving the above gives:F(t) = 1 - exp(-t/), t 0
orf(t) = 1/ exp(-t/), t 0
5.11 Summary- Discrete distributions
discrete uniform: equally likely binomial and multinomial: number of success in n independent Bernoulli
experiments hypergeometric: sampling is dependent (finite sampling space) negative binomial: kth success in n trials geometric: trail until success Poisson: discrete event in continuous intervals.
- Continuous distributions uniform: equally likely Normal: has a most likely value and decreasing symmetrically exponential: gradually decreasing Gamma: small when close to zero (generalized exponential) Beta: contained in a finite interval Weibull: generalized Gamma