SOME CONTINUOUS PROBABILITY DISTRIBUTIONS

Upload
nathanmooney 
Category
Documents

view
49 
download
0
Embed Size (px)
description
Transcript of SOME CONTINUOUS PROBABILITY DISTRIBUTIONS
1
SOME CONTINUOUS PROBABILITY
DISTRIBUTIONSUniform, Normal, Exponential,
Gamma and ChiSquare Distributions
2
– A random variable X is said to be uniformly distributed if its density function is
– The expected value and the variance are
12)ab(
)X(V2
baE(X)
.bxaab
1)x(f
2
Uniform Distribution
Indicator functions
• It is sometimes convenient to express the p.m.f. or p.d.f. by using indicator functions. This is especially true when the range of random variable depends on a parameter.
• Ex: Uniform distribution
3
4
• Example 1– The daily sale of gasoline is uniformly distributed
between 2,000 and 5,000 gallons. Find the probability that sales are:
– Between 2,500 and 3,000 gallons– More than 4,000 gallons– Exactly 2,500 gallons
2000 5000
1/3000
f(x) = 1/(50002000) = 1/3000 for x: [2000,5000]
x2500 3000
P(2500X3000) = (30002500)(1/3000) = .1667
Uniform Distribution
5
• Example 1– The daily sale of gasoline is uniformly distributed
between 2,000 and 5,000 gallons. Find the probability that sales are:
– Between 2,500 and 3,500 gallons– More than 4,000 gallons– Exactly 2,500 gallons
2000 5000
1/3000
f(x) = 1/(50002000) = 1/3000 for x: [2000,5000]
x4000
P(X4000) = (50004000)(1/3000) = .333
Uniform Distribution
6
• Example 1– The daily sale of gasoline is uniformly distributed
between 2,000 and 5,000 gallons. Find the probability that sales are:
– Between 2,500 and 3,500 gallons– More than 4,000 gallons– Exactly 2,500 gallons
2000 5000
1/3000
f(x) = 1/(50002000) = 1/3000 for x: [2000,5000]
x2500
P(X=2500) = (25002500)(1/3000) = 0
Uniform Distribution
7
Normal Distribution
• This is the most popular continuous distribution.
– Many distributions can be approximated by a normal distribution.
– The normal distribution is the cornerstone distribution of statistical inference.
8
• A random variable X with mean and variance is normally distributed if its probability density function is given by
...71828.2eand...14159.3where
0;xe2
1)x(f
2x)2/1(
Normal Distribution
The Shape of the Normal Distribution
The normal distribution is bell shaped, and symmetrical around
Why symmetrical? Let = 100. Suppose x = 110.
2210
)2/1(100110
)2/1(e
21
e2
1)110(f
Now suppose x = 90210)2/1(
210090)2/1(e
21e
21)90(f
11090
10
The Effects of and
How does the standard deviation affect the shape of f(x)?
= 2
=3 =4
= 10 = 11 = 12How does the expected value affect the location of f(x)?
11
• Two facts help calculate normal probabilities:– The normal distribution is symmetrical.– Any normal distribution can be transformed into
a specific normal distribution called… “STANDARD NORMAL DISTRIBUTION”ExampleThe amount of time it takes to assemble a computer
is normally distributed, with a mean of 50 minutes and a standard deviation of 10 minutes. What is the probability that a computer is assembled in a time between 45 and 60 minutes?
Finding Normal Probabilities
12
STANDARD NORMAL DISTRIBUTION
• NORMAL DISTRIBUTION WITH MEAN 0 AND VARIANCE 1.
• IF X~N( , 2), THEN
NOTE: Z IS KNOWN AS Z SCORES.• “ ~ “ MEANS “DISTRIBUTED AS”
~ (0,1)XZ N
13
• Solution– If X denotes the assembly time of a computer, we
seek the probability P(45<X<60).– This probability can be calculated by creating a new
normal variable the standard normal variable.
x
xXZ
E(Z) = = 0 V(Z) = 2 = 1
Every normal variablewith some and, canbe transformed into this Z.
Therefore, once probabilities for Zare calculated, probabilities of any normal variable can be found.
Finding Normal Probabilities
Standard normal probabilities
14Copied from Walck, C (2007) Handbook on Statistical Distributions for
experimentalists
Standard normal table 1
15
Standard normal table 2
16
17
Standard normal table 3
18
• Example  continued
P(45<X<60) = P( < < )45 X 60 50  5010 10
= P(0.5 < Z < 1)
To complete the calculation we need to compute the probability under the standard normal distribution
Finding Normal Probabilities
19
z 0 0.01 ……. 0.05 0.060.0 0.0000 0.0040 0.0199 0.02390.1 0.0398 0.0438 0.0596 0.0636. . . . .. . . . .
1.0 0.3413 0.3438 0.3531 0.3554. . . . .. . . . .
1.2 0.3849 0.3869 ……. 0.3944 0.3962. . . . . .. . . . . .
Standard normal probabilities have been calculated and are provided in a table .
The tabulated probabilities correspondto the area between Z=0 and some Z = z0 >0 Z = 0 Z = z0
P(0<Z<z0)
Using the Standard Normal Table
20
• Example  continued
P(45<X<60) = P( < < )45 X 60 50  5010 10
= P(.5 < Z < 1)
z0 = 1z0 = .5
We need to find the shaded area
Finding Normal Probabilities
21
P(.5<Z<0)+ P(0<Z<1)
P(45<X<60) = P( < < )45 X 60 50  5010 10
z 0 0.01 ……. 0.05 0.060.0 0.0000 0.0040 0.0199 0.02390.1 0.0398 0.0438 0.0596 0.636. . . . .. . . . .
1.0 0.3413 0.3438 0.3531 0.3554. . . . .
P(0<Z<1
• Example  continued
= P(.5<Z<1) =
z=0 z0 = 1z0 =.5
.3413
Finding Normal Probabilities
22
• The symmetry of the normal distribution makes it possible to calculate probabilities for negative values of Z using the table as follows:
z0 +z00
P(z0<Z<0) = P(0<Z<z0)
Finding Normal Probabilities
23
z 0 0.01 ……. 0.05 0.060.0 0.0000 0.0040 0.0199 0.02390.1 0.0398 0.0438 0.0596 0.636
. . . . .
. . . . .0.5 0.1915 …. …. ….
. . . . .
• Example  continued
Finding Normal Probabilities
.3413
.5.5
.1915
24
z 0 0.1 ……. 0.05 0.060.0 0.0000 0.0040 0.0199 0.02390.1 0.0398 0.0438 0.0596 0.636
. . . . .
. . . . .0.5 0.1915 …. …. ….
. . . . .
• Example  continued
Finding Normal Probabilities
.1915.1915.1915.1915
.3413
.5.5
P(.5<Z<1) = P(.5<Z<0)+ P(0<Z<1) = .1915 + .3413 = .5328
1.0
25
z 0 0.01 ……. 0.05 0.060.0 0.5000 0.5040 0.5199 0.52390.1 0.5398 0.5438 0.5596 0.5636. . . . .. . . . .
0.5 0.6915 …. …. ….. . . . .
• Example  continued
Finding Normal Probabilities
.3413
P(Z<0.5)=1P(Z>0.5)=10.6915=0.3085By Symmetry
P(Z<0.5)
This table provides probabilities from ∞ to z0
26
10%0%
202
(i) P(X< 0 ) = P(Z< ) = P(Z<  2)0  10
5
=P(Z>2) =Z
X
• Example– The rate of return (X) on an investment is normally distributed
with a mean of 10% and standard deviation of (i) 5%, (ii) 10%.– What is the probability of losing money?
.4772
0.5  P(0<Z<2) = 0.5  .4772 = .0228
Finding Normal Probabilities
27
10%0%
1
(ii) P(X< 0 ) = P(Z< ) 0  10
10
= P(Z<  1) = P(Z>1) =Z
X
• Example– The rate of return (X) on an investment is normally
distributed with mean of 10% and standard deviation of (i) 5%, (ii) 10%.
– What is the probability of losing money?
.3413
0.5  P(0<Z<1) = 0.5  .3413 = .1587
Finding Normal Probabilities
1
28
AREAS UNDER THE STANDARD NORMAL DENSITY
P(0<Z<1)=.3413
Z0 1
29
AREAS UNDER THE STANDARD NORMAL DENSITY
.3413 .4772
P(1<Z<2)=.4772.3413=.1359
30
EXAMPLES
• P( Z < 0.94 ) = 0.5 + P( 0 < Z < 0.94 ) = 0.5 + 0.3264 = 0.8264
0.940
0.8264
31
EXAMPLES
• P( Z > 1.76 ) = 0.5 – P( 0 < Z < 1.76 ) = 0.5 – 0.4608 = 0.0392
1.760
0.0392
32
EXAMPLES
• P( 1.56 < Z < 2.13 ) = = P( 1.56 < Z < 0 ) + P( 0 < Z < 2.13 )
= 0.4406 + 0.4834 = 0.9240
P(0 < Z < 1.56)
1.56 2.13
0.9240
Because of symmetry
33
STANDARDIZATION FORMULA
• If X~N( , 2), then the standardized value Z of any ‘Xscore’ associated with calculating probabilities for the X distribution is:
• The standardized value Z of any ‘Xscore’ associated with calculating probabilities for the X distribution is:
(Converse Formula)
XZ
.x z
34
• Sometimes we need to find the value of Z for a given probability
• We use the notation zA to express a Z value for which P(Z > zA) = A
Finding Values of Z
zA
A
35
PERCENTILE• The pth percentile of a set of measurements
is the value for which at most p% of the measurements are less than that value.
• 80th percentile means P( Z < a ) = 0.80 • If Z ~ N(0,1) and A is any probability, then
P( Z > zA) = A
A
zA
36
• Example – Determine z exceeded by 5% of the population– Determine z such that 5% of the population is below
• Solutionz.05 is defined as the z value for which the area on its right
under the standard normal curve is .05.
0.05
Z0.050
0.45
1.645
Finding Values of Z
0.05
Z0.05
37
EXAMPLES
• Let X be rate of return on a proposed investment. Mean is 0.30 and standard deviation is 0.1.
a) P(X>.55)=?b) P(X<.22)=?c) P(.25<X<.35)=?d) 80th Percentile of X is?e) 30th Percentile of X is?
Standardization formula
Converse Formula
38
ANSWERSa)
b)
c)
d)
e)
X  0.3 0.55  0.3P(X 0.55) P = Z > = 2.5 = 0.5  0.4938 = 0.00620.1 0.1
X  0.3 0.22  0.3P(X 0.22) P = Z = 0.8 = 0.5  0.2881 = 0.21190.1 0.1
0.25 0.3 X  0.3 0.35  0.3P(0.25 X 0.35) P 0.5 = Z = 0.50.1 0.1 0.1
= 2.*(0.1915) 0.3830
80th Percentile of X is 0.20. .3+(.85)*(.1)=.385x z
30th Percentile of X is 0.70. .3+(.53)*(.1)=.247x z
39
The Normal Approximation to the Binomial Distribution
• The normal distribution provides a close approximation to the Binomial distribution when n (number of trials) is large and p (success probability) is close to 0.5.
• The approximation is used only whennp 5 andn(1p) 5
40
The Normal Approximation to the Binomial Distribution
• If the assumptions are satisfied, the Binomial random variable X can be approximated by normal distribution with mean = np and 2 = np(1p).
• In probability calculations, the continuity correction improves the results. For example, if X is Binomial random variable, then
P(X a) ≈ P(X<a+0.5)
P(X a) ≈ P(X>a0.5)
41
EXAMPLE• Let X ~ Binomial(25,0.6), and want to find P(X ≤ 13). • Exact binomial calculation:
• Normal approximation (w/o correction): Y~N(15,2.45²)
13
0x
x25x 267.0)4.0()6.0(x25
)13X(P
206.0)82.0Z(P)45.21513Z(P)13Y(P)13X(P
Normal approximation is good, but not great!
EXAMPLE, cont.
42Copied from Casella & Berger (1990)
Bars – Bin(25,0.6); line – N(15, 2.45²)
EXAMPLE, cont.
• Normal approximation (w correction): Y~N(15,2.45²)
43
271.0)61.0Z(P)5.13Y(P)5.13X(P)13X(P
Much better approximation to the exact value: 0.267
44
Exponential Distribution• The exponential distribution can be used to model
– the length of time between telephone calls– the length of time between arrivals at a service station– the lifetime of electronic components.
• When the number of occurrences of an event follows the Poisson distribution, the time between occurrences follows the exponential distribution.
45
A random variable is exponentially distributed if its probability density function is given by
f(x) = ex, x>=0.is the distribution parameter >0).
E(X) = V(X) = 2
Exponential Distribution
The cumulative distribution function isF(x) =1ex/, x0
/1 , 0, 0x
Xf x e x
is a distribution parameter.
46
0
0.5
1
1.5
2
2.5f(x) = 2e2x
f(x) = 1e1x
f(x) = .5e.5x
0 1 2 3 4 5
Exponential distribution for = .5, 1, 2
0
0.5
1
1.5
2
2.5
a bP(a<X<b) = ea/ eb/
47
• Finding exponential probabilities is relatively easy:– P(X < a) = P(X ≤ a)=F(a)=1 – e –a/
– P(X > a) = e–a/
– P(a< X < b) = e – a/ – e – b/
Exponential Distribution
48
• Example– The lifetime of an alkaline battery is
exponentially distributed with mean 20 hours.– Find the following probabilities:
• The battery will last between 10 and 15 hours.• The battery will last for more than 20 hours.
Exponential Distribution
49
• Solution– The mean = standard deviation =
20 hours.– Let X denote the lifetime.
• P(10<X<15) = e.05(10) – e.05(15) = .1341• P(X > 20) = e.05(20) = .3679
Exponential Distribution
50
• Example The service rate at a supermarket checkout
is 6 customers per hour. – If the service time is exponential, find the
following probabilities:• A service is completed in 5 minutes,• A customer leaves the counter more than 10
minutes after arriving• A service is completed between 5 and 8 minutes.
Exponential Distribution
51
• Solution– A service rate of 6 per hour =
A service rate of .1 per minute ( = .1/minute).– P(X < 5) = 1e.lx = 1 – e.1(5) = .3935– P(X >10) = e.lx = e.1(10) = .3679– P(5 < X < 8) = e.1(5) – e.1(8) = .1572
Exponential Distribution
52
Exponential Distribution
• If pdf of lifetime of fluorescent lamp is exponential with mean 0.10, find the life for 95% reliability?
The reliability function = R(t) = 1F(t) = et/
R(t) = 0.95 t = ?
53
• X~ Gamma(,)
GAMMA DISTRIBUTION
1 /1 , 0, 0, 0xf x x e x
2 and E X Var X
11 , M t t t
54
GAMMA DISTRIBUTION
• Gamma Function:
1
0
xx e dx
where is a positive integer.Properties:
1 , 0
1 ! for any integer 1n n n 12
55
)()()(
)1(
)()1(
0
1
0
1
0
dxexx
dxexxdxex
x
xx
GAMMA DISTRIBUTION
since the last integral is the expectation of a Gamma distribution with parameters alpha and 1.
56
• Let X1,X2,…,Xn be independent rvs with Xi~Gamma(i, ). Then,
GAMMA DISTRIBUTION
1 1
~ ,n n
i ii i
X Gamma
•Let X be an rv with X~Gamma(, ). Then, ~ , where is positive constant.cX Gamma c c
• Let X1,X2,…,Xn be a random sample with Xi~Gamma(, ). Then,
1 ~ ,
n
ii
XX Gamma n
n n
GAMMA DISTRIBUTION
• Special cases: Suppose X~Gamma(α,β)– If α=1, then X~ Exponential(β)– If α=p/2, β=2, then X~ 2 (p) (will come back in a min.)
– If Y=1/X, then Y ~ inverted gamma.• Gamma approximates to Normal
distribution on the limit as alpha goes to infinity.
57
Integral tricks
• Recall: h(t) is an odd function if h(t)=h(t);It is an even function if h(t)=h(t).• Ex: h(x)=x exp{x²/2} odd; h(x)= exp{x²/2x} neither odd nor even
• odd function =0
• even function = 2* even function58
0
59
CHISQUARE DISTRIBUTION
• X~ 2()= Gamma(/2,2)
and 2E X Var X
Chisquare with degrees of freedom
2/1)21()( 2/ tttM
,...2,1,0,)2/(2
1)( 2/12/2/
xexxf x
60
DEGREES OF FREEDOM• In statistics, the phrase degrees of freedom is
used to describe the number of values in the final calculation of a statistic that are free to vary.
• The number of independent pieces of information that go into the estimate of a parameter is called the degrees of freedom (df) .
• How many components need to be known before the vector is fully determined?
• Chisquare (2) ≡ Exponential (2)
61
CHISQUARE DISTRIBUTION
62
CHISQUARE DISTRIBUTION
• If rv X has Gamma(,) distribution, then Y=2X/ has Gamma(,2) distribution. If 2 is positive integer, then Y has
distribution.
2
2
•Let X1,X2,…,Xn be a r.s. with Xi~N(0,1). Then,2 2
1~
n
i ni
X
•Let X be an rv with X~N(0, 1). Then,2 2
1~X
63
BETA DISTRIBUTION
• The Beta family of distributions is a continuous family on (0,1) and often used to model proportions.
111 1 , 0 1, 0, 0.,
f x x x xB
where
2 and
1E X Var X
)()()()1(),(
1
0
11
dxxxB
64
WEIBULL DISTRIBUTION• To model the failure time data or hazard
functions.• Hazard function:Rate of change in prob that subject survives a little past time t, given that subject survives upto time t.
• If X~Exp(), then Y=X1/ has Weibull(, ) distribution.
1 / , 0, 0, 0y
Yf y y e y
)tTtTt(Pmil)t(h0
65
CAUCHY DISTRIBUTION• It is a symmetric and bellshaped
distribution on (,) with pdf
E X Since , the mean does not exist.• The mgf does not exist.• measures the center of the distribution and it is the median.
• If X and Y have N(0,1) distribution, then Z=X/Y has a Cauchy distribution with =0 and σ=1.
0,)x(1
11)x(f2
66
CAUCHY DISTRIBUTION• When studying hypothesis tests that assume
normality, seeing how the tests perform on data from a Cauchy distribution is a good indicator of how sensitive the tests are to heavytail departures from normality.
• Likewise, it is a good check for robust techniques that are designed to work well under a wide variety of distributional assumptions.
67
LOGNORMAL DISTRIBUTION
• An rv X is said to have the lognormal distribution, with parameters µ and 2, if Y=ln(X) has the N(µ, 2) distribution.
•The lognormal distribution is used to model continuous random quantities when the distribution is believed to be skewed, such as certain income and lifetime variables.
68
STUDENT’S T DISTRIBUTION• This distribution will arise in the study of
population mean when the underlying distribution is normal.
• Let Z be a standard normal rv and let U be a chisquare distributed rv independent of Z, with degrees of freedom. Then,
~/
ZX tU
When n, XN(0,1).
69
F DISTRIBUTION
• Let U and V be independent rvs with chisquare distributions with 1 and 2 degrees of freedom. Then,
1 2
1,
2
/ ~/
UX FV
MULTIVARIATEDISTRIBUTIONS
70
BIVARIATE NORMAL DISTRIBUTION
• A pair of continuous rvs X and Y is said to have a bivariate normal distribution if it has a joint pdf of the form
71
2
2
y
2
y
1
x2
1
x22
21
yyx2x12
1exp12
1y,xf
.11,0,0,y,x 21
BIVARIATE NORMAL DISTRIBUTION
72
If ,,,,BVN~Y,X yx22
21
22
21 ,N~Y and ,N~X yx
, then
and is the correlation coefficient btw X and Y.1. Conditional on X=x,
22
2X1
2y 1),x(N~xY
2. Conditional on Y=y,
22
1Y2
1x 1),y(N~yX
Mixture of distributions
• Let f1(y) and f2(y) be density functions, and let a be a constant such that 0≤ a ≤1. Consider the function
f(y)=a f1(y) + (1a) f2(y)
Such a density function is often called a mixture distribution. Here are some examples of real life applications for such distributions.
73
Mixture of distributions
• Example 1: Financial returns often behave differently in normal situations and during crisis times. A mixture of two normal distributions with different means and variances can be assumed for returns, one for the returns during normal situations, and another for during crises.
74
Mixture of distributions
• Example 2: The prices of houses in a particular neighborhood will tend to be similar, while prices in a different neighborhood may be extremely different. For instance, if we are to collect prices in both Cukurambar and Sincan, we will need a mixture of two distributions to describe these prices.
75
Mixture of distributions• Note that f(y) is a valid density function.• Suppose that Y1 is a random variable with density
function f1(y), and E(Y1)=μ1 and Var(Y1)= . Similarly, suppose that Y2 is a random variable with density function f2(y), and E(Y2)=μ2 and Var(Y2)= . Assume that Y is random variable whose density is a mixture of the densities corresponding to Y1 and Y2.
i) It can be shown that E(Y)= a μ1 + (1a) μ2
ii) It can be shown that Var(Y)= a + (1a) + a(1a) (μ1  μ2)2
76
21
22
21 2
2
Problems
1. Mensa (from the Latin word for “mind”) is an international society devoted to intellectual pursuits. Any person who has an IQ in the upper 2% of the general population is eligible to join. If we assume that IQs are normally distributed with μ = 100 and σ = 16, what is the lowest IQ that will qualify a person for membership?
77
Problems
2. Suppose that numerical grades in a statistics class are values of a r.v. X which is Normally distributed with mean μ = 65 and standard deviation 15. Suppose that letter grades are assigned according to the following rule: student receives an A if X ≥ 85; B if 70 ≤ X < 85; C if 55 ≤ X < 70; D if 45 ≤ X < 55; and F if X ≤ 45. If a student is chosen at random from this class, calculate the probability that the student will earn i) A; ii) B; iii) F.
78
Problems
3. Economic conditions cause fluctuations in the prices of raw commodities as well as in finished products. Let X denote the price paid for a barrel of crude oil by the initial carrier, and let Y denote the price paid by the refinery purchasing the product from the carrier. Assume that the joint density for (X,Y) is given by f(x,y)=c 20< x < y < 40
79
Problems
a) Find the value of c that makes this a joint density for a twodimensional random variable.
b) Find the probability that the carrier will pay at least $25 per barrel and the refinery will pay at most $30 per barrel for the oil.
c) Find the probability that the price paid by the refinery exceeds that of the carrier by at least $10 per barrel.
80
Problems
d) Are X and Y independent? Explain.e) Find E(YX). Interpret this expectation in a
practical sense.
81