Probability theory 2 Tron Anders Moger September 13th 2006.
-
Upload
spencer-anthony -
Category
Documents
-
view
223 -
download
1
Transcript of Probability theory 2 Tron Anders Moger September 13th 2006.
![Page 1: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/1.jpg)
Probability theory 2
Tron Anders Moger
September 13th 2006
![Page 2: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/2.jpg)
The Binomial distribution• Bernoulli distribution: One experiment with
two possible outcomes, probability of success P.
• If the experiment is repeated n times
• The probability P is constant in all experiments
• The experiments are independent
• Then the number of successes follows a binomial distribution
![Page 3: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/3.jpg)
The Binomial distribution
If X has a Binomial distribution, its PDF is defined as:
xnx PPxnx
nxXP
)1(
)!(!
!)(
)1()(
)(
PnPXVar
nPXE
![Page 4: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/4.jpg)
Example
• Since the early 50s, 10000 UFO’s have been reported in the U.S.
• Assume P(real observation)=1/100000
• Binomial experiments, n=10000, p=1/100000
• X counts the number of real observations
%5.9095.010000
11
10000
1
0
100001
)0(1)1()real isn observatio oneleast At (100000
XPXPP
![Page 5: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/5.jpg)
The Hypergeometric distribution
• Randomly sample n objects from a group of N, S of which are successes. The distribution of the number of successes, X, in the sample, is hypergeometric distributed:
)!(!
!)!()!(
)!(
)!(!
!
)(
nNn
NxnSNxn
SN
xSx
S
n
N
xn
SN
x
S
xXP
![Page 6: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/6.jpg)
Example
• What is the probability of winning the lottery, that is, getting all 7 numbers on your coupon correct out of the total 34?
71086.1
)!734(!7
!34)!77734()!77(
)!734(
)!77(!7
!7
7
34
77
734
7
7
)7(
XP
![Page 7: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/7.jpg)
The distribution of rare events: The Poisson distribution
• Assume successes happen independently, at a rate λ per time unit. The probability of x successes during a time unit is given by the Poisson distribution:
( )!
( )
( )
xeP x
xE X
Var X
![Page 8: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/8.jpg)
Example: AIDS cases in 1991 (47 weeks)
• Cases per week:
1 1 0 1 2 1 3 0 0 0 0 0 0 2 1 2 2 1 3 0 1 0 0 0
1 1 1 1 1 0 2 1 0 2 0 2 1 6 1 0 0 1 0 2 0 0 0
• Mean number of cases per week:
λ=44/47=0.936
• Can model the data as a Poisson process with rate λ=0.936
![Page 9: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/9.jpg)
Example cont’d:No. of No. Expected no. observed
cases observed (from Poisson dist.)
0 20 18.4
1 16 17.2
2 8 8.1
3 2 2.5
4 0 0.6
5 0 0.11
6 1 0.017
• Calculation: P(X=2)=0.9362*e-0.936/2!=0.17
• Multiply by the number of weeks: 0.17*47=8.1
• Poisson distribution fits data fairly well!
![Page 10: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/10.jpg)
The Poisson and the Binomial
• Assume X is Bin(n,P), E(X)=nP• Probability of 0 successes: P(X=0)=(1-p)n • Can write λ =nP, hence P(X=0)=(1- λ/n)n • If n is large and P is small, this converges to e-λ,
the probability of 0 successes in a Poisson distribution!
• Can show that this also applies for other probabilities. Hence, Poisson approximates Binomial when n is large and P is small (n>5, P<0.05).
![Page 11: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/11.jpg)
Bivariate distributions
• If X and Y is a pair of discrete random variables, their joint probability function expresses the probability that they simultaneously take specific values:– – marginal probability: – conditional probability: – X and Y are independent if for all x and y:
( , ) ( )P x y P X x Y y ( ) ( , )
y
P x P x y( , )
( | )( )
P x yP x y
P y
( , ) ( ) ( )P x y P x P y
![Page 12: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/12.jpg)
Example
• The probabilities for – A: Rain tomorrow
– B: Wind tomorrow
are given in the following table:
0.1 0.2 0.05 0.01
0.05 0.1 0.15 0.04
0.05 0.1 0.1 0.05
No rain
Light rain
Heavy rain
No wind Some wind Strong wind Storm
![Page 13: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/13.jpg)
Example cont’d:• Marginal probability of no rain: 0.1+0.2+0.05+0.01=0.36
• Similarily, marg. prob. of light and heavy rain: 0.34 and 0.3. Hence marginal dist. of rain is a PDF!
• Conditional probability of no rain given storm: 0.01/(0.01+0.04+0.05)=0.1
• Similarily, cond. prob. of light and heavy rain given storm: 0.4 and 0.5. Hence conditional dist. of rain given storm is a
PDF!• Are rain and wind independent? Marg. prob. of no wind:
0.1+0.05+0.05=0.2
P(no rain,no wind)=0.36*0.2=0.072≠0.1
![Page 14: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/14.jpg)
Covariance and correlation
• Covariance measures how two variables vary together:
• Correlation is always between -1 and 1:
• If X,Y independent, then• If X,Y independent, then• If Cov(X,Y)=0 then
( , ) ( ( ))( ( )) ( ) ( ) ( )Cov X Y E X E X Y E Y E XY E X E Y
( , ) ( , )( , )
( ) ( )X Y
Cov X Y Cov X YCorr X Y
Var X Var Y
( ) ( ) ( )E XY E X E Y( , ) 0Cov X Y
( ) ( ) ( )Var X Y Var X Var Y
![Page 15: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/15.jpg)
Continuous random variables
• Used when the outcomes can take any number (with decimals) on a scale
• Probabilities are assigned to intervals of numbers; individual numbers generally have probability zero
• Area under a curve: Integrals
![Page 16: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/16.jpg)
Cdf for continuous random variables
• As before, the cumulative distribution function F(x) is equal to the probability of all outcomes less than or equal to x.
• Thus we get • The probability density function is however
now defined so that
• We get that
( ) ( ) ( )P a X b F b F a
( ) ( )b
a
P a X b f x dx 0
0( ) ( )x
F x f x dx
![Page 17: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/17.jpg)
Expected values
• The expectation of a continuous random variable X is defined as
• The variance, standard deviation, covariance, and correlation are defined exactly as before, in terms of the expectation, and thus have the same properties
( ) ( )E X xf x dx
![Page 18: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/18.jpg)
Example: The uniform distribution on the interval [0,1]
• f(x)=1
• F(x)=x
•
•
1 1121 1
2 200 0
( ) ( )E X xf x dx xdx x
2 2
122 1 1 1
3 4 120
( ) ( ) ( )
( ) 0.5
Var X E X E X
x d x
![Page 19: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/19.jpg)
The normal distribution
• The most used continuous probability distribution: – Many observations tend to approximately
follow this distribution– It is easy and nice to do computations with– BUT: Using it can result in wrong conclusions
when it is not appropriate
![Page 20: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/20.jpg)
Histogram of weight with normal curve displayed
Weight (kg)
95.090.085.080.075.070.065.060.055.050.045.040.0
Distribution of weight among 95 students
Nu
mb
er o
f stu
de
nts
25
20
15
10
5
0
![Page 21: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/21.jpg)
The normal distribution• The probability density function is
• where
• Notation
• Standard normal distribution
• Using the normal density is often OK unless the actual distribution is very skewed
• Also: µ±σ covers ca 65% of the distribution
• µ±2σ covers ca 95% of the distribution
2 2( ) / 2
2
1( )
2
xf x e
( )E X 2( )Var X 2( , )N
(0,1)N
![Page 22: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/22.jpg)
The normal distribution with small and large standard deviation σ
x 2018161412108642
0.4
0.3
0.2
0.1
0
![Page 23: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/23.jpg)
Simple method for checking if data are well approximated by a normal
distribution: Explore
• As before, choose Analyze->Descriptive Statistics->Explore in SPSS.
• Move the variable to Dependent List (e.g. weight).
• Under Plots, check Normality Plots with tests.
![Page 24: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/24.jpg)
Histogram of lung function for the students
Average PEF value measured in a sitting position
800
750
700
650
600
550
500
450
400
350
300
Nu
mb
er o
f stu
de
nts
20
16
12
8
4
0
Std. Dev = 120.12
Mean = 503
N = 95.00
![Page 25: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/25.jpg)
Q-Q plot for lung function
Normal Q-Q Plot of PEFSITTM
Observed Value
800700600500400300200
Exp
ecte
d N
orm
al
3
2
1
0
-1
-2
-3
![Page 26: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/26.jpg)
Age – not normal
Age
35.032.530.027.525.022.520.0
Histogram
Fre
qu
en
cy
50
40
30
20
10
0
Std. Dev = 3.11
Mean = 22.4
N = 95.00
![Page 27: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/27.jpg)
Q-Q plot of age
Normal Q-Q Plot of AGE
Observed Value
40302010
Expe
cte
d N
orm
al
3
2
1
0
-1
-2
![Page 28: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/28.jpg)
SKEWED
40
30
20
10
0
Std. Dev = 1.71
Mean = 1.50
N = 106.00
Skewed distribution, with e.g. the observations 0.40, 0.96, 11.0
A trick for data that are skewed to the right: Log-transformation!
![Page 29: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/29.jpg)
Log-transformed data
LNSKEWD
14
12
10
8
6
4
2
0
Std. Dev = 1.05
Mean = -.12
N = 106.00
ln(0.40)=-0.91ln(0.96)=-0.04ln(11) =2.40
Do the analysis on log-transformed data
SPSS: transform- compute
![Page 30: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/30.jpg)
OK, the data follows a normal distribution, so what?
• First lecture, pairs of terms:– Sample – population
– Histogram – distribution
– Mean – Expected value
• In statistics we would like the results from analyzing a small sample to apply for the population
• Has to collect a sample that is representative w.r.t. age, gender, home place etc.
![Page 31: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/31.jpg)
New way of reading tables and histograms:
• Histograms show that data can be described by a normal distribution
• Want to conclude that data in the population are normally distributed
• Mean calculated from the sample is an estimate of the expected value µ of the population normal distribution
• Standard deviation in the sample is an estimate of σ in the population normal distribution
• Mean±2*(standard deviation) as estimated from the sample (hopefully) covers 95% of the population normal distribution
![Page 32: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/32.jpg)
In addition:
• Most standard methods for analyzing continuous data assumes a normal distribution.
• When n is large and P is not too close to 0 or 1, the Binomial distribution can be approximated by the normal distribution
• A similar phenomenon is true for the Poisson distribution
• This is a phenomenon that happens for all distributions that can be seen as a sum of independent observations.
• Means that the normal distribution appears whenever you want to do statistics
![Page 33: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/33.jpg)
The Exponential distribution
• The exponential distribution is a distribution for positive numbers (parameter λ):
• It can be used to model the time until an event, when events arrive randomly at a constant rate
( ) tf t e
( ) 1/E T 2( ) 1/Var T
![Page 34: Probability theory 2 Tron Anders Moger September 13th 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649e705503460f94b6e1d2/html5/thumbnails/34.jpg)
Next time:
• Sampling and estimation
• Will talk much more in depth about the topics mentioned in the last few slides today