Probability distributionv1
-
Upload
beatrice-van-eden -
Category
Technology
-
view
197 -
download
2
description
Transcript of Probability distributionv1
![Page 1: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/1.jpg)
Beatrice van Eden
Probability DistributionPart 1
![Page 2: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/2.jpg)
Topics
Probability Distribution Probability Distribution Equations Descriptive parameters for Probability Distributions Probability Theorems Binary Variables The beta distribution
Multinomial Variables The Dirichlet distribution
![Page 3: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/3.jpg)
Probability DistributionA function that describes all the possible values and
likelihoods that a random variable can take within a given range.
This range will be between the minimum and maximum possible values, but where the possible value is likely to be plotted on the probability distribution depends on a number of factors, including the distributions mean, standard deviation, skewness and kurtosis.
![Page 4: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/4.jpg)
Probability Distributions Equations
The section on probability equations explains the equations that define probability distributions.
Cumulative distribution function (cdf)Probability mass function (pmf)Probability density function (pdf)
![Page 5: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/5.jpg)
Probability Distributions Equations
Cumulative distribution function (cdf)
The (cumulative) distribution function, or probability distribution function, F(x) is the mathematical equation that describes the probability that a variable X is less that or equal to x, i.e.
F(x) = P(X≤x) for all x
where P(X≤x) means the probability of the event X≤x.
![Page 6: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/6.jpg)
Probability Distributions Equations
Cumulative distribution function for the normal distributions.
Probability density function for several normal distributions. The red line denotes the standard normal distribution.
![Page 7: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/7.jpg)
Probability Distributions Equations
A cumulative distribution function has the following properties: F(x) is always non-decreasing, i.e. F(x) = 0 at x = -∞ or minimum F(x) = 1 at x = ∞ or maximum
( ) 0dF x
dx
![Page 8: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/8.jpg)
Probability Distributions Equations
Probability mass function (pmf)
If a random variable X is discrete, i.e. it may take any of a specific set of n values xi, i = 1 to n, then:
P(X=xi) = p(xi)
p(x) is called the probability mass function
![Page 9: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/9.jpg)
Probability Distributions Equations
The graph of a probability mass function. All the values of this function must be non-negative and sum up to 1.
The probability mass function of a fair die. All the numbers on the die have an equal chance of appearing on top when the die stops rolling.
1 3 7
0.2 0.5 0.3
1 2 3 4 5 6
1/6 1/6 1/6 1/6 1/6 1/6
![Page 10: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/10.jpg)
Probability Distributions Equations
Note that
and F(xk) =
1
( ) 1n
i
i
p x
1
( )k
i
i
p x
![Page 11: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/11.jpg)
Probability Distributions Equations
Probability density function (pdf)
If a random variable X is continuous, i.e. it may take any value within a defined range (or sometimes ranges), the probability of X having any precise value within that range is vanishingly small because a total probability of 1 must be distributed between an infinite number of values. In other words, there is no probability mass associated with any specific allowable value of X.
![Page 12: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/12.jpg)
Probability Distributions Equations
Instead, we define a probability density function f(x) as:
i.e. f(x) is the rate of change (the gradient) of the cumulative distribution function. Since F(x) is always non-decreasing, f(x) is always non-negative.
( ) ( )d
f x F xdx
![Page 13: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/13.jpg)
Probability Distributions Equations
For a continuous distribution we cannot define the probability of observing any exact value. However, we can determine the probability of lying between any two exact values (a, b):where b>a.
( ) ( ) ( )P a x b F b F a
![Page 14: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/14.jpg)
Descriptive parameters for Probability Distributions
The section on probability parameters explains the meaning of standard statistics like mean and variance within the context of probability distributions.
![Page 15: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/15.jpg)
Descriptive parameters for Probability Distributions
Location Mode: is the x-value with the greatest
probability p(x) for a discrete distribution, or the greatest probability density f(x) for a continuous distribution.
Median: is the value that the variable has a 50% probability of exceeding, i.e. F(x50) = 0.5
![Page 16: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/16.jpg)
Descriptive parameters for Probability Distributions
Mean : also known as the expected value, is given by: for discrete variables for continuous variables
The mean is known as the first moment about zero. It can be considered to be the centre of gravity of the distribution.
1
n
i i
i
x p
. ( ).x f x dx
![Page 17: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/17.jpg)
Descriptive parameters for Probability Distributions
SpreadStandard Deviation: measures the amount of
variation or dispersion from the average or mean. The standard deviation is the positive square root of the variance.
The standard deviation has the same dimension as the data, and hence is comparable with deviations of the mean.
![Page 18: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/18.jpg)
Descriptive parameters for Probability Distributions
Variance: measures how far a set of numbers is spread out.
An equivalent measure is the square root of the variance, called the standard deviation.
The variance is one of several descriptors of a probability distribution. In particular, the variance is one of the moments of a distribution.
![Page 19: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/19.jpg)
Descriptive parameters for Probability Distributions
ShapeSkewness:
The skewness statistic is calculated from the following formulae:
Discrete variable:
Continuous variable:
max3
min3
( ) . ( ).x f x dx
S
3
13
( ) .n
i i
i
x pS
![Page 20: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/20.jpg)
Descriptive parameters for Probability Distributions
Kurtosis: The kurtosis statistic is calculated from the following formulae:
Discrete variable:
Continuous variable:max
4
min4
( ) . ( ).x f x dx
K
4
14
( ) .n
i i
i
x pK
![Page 21: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/21.jpg)
Probability TheoremsProbability theorems explains some fundamental probability theorems most often used in modelling risk, and some other mathematical concepts that help us manipulate and explore probabilistic problems.
The strong law of large numbers Central limit theorem Binomial Theorem Bayes theorem
![Page 22: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/22.jpg)
Probability Theorems The strong law of large numbers The strong law of large numbers says that the larger
the sample size (i.e. the greater the number of iterations), the closer their distribution (i.e. the risk analysis output) will be to the theoretical distribution (i.e. the exact distribution of the models output if it could be mathematically derived).
![Page 23: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/23.jpg)
Probability Theorems Central Limit Theorem(CLT)
The distribution of the sum of N i.i.d. randomvariables becomes increasingly Gaussian as Ngrows.Example: N uniform [0,1] random variables.
![Page 24: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/24.jpg)
Probability Theorems Binomial Theorem
a Formula for finding any power of a binomial without multiplying at length.
Properties of binomial coefficient
!
!( )!
n n
x x n x
0
1
10
n
i
n n
n x x
n n n
x x n x
n n
n
a b a b
n b n i
![Page 25: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/25.jpg)
Probability Theorems Bayes theorem
a theorem describing how the conditional probability of each of a set of possible causes for a given observed outcome can be computed from knowledge of the probability of each cause and the conditional probability of the outcome of each cause.
![Page 26: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/26.jpg)
Topics Binary Variables The beta distribution
Multinomial Variables The Dirichlet distribution
![Page 27: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/27.jpg)
Binary VariablesBinary variable Observations (i.e., dependent variables) that occur in one of two possible states, often labelled zero and one. E.g., “improved/not improved” and “completed task/failed to complete task.” Coin flipping: heads=1, tails=0 Bernoulli Distribution
( 1| )p x
1( | ) (1 )
var 1
x xBern x
x
x
![Page 28: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/28.jpg)
Binary VariablesN coin flips
Binomial distribution
( | , )p m heads N
0
2
0
( | , ) ( ) (1 )
( | , )
var[ ] ( [ ]) ( | , ) (1 )
m N m
m
N
m
N
m
Bin m N N
m mBin m N N
m m m Bin m N N
![Page 29: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/29.jpg)
Beta distributionBeta is a continuous distribution defined on the interval of 0 and 1, i.e.,parameterized by two positive parameters a and b.
where T(*) is gamma function. beta is conjugate to the binomial and Bernoulli distributions
0,1
11
2
| , 1
var1
baBetaa b
a ba b
a
a b
ab
a b a b
![Page 30: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/30.jpg)
Beta distribution Illustration of one step of sequential Bayesian
inference. The prior is given by a beta distributionwith parameters a = 2, b = 2, and the likelihood function, given by (2.9) with N = m = 1, corresponds to asingle observation of x = 1, so that the posterior is given by a beta distribution with parameters a = 3, b = 2.
![Page 31: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/31.jpg)
Beta distribution
Example
Beta1.odt
![Page 32: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/32.jpg)
Multinomial DistributionMultinomial distribution is a generalization of the binominal distribution. Different from the binominal distribution, where the RV assumes two outcomes, the RV for multi-nominal distribution can assume k (k>2) possible outcomes.
Let N be the total number of independent trials, mi, i=1,2, ..k, be the number of times outcome i appears. Then, performing N independent trials, the probability that outcome 1 appears m1, outcome 2, appears m2, …,outcome k appears mk times is
![Page 33: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/33.jpg)
Multinomial Distribution
1 211 2
, ,...., | ,...
var 1
cov
KmK
K KKK
K K
K K
j K j K
MultN
m m m Nmm m
m N
m N
m m N
![Page 34: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/34.jpg)
The Dirichlet DistributionThe Dirichlet distribution is a continuous multivariate probability distributions parametrized by a vector of positive reals a. It is the multivariate generalization of the beta distribution.
Conjugate prior for themultinomial distribution.
10
11
1
( | )...
0
K
K
kkK
K
kk
Dir
![Page 35: Probability distributionv1](https://reader033.fdocuments.in/reader033/viewer/2022061204/547f0ab0b4af9fc0148b45cb/html5/thumbnails/35.jpg)