Special Distributions - Data Science Initiative...2020/09/04 · Bernoulli Dist. Binomial Dist....
Transcript of Special Distributions - Data Science Initiative...2020/09/04 · Bernoulli Dist. Binomial Dist....
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Special Distributions
Brian Vegetabile
2017 Statistics BootcampDepartment of Statistics
University of California, Irvine
September 14th, 2016
1 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Bernoulli Distribution I
• One of the most important distributions in statistics is theBernoulli Distribution
• The Bernoulli distribution is used to describe experimentswith binary outcomes, say 0 and 1.
• Think ‘heads’ or ‘tails’, ‘yes’ or ‘no’, ‘win’ or ‘loss’• Often called a ‘Bernoulli trial’
• Ultimately, there is some probability p of ‘succeeding’ anda corresponding probability (1− p) of failing based uponthe rules of probability.
2 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Bernoulli Distribution II
• If we define the value 1 as being a success, we can writethis as follows
X =
{1 with probability p0 with probability 1− p , 0 ≤ p ≤ 1
• To create a probability mass function, consider
P [X = 1] = p P [X = 0] = 1− p
therefore one way to write the mass function is as follows
P [X = x] = pX(x) =
{px(1− p)1−x x = 0, 1
0 otherwise
• Show properties of this distribution: CDF, expectation,variance, MGF...
3 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Bernoulli Distribution III• It is easy to see that this is a probability mass function.
• pX(x) ≥ 0 for all x, and•∑
x pX(X) = p+ (1− p) = 1.
• We can also easily find the mean and variance,
E(X) =∑x
xpX(x) = 1× (p) + 0× (1− p) = p
E(X2) =∑x
x2pX(x) = 12 × (p) + 0× (1− p) = p
V ar(X) = E(X2)− (E(X))2 = p− p2 = p(1− p)
• Additionally, we can find the moment generating functionfor this random variable
E(etX) =∑x
etxpX(x) = et(1)p+et(0)(1−p) = (1−p)+pet
4 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Binomial Distribution I
• Related to the Bernoulli distribution is the BinomialDistribution.
• A binomial random variable can arise from a sequence ofBernoulli trials with the properties that,
• Trials are independent events• Each trial results in exactly one of the same two mutually
exclusive outcomes• The probability of success (and subsequently failure)
remains constant from trial to trial.
• Therefore a binomial random variable can be consideredas the sum of n Bernoulli random variables. That is thenumber of successes in n Bernoulli trials.
• Example: Number of ‘heads’ in ten independent cointosses.
5 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Binomial Distribution II
• We can write the probability mass function in a similarway to the Bernoulli distribution
P [X = x] = pX(x) =
{ (nx
)px(1− p)n−x x = 0, 1, 2, . . . , n
0 otherwise
• Note: Showing that this is indeed a distribution requiresthe use of the binomial theorem, where
(x+ y)n =
n∑i=0
(n
i
)xn−iyi
• The expectation and variance are also similar
E(X) = np V ar(X) = np(1− p)
6 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Poisson Distribution I
• Another important discrete distribution is the Poissondistribution.
• While the Binomial distribution counts the number ofsuccesses in a series of trials, the Poisson distributioncounts the number of events in a given time interval.
• Binomial ‘counts’ are bounded by the number of trials• Poisson counts are in an interval are not bounded.
• Examples that generally can be modeled with a PoissonDistribution
• The number of misprints on a page (or a group of pages)of a book
• The number of customers entering a post office on a givenday
• The number of α-particles discharged in a fixed period oftime from some radioactive material
7 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Poisson Distribution II
• Additionally, the Poisson distribution can be used tomodel the number of events that occur in a spatial region.
• The distribution is parameterized by a value λ which isoften referred to as the rate or intensity of thedistribution, which governs the mean of the distribution
• The mass function is given as follows
f(x|λ) =
{e−λλ
x
x! for x = 0, 1, 2, . . .0 otherwise
8 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Poisson Distribution III
• To verify that this is a distribution, we must show that∑∞x=0 f(x|λ) = 1. Additionally, from calculus, we know
the power series characterization ea =∑∞
n=0an
n! . Thus,
∞∑x=0
e−λλx
x!= e−λ
∞∑x=0
λx
x!= e−λeλ = 1
• We can use similar mathematical tricks to derive themean and variance.
• The Poisson distribution can be used to approximate theBinomial distribution.
9 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Self-Study: Review Poisson Process
• The Poisson distribution can be derived from a few basicassumptions that we list below, but do not show thederivation:
i) Start with no arrivalsii) Arrivals in disjoint time periods are independentiii) Number of arrivals depends only on the period lengthiv) Arrival probability is proportional to the period length, if
length is smallv) No simultaneous arrivals
10 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Uniform Distribution
• The simplest continuous distribution is when mass isspread out ‘uniformly’ on some interval [a, b]
• The density function is as follows:
f(x|λ) =
{1b−a for x ∈ [a, b]
0 otherwise
• Quickly show CDF and Expected Values
11 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Normal Distribution I
• The most “famous” distribution is the Normal distributionand it is often informally referred to as the ‘bell curve’
• The distribution is symmetric and unbounded on the realline, and concentrates mass at it’s mean/mode/median.
• It is very useful and can be used to satisfactorily representmany phenomenon in the world such as
• Distribution of heights of Airforce Pilots• Distribution of IQ scores• Distribution of measurement errors
• The distribution plays an important role in the centrallimit theorem which is used in much of statistics.
12 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Normal Distribution II
• The density of the distribution is
f(x|λ) =1√
2πσ2exp
(−(x− µ)2
2σ2
)for −∞ < x <∞
• The following are the mean and variance of thedistribution
E(X) = µ V ar(X) = σ2
•√σ2 = σ is often referred to the as standard deviation of
the distribution.
• We do not derive these properties here.
13 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Gamma Distribution I
• The Gamma distribution is an important positive valueddistribution
• The Gamma distribution, under various parametersettings, is related to many other named distributions.(exponential, Weibull, χ2, etc)
• The Gamma distribution allows plays important rolesthroughout Bayesian Statistics.
14 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Gamma Distribution II
• An important mathematical relationship for thisdistribution is that of the gamma function, specificallyprovided α is positive,
Γ(α) =
∫ ∞0
tα−1e−tdt.
• Related are two important properties of this function
1 Γ(α+ 1) = αΓ(α)
2 For any integer n > 1, Γ(n) = (n− 1)!.
15 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Gamma Distribution III
• The density of the gamma distribution is
f(x|α, β) =1
Γ(α)βαxα−1 exp
(−xβ
)where α is the shape parameter since it controls the‘peakedness’ of the distribution and β is the scale since itmainly influences the spread of the distribution.
• There is also an alternative parameterization... SeeWikipedia (This will trip you up).
16 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Kernel Trick for Integration I
• To illustrate the ‘kernel trick’ for integration, we find theexpected value of the gamma distribution.
E(X) =
∫ ∞0
xxα−1 exp
(−xβ
)dx
=1
Γ(α)βα
∫ ∞0
x(α+1)−1 exp
(−xβ
)dx
• We notice though that if we multiply and divide by1
Γ(α+1)βα+1 , then the integral becomes the pdf of a
Gamma(α+ 1, β) distribution.
=Γ(α+ 1)βα+1
Γ(α)βα
∫ ∞0
1
Γ(α+ 1)βα+1x(α+1)−1 exp
(−xβ
)dx
17 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Kernel Trick for Integration II
• The term on the right integrates to 1 and we are left withthe following expression.
=Γ(α+ 1)βα+1
Γ(α)βα
= αβ
Where the last line holds by properties of the gammafunction.
• The kernel trick will become invaluable through the courseof the year.
18 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Special Gamma Distributions
• The gamma(α, β) family has many special distributions.
• When α = 1, the gamma distribution reduces to theexponential distribution
• If α = p/2, where p is an integer, and β = 2, then thegamma distribution becomes a χ2 distribution with pdegrees of freedom
• The χ2 distribution will become very importantthroughout the year.
• The list goes on and on....
19 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Beta Distribution I
• Another important distribution that will come up often isthe Beta distribution which a continuous and boundedrandom variable.
• The density is continuous on the interval (0, 1) and isindexed by the parameters α and β.
• Most frequently used in Bayesian statistics to model apriori beliefs about proportions.
• There is a more general family of beta distributions forgeneral intervals
20 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Beta Distribution II
• The distribution relies on the relationship
B(α, β) =
∫ 1
0xα−1(1− x)β−1dx.
where B(α, β) = Γ(α)Γ(β)Γ(α+β) .
• Thus the density is
f(x|α, β) =1
B(α, β)xα−1(1−x)β−1 for x ∈ [0, 1], α > 0, β > 0.
• When β = α = 1 the beta reduces to the Uniformdistribution on (0, 1).
21 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Bivariate Normal Distributions
• To introduce multivariate distributions, we define thebivariate normal distribution.
• A RV X = (X1, X2) has the bivariate normal distributionN(µ1, µ2, σ
21, σ
22, ρ) if (for some σi > 0,−1, ρ < 1) and
real-valued µi
f(x|µ1, µ2, σ21 , σ
22 , ρ) =
1
2πσ1σ2√
1− ρ2exp
(−
1
2(1− ρ2)
{(x1 − µ1
σ1
)2−
2ρ
(x1 − µ1
σ1
)(x2 − µ2
σ2
)+
(x2 − µ2
σ2
)2})
• When ρ = 0 this will factor into two independent normaldistributions.
22 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
Roadmap of Univariate Distributions
• https://en.wikipedia.org/wiki/Relationships_
among_probability_distributions
23 / 24
04 - SpecialDistributions
Discrete RVs
Bernoulli Dist.
Binomial Dist.
Poisson Dist.
ContinuousRVs
UniformDistribution
NormalDistribution
GammaDistribution
BetaDistribution
Bivariate Normal
Road Map toDistributions
References
References
24 / 24