Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million...

25
Probabilit y Distributi ons and Frequentis t Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    222
  • download

    2

Transcript of Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million...

Page 1: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Probability Distributions

and Frequentist

Statistics

“A single death is a tragedy, a million deaths is a statistic”

Joseph Stalin

Page 2: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Can we answer that?

1st draw

M RedN-M Blue

2nd draw

?

N Balls Total

?

P(R1|I) = (M/N)

Page 3: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

The Red and the BlueRed-2 R2 = (R1 + B1), R2

M RedN-M Blue

N Balls Total

R2 = R1 ,R2 + B1 , R2

P(R2 |I ) = P(R1 , R2 | I ) + P(B1 , R2 | I )

= P(R1 | I ) P(R2 | R1 , I ) + P(B1 | I ) P(R2| B1 , I )

N - 1 M - 1

M N - M N - 1

M N

= + N M N

=

= P(R1 |I )

Using product rule

... = P(R3 |I ) etc

The Outcome of first draw is a “nuisance” parameter. Marginalize = Integrate over all options.

Page 4: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Marginalization

RAIN NO RAIN

CLOUDS

NO CLOUDS

1/6

0

1/3

1/2

1/2

1/2

1/6 5/6Chance of Rain

Chance of Cloud

Page 5: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

MarginalizationWhere Ai represents a set of Mutually Exclusive and Exhaustive possibilities, then marginalization or integrating out of “nuisance parameters” takes the form:

P(|D,I) = i P(, Ai |D,I)

Or in the limit of a continuously variable parameter A (rather than discrete case above) P changes into a probability density function:

P(|D,I) = dA P(, A|D,I)

This technique is often required in inference, for example we may be interested in the frequency of a sinusoidal signal in noisy data, but not interested in the amplitude (a nuisance parameter)

Page 6: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Probability DistributionsWe denote probability distributions over all possible values of a variable x by p(x) .

Discrete

Continuous

Cumulative

Lim [p(x < X < x+δx)] / δxδx→ 0

Page 7: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Properties of Probability DistributionsThe expectation value for a function g(X) is the weighted average:

g(X) = g(x) p(x) (discrete)All x

ʃ g(x) f(x) dx (continuous)

If it exists, this is the first moment, or mean of the distribution.The rth moment for a random variable X about the origin (x=0) is:

’r =Xr = xr p(x) (discrete)All x

ʃ xr f(x) dx (continuous)

The mean = ’1 = X is the 1st moment about the origin.

Page 8: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Properties of Probability Distributions

Therefore the variance x2 = X2 – X 2

The rth central moment for a random variable X about the mean(origin=) is:

r =(X-) r = (x-)r p(x) (discrete)All x

ʃ (x-)r f(x) dx (continuous)

First central moment: 1 = (X-) = 0Second central moment: Var(X) = x

2 = ( X - )2 x

2 = ( X - )2 = ( X2 – 2X + 2) = X2 – 2 X + 2

= X2 – 22 + 2 = X2 – 2 = X2 – X 2

Page 9: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Properties of Probability DistributionsThird central moment: 3

= ( X - )3 Skewness Fourth central moment: 4

= ( X - )4 Kurtosis

The median and the mode both provide estimates of central tendency for a distribution, and are in many cases more robust against outliers than the mean.

Page 10: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Example: Mean and Median filtering

Mean Filter

Median Filter

Image degraded by salt noise

Page 11: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

The Uniform DistributionA flat distribution with peak value normalized so that the area under the curve=1

Uniform PDF Cumulative Uniform PDF

• Commonly used as an ingnorance prior to express impartiality (a lack of bias) of the value of a quantity over the given interval.

• Round-off error, quantization error are uniformly distributed

Page 12: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

The Binomial DistributionBinomial statistics apply when there are exactly two mutually exclusive outcomes of a trial (labelled "success" and "failure“). The binomial distribution gives the probability of observing k successes in n trials, with the probability of success on a single trial denoted by p (p is assumed fixed for all trials).

Fixed n, Varying p Fixed p, Varying n

• Among the most useful discrete distribution functions in statistics.

• Multinomial distribution is a generalization for the case where there is more than a binary outcome.

n

Page 13: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

The Negative Binomial DistributionClosely related to the Binomial distribution, the Negative Binomial Distribution applies under the same circumstances but where the variable of interest is the number of trials n to obtain k successes and n-k failures (rather than the number of successes in N trials). For n Bernoulli trials each with success fraction p, the negative_binomial distribution gives the probability of observing k failures and n-k successes with success on the last trial:

Page 14: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

The Poisson DistributionAnother crucial discrete distribution function, the Poisson expresses the probability of a number of events k (e.g. failures, arrivals, occurrences ...) occurring in a fixed period of time (or fixed area of space), provided these events occur with a known mean rate λ (events/time), and are independent of the

previous event.

• Poisson distribution is the limiting case of a binomial distribution where the probability for success p goes to zero while the number of trials n grows such that λ = np is finite.

• Examples: photons received from a star in an interval; meteorite impacts over an area; pedestrians crossing at an intersection etc…

Page 15: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

The Normal (Gaussian) DistributionThe Normal or Gaussian distribution is probably the most well known statistical distribution. A Gaussian with mean zero and standard deviation one is known as the Standard Normal Distribution. Given mean μ and standard deviation σ it has the PDF:

• Continuous distribution which is the limiting case for a binomial as the number of trials (and successes) is very large.

• Its pivotal role in statistics is partly due to the Central Limit Theorem (see later).

Page 16: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Examples: Gaussian DistributionsHuman IQ Distribution

Page 17: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

The Power Law DistributionPower law distributions are ubiquitous in science, occurring in diverse phenomena, including city sizes, incomes, word frequencies, and earthquake magnitudes. A power-law implies that small occurrences are extremely common, whereas large instances are extremely rare. This “law” takes a number of forms (can be referred to as Zipf and sometimes Pareto). A simple illustrative power law is:

Power Law PDF - Linear Scale Power Law PDF – Log-Log scale

k=0.5K=1.0K=2.0

Page 18: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Example Power Laws from Nature

Page 19: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Physics Example: Cosmic Ray Spectrum

Page 20: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

The Exponential DistributionThe exponential distribution is a continuous probability distribution with an exponential falloff controlled by the rate parameter λ: larger values of λ entail a more rapid falloff in the distribution.

• The exponential distribution is used to model times between independent events which happen at a constant average rate (e.g. lifetimes, waiting times).

Page 21: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

The gamma DistributionThe gamma distribution is a two-parameter continuous pdf characterized by two parameters usually designated the shape parameter k and the scale parameter θ. When k=1 it coincides with the exponential distribution, and is also closely related to the Poisson and Chi Squared Distributions.

Gamma PDF:

Where the Gamma function is defined:

• The Gamma distribution gives a flexible class of PDFs for nonnegative phenomena, often used in modeling waiting times.

• Conjugate for the Poisson PDF

Page 22: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

The Beta DistributionThe family of beta probability distributions is defined on the fixed interval [0,1] and parameterized by two positive shape parameters, α and β. In Bayesian statistics it is frequently encountered as a prior for the binomial distribution.

Beta PDF:

Where the Beta function is defined:

• The family of Beta distributions allows for a wide variety of shapes over a fixed interval.

• If likelihood function is a binomial, then a Beta prior will lead to another beta function for the posterior.

• The role of the Beta function can be thought of as a simple normalization to ensure that the total PDF integrates to 1.0

Page 23: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Central Limit Theorem: Experimental demonstration

.....

Page 24: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Central Limit Theorem: A Bayesian demonstration

x1 dx1x2 dx2

y dy

X1 x1 to dx1

X2 x2 to dx2

Y y to dyI Y is the sum of X1 and X2

P(Y |I ) = dX1 dX2 P(Y, X1 , X2 | I )

P(x1 |I ) = f1 (x1)P(x2 |I ) = f2 (x2)

= dX1 dX2 P(X1 | I ) P(X2 | I ) P(Y | X1 , X2 , I ) Using the product rule, and independence of X1 , X2

P(Y | X1 , X2 , I ) = δ (y – x1 – x2 ) Because y = x1 + x2

Therefore P(Y |I ) = dX1 f1 (x1) dX2 f2 (x2) δ (y – x1 – x2 ) = dX1 f1 (x1) f2 (y – x1)

Convolution Integral

Page 25: Probability Distributions and Frequentist Statistics “A single death is a tragedy, a million deaths is a statistic” Joseph Stalin.

Central Limit Theorem: Convolution Demonstration