Discrete Probability Distributions Lecture

8/2/2019 Discrete Probability Distributions Lecture

1/6

Discrete Probability Distributions

A CEE3030 lecture prepared by

Gilberto E. Urroz

February 2006

Reference

The subjects presented are taken from the Maple

worksheed entitled

DiscreteProbabilityDistributions

available for download in the class schedule

Quick review of concepts for discrete

random variables - 1

LetXbe a discrete random variable, then f(x) = P(X=x) is theprobability mass function (pmf) F(x) = P(X x) = = cumulative

distribution function (CDF)

Calculation of probabilities P(X < x) = F(x-1) -- P(X x) = F(x) P(X > x) = 1-F(x) -- P(X x) =1-F(x-1) P(a < X < b) = F(b-1)-F(a) P(a X< b) = F(b-1)-F(a-1) P(a < Xb) = F(b)-F(a) P(a Xb) = F(b)-F(a-1)

ux

fu

Quick review of concepts for discrete

random variables - 2

LetXbe a discrete random variable, then f(x) = P(X=x) is theprobability mass function (pmf)

Calculation of measures Mean,

Variance,

Skewness

Kurtosis

=i=1

n

xif xi

2=i=1

n

xi2fxi

3=1

3

i=1

n

xi3fxi

4= 1

4i=1

n

xi4fxi

Discrete distributions in Maple

Use the command: ?Statistics,Distributions for alist of available distributions

Discrete distributions of interest are:

BernoulliBernoulli Bernoulli distributionBernoulli distributionBinomialBinomial binomial distributionbinomial distributionDiscreteUniformDiscreteUniform discrete uniform distributionEmpiricalDistribution empirical distributionGeometricGeometric geometric distributiongeometric distributionHypergeometricHypergeometric hypergeometric distributionNegativeBinomial negative binomial (Pascal) dist.PoissonPoisson Poisson distributionProbabilityTable probability table

Using Maple Statistics package todefine a discrete random variable

To load the Statistics package use: with(Statistic

Use ? for help e.g., ?Geometric

Define a random variable with distribution namand appropriate parameters with function

RandomVariable

e.g., X:= RandomVariable(Binomial(n,p)) e.g., X:= RandomVariable(Poison(3.2))


2/6

Calculating measures of adistribution - 1

After defining a random variableXin Maple, youcan calculate the following measures:

:= Mean(X) 2 := Variance(X) := StandardDeviation(X) 3 := Skewness(X) 4 := Kurtosis(X)

Calculating measures of adistribution - 2

To obtain floating-point (decimal) results for themeasures of a distribution you may use:

:= evalf(Mean(X)) 2 := evalf(Variance(X)) := evalf(StandardDeviation(X)) 3 := evalf(Skewness(X)) 4 := evalf(Kurtosis(X))

Calculating probabilities - 1

To calculate probabilities use the following basicfunctions:

ProbabilityFunction(X,a) for thepmf, i.e.,f(a)=P(X=a)

CDF(X,a) for the CDF, i.e.,F(a) = P(Xa)

Calculating probabilities - 2 To calculate more complex probabilities use

function CDFas follows: P(X < x) = F(x-1) => use CDF(X,x-1) P(X > x) = 1-F(x) => use 1-CDF(X,x-1) P(Xx) =1-F(x-1) => use 1-CDF(X,x-1) P(a < X < b) = F(b-1)-F(a) =>

use CDF(X,b-1)-CDF(x,a) P(a X< b) = F(b-1)-F(a-1) =>

use CDF(X,b-1)-CDF(x,a-1) P(a < Xb) = F(b)-F(a)=>

use CDF(X,b)-CDF(x,a) P(a Xb) = F(b)-F(a-1)=>

use CDF(X,b)-CDF(x,a-1)

The Bernoulli distribution

Random variableXcan take only the valuesx = 0andx = 1

Probability mass function:with 0 < p < 1

Possible association of the values ofx:

Binary logical No Yes

Voltage level Low voltage High voltage

Failure Success

Variable X X=0 equivalent X=1 equivalent

Sucess/failure

Measures of the Bernoulli distribution

=p 1p

3=12p

p 1p

4=13p3p2

p 1p

2= p1 p

=p


3/6

The binomial distribution: X~B(n,p)

Consider n repetitions of a Bernoulli process withparameterp

LetX= number of successes in n repetitions Probability mass function

Binomial coefficient:

fx=nxpx1p

nx, for x=0,1,..., n

nx=n !

x ! nx !

Measures of the binomial distribution

=n p 1p

3=12p

np 1p

4=long expression , see worksheet

2=n p 1p

=n p

Approximating the binomial distributionwith the normal distribution, X~N(,) Applies for relatively large values ofn and

relatively small values ofp so that

np 5or n(1-p) 5

UseX:= RandomVariable(Normal(,)) to definea normal random variable (continuous)

Main reason for the approximation: to avoid

calculating large factorial values No longer animpediment with modern calculators and software

The Poisson distribution

Used to define discrete random variableX=number of occurrences of a certain phenomena unit time, unit length, etc.

Probability mass function

Parameter represents the average number ofoccurrence per unit time, length, etc.

f x=ex

x !, for x=0,1,...

Measures of the Poisson distribution

=

3=1

4=31

2=

=

Poisson distribution with scaling

LetX= number of occurrences of a phenomenonsay, per unit time

Let = average number of occurrences per unittime

Let T= period of interest for the analysis

Use = T as the parameter in the Poissondistribution

See example of scaling in worksheet


4/6

Approximating the binomial distributionwith the Poisson distribution

Applies for np 5or n(1-p) 5

UseX:= RandomVariable(Poisson()) to define aPoisson random variable (continuous)

Main reason for the approximation: to avoidcalculating large factorial values No longer animpediment with modern calculators and software

Read details in worksheet

Approximating the Poisson distribution withe normal distribution

Similar to the approximation of the binomialdistribution with the normal distribution

Main reason for the approximation: to avoidcalculating exponential functions in the Poissodistribution No longer an impediment withmodern calculators and software

Read details in worksheet

The geometric distribution

Consider several repetitions of a Bernoulli processwith parameterp

LetX= number of repetitions required for the firstsuccess

Probability mass function

fx=1px1

p , for x=1,2,...

Measures of the geometric distribution

= 1pp

3=2p

1p

4=p29p9

1p

2=1 p

p2

=1p

p

Period of return - 1

LetXi= maximum value of an event in period i,

independent random variables Let q = P(X

ix) = probability of exceedence of

valuexin period i, thus q+p = 1, q = 1-p Let T= number of periods past before the value of

xis exceededP(T=t) = P(X

1


5/6

The hypergeometricdistribution Consider figure

Finite population sizeNwith a objects of a type

Draw a sample of size n LetX= number of objects

of the type in sample

Probability mass function:

fx=axNanxNn

Mean

Variance

2=

n a NaNn

N2N1

=na

N

The discrete uniform distribution

LetX= random variable taking the valuesx = a,a+1, ..., b, each value with equal probability

The probability mass function is

Mean: = (a+b)/2

Variance: 2 = (a-b)(a-b-2)/12

fx= 1ba1, for x =a ,a1,...,

Inverse cumulative distribution function

The CDFof a random variableXis defined asF(x)

= P(Xx).

Given a probabilityp =F(x), the value ofxisdefined as

x = F-1(p)

F-1 is the inverse cumulative distribution function

(ICDF) ofX

The probability density function (pdf) for this cais given by

f(x) = e x, x0

The corresponding cumulative distributionfunction (CDF) is

F(x) = 1 - e x

Forp = F(x), the ICDF is given by

F-1(p) = -ln(1-p)/

Example - ICDF for the exponential

distribution (continuous variable case)

ICDF and Maple function Quantile

For a discrete random variableXthep quantile isdefined by

Q(p) = inf{x|F(x)p}

i.e., the closest inferior value ofxsuch thatF(x) islarger or equal top. This is calculated usingMaple's function Quantile(X,p)

IfXtakes only integer values, theICDFforXiscalculated using Maple's function Quantile as

F-1(p) = Quantile(X,p) - 1

Fitting a distribution to a sample

Xs= {x

1,x

2,...,x

ns}, numerical sample of size ns.

Mean of the sample

Variance of the sample

Select a distribution, make = xmean

and 2 = s

and solve for the parameters of the distribution

xmean

=1

nsi=1

ns

xi

s2=

1

ns1i=1

ns

xxm


6/6

Random numbers

Numbers generated by random processes, e.g.,numbers out of a roulette, or lottery

Computers use deterministic algorithms thatproduce pseudo-random numbers

Use Maple function Sample(X,ns), within packageStatistics, to produce a sample (vector) of size nsfor the random variable X, e.g.,Xs:= Sample(X,ns)

To convert from a vector to a list, use:convert(Xs,list)

Statistical simulation or Monte-Carlo simulation

Generating synthetic data out of a givendistribution to use as input for a model

Example 1 - generating precipitation data for ahydrological model

Example 2 generating hydraulic conductivitydata for an aquifer in groundwater simulation

Example 3 generating traffic data for a highwoperation simulation

Discrete Probability Distributions Lecture

Documents

Transcript of Discrete Probability Distributions Lecture