Discrete Probability Distributions Lecture

download Discrete Probability Distributions Lecture

of 6

Transcript of Discrete Probability Distributions Lecture

  • 8/2/2019 Discrete Probability Distributions Lecture

    1/6

    Discrete Probability Distributions

    A CEE3030 lecture prepared by

    Gilberto E. Urroz

    February 2006

    Reference

    The subjects presented are taken from the Maple

    worksheed entitled

    DiscreteProbabilityDistributions

    available for download in the class schedule

    Quick review of concepts for discrete

    random variables - 1

    LetXbe a discrete random variable, then f(x) = P(X=x) is theprobability mass function (pmf) F(x) = P(X x) = = cumulative

    distribution function (CDF)

    Calculation of probabilities P(X < x) = F(x-1) -- P(X x) = F(x) P(X > x) = 1-F(x) -- P(X x) =1-F(x-1) P(a < X < b) = F(b-1)-F(a) P(a X< b) = F(b-1)-F(a-1) P(a < Xb) = F(b)-F(a) P(a Xb) = F(b)-F(a-1)

    ux

    fu

    Quick review of concepts for discrete

    random variables - 2

    LetXbe a discrete random variable, then f(x) = P(X=x) is theprobability mass function (pmf)

    Calculation of measures Mean,

    Variance,

    Skewness

    Kurtosis

    =i=1

    n

    xif xi

    2=i=1

    n

    xi2fxi

    3=1

    3

    i=1

    n

    xi3fxi

    4= 1

    4i=1

    n

    xi4fxi

    Discrete distributions in Maple

    Use the command: ?Statistics,Distributions for alist of available distributions

    Discrete distributions of interest are:

    BernoulliBernoulli Bernoulli distributionBernoulli distributionBinomialBinomial binomial distributionbinomial distributionDiscreteUniformDiscreteUniform discrete uniform distributionEmpiricalDistribution empirical distributionGeometricGeometric geometric distributiongeometric distributionHypergeometricHypergeometric hypergeometric distributionNegativeBinomial negative binomial (Pascal) dist.PoissonPoisson Poisson distributionProbabilityTable probability table

    Using Maple Statistics package todefine a discrete random variable

    To load the Statistics package use: with(Statistic

    Use ? for help e.g., ?Geometric

    Define a random variable with distribution namand appropriate parameters with function

    RandomVariable

    e.g., X:= RandomVariable(Binomial(n,p)) e.g., X:= RandomVariable(Poison(3.2))

  • 8/2/2019 Discrete Probability Distributions Lecture

    2/6

    Calculating measures of adistribution - 1

    After defining a random variableXin Maple, youcan calculate the following measures:

    := Mean(X) 2 := Variance(X) := StandardDeviation(X) 3 := Skewness(X) 4 := Kurtosis(X)

    Calculating measures of adistribution - 2

    To obtain floating-point (decimal) results for themeasures of a distribution you may use:

    := evalf(Mean(X)) 2 := evalf(Variance(X)) := evalf(StandardDeviation(X)) 3 := evalf(Skewness(X)) 4 := evalf(Kurtosis(X))

    Calculating probabilities - 1

    To calculate probabilities use the following basicfunctions:

    ProbabilityFunction(X,a) for thepmf, i.e.,f(a)=P(X=a)

    CDF(X,a) for the CDF, i.e.,F(a) = P(Xa)

    Calculating probabilities - 2 To calculate more complex probabilities use

    function CDFas follows: P(X < x) = F(x-1) => use CDF(X,x-1) P(X > x) = 1-F(x) => use 1-CDF(X,x-1) P(Xx) =1-F(x-1) => use 1-CDF(X,x-1) P(a < X < b) = F(b-1)-F(a) =>

    use CDF(X,b-1)-CDF(x,a) P(a X< b) = F(b-1)-F(a-1) =>

    use CDF(X,b-1)-CDF(x,a-1) P(a < Xb) = F(b)-F(a)=>

    use CDF(X,b)-CDF(x,a) P(a Xb) = F(b)-F(a-1)=>

    use CDF(X,b)-CDF(x,a-1)

    The Bernoulli distribution

    Random variableXcan take only the valuesx = 0andx = 1

    Probability mass function:with 0 < p < 1

    Possible association of the values ofx:

    Binary logical No Yes

    Voltage level Low voltage High voltage

    Failure Success

    Variable X X=0 equivalent X=1 equivalent

    Sucess/failure

    Measures of the Bernoulli distribution

    =p 1p

    3=12p

    p 1p

    4=13p3p2

    p 1p

    2= p1 p

    =p

  • 8/2/2019 Discrete Probability Distributions Lecture

    3/6

    The binomial distribution: X~B(n,p)

    Consider n repetitions of a Bernoulli process withparameterp

    LetX= number of successes in n repetitions Probability mass function

    Binomial coefficient:

    fx=nxpx1p

    nx, for x=0,1,..., n

    nx=n !

    x ! nx !

    Measures of the binomial distribution

    =n p 1p

    3=12p

    np 1p

    4=long expression , see worksheet

    2=n p 1p

    =n p

    Approximating the binomial distributionwith the normal distribution, X~N(,) Applies for relatively large values ofn and

    relatively small values ofp so that

    np 5or n(1-p) 5

    UseX:= RandomVariable(Normal(,)) to definea normal random variable (continuous)

    Main reason for the approximation: to avoid

    calculating large factorial values No longer animpediment with modern calculators and software

    The Poisson distribution

    Used to define discrete random variableX=number of occurrences of a certain phenomena unit time, unit length, etc.

    Probability mass function

    Parameter represents the average number ofoccurrence per unit time, length, etc.

    f x=ex

    x !, for x=0,1,...

    Measures of the Poisson distribution

    =

    3=1

    4=31

    2=

    =

    Poisson distribution with scaling

    LetX= number of occurrences of a phenomenonsay, per unit time

    Let = average number of occurrences per unittime

    Let T= period of interest for the analysis

    Use = T as the parameter in the Poissondistribution

    See example of scaling in worksheet

  • 8/2/2019 Discrete Probability Distributions Lecture

    4/6

    Approximating the binomial distributionwith the Poisson distribution

    Applies for np 5or n(1-p) 5

    UseX:= RandomVariable(Poisson()) to define aPoisson random variable (continuous)

    Main reason for the approximation: to avoidcalculating large factorial values No longer animpediment with modern calculators and software

    Read details in worksheet

    Approximating the Poisson distribution withe normal distribution

    Similar to the approximation of the binomialdistribution with the normal distribution

    Main reason for the approximation: to avoidcalculating exponential functions in the Poissodistribution No longer an impediment withmodern calculators and software

    Read details in worksheet

    The geometric distribution

    Consider several repetitions of a Bernoulli processwith parameterp

    LetX= number of repetitions required for the firstsuccess

    Probability mass function

    fx=1px1

    p , for x=1,2,...

    Measures of the geometric distribution

    = 1pp

    3=2p

    1p

    4=p29p9

    1p

    2=1 p

    p2

    =1p

    p

    Period of return - 1

    LetXi= maximum value of an event in period i,

    independent random variables Let q = P(X

    ix) = probability of exceedence of

    valuexin period i, thus q+p = 1, q = 1-p Let T= number of periods past before the value of

    xis exceededP(T=t) = P(X

    1

  • 8/2/2019 Discrete Probability Distributions Lecture

    5/6

    The hypergeometricdistribution Consider figure

    Finite population sizeNwith a objects of a type

    Draw a sample of size n LetX= number of objects

    of the type in sample

    Probability mass function:

    fx=axNanxNn

    Mean

    Variance

    2=

    n a NaNn

    N2N1

    =na

    N

    The discrete uniform distribution

    LetX= random variable taking the valuesx = a,a+1, ..., b, each value with equal probability

    The probability mass function is

    Mean: = (a+b)/2

    Variance: 2 = (a-b)(a-b-2)/12

    fx= 1ba1, for x =a ,a1,...,

    Inverse cumulative distribution function

    The CDFof a random variableXis defined asF(x)

    = P(Xx).

    Given a probabilityp =F(x), the value ofxisdefined as

    x = F-1(p)

    F-1 is the inverse cumulative distribution function

    (ICDF) ofX

    The probability density function (pdf) for this cais given by

    f(x) = e x, x0

    The corresponding cumulative distributionfunction (CDF) is

    F(x) = 1 - e x

    Forp = F(x), the ICDF is given by

    F-1(p) = -ln(1-p)/

    Example - ICDF for the exponential

    distribution (continuous variable case)

    ICDF and Maple function Quantile

    For a discrete random variableXthep quantile isdefined by

    Q(p) = inf{x|F(x)p}

    i.e., the closest inferior value ofxsuch thatF(x) islarger or equal top. This is calculated usingMaple's function Quantile(X,p)

    IfXtakes only integer values, theICDFforXiscalculated using Maple's function Quantile as

    F-1(p) = Quantile(X,p) - 1

    Fitting a distribution to a sample

    Xs= {x

    1,x

    2,...,x

    ns}, numerical sample of size ns.

    Mean of the sample

    Variance of the sample

    Select a distribution, make = xmean

    and 2 = s

    and solve for the parameters of the distribution

    xmean

    =1

    nsi=1

    ns

    xi

    s2=

    1

    ns1i=1

    ns

    xxm

  • 8/2/2019 Discrete Probability Distributions Lecture

    6/6

    Random numbers

    Numbers generated by random processes, e.g.,numbers out of a roulette, or lottery

    Computers use deterministic algorithms thatproduce pseudo-random numbers

    Use Maple function Sample(X,ns), within packageStatistics, to produce a sample (vector) of size nsfor the random variable X, e.g.,Xs:= Sample(X,ns)

    To convert from a vector to a list, use:convert(Xs,list)

    Statistical simulation or Monte-Carlo simulation

    Generating synthetic data out of a givendistribution to use as input for a model

    Example 1 - generating precipitation data for ahydrological model

    Example 2 generating hydraulic conductivitydata for an aquifer in groundwater simulation

    Example 3 generating traffic data for a highwoperation simulation