Discrete Distributions

107
Discrete Distributions

description

Discrete Distributions. Random Variable. A random variable X is a function that maps the possible outcomes of an experiment to real numbers. That is X : C --> R , where C is the set of all outcomes of an experiment and R is the set of real numbers. - PowerPoint PPT Presentation

Transcript of Discrete Distributions

Page 1: Discrete Distributions

Discrete Distributions

Page 2: Discrete Distributions

Random Variable

A random variable X is a function that maps the possible outcomes of an experiment to real numbers.

That is X: C --> R, where C is the set of all outcomes of an experiment and R is the set of real numbers.

The space of X is the set of real numbers S = {x: X(c)= x, }

Cc

Page 3: Discrete Distributions

An Example of Random Variable

If we toss a coin one time, then there are two possible outcomes, namely “head up” and “tail up”.

We can define a random variable X that maps “head up” to 1 and “tail up” to 0.

We also can define a random variable Y that maps “head up” to 0 and “tail up” to 1.

Page 4: Discrete Distributions

The spaces of both random variables X and Y are {0,1}.

Page 5: Discrete Distributions

Further Illustration of Random Variables

A random variable corresponds to a quantitative interpretation of the outcomes of an experiment.

For example, a company offers its employees a drawing in its yearend party. A computer will randomly select an employee for the first prize of $100,000 based on the employees’ ID number, which ranges from 1 to 100.

Page 6: Discrete Distributions

In addition, the computer will randomly select two more employees for the second and third prizes of $50,000 and $10,000, respectively.

Assume that each employee can receive only one award and the drawing starts with the third prize and ends with the first prize.

Then, there are totally 100 × 99 × 98 = 970200 possible outcomes.

Page 7: Discrete Distributions

To Edward, whose employee ID number is 10, the random variable of his interest is as follows:X(<10, *, *>) = 10,000X(<*, 10, *>) = 50,000X(<*, *, 10>) = 100,000X(all other outcomes) = 0

To Grace, whose employee ID number is 30, the random variable of her interest is as follows:Y(<30, *, *>) = 10,000Y(<*, 30, *>) = 50,000Y(<*, *, 30>) = 100,000Y(all other outcomes) = 0

Page 8: Discrete Distributions

The outcome spaces of random variables X and Y are identical. However, X and Y map some outcomes to different real numbers.

The spaces of X and Y are also identical and both are {0, 10000, 50000, 100000}.

The probability functions of X and Y are also equal.Prob(X=10,000) = Prob(Y=10,000) = 0.01Prob(X=50,000) = Prob(Y=50,000)

= 0.01Prob(X=100,000) = Prob(Y=100,000)

= 0.01Prob(X=0) = Prob(Y=0) = 0.97

Page 9: Discrete Distributions

The expected values of X and Y are equal to

E[X] = E[Y]

= 10,000 * 0.01 + 50,000 * 0.01

+ 100,000 * 0.01

= 1600.

Page 10: Discrete Distributions

Discrete Random Variables

Given a random variable X, let S denote the space of X.

If S is a finite or countable infinite set, then X is said to be a discrete random variable.

Page 11: Discrete Distributions

Countable Infinite

A set is said to be countable infinite, if it contains infinite number of elements and there exists a one-to-one mapping between each element of the set and the positive integers.

Page 12: Discrete Distributions

Examples of Countable / Uncountable Infinite

The set of integer numbers is countable.

The set of fractional numbers is countable.

The set of real numbers is uncountable.

Page 13: Discrete Distributions

Probability Mass Function

The probability mass function (p.m.f.) of a discrete random variable X is defined to be

. variablerandomby tomapped are

thatoutcomes all contains where

),(

Xk

Q

qkXkP

k

QqX

k

ProbProb

Page 14: Discrete Distributions

In the previous example of drawing,

0.01. 9899100

1

),,10(

000,10000,10

10,10,,10

10,10,,10

jijiji

jijiji

X

ji

XP

Prob

Prob

Page 15: Discrete Distributions

In fact,the p.m.f. of a random variable is defined on a set of events of the experiment conducted.

In the previous drawing example, the set of outcomes that are mapped to 10,000 by X is an event.

Page 16: Discrete Distributions

Furthermore, in the previous drawing example, random variables X and Y map some outcomes to different real numbers. However, X and Y have the same distribution, i.e. the p.m.f. of X and the p.m.f. of Y are equal. More precisely,

100000}. 50000, {0,10000,every for

)()(

k

kPkP YX

Page 17: Discrete Distributions

Properties of the Probability Mass Function

The p.m.f. of a random variable X satisfies the following three properties:

.3

12

.0 then finite, is If

01

S A , wherexPA

. xP

xPS

X. space ofS : the , xx P

AxjX

SxiX

X

X

j

i

Prob

Page 18: Discrete Distributions

Probability Distribution Function

For a random variable X, we define its probability distribution function F as

tXProbtFX

Page 19: Discrete Distributions

Properties of a Probability Distribution Function

Any function that satisfies these conditions above can be a distribution function.

. , . 3

. 0lim . 2

. 1lim .1

twiftFwF

tF

tF

XX

Xt

Xt

Page 20: Discrete Distributions

An Example of the Probability Distribution Function of a Discrete Random Variable Assume that we toss a 4-sided die twice. Then, we have 16 possible outcomes :

4434241443332313

4232221241312111

,,,,,,,,,,,,,,,

,,,,,,,,,,,,,,,,

Page 21: Discrete Distributions

Let random variable X be the sum of the outcome.Then,

.8

5

16

4

16

3

16

2

16

15Prob5

. 16

18Prob

16

27Prob ,

16

36Prob

16

45Prob ,

16

34Prob

16

23Prob ,

16

12Prob

XF

X

XX

X X

X X

X

Page 22: Discrete Distributions

Operations of Random Variables

Let X and Y be two random variables defined on the same outcome space of an experiment.

Then, we can define a new random variable Z=f(X,Y).

Page 23: Discrete Distributions

For example, in the example of drawing, if Edward and Grace are husband and wife, then we can define a new random variable Z=X+Y.

We haveX(<30, 10, *>) = 50,000Y(<30, 10, *>) = 10,000Z(<30, 10, *>) = 60,000

Page 24: Discrete Distributions

Function of Random Variables

Let X be a random variable and G be a function. Then, random variable Y=G( X) maps an outcome ν in the outcome space of X to value G( X( ν )) .

With respect to the probability distribution functions, if G( X) is monotonically increasing, one-to-one mapping, then

tGFtGX

tXGtYtF

X

Y

11Prob

ProbProb

Page 25: Discrete Distributions

An Example of Functions of Random Variables

Let random variable X be the sum of two tosses of a 4-sided die and Y=X2.

Then,

.8

3

16

6234

.44Prob

16Prob16Prob16 2

XXX

X

Y

PPP

FX

XYF

Page 26: Discrete Distributions

Expected Value of a Discrete Random Variable

Let X be a discrete random variable and S be its space. Then, the expected value of X is

μ is a widely used symbol for expected value.

iSx

iXCz

xxPzXzobXEi

)()(Pr

Page 27: Discrete Distributions

Expected Value of a Function of a Random Variable

Let X be a random variable and G be a function. Then, the expected value of random variable is equal to XGY

Sx

iXi

i

xPxG

Page 28: Discrete Distributions

Expected Value of a Function of a Random Variable Proof :

SxjjX

Syj

yxGthatsuch

xallj

iSy

i

iSy

iY

j

i

ij

j

i

i

xGxP

xGxX

yyY

YofspacetheisSwhereyyPYE

.

Prob

Prob

. ,

'

'

'

'

Page 29: Discrete Distributions

For example, let X correspond to the outcome of tossing a die once. Then, Px(1)=Px(2)=Px(3)=Px(4)=Px(5)=Px(6)=1/6. and E[X]=3.5

If we are concerned about the difference between the observed outcome and the mean. And define Y=|X-E[X]|, then PY(1/2)=1/3, PY(3/2)=1/3, PY(5/2)=1/3.

Page 30: Discrete Distributions

.2

3

6

9

6

1)5.25.15.05.05.15.2(

6

1|5.3|)(|][|

hand,other On the

.2

3

3

1

2

9

3

1

2

5

3

1

2

3

3

1

2

1][

Therefore,

ii x

ix

ixi xxPXEx

YE

Page 31: Discrete Distributions

Theorems about the Expected Value

( a) If c is a constant,

( b) If c is a constant and g

is a function,

( c) If c1 and c2 are constants

and g1 and g2 are functions,

then

.ccE

XgcEXcgE

.22112211 XgEcXgEcXgcXgcE

Page 32: Discrete Distributions

Proof of ( a ): Trivial. Proof of ( b ):

XgcE

xPxgc

X. p.m.f of the

is xX and P of

es the spac,where S ixPxcgXcgE

SxiXi

X

SxiXi

i

i

Theorems about the Expected Value

Page 33: Discrete Distributions

Proof of ( c ):

An extension of ( c )

.

2211

2211

22112211

XgEcXgEc

xPxgcxPxgc

xPxgcxgcXgcXgcE

iXSx

iiXSx

i

iXSx

ii

ii

i

.11

k

iii

k

iii XgEcXgcE

Theorems about the Expected Value

Page 34: Discrete Distributions

Variance of a Discrete Random Variable

The variance of a random variable is defined to be and is typically denoted by σ2.

For a discrete random variable X,

σ is normally called the standard deviation.

2XE

.

2

2

22

22

222

XE

XEXE

XXEXEXVar

Page 35: Discrete Distributions

Let X be a random variable with mean μX and variance σX

2. Let Y= aX+b, where a and b are constants.

Then,

.

222222

2

2

XXX

X

y

X

aXEaXaE

babaXE

YEYVar

babXaEbaXEYE

Variance of a Discrete Random Variable

Page 36: Discrete Distributions

Variance of a Random Variable

The variance of a random variable measures the deviation of its distribution from the mean.

For example, in one drawing, Robert has 0.1% of chance to win $100,000, while in another drawing, he has 0.01% of chance to win $1,000,000.

Page 37: Discrete Distributions

The expected amounts of award in these two drawings are equal.

0.001 * 100000 = 100

0.0001 * 1000000 = 100 However, their variances are

different.

0.001 * (100000 – 100)2

+ 0.999 * (0 – 100)2 = 9,990,000

0.0001 * (1000000 – 100)2

+ 0.999 * (0 – 100)2 = 99,990,000

Page 38: Discrete Distributions

In many distributions, the mean and variance together uniquely determine the parameters of the random variables.

Page 39: Discrete Distributions

The Bernoulli Experiment and Distribution

A Bernoulli experiment is a random experiment, the outcome of which can be classified in one of two mutually exclusive and exhaustive ways, say, success and failure.

A sequence of Bernoulli trials occurs when a Bernoulli experiment is performed several independent times, so that the probability of success, say p, remains the same from trial to trial.

Page 40: Discrete Distributions

Let X be a Bernoulli random variable. The p.m.f of X can be written as

where k= 0 or 1 and p is the probability of success.

The expected value of X is

The variance of X is

,1 1 kkX ppkP

1

0

1 .1k

kk ppkp

1

0

12 .11k

kk pppppk

The Bernoulli Distribution

Page 41: Discrete Distributions

The Binomial Distribution

Let X be the random variable corresponding to the number of successes in a sequence of Bernoulli trials.

Then,

where n is the number of Bernoulli trials and p is the probability of success in one trial.

X is said to have a binomial distribution and is normally denoted by b( n , p) .

,1Prob knknkX ppCkXkP

Page 42: Discrete Distributions

Example of the Binomial Distribution

Assume that Tiger and Whale are the two teams that enter the Championship series of the professional basket ball league. Based on prior records, Tiger has a 60% chance of beating Whale in a single game. Larry, who is a fan of Tiger, makes a bet with Peter, who is a fan of Whale. According to their agreement, Larry will pay Peter $1000, should Whale win the 5-game series. In order to make a fair bet, how much should Peter pay Larry, if Tiger wins the series?

Page 43: Discrete Distributions

The probability that Tiger wins the series is

.465

)6826.01(*10006826.0*

6826.0)6.0()4.0()6.0()4.0()6.0( 555

454

2353

Z

Z

CCC

Page 44: Discrete Distributions

If the championship series consists of 3 games, then what is the probability that Tiger win the series?

6826.0648.0

)6.0()4.0()6.0( 333

232

CC

Page 45: Discrete Distributions

The Moment-Generating Function

Let X be a discrete random variable with p.m.f and space S. If there is a positive number h such that

exists and is finite for -h< t<h, then the function of t defined by

is called the moment-generating function of X. and often abbreviated as m.g.f.

xPX

Sx

iXtxtX

i

i xPeeE

tXeEtM

Page 46: Discrete Distributions

Let X and Y be two discrete random variables with the same space S.

If then the probability mass functions of X and Y are equal.

Insight of the argument above : Assume that S={ s1,s2, …, sk} contains only positive integers.

Then, we have

Therefore, , i.e. X and Y have the same p.m.f.

, tYtX eEeE

. ...

...21

21

21

21

k

k

tskY

tsY

tsY

tskX

tsX

tsX

esPesPesP

esPesPesP

iYiX sPsP

The Moment-Generating Function

Page 47: Discrete Distributions

Let be the m.g.f of a discrete random variable X.

Furthermore,

In particular,

.

Sx

iXtxk

iKX

K

i

i xPexdt

tMd

Sx

kiX

kiK

XK

i

XExPxdt

Md..

0

tM X

.00 02

2

XXXXX MMandM

The Moment-Generating Function

Page 48: Discrete Distributions

The Moment-Generating Function of the Binomial Distribution

Let X be b( n , p) .

are both difficult to compute.

n

k

knk

n

k

knknk

n

k

knk

n

k

knknk

ppknk

kn

ppCkXE

ppknk

n

ppkCXE

0

0

22

0

0

1!!1

!

1

1!!1

!

1

Page 49: Discrete Distributions

On the other hand, we can easily

derive the m.g.f. of a binomial

distribution.

.11

1

0

0

nn

k

tknktnk

n

k

knknk

tktXX

ppeppeC

ppCeeEtM

Page 50: Discrete Distributions

.10

0

111

1

2

122

1

nppnnM

npM

ppenpepeppenntM

peppentM

X

X

ntttntX

tntX

Therefore,

.-pnp

pnnpnppn

MMσ

npM

XXX

XX

1

00

0

22222

22

The Moment-Generating Function of the Binomial Distribution

Page 51: Discrete Distributions

The Poisson Process

A Poisson process models the number of times that a particular type of events occur during a time interval.

The Poisson process is based on the following 3 assumptions :

( 1) The numbers of event occurrences in non-overlapping intervals are independent.

( 2 ) Prob( one occurrence between times t and ) =

0lim

t

tt .t

Page 52: Discrete Distributions

( 3 ) Prob( two occurrences between times t and ) = 0.

λ is the only parameter of the Poisson process.

One example of the Poisson process is to model the number of Web accesses that a Web server receives between 8 AM and 9 AM.

The Poisson Process

0lim

t

tt

Page 53: Discrete Distributions

The Basis of the Assumptions of the Poisson Process

Assume that an ideal random number generator generates λ numbers in [0, 1].

If we divide [0, 1] evenly into n subintervals,then the probability that there is exactly one of the number generated in [0, 1/n] is

.1

11

11

11

1

nnnnC

Page 54: Discrete Distributions

The Basis of the Assumptions of the Poisson Process

The probability that there are exactly two of the numbers generated in [0, 1/n] is

Let .Then,

nt /1

.2

111

2

1limt0,in soccurrence twoProblim

.1

1limt0,in occurrence oneProblim

22

2

0

1

0

tn

t

tn

t

nt

nt

.

11

2

111

12

2

22

2

nnnnC

Page 55: Discrete Distributions

The Poisson Distribution

Assume that we are concerned about a Poisson process with parameter λ and want to count the number of event occurrences during one time interval.

We can divide the time interval evenly into n subintervals as the following figure shows.1/n

Time=1Time=0

Page 56: Discrete Distributions

The probability that the event occurs k times during the time interval is

Since and ,

the final result is .

k

n

k

nk

n

kk

n

k

n

k

n

knknk

n

n

nk

n

nnk

n

n

nnknk

n

nnC

1

1

!lim

1

1

!lim

1

1

!!

!lim1lim

11lim

k

n n

,1lim

e

n

n

n

ek

k

!

The Poisson Distribution

Page 57: Discrete Distributions

We say that a random variable X has a Poisson distribution, if

By the Maclaurin’s series,

we have

Therefore,

.!

ek

kPk

X

0

.!

1

k

k

ke

0 0

.1!k k

k

X eek

ekP

The Poisson Distribution

Page 58: Discrete Distributions

The moment-generating function of a random variable with the Poisson distribution is

112

1

1

0

0

!

!

tt

t

tt

etetX

etX

ee

k

kt

k

kktXt

X

eeeetM

eetM

eeek

ee

ek

eeEtM

The Poisson Distribution

Page 59: Discrete Distributions

Therefore,

Therefore, λ is the average rate of event occurrence per unit of time.

Let Y be the random variable corresponding to the number of event occurrences during a time interval of length t. Then,

.

!t

k

Y ek

tkP

The Poisson Distribution

.

00 and 0

22

22

XXXXX MMM

Page 60: Discrete Distributions

The Poisson Distribution

The probability that the event occurs k times during a time interval of length t is

tk

knk

n

knknk

n

ek

t

n

t

n

t

k

t

n

t

n

tC

!

11!

lim

1lim

Page 61: Discrete Distributions

Joint Distributions

Page 62: Discrete Distributions

Joint Probability Mass Function

Let X and Y be two discrete random variables defined on the same outcome set. The probability that X=x and Y=y is denoted by PX,Y(x, y)= Prob(X=x,Y=y) and is called the joint probability mass function(joint p.m.f) of X and Y. PX,Y(x, y) satisfies the the following 3 properties:

S.S ofsubset a isA where

,,,Pr )3(

1, )2(

1,0 )1(

,,

,,

,

AyxYX

SyxYX

YX

yxPAYXob

yxP

yxP

Page 63: Discrete Distributions

Example of Joint Distributions Assume that a supermarket collected the

following statistics of customers’ purchasing behavior:

Purchasing

Wine

Not Purchasing

Wine

Male 45 255

Female 70 630

Purchasing

Juice

Not Purchasing

Juice

Male 60 240

Female 210 490

Page 64: Discrete Distributions

Example of Joint Distributions

Let random variable M correspond to whether a customer is male, random variable W correspond to whether a customer purchases wine, random variable J correspond to whether a customer purchases juice.

Page 65: Discrete Distributions

The joint p.m.f of M and W is

W

M

PMW (1,1) = 0.045

PMW (1,0) = 0.255

PMW (0,1) = 0.07

PMW (0,0) = 0.63

Page 66: Discrete Distributions

The joint p.m.f of M and J is

W

M

PMW (1,1) = 0.06

PMW (1,0) = 0.24

PMW (0,1) = 0.21

PMW (0,0) = 0.49

Page 67: Discrete Distributions

Marginal Probability Mass Function

Let PXY(x,y) be the joint p.m.f. of discrete random variables X and Y.

is called the marginal p.m.f of X. Similarly,

is called the marginal p.m.f. of Y.

j

j

yiXY

yiX

yxP

yYxXobxXobxP

),(

,PrPr

ix

iYXY yxPyP ,,

Page 68: Discrete Distributions

Note that we can always create a common outcome set for any two or more random variables. For example, let X and Y correspond to the outcomes of the first and second tosses of a coin, respectively. Then, the outcome set of X is {head up, tail up} and the outcome set of Y is also {head up, tail up}. The common outcome set of X and Y is {(head up,head up),(head up,tail up),(tail up,head up),(tail up,tail up)}.

More on Joint Probability Mass Function

Page 69: Discrete Distributions

Independent Random Variables

Two discrete random variables X and Y are said to be independent if and only if for all possible combination of x and y

Otherwise, X and Y are said to be dependent.

.,, yPxPyxP YXYX

Page 70: Discrete Distributions

Example of Independent Random Variables Assume that a supermarket collected the

following statistics of customers’ purchasing behavior:

Purchasing

soft drinks

Not purchasing

soft drinks

Male 90 210

Female 210 490

Page 71: Discrete Distributions

Example of Independent Random Variables Let random variable M correspond to

whether a customer is male or not and random variable S correspond to whether a customer purchases soft drinks or not.

Then, M and S are independent, since for all possible combinations of the values of M and S, we have

Prob(M=i,S=j)=Prob(M=i)Prob(S=j).

Page 72: Discrete Distributions

Another Example of Joint Distribution

Object X Y Class Object X Y Class

1 7.1 9.1 1 11 10.9 8.8 2

2 6.7 10.2 1 12 10.8 10.3 2

3 7.5 10.6 1 13 11.1 11 2

4 7.6 8.8 1 14 12.3 9.1 2

5 8.1 10.3 1 15 12.1 9.7 2

6 8.0 11.0 1 16 12 10.9 2

7 8.6 8.9 1 17 13.1 8.9 2

8 8.7 9.8 1 18 12.8 10.1 2

9 9.2 11.2 1 19 13.2 11.3 2

10 6.5 10.1 1 20 13.7 9.9 2

Average 7.8 10.0 - Average 12.2 10.0 -

Page 73: Discrete Distributions

Joint p.m.f. of X, Y, and C

1

11

1

1

1

1

1

1

1

2

2

2

2

2

2

2

2

2

2

8

9

10

11

12

6 8 10 12 14

Page 74: Discrete Distributions

Joint p.m.f. of X and C

11 11 11 11 11

22 2 222 22 2 2

0

1

2

6 8 10 12 14

Page 75: Discrete Distributions

Joint p.m.f. of Y and C

1 1 11 1 11 1 11

2 2 22 2 22 2 22

0

1

2

6 8 10 12 14

Page 76: Discrete Distributions

Joint Distribution Function

Let X and Y be two random variables. The joint distribution function is defined as follows:

FXY(x,y)=Prob( X≤x, Y≤y). Note that this definition applies to both

discrete and continuous random variables.

Page 77: Discrete Distributions

Joint Probability Density Function

Assume that X and Y be two continuous random variables defined on the same space S. The joint probability density function of X and Y is defined as follows:

X and Y are said to be independent if and only if

yx

yxFyxf xy

xy

,

,

., yfxfyxf YXXY

Page 78: Discrete Distributions

In some text books, it is defined that two random variables are independent, if and only if

We have ., yFxFyxF YXXY

.

,,

yfxf

yx

yFxF

yx

yxFyxf

YX

YXXYXY

Page 79: Discrete Distributions

The marginal p.d.f of X is

and the marginal p.d.f of Y is

dyyxfxf XYX ,

dxyxfyf XYY ,

Page 80: Discrete Distributions

Jointly Independent and Pairwise Independent

Note that, even we have PX,Y (x,y) = PX (x)PY (y)

PY,Z (y,z) = PY (y)PZ(z)

PX,Z (x,z) = PX (x)PZ (z) Then, it is not necessary true that PX,Y,Z (x,y,z) = PX (x)PY (y) PZ (z)

Page 81: Discrete Distributions

An Example of Pairwise Independence

Let X and Y are two random variables that correspond to tossing a unbiased coin two times. Let Z = X Y.Then

Prob(Z=0) = Prob(X=0,Y=0) + Prob(X=1,Y=1) = ½

Prob(X=0,Z=0) = Prob(X=0,Y=0) = ¼ = Prob(X=0)Prob(Z=0).

Page 82: Discrete Distributions

Therefore, X, Y and Z are pairwise independent.However, Prob(X=0,Y=0,Z=1) = 0 and Prob(X=0)Prob(Y=0)Prob(Z=1)= 1/8

Hence, X, Y and Z are not jointly independent.

Page 83: Discrete Distributions

On the other hand, jointly independent implies pairwise independent. For example,

.

,,, ,,,

yPxP

zPyPxP

zPyPxP

zyxPyxP

YX

zZYX

zZYX

zZYXYX

Page 84: Discrete Distributions

Addition of Two Random Variables Let X and Y be two random variables. Then,

E[X+Y]=E[X]+E[Y]. Note that the above equation holds even if X

and Y are dependent. Proof of the discrete case :

][][)()(

),(),(

),(),(

))(,(][

YEXEyyPxxP

yxPyyxPx

yxPyyxPx

yxyxPYXE

yY

xX

y xXY

x yXY

x yXY

x yXY

x yXY

Page 85: Discrete Distributions

On the other hand,

#

2222

2222

222

22

2

])[][][(2][][

)][(2)][()][(

2][2][][

)(2)(])[(

)])((2)()[(

]))()[((

][

YEXEXYEYVarXVar

XYEYEXE

XYEYEXE

YXE

YXYXE

YXE

YXVar

yxyx

yxyx

yxyx

yxyx

yx

Page 86: Discrete Distributions

Note that if X and Y are independent, then

Therefore, if X and Y are independent, then Var[X+Y]=Var[X]+Var[Y].

][][

)()(

)()(

),(][

YEXE

yyPxxP

yPxxyP

yxxyPXYE

yY

xX

x yYX

x yXY

Page 87: Discrete Distributions

Covariance

Let X and Y be two random variables. Then, E[(X-µX)(Y- µY)] is called the covariance of X and Y, and is denoted by σXY, where µX and µY are the means of X and Y, respectively.

Page 88: Discrete Distributions

Covariance

E[(X-µX)(Y- µY)]

= E[XY- µYX- µXY+ µXµY]

= E[XY]- µYE[X]- µXE[Y]+E[µXµY]

= E[XY]- µXµY

Therefore, if X and Y are independent, then Cov[X,Y]=0.

Page 89: Discrete Distributions

Examples of Correlated Random Variables Assume that a supermarket collected the

following statistics of customers’ purchasing behavior:

Purchasing

Wine

Not Purchasing

Wine

Male 45 255

Female 70 630

Purchasing

Juice

Not Purchasing

Juice

Male 60 240

Female 210 490

Page 90: Discrete Distributions

Examples of Correlated Random Variables

Let random variable M correspond to whether a customer is male, random variable W correspond to whether a customer purchases wine, random variable J correspond to whether a customer purchases juice.

Page 91: Discrete Distributions

The joint p.m.f of M and W is

Cov(M,W)= E[MW]-E[M]E[W]= 0.045 – 0.3*0.115 = 0.0105 >0M and W are positively correlated.

W

M

PMW (1,1) = 0.045

PMW (1,0) = 0.255

PMW (0,1) = 0.07

PMW (0,0) = 0.63

Page 92: Discrete Distributions

The joint p.m.f of M and J is

Cov(M,J)= E[MJ]-E[M]E[J]= 0.06 – 0.3*0.27 = -0.021 < 0M and J are negatively correlated.

W

M

PMW (1,1) = 0.06

PMW (1,0) = 0.24

PMW (0,1) = 0.21

PMW (0,0) = 0.49

Page 93: Discrete Distributions

Covariance of Independent Random Variables Assume that the supermarket also

collected the following statistics of customers’ purchasing behavior:

Purchasing

soft drinks

Not purchasing

soft drinks

Male 90 210

Female 210 490

Page 94: Discrete Distributions

The joint p.m.f of M and S is

Cov(M,S)= E[MS]-E[M]E[S]= 0.09 – 0.3*0.3 = 0,due to the fact that M and S are independent.

S

M

PMWS(1,1) = 0.09

PMWS(1,0) = 0.21

PMS (0,1) = 0.21

PMS (0,0) = 0.49

Page 95: Discrete Distributions

Correlation Coefficient

The correlation coefficient of two random variables X and Y is defined as follows:

YX

YX

),cov(

Page 96: Discrete Distributions

Bounds of a Correlation Coefficient

.11 Therefore,

. allfor 0 square,a

of valueexpected theis Since

).1()(

have We

.2

))()(()(Let

22

222

2

RbK(b)

K(b)

K

bb

uXbuYEbK

YX

Y

XYXY

XY

Page 97: Discrete Distributions

Implication of the Value of the Correlation Coefficient Assume that the supermarket collected

the following statistics of customers’ purchasing behavior:

Purchasing

cosmetics

Not purchasing

cosmetics

Male 10 290

Female 260 440

Page 98: Discrete Distributions

Implication of the Value of the Correlation Coefficient

Let random variable M correspond to whether a customer is male, random variable C correspond to whether a customer purchases cosmetics.

Then, the correlation coefficient of M and C is

-0.349.

Page 99: Discrete Distributions

Implication of the Value of the Correlation Coefficient On the other hand, we also have the

following dataset:

The correlation coefficient of M and J is -0.103.

Purchasing

juice

Not purchasing

juice

Male 60 240

Female 210 490

Page 100: Discrete Distributions

Another Example of Correlation Coefficient

Object X Y Class Object X Y Class

1 7.1 9.1 1 11 10.9 8.8 2

2 6.7 10.2 1 12 10.8 10.3 2

3 7.5 10.6 1 13 11.1 11 2

4 7.6 8.8 1 14 12.3 9.1 2

5 8.1 10.3 1 15 12.1 9.7 2

6 8.0 11.0 1 16 12 10.9 2

7 8.6 8.9 1 17 13.1 8.9 2

8 8.7 9.8 1 18 12.8 10.1 2

9 9.2 11.2 1 19 13.2 11.3 2

10 6.5 10.1 1 20 13.7 9.9 2

Average 7.8 10.0 - Average 12.2 10.0 -

Page 101: Discrete Distributions

Joint p.m.f. of X, Y, and C

1

11

1

1

1

1

1

1

1

2

2

2

2

2

2

2

2

2

2

8

9

10

11

12

6 8 10 12 14

Page 102: Discrete Distributions

Joint p.m.f. of X and C

11 11 11 11 11

22 2 222 22 2 2

0

1

2

6 8 10 12 14

Page 103: Discrete Distributions

Joint p.m.f. of Y and C

1 1 11 1 11 1 11

2 2 22 2 22 2 22

0

1

2

6 8 10 12 14

Page 104: Discrete Distributions

Another Example of Correlation Coefficients

.925.05.0379.2

5.1101.16][][][

CX

CEXEXCE

The correlation coefficient of X and C is

On the other hand, the covariance of Y and C is

E[YC]-E[Y}E[C] = 15-10×1.5 =0

and therefore the correlation coefficient of Y and C is 0.

Page 105: Discrete Distributions

With respect to data analysis, random variable X provides valuable information about the class of an object.

On the other hand, random variable Y essentially provides no information about the class of an object.

Page 106: Discrete Distributions

Example of Uncorrelated Random Variables Assume X and Y have the following

joint p.m.fPXY(0,1)= PXY(1,0)= PXY(2,1)= 1/3

We have the following marginal p.m.f.s

xXYY

x xXYYXYX

yXYX

yXYX

xPP

xPPyPP

yPPyPP

3/2)1,(1

3/1)0,(0 ; 3/1),2(2

3/1),1(1 ; 3/1),0(0

Page 107: Discrete Distributions

Example of Uncorrelated Random Variables

Since PXY(0,1) = 1/3PX(0) x PY(1) = 1/3 x 2/3 = 2/9,X and Y are not independent.

However,Cov (X, Y) = E[XY] – E[X]E[Y] = [2/9 x 1 + 2/9 x 2] – [1 x 2/3] = 0.

Therefore, independence implies uncorrelated, but the inverse is not true.