Sections 4.1, 4.2, 4.3

89
Sections 4.1, 4.2, 4.3 Important Definitions in the Text : The definition of joint probability mass function (joint p.m.f.) Definition 4.1-1 The definitions of marginal probability mass function (marginal p.m.f.) and the independence of random variables Definition 4.1-2 If the joint p.m.f. of (X, Y) is f(x,y), and S is the corresponding outcome space, then the mathematical expectation, or expected value, of u(X,Y) is If the marginal p.m.f. of X is f 1 (x), and S 1 is the corresponding outcome space, then E[v(X)] can be calculated from either An analogous statement can be made about E[v(Y)] .

description

Sections 4.1, 4.2, 4.3. Important Definitions in the Text :. The definition of joint probability mass function (joint p.m.f.). Definition 4.1-1. The definitions of marginal probability mass function (marginal p.m.f.) and the independence of random variables. Definition 4.1-2. - PowerPoint PPT Presentation

Transcript of Sections 4.1, 4.2, 4.3

Page 1: Sections 4.1, 4.2, 4.3

Sections 4.1, 4.2, 4.3

Important Definitions in the Text:

The definition of joint probability mass function (joint p.m.f.)Definition 4.1-1

The definitions of marginal probability mass function (marginal p.m.f.) and the independence of random variables Definition 4.1-2

If the joint p.m.f. of (X, Y) is f(x,y), and S is the corresponding outcome space, then the mathematical expectation, or expected value, of u(X,Y) is

If the marginal p.m.f. of X is f1(x), and S1 is the corresponding outcome space, then E[v(X)] can be calculated from either

An analogous statement can be made about E[v(Y)] .

Page 2: Sections 4.1, 4.2, 4.3

1. Twelve bags each contain two pieces of candy, one red and one green. In two of the bags each piece of candy weighs 1 gram; in three of the bags the red candy weighs 2 grams and the green candy weighs 1 gram; in three of the bags the red candy weighs 1 gram and the green candy weighs 2 grams; in the remaining four bags each piece of candy weighs 2 grams. One bag is selected at random and the following random variables are defined:

X = weight of the red candy ,

Y = weight of the green candy .

The space of (X, Y) is

The joint p.m.f. of (X, Y) is f(x, y) =

{(1,1) (1,2) (2,1) (2,2)}.

2y 1

1 2 x

1/4 1/3

1/6 1/4

if (x, y) = (1, 1) 1— 6

if (x, y) = (1, 2) , (2, 1) 1— 4

if (x, y) = (2, 2) 1— 3

Page 3: Sections 4.1, 4.2, 4.3

f1(x) =if x = 1

if x = 2

5 / 12

7 / 12

The marginal p.m.f. of Y is f2(y) =if y = 1

if y = 2

5 / 12

7 / 12

The marginal p.m.f. of X is

A formula for the joint p.m.f. of (X,Y) is f(x, y) =

A formula for the marginal p.m.f. of X is

A formula for the marginal p.m.f. of Y is

x + y—— if (x, y) = (1, 1) , (1, 2) , (2, 1) , (2, 2) 12

f1(x) =

f2(y) =

f(x, 1) + f(x, 2) =

2x + 3——— if x = 1, 2 12

x + 1 x + 2—— + —— = 12 12

f(1, y) + f(2, y) =

2y + 3——— if y = 1, 2 12

1 + y 2 + y—— + —— = 12 12

Page 4: Sections 4.1, 4.2, 4.3

Sections 4.1, 4.2, 4.3

Important Definitions in the Text:

The definition of joint probability mass function (joint p.m.f.)Definition 4.1-1

The definitions of marginal probability mass function (marginal p.m.f.) and the independence of random variables Definition 4.1-2

If the joint p.m.f. of (X, Y) is f(x,y), and S is the corresponding outcome space, then the mathematical expectation, or expected value, of u(X,Y) is

(x,y) S

u(x,y)f(x,y)

If the marginal p.m.f. of X is f1(x), and S1 is the corresponding outcome space, then E[v(X)] can be calculated from either

An analogous statement can be made about E[v(Y)] .

E[u(X,Y)] =

x S

v(x)f1(x) or (x,y) S

v(x)f(x,y)

1

Page 5: Sections 4.1, 4.2, 4.3

1. - continued

E(X) =

E(X2) =

Var(X) =

E(Y) =

E(Y2) =

Var(Y) =

(1)(5/12) + (2)(7/12) = 19/12

(1)2(5/12) + (2)2(7/12) = 11/4

11/4 – (19/12)2 = 35/144

(1)(5/12) + (2)(7/12) = 19/12

(1)2(5/12) + (2)2(7/12) = 11/4

11/4 – (19/12)2 = 35/144

Since _________________________, then the random variables X

and Y _______________ independent.

f(x, y) f1(x)f2(y)

are not

Using the joint p.m.f., E(X + Y) =(1+1)(1/6) + (1+2)(1/4) + (2+1)(1/4) + (2+2)(1/3) = 19 / 6

Alternatively, E(X + Y) = E(X) + E(Y) = 19/12 + 19/12 = 19 / 6

Page 6: Sections 4.1, 4.2, 4.3

Using the joint p.m.f., E(X – Y) =

Alternatively, E(X – Y) =

(1–1)(1/6) + (1–2)(1/4) + (2–1)(1/4) + (2–2)(1/3) = 0

19/12 – 19/12 = 0

E(X + Y) can be interpreted as the mean ofthe total weight of candy in the bag.

E(X – Y) can be interpreted as the mean ofhow much more the red candy in the bag weighs than the green candy.

E(XY) = (1)(1)(1/6) + (1)(2)(1/4) + (2)(1)(1/4) + (2)(2)(1/3) = 5/2

Cov(X,Y) =

Page 7: Sections 4.1, 4.2, 4.3

1. - continued

=

The least squares lines for predicting Y from X is

The least squares lines for predicting X from Y is

Page 8: Sections 4.1, 4.2, 4.3

The conditional p.m.f. of

Y | X = 1 is

Y | X = 2 is

For x = 1, 2, a formula for the conditional p.m.f. of Y | X = x is

Page 9: Sections 4.1, 4.2, 4.3

The conditional p.m.f. of

X | Y = 1 is

X | Y = 2 is

For y = 1, 2, a formula for the conditional p.m.f. of X | Y = y is

1. - continued

Page 10: Sections 4.1, 4.2, 4.3

E(Y | X = 1) =

E(Y2 | X = 1) =

Var(Y | X = 1) =

E(Y | X = 2) =

E(Y2 | X = 2) =

Var(Y | X = 2) =

Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ?

Page 11: Sections 4.1, 4.2, 4.3

E(X | Y = 1) =

E(X2 | Y = 1) =

Var(X | Y = 1) =

E(X | Y = 2) =

E(X2 | Y = 2) =

Var(X | Y = 2) =

Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

1. - continued

Page 12: Sections 4.1, 4.2, 4.3

2. An urn contains six chips, one $1 chip, two $2 chips, and three $3 chips. Two chips are selected at random and without replacement. The following random variables are defined:

X = dollar value of the first chip selected ,

Y = dollar value of the second chip selected .

The space of (X, Y) is {(1,2) (1,3) (2,1) (2,2) (2,3) (3,1) (3,2) (3,3)}.

3

y 2

1

1 2 3x

1/15

1/10

1/15

1/15

1/5

1/10

1/5

1/5

The joint p.m.f. of (X, Y) is f(x, y) =

if (x, y) = (2, 3) , (3, 2) , (3, 3) 1— 5

if (x, y) = (1, 3) , (3, 1) 1—10

if (x, y) = (1, 2) , (2, 1) , (2, 2) 1—15

Page 13: Sections 4.1, 4.2, 4.3

f1(x) =if x = 1if x = 2if x = 3

1 / 61 / 3

The marginal p.m.f. of Y is f2(y) =if y = 1if y = 2if y = 3

1 / 61 / 3

The marginal p.m.f. of X is

A formula for the joint p.m.f. of (X,Y) is f(x, y) =

A formula for the marginal p.m.f. of X is

A formula for the marginal p.m.f. of Y is

(There seems to be no easy formula.)

f1(x) =

f2(y) =

x / 6 if x = 1, 2, 3

y / 6 if y = 1, 2, 3

1 / 2

1 / 2

2. - continued

Page 14: Sections 4.1, 4.2, 4.3

E(X) =

E(X2) =

Var(X) =

E(Y) =

E(Y2) =

Var(Y) =

7 / 3

6

6 – (7 / 3)2 = 5 / 9

7 / 3

6

6 – (7 / 3)2 = 5 / 9

Since _________________________, then the random variables X

and Y _______________ independent.

f(x, y) f1(x)f2(y)

are not

P(X + Y < 4) = P[(X,Y) = (1,2)] + P[(X,Y) = (2,1)] = 1 / 15 + 1 / 15 =2 / 15

Using the joint p.m.f., E(XY) =(1)(2)(2/30) + (1)(3)(3/30) + (2)(1)(2/30) + (2)(2)(2/30) +(2)(3)(6/30) + (3)(1)(3/30) + (3)(2)(6/30) + (3)(3)(6/30) = 16 / 3

Page 15: Sections 4.1, 4.2, 4.3

Cov(X,Y) =

=

The least squares lines for predicting Y from X is

The least squares lines for predicting X from Y is

2. - continued

Page 16: Sections 4.1, 4.2, 4.3

The conditional p.m.f. of

Y | X = 1 is

Y | X = 2 is

Y | X = 3 is

For x = 1, 2, 3, a formula for the conditional p.m.f. of Y | X = x is

Page 17: Sections 4.1, 4.2, 4.3

The conditional p.m.f. of

X | Y = 1 is

X | Y = 2 is

X | Y = 3 is

2. - continued

For y = 1, 2, 3, a formula for the conditional p.m.f. of X | Y = y is

Page 18: Sections 4.1, 4.2, 4.3

E(Y | X = 1) =

E(Y2 | X = 1) =

Var(Y | X = 1) =

E(Y | X = 2) =

E(Y2 | X = 2) =

Var(Y | X = 2) =

E(Y | X = 3) =

E(Y2 | X = 3) =

Var(Y | X = 3) =

E(X | Y = 1) =

E(X2 | Y = 1) =

Var(X | Y = 1) =

E(X | Y = 2) =

E(X2 | Y = 2) =

Var(X | Y = 2) =

E(X | Y = 3) =

E(X2 | Y = 3) =

Var(X | Y = 3) =

Page 19: Sections 4.1, 4.2, 4.3

2. - continued

Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ?

Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

Page 20: Sections 4.1, 4.2, 4.3

3. An urn contains six chips, one $1 chip, two $2 chips, and three $3 chips. Two chips are selected at random and with replacement. The following random variables are defined:

X = dollar value of the first chip selected ,

Y = dollar value of the second chip selected .

The space of (X, Y) is{(1,1) (1,2) (1,3) (2,1) (2,2) (2,3) (3,1) (3,2) (3,3)}.

3

y 2

1

1 2 3x

1/18

1/12

1/18

1/9

1/6

1/12

1/6

1/4

The joint p.m.f. of (X, Y) is f(x, y) = x = 1, 2, 3if y = 1, 2, 3

xy—36

1/36

Page 21: Sections 4.1, 4.2, 4.3

f1(x) =if x = 1if x = 2if x = 3

1 / 61 / 3

The marginal p.m.f. of Y is f2(y) =if y = 1if y = 2if y = 3

1 / 61 / 3

The marginal p.m.f. of X is

A formula for the joint p.m.f. of (X,Y) is f(x, y) =

A formula for the marginal p.m.f. of X is

A formula for the marginal p.m.f. of Y is

(The formula was found previously)

f1(x) =

f2(y) =

x / 6 if x = 1, 2, 3

y / 6 if y = 1, 2, 3

1 / 2

1 / 2

3. - continued

Page 22: Sections 4.1, 4.2, 4.3

E(X) =

E(X2) =

Var(X) =

E(Y) =

E(Y2) =

Var(Y) =

7 / 3

6

6 – (7 / 3)2 = 5 / 9

7 / 3

6

6 – (7 / 3)2 = 5 / 9

Since _________________________, then the random variables X

and Y _______________ independent.

f(x, y) = f1(x)f2(y)

are

P(X + Y < 4) = P[(X,Y) = (1,1)] + P[(X,Y) = (1,2)] + P[(X,Y) = (2,1)] =1 / 36 + 1 / 18 + 1 / 18 = 5 / 36

Page 23: Sections 4.1, 4.2, 4.3

E(XY) =

3. - continued 3 3

(xy) (xy / 36) =x = 1 y = 1

3 3

(xy) (x / 6) (y / 6) =x = 1 y = 1

3 3

(x) (x / 6) (y) (y / 6) =x = 1 y = 1

E(X) E(Y) = (7/3)(7/3) = 49 / 9

Cov(X,Y) =

=

The least squares lines for predicting Y from X is

The least squares lines for predicting X from Y is

Page 24: Sections 4.1, 4.2, 4.3

For x = 1, 2, 3,

the conditional p.m.f. of Y | X = x is

E(Y | X = x) =

Var(Y | X = x) =

For y = 1, 2, 3,

the conditional p.m.f. of X | Y = y is

E(X | Y = y) =

Var(X | Y = y) =

Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ?

Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

Page 25: Sections 4.1, 4.2, 4.3

For continuous type random variables (X, Y), the definitions of joint probability density function (joint p.d.f.), independence of X and Y, and mathematical expectation are each analogous to those for discrete type random variables, with summation signs replaced by integral signs.

The covariance between random variables X and Y is

The correlation between random variables X and Y is

Consider the equation of a line y = a + bx which comes “closest” to predicting the values of the random variable Y from the random variable X in the sense that E{[Y – (a + bX)]2} is minimized. x

y

Page 26: Sections 4.1, 4.2, 4.3

We let k(a,b) = E{[Y – (a + bX)]2} =

To minimize k(a,b) , we set the partial derivatives with respect to a and b equal to zero. (Note: This is textbook exercise 4.2-5.)

k— =a

k— =b

(Multiply the first equation by X , subtract the resulting equation from the second equation, and solve for b. Then substitute in place of b in the first equation to solve for a.)

Page 27: Sections 4.1, 4.2, 4.3

b =

a =

The least squares line for predicting Y from X can be written

The least squares line for predicting X from Y can be written

The least squares line for predicting Y from X is

Page 28: Sections 4.1, 4.2, 4.3

The conditional p.m.f./p.d.f. of Y given X = x is defined to be

The conditional p.m.f./p.d.f. of X given Y = y is defined to be

The conditional mean of Y given X = x is defined to be

The conditional variance of Y given X = x is defined to be

Page 29: Sections 4.1, 4.2, 4.3

The conditional mean of X given Y = y and the conditional variance of X given Y = y are each defined similarly.

For continuous type random variables (X, Y), the definitions of conditional mean and variance are each analogous to those for discrete type random variables, with summation signs replaced by integral signs.

Suppose X and Y are two discrete type random variables, andE(Y | X = x) = a + bx. Then, for each possible value of x,

Multiplying each side by f1(x),

Summing each side over all x,

Page 30: Sections 4.1, 4.2, 4.3

Summing each side over all x,

Now, multiplying each side of by x f1(x),

Page 31: Sections 4.1, 4.2, 4.3

The two equations and

are essentially the same as those in the derivation of the least squares line for predicting Y from X. This derivation is analogous for continuous type random variables with summation signs replaced by integral signs.

Consequently, if E(Y | X = x) = a + bx (i.e., if E(Y | X = x) is a linear function of x), then a and b must be respectively the intercept and slope in the least squares line for predicting Y from X.

Similarly, if E(X | Y = y) = c + dy (i.e., if E(X | Y = y) is a linear function of y), then c and d must be respectively the intercept and slope in the least squares line for predicting X from Y.

Page 32: Sections 4.1, 4.2, 4.3

Suppose a set contains N = N1 + N2 + N3 items, where N1 items are of one type, N2 items are of a second type, and N3 items are of a third type; n items are selected from the N items at random and without replacement. If the random variable X1 is defined to be the number of selected n items that are of the first type, the random variable X2 is defined to be the number of selected n items that are of the second type, and the random variable X3 is defined to be the number of selected n items that are of the third type, then the joint distribution of (X1 , X2 , X3) is called a trivariate hypergeometric distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 .The joint p.m.f. of (X1 , X2) is

Each Xi has a distribution.

If the number of types of items is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multivariate hypergeometric distribution.

Page 33: Sections 4.1, 4.2, 4.3

Suppose each in a sequence of independent trials must result in one of outcome 1, outcome 2, or outcome 3. The probability of outcome 1 on each trial is p1 , the probability of outcome 2 on each trial is p2 , and the probability of outcome 3 on each trial is p3 = 1 – p1 – p2 . If the random variable X1 is defined to be the number of the n trials resulting in outcome 1, the random variable X2 is defined to be the number of the n trials resulting in outcome 2, and the random variable X3 is defined to be the number of the n trials resulting in outcome 3, then the joint distribution of (X1 , X2 , X3) is called a trinomial distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 .The joint p.m.f. of (X1 , X2) is

Each Xi has a distribution.

If the number of outcomes is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multinomial distribution.

Page 34: Sections 4.1, 4.2, 4.3

4.

(a)

An urn contains 15 red chips, 10 blue chips, and 5 white chips. Eight chips are selected at random and without replacement. The following random variables are defined:

X1 = number of red chips selected ,

X2 = number of blue chips selected ,

X3 = number of white chips selected .

Find the joint p.m.f. of (X1 , X2 , X3) .

(X1 , X2 , X3) have a trivariate hypergeometric distribution, and X3 = 8 – X1 – X2 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is

f(x1, x2) =

15

x1

10

x2

5

8 – x1 – x2

30

8

if

x1 = 0, 1, …, 8

x2 = 0, 1, …, 8

3 x1 + x2 8

Page 35: Sections 4.1, 4.2, 4.3

(b) Find the marginal p.m.f. for each of X1 , X2 , and X3 .

Each of X1 , X2 , and X3 has a hypergeometric distribution.

f1(x1) =

15

x1

15

8 – x1

30

8

if x1 = 0, 1, …, 8

f2(x2) =

10

x2

20

8 – x2

30

8

if x2 = 0, 1, …, 8

Page 36: Sections 4.1, 4.2, 4.3

4. - continued

(c)

f3(x3) =

5

x3

25

8 – x3

30

8

if x3 = 0, 1, …, 5

Are X1 , X2 , and X3 independent? Why or why not?

X1, X2, X3 cannot possibly be independent, because any one of these random variables is totally determined by the other two.

Page 37: Sections 4.1, 4.2, 4.3

(d) Find the probability that at least two of the selected chips are blue or at least two chips are white.

P({X2 2} {X3 2}) =

1 – P({X2 1} {X3 1}) =

1 – [P(X2 = 0 , X3 = 0) + P(X2 = 1 , X3 = 0) +

P(X2 = 0 , X3 = 1) + P(X2 = 1 , X3 = 1)] =

1 –30

8

15

8

+30

8

15

7

10

1

+30

8

15

7

5

1

+30

8

15

6

10

1

5

1

Page 38: Sections 4.1, 4.2, 4.3

4. - continued

(e)

(f)

Find the conditional p.m.f. of X1 | x2 .

E(X1 | x2) can be written as a linear function of x2 , since

X1 | x2 can be treated as “the number of red chips selected when

X1 | x2 has aFor x2 =

distribution with p.m.f.

E(X1 | x2) = Therefore, the least squares

Page 39: Sections 4.1, 4.2, 4.3

(h)

line for predicting X1 from X2 must be

Find the covariance and correlation between X1 and X2 by making use of the following facts (instead of using direct formulas):

The slope in the least squares line for predicting X1 from X2 is

The product of the slope in the least squares line for predicting X1 from X2 and the slope in the least squares line for predicting X2 from X1 is equal to .

The slope in the least squares line for predicting X2 from X1 is

E(X2 | x1) can be written as a linear function of x2 , since

E(X2 | x1) =

(g)

Therefore, the least squares

line for predicting X2 from X1 must be

Page 40: Sections 4.1, 4.2, 4.3

Suppose a set contains N = N1 + N2 + N3 items, where N1 items are of one type, N2 items are of a second type, and N3 items are of a third type; n items are selected from the N items at random and without replacement. If the random variable X1 is defined to be the number of selected n items that are of the first type, the random variable X2 is defined to be the number of selected n items that are of the second type, and the random variable X3 is defined to be the number of selected n items that are of the third type, then the joint distribution of (X1 , X2 , X3) is called a trivariate hypergeometric distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 .The joint p.m.f. of (X1 , X2) is

Each Xi has a distribution.

If the number of types of items is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multivariate hypergeometric distribution.

N1

x1

N2

x2

N – N1 – N2

n – x1 – x2

Nn

hypergeometric

if x1 and x2 are “appropriate” integers

Page 41: Sections 4.1, 4.2, 4.3

5.

(a)

An urn contains 15 red chips, 10 blue chips, and 5 white chips. Eight chips are selected at random and with replacement. The following random variables are defined:

X1 = number of red chips selected ,

X2 = number of blue chips selected ,

X3 = number of white chips selected .

Find the joint p.m.f. of (X1 , X2 , X3) .

(X1 , X2 , X3) have a trinomial distribution, and X3 = 8 – X1 – X2 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is

f(x1, x2) = 1— 2

x1 1— 3

x2 1— 6

8 – x1 – x28!

x1! x2! (8 – x1 – x2)!

if

x1 = 0, 1, …, 8

x2 = 0, 1, …, 8

x1 + x2 8

Page 42: Sections 4.1, 4.2, 4.3

(b) Find the marginal p.m.f. for each of X1 , X2 , and X3 .

Each of X1 , X2 , and X3 has a binomial distribution.

f1(x1) =8!

x1! (8 – x1)!

1— 2

8

if x1 = 0, 1, …, 8

f2(x2) =8!

x2! (8 – x2)!

1— 3

x2 2— 3

8 – x2

if x2 = 0, 1, …, 8

Page 43: Sections 4.1, 4.2, 4.3

5. - continued

(c) Are X1 , X2 , and X3 independent? Why or why not?

f3(x3) =8!

x3! (8 – x3)!

1— 6

x3 5— 6

8 – x3

if x3 = 0, 1, …, 8

X1, X2, X3 cannot possibly be independent, because any one of these random variables is totally determined by the other two.

Page 44: Sections 4.1, 4.2, 4.3

(d) Find the probability that at least two of the selected chips are blue or at least two chips are white.

P({X2 2} {X3 2}) =

1 – P({X2 1} {X3 1}) =

1 – [P(X2 = 0 , X3 = 0) + P(X2 = 1 , X3 = 0) +

P(X2 = 0 , X3 = 1) + P(X2 = 1 , X3 = 1)] =

1 – 1— 2

8

+ 1— 2

7 1— 3

+8!

7! 1!

1— 2

7 1— 6

+8!

7! 1!

1— 2

6 1— 3

1— 6

8!

6! 1! 1!

Page 45: Sections 4.1, 4.2, 4.3

5. - continued

(e)

(f)

Find the conditional p.m.f. of X1 | x2 .

E(X1 | x2) can be written as a linear function of x2 , since

X1 | x2 can be treated as “the number of red chips selected when

X1 | x2 has aFor x2 =

distribution with p.m.f.

E(X1 | x2) = Therefore, the least squares

Page 46: Sections 4.1, 4.2, 4.3

(h)

line for predicting X1 from X2 must be

Find the covariance and correlation between X1 and X2 by making use of the following facts (instead of using direct formulas):

The slope in the least squares line for predicting X1 from X2 is

The product of the slope in the least squares line for predicting X1 from X2 and the slope in the least squares line for predicting X2 from X1 is equal to .

The slope in the least squares line for predicting X2 from X1 is

E(X2 | x1) can be written as a linear function of x2 , since

E(X2 | x1) =

(g)

Therefore, the least squares

line for predicting X2 from X1 must be

Page 47: Sections 4.1, 4.2, 4.3

Suppose each in a sequence of independent trials must result in one of outcome 1, outcome 2, or outcome 3. The probability of outcome 1 on each trial is p1 , the probability of outcome 2 on each trial is p2 , and the probability of outcome 3 on each trial is p3 = 1 – p1 – p2 . If the random variable X1 is defined to be the number of the n trials resulting in outcome 1, the random variable X2 is defined to be the number of the n trials resulting in outcome 2, and the random variable X3 is defined to be the number of the n trials resulting in outcome 3, then the joint distribution of (X1 , X2 , X3) is called a trinomial distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 .The joint p.m.f. of (X1 , X2) is

Each Xi has a distribution.

If the number of outcomes is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multinomial distribution.

n!x1! x2! (n – x1 – x2)!

p1 p2 (1 – p1 – p2)

if x1 and x2 are non-negative integers such that x1 + x2 n

x1 n – x1 – x2x2

b( , )n pi

Page 48: Sections 4.1, 4.2, 4.3

6. One chip is selected from each of two urns, one containing three chips labeled distinctively with the integers 1 through 3 and the other containing two chips labeled distinctively with the integers 1 and 2. The following random variables are defined:

X = largest integer among the labels on the selected chips ,

Y = smallest integer among the labels on the selected chips .

The space of (X, Y) is

The joint p.m.f. of (X, Y) is f(x, y) =

{(1,1) (2,1) (3,1) (2,2) (3,2)}.

2y 1

1 2 3x

1/6

1/3

1/6

1/61/6

if (x, y) = (1, 1) , (3, 1) , (2, 2) , (3, 2)

1— 6

if (x, y) = (2, 1)

1— 3

(Note: We immediately see that X and Y cannot be independent, since the joint space is not “rectangular”.)

Page 49: Sections 4.1, 4.2, 4.3

f1(x) = if x = 1if x = 2, 3

1 / 61 / x

The marginal p.m.f. of Y is f2(y) = if y = 1, 2(3 – y) / 3

The marginal p.m.f. of X is

E(X) =

E(X2) =

Var(X) =

E(Y) =

E(Y2) =

Var(Y) =

13 / 6

31 / 6

31 / 6 – (13 / 6)2 = 17 / 36

4 / 3

2

2 – (4 / 3)2 = 2 / 9

Page 50: Sections 4.1, 4.2, 4.3

6. - continued

Since _________________________, then the random variables X

and Y _______________ independent

f(x, y) f1(x)f2(y)

are not

Using the joint p.m.f., E(XY) =(1)(1)(1/6) + (3)(1)(1/6) + (2)(2)(1/6) + (3)(2)(1/6) + (2)(1)(1/3) = 3

(as we previously noted).

Cov(X,Y) =

=

The least squares lines for predicting Y from X is

Page 51: Sections 4.1, 4.2, 4.3

The least squares lines for predicting X from Y is

The conditional p.m.f. of

Y | X = 1 is

Y | X = 2 is

Y | X = 3 is

Page 52: Sections 4.1, 4.2, 4.3

6. - continued

The conditional p.m.f. of

X | Y = 1 is

X | Y = 2 is

Page 53: Sections 4.1, 4.2, 4.3

E(Y | X = 1) =

E(Y2 | X = 1) =

Var(Y | X = 1) =

E(Y | X = 2) =

E(Y2 | X = 2) =

Var(Y | X = 2) =

E(Y | X = 3) =

E(Y2 | X = 3) =

Var(Y | X = 3) =

Page 54: Sections 4.1, 4.2, 4.3

E(X | Y = 1) =

E(X2 | Y = 1) =

Var(X | Y = 1) =

E(X | Y = 2) =

E(X2 | Y = 2) =

Var(X | Y = 2) =

6. - continued

Page 55: Sections 4.1, 4.2, 4.3

Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ?

Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

Page 56: Sections 4.1, 4.2, 4.3

For continuous type random variables (X, Y), the definitions of joint probability density function (joint p.d.f.), independence of X and Y, and mathematical expectation are each analogous to those for discrete type random variables, with summation signs replaced by integral signs.

The covariance between random variables X and Y is

The correlation between random variables X and Y is

Consider the equation of a line y = a + bx which comes “closest” to predicting the values of the random variable Y from the random variable X in the sense that E{[Y – (a + bX)]2} is minimized. x

y

Page 57: Sections 4.1, 4.2, 4.3

9. Random variables X and Y have joint p.d.f.

f(x,y) = 5xy2 / 2 if 0 < x/2 < y < 1 .

The space of (X, Y) displayed graphically is as follows:

(0,0)

(0,2) (2,1)

x

y

y = x / 2(Note: We immediately see that X and Y cannot be independent, since the joint space is not “rectangular”.)

Skip to #9

Page 58: Sections 4.1, 4.2, 4.3

Event A = {(x,y) | 1/2 < x < 1 , 1/2 < y < 3/2} displayed graphically is as follows:

P(A) = (0,0)

(0,2) (2,1)

x

y

(1/2, 3/2) (1, 3/2)

(1, 1/2)(1/2, 1/2)

A

f(x, y) dx dy =

1/2

5xy2

—— dx dy = 2

1

1/2

1

1/2

5x2y2

—— dy = 4

1

x = 1/2

1

1/2

15y2

—— dy = 16

1

y = 1/2

5y3

— =16

1 35–—128

Page 59: Sections 4.1, 4.2, 4.3

9. - continued

The marginal p.d.f. of X is

E(X) =

f(x, y) dy =

x / 2

5xy2

—— dy = 2

1

5xy3

—— = 6

y = x / 2

140x – 5x4

———— 48

if 0 < x < 2

f1(x) =

0

2

40x – 5x4

x ———— dx = 48

40x2 – 5x5

———— dx = 48

0

25x3 5x6

— – —— =18 288

x = 0

210— 9

Page 60: Sections 4.1, 4.2, 4.3

E(X 2) =

Var(X) =

0

2

40x – 5x4

x2 ———— dx = 48

40x3 – 5x6

———— dx = 48

0

25x4 5x7

— – —— =24 336

x = 0

210— 7

110—–567

Page 61: Sections 4.1, 4.2, 4.3

f(x, y) dx =

0

5xy2

—— dx = 2

2y

5x2y2

—— = 4

x = 0

2y

5y4 if 0 < y < 1

9. - continued

The marginal p.d.f. of Y is f2(y) =

E(Y) = 5— 6

Page 62: Sections 4.1, 4.2, 4.3

5— 7

5—–252

E(Y 2) =

Var(Y) =

Page 63: Sections 4.1, 4.2, 4.3

9. - continued

Since _________________________, then the random variables X

and Y _______________ independent

f(x, y) f1(x)f2(y)

are not (as we previously noted).

E(XY) =

xy f(x, y) dx dy =

0

1

5xy2

xy —— dx dy = 2

0

2y

0

1

5x2y3

—— dx dy = 2

0

2y

0

1

5x3y3

—— dy = 6

x = 0

2y

0

1

20y6

—— dy = 3

20y7

—– = 21

y = 0

1

20—21

Cov(X,Y) =

=

Page 64: Sections 4.1, 4.2, 4.3

The least squares lines for predicting Y from X is

The least squares lines for predicting X from Y is

For 0 < x < 2, the conditional p.d.f. of Y | X = x is

Page 65: Sections 4.1, 4.2, 4.3

For 0 < x < 2, E(Y | X = x) =

For 0 < x < 2, E(Y2 | X = x) =

For 0 < x < 2, Var(Y | X = x) =

9. - continued

Page 66: Sections 4.1, 4.2, 4.3

For 0 < y < 1, the conditional p.d.f. of X | Y = y is

For 0 < y < 1, E(X | Y = y) =

For 0 < y < 1, E(X2 | Y = y) =

Page 67: Sections 4.1, 4.2, 4.3

For 0 < y < 1, Var(X | Y = y) =

Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ?

Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

9. - continued

Page 68: Sections 4.1, 4.2, 4.3

7. Random variables X and Y have joint p.d.f.

f(x,y) = (x + y) / 8 if 0 < x < 2 , 0 < y < 2 .

The space of (X, Y) displayed graphically is as follows:

(0,0) (2,0)

(0,2) (2,2)

x

y

Page 69: Sections 4.1, 4.2, 4.3

The set A ={(x, y) | 1/2 < x < 1 , 1/2 < y < 3/2}is graphically displayed as follows:

(0,0) (2,0)

(0,2) (2,2)

x

y

(1/2, 1/2)

(1/2, 3/2)

(1, 1/2)

(1, 3/2)

The set B = {(x, y) | x > y} is graphically displayed as follows:

(0,0) (2,0)

(0,2) (2,2)

x

y BA

Events A = {(x,y) | 1/2 < x < 1 , 1/2 < y < 3/2} and B = {(x,y) | x > y} displayed graphically are as follows:

(2,2)

Page 70: Sections 4.1, 4.2, 4.3

P(A) = P(1/2 < X < 1 , 1/2 < Y < 3/2) =

7. - continued

A

f(x, y) dx dy =

1/2

x + y—— dx dy = 8

3/2

1/2

1

1/2

x2 + 2xy——— dy = 16

3/2

x = 1/2

1

1/2

3/2

1 + 2y 1/4 + y——— – ——— dy = 16 16

1/2

3/2

3 + 4y——— dy = 64

3y + 2y2

——— = 64

3/2

y = 1/2

7—64

Page 71: Sections 4.1, 4.2, 4.3

P(B) = P(X > Y) =

x > y

f(x, y) dx dy =

0

x + y—— dx dy or 8

2

y

2

0

x + y—— dy dx 8

2

0

x2

0

2xy + y2

——— dx = 16

2 x2

y = 0 0

2

2x3 + x4

——— dx = 16

5x4 + 2x5

———— = 160

2

x = 0

9—10

Page 72: Sections 4.1, 4.2, 4.3

f1(x) =The marginal p.d.f. of X is

E(X) =

E(X2) =

Var(X) =

7. - continued

f(x, y) dy =

0

x + y—— dy = 8

2

2xy + y2

——— = 16

y = 0

2

x + 1—— if 0 < x < 2 4

7/6

5/3

11/36

Page 73: Sections 4.1, 4.2, 4.3

The marginal p.d.f. of Y is f2(y) =

E(Y) =

E(Y2) =

Var(Y) =

f(x, y) dx =

0

x + y—— dx = 8

2

7/6

5/3

11/36

y + 1—— if 0 < y < 2 4

Since _________________________, then the random variables X

and Y _______________ independent

f(x, y) f1(x)f2(y)

are not

Page 74: Sections 4.1, 4.2, 4.3

E(XY) =

Cov(X,Y) =

=

7. - continued

xy f(x, y) dx dy =

0

x2y + xy2

——— dx dy = 8

2

0

2

0

2x3y + 3x2y2

————— dy = 48

2

x = 0

2

0

2

4y + 3y2

———— dy = 12

2y2 + y3

——— = 12

2

y = 0

4— 3

Page 75: Sections 4.1, 4.2, 4.3

The least squares lines for predicting Y from X is

The least squares lines for predicting X from Y is

For 0 < x < 2, the conditional p.d.f. of Y | X = x is

Page 76: Sections 4.1, 4.2, 4.3

For 0 < x < 2, E(Y | X = x) =

For 0 < x < 2, E(Y2 | X = x) =

For 0 < x < 2, Var(Y | X = x) =

7. - continued

Page 77: Sections 4.1, 4.2, 4.3

For 0 < y < 2, the conditional p.d.f. of X | Y = y is

For 0 < y < 2, E(X | Y = y) =

For 0 < y < 2, E(X2 | Y = y) =

Page 78: Sections 4.1, 4.2, 4.3

For 0 < y < 2, Var(X | Y = y) =

Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ?

Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ?

7. - continued

Page 79: Sections 4.1, 4.2, 4.3

8. Random variables X and Y have joint p.d.f.

f(x,y) = (y – 1) / (2x2) if 1 < x , 1 < y < 3 .

The space of (X, Y) displayed graphically is as follows:

(0,0)

(1,1)

(1,3)

x

y

Page 80: Sections 4.1, 4.2, 4.3

8. - continuedEvent A = {(x,y) | 1 < x < 3 , 1 < y < (x+1)/2} displayed graphically is as follows:

(0,0)

(1,1)

(1,3)

x

y

(3,2)

y = (x + 1) / 2 x =3

P(A) =

A

f(x, y) dx dy =

1

y – 1—— dy dx = 2x2

3

1

(x+1)/2

Page 81: Sections 4.1, 4.2, 4.3

1

y – 1—— dy dx = 2x2

3

1

(x+1)/2

1

y2 – 2y——— dx = 4x2

3

y = 1

(x+1)/2

1

(x + 1)2 – 4(x + 1) 1——————— + —— dx =

16x2 4x2

3

1

x2 – 2x + 1———— dx = 16x2

3

1

1 1 1— – — + —— dx =16 8x 16x2

3

x = 1

x ln x 1— – —— – —— =16 8 16x

3

3 ln 3 1— – —— – — – 0 =16 8 48

1 ln 3— – —— 6 8

Page 82: Sections 4.1, 4.2, 4.3

8. - continuedEvents B = {(x,y) | x > y} displayed graphically is as follows:

(0,0)

(1,1)

(1,3)

x

y

(3,3)y = x

P(B) =

x > y

f(x, y) dx dy =

Note that describing B as

makes the integration more work than describing B as

{1 < x < 3 , 1 < y < x} {3 < x < , 1 < y < 3}

{1 < y < 3 , y < x < }

1

y – 1—— dx dy = 2x2

3

y

1

1 – y—— dy = 2x

3

x = y

Page 83: Sections 4.1, 4.2, 4.3

1

1 – y—— dy = 2x

3

x = y

1

y – 1—— dy = 2y

3

y = 1

y lny— – —– = 2 2

3 ln 31 – —— 2

Page 84: Sections 4.1, 4.2, 4.3

8. - continued

f1(x) =The marginal p.d.f. of X is

E(X) =

f(x, y) dy =

1

y – 1—— dy = 2x2

3

y2 – 2y——— = 4x2

y = 1

3

1— if 1 < x x2

1

1x — dx = x2

1— dx = x

1

ln(x) =

x = 1

Page 85: Sections 4.1, 4.2, 4.3

E(X2) =

Var(X) =

1

1x2 — dx = x2

1

dx =

Page 86: Sections 4.1, 4.2, 4.3

f2(y) =The marginal p.d.f. of Y is

E(Y) =

8. - continued

f(x, y) dx =

1

y – 1—— dx = 2x2

1 – y—— = 2x

x = 1

y – 1—— if 1 < y < 3 2

1

3

y – 1y —— dy = 2

y2 – y—— dy = 2

1

3

y3 y2

— – — =6 4

y = 1

3

7— 3

Page 87: Sections 4.1, 4.2, 4.3

E(Y2) =

Var(Y) =

1

3

y – 1y2 —— dy = 2

y3 – y2

—— dy = 2

1

3

y4 y3

— – — =8 6

y = 1

3

17— 3

2— 9

Page 88: Sections 4.1, 4.2, 4.3

8. - continued

Since _________________________, then the random variables X

and Y _______________ independent

f(x, y) = f1(x)f2(y)

are

E(XY) =

Cov(X,Y) =

=

1

y – 1—— dx dy = 2x2

3

1

xy 1— dx = x

1

y2 – y—— dy 2

1

3

1— dx = x

1

7— 3

Page 89: Sections 4.1, 4.2, 4.3

The least squares lines for predicting Y from X is

The least squares lines for predicting X from Y is

For 1 < x , the conditional p.d.f. of Y | X = x is

For 1 < y < 3, the conditional p.d.f. of X | Y = y is

For 1 < x , E(Y | X = x) =

For 1 < x , Var(Y | X = x) =

For 0 < y < 3, E(X | Y = y) =

For 0 < y < 3, Var(X | Y = y) =