Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central...

18
Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156 Central Limit Theorem - pages 183 - 185

Transcript of Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central...

Page 1: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Topic 5 - Joint distributions and the CLT

• Joint distributions - pages 145 - 156 • Central Limit Theorem - pages 183 - 185

Page 2: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Joint distributions

• Often times, we are interested in more than one random variable at a time.

• For example, what is the probability that a car will have at least one engine problem and at least one blowout during the same week?

• X = # of engine problems in a week• Y = # of blowouts in a week• P(X ≥ 1, Y ≥ 1) is what we are looking for• To understand these sorts of probabilities,

we need to develop joint distributions.

Page 3: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Discrete distributions

• A discrete joint probability mass function is given by

f(x,y) = P(X = x, Y = y)where

all ( , )

all ( , )

all ( , )

1. ( , ) 0 for all ,

2. ( , ) 1

3. (( , ) ) ( , )

4. ( ( , )) ( , ) ( , )

x y

x y A

x y

f x y x y

f x y

P X Y A f x y

E h X Y h x y f x y

Page 4: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Return to the car example

• Consider the following joint pmf for X and Y

• P(X ≥ 1, Y ≥ 1) =

• P(X ≥ 1) = • E(X + Y) =

X\Y 0 1 2 3 4

0 1/2 1/16 1/32 1/32 1/32

1 1/16 1/32 1/32 1/32 1/32

2 1/32 1/32 1/32 1/32 1/32

Page 5: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Joint to marginals• The probability mass functions for X and Y

individually (called marginals) are given by

• Returning to the car example:fX(x) =

fY(y) =

E(X) =

E(Y) =

all all ( ) ( , ), ( ) ( , )X Yy x

f x f x y f y f x y

Page 6: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Continuous distributions

• A joint probability density function for two continuous random variables, (X,Y), has the following four properties:

- -

- -

1. ( , ) 0 for all ,

2. ( , ) 1

3. (( , ) ) ( , )

4. ( ( , )) ( , ) ( , )

A

f x y x y

f x y dxdy

P X Y A f x y dxdy

E h X Y h x y f x y dxdy

Page 7: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Continuous example• Consider the following joint pdf:

• Show condition 2 holds on your own.• Show P(0 < X < 1, ¼ < Y < ½) = 23/512

2(1 3 )( , ) 0 2, 0 1

4x y

f x y x y

Page 8: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Joint to marginals

• The marginal pdfs for X and Y can be found by

• For the previous example, find fX(x) and fY(y).

( ) ( , ) , ( ) ( , )X Yf x f x y dy f y f x y dx

Page 9: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Independence of X and Y

• The random variables X and Y are independent if f(x,y) = fX(x) fY(y) for all pairs (x,y).

• For the discrete clunker car example, are X and Y independent?

• For the continuous example, are X and Y independent?

Page 10: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Sampling distributions• We assume that each data value we collect

represents a random selection from a common population distribution.

• The collection of these independent random variables is called a random sample from the distribution.

• A statistic is a function of these random variables that is used to estimate some characteristic of the population distribution.

• The distribution of a statistic is called a sampling distribution.

• The sampling distribution is a key component to making inferences about the population.

Page 11: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

StatCrunch example• StatCrunch subscriptions are sold for 6 months

($5) or 12 months ($8).• From past data, I can tell you that roughly 80%

of subscriptions are $5 and 20% are $8.• Let X represent the amount in $ of a purchase.• E(X) =

• Var(X) =

Page 12: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

StatCrunch example continued• Now consider the amounts of a random

sample of two purchases, X1, X2.

• A natural statistic of interest is X1 + X2, the total amount of the purchases.

Outcomes

X1 + X2 Probability

X1 + X2

Probability

Page 13: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

StatCrunch example continued

• E(X1 + X2) =

• E([X1 + X2]2) =

• Var(X1 + X2) =

Page 14: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

StatCrunch example continued• If I have n purchases in a day, what is

– my expected earnings?– the variance of my earnings?– the shape of my earnings distribution for large n?

• Let’s experiment by simulating 1000 days with 100 purchases per day.

• StatCrunch

Page 15: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Central Limit Theorem• We have just illustrated one of the most

important theorems in statistics.• As the sample size, n, becomes large the

distribution of the sum of a random sample from a distribution with mean and variance 2 converges to a Normal distribution with mean n and variance n2.

• A sample size of at least 30 is typically required to use the CLT

• The amazing part of this theorem is that it is true regardless of the form of the underlying distribution.

Page 16: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Airplane example• Suppose the weight of an airline passenger

has a mean of 150 lbs. and a standard deviation of 25 lbs. What is the probability the combined weight of 100 passengers will exceed the maximum allowable weight of 15,500 lbs?

• How many passengers should be allowed on the plane if we want this probability to be at most 0.01?

Page 17: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

The sample mean• For constant c, E(cY) = cE(Y) and Var(cY) = c2Var(Y)

• E( ) =

• Var( ) =

• The CLT says that for large samples, is approximately normal with a mean of and a variance of 2/n.

• So, the variance of the sample mean decreases with n.

X

X

X

Page 18: Topic 5 - Joint distributions and the CLT Joint distributions - pages 145 - 156145 - 156 Central Limit Theorem - pages 183 - 185183 - 185.

Sampling applet