08 Continuous

36
Hadley Wickham Stat310 Continuous random variables Thursday, 4 February 2010

Transcript of 08 Continuous

Page 1: 08 Continuous

Hadley Wickham

Stat310Continuous random variables

Thursday, 4 February 2010

Page 2: 08 Continuous

1. Finish off Poisson

2. New conditions & definitions

3. Uniform distribution

4. Exponential and gamma

Thursday, 4 February 2010

Page 3: 08 Continuous

Challenge

If X ~ Poisson(109.6), what is P(X > 100) ?

What’s the probability they score over 100 points? (How could you work this out?)

Thursday, 4 February 2010

Page 4: 08 Continuous

grid

cumulative

0.0

0.2

0.4

0.6

0.8

1.0

●●●●●●●●●●●●●●●●●●●●●

●●●●

●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

80 100 120 140

Thursday, 4 February 2010

Page 5: 08 Continuous

Multiplication

X ~ Poisson(λ)

Y = tX

Then Y ~ Poisson(λt)

Thursday, 4 February 2010

Page 6: 08 Continuous

Continuous random variables

Recall: have uncountably many outcomes

Thursday, 4 February 2010

Page 7: 08 Continuous

f(x) For continuous x, f(x) is a probability density function.

Not a probability!(may be greater than one)

Thursday, 4 February 2010

Page 8: 08 Continuous

P (a < X < b) =� b

af(x)dx

What does P(X = a) = ?

P (a ≤ X ≤ b) =� b

af(x)dx

Thursday, 4 February 2010

Page 9: 08 Continuous

f(x) ≥ 0 ∀x ∈ R�

Rf(x) = 1

Conditions

Thursday, 4 February 2010

Page 10: 08 Continuous

MX(t) =�

Retxf(x)dx

E(u(X)) =�

Ru(x)f(x)dx

Thursday, 4 February 2010

Page 11: 08 Continuous

AsideWhy do we need different methods for discrete and continuous variables?

Our definition of integration (Riemanian) isn’t good enough, because it can’t deal with discrete pdfs.

Can generalise to get Lebesgue integration. More general approach called measure theory.

Thursday, 4 February 2010

Page 12: 08 Continuous

F (x) = P (X ≤ x) =� x

−∞f(t)dt

P (X ∈ [a, b]) =� b

af(x)dx = F (b)− F (a)

Cumulative distribution function

Thursday, 4 February 2010

Page 13: 08 Continuous

f(x)DifferentiateIntegrate

F(x)Thursday, 4 February 2010

Page 14: 08 Continuous

Uniform distribution

Thursday, 4 February 2010

Page 15: 08 Continuous

Assigns probability uniformly in an interval [a, b] of the real line

X ~ Uniform(a, b)

What are F(x) and f(x) ?

The uniform

� b

af(x)dx = 1

Thursday, 4 February 2010

Page 16: 08 Continuous

Intuition

X ~ Unif(a, b)

How do you expect the mean to be related to a and b?

How do you expect the variance to be related to a and b?

Thursday, 4 February 2010

Page 17: 08 Continuous

f(x) =1

b− ax ∈ [a, b]

F (x) = 0 x < a

F (x) = x−ab−a x ∈ [a, b]

F (x) = 1 x > b

Thursday, 4 February 2010

Page 18: 08 Continuous

integrate e^(x * t) 1/(b-a) dx from a to b

MX(t) = E[eXt] =� b

aext 1

b− adx

Thursday, 4 February 2010

Page 19: 08 Continuous

Mean & varianceFind mgf: integrate e^(x * t) 1/(b-a) dx from a to b

Find 1st moment: d/dt (E^(a_1 t) - E^(b_1 t))/(a_1 t - b_1 t) where t = 0

Find 2nd moment d^2/dt^2 (E^(a_1 t) - E^(b_1 t))/(a_1 t - b_1 t) where t = 0

Find var: simplify 1/3 (a_1 b_1+a_1^2+b_1^2) - ((a_1 + b_1)/2) ^ 2

Thursday, 4 February 2010

Page 20: 08 Continuous

E(X) =a + b

2

V ar(X) =(b− a)2

12

Thursday, 4 February 2010

Page 21: 08 Continuous

Write out in a similar way when using wolfram

alpha in homework

Thursday, 4 February 2010

Page 22: 08 Continuous

Exponential and gamma distributions

Thursday, 4 February 2010

Page 23: 08 Continuous

Conditions

Let X ~ Poisson(λ)

Let Y be the time you wait for the next event to occur. Then Y ~ Exponential(β = 1 / λ)

Let Z be the time you wait for α events to occur. Then Z ~ Gamma(α, β = 1 / λ)

Thursday, 4 February 2010

Page 24: 08 Continuous

LA Lakers

Last time we said the distribution of scores looked pretty close to Poisson. Does the distribution of times between scores look exponential?

(Downloaded play-by-play data for all NBA games in 08/09 season, extracted all point scoring plays by Lakers, computed times between them - 4660 in total)

Thursday, 4 February 2010

Page 25: 08 Continuous

value

Proportion

0.0000.0050.0100.0150.0200.0250.030

0.0000.0050.0100.0150.0200.0250.030

0.0000.0050.0100.0150.0200.0250.030

1

4

7

0 50 100 150 200 250 300

2

5

8

0 50 100 150 200 250 300

3

6

9

0 50 100 150 200 250 300

8 are exponential. 1 is real.

Thursday, 4 February 2010

Page 26: 08 Continuous

dur

Proportion

0.000

0.005

0.010

0.015

0.020

0.025

0.030

0 50 100 150 200 250 300

What causes this spike?

Thursday, 4 February 2010

Page 27: 08 Continuous

dur

Proportion

0.000

0.005

0.010

0.015

0.020

0.025

0.030

0 50 100 150 200 250 300

β = 44.9 s

Thursday, 4 February 2010

Page 28: 08 Continuous

Why?

Why is it easy to see that the waiting times are not exponential, when we couldn’t see that the scores were not Poisson?

Thursday, 4 February 2010

Page 29: 08 Continuous

Next

Remove free throws.

Try using a gamma instead (it’s a bit more flexible because it has two parameters, but it maintains the basic idea)

Thursday, 4 February 2010

Page 30: 08 Continuous

value

Proportion

0.000

0.005

0.010

0.015

0.020

0.000

0.005

0.010

0.015

0.020

0.000

0.005

0.010

0.015

0.020

1

4

7

0 100 200 300 400

2

5

8

0 100 200 300 400

3

6

9

0 100 200 300 400

8 are gamma. 1 is real.

Thursday, 4 February 2010

Page 31: 08 Continuous

dur

Proportion

0.000

0.005

0.010

0.015

0.020

0 100 200 300 400

Thursday, 4 February 2010

Page 32: 08 Continuous

dur

Proportion

0.000

0.005

0.010

0.015

0.020

0 100 200 300 400

β = 28.3α = 2.28

Out of 6611 shots, 3140 made it:

1 went in for every 2.1 attempts

Thursday, 4 February 2010

Page 33: 08 Continuous

f(x) =1θe−x/θ

M(t) =1

1− θt

Exponential Gamma

pdf

mgf

Mean

Variance

θ

θ2

f(x) =1

Γ(α)θαxα−1e−x/θ

M(t) =1

(1− θt)α

αθ

αθ2

Thursday, 4 February 2010

Page 34: 08 Continuous

Another derivation

Find mgf: integrate 1 / (gamma(alpha) theta^alpha) x^(alpha - 1) e^(-x / theta) e^(x*t) from x = 0 to infinity

Find mean: d/dt theta^(-alpha) (1/theta - t)^(-alpha) where t = 0

Rewrite: d/dt x_2^(-x_1) (1/x_2 - t)^(-x_1) where t = 0

Thursday, 4 February 2010

Page 35: 08 Continuous

Your turn

Assume the distribution of offensive points by the LA lakers is Poisson(109.6). A game is 48 minutes long.

Let W be the time you wait between the Lakers scoring a point. What is the distribution of W?

How long do you expect to wait?

Thursday, 4 February 2010

Page 36: 08 Continuous

Next week

Normal distribution: Read 3.2.4

Calculating probabilities

Transformations: Read 3.4 up to 3.4.3

Thursday, 4 February 2010