08 Continuous

Post on 25-May-2015

300 views 0 download

Tags:

Transcript of 08 Continuous

Hadley Wickham

Stat310Continuous random variables

Thursday, 4 February 2010

1. Finish off Poisson

2. New conditions & definitions

3. Uniform distribution

4. Exponential and gamma

Thursday, 4 February 2010

Challenge

If X ~ Poisson(109.6), what is P(X > 100) ?

What’s the probability they score over 100 points? (How could you work this out?)

Thursday, 4 February 2010

grid

cumulative

0.0

0.2

0.4

0.6

0.8

1.0

●●●●●●●●●●●●●●●●●●●●●

●●●●

●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

80 100 120 140

Thursday, 4 February 2010

Multiplication

X ~ Poisson(λ)

Y = tX

Then Y ~ Poisson(λt)

Thursday, 4 February 2010

Continuous random variables

Recall: have uncountably many outcomes

Thursday, 4 February 2010

f(x) For continuous x, f(x) is a probability density function.

Not a probability!(may be greater than one)

Thursday, 4 February 2010

P (a < X < b) =� b

af(x)dx

What does P(X = a) = ?

P (a ≤ X ≤ b) =� b

af(x)dx

Thursday, 4 February 2010

f(x) ≥ 0 ∀x ∈ R�

Rf(x) = 1

Conditions

Thursday, 4 February 2010

MX(t) =�

Retxf(x)dx

E(u(X)) =�

Ru(x)f(x)dx

Thursday, 4 February 2010

AsideWhy do we need different methods for discrete and continuous variables?

Our definition of integration (Riemanian) isn’t good enough, because it can’t deal with discrete pdfs.

Can generalise to get Lebesgue integration. More general approach called measure theory.

Thursday, 4 February 2010

F (x) = P (X ≤ x) =� x

−∞f(t)dt

P (X ∈ [a, b]) =� b

af(x)dx = F (b)− F (a)

Cumulative distribution function

Thursday, 4 February 2010

f(x)DifferentiateIntegrate

F(x)Thursday, 4 February 2010

Uniform distribution

Thursday, 4 February 2010

Assigns probability uniformly in an interval [a, b] of the real line

X ~ Uniform(a, b)

What are F(x) and f(x) ?

The uniform

� b

af(x)dx = 1

Thursday, 4 February 2010

Intuition

X ~ Unif(a, b)

How do you expect the mean to be related to a and b?

How do you expect the variance to be related to a and b?

Thursday, 4 February 2010

f(x) =1

b− ax ∈ [a, b]

F (x) = 0 x < a

F (x) = x−ab−a x ∈ [a, b]

F (x) = 1 x > b

Thursday, 4 February 2010

integrate e^(x * t) 1/(b-a) dx from a to b

MX(t) = E[eXt] =� b

aext 1

b− adx

Thursday, 4 February 2010

Mean & varianceFind mgf: integrate e^(x * t) 1/(b-a) dx from a to b

Find 1st moment: d/dt (E^(a_1 t) - E^(b_1 t))/(a_1 t - b_1 t) where t = 0

Find 2nd moment d^2/dt^2 (E^(a_1 t) - E^(b_1 t))/(a_1 t - b_1 t) where t = 0

Find var: simplify 1/3 (a_1 b_1+a_1^2+b_1^2) - ((a_1 + b_1)/2) ^ 2

Thursday, 4 February 2010

E(X) =a + b

2

V ar(X) =(b− a)2

12

Thursday, 4 February 2010

Write out in a similar way when using wolfram

alpha in homework

Thursday, 4 February 2010

Exponential and gamma distributions

Thursday, 4 February 2010

Conditions

Let X ~ Poisson(λ)

Let Y be the time you wait for the next event to occur. Then Y ~ Exponential(β = 1 / λ)

Let Z be the time you wait for α events to occur. Then Z ~ Gamma(α, β = 1 / λ)

Thursday, 4 February 2010

LA Lakers

Last time we said the distribution of scores looked pretty close to Poisson. Does the distribution of times between scores look exponential?

(Downloaded play-by-play data for all NBA games in 08/09 season, extracted all point scoring plays by Lakers, computed times between them - 4660 in total)

Thursday, 4 February 2010

value

Proportion

0.0000.0050.0100.0150.0200.0250.030

0.0000.0050.0100.0150.0200.0250.030

0.0000.0050.0100.0150.0200.0250.030

1

4

7

0 50 100 150 200 250 300

2

5

8

0 50 100 150 200 250 300

3

6

9

0 50 100 150 200 250 300

8 are exponential. 1 is real.

Thursday, 4 February 2010

dur

Proportion

0.000

0.005

0.010

0.015

0.020

0.025

0.030

0 50 100 150 200 250 300

What causes this spike?

Thursday, 4 February 2010

dur

Proportion

0.000

0.005

0.010

0.015

0.020

0.025

0.030

0 50 100 150 200 250 300

β = 44.9 s

Thursday, 4 February 2010

Why?

Why is it easy to see that the waiting times are not exponential, when we couldn’t see that the scores were not Poisson?

Thursday, 4 February 2010

Next

Remove free throws.

Try using a gamma instead (it’s a bit more flexible because it has two parameters, but it maintains the basic idea)

Thursday, 4 February 2010

value

Proportion

0.000

0.005

0.010

0.015

0.020

0.000

0.005

0.010

0.015

0.020

0.000

0.005

0.010

0.015

0.020

1

4

7

0 100 200 300 400

2

5

8

0 100 200 300 400

3

6

9

0 100 200 300 400

8 are gamma. 1 is real.

Thursday, 4 February 2010

dur

Proportion

0.000

0.005

0.010

0.015

0.020

0 100 200 300 400

Thursday, 4 February 2010

dur

Proportion

0.000

0.005

0.010

0.015

0.020

0 100 200 300 400

β = 28.3α = 2.28

Out of 6611 shots, 3140 made it:

1 went in for every 2.1 attempts

Thursday, 4 February 2010

f(x) =1θe−x/θ

M(t) =1

1− θt

Exponential Gamma

pdf

mgf

Mean

Variance

θ

θ2

f(x) =1

Γ(α)θαxα−1e−x/θ

M(t) =1

(1− θt)α

αθ

αθ2

Thursday, 4 February 2010

Another derivation

Find mgf: integrate 1 / (gamma(alpha) theta^alpha) x^(alpha - 1) e^(-x / theta) e^(x*t) from x = 0 to infinity

Find mean: d/dt theta^(-alpha) (1/theta - t)^(-alpha) where t = 0

Rewrite: d/dt x_2^(-x_1) (1/x_2 - t)^(-x_1) where t = 0

Thursday, 4 February 2010

Your turn

Assume the distribution of offensive points by the LA lakers is Poisson(109.6). A game is 48 minutes long.

Let W be the time you wait between the Lakers scoring a point. What is the distribution of W?

How long do you expect to wait?

Thursday, 4 February 2010

Next week

Normal distribution: Read 3.2.4

Calculating probabilities

Transformations: Read 3.4 up to 3.4.3

Thursday, 4 February 2010