Probability Distributions and Statistics - University of St....

43
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc. Probability Distributions and Statistics 8 Distributions of Random Variables Expected Value Variance and Standard Deviation Binomial Distribution Normal Distribution Applications of the Normal Distribution

Transcript of Probability Distributions and Statistics - University of St....

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Probability Distributions and Statistics

8• Distributions of Random Variables

• Expected Value

• Variance and Standard Deviation

• Binomial Distribution

• Normal Distribution

• Applications of the Normal Distribution

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

A random variable is a rule that assigns a number to each outcome of a chance experiment.

• Finite discrete – variable can assume only finitely many values.

• Infinite discrete – variable can assume infinitely many values that may be arranged in a sequence.

• Continuous – variable can assume values that make up an interval of real numbers.

Random Variable

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Probability Distribution for the Random Variable X

A probability distribution for a random variable X:

0.090.110.150.200.170.150.13P(X = x)6410–1–3–8 x

Find( )( )

a. 0

b. 3 1

P X

P X

− ≤ ≤

0.65

0.67

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. Students from a small college were asked how many charge cards they carry. X is the random variable representing the number of cards and the results are below.

264594

243572421120

#peoplex

0.010.030.060.160.380.280.08

P(x =X)Probability Distribution

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

HistogramsA way to represent a probability distribution of a random variable graphically.

0.010.030.060.160.380.280.08

P(x =X)

6543210x

Credit card results:P (X = x )

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6

Number of Cards

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

MeanThe average (mean) of the n numbers 1 2, , ..., nx x xis wherex 1 2 ... nx x xx

n+ + +

=

Mode

MedianThe median is the middle value in a set of data that is arranged in increasing or decreasing order. For an even number of data points the median is the average of the middle two.

The mode is the number that occurs most frequently in a set of data.

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. The quiz scores for a particular student are given below:22, 25, 20, 18, 12, 20, 24, 20, 20, 25, 24, 25, 18Find the mean, median and mode.

12, 18, 18, 20, 20, 20, 20, 22, 24, 24, 25, 25, 25

Mean: 273 2113

= =sum of entries

number of data points

Median: Middle number = 20

Mode (most frequent): 20 (occurs 4 times)

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Expected Value of a Random Variable X

Let X denote a random variable that assumes the values x1, x2, …,xn with associated probabilities p1, p2, …, pn, respectively. Then the expected value of X, E(X), is given by

( ) 1 1 2 2 ... n nE X x p x p x p= + + +

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. Use the data below to find out the expected number of credit cards that a student will possess.

0.010.030.060.160.380.280.08

P(x =X)

6543210x

x = # credit cards

( ) 1 1 2 2 ... n nE X x p x p x p= + + +

0(.08) 1(.28) 2(.38) 3(.16) 4(.06) 5(.03) 6(.01)= + + +

+ + +

=1.97

About 2 credit cards

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. Jackson and Max are playing a dice game where a single die is rolled. Jackson pays Max $2 for rolling a 1, 2, 3, or 4 and Max pays Jackson $D for a 5 or 6. Determine the value of D if the game is to be fair. 4 2(Jackson loses)

6 3P = =

We want the expected value of the game to be zero to be fair:

( ) ( )2 12 03 3

D⎛ ⎞ ⎛ ⎞− + =⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠

4 0D− + =

4D =

D should be $4

Jackson loss

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

OddsIf P(E) is the probability of an event E occurring, then

1. The odds in favor of E occurring are given by the ratio

( )( )

( )( )1 C

P E P EP E P E

=−

2. The odds against E occurring are given by the ratio

( )( )

( )( )

1CP EP E

P E P E−

=

E occurs

E doesn’t occur

E doesn’t occur

E occurs

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. If the news has just announced that the probability of rain is 0.65 (65%), find

a. the odds in favor of rain

b. The odds against rain

( )( )

.651 1 .65

P EP E

=− −

( )( )

1 .35 7.65 13

P EP E−

= =

.65 13

.35 7= =

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Probability of an Event (Given Odds)

If the odds in favor of an event E occurring are a to b, then the probability of E occurring is

( ) aP Ea b

=+

Ex. The odds that the horse Gluebound will win a particular race are 2 to 16. Find the probability that Gluebound wins the race.

( ) aP wina b

=+

2 2 12 16 18 9

= = =+

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

( ) ( ) ( )22 21 1 2 2Var( ) ... n nX p x p x p xµ µ µ= − + − + + −

The variance of a random variable X is defined by:

VarianceVariance is a measure of the spread of the data. The larger the variance, the larger the spread.

Suppose a random variable has the probability distribution

and expected value ( )E X µ=

pn…p3p2p1P(X = x)xn…x3x2x1x

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

( ) ( ) ( )22 21 1 2 2

Var( )

... n n

X

p x p x p x

σ

µ µ µ

=

= − + − + + −

The standard deviation of a random variable X is defined by:

Standard DeviationStandard deviation is a measure of the spread of the data using the same units as the data.

Where each xi denotes the value assumed by the random variable X and pi is the probability associated with xi.

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. The quiz scores for a particular student are given below:22, 25, 20, 18, 12, 20, 24, 20, 20, 25, 24, 25, 18Find the variance and standard deviation.

The expected value 21µ ≈

321421Frequency.23.15.08.31.15.08Probability

252422201812Value

( ) ( ) ( )22 21 1 2 2Var( ) ... n nX p x p x p xµ µ µ= − + − + + −

Var( )Xσ =

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

( ) ( )

( ) ( )

( ) ( )

2 2

2 2

2 2

Var( ) .08 12 21 .15 18 21

.31 20 21 .08 22 21

.15 24 21 .23 25 21

X = − + −

+ − + −

+ − + −

Var( ) 13.25X =

Var( ) 13.25 3.64Xσ = = ≈

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

( ) 211P k X k

kµ σ µ σ− ≤ ≤ + ≥ −

Chebychev’s InequalityLet X be a random variable,

where the expected valuethe standard deviation

µσ==

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. A probability distribution has a mean of 40 and a standard deviation of 12. Use Chebychev’sinequality to estimate the probability that an outcome of the experiment lies between 22 and 58.

( )10 70P X≤ ≤

Notice that 1070

kk

µ σµ σ− =+ = 2.5k =

( )( )2 2

1 110 70 1 15 / 2

P Xk

≤ ≤ ≥ − = − 2125

=

So at least 84%

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

A binomial experiment has the properties:

1. The number of trials in the experiment is fixed.

2. The only outcomes are “success” and “failure.”

3. The probability of success in each trial is the same.

4. The trials are independent of each other.

Binomial (Bernoulli) Trials

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Probabilities in Bernoulli Trials

In a binomial experiment in which the probability of success in any trial is p, the probability of exactly x successes in n independent trials is given by

( ), x n xC n x p q −

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. A card is drawn from a standard 52-card deck. If drawing a club is considered a success, find the probability of

a. exactly one success in 4 draws (with replacement).

( )1 31 34,1

4 4C ⎛ ⎞ ⎛ ⎞

⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠

( ) 1 1 3, where , 14 4 4

x n xC n x p q p q− = = − =

b. no successes in 5 draws (with replacement).

( )0 51 35,0

4 4C ⎛ ⎞ ⎛ ⎞

⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠

0.422≈

0.237≈

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Mean, Variance, and Standard Deviation of a Random Variable X

If X is a binomial random variable associated with a binomial experiment consisting of n trials with probability of success p and probability of failure q, then the mean, variance, and standard deviation of X are

( )( )Var

X

E X np

X npq

npq

µ

σ

= =

=

=

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. 5 cards are drawn, with replacement, from a standard 52-card deck. If drawing a club is considered a success, find the mean, variance, and standard deviation of X (where X is the number of successes).

15 1.254

npµ ⎛ ⎞= = =⎜ ⎟⎝ ⎠

( ) 1 3Var 5 0.93754 4

X npq ⎛ ⎞⎛ ⎞= = =⎜ ⎟⎜ ⎟⎝ ⎠⎝ ⎠

0.9375 0.968X npqσ = = ≈

1 1 3, 14 4 4

p q= = − =

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. If the probability of a student successfully passing this course (C or better) is 0.82, find the probability that given 8 students

a. all 8 pass.

b. none pass.

c. at least 6 pass.

( )( ) ( )8 08,8 0.82 0.18C

( )( ) ( )0 88,0 0.82 0.18C

( )( ) ( ) ( )( ) ( )( )( ) ( )

6 2 7 1

8 0

8,6 0.82 0.18 8,7 0.82 0.18

8,8 0.82 0.18

C C

C

+

+

0.2758 0.3590 0.2044≈ + +

so 6, 7, and 8 successes

= 0.8392

0.2044≈

0.0000011≈

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Probability Density FunctionA probability density function, f, defines a continuous probability distribution and coincides with the interval of values taken on by the random variable associated with an experiment.

1. f (x) is nonnegative for all values of x.

2. The area of the region between the graph of f and the x –axis is equal to 1.

Area = 1( )y f x=

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Probability Density Function

P(a < X < b) is given by the area of the shaded region.

( )y f x=

ba

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Normal DistributionsNormal distributions are a special class of continuous probability density functions. Many phenomena have probability density functions that are normal.

The graph of this distribution is called a normal curve.

2(1/ 2)[( ) / ]1( )2

xf x e µ σ

σ π− −=

The probability density function associated with the normal curve:

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Normal Curve Properties

1. The peak is at

2. There is symmetry with respect to the line

3. The curve lies above and approaches the x–axis.

4. The area under the curve is 1.

5. 68.27% of the area lies within 1 standard deviation of the mean, 95.45% within 2, and 99.73% within 3 (see curve on next slide).

.x µ=

.x µ=

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Normal Curve

68.27%95.45%

99.73%

Percentage of area within given standard deviations.

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Normal curves with the same standard deviation but different means

Normal curves with the same mean but different standard deviations.

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Standard Normal DistributionDenoted by the variable Z, with 0 and 1.µ σ= =

Ex. Let Z be the standard normal variable. Find (from table)a. P(Z < 0.85)

This is the area to the left of 0.85 0.8023

b. P(Z > 1.32)

Use the fact that this area is equivalent to finding P(Z < –1.32)

0.0934

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

c. P(–2.1 < Z < 1.78) Find the area to the left of 1.78 then subtract the area to the left of –2.1.

P(Z < 1.78) – P(Z < –2.1) 0.9625 – 0.0179

= 0.9446

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

= 2[P(Z < z ) – ½] P(z < Z < –z ) = 2P(0 < Z < z)

z = 1.32

Ex. Let Z be the standard normal variable. Find z if a. P(Z < z) = 0.9278.

Look at the table and find an entry = 0.9278 then read back to find

z = 1.46.

b. P(–z < Z < z) = 0.8132

= 2P(Z < z) – 1 = 0.8132P(Z < z) = 0.9066

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Transforming Other Normal Distributions into a Standard Normal Distribution

Given X, a normal random variable distribution with mean = and standard deviation = ,µ σ

We can transform X to Z using:

XZ µσ−

=

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

80 and 20.µ σ= =Ex. Let X be a normal random variable with

Find a. P(X < 65)

Convert to standard normal

= 0.2266

( ) 65 806520

P X P Z −⎛ ⎞< = <⎜ ⎟⎝ ⎠

( ).75P Z= < −

b. P(X > 60)

a. P(X < 65)

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Convert to standard normal

= 0.8413

( ) ( )60 8060 120

P X P Z P Z−⎛ ⎞> = > = > −⎜ ⎟⎝ ⎠

( )1 1P Z= − < −

b. P(X > 60)

1 0.1587= −

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. A particular rash has shown up at an elementary school. It has been determined that the length of time that the rash will last is normally distributed with

6 days and 1.5 days.µ σ= =

a. Find the probability that for a student selected at random, the rash will last for less than 3 days.

b. Find the probability that for a student selected at random, the rash will last for between 3.75 and 9 days.

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

a. Find the probability that for a student selected at random, the rash will last for less than 3 days.

b. Find the probability that for a student selected at random, the rash will last for between 3.75 and 9 days.

( ) 3 631.5

P X P Z −⎛ ⎞< = <⎜ ⎟⎝ ⎠

( )2P Z= < −

= 0.0228

( ) 3.75 6 9 63.75 91.5 1.5

P X P Z− −⎛ ⎞< < = < <⎜ ⎟⎝ ⎠

( )1.5 2P Z= − < <

= 0.9772 – 0.0668

( )( 2) 1.5P Z P Z= < − < −

= 0.9104

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Approximating Binomial Distributions

Suppose we are given a binomial distribution associated with a binomial experiment involving ntrials, each with probability of success p and failure q. If n is large and p is not close to 1 or 0, the binomial distribution may be approximated by a normal distribution with:

np

npq

µ

σ

=

=

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex 1. PAR Bearings manufactures ball bearings packaged in lots of 100 each. The company’s quality-control department has determined that 2% of the ball bearings manufactured do not meet the specifications imposed by a buyer. Find the average number of ball bearings per package that fail to meet with the specification imposed by the buyer. The experiment under consideration is binomial. The average number of ball bearings per package that fail to meet with the specifications is therefore given by the expected value of the associated binomial random variable.

( ) 100(.02) 2E X npµ = = = =

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex 2. At a particular small college the pass rate of Intermediate Algebra is 72%. If 500 students enroll in a semester determine the probability that a. at most 375 students pass.

500(.72) 360npµ = = =

500(.72)(.28) 10npqσ = = ≈

( ) ( )375 375.5P X P Y≤ ≈ <

375.5 360 ( 1.55)10

P Z P Z−⎛ ⎞= < = <⎜ ⎟⎝ ⎠

= 0.9394

continuous variable Y

Convert to Z

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

b. between 355 and 390, inclusive, of the students pass.

( ) ( )355 390 354.5 390.5P X P Y≤ ≤ ≈ < <

354.5 360 390.5 36010 10

P Z− −⎛ ⎞= < <⎜ ⎟⎝ ⎠( )0.55 3.05P Z= − < <

( )( 3.05) 0.55P Z P Z= < − < −

= 0.9989 – 0.2912

= 0.7077 or 71%