COMMONLY USED PROBABILITY DISTRIBUTION

68
COMMONLY USED COMMONLY USED PROBABILITY PROBABILITY DISTRIBUTION DISTRIBUTION CHAPTER 2 CHAPTER 2 BCT2053 BCT2053

description

COMMONLY USED PROBABILITY DISTRIBUTION. CHAPTER 2 BCT2053. Introduction. Probability – chance of an event occurring Distribution – a function which assigns to each possible value of the random variable - PowerPoint PPT Presentation

Transcript of COMMONLY USED PROBABILITY DISTRIBUTION

COMMONLY COMMONLY USED USED

PROBABILITY PROBABILITY DISTRIBUTIONDISTRIBUTION

CHAPTER 2CHAPTER 2

BCT2053BCT2053

Introduction Probability – chance of an event occurring

Distribution – a function which assigns to each possible value of the random variable

Probability Distribution – the values/function that a random variable can assume and the corresponding probabilities of the values

Types of probability distribution:1. Discrete – describe discrete random variable2. Continuous – describe continuous random

variable

CONTENT 2.1 Review on Binomial and Poisson Distributions 2.2 Poisson Approximation for Binomial Distribution 2.3 Review on Normal Distribution 2.4 Central Limit Theorem 2.5 Normal Approximation to the Binomial

Distribution 2.6 Normal Approximation to the Poisson

Distribution 2.7 Normal Probability Plots

OBJECTIVEOBJECTIVEAt the end of this chapter, you should be able to:

1. Explain what a Binomial Distribution, identify Binomial experiments and compute Binomial probabilities

2. Find the expected value (mean), variance, and standard deviation of a Binomial experiment.

2.1 Binomial Distribution

Binomial Distribution

A Binomial distribution results from a procedure that meets all the following requirements

The procedure has a fixed number of trials ( the same trial is repeated)

The trials must be independent

Each trial must have outcomes classified into 2 relevant categories only (success & failure)

The probability of success remains the same in all trials

• Example: toss a coin, Baby is born, True/false question, product, etc ...

Notation for the Binomial Distribution

Then, X has the Binomial distribution with parameters n and p denoted by X ~ Bin (n, p) which read as

‘‘X is Binomial distributed with number of trials n and probability of success p’’

Binomial Experiment or not ?

1. An advertisement for Vantin claims a 77% end of treatment clinical success rate for flu sufferers. Vantin is given to 15 flu patients who are later checked to see if the treatment was a success.

2. A study showed that 83% of the patients receiving liver transplants survived at least 3 years. The files of 6 liver recipients were selected at random to see if each patients was still alive.

3. In a study of frequent fliers (those who made at least 3 domestic trips or one foreign trip per year), it was found that 67% had an annual income over RM35000. 12 frequent fliers are selected at random and their income level is determined.

For each problem, state what are X, n, p, and q.

Binomial Probability Formula

EXERCISE 2.1

1. A fair coin is tossed 10 times. Let X be the number of heads that appear. What is the distribution of X?

2. A lot contains several thousand components. 10 % of the components are defective. 7 components are sampled from the lot.

Let X represents the number of defective components in the sample. What is the distribution of X ?

Solves problems involving linear inequalities

At least, minimum of, no less than

At most, maximum of, no more than

Is greater than, more than

Is less than, smaller than, fewer than

EXERCISE 2.1

3. Find the probability distribution of the random variable X if X ~ Bin (10, 0.4). Find also P(X = 5) and P(X < 2). Then find the mean and variance for X.

4. A fair die is rolled 8 times. Find the probability that no more than 2 sixes comes up. Then find the mean and variance for X.

EXERCISE 2.15. A survey found that, one out of five Malaysians

say he or she has visited a doctor in any given month. If 10 people are selected at random, find the probability that exactly 3 will have visited a doctor last month.

6. A survey found that 30% of teenage consumers receive their spending money from part time jobs. If 5 teenagers are selected at random, find the probability that at least 3 of them will have part time jobs.

Solve Binomial problems by statistics table

Use Cumulative Binomials Probabilities Table

n number of trials p probability of success k number of successes in n trials – X It give P (X ≤ k) for various values of n and p

Example: n = 2 , p = 0.3

Then P (X ≤ 1) = 0.9100

Then P (X = 1) = P (X ≤ 1) - P (X ≤ 0) = 0.9100 – 0.4900 = 0.4200

Then P (X ≥ 1) = 1 - P (X <1) = 1 - P (X ≤ 0) = 1 – 0.4900 = 0.5100

Then P (X < 1) = P (X ≤ 0) = 0.4900

Then P (X > 1) = 1 - P (X ≤ 1) = 1- 0.9100 = 0.0900

Using symmetry properties to read Binomial tables

In general,

P (X = k | X ~ Bin (n, p)) = P (X = n - k | X ~ Bin (n,1 - p)) P (X ≤ k | X ~ Bin (n, p)) = P (X ≥ n - k | X ~ Bin (n,1 - p)) P (X ≥ k | X ~ Bin (n, p)) = P (X ≤ n - k | X ~ Bin (n,1 - p))

Example: n = 8 , p = 0.6

Then P (X ≤ 1) = P (X ≥ 7 | p = 0.4) = P ( 1 - X ≤ 6 | p = 0.4)

= 1 – 0.9915 = 0.0085

Then P (X = 1) = P (X = 7 | p = 0.4) = P (X ≤ 7 | p = 0.4) - P (X ≤ 6 | p = 0.4)

= 0.9935 – 0.9915 = 0.0020

Then P (X ≥ 1) = P (X ≤ 7 | p = 0.4) = 0.9935

Then P (X < 1) = P (X > 7 | p = 0.4) = P ( 1 - X ≤ 7 | p = 0.4) = 1 – 0.9935 = 0.0065

Then P (X > 1) = P (X < 7 | p = 0.4) = P (X ≤ 6 | p = 0.4) = 0.9915

7. Given that n = 12 , p = 0.25. Then find P (X ≤ 3) P (X = 7) P (X ≥ 5) P (X < 2) P (X > 10)

8. Given that n = 9 , p = 0.7. Then find P (X ≤ 4) P (X = 8) P (X ≥ 3) P (X < 5) P (X > 6)

EXERCISE 2.1

EXERCISE 2.1

9. A large industrial firm allows a discount on any invoice that is paid within 30 days. Of all invoices, 10% receive the discount. In a company audit, 12 invoices are sampled at random.

a) What is probability that fewer than 4 of 12 sampled invoices receive the discount?

b) Then, what is probability that more than 1 of the 12 sampled invoices received a discount.

EXERCISE 2.1

10. A report shows that 5% of Americans are afraid being alone in a house at night. If a random sample of 20 Americans is selected, find the probability that

a) There are exactly 5 people in the sample who are afraid of being alone at night

b) There are at most 3 people in the sample who are afraid of being alone at night

c) There are at least 4 people in the sample who are afraid of being alone at night

OBJECTIVEOBJECTIVEAt the end of this chapter, you should be able to:

1. Explain what a Poisson Distribution, identify Poisson experiments and compute Poisson probabilities.

2. Find the expected value (mean), variance, and standard deviation of a Poisson experiment.

2.1 Poisson Distribution

Poisson Distribution The Poisson distribution is a discrete probability

distribution that applies to occurrences of some event over a specified interval ( time, volume, area etc..)

The random variable X is the number of occurrences of an event over some interval

The occurrences must be random

The occurrences must be independent of each other

The occurrences must be uniformly distributed over the interval being used

Example of Poisson distribution1. The number of emergency call received by an ambulance control in an hour.2. The number of vehicle approaching a bus stop in a 5 minutes interval. 3. The number of flaws in a meter length of material

Poisson Probability Formula

λ, mean number of occurrences in the given interval is known and finite

Then the variable X is said to be ‘Poisson distributed with mean λ’ X ~ Po (λ)

EXERCISE 2.2

1. A student finds that the average number of amoebas in 10 ml of ponds water from a particular pond is 4. Assuming that the number of amoebas follows a Poisson distribution, find the probability that in a 10 ml sample,

a) there are exactly 5 amoebas

b) there are no amoebas

c) there are fewer than three amoebas

2. On average, the school photocopier breaks down 8 times during the school week (Monday - Friday). Assume that the number of breakdowns can be modeled by a Poisson distribution. Find the probability that it breakdowns,

a) 5 times in a given week

b) Once on Monday

c) 8 times in a fortnight (2 week)

EXERCISE 2.2

EXERCISE 2.2Solve Poisson problems by statistics table

3. Given that X ~ Po (1.6). Use cumulative Poisson probabilities table to find

a) P (X ≤ 6) b) P (X = 5) c) P (X ≥ 3) d) P (X < 1)e) P (X > 10)

Find also the smallest integer n such that P ( X > n) < 0.01

4. A sales firm receives, on the average, three calls per hour on its toll-free number. For any given hour, find the probability that it will receive the following:

a) At most three calls

b) At least three calls

c) 5 or more calls

EXERCISE 2.2

5. The number of accidents occurring in a weak in a certain factory follows a Poisson distribution with variance 3.2. Find the probability that in a given fortnight,

a) exactly seven accidents happen.b) More than 5 accidents happen.

EXERCISE 2.2

2.2 Using the Poisson distribution as an

approximation to the Binomial distribution

When n is large (n > 50) and p is small (p < 0.1), the Binomial distribution X ~ Bin (n, p) can be approximated using a Poisson distribution with X ~ Po (λ) where mean, λ = np < 5.

The larger the value of n and the smaller the value of p, the better the approximation.

6. Eggs are packed into boxes of 500. On average 0.7 % of the eggs are found to be broken when the eggs are unpacked. Find the probability that in a box of 500 eggs,

a) Exactly three are brokenb) At least two are broken

EXERCISE 2.2

7. If 2% of the people in a room of 200 people are left-handed, find the probability that

a) exactly five people are left-handed.b) At least two people are left-handed.c) At most seven people are left-handed.

EXERCISE 2.2

OBJECTIVEOBJECTIVEAt the end of this chapter, you should be able to:

1. Identify the properties of the normal distribution and find the area under the standard normal distribution, given various Z values.

3. Find probabilities for a normally distributed variable by transforming it into a standard normal variable.

4. Find specific data values for given percentages, using the standard normal distribution.

2.3 Normal Distribution

A discrete variable cannot assume all values between any two given values of the variables.

A continuous variable can assume all values between any two given values of the variables.

Examples of continuous variables are the heights of adult men, body temperatures of rats, and cholesterol levels of adults.

Many continuous variables, such as the examples just mentioned, have distributions that are bell-shaped, and these are called approximately normally distributed variables.

Continuous Distribution

Properties of Normal Distribution

Also known as the bell curve or the Gaussian distribution, named for the German mathematician Carl Friedrich Gauss (1777–1855), who derived its equation.

X is continuous where and

2~ ,X N

2 221,

2xf x e x

Example: Histograms for the Distribution of Heights of Adult Women

Observation The larger the data size, then the distribution of

the data will approximately bell shape (normal).

No variable fits normal distribution perfectly, since a normal distribution is a theoretical distribution.

However, a normal distribution can be used to describe many variables, because the deviations from normal distribution are very small.

The Normal Probability Curve

The Curve is bell-shaped

The mean, median, and mode are equal and located at the center of the distribution.

The curve is unimodal (i.e., it has only one mode).

The curve is symmetric about the mean, (its shape is the same on both sides of a vertical line passing through the center.

The curve is continuous, (there are no gaps or holes) For each value of X, there is a corresponding value of Y.

The Normal Probability Curve The curve never touches the x axis.

Theoretically, no matter how far in either direction the curve extends, it never meets the x axis—but it gets increasingly closer.

The total area under the normal distribution curve is equal to 1.00, or 100%.

A Normal Distribution is a continuous, symmetric, bell shaped distribution of a variable.

Area Under a Normal Distribution Curve

The area under the part of the normal curve that lies

within 1 standard deviation of the mean is approximately 0.68, or 68%;

within 2 standard deviations, about 0.95, or 95% within 3 standard deviations, about 0.997, or 99.7%.

Other Characteristics

Finding the probability Area under curve

0.68P x

2 2 0.95P x

3 3 0.99P x

P a x b

Example

Given ~ 110,144X N , Find the value of a and b if 0.68P a x b

Shapes of Normal Distributions

The Standard Normal Distribution The standard normal distribution is a normal distribution

with a mean of 0 and a standard deviation of 1.

use the statistical table

to obtain probability for ,

0 0

X

P X x P Z z z

0 0.12 0.0478

0.0517 0.04780 0.123 0.0478 3

10

P Z

P Z

TIPS

2

The standard normal variable is given by ~ 0,1

where and ~ ,

Z Z N

XZ X N

Different between 2 curves

Area Under the Normal Distribution Curve

Area Under the Standard Normal Distribution Curve

Finding Area under the Standard Normal Distribution

STEP 1 Draw a picture. STEP 2 Shade the area

desired. STEP 3 Find the correct

figure in the following Procedure Table (the figure that is similar to the one you’ve drawn).

STEP 4 Follow the directions given in the appropriate block of the Procedure Table to get the desired area.

GENERAL PROCEDURE

EXAMPLE 1

P (0 < Z < 2.34) = 0.4904 P (-2.34 < Z < 0) = 0.4904 P (0 < Z < 0.156) = 0.062 P (-1.738 < Z < 0) = 0.4589

Finding Area under the Standard Normal Distribution

EXAMPLE 2

P (Z >1.25) = 0.1056 P (Z <-2.13) = 0.0166 P (Z >2.099) = 0.0179 P (Z <-0.087) = 0.4653

EXAMPLE 3

P (0.21 < Z < 2.34) = 0.4072 P (-2.134 < Z < -0.21) = 0.4004 P (0.67 < Z < 1.156) = 0.1276 P (-1.738 < Z < -0.79) = 0.1737

Finding Area under the Standard Normal Distribution

EXAMPLE 4

P (-0.21 < Z < 2.34) = 0.5736 P (-2.134 < Z < 0.21) = 0.5688 P (-0.67 < Z < 1.156) = 0.6248 P (Z < |0.79|) = 0.5704

EXAMPLE 5

P (Z < 1.21) = 0.8869 P (Z < 2.099) = 0.9821 P (Z < 0.512) = 0.6957

Finding Area under the Standard Normal Distribution

EXAMPLE 6

P (Z >-1.25) = 0.8944 P (Z >-2.13) = 0.9834 P (Z >-0.087) = 0.5347

EXAMPLE 7

P (Z >|2.34|) = 0.0192 P (Z >|0.147|) = 0.8832

Transform the original variable X where

to a standard normal distribution variable Z where

EXERCISE 2.31. Given X ~ N(110,144), find

(a) P (110 < X < 128) (d) P (X > 170)

(b) P (X < 150) (e) P (98 < X < 128)

(c) P (X > 130) (f) P (X < 60)

~ 0,1X

Z N Z

TIPS

2~ ,X N

EXERCISE 2.3

2. If Z ~ N(0,1), find the value of a if

a) P(Z < a) = 0.9693b) P(Z < a) = 0.3802c) P(Z < a) = 0.7367d) P(Z < a) = 0.0793

3. If X ~ N(μ,36) and P ( X > 82) = 0.0478, find μ.

4. If X ~ N(100, σ ²) and P ( X < 82) = 0.0478, find σ.

0 0.0478 0.12

0 0.0490

0.0490 0.0478 0.12 100

0.0517 0.0478

P Z a a

P Z a

a

TIPS

EXERCISE 2.3 Applications of the Normal Distribution

5. The mean number of hours an American worker spends on the computer is 3.1 hours per workday. Assume the standard deviation is 0.5 hour. Find the percentage of workers who spend less than 3.5 hours on the computer. Assume the variable is normally distributed.

6. Length of metal strips produced by a machine are normally distributed with mean length of 150 cm and a standard deviation of 10cm. Find the probability that the length of a randomly selected is

a) Shorter than 165 cmb) within 5cm of the mean

7. Time taken by the Milkman to deliver to the Jalan Indah is normally distributed with mean of 12 minutes and standard deviation of 2 minutes. He delivers milk everyday. Estimate the numbers of days during the year when he takes

a) longer than 17 minutesb) less than ten minutesc) between 9 and 13 minutes

8. To qualify for a police academy, candidates must score in the top 10% on a general abilities test. The test has a mean of 200 and a standard deviation of 20. Find the lowest possible score to qualify. Assume the test scores are normally distributed.

EXERCISE 2.3 Applications of the Normal Distribution

EXERCISE 2.3Applications of the Normal Distribution

9. The heights of female student at a particular college are normally distributed with a mean of 169cm and a standard deviation of 9 cm.

a) Given that 80% of these female students have a height less than h cm. Find the value of h.

b) Given that 60% of these female students have a height greater than y cm. Find the value of y.

10. For a medical study, a researcher wishes to select people in the middle 60% of the population based on blood pressure. If the mean systolic blood pressure is 120 and the standard deviation is 8, find the upper and lower readings that would qualify people to participate in the study.

OBJECTIVEOBJECTIVEAt the end of this chapter, you should be able to:

1. Use the central limit theorem to solve problems involving sample means for large samples (probability of mean)

2.4 Central Limit Theorem

The Central Limit Theorem

As the sample size n increases without limit, the shape of the distribution of the sample means taken with replacement from a population with mean µ and standard deviation σ will approach a normal distribution.

This distribution (for sample mean) will have a mean µ and a standard deviation σ/√n.

The Central Limit TheoremMATHEMATICAL EXPLAINATION

The Central Limit Theorem

If and n sample is selected,

then

Use a standard normal distribution variable Z where

~ 0,1X

Z N Z

n

TIPS

2~ ,X N

2

~ ,X Nn

EXTRA: If the distribution of X is not normal, so a sample size of 30 or more is needed to use the central limit theorem

EXERCISE 2.4

1. A. C. Neilsen reported that children between the ages of 2 and 5 watch an average of 25 hours of television per week. Assume the variable is normally distributed and the standard deviation is 3 hours. If 20 children between the ages of 2 and 5 are randomly selected, find the probability that the mean of the number of hours they watch television will be greater than 26.3 hours.

2. The average age of a vehicle registered in the United States is 8 years, or 96 months. Assume the standard deviation is 16 months. If a random sample of 36 vehicles is selected, find the probability that the mean of their age is between 90 and 100 months.

EXERCISE 2.43. The average number of pounds of meat that a person

consumes a year is 218.4 pounds. Assume that the standard deviation is 25 pounds and the distribution is approximately normal.

a. Find the probability that a person selected at random consumes less than 224 pounds per year.

b. If a sample of 40 individuals is selected, find the probability that the mean of the sample will be less than 224 pounds per year.

OBJECTIVEOBJECTIVEAt the end of this chapter, you should be able to:

1. Use the normal approximation to compute probabilities for a Binomial variable.

2.5 Normal approximation to the Binomial Distribution

Procedure1. Check to see whether the normal approximation

can be used

2. Find the mean and standard deviation

3. Write the problem in probability notation using X

4. Rewrite the problem by using the continuity correction factor, and show the corresponding area under the normal distribution

5. Find the corresponding Z values

6. Find the solution

Normal Approximation to the Binomial Distribution If X ~ Bin (n, p) and n and p are such that np ≥ 5 and nq

≥ 5 where q = 1 – p then X ~ N (np, npq) approximately.

The continuity correction is needed when using a continuous distribution (normal) as an approximation for a discrete distribution (binomial), i.e

TIPS: class boundary

EXERCISE 2.5

1. In a sack of mixed grass seeds, the probability that a seed is ryegrass is 0.35. Find the probability that in a random sample of 400 seeds from the sack,

less than 120 are ryegrass seeds between 120 and 150 (inclusive) are ryegrass more than 160 are ryegrass seeds

2. Find the probability obtaining 4, 5, 6 or 7 heads when a fair coin is tossed 12 time using a normal approximation to the binomial distribution

OBJECTIVEOBJECTIVEAt the end of this chapter, you should be able to:

1. Use the normal approximation to compute probabilities for a Poisson variable.

2.6 Normal approximation to the Poisson Distribution

Normal approximation to the Poisson Distribution

If X ~ Po (λ) and λ > 15, then X can be approximated by Normal distribution with X ~ N (λ, λ)

The continuity correction is also needed.

1. If X ~ Po (35), use the normal approximation to find

a) P ( X ≤ 33) b) P ( X > 37) c) P (33 < X < 37)d) P ( X = 37)

EXERCISE 2.6

EXERCISE 2.6

2. A radioactive disintegration gives counts that follow a Poisson distribution with mean count of 25 per second. Find the probability that in one-second interval the count is between 23 and 27 inclusive.

3. The number of hits on a website follows a Poisson distribution with mean 27 hits per hour. Find the probability that there will be 90 or more hits in three hours.

OBJECTIVEOBJECTIVEAt the end of this chapter, you should be able to:

1. Plot and interpret a Normal Probability Plot

2.7 Normal Probability Plots

Normal Probability Plots

To determine whether the sample might have come from a normal population or not.

The most plausible normal distribution is the one whose mean and standard deviation are the same as the sample mean and standard deviati.on

How to plot?

Arrange the data sample in ascending (increasing) order

Assign the value (i -0.5) / n to xi

to reflect the position of xi in the ordered sample. There are i - 1 values less than xi , and i values less than or equal to xi . The quantity (i -0.5) / n is a compromise between the proportions (i - 1) / n and i / n

Plot xi versus (i -0.5) / n

If the sample points lie approximately on a straight line, so it is plausible that they came from a normal population.

Other than plot manually, we can obtain it from software such as SPSS, Minitab, Excel, and etc. The normality of the data can be test by using Kolmogorov Smirnov and Anderson Darling for parametric test.

Normal Probability Plots

EXERCISE 2.71. A sample of size 5 is drawn. The sample, arranged in

increasing order, is

3.01 3.35 4.79 5.96 7.89

Do these data appear to come from an approximately normal distribution?

The data shown represent the number of movies in US for 14-year period.

2084 1497 1014 910 899 870 859848 837 826 815 750 737 637

Do these data appear to come from an approximately normal distribution?

Conclusion Statistical Inference involves drawing a sample from a

population and analyzing the sample data to learn about the population.

In many situations, one has an approximate knowledge of the probability mass function (discrete) or probability density function (continuous) of the population.

In these cases, the probability mass or density function can often be well approximated by one of several standard families of curves or function discussed in this chapter.

Thank YouNEXT: CHAPTER 3 Sampling Distribution and

Confidence Interval