Probability & Prob. Distributions

24
Probability & Probability Distributions Probability Meaning – It is the chance of occurrence of an event Probability Theory – Measure of uncertainty Experiment – A Process which results in some well- defined outcome is known as an experiment. e.g. - when a coin is tossed, we shall be getting either a Head of a tail i.e., its outcome is a Head or a Tail, which is well defined. Random experiment – All the outcomes of the experiment are known in advance, but any specific outcome of the experiment is not known in advance. e.g. – tossing of a coin is a random experiment, since the outcomes of the experiment is known in advance (i.e., it can be a Head or a Tail) but what will come whether Head or Tail is not known in advance. Sample Space – It is the list of all possible outcomes of an experiment. e.g. – when 2 coins are tossed together, the random experiment may result in any of the following: Head (H) on the first coin and & Head (H) on the second coin Head (H) on the first coin and & Tail (T) on the second coin Tail (T) on the first coin and & Head (H) on the second coin Tail (T) on the first coin and & Tail (T) on the second coin

description

Stats Notes for MBA

Transcript of Probability & Prob. Distributions

Probability & Probability Distributions

Probability Meaning It is the chance of occurrence of an event

Probability Theory Measure of uncertainty

Experiment A Process which results in some well-defined outcome is known as an experiment. e.g. - when a coin is tossed, we shall be getting either a Head of a tail i.e., its outcome is a Head or a Tail, which is well defined.

Random experiment All the outcomes of the experiment are known in advance, but any specific outcome of the experiment is not known in advance.e.g. tossing of a coin is a random experiment, since the outcomes of the experiment is known in advance (i.e., it can be a Head or a Tail) but what will come whether Head or Tail is not known in advance.

Sample Space It is the list of all possible outcomes of an experiment.e.g. when 2 coins are tossed together, the random experiment may result in any of the following:Head (H) on the first coin and & Head (H) on the second coinHead (H) on the first coin and & Tail (T) on the second coinTail (T) on the first coin and & Head (H) on the second coinTail (T) on the first coin and & Tail (T) on the second coinThus the corresponding sample space (for this example) denoted as S = {(H, H), (H, T), (T, H), (T, T)}

Equally Likely Outcomes In case of tossing a coin, it is known, in advance, that the coin will land with its Head or Tail up it is reasonably assumed that each outcome, a Head or a Tail, is as likely to occur as the otherIn other words, we say that there are equal chances for the coin to land with its Head or Tail up. Hence we say that the outcomes Head and Tail are equally likely.

Event An outcome of a random experiment is called an event. In other words, an event is something that happens. e.g. Consider tossing of 2 similar coins. The possible outcomes are:

First CoinSecond coinOutcome

HHHH

HTHT

THTH

TTTT

The events of the above experiment are HH, HT and TT whose frequencies are 1, 2 and 1 respectively.

Mutually exclusive events 2 events are said to be mutually exclusive if both cannot occur simultaneously.e.g. If a single coin is tossed, the Head and the Tail cannot occur in the same trial. Hence the event Head and the event Tail are mutually exclusive. If 2 coins are tossed then the events {(H, H), (H, T), (T, H), (T, T)} are mutually exclusive.

Exhaustive Events The total number of all possible outcomes of a random experiment will constitute an exhaustive set of events.e.g. Thus, in tossing of a coin, there are 2 exhaustive events Head and Tail and in throwing of a die the exhaustive events are either 1, 2, 3, 4, 5, 6. In drawing a single card from a pack of 52 playing cards the events card is red and card is black are collectively exhaustive.

Classical approach to Probability (Mathematical / Priori approach)

Probability of an event =

In this approach probability is known in advance before conducting experiment.e.g. - if a die is rolled once, and an even number is required on the upper face of it, then in this experiment, the Total number of all possible outcomes = 6 (since any of 1, 2, 3, 4, 5, 6 can come on the upper face)Number of favorable outcomes = 3 (since the even numbers can be any one of 2, 4, 6)Hence probability of getting an even number on the upper face = 3/6 =

Relative frequency approach to probability (Statistical / Posteriori approach)

Probability of an event = e.g. If a coin is tossed 100 times and the outcomes of this experiment are 57 heads and 43 tails, then the Probability of a Head = 57 / 100 and Probability of a Tail = 43 / 100

Subjective approach to probability

Probability of an event =

In this approach, probabilities are assigned to the events based on experience or past records. This type of approach is suitable for a sample size

Rules of ProbabilityRule of AdditionIf A1, A2, .,Am be m mutually exclusive events, thenP(A1 U A2 U U Am) = P(A1) + P(A2) +..+P(Am)

If the events A1, A2, .,Am are mutually exclusive events and are also exhaustive, thenP(A1 U A2 U U Am) = P(S) = 1 = P(A1) + P(A2) +..+P(Am)

The event A and its complement A are mutually exclusive and hence, P(A U A) = P(A) + P(A) since A U A = S, it follows that P(A U A) = P(S) = 1; Therefore P(A) = 1 P(A).

Probability of the event A or B or C (A, B, C are any events not necessarily mutually exclusive, in the sample space S) then

P(A U B) = P(B) + P(A) P(A B)P(A U B U C) = P(A) + P(B) + P(C) P(A B) P(A C) P(B C) + P(A B C)

Conditional Probability Conditional Probability is the probability of occurrence of an event given that another event has already occurred.

e.g. Consider the experiment of tossing 3 coins. The sample space of the experiment isS = {HHH, HTH, THH, TTH, HHT, HTT, THT, TTT}

Since the coins are fair, we can assign the probability 1/8 to each sample point. Let E be the event at least 2 Heads appear & Let F be the event as first coin shows tail thenE = {HHH, HTH, THH, HHT} &F = {THH, TTH, THT, TTT}E F = {THH}So P(E) = P(HHH) + P(HTH) + P(THH) + P(HHT) = 1/8 + 1/8 + 1/8 + 1/8 = 4/8 = 1/2;P(F) = P(THH) + P(TTH) + P(THT) + P(TTT) = 1/8 + 1/8 + 1/8 + 1/8 = 4/8 = 1/2 & P(E F) = 1/8

Now, suppose we are given that the event F occurs (i.e., the first coin shows tail), then what is the probability of occurrence of E? This information reduces our sample space from the set S to its subset F for the event E.Thus, probability of E considering F as the sample space = 1/4This probability of the event E is called the conditional probability of E given that F has already occurred, and is denoted by P(E|F) = 1/4

So P(E|F) = =

Now dividing the numerator and denominator by n(S), we get

P(E|F) = = , where P(F) 0 i.e., F (empty set)

Problem:In an organization, out of 200 employees, 40 are having their monthly salary more than Rs.15000 & 120 of them are regular takers of Alpha Brand Tea. Out of those 40, who are having their monthly salary more than Rs.15000, 20 are regular takers of Alpha Brand Tea. If a particular employee is selected, what is the probability that he is having a monthly salary more than Rs.15000, if he is a regular taker of Alpha Brand Tea?

Rule of MultiplicationP(A B) = P(A) P(B|A) = P(B) P(A|B)If A, B & C are 3 events of sample space, then we haveP(A B C) = P(A) P(B|A) P(C|A B) = P(A) P(B|A) P(C|AB)

Independent events:2 events are said to be independent if the occurrence of one event does not influence the occurrence of the other event.e.g. Successive tosses of a fair coin are independent. If a fair coin is tossed twice, the event Head in the first toss (assume this as event A) and the event Head in the 2nd toss (assume this as event B) are independent since the occurrence of Head in any toss does not influence the occurrence of Head of the other toss and the probability of getting a Head, say, in the second toss, which is 1/2, does not change, whether in the 1st toss we get a Head or Tail. Hence here P(B|A) = P(B) since the occurrence of event A does not alter the probability of event B.So when the events A and B are independent, P(A B) = P(A) P(B)In general, when a finite number of events A1, A2, .,Am are independent, we haveP(A1 A2 Am) = P(A1) P(A2)..P(Am)

Problems:1. What is the probability of getting exactly 2 Heads when 3 coins are tossed?2. What is the probability of getting atleast 1 Heads when 3 coins are tossed?3. What is the probability of getting a sum of 9 when two dice is thrown?4. What is the probability of getting atleast a sum of 9 when two dice is thrown?5. A number is selected at random from the numbers 1 to 30. What is the probability that (i) it is divisible by either 3 or 7 (ii) it is divisible by 5 or 136. The board of directors of a company wants to form a quality management committee to monitor quality of their products. The company has 5 scientists, 4 engineers & 6 accountants. Find the probability that the committee will contain 2 scientists, 1 engineer & 2 accountants?7. A box contains 5 red & 4 blue similar shaped balls. 2 balls are drawn at random from the box. Find the probability that both of them are red if (i) the balls are drawn together (ii) the balls are drawn one after the other, with replacement (iii) the balls are drawn one after the other without replacement.8. The probabilities that A & B will tell the truth are respectively. What is the probability that (i) they agree with each other (ii) they contradict each other9. The probabilities that component A & component B of a machine will fail are 0.09 & 0.06 respectively. The machine will fail if any one of them fails. Find the probability that it will fail?10. What is the probability of getting 53 Mondays in a leap year?11. Find the probability of selecting 2 ys from the letters x, x, x, x, y, y, y?12. Find the probability of selecting a King and Queen from a pack of playing cards, when 2 cards are drawn at a time?13. The probability of Mr.Sunil solving a problem is . The probability of Mr.Anish solving is . What is the probability that a given problem will be solved?14. The probability that a company A will survive for 20 years is 0.6. The probability that its sister concern will survive for 20 years is 0.8. What is the probability that atleast one of them will survive for 20 years?15. The probabilities that drivers A, B, C will drive home safely after consuming liquor are respectively. What is the probability that they will drive home safely after consuming liquor.

Random Variable A random variable (r.v.) is a real valued function whose domain is the sample space of a random experiment.

e.g. Let us consider the experiment of tossing a fair coin 2 times in succession. The sample space of the experiment is S = {(H, H), (H, T), (T, H), (T, T)}

If X denotes the number of Heads obtained, then X is a r.v. and for each outcome, its value is as given below:X(HH) = 2, X(HT) = 1, X(TH) = 1, X(TT) = 0More than one r.v. can be defined on the same sample space. e.g. Let Y denote the no. of heads minus the no. of tails for each outcome of the above sample space S, then Y is also a r.v. and for each outcome, its value is as given below:Y(HH) = 2, Y(HT) = 0, Y(TH) = 0, Y(TT) = -2

Discrete Random Variable This r.v. will take countable no. of outcomes (as above example).

Continuous Random Variable Let us take the example of measuring exact amount rain in inches tomorrow. Here we cant say it is 2. When we say it is 2 then it should not be even 2.000001 or 1.99999. But we can define this r.v. as means 1.9 < X < 2.1, where X denotes the amount of rain

Probability DistributionsIn some researches, after data collection, the next step is to present the data in the form of a probability distribution which will facilitate further analysis of data in more meaningful ways. The probability distribution can be classified into discrete probability distribution and continuous probability distribution.Some examples of discrete probability distributions are Binomial, Poisson distributionSome examples of continuous probability distributions are Exponential distribution, Uniform distribution, Normal distribution, t-distribution

Discrete Probability DistributionIn an experiment, events can be represented in the form of frequencies, which can be easily converted into the respective probabilities by dividing them with the total no. of outcomes.

Consider the case of tossing 3 coins simultaneously. The no. of outcomes of this experiment is 8 and the outcomes are {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}i.e., HHH 3 HeadsHHT 2 Heads and 1 TailHTH 2 Heads and 1 TailHTT 1 Head and 2 TailsTHH 2 Heads and 1 TailTHT 1 Head and 2 TailsTTH 1 Head and 2 TailsTTT 3 Tails

Therefore, the events of this experiment are 3 Heads, 2 Heads and 1 Tail, 1 Head and 2 Tails, 3 Tails and the frequencies are 1,3,3,1 respectively.

Probability Distribution of the experiment

Event# of outcomes of the eventProbability of occurrence of the event

HHH11/8

HHT33/8

HTT33/8

TTT11/8

Total1

The discrete probability distribution of tossing 3 coins is calculated as follows:The events of this experiment are HHH, HHT, HTT & TTT. Let X be a r.v. defined as the no. of Heads in the event occurrence, which is probabilistic.The events HHH, HHT, HTT, TTT can be defined as follows:X = 0, when the event is TTT (i.e., no. of Heads is 0)X = 1, when the event is HTT (i.e., no. of Heads is 1)X = 2, when the event is HHT (i.e., no. of Heads is 2)X = 3, when the event is HHH (i.e., no. of Heads is 3)

Based on this, the probability distribution is P(X) =

This is called probability mass function (p.m.f) & P(X) = 1

Continuous Probability DistributionLet us consider the following function to demonstrate the concept of the continuous distribution:f(x) =

For any prob. distribution, the value of the cumulative distribution in the specified range should be 1.The value of the cumulative distribution can be obtained by integrating f(x) as shown below:

= = 1

Since the value of the cumulative function of f(x) is 1, f(x) is a probability distribution (also called as probability density function. In this distribution, the variable X is a continuous r.v. because its value is continuous in the range from 0 to 1. Hence the probability distribution is a continuous probability distribution. The cumulative function of a probability density function is called cumulative density function (c.d.f.)

Problem:Let X denotes the no. of hours you study during a randomly selected school day. The probability that X can take the values x, has the following form, where k is some unknown constant.

{0.1, if x = 0 P(X = x) = {kx, if x = 1 or 2 {k(5 - x), if x = 3 or 4 {0, otherwise

a. Find the value of k?

b. What is the probability that you study at least 2 hours? Exactly 2 hours? At most 2 hours?

Problem:Find the probability distribution of no. of doublets in 3 throws of a pair of dice.

Mean of a r.v. X = Expected value of X = Expectation of X = E(X) = xipi Variance of a r.v. X = Var(X) = E(X2) [E(x)]2 where E(X2) = xi2pi

Problem:2 cards are drawn simultaneously (or successively without replacements) from a well shuffled pack of 52 cards. Find the mean, variance and standard deviation of the no. of kings.Theoretical Probability Distributions

Binomial DistributionIt is a discrete probability distribution based on Bernoulli process. In a game between 2 persons using a coin-tossing experiment, let the occurrence of Head in a trial be a success to one person & the occurrence of tail be a failure to the same person.

So, in a trial of tossing a coin, the probability of a success (p) to the person is 0.5 & the probability of failure (q = 1 p) to the same person is 0.5

If n repeated trials are performed, then the objective of the game may be to find the probability of having x successes for the person. This experiment is termed as Bernoulli Process in which n repeated trials are performed with the following assumptions:

In the experiment, there are only 2 mutually exclusive & collectively exhaustive events.The probability of occurrence of the events of the experiment, are same in all the trials.In all the n trials, the observations are independent of one another.

Based on these fundamentals, the binomial probability distribution is represented as follows:P(x successes in n trials given p is the probability of success) = nCx px qn-x, where x = 0, 1, 2,..,n; where n = No. of trials; p = probability of success in a trial; q = probability of failure in a trial (= 1- p); Here n & p are parameters.

In short, the probability distribution is represented as P(X = x) = nCx px qn-xThe cumulative distribution function of the binomial distribution is P(X = x) = nCx px qn-x = 1

Problem Based on past experience, the quality control engineer of Heavy Electrical Limited has estimated that the probability of commissioning each project in time at a client site is 0.9; The Company is planning to commission 5 such projects in the forthcoming year. Find the probability of commissioning (a) no project in time (b) 2 projects in time (c) at most one project in time & (d) at least 2 projects in time.

Mean & Variance of binomial distributionMean = np & Variance = npq

Problem If the probability of a defective bolt is 0.1, find the mean & the s.d. of defective bolts in total of 900?Probability of a defective bolt = p = 0.1 q = 1 0.1 = 0.9Mean = np = 900 * 0.1 = 90Variance = npq = (np)q = 90(0.9) = 81Hence Standard Deviation = 9

Problem:a. Eight coins are tossed at a time, 256 times. Find the expected frequencies of successes (getting a head) and tabulate the results obtained.b. Also obtain the values of the mean & SD of the theoretical (fitted) distribution.

Problem:Fit a binomial distribution to the following data:x01234

f286246104

Poisson distributionIt is a discrete probability distribution. This is usually used to represent the no. of occurrence of an event in one unit of time. Poisson distribution may be obtained as a limiting case of Binomial probability distribution under the following conditions:1. n, the number of trials is indefinitely large, i.e., n 2. p, the probability of success for each trial is indefinitely small, i.e., p 03. np = (say), is finiteUnder the above 3 conditions the binomial probability function tends to the probability function of the Poisson distribution given below:

P(X = x) = e- x , x= 0, 1, 2, .., x!

In this probability mass function, is the only parameter

Mean & Variance of Poisson distributionMean = Variance =

Approximation of binomial distribution to Poisson distributionThe binomial distribution can be approximated to Poisson distribution under any of the following conditions.n 20 & p 0.05n 100 & np 10If any one of the above 2 conditions is satisfied, then the mean of the Poisson distribution is = np

Problem:The arrival rate of customers arriving at a bank counter follows Poisson distribution with a mean arrival rate of 4 per 10 minutes interval. Find the probability that1. exactly 0 customer will arrive in 10 minutes interval2. exactly 2 customers will arrive in 10 minutes interval3. at most 2 customers will arrive in 10 minutes interval4. at least 3 customers will arrive in 10 minutes intervalGiven e-4 = 0.0183

Problem:The QC assistant takes a sample of 25 units of a product at a particular work station of a production line & inspects them one by one. Based on the past experience, he has estimated that the probability of one unit will be defective is 0.04. Find the probability that1. no piece in the sample is defective2. 3 pieces in the sample will be defective3. At most 2 pieces will be defective4. At least 3 pieces will be defectiveGiven e-1 = 0.3678

Problem:It is known from the past experience that in a certain plant there are on the average 4 industrial accidents per month. Find the probability that in a given year, there will be less than 4 accidents. Assume Poisson distribution (Given e-4 = 0.0183)

Problem:Suppose on an average 1 house in 1000 in a certain district has a fire during a year. If there are 2000 houses in that district, what is the probability that exactly 5 houses will have a fire during the year?Given e-2 = 0.1353

Problem:The following table gives the number of days in a 100 day period during which automobile accidents occurred in a city. Fit a Poisson distribution to the data.No. of accidents01234

No. of days40351564

Normal Distribution It is a continuous probability distribution. The behavior of many of the real-life situations can be modeled as normal distribution.Some examples which follow normal distribution are as follows: Monthly salary of employees in a locality Internal diameter of bearings produced in a company Marks of students in an entrance test Height of employees in a company Weight of employees in a company

A continuous r.v. X is defined to follow normal distribution with parameters & 2, denoted as X ~ N(,2) if the p.d.f. of the r.v. X is given by

f(x) = 1 exp[-(x )2/22), - < X < (2)

If the observations of a real-life problem follow the normal distribution with mean () and variance (2), then its r.v. can be converted into a standard normal r.v. using the following transformation:

Z = X -

where Z is a standard normal variable. The corresponding distribution is called standard normal distribution, whose formula is as given below:

P(Z) = 1 exp[-Z2/2), - < Z < (2)

The mean and variance of this standard normal distribution are 0 and 1, respectively.

Problem: In a survey with a sample of 300 respondents, the monthly income of the respondents follows normal distribution with its mean and s.d. as Rs.15000 and Rs.3000 respectively.(a) What is the probability that the monthly income is less than Rs.12000? Also, find the no. of respondents having income less than Rs.12000?(b) What is the probability that the monthly income is more than Rs.16000? Also, find the no. of respondents having income less than Rs.16000?(c) What is the probability that the monthly income is in between Rs.10000 & Rs.17000? Also, find the no. of respondents having income in between Rs.10000 & Rs.17000?

Problem: The marks obtained by 300 students in an examination are estimated to be normally distributed with mean of 60 & standard deviation of 8. How many students are expected to score (a) More than 70 marks (b) Between 50 & 75 marks (c) If top 5% of the students are to be given scholarships, what is the eligible mark for the scholarships?

Problem: Steel rods are manufactured to be 3 inches in diameter but they are acceptable if they are inside the limits 2.99 inches & 3.01 inches. It is observed that 5% are rejected as oversize & 5% are rejected as undersize. Assuming that the diameters are normally distributed, find the standard deviation of the distribution. Hence find, what the proportion of rejects would be, if the permissible limits were widened to 2.985 inches and 3.015 inches.

Problem: In a certain exam, 31% of the students got less than 45 marks & 8% of the students got more than 64 marks. Assuming the distribution to be normal, find the mean & SD of the marks

Problem: The frequency distribution of a national survey on cars is shown below:No. of cars0.00-0.490.50-0.991.00-1.491.50-1.992.00-2.492.50-2.99

Frequency21423742

a) Calculate the variance and SDb) How many of the observation should theoretically fall between 0.7 and 1.8, if the distribution is bell-shaped?

Problem: The manager of a small postal substation is trying to quantify the variation in the weekly demand for mailing bags. She has decided to assume that this demand is normally distributed. She knows that on an average 100 bags are purchased weekly and that, 90% of the time, weekly demand is below 115.a) What is the standard deviation of this distribution?b) The manager wants to stock enough mailing bags each week so that the probability of running out of stock of bags is no higher than 0.05. How much she should stock?

Uniform DistributionThis is a continuous probability distribution which has wider practical applications. More specifically, it has more use in simulation. Let us assume that the distribution of the daily demand of a product is uniformly distributed with 10 +/- 2 units. The minimum daily demand is 8 and the maximum daily demand is 12.

Now in general let the minimum daily demand & maximum daily demand be a & b respectively.

Let X is a continuous r.v. & the probability of occurrence for the values of the r.v. X in the range a & b is constant & it is 0 for all values of the r.v. outside the interval a to b. The formula for the corresponding uniform distribution is P(X = x) =

The cumulative density function is = = = = 1 The mean & variance of the uniform distribution are &

Problem:In a private canteen, the daily demand for packed meals follows uniform distribution as below:P(X) = 1 / (450 230), 230 X 450 = 0, otherwiseIf the service level of satisfying the demand of the canteen is 0.8, find the highest possible demand which can be satisfied w.r.t. the given service level (cumulative probability)

Problem:Let the continuous r.v. X denote the current measured in a thin copper wire in milliamperes. Assume that the range of X is [0, 20mA], and assume that the p.d.f. of X is f(x) = 0.05, ; What is the probability that a measurement of current is between 5 and 10 mA?

Exponential DistributionIt is a continuous probability distribution. This distribution is used to represent the time interval between consecutive occurrence of an event, like inter-arrival time of customers, service time for customers, mean time between failures in maintenance activity, etc.

Consider the example of the service time of customers in a queuing system. Generally the service time in a queuing example is a random variable which follows exponential distribution. Let the no. of customers served per unit time (service rate) be . Therefore, the service time is (1/).

If r.v. X follows exponential distribution, then the probability density function f(x) = e-x, x 0

The r.v. X that equals the distance between successive events of a Poisson process with mean > 0 is an exponential random variable with parameter . The p.d.f. of X is f(x) = e-x for E(X) = = & 2 = V(X) = ;The cumulative density functions areP(X x) = 1 - e-x, for x 0P(X x) = e-x, for x 0P(x1 X x2) = - , for 0 x1 < x2

Problem:In an international airport, the service time for servicing flights by a terminal follows exponential distribution. The service rate of a terminal servicing the flights is 20 per day. Find the probability that the service time of the terminal in clearing a flight is less than 0.45 hour.

Problem:In a mainframe computer centre, execution time of programs follows exponential distribution. The average execution time of the programs is 5 minutes. Find the probability that the execution time of programs is (a) Less than 4 minutes(b) More than 6 minutes(HintAverage execution time is 5 minutes; Means the execution rate is (1/5) = )