CENG 235 Introduction To Probability and...

64
Lecture Notes On CENG 235 Introduction To Probability and Statistics Prepared by: Dr. Emre Sermutlu Based on the book: Probability and Statistics for Engineers and Scientists, Ninth Edition, Walpole, Myers, Myers, Ye, Pearson Education Last Update: March 1, 2016

Transcript of CENG 235 Introduction To Probability and...

Page 1: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Lecture Notes On

CENG 235

Introduction To Probability and Statistics

Prepared by: Dr. Emre Sermutlu

Based on the book: Probability and Statistics for Engineersand Scientists, Ninth Edition, Walpole, Myers, Myers, Ye, PearsonEducation

Last Update: March 1, 2016

Page 2: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability
Page 3: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 1– Introduction

A population is a collection of individual items of a particular type.A sample is a subset of the population, selected by a definite procedure.In a biased sample, the probability for each member of population to be selected is

not equal.Sample Mean:

x =x1 + x2 + · · ·xn

n

Sample Median: If the observations in the sample are ordered as x1, x2, . . . , xn themedian is:

x =

x(n+1)/2 if n is odd

1

2(xn/2 + xn/2+1) if n is even

An alternative to mean and median is the trimmed mean. For example, we can elim-inate largest and smallest %10 of the data and find the mean of the remaining elements.This is called %10 trimmed mean.

Variance:

s2 =n∑i=1

(xi − x)2

n− 1

Standard Deviation:s =√s2

In statistics, any process that generates a set of data is called an experiment.The set of all possible outcomes of a statistical experiment is called the sample space

and is denoted by S.For example, if we toss a coin twice, the sample space is: S = {HH,HT, TH, TT}.An event is a subset of a sample space. The complement of an event A is the set of

all elements of S that are not in A, denoted by A′.Two events A and B are mutually exclusive or disjoint if A ∩B = ∅.

Exercise 1-1: An experiment consists of tossing a die and then flipping a coin. Describethe sample space.

Exercise 1-2: An experiment consists of tossing a die and then flipping a coin once ifthe number is even, twice if it is odd. Describe the sample space.

Exercise 1-3: A student is registered to 2 courses. He can get one of 5 different lettergrades for each course, (A,B,C,D,F) Describe the sample space of his grade distributions.Find the event he passes all, he fails all and the complements of these two.

3

Page 4: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Multiplication Rule: If an operation can be performed in n ways, and if for each of

these ways a second operation can be performed in m ways, then the two operations canbe performed together in nm ways.

Permutation: A permutation is an arrangement of a set of objects. The number of

permutations of n objects is n!. Permutation of n objects taken r at a time is:

P (n, r) =n!

(n− r)!

The number of distinct permutations of n things of which n1 are of one kind, n2 areof a second kind etc. is:

n!

n1!n2! · · ·nk!

Combination: The number of combinations of n distinct objects taken r at a time is:(n

r

)=

n!

r!(n− r)!

Exercise 1-4: How many 12 digit numbers contain exactly four 9’s?

Exercise 1-5: A football team plays 20 matches in a season. The matches result in win,loss or tie. In how many different ways can the team end the season with:

a) No loss?

b) 10 wins, 4 losses, 6 ties?

Exercise 1-6: 6 people, A,B,C,D,E, F will sit around a circular table.

a) In how many ways can they do that?

b) A wants to sit together with B. In how many ways can they do that?

c) C does not want to sit together with D. In how many ways can they do that?

4

Page 5: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Probability

Probability of an event A denotes the weight of A in S. Therefore:

P (S) = 1, P (∅) = 0, 0 6 P (A) 6 1

Furthermore, if A and B are mutually exclusive events

P (A ∪B) = P (A) + P (B)

In general, for two events A and B we have:

P (A ∪B) = P (A) + P (B)− P (A ∩B)

and for three events A,B and C we have:

P (A∪B∪C) = P (A) +P (B) +P (C)−P (A∩B)−P (A∩C)−P (B∩C) +P (A∩B∩C)

For complementary events,P (A) + P (A′) = 1

Exercise 1-7: There are 5 black and 4 red balls in a bag. We randomly choose threeballs without replacement. Find the probabilities that we get

a) 3 black

b) At least 2 black

c) At least 1 black

Exercise 1-8: We toss a pair of dice. What is the probability that:

a) The sum is 7?

b) The maximum number is 4?

c) We have a double number?

Exercise 1-9: A fair coin is tossed 5 times. Find the probability of getting

a) No heads

b) Exactly one head

c) 3 or more heads.

Exercise 1-10: We choose a number from {1, 2, . . . , n} randomly. We repeat this ntimes. What is the probability that we choose 1 at least once?

Exercise 1-11: In a game of chance, your probability of winning is 0.7. You play thisgame five times. What is the probability that

a) You win 3 or more games?

b) You lose all of them?

Solution:

a)

(5

3

)0.73 0.32 +

(5

4

)0.74 0.3 +

(5

5

)0.75 = 0.83692

b) 0.35 = 0.00243

5

Page 6: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 1-12: There are 17 balls in a box. 5 are blue, 8 are red and 4 are green. Werandomly choose 5 balls. What is the probability that we choose equal number of blueand red balls?

Solution: Possible choices are: 1 blue 1 red and 2 blue 2 red.(5

1

)(8

1

)(4

3

)(

17

5

) +

(5

2

)(8

2

)(4

1

)(

17

5

) = 0.0259 + 0.1810 = 0.2069

Exercise 1-13: A file server has 4 harddisks. Each disk has a 5% probability of failurewithin one year. If one (or more) disk fails, the whole system fails.

a) What is the probability that the system will fail within one year?

b) We decide to improve the system reliability by adding an extra disk. Now we have5 disks, and the system works if 4 or 5 disks are working, fails otherwise. What is theprobability that the new system will fail within one year?

Solution:

a) 1− 0.954 = 0.1855

b) 1− 0.955 − 5 · 0.954 · 0.05 = 0.0226

Exercise 1-14: In a computer game, there are three results: Win, Draw, Lose. Theprobabilities are:0.4, 0.5, 0.1. You get 2 points for Win, 1 for Draw and 0 for Lose.

What is the probability that you get 16 points after playing this game for 10 rounds?

Solution: To get 16 points, we may get 8W+0D+2L or 7W+2D+1L or 6W+4D+0L.We can find the probabilities using multinomial distribution:

10!

8! 0! 2!0.480.500.12 +

10!

7! 2! 1!0.470.520.11 +

10!

6! 4! 0!0.460.540.10

= 0.0003 + 0.0147 + 0.0538

= 0.0688

6

Page 7: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 2– Conditional Probability

The conditional probability of B, given A, denoted by P (B∣∣A) is defined as:

P (B∣∣A) =

P (A ∩B)

P (A), provided P (A) > 0

Two events A and B are independent if and only if

P (B∣∣A) = P (B) or P (A

∣∣B) = P (A)

Otherwise, A and B are dependent.

Theorem: Two events A and B are independent if and only if

P (A ∩B) = P (A)P (B)

Exercise 2-1: In a classroom of 50 students, 28 are girls and 22 are boys. 16 of the girlsare from Ankara, and 10 of the boys are from Ankara. We randomly choose a student.

a) Given that the student is a girl, what is the probability that she’s from Ankara?

b) Given that the student is from Ankara, what is the probability that the student is agirl?

c) Are these events independent?

Exercise 2-2: In a city, cars are colored black, white or red. 10% of all cars are black,60% are white, the rest are red. In the past one year, 4% of all cars had an accident. 15%of all cars that had an accident are black, 45% are white, the rest are red.

a) Given that a car is red, what is the probability that it had an accident?

b) Are these events independent?

Solution: Using the values 0.04× 0.15 = 0.006, 0.04× 0.45 = 0.018 and 0.04× 0.40 =0.016, we can fill the table as follows:

Black White RedAccident 0.006 0.018 0.016

NO Accident 0.094 0.582 0.284

a) P (Acc.∣∣R) =

P (Acc. ∩ R)

P (R)=

0.016

0.016 + 0.284= 0.053

b) These events are dependent.P (Acc.) = %4, P (Acc.

∣∣R) = %5.3 ⇒ P (Acc.) 6= P (Acc.∣∣R)

7

Page 8: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 2-3: In a country, people are unemployed with 0.20 probability. 60% of thepopulation is young. The probability that a person is unemployed given that he/she isyoung is 0.25.

What is the probability that an old person is unemployed? Are these events indepen-dent?(

Answer: 0.125, NO, they are dependent

)

Exercise 2-4: A driver uses road 1 with probability 0.3 and road 2 with probability 0.7.On road 1, the probability that he sees a police car is 0.5, on road 2 it is 0.2. Given thathe saw a police car, what is the probability he took road 1?

Exercise 2-5: There are 20 balls in a box. There is a 50% probability that all are white,30% probability that 18 are white and 2 are black, 20% probability that 15 are white and5 are black. We randomly choose two balls and see that both are white. What is theprobability that all balls are white?

Exercise 2-6: The probability that a married man watches Muhtesem Yuzyıl is 0.5. Theprobability that a married woman watches it is 0.6. The probability that a man watchesit, given that his wife does is 0.7.

Given that a married man watches Muhtesem Yuzyıl, what is the probability that hiswife watches it? Are these events independent?

Exercise 2-7: There are two roads, A and B that I can take in the mornings. I preferA 80% of the time. If I choose A, I arrive work early with probability 0.1, on time withprobability 0.8 and late with probability 0.1. For road B, these probabilities are 0.5, 0.3and 0.2.

a) What is the probability that I arrive work early?

b) Given that I arrived early, what is the probability that I have taken B?

Solution:

a) 0.8 · 0.1 + 0.2 · 0.5 = 0.18

b)0.1

0.18= 0.5556

8

Page 9: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 2-8: 65% of the customers of a coffee shop are women, the rest are men. Awoman orders cappuccino with 50% probability, she orders espresso with 30% probabilityand orders something else with 20% probability. For a man these numbers are 25%, 50%and 25%.

Given that a customer ordered espresso, what is the probability that the customer isa man?

Solution: Let E denote that the customer orders espresso, M denote the customer is aman and W denote the customer is a woman.

P (M | E) =P (M ∩ E)

P (E)=

0.35× 0.5

0.35× 0.5 + 0.65× 0.3= 0.47

Exercise 2-9: Three assistants, Bugra, Alphan and Oguz are grading homeworks. Bugragrades 30% of the homeworks, Alphan grades 45% and Oguz grades the rest. Bugra makes2 mistakes per 100 homeworks, Alphan makes 3 and Oguz makes 5.

I have a homework that was graded wrongly, but I don’t know who graded it. Whatis the probability it was Oguz?

Solution: Let O denote Oguz has graded the homework and M denote a mistake wasmade.

P (O |M) =P (O ∩M)

P (M)=

0.25× 0.05

0.25× 0.05 + 0.45× 0.03 + 0.30× 0.02= 0.39

Exercise 2-10: I feel sad 5%, happy 35% and normal 60% of the time. On any givenday, if I feel normal, I do not go to canteen. If I feel sad, I go to canteen with probability70%. If I feel happy, I go to the canteen with probability 30%.

At the canteen, I order either black coffee or coffee with milk. On happy days theprobabilities are 50% - 50%, on sad days 10% - 90%.

Today I was at the canteen, drinking coffee with milk. What is the probability I amfeeling happy?

Solution: The event Sad+Canteen+Coffee with Milk has probability 0.05×0.70×0.90 =0.0315. The alternative event Happy+Canteen+Coffee with Milk has probability 0.35 ×0.30 × 0.50 = 0.0525. We know one of these happened, so using conditional probabilityformulas, I am happy with probability:

0.0525

0.0525 + 0.0315= 0.625 = 62.5%

9

Page 10: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 2-11: Cagdas, Dogukan and Utku are working on a project. The manager, onaverage, gives 35% of jobs to Cagdas, 25% to Dogukan and 40% to Utku.

Cagdas completes 90%, Dogukan completes 80% and Utku completes 70% of his jobswithin the same week.

At the end of the week, the manager is considering an incomplete job, but he forgotwhose job that was. What is the probability that it belongs to Cagdas?

Solution:Cagdas Dogukan Utku

Complete 0.315 0.20 0.28Incomplete 0.035 0.05 0.12

Given that the job is incomplete, the probability that it is Cagdas’s job is:

0.035

0.035 + 0.05 + 0.12= 0.1707

Exercise 2-12: In a single-player game, we toss a fair die twice, and we win if the sumis 10 or more.

a) Given that Erdi gets 6 in the first toss, what is the probability that he wins?

a) Given that Volkan wins, what is the probability that he got 6 in the first toss?

Solution:

a) Winning results: {4, 5, 6}, all results: {1, 2, . . . , 6}

⇒ Winning Probability =3

6=

1

2

b) Winning combinations: 6− 6, 6− 5, 6− 4, 5− 6, 5− 5, 4− 6

⇒ Winning Probability =3

6=

1

2

Exercise 2-13: We ask 80 people their preferences about beverages. 33 prefer cola zerowhile the rest prefer normal cola. 55 prefer coffee without sugar, the rest prefer coffeewith sugar. 27 people prefer both cola zero and coffee without sugar.

Given that a person prefers coffee with sugar, what is the probability he/she drinksnormal cola?

Solution:Zero Normal

Without Sugar 27 28With Sugar 6 19

P (Normal |With Sugar) =19

19 + 6= 0.76

Exercise 2-14: 70% of the students in a class study hard, the rest don’t. For studentstudying hard, the probability of passing the course is 85%, for others, 25%. Given thata student has passed the course, what is the probability that he/she studied hard?

Solution:Studied Not Studied

Pass 0.595 0.075Fail 0.105 0.225

P (Studied | Passed) =0.595

0.595 + 0.075= 0.89

10

Page 11: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 3– Probability Distributions

A Random Variable is a function that associates a real number with each elementin the sample space. Random variables can be discrete or continuous.

Example: There are 10 balls in a box. 4 of them are black, 6 are white. We draw twoballs without replacement. The number of black balls is a random variable. (discrete)

Exercise 3-1: We take cell phones of 4 students and redistribute them randomly. Thenumber of students getting the correct phone is a random variable. (discrete)

Example: The time between passing of two trucks along the way is a random variable.(continuous)

Discrete Probability Distributions

f(x) is a probability function, or probability distribution of the discrete random vari-able X if:

• f(x) > 0

•∑x

f(x) = 1

• P (X = x) = f(x)

Let the discrete random variable X have the probability distribution f(x).The cumulative distribution function F (x) is defined as

F (x) = P (X 6 x) =∑t6x

f(t)

Exercise 3-2: Among a shipment of 20 laptops, 3 are defective. We purchase 2. Findthe probability distribution for the number of defectives.

Exercise 3-3: A box contains 2 black and 5 white balls. We randomly select 3. If x isthe number of black balls we choose, find the probability distribution of X. Then, findthe cumulative distribution function.(

Answer: 2/7, 4/7, 1/7, , 0, 2/7, 6/7, 1

)

11

Page 12: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Continuous Probability Distributions

f(x) is a probability density function of the continuous random variable X if:

• f(x) > 0

•∫ ∞−∞

f(x) dx = 1

• P (a < X < b) =

∫ b

a

f(x) dx

The Cumulative Distribution Function F (x) of a continuous random variable Xwith density function f(x) is:

F (x) = P (X 6 x) =

∫ x

−∞f(t) dt

Therefore P (a < X < b) = F (b)− F (a).

Exercise 3-4: Determine c such that f(x) = c(x2 + 4) for x = 0, 1, 2, 3 is a probabilitydistribution.(

Answer: 1/30

)

Exercise 3-5: Let the error in an experiment be given by

f(x) =

x2

3−1 < x < 2

0 elsewhere

a) Verify that f(x) is a density function.

b) Find P (0 < X 6 1)

Exercise 3-6: Consider the density function

f(x) =

{k√x 0 < x < 1

0 elsewhere

a) Find k.

b) Find P (0.3 < X < 0.6)(Answer: 3/2, 0.3

)

12

Page 13: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 3-7: The time to failure in years of an electronic equipment is

f(t) =

0 t < 0

e−t/3

3t > 0

The company will replace any product that had a lifetime less than 1 year. What pro-portion of the products will they replace?

Solution: ∫ 1

0

e−t/3

3dx = −e−t/3

∣∣∣∣10

= 1− e−1/3 = 0.283

They will replace %28.3 of the products.

Exercise 3-8: The probability distribution for a continuous random variable X is:

f(x) =

{k(1− x)4 0 6 x 6 1

0 elsewhere

a) Find k

b) Find P (0.8 < X)

Solution:

a)

∫ 1

0

k(1− x)4 dx = k

∫ 0

1

u4(−du) =k

5= 1 ⇒ k = 5

b) P (0.8 < x) =

∫ ∞0.8

5(1− x)4 dx = 0.25 = 0.00032

Exercise 3-9: The waiting time, in hours, for a police radar is a continuous randomvariable with probability density function:

f(x) =

{0 x < 0

8 exp(−8x) x > 0

Find the probability of waiting less than 12 minutes.(Answer: 0.7981

)

13

Page 14: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 3-10: The particle size (in micrometers) distribution in a chemical mixture isgiven by

f(x) =

{3x−4 x > 1

0 elsewhere

Find the probability that the particle size is greater than 4 micrometers.

Exercise 3-11: A continuous random variable X has the probability distribution

f(x) =

0 x < 2

k(1 + x) 2 6 x 6 50 5 < x

a) Find k.

b) Find P (4 6 X 6 8)

Solution:

a)∫ ∞−∞

f(x) dx =

∫ 5

2

k(1 + x) dx

=k(1 + x)2

2

∣∣∣∣∣5

2

=27k

2= 1

⇒ k =2

27

b) P (4 6 x 6 8) =

∫ 8

4

f(x) dx

=

∫ 5

4

2

27(1 + x) dx

=11

27

= 0.4074

14

Page 15: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 3-12: Emre hoca announces exam results x hours after the exam ends. Theprobability distribution function of x is:

f(x) =

ke−x/24 18 < x

0 elsewhere

a) Find k.

b) Find the probability that an exam result is announced within 36 hours of end of theexam.

Solution:

a) k

∫ ∞18

e−x/24 dx = 1 ⇒ k =e3/4

24= 0.0882

b) P = k

∫ 36

18

e−x/24 dx = 0.5276

Exercise 3-13: The probability density function of a random variable is:

f(x) =

ke−x/7 x > 0

0 elsewhere

a) Find k.

b) Find P (4 < x < 5)

c) Find P (5 < x)

Solution:

a)

∫ ∞0

ke−x/7 dx = 1 ⇒ k =1

7

b) P (4 < x < 5) =

∫ 5

4

e−x/7

7dx = −e−5/7 + e−4/7 = 0.0752

c) P (5 < x) =

∫ ∞5

e−x/7

7dx = e−5/7 = 0.4895

15

Page 16: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 3-14: The concentration of a pollutant is a continuous random variable withprobability density function:

f(x) =

c

x4x > 1

0 elsewhere

a) Find c.

b) Find P (3 < x < 4)

Solution: ∫ ∞1

c

x4dx =

c

−3x3

∣∣∣∣∞1

=c

3= 1 ⇒ c = 3

P (3 < x < 4) =

∫ 4

3

3

x4dx =

−1

x3

∣∣∣∣43

= − 1

64+

1

27= 0.0214

Exercise 3-15: A random variable X has density function

f(x) =

x 0 < x < 1

2− x 1 6 x < 20 elsewhere

Find the the variance of X.

Solution:

µ =

∫ 1

0

x2 dx+

∫ 2

1

(2x− x2) dx

=x3

3

∣∣∣∣10

+

(x2 − x3

3

) ∣∣∣∣21

= 1

σ2 = E(x2)− µ2

=

∫ 1

0

x3 dx+

∫ 2

1

(2x2 − x3) dx− 12

=7

6− 1 = 0.1667

16

Page 17: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 4– Joint Probability Distributions

Discrete case:

f(x, y) is a probability mass function, or joint probability distribution of the discreterandom variables X and Y if:

• f(x, y) > 0

•∑x

∑y

f(x, y) = 1

• P (X = x, Y = y) = f(x, y)

Continuous case:

f(x, y) is a joint density function of the continuous random variables X and Y if:

• f(x, y) > 0

•∫ ∞−∞

∫ ∞−∞

f(x, y) dx dy = 1

• P ((X, Y ) ∈ A) =

∫ ∫A

f(x, y) dxdy for any region A in the xy−plane

Exercise 4-1: A box contains 3 blue, 2 red and 3 green pens. We randomly choose 2pens. If X is the number of blue and Y is the number of red pens, find

a) the joint probability function f(x, y).

b) P(

(X, Y ) ∈ A)

where A is the region {(x, y) | x+ y 6 1}

Exercise 4-2: Let

f(x, y) =

2

5(2x+ 3y) 0 6 x 6 1, 0 6 y 6 1

0 elsewhere

a) Verify that it is a probability density function

b) Calculate the probability that 0 < x <1

2and

1

4< y <

1

2(Answer: 13/160

)

17

Page 18: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Marginal Distributions

Given a joint probability distribution f(x, y), we can find the probability distributionof x only or y only as follows:

g(x) =∑y

f(x, y) and h(y) =∑x

f(x, y)

g(x) =

∫ ∞−∞

f(x, y) dy and h(y) =

∫ ∞−∞

f(x, y) dx

Exercise 4-3: Let f(x, y) =

x(1 + 3y2)

40 < x < 2, 0 < y < 1

0 elsewhere

Find marginal distributions g(x) and h(y).(Answer: x/2, (1 + 3y2)/2

)

Exercise 4-4: Let f(x, y) =

{10xy2 0 < x < y < 1

0 elsewhere

Find marginal distributions g(x) and h(y).(Answer: 10x(1− x3)/3, 5y4

)

Statistical Independence

The random variables X and Y are said to be statistically independent if and only if

f(x, y) = g(x)h(y)

Exercise 4-5: Let X and Y have the distribution given in table:

fx

1 2

y1 0.2 0.32 0.4 0.1

a) Find the marginal distributions of X and Y .

b) Are they statistically independent?

18

Page 19: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 4-6: Given a joint density function

f(x, y) =

3y2

26ex0 6 x, 1 6 y 6 3

0 elsewhere

are X and Y statistically independent?

Exercise 4-7: Given a joint density function

f(x, y) =

4 + 6x+ 3y

1280 6 x+ y 6 4

0 elsewhere

are X and Y statistically independent?

Exercise 4-8: Given a joint density function

f(x, y) =

x2 + 2y2

2500−5 < x < 5, −5 < y < 5

0 elsewhere

what is the probability that 2 < x and 3 < y?

Solution: ∫ 5

3

∫ 5

2

x2 + 2y2

2500dx dy =

1

2500

∫ 5

3

(x3

3+ 2y2x

) ∣∣∣∣52

dy

=1

2500

(117y

3+ 2y3

) ∣∣∣∣53

=274

2500

= 0.1096

19

Page 20: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 4-9: A coffee factory investigates the relation between wind speed and qualityof coffee produced that day. They obtain the following table for probabilities:

Wind

Calm (No wind) Light Wind Strong WindLow 0.03 0.05 0.02

Quality Average 0.225 0.375 0.15

High 0.045 0.075 0.03

a) Find the marginal distributions for quality and wind speed.

b) Are they independent?

c) Find the probability that we obtain high quality coffee, given that there is strong wind.

d) Find the probability that there is strong wind, given that we obtain high quality coffee.

Solution:

a) Quality g(x): Low: 0.1, Average: 0.75, High:0.15Wind h(y): Calm: 0.3, Light: 0.5, Strong: 0.2

b) Multiplication of these numbers give exactly the above table. In other words f(x, y) =g(x) · h(y). Therefore, wind speed and quality are independent.

c)0.03

0.03 + 0.02 + 0.15= 0.15

d)0.03

0.045 + 0.075 + 0.03= 0.20

Exercise 4-10: Age and income distribution in a country is given by the following tablein percentages:

Age20-34 35-49 50-64 65-

Income

Less than $20 000 8 7 4 3$20 000-$40 000 13 10 8 6$40 000-$60 000 5 6 8 7

Greater than$60 000 2 2 5 6

a) Find the marginal distributions for age and income.

b) Are they independent?

20

Page 21: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 4-11: Given a joint density function

f(x, y) =

x2 + 2y2

2500−5 < x < 5, −5 < y < 5

0 elsewhere

are X and Y statistically independent?

Solution:

g(x) =

∫ 5

−5

x2 + 2y2

2500dy =

x2

250+

1

15, −5 < x < 5

h(y) =

∫ 5

−5

x2 + 2y2

2500dx =

y2

125+

1

30, −5 < y < 5

g(x) · h(y) 6= f(x, y) ⇒ They are dependent

Exercise 4-12: A video download site classifies videos as short, medium or long in termsof time and as music or other in terms of content. These are the statistics:

ContentMusic Other

LengthShort 0.20 0.20

Medium 0.10 0.15Long 0.30 0.05

Are the parameters of length and content statistically independent?

Solution: Marginal distributions are:

Short Medium Long0.40 0.25 0.35

andMusic Other0.60 0.40

But 0.40× 0.60 = 0.24 6= 0.20 therefore they are dependent.

21

Page 22: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 4-13: At the end of a semester, 80% of the students taking a statistics coursehave passed. 60 % of all students have attended the lectures regularly. Suppose thedistribution of students to four categories is as follows. Are these factors statisticallyindependent or not? Explain separately for each case.

a)Attended Did Not Attend

Passed 0.50 0.30Failed 0.10 0.10

b)Attended Did Not Attend

Passed 0.48 0.32Failed 0.12 0.08

c)Attended Did Not Attend

Passed 0.60 0.20Failed 0 0.20

Solution: For all three cases, marginal distributions are:f(x)

Attended 0.60Did Not Attend 0.40

g(y)Passed 0.80Failed 0.20

If they are statistically independent, then h(x, y) = f(x) ·g(y) and we obtain the table

Attended Did Not AttendPassed 0.48 0.32Failed 0.12 0.08

Therefore

a) h(x, y) 6= f(x) · g(y) Statistically Dependent

b) h(x, y) = f(x) · g(y) Statistically Independent

c) h(x, y) 6= f(x) · g(y) Statistically Dependent

22

Page 23: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 5– Mathematical Expectation

Expected Value

Let X be a random variable with probability distribution f(x). The mean, or ex-pected value of X is:

µ = E(X) =∑x

xf(x)

if X is discrete and

µ = E(X) =

∫ ∞−∞

xf(x) dx

if X is continuous.

Exercise 5-1: A lot containing 7 components contains 4 good and 3 defective ones. Wetake a sample of 3. Find the expected value of number of good components.(

Answer: 12/7

)Exercise 5-2: Let X be the random variable that denotes the life in hours of a certainelectronic device. The probability density function is

f(x) =

20000

x3x > 100

0 elsewhere

Find the expected life of this type of device.(Answer: 200

)Let X be a random variable with probability distribution f(x). The expected value

of g(X) is:

E(g(X)) =∑x

g(x)f(x)

if X is discrete and

E(g(X)) =

∫ ∞−∞

g(x)f(x) dx

if X is continuous.

Exercise 5-3: The number of sales per month have the probability distribution:

x 4 5 6 7 8 9f(x) 1

12112

14

14

16

16

If the salesman is paid a bonus of 2X − 1, find the expected amount of bonus.(Answer: 12.67

)Theorem: If a and b are constants, E(aX + b) = aE(X) + b.

Exercise 5-4: Solve the previous problem with a second method.

23

Page 24: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Variance

Let X be a random variable with probability distribution f(x) and mean µ. Thevariance of X is

σ2 = E[(X − µ)2] =∑x

(x− µ)2f(x), if X is discrete

σ2 = E[(X − µ)2] =

∫ ∞−∞

(x− µ)2f(x) dx, if X is continuous

The square root of variance, σ is called the standard deviation of X.

Theorem: The variance of a random variable X is σ2 = E(X2)− µ2.

Exercise 5-5: Let the random variable X represent the number of typographical errorson a page.The probability distribution is given as:

x 0 1 2 3f(x) 0.51 0.38 0.10 0.01

Calculate σ2.(Answer: 0.4979

)Exercise 5-6: The weekly demand for a product is a continuous random variable Xhaving the probability density

f(x) =

{2(x− 1) 1 < x < 2

0 elsewhere

Find the mean and variance of X.(Answer: 5/3, 1/18

)Exercise 5-7: A random variable X has density function

f(x) =

x 0 < x < 1

2− x 1 6 x < 20 elsewhere

Find the mean and the variance of X.

Exercise 5-8: A random variable X has density function

f(x) =

x 0 < x < 1

2− x 1 6 x < 20 elsewhere

Find the expected value of Y = 3X2 − 4X.(Answer: 3 · 7

6− 4 · 1 = − 1

2

)

24

Page 25: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Chebyshev’s Theorem

Theorem: The probability that any random variable X will take a value within k stan-

dard deviations of the mean is at least 1− 1

k2. That is:

P (µ− kσ < X < µ+ kσ) > 1− 1

k2

Exercise 5-9: A random variable X has a mean µ = 10 and a variance σ2 = 4. UsingChebyshev’s theorem, find P (5 < X < 15)(

Answer: p > 21/25

)

Exercise 5-10: Compute P (µ− 2σ < X < µ+ 2σ) where X has the density function

f(x) =

{6x(1− x) 0 < x < 1

0 elsewhere

and compare with the result given in Chebyshev’s theorem.

Exercise 5-11: Find the mean and variance of a random variable X whose probabilitydistribution is:

x 0 5 10 20f(x) 0.17 0.33 0.41 0.09

Solution: µ = E(X) = 0× 0.17 + 5× 0.33 + 10× 0.41 + 20× 0.09 = 7.55

σ2 = E(X2)− µ2 = 0× 0.17 + 25× 0.33 + 100× 0.41 + 400× 0.09− 7.552 = 28.2475

Exercise 5-12: The probability density function of a random variable is:

f(x) =

3

116

(1 + 7x− x2

)0 < x < 4

0 elsewhere

Find σ2 (the variance).

µ =

∫ 4

0

3

116

(x+ 7x2 − x3

)dx = 2.4138

σ2 = E(x2)− µ2 =

∫ 4

0

3

116

(x2 + 7x3 − x4

)dx− µ2 = 1.0150

25

Page 26: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 5-13: The length of time cars have to wait at a traffic light in seconds has thedensity function:

f(x) =

1

5e−x/5 0 < x

0 elsewhere

a) Find E(X)

b) Find E(X2)

Solution: Using integration by parts, we can show that, for any nonzero a:∫xeax dx =

xeax

a− eax

a2∫x2eax dx =

x2eax

a− 2xeax

a2+

2eax

a3

a) E(X) =

∫ ∞0

xe−x/5

5dx = 5

b) E(X2) =

∫ ∞0

x2e−x/5

5dx = 50

Exercise 5-14: Find σ2 (the variance) for a disrete random variable with the probabilitydensity function:

x 5 6 7 8p 0.50 0.20 0.20 0.10

Solution:µ = 5× 0.5 + 6× 0.2 + 7× 0.2 + 8× 0.1 = 5.9

σ2 = E(x2)− µ2

= 25× 0.5 + 36× 0.2 + 49× 0.2 + 64× 0.1− 5.92 = 1.09

Exercise 5-15: A random variable X has density function

f(x) =

{x2/9 0 < x < 3

0 elsewhere

Find the the variance of X.

Solution:

µ =

∫ 3

0

x3

9dx =

x4

36

∣∣∣∣30

=9

4

σ2 = E(x2)− µ2

=

∫ 3

0

x4

9dx−

(9

4

)2

=27

80= 0.3375

26

Page 27: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 6– Binomial Distributions

Bernoulli Process: In a Bernoulli process, we make trials. The result of each trial is

success or failure. (There are two options) The probability of success (p) remains constantfrom trial to trial.

Exercise 6-1: We select a card from a standard deck. We replace the card and shuffleafter each trial. What is the probability that we get 3 hearts after 6 trials?

(If we do this without replacement, it is no longer Bernoulli)

Binomial Distribution: In a Bernoulli trial, if the probability of success is p and the

probability of failure is q = 1 − p, the probability of having x successes after n trials isgiven by:

b(x;n, p) =

(n

x

)pxqn−x

Note thatn∑x=0

b(x;n, p) = 1

The mean of the binomial distribution is µ = np and the variance is σ2 = npq.

Exercise 6-2: The probability that a patient recovers after a heart operation is 0.9. Findthe probability that,

a) Out of the next 10 patients, 5 or more recover.

b) Out of the next 8 patients, 4 or more recover.

Exercise 6-3: Tests show that only 30% of the cars have correct tire pressure. We test7 cars. Find the probability that

a) 2 or more have correct pressure

b) 3− 6 have correct pressure.

Exercise 6-4: According to statistics of finance ministry, one in five cars have unpaidtax. Suppose we check 10 randomly chosen cars.

a) What is the probability that exactly 4 of them have unpaid tax?

b) What is the probability that 4 or more of them have unpaid tax?(Answer: 0.088, 0.121

)

27

Page 28: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Multinomial Distribution

If each trial has more than 2 possible outcomes, we have a multinomial distribution.If k outcomes result with probabilities p1, . . . , pk, after n independent trials

f(x1, . . . , xk; p1, . . . , pk;n) =n!

x1! . . . xk!px11 · · · p

xkk

where∑

xi = n and∑pi = 1.

Exercise 6-5: At a traffic light, green signal stays for 15 seconds, yellow for 5 secondsand the red for 40 seconds. We pass through it 5 times. We encounter green light X1

times, yellow X2 times and red X3 times. Find the distribution of X1, X2, X3.(Answer:

5!

x1!x2!x3!0.25x10.083x20.67x3

)

Exercise 6-6: In a large classroom, 55% of the students are from CENG, 35% are fromECE and 10% are from IE departments. We randomly choose 6 students. What is theprobability that 3 are from CENG, 2 are from ECE and 1 is from IE?

Solution: Using multinomial distribution,

p =6!

3!2!1!0.553 0.352 0.10 = 0.1223

Exercise 6-7: In a city, 40% of the cars use gasoline as fuel, 35% use LPG and 25% usediesel. We randomly choose

a) 4 cars

b) 40 cars.

Find the probability that half use gasoline and half use LPG.

Solution:

a)

(4

2

)0.42 0.352 = 0.1176

b)

(40

20

)0.420 0.3520 = 1.15× 10−6

28

Page 29: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 6-8: In a court, there are 9 judges. They make the decision ”Guilty” or”Innocent” independently. Each judge has the same rate of error. They find an innocentperson guilty 20% of the time, and a guilty person innocent 30% of the time. An accusedperson is considered guilty if 7 or more judges find him guilty.

a) Suppose you are innocent. What is the probability that the court will find you guilty?

b) Suppose you are guilty. What is the probability that the court will find you innocent?

Solution:

a)

(9

7

)0.27 × 0.82 +

(9

8

)0.28 × 0.8 +

(9

9

)0.29

= 0.000314

b) 1−[(

9

7

)0.77 × 0.32 +

(9

8

)0.78 × 0.3 +

(9

9

)0.79

]= 1− 0.4628

= 0.5372

Exercise 6-9: You receive a large shipment of electronic components. It is either ”good”,which means 5% is defective, or ”bad”, which means 15% is defective. You randomlychoose a sample of 20 components and test them. You reject the shipment if there are 2or more defectives, accept otherwise.

a) Suppose the shipment is good. What is the probability of rejecting?

b) Suppose the shipment is bad. What is the probability of accepting?

Solution:

a) 1−[0.9520 +

(20

1

)0.05× 0.9519

]= 1− 0.7358

= 0.2642

b) 0.8520 +

(20

1

)0.15× 0.8519

= 0.1756

29

Page 30: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 6-10: Jale and Secil are testing some equipment. Jale estimates that 10% aredefective, Secil estimates that 15% are defective. They test 30 items and find 4 defectiveones. What is the probability of this outcome

a) Assuming Jale is right?

b) Assuming Secil is right?

c) Who is right? (Assuming one of them is right)

Solution:

a)

(30

4

)0.14 0.926 = 0.1771

b)

(30

4

)0.154 0.8526 = 0.2028

c) Secil is right with probability:0.2028

0.2028 + 0.1771= 0.53

Exercise 6-11: You want to pass a desert with a jeep. You need 8 containers of gasolinefor this. But during the trip, a container will be punctured with probability 0.04. (Assumeprobability is independent for all containers)

What is your probability of success if you start with

a) 8 containers?

b) 9 containers?

c) 10 containers?

Solution:

a) 0.968 = 0.7214

b)

(9

8

)0.968 0.04 +

(9

9

)0.969 = 0.9522

c)

(10

8

)0.968 0.042 +

(10

9

)0.969 0.04 + 0.9610 = 0.9938

Exercise 6-12: Bits are sent over a communications channel in packets of 20. The prob-ability of a bit being corrupted is 0.07. Errors are independent. What is the probabilitythat

a) Exactly 2 bits in a packet are corrupted?

b) At most 2 bits in a packet are corrupted?

Solution:

a)

(20

2

)0.072 0.9318 = 0.2521

b)

(20

0

)0.9320 +

(20

1

)0.07 0.9319 +

(20

2

)0.072 0.9318 = 0.8390

30

Page 31: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 7– Hypergeometric and Negative Binomial Dist.

Hypergeometric Distribution

Hypergeometric distribution is based on sampling without replacement.

Exercise 7-1: There are 3 black and 7 white balls in a basket. We randomly choose 3.What is the probability that 2 are white and 1 is black?

(Answer:

(3

1

)(7

2

)(

10

3

) = 0.525

)

In general, there are N items. We consider k of them as success and N −k as failures.We randomly choose n items without replacement. What is the probability that there arex successes?

h(x;N, n, k) =

(kx

)(N−kn−x

)(Nn

) , max{0, n− (N − k)} 6 x 6 min{n, k}

Exercise 7-2: A lot of 40 components is unacceptable if there are 3 or more defectives.We test 5 randomly chosen elements and reject the lot if one is defective. What is theprobability that exactly one defective is found assuming there are 3 total defectives?(

Answer: 0.3011

)

Theorem: The mean and variance of the hypergeometric distribution h(x;N, n, k) are

µ =nk

N, σ2 =

N − nN − 1

· n · kN

(1− k

N

)

There is a close relationship between binomial and hypergeometric distributions. Ifn� N , the distinction between with and without replacement disappears.

Exercise 7-3: A factory reports that of the 5000 tires sent to a local distributor, 1000are slightly blemished. You purchase 10. What is the probability that exactly 3 areblemished?(

Answer: 0.2015 ≈ 0.2013

)

31

Page 32: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 7-4: There are 500 students in a CENG department. 150 use Linux and therest use Windows on their personal computers. We randomly choose 7 students. What isthe probability that 4 of them use Linux? Answer this using

a) Hypergeometric distribution.

b) Binomial distribution approximation.(Answer: 0.09659, 0.09724

)

Exercise 7-5: A network makes errors in 1500 bits per 100000 bits transmitted. Eachpacket consists of 100 bits. If there are 4 or more errors per packet, we request retrans-mission.

a) Assuming we can detect all errors, what is the probability of retransmission request?

b) Assuming we can detect at most 6 errors per packet, what is the probability of re-transmission request? What is the probability of accepting a packet with errors?

Solution:For one bit, error probability is q = 1500/100000 = 0.015 and correct arrival proba-

bility is: p = 1− q = 0.985.

a) If there are 0,1,2 or 3 errors, we do not request a transmission. If there are 4, 5, 6, . . .or 100 errors, we do.

1−[p100 +

(100

1

)p99q +

(100

2

)p98q2 +

(100

3

)p97q3

]= 0.0642

b) Now we assume we request a transmission if there are 4,5 or 6 errors:(100

4

)p96q4 +

(100

5

)p95q5 +

(100

6

)p94q6 = 0.0634

The probability that there are 7 or more errors is:0.0642− 0.0634 = 0.0008

Exercise 7-6: There are 17 defectives in a shipment of 1200 items. The testing engineerchooses 10 items randomly, tests them, and accepts the shipment if none of them aredefective. Find the probability of acceptance using

a) Exact method.

b) An approximate method.

Solution:

a) 1200− 17 = 1183 (1183

10

)(

1200

10

) = 0.8666

b) Assuming probability of choosing a defective item is fixed at

p =17

1200= 0.014167 we obtain

(1− p)10 = .985810 = 0.8670

32

Page 33: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 7-7:

a) Of the 50 cars in the parking lot, 13 are using diesel fuel and 37 gasoline. We randomlychoose 10. What is the probability that 5 are using diesel?

b) Of the 500 people working at a hospital, 220 are female and 280 are male. We randomlychoose 10. What is the probability that 5 are female?

c) If you had to solve one of the above problems using an approximation, which one wouldyou choose? a) or b)? Which approximation would you use? Explain.

Solution: Using hypergeometric distribution,

a) p =

(13

5

)(37

5

)(

50

10

) = 0.0546

b) p =

(220

5

)(280

5

)(

500

10

) = 0.2309

c) We can use binomial approximation to hypergeometric distribution. We should preferpart b) because n is larger and therefore we expect it to be a better approximation.

This approximation gives 0.0664 for part a), which means %22 relative error. It gives0.2289 for part b), which means %0.8 relative error.

33

Page 34: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Negative Binomial Distribution

Consider an experiment where probability of success is fixed like in binomial. We areinterested in k successes in x trials, but this time, we want the kth success to occur inxth trial.

Exercise 7-8: In NBA championship, the team that wins four out of seven games is thewinner. Suppose team A has probability 0.55 of winning a game over team B.

a) What is the probability that team A will win the series in 6 games?

b) What is the probability that team A will win the series?(Answer: 0.1853, 0.6083

)If the probability of success (p) and failure (q = 1− p) are fixed, the probability that

kth success occurs at trial x is.

b∗(x; k, p) =

(x− 1

k − 1

)pkqx−k, x = k, k + 1, . . .

The average number of trials until kth success is:

µ =k

p

We can prove this starting with

µ =∞∑i=k

i

(i− 1

k − 1

)pk(1− p)i−k

and using derivatives of geometric series.

Exercise 7-9: In a sports tournament, the team that wins 5 out of 9 games passes thattour. Team A has probability 0.6 of winning any one game against team B. What is theprobability that this tour ends in exactly 7 games?

Solution: Team A may win in 7 games or team B may win in 7 games. Winner mustwin in the 7th game, so

p =

(6

4

)0.65 0.42 +

(6

4

)0.45 0.62

= 0.1866 + 0.0553

= 0.2419

Exercise 7-10: Suppose that the probability of male or female birth is 0.5. A couplewishes to have exactly two daughters, and they will continue to have babies until thiscondition is satisfied. What is the probability that the family has 2 sons?(

Answer: 0.188

)

34

Page 35: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 7-11: We throw a pair of dice until we get 6-6. What is the expected value ofthe number of throws?(

Answer: 36

)

Exercise 7-12: An oil company drills wells. Their probability of success is 0.2. Theywill stop at the third success. What is the average number of wells they drill?(

Answer: 15

)

Exercise 7-13: You have started using your father’s car today. On each day, thereis a probability of 0.01 that you make an accident. Your father says ”You can make amistake at most twice. At your third mistake, I will take back the car”. If you use thecar everyday, what is the probability you lose it on day 100?

Solution: (99

2

)× 0.013 × 0.9997 = 0.0018

Exercise 7-14: A biased coin have probability of 0.7 of coming Heads. We start tossingthis coin. We will stop when we obtain 10 Tails. What is the probability we stop after 20tosses?

Solution: (19

10

)× 0.710 × 0.310 = 0.0154

Exercise 7-15: You are playing a game with your friend. You win with 70% probability,your friend 30%. There is no draw. You decide to play a series of games. What is yourprobability of winning the series if the player with

a) 3 wins

b) 4 winsis considered the winner of the series?

Solution:

a)

0.73 +

(3

2

)0.73 × 0.3 +

(4

2

)0.73 × 0.32 = 0.8369

b)

0.74 +

(4

3

)0.74 × 0.3 +

(5

3

)0.74 × 0.32 +

(6

3

)0.74 × 0.33 = 0.8740

35

Page 36: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 7-16: On a Saturday night, Alphan is playing a game on his phone. He winswith probability 0.23. His friends are waiting for him to go out, but Alphan says ”I willcontinue until I win 7 times”.

Cagatay says: ”We will wait exactly 25 games.” Oguz says: ”We will wait exactly 30games.” Alphan is more optimistic, he thinks his friends will wait at most 10 games.

a) What is the probability that Cagatay is right?

b) What is the probability that Oguz is right?

c) What is the probability that Alphan is right?

d) What is the probability that all are wrong?

Solution:

a)

(24

6

)0.237 0.7718 = 0.0415

b)

(29

6

)0.237 0.7723 = 0.0396

c)

(9

6

)0.237 0.773 +

(8

6

)0.237 0.772 +

(7

6

)0.237 0.77 +

(6

6

)0.237 = 0.0021

d) 1− 0.0415− 0.0396− 0.0021 = 0.9168

Exercise 7-17: You are playing a game against the computer. The game has 3 subgames.The player winning 2 subgames wins the game. Your probability of winning any subgameis 0.7. You start by saying ”I will quit when I lose my fifth game”. What is the probabilityyou play 10 games?

Solution: Probability of winning after two subgames: 0.72 = 0.49

Probability of winning after three subgames:

(2

1

)0.72 0.3 = 0.294

Probability of winning a game: 0.49 + 294 = 0.784

Probability of losing a game: 1− 0.784 = 0.216

Probability of 10th game being the 5th lost game:(9

4

)0.2165 0.7845 = 0.0175

36

Page 37: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 8– Poisson Distribution

Properties of Poisson Process

• The number of outcomes occurring in one time interval is independent of the numberthat occurs in any other interval

• The probability that a single outcome will occur in a very short time interval isproportional to the length of time interval

• The probability that more than one outcome will occur in such a short time intervalis negligible

The probability distribution of the Poisson random variable X is:

p(x;µ) =e−µµx

x!

where µ is the average number of outcomes per unit time.

Exercise 8-1: During an experiment, average number of radioactive particles passingthrough a counter in 1 millisecond is 4. What is the probability that 6 particles enter thecounter in any given millisecond?(

Answer: 0.1042

)

Exercise 8-2: Average number of tankers arriving at a port is 10. The facilities canhandle at most 15 tankers per day. What is the probability that tankers have to beturned away on any given day?(

Answer: 0.0487

)Theorem: Both the mean and the variance of the Poisson distribution are µ.

Theorem: Let X be a binomial random variable with probability distribution b(x;n, p).When n→∞, p→ 0 and np→ µ remains constant,

b(x;n, p)→ p(x, µ)

Exercise 8-3: In a factory, the probability of an accident on a given day is 0.005 andaccidents are independent of each other. What is the probability that in any given periodof 400 days

a) There will be one accident?

a) There will be at most three accidents?(Answer: 0.271, 0.857

)

37

Page 38: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 8-4: For a certain type of copper wire, it is known that, on the average, 1.5 flawsoccur per millimeter. Assuming that the number of flaws is a Poisson random variable,what is the probability that in a certain portion of the wire of length 5 millimeters

a) No flaw occurs ?

b) 10 or more flaws occur?

Solution: λt = 5× 1.5 = 7.5

a)e−λt(λt)x

x!= e−7.5 = 5.5308× 10−4

b)∞∑x10

e−7.57.5x

x!= 1− 0.7764 = 0.2236

Exercise 8-5: Of all the computers in the campus, 2% have Ubuntu installed. Werandomly select 250 and test. What is the probability that we observe Ubuntu in 13 ofthem? Answer using

a) Binomial distribution.

b) Poisson distribution.

Solution:

a)

(250

13

)0.0213 0.98237 = 1.189× 10−3

b)λ = 250× 0.02 = 5e−5513

13!= 1.321× 10−3

Exercise 8-6: On average, 1 person in 1000 make a numerical error while preparingincome tax form. If 10000 forms are selected at random and examined, what is theprobability that 15 or more contain an error?

Exercise 8-7: The number of customers arriving per hour at a auto service follows aPoisson distribution with mean λ = 7.

a) Find the probability that, at a certain hour, no customers come.

b) Find the probability that, within two hours, at least 10 and at most 20 customerscome.

c) Find the mean number of arrivals during a 2-hour period.(Answer: 9.12× 10−4, 0.8427, 14

)

38

Page 39: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 8-8: Aysun has analyzed several year’s lists and found that Emre hoca fails15 students per semester on average. But this year he failed 25. So Aysun thinks Emrehoca must have started using different limits, because the probability of such an outcomeis very low assuming he is using the old system.

Nesib points out that the probability that exactly 15 student fails is also low. He saysEmre hoca is probably using the usual system.

Let number of failed students be n. Assuming µ = 15, find the probability that

a) n = 25

b) n = 15

c) n > 25

d) 10 6 n 6 20

e) Who is right, Aysun or Nesib?

Solution: We have to use Poisson distribution, because only the average is given.

a) 0.9938− 0.9888 = 0.0005

b) 0.5681− 0.4657 = 0.1024

c) 1− 0.9888 = 0.0112

d) 0.9170− 0.0699 = 0.8471

e) Probably Aysun is right, because part c) gives 1% probability for such a result.

Exercise 8-9: The probability that a cell phone rings in any given second is 0.0025. Findthe probability that it rings 4 times or more in an hour, using:

a) Exact method.

b) An approximation.

Solution:

a) Using binomial distribution, we find the probability as:

1−[(

3600

0

)0.00250 0.99753600 +

(3600

1

)0.00251 0.99753599

+

(3600

2

)0.00252 0.99753598 +

(3600

3

)0.00253 0.99753597

]= 1− [0.0001 + 0.0011 + 0.0050 + 0.0149]

= 0.9789

b) Using Poisson distribution, the average per hour is: 0.0025 × 3600 = 9. Using thetable, we find the probability as:

1− 0.0212 = 0.9788

39

Page 40: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 8-10: You are in the real estate business and on average, you sell 17 housesper month. Find the probability of

a) Good month. (25 or more sales)

b) Normal month. (10− 24 sales)

c) Bad month. (2− 9 sales)

d) Terrible month. (0− 1 sales)(Include 8 digits for part d)

Solution: Using the Table on Poisson Probability Sums, we obtain:

a) 1− 0.9594 = 0.0406

b) 0.9594− 0.0261 = 0.9333

c) 0.0261− 0.0000 = 0.0261

Using the Poisson formula, we obtain:

d) e−17(

170

0!+

171

1!

)= 7.45× 10−7

Exercise 8-11: You work in a warehouse which receives 2 orders per hour on average.It is open 8 hours per day. If on any given day you receive 24 or more orders, you call ita difficult day. If you receive 8 − 23 orders, you call it a normal day. If you receive 7 orless, you call it an easy day. Find the probability of experiencing

a) A difficult day

b) A normal day

c) An easy day

d) No orders. (8 digits)

Solution: Average per day = 2× 8 = 16. Using the Table on Poisson Probability Sums,we obtain:

a) 1− 0.9633 = 0.0367

b) 0.9633− 0.0100 = 0.9533

c) 0.0100

Using the Poisson formula, we obtain:

d) e−16160

0!= 1.12× 10−7

40

Page 41: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 8-12:

a) On average, there are 25 flights per day from airport A. Find the probability that therebetween 20 and 22 flights on any given day.

b) On average, there are 15 flights per day from airport B. Find the probability that thereare 20 or more flights on any given day.

Solution:

a)

e−25(

2520

20!+

2521

21!+

2522

22!

)= 0.1840

b) Using Poisson table1− 0.8752 = 0.1248

Exercise 8-13: Ali Can and Ahmet will make 1000 tests. The probability of success inone test is 0.0035. They have to find the probability that they have 4 or more successes.

a) Ali Can uses Binomial distribution. What will he find?

b) Ahmet uses Poisson distribution. What will he find?

Solution:

a) The probability of 0,1,2 or 3 success:

0.99651000 +

(1000

1

)0.0035× 0.9965999

+

(1000

2

)0.00352 × 0.9965998 +

(1000

3

)0.00353 × 0.9965997 = 0.5364

Probability of 4 or more success:

1− 0.5364 = 0.4636

b) On average, we expect 1000× 0.0035 = 3.5 success.Using Poisson table, we obtain:

1− 0.5366 = 0.4634

41

Page 42: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 8-14: The number of clicks a web page gets per minute follows a Poissondistribution. On average, the page gets 960 clicks per hour. What is the probability that,during any one-minute interval, the page gets

a) 0 clicks?

b) 1− 10 clicks?

c) 11− 20 clicks?

d) 21 or more clicks?

Solution: Average number of clicks per minute: µ = 960/60 = 16

a)e−16 160

0!= 1.125× 10−7

We can use the Poisson distribution table for the parts b,c,d:

b) e−16(

161

1!+

162

2!+ · · ·+ 1610

10!

)= 0.0774

c) e−16(

1611

11!+

1612

12!+ · · ·+ 1620

20!

)= 0.8682− 0.0774 = 0.7908

d) e−16(

1621

21!+

1622

22!+ · · ·

)= 1− 0.8682 = 1318

42

Page 43: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 9– Normal Distribution

Normal distribution is the most important continuous probability distribution in statis-tics. The density of the normal random variable X, with mean µ and variance σ2 is

n(x;µ, σ) =e−

12σ2

(x−µ)2

√2π σ

, −∞ < x <∞

The curve is symmetric about x = µ, which is its maximum. It asymptotically ap-proaches the x−axis as we go away from center. The total area under the curve is 1.

We can prove this using z =x− µσ

.

Areas Under the Normal Curve

To find the probability that x1 < X < x2, we have to compute

P (x1 < X < x2) =1√2π σ

∫ x2

x1

e−1

2σ2(x−µ)2 dx

This can be transformed into

P (z1 < Z < z2) =1√2π

∫ z2

z1

e−12z2 dz

where Z is a normal random variable with mean 0 and variance 1. This is calledstandard normal distribution.

Using polar coordinates, we can prove that∫ ∞−∞

e−ax2

dx =

√π

a

Derivative with respect to a gives∫ ∞−∞

x2e−ax2

dx =

√π

2a√a

Exercise 9-1: Given a standard normal distribution, find the area of the curve

a) to the right of z = 1.84

b) between z = −1.97 and z = 0.86(Answer: 0.0329, 0.7807

)Exercise 9-2: Given a standard normal distribution, find k such that P (Z > k) = 0.3015.(

Answer: k = 0.52

)Exercise 9-3: A certain type of battery lasts, on average, 3 years with a standarddeviation of 0.5 years. Assuming battery life is normally distributed, find the probabilitythat a given battery lasts less than 2.3 years.(

Answer: 0.0808

)

43

Page 44: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 9-4: The average grade for an exam is 74 and the standard deviation is 7. If12% of the class get A, what is the lowest possible A and highest possible B? Assumegrades are distributed normally.(

Answer: 83, 82

)

Exercise 9-5: Find the value of k such that the area under the standard normal curvebetween −k < z < k is equal to 0.762.

Solution:P (−k < z < k) = 0.762

P (0 < z < k) = 0.381

P (−z < k) = 0.5 + 0.381 = 0.881

Using the table we find k = 1.18

Exercise 9-6: The IQ’s of 600 applicants to a certain college are approximately normallydistributed with µ = 115 and σ = 12. If the college requires an IQ of at least 95, howmany of them will be rejected? Note that IQ’s are rounded to the nearest integer.

Solution: Z =94.5− 115

12= −1.71

P (Z < −1.71) = 0.0436

0.0436× 600 = 26

Exercise 9-7: The average time for a trip from your home to work is 24 minutes with astandard deviation of 3.8 minutes. Assume the trip times are normally distributed. Youleave home at 08:35 and you must be at work by 09:00. What is the probability that youwill be late?(

Answer: P (z > 0.26) = 1− 0.6026 = 0.3974

)

Exercise 9-8: The average life of a small motor is 10 years with a standard deviation of2 years. The manufacturer replaces free all motors that fail while under guarantee. Toreplace only 3%, how long a guarantee should be offered? Assume lifetime of a motorfollows a normal distribution.

44

Page 45: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 9-9: Let random variable x have a normal distribution with µ = 710 andσ = 93.

a) Find a such that P (710− a < x < 710 + a) = 0.76.

b) Find b such that P (710 < x < b) = 0.36.

c) Find c such that P (x < c) = 0.14.

d) Find the probability that x > 1000.

Solution:

a)0.76

2+ 0.5 = 0.88 P (z < k) = 0.88 ⇒ k = 1.175 a = 1.175× 93 = 109.275

b) 0.36 + 0.5 = 0.86 P (z < k) = 0.86 ⇒ k = 1.08 b = 710 + 1.08× 93 = 810.44

c) z = −1.08 c = 710− 1.08× 93 = 609.56

d)1000− 710

93= 3.12 P (z > 3.12) = 1− 0.9991 = 0.0009

Exercise 9-10: If the function f(x) = ke−x2/3 is a probability distribution, what is k?

Solution: Let I =

∫ ∞−∞

e−x2/3 dx. Then

I2 =

∫ ∞−∞

e−x2/3 dx

∫ ∞−∞

e−y2/3 dy

=

∫ ∞−∞

∫ ∞−∞

e−(x2+y2)/3 dxdy

=

∫ 2π

0

∫ ∞0

e−r2/3r drdθ

= 3π

Therefore I =√

3π and k =1√3π

.

Second Method: We know that the normal distribution

1√2π σ

e−1

2σ2(x−µ)2

is a probability distribution. If we choose µ = 0 and 2σ2 = 3 we obtain the given function,therefore

σ =

√3

2, k =

1√2π

√2

3=

1√3π

45

Page 46: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 9-11: The average height of women is 161 cm with a standard deviation of 6cm and the average height of men is 173 cm with a standard deviation of 7 cm. A mirrorin a shopping mall has dimensions such that 85% of women (equally distributed betweenhigher and lower than average values) can use it comfortably.

What percentage of men can use it comfortably?(Assume height distribution is normal)

Solution:0.85

2+ 0.5 = 0.925

P (−k < z < k) = 0.85 ⇒ k = 1.44

161 + 1.44× 6 = 169.64

161− 1.44× 6 = 152.36

So the mirror was designed for people with height between [152.36−169.64]. For men,these correspond to z values:

169.64− 173

7= −0.48,

152.36− 173

7= −2.95

P (−2.95 < z < −0.48) = 0.3156− 0.0016 = 0.314

= 31.4%

Exercise 9-12: In a large scale international examination, students in the top 1.5% getA and the students in the top 3.5% after them get B. We are given that the limits of Bare [509.44− 534.16].

a) What is the average (µ) of this distribution?

b) What is the standard deviation (σ) of this distribution?(Assume grade distribution is normal)

Solution:P (z > z1) = 1.5% = 0.015 ⇒ z1 = 2.17

P (z > z2) = 5% = 0.05 ⇒ z1 = 1.645

2.17 =534.16− µ

σ

1.645 =509.44− µ

σ

⇒ µ = 431.98, σ = 47.09

46

Page 47: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 9-13:

• Average level of cholesterol in blood is 214 mg/dL and the standard deviation is29.5. Cholesterol over 250 is considered too high.

• Average level of blood sugar is 97 mg/dL, standard deviation is 12.2 and a levelbelow 75 is considered too low.

Assuming both these distributions are normal and also assuming they are independent,what is the probability that a randomly chosen individual has high cholesterol and lowsugar at the same time?

Solution:

z1 =250− 214

29.5= 1.22

Using normal distribution table, probability of high cholesterol is:

P (z1 < z) = 1− 0.8888 = 0.1112

z2 =75− 97

12.2= −1.8

Similarly, probability of low sugar is:

P (z < z2) = 0.0359

The probability that both events happen simultaneously is:

0.1112× 0.0359 = 0.003992 = 3.992× 10−3

Exercise 9-14: Suppose the class average for our course is µ = 56.4 and the standarddeviation is σ = 14.7. Also assume this distribution is normal. If Emre hoca wants togive CC to the 40% of the students in the middle, what are limits for CC?

Solution:0.4

2+ 0.5 = 0.7

Using normal distribution table, we find the z value corresponding to 0.7 as z = 0.52.

56.4± 14.7× 0.52 gives [48.756, 64.044]

47

Page 48: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 9-15: The average of an exam is 355.4 and the standard deviation is 66.2.Assume grades are normally distributed. The students in the bottom 15% fail. What isthe failing grade?

Solution: Using the table, we find z = −1.04

z =x− µσ

⇒ −1.04 =x− 355.4

66.2

x = 286.55

Exercise 9-16: The average of an exam is 355.4 and the standard deviation is 66.2.Assume grades are normally distributed. What is the probability that a randomly selectedstudent’s grade is in the interval [400.0, 500.0] ?

Solution:

z =x− µσ

z1 =400− 355.4

66.2= 0.67, z2 =

500− 355.4

66.2= 2.18

Using the table, P (z1 < z < z2) = 0.9854− 0.7486 = 0.2368

48

Page 49: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 10– Normal Approximation to the Binomial

Theorem: If X is a binomial random variable with mean µ = np and variance σ2 = npq,then the limiting form of the distribution of

Z =X − np√npq

as n→∞ is the standard normal distribution n(z; 0, 1).

Exercise 10-1: The probability that a patient recovers from a rare blood disease is 0.4.If 100 people contract this disease, what is the probability that fewer than 30 survive?(

Answer: x = 29.5, z = −2.14, P = 0.0162

)Exact Result= 0.0148

Exercise 10-2: In a multiple choice exam, a student answers 80 questions randomly.There are 4 answers for each question. What is the probability that the student guessesbetween 25-30 (inclusive) of the questions correctly?(

Answer: 0.1196

)Exact Result= 0.1193

Exercise 10-3: A company produces component parts for an engine. Part specificationssuggest that 95% of items meet specifications. The parts are shipped to customers in lotsof 100.

a) What is the probability that more than 2 items in a lot will be defective?

b) What is the probability that more than 10 items in a lot will be defective?(Answer: 0.8749, 0.0059

)

Exercise 10-4: In a digital communication channel, the probability that a bit is receivedin error is 10−5. If 16 million bits are transmitted, what is the probability that more than150 errors occur?(

Answer: 0.7734

)

49

Page 50: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 10-5: Statistics show that on a Saturday night 1 out of every 10 drivers on theroad is drunk. 400 drivers are randomly checked. Let’s call the number of drunk driversn. What is the probability that

a) n < 32?

b) 49 < n?

c) 35 6 n 6 46?

Solution: Using normal approximation to binomial, we find

µ = 400× 0.1 = 40, σ =√

400× 0.1× 0.9 = 6

a) x = 31.5, z =31.5− 40

6= −1.42

P (z < −1.42) = 0.0778

b) x = 49.5, z =49.5− 40

6= 1.58

P (1.58 < z) = 1− 0.9429 = 0.0571

c) x = 34.5, z = −0.92, x = 46.5, z = 1.08

P (−0.92 < z < 1.08) = 0.8599− 0.1788 = 0.6811]

Exercise 10-6: In a shipment of 500 identical products, 30 are defective. We randomlychoose 20. Find the probability that 2 are defectives among the 20

a) Using the exact method

b) Using an approximation.

50

Page 51: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 10-7: A coin is tossed 400 times. We obtain n heads. Use the normal curveapproximation to find the probability that 185 6 n 6 210.

Exercise 10-8: Suppose 15% of all cars in Ankara are white. We observe the EskisehirRoad and count passing cars. We observe n white cars in a total number of 400. What isthe probability that 50 6 n 6 70?(

Answer: P (−1.47 < z < 1.47) = 0.8584

)

Exercise 10-9: There are 3000 students in a university and 750 are freshmen. Werandomly choose 10 students. What is the probability that 8 of them are freshmen?Solve in two different ways. (6 digits after point)

Solution: Hypergeometric distribution gives:(750

8

)(2250

2

)(

3000

10

) = 0.000377

Binomial approximation with p =750

3000= 0.25 gives(

10

8

)0.258 0.752 = 0.000386

51

Page 52: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 10-10: We toss a single die 90 times. What is the probability we obtain 20 ormore sixes?

Use normal approximation to binomial:

µ = 90 · 1

6= 15, σ =

√90 · 1

6· 5

6= 3.54

z =19.5− 15

3.54= 1.27

P (z > 1.27) = 1− 0.8980 = 0.1020

Exercise 10-11: We ask a random sample of students if they want to take a course insummer. (Assume that the proportion of students who actually want is 60%). What isthe probability that,

a) if we ask 100 students, 50 or fewer say yes?

b) if we ask 200 students, 100 or fewer say yes?

Solution: We have to use normal approximation to binomial.

a)µ = np = 100× 0.6 = 60

σ =√npq =

√100× 0.6× 0.4 = 4.899

z1 =50.5− 60

4.899= −1.94

P (z < z1) = 0.0262

b)µ = np = 200× 0.6 = 120

σ =√npq =

√200× 0.6× 0.4 = 6.928

z2 =100.5− 120

6.928= −2.81

P (z < z2) = 0.0025

52

Page 53: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 11– Gamma and Exponential Distributions

The gamma function is defined by:

Γ(α) =

∫ ∞0

xα−1e−x dx, α > 0

Using integration by parts, we can show that

Γ(α) = (α− 1)Γ(α− 1)

Γ(1) = 1

ThereforeΓ(n) = (n− 1)! For a positive integer n

Gamma Distribution: The continuous random variable X has a gamma distribution

with parameters α and β if its density function is given by

f(x;α, β) =

xα−1e−x/β

βαΓ(α)x > 0

0 elsewhere

where α > 0, β > 0

Theorem: The mean and variance of the gamma distribution are

µ = αβ, σ2 = αβ2

Exercise 11-1: Let X be a random variable having gamma distribution with parametersα = 3, β = 2. Find the probability that P (4 < X < 5).

Hint: You may use the formula

∫x2eax dx =

eax

a3(a2x2 − 2ax+ 2

)Solution:

P (4 < X < 5) =

∫ 5

4

x2e−x/2

23 Γ(3)dx

=

∫ 2.5

2

u2e−u

2du

=−e−u

2

(u2 + 2u+ 2

) ∣∣∣∣2.52

= 0.1329

53

Page 54: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 11-2: A random variable X is modeled by gamma distribution with α = 2,β = 8. Find the probability that P (X < 7).

Solution:

P (X < 7) =

∫ 7

0

xe−x/8

82 Γ(2)dx

Γ(2) = 1. Using integration by parts with u = x, dv = e−x/8 dx we obtain:

P (X < 7) = −e−x/8(x

8+ 1) ∣∣∣∣7

0

= 1− 15

8e−7/8

= 0.2184

Exponential Distribution: The continuous random variable X has an exponential

distribution with parameter β if its density function is given by

f(x; β) =

e−x/β

βx > 0

0 elsewhere

where β > 0

Theorem: The mean and variance of the exponential distribution are

µ = β, σ2 = β2

Exercise 11-3: A system contains a component with time to failure T . The randomvariable T is modeled by exponential distribution with mean time to failure β = 5. if 5of these components are installed, what is the probability that at least 2 are functioningat the end of 8 years?(

Answer: 0.2667

)

54

Page 55: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 11-4: The length of time for one individual to be served at a cafeteria is arandom variable having an exponential distribution with a mean of 6 minutes. What isthe probability that a person is served in less than 4 minutes on at least 5 of the next 7days?(

Answer: 0.2052

)

Exercise 11-5: The length of time you have to wait at the cafeteria is a random variablehaving an exponential distribution with a mean of 120 seconds. If you wait more than400 seconds, you call it an unlucky day. If you eat at the cafeteria 20 days a month, whatis the probability that you experience 2 or more unlucky days in a month?

Solution: First we have to find the probability of unlucky days:

1

120

∫ ∞400

e−x/120 dx = −e−x/120∣∣∣∣∞400

= e−400/120 = 0.0357

1− 0.0357 = 0.9643

1−[(

20

0

)0.03570 0.964320 +

(20

1

)0.03571 0.964319

]= 0.1588

55

Page 56: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 11-6: Suppose that the length of a telephone call (in minutes) is exponentiallydistributed with mean 8. Find how long the longest 25% of the calls are.

In other words, find k such that P (t > k) = 0.25.

Solution: ∫ ∞k

e−t/8

8dt = 0.25

−e−t/8∣∣∣∣∞k

= 0.25

e−k/8 = 0.25

k = 11.09

Exercise 11-7: The lifetime of an electronic component has µ = 40 and σ = 20√

2.Nilay thinks that the distribution is gamma, but Mehmet thinks it is normal. The onlyother information they have about the population is that 1.7% of the components have alifetime larger than 120.

Who is right and why?

Solution: According to Nilay:

αβ = 40, αβ2 = 800 ⇒ α = 2, β = 20

P (x > 120) =

∫ ∞120

1

202Γ(2)x1 e−x/20 dx

= 7e−6

= 0.017

According to Mehmet:

Z =120− 40

20√

2= 2.83

P (Z > 2.83) = 1− 0.9977 = 0.0023

%1.7 = 0.017 Clearly, Nilay is right.

56

Page 57: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 12– Sampling Distributions

A population consists of the totality of the observations with which we are concerned.A sample is a subset of the population.

Any sampling procedure that produces inferences that consistently overestimate orunderestimate some characteristic of the population is said to be biased.

Any function of the random variables constituting a random sample is called a statis-tic. The probability distribution of a statistic is called a sampling distribution.

Theorem: (Central Limit Theorem) If X is the mean of a random sample of size ntaken from a population with mean µ and finite variance σ2, then the limiting form ofthe distribution of

Z =X − µσ/√n

as n→∞ is the standard normal distribution n(z; 0, 1).

In other words, sampling distribution of X will be normal even if the populationdistribution is not.

Exercise 12-1: An electrical firm manufactures light bulbs that have a lifetime of mean800 hours and standard deviation of 40 hours. Find the probability that a random sampleof 16 bulbs will have an average lifetime less than 775 hours.(

Answer: 0.0062

)

Exercise 12-2: An auto part must have a diameter of 5 mm. We know that populationσ = 0.1mm. We choose 100 parts randomly. The sample average is x = 5.027 mm. Canwe say the population mean is 5 mm?(

Answer: 0.007

)

Exercise 12-3: The bus trip to a campus takes on average, 28 minutes with a standarddeviation of 5 minutes. In a week, the bus makes 40 trips. What is the probability thatweekly average is above 30 minutes? Assume the mean is measured to nearest minute.(

Answer: P (z > 3.16) = 0.0008

)

57

Page 58: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 12-4: A certain machine makes electrical resistors having a mean resistanceof 50 ohms and a standard deviation of 3 ohms. We choose a random sample of size n.What is the probability that the average resistance of the sample is less than 49.7 ohms,if the sample size is

a) n = 10?

b) n = 50?

c) n = 250?(Answer: P (z < −0.32) = 0.3745, P (z < −0.71) = 0.2389,

P (z < −1.58) = 0.0571

)

Exercise 12-5: Average lifetime of an electronic component is 87.0 months, with astandard deviation of 9.0 months. Assume normal distribution.

a) What is the probability that a single component will have a lifetime between 86.5 and87.5 months?

b) What is the same probability for a sample average if sample size is 100?

Solution:

a) z1 =86.5− 87

9= −0.06, z2 =

87.5− 87

9= 0.06

P (−0.06 < z < 0.06) = 0.5239− 0.4761

= 0.0478

b) z3 =86.5− 87

9/√

100= −0.56, z4 =

87.5− 87

9/√

100= 0.56

P (−0.56 < z < 0.56) = 0.7123− 0.2877

= 0.4246

Exercise 12-6: A machine part has average dimension µ = 17 cm. We know thatpopulation standard deviation is σ = 0.5 cm. We choose a sample of n parts randomly.We want the probability that the sample average X is more than 17.2 cm to be 1% orless.

What should n be?

Solution: If the probability is 1% or less, using normal distribution table, we find z = 2.33or more.

z =17.2− 17

0.5/√n

> 2.33

0.4√n > 2.33

n > 33.9

We need a sample size n > 34.

58

Page 59: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Theorem: If independent samples of size n1 and n2 are drawn at random from twopopulations with means µ1 and µ2 and variances σ2

1 and σ22 respectively, then the sampling

distribution of the differences of means X1 − X2 is approximately normally distributed

with mean and variance given by µ1 − µ2 andσ21

n1

+σ22

n2

. In other words

Z =(X1 −X2)− (µ1 − µ2)√

σ21

n1+

σ22

n2

is approximately a standard normal variable.

Exercise 12-7: We test the strength of steel cables manufactured by companies A andB. The standard deviations of both are 5 and we test 30 cables from each. The resultsare:

xA = 49.5, xB = 45.5, xA − xB = 4

Company B claims the population means are the same. What is the probability of seeingthis result if they are really the same?

Solution: Z =4− 0√25

30+

25

30

= 3.10

P (Z > 3.10) = 1− 0.999 = 0.001

Exercise 12-8: The televisions of manufacturer A have a mean lifetime of 6.5 years anda standard deviation of 0.9 years. Those of manufacturer B have a mean lifetime of 6.0years and a standard deviation of 0.8 year. We take a random sample of 36 from A and49 from B. What is the probability that sample from A will have a mean lifetime at least1 year more than sample of B?(

Answer: P (z > 2.65) = 0.0040

)

59

Page 60: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 12-9: We randomly choose 35 students from school A and 45 student fromschool B. (There are thousands of students in each school) We give them a mathematicstest and find that sample averages are 55 and 60. The standard deviations are 18 and 17respectively.

What is the probability of seeing this result if the schools have the same average?

Solution:

Z =(60− 55)− 0√

182

35+

172

45

= 1.26

P (Z > 1.26) = 1− 0.8962

= 0.1038

60

Page 61: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 13– Confidence Intervals

Suppose we know the variance σ2 of a population and we are trying to find the mean.The sample mean is distributed normally around the population mean, so

P (−zα/2 < Z < zα/2) = 1− α

where

Z =X − µσ/√n

and zα/2 denotes the z−value such that the area to the right is α/2.

1− α

α/2 α/2

−zα/2 zα/2

We can rewrite this as:

P

(X − zα/2

σ√n< µ < X + zα/2

σ√n

)= 1− α

This is called the 100(1− α)% confidence interval for µ.

Exercise 13-1: Average zinc concentration from a sample of 36 measurements is 2.6grams per milliliter. Find the 95% and 99% confidence intervals for mean zinc concentra-tion in the river. Assume σ = 0.3.(

Answer: [2.50, 2.70], [2.47, 2.73]

).

Exercise 13-2: A random sample of 130 units have average 36 and standard deviation0.7.

a) Find a 90% confidence interval.

b) How large a sample do we need if we want to be 90% sure that sample mean is within0.05 of the true mean?(

Answer: [35.90− 36.10], 531

)

61

Page 62: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 13-3: A population has σ = 40 and we are trying to determine the mean. Howlarge a sample do we need if we want to be 95% sure that we are making an error of 15or less?

Solution: P (Z < k) = 0.975 ⇒ k = 1.96

1.96× 40√n6 15 ⇒ n > 28

Exercise 13-4: 200 high school students in a city are randomly chosen and given amathematics test. The mean and standard deviation of the sample are 46 and 14.

a) Find a 99% confidence interval for the mean.

b) Find the necessary sample size if we want the 99% confidence interval to have size 1.

Exercise 13-5: Ayca, Aydan and Cansu are given a sample of 77 units. The sampleaverage is X = 446 and the population standard deviation is σ = 69. They are askedto find a confidence interval for the population mean µ, and they come up with threedifferent results:

a) Ayca finds [435.9, 456.1]

b) Aydan finds [433.1, 458.9]

c) Cansu finds [428.9, 463.1]What are their confidence levels?

Solution:

a)z × 69√

77= 456.1− 446 = 10.1 ⇒ z = 1.28

(Alternatively, we can find 10.1 from456.1− 435.9

2)

Using the table, we find α/2 = 1− 0.8997 ≈ 0.10

Therefore α ≈ 0.20 and 100(1− α)% confidence is 80% in this case.

Similarly, we find:

b) z = 1.64, Confidence: 90%

c) z = 2.17, Confidence: 97%

62

Page 63: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Exercise 13-6: We wish to find the average weight of a bag of sugar. We test a sampleof 56 bags. The sample average is 498 grams. Assume the population standard deviationis 3.3.

a) Find an 99.6% confidence interval for the population average µ.

b) If we want to reduce the confidence interval’s size to 1/3 of what you found in part a),what should n (the sample size) be?

c) What can you say about the claim that µ = 500?

Solution:

a) 1− α = 0.996 ⇒ zα/2 = 2.88

498± 2.88× 3.3√56

gives: [496.730, 499.270]

b) Because of1√n

dependence, n must be increased 9 times.

⇒ n = 9× 56 = 504

c) The claim is very probably wrong, because 500 /∈ [496.730, 499.270] and the confidencelevel is 99.6%.

Exercise 13-7: A sample of apple juice is tested for arsenic content. The standarddeviation is 1.8 ppb (part per billion), the sample size is 94 and the sample average is 9ppb. The distribution is normal.

a) Find a 80% confidence interval for the population average.

b) Find a sample size such that the 80% confidence interval will be half of what you foundin part a).

c) Find a sample size such that the 99.5% confidence interval will be the same size aswhat you found in part a).

Solution: We will use Area= 0.8/2 + 0.5 = 0.9 ⇒ z = 1.28 .

a)σ√nz =

1.8√94

1.28 = 0.2376, therefore confidence interval is:

[9− 0.2376, 9 + 0.2376] = [8.7624, 9.2376]

b) We have to find n such that

1.8√n

1.28 = 0.2376/2 = 0.1188

n = 376

Or, we can simply multiply 94 by 4: 94× 4 = 376

c) Area= 0.995/2 + 0.5 = 0.9975 ⇒ z = 2.81 .

1.8√n

2.81 = 0.2376

n = 453

63

Page 64: CENG 235 Introduction To Probability and Statisticsacademic.cankaya.edu.tr/~sermutlu/lectures/235lecnot.pdf · 2016-03-01 · Lecture Notes On CENG 235 Introduction To Probability

Week 14– Prediction Intervals

Suppose we know the variance σ2 of a population but do not know the mean µ. Wehave a random sample of size n with average x. We want to predict the value of a singlefuture observation x0.

If we define a new variable x− x0, its variance will be

σ2

n+ σ2

therefore the standard deviation is:

σ

√1 +

1

n

So 100(1− α)% prediction interval is:

X − zα/2 σ√

1 +1

n< x0 < X + zα/2 σ

√1 +

1

n

Exercise 14-1: Let sample size be 100, sample average 290, population standard devia-tion 32.

a) Construct a 99% confidence interval.

b) Construct a 99% prediction interval.(Answer: [282, 298], [207, 373]

)Exercise 14-2: The average weight gain for a sample of 40 mice is 5.6 grams. Thepopulation standard deviation is 1.3.

a) Construct a 90% confidence interval.

b) Construct a 90% prediction interval.(Answer: [5.26, 5.94], [3.43, 7.77]

)

64