Stats for Engineers Lecture 6

43
Stats for Engineers Lecture 6 Answers for Question sheet 1 are now online http://cosmologist.info/teaching/STAT/ Answers for Question sheet 2 should be available Friday evening

description

Stats for Engineers Lecture 6. Answers for Question sheet 1 are now online http://cosmologist.info/teaching/STAT/. Answers for Question sheet 2 should be available Friday evening. Summary From Last Time. Continuous Random Variables. Probability Density Function (PDF) . - PowerPoint PPT Presentation

Transcript of Stats for Engineers Lecture 6

Page 1: Stats for Engineers Lecture 6

Stats for Engineers Lecture 6

Answers for Question sheet 1 are now onlinehttp://cosmologist.info/teaching/STAT/

Answers for Question sheet 2 should be available Friday evening

Page 2: Stats for Engineers Lecture 6

Summary From Last Time

Continuous Random Variables

𝑃 (π‘Žβ‰€ 𝑋≀𝑏 )=βˆ«π‘Ž

𝑏

𝑓 (π‘₯ β€² )𝑑π‘₯ β€²

Probability Density Function (PDF)

Exponential distribution 𝑓 (𝑦 )={πœˆπ‘’βˆ’πœˆ 𝑦 , 𝑦>00 ,βˆ§π‘¦<0

Probability density for separation of random independent events with constant rate

Normal/Gaussian distribution

𝜎

mean standard deviation

βˆ«βˆ’βˆž

∞

𝑓 (π‘₯ )𝑑π‘₯=1

𝑃 (βˆ’1.5<𝑋<βˆ’0.7 )𝑓 (π‘₯)

Page 3: Stats for Engineers Lecture 6

1 2 3 4 5 6

12%

3% 3%

82%

0%0%

Normal distribution

Consider the continuous random variable X = the weight in pounds of a randomly selected new-born baby. Suppose that X can be modelled with a normal distribution with mean ΞΌ = 7.57 and standard deviation = 1.06.

If the standard deviation were = 1.26 instead, how would that change the graph of the pdf of X?

Question from Derek Bruff

1. The graph would be narrower and have a greater maximum value.

2. The graph would be narrower and have a lesser maximum value.

3. The graph would be narrower and have the same maximum value.

4. The graph would be wider and have a greater maximum value.

5. The graph would be wider and have a lesser maximum value.

6. The graph would be wider and have the same maximum value.

𝜎

βˆ«βˆ’βˆž

∞

𝑓 (π‘₯ )𝑑π‘₯=1

Page 4: Stats for Engineers Lecture 6

𝑃 (π‘Ž<𝑋<𝑏)=βˆ«π‘Ž

𝑏

𝑓 (π‘₯ )𝑑π‘₯

BUT: for normal distribution cannot integrate analytically.

If , then

Instead use tables for standard Normal distribution: 𝑓 (𝑧 )= 1√2πœ‹

π‘’βˆ’π‘§ 22

Page 5: Stats for Engineers Lecture 6

Change of variable The probability for X in a range around is for a distribution is given by The probability should be the same if it is written in terms of another variable . Hence

β‡’ 𝑓 (𝑦 )= 𝑓 (π‘₯ ) 𝑑π‘₯𝑑𝑦 .

Why does this work?

i.e. change to

β‡’π‘₯=πœ‡+𝜎 𝑧

ΒΏ1

√2πœ‹π‘’βˆ’ 𝑧 22 N(0, 1) - standard Normal distribution

¿𝑍2

2

𝑑π‘₯

𝑓 (π‘₯)

β‡’ 𝑑π‘₯𝑑𝑧=𝜎

Page 6: Stats for Engineers Lecture 6

Use Normal tables for [also called ]

Q ( z )=𝑃 (𝑍 ≀ 𝑧 )=βˆ«βˆ’βˆž

𝑧

𝑓 (π‘₯ )𝑑π‘₯

Outside of exams this is probably best evaluated using a computer package (e.g. Maple, Mathematica, Matlab, Excel); for historical reasons you still have to use tables.

𝑸

𝑧

Page 7: Stats for Engineers Lecture 6

𝑸

0≀𝑍 ≀3.59

z

Page 8: Stats for Engineers Lecture 6

Example:If Z ~ N(0, 1):

(a)

888

Page 9: Stats for Engineers Lecture 6

(b)

¿𝑃 (𝑍 ≀0.5 )

=

= 0.6915.ΒΏQ(0.5)

Page 10: Stats for Engineers Lecture 6

(c)

=

ΒΏ1βˆ’Q (1.0 )=1βˆ’0.8413ΒΏ0.1587

Page 11: Stats for Engineers Lecture 6

1 2 3 4

10%

17%

60%

13%

Symmetries

If , which of the following is NOT the same as ?

Page 12: Stats for Engineers Lecture 6

Symmetries

If , which of the following is NOT the same as ?

1βˆ’π‘ƒ (𝑍>0.7 )-

𝑃 (𝑍 >βˆ’0.7 )

1βˆ’π‘ƒ (𝑍 >βˆ’0.7) -

1βˆ’π‘ƒ (𝑍 <βˆ’0.7) - =

=

=

Page 13: Stats for Engineers Lecture 6

(d)

¿𝑃 (𝑍<1.5 )βˆ’π‘ƒ (𝑍<0.5)

= βˆ’

ΒΏQ (1.5 )βˆ’Q (0.5)

ΒΏ0.9332βˆ’0.6915= 0.2417

Page 14: Stats for Engineers Lecture 6

(e)

Between and

Using interpolation

Q (1.356 )=A𝑄 (1.35 )+B𝑄 (1.36)Fraction of distance between1.35 and 1.36:

= 0.6

= 0.4

Q (1.356 )=0.4𝑄 (1.35 )+0.6𝑄 (1.36)

=0.9125

Page 15: Stats for Engineers Lecture 6

(f)

What is ?

Use table in reverse:

.

Interpolating as before

between 0.84 and 0.85

𝑐=𝐴×0.084+𝐡×0.0858

2

⇒𝑐=0.82Γ—0.084+0.18Γ—0.085

Page 16: Stats for Engineers Lecture 6

1. 2. 3. 4.

43%

2%

19%

36%

Using Normal tables

The error (in Celsius) on a cheap digital thermometer has a normal distribution, with What is the probability that a given temperature measurement is too cold by more than C?

1. 0.06182. 0.93823. 0.12364. 0.0735

Page 17: Stats for Engineers Lecture 6

Using Normal tables

The error (in Celsius) on a cheap digital thermometer has a normal distribution, with That is the probability that a given temperature measurement is too cold by more than C?

Answer:

Want

¿𝑃 (𝑍>1.54)ΒΏ

ΒΏ1βˆ’π‘ƒ (𝑍<1.54)

ΒΏ1βˆ’Q(1.54 )

ΒΏ1βˆ’0.9382=0.0618

Page 18: Stats for Engineers Lecture 6

(g) Finding a range of values within which lies with probability 0.95:

The answer is not unique; but suppose we want an interval which is symmetric about zero i.e. between and .

βˆ’π‘‘ 𝑑

0.95

0.05/2=0.0250.025

𝑑

So is where

0.975

0.025+0.95

Page 19: Stats for Engineers Lecture 6

Use table in reverse:

Q (𝑑 )=0.975

⇒𝑍=1.96

95% of the probability is inthe range

βˆ’1.96<𝑍<1.96

Page 20: Stats for Engineers Lecture 6

P=0.025

P=0.025

In general 95% of the probability lies within of the mean

The range is called a 95% confidence interval.

Page 21: Stats for Engineers Lecture 6

1 2 3 4

20%

45%

2%

32%

Normal distribution

If has a Normal distribution with mean and standard deviation , which of the following could be a graph of the pdf of ?

Question from Derek Bruff

1. Too wide2. OK3. Wrong mean4. Too narrow

1. 2. 3. 4.

Page 22: Stats for Engineers Lecture 6

Normal distribution

If has a Normal distribution with mean and standard deviation , which of the following could be a graph of the pdf of ?

1. 2. 3. 4.

Too wide Correct Wrong mean Too narrow

i.e. Mean at , 95% inside (5% outside) of i.e.

Page 23: Stats for Engineers Lecture 6

Example: Manufacturing variability

The outside diameter, X mm, of a copper pipe is N(15.00, 0.022) and the fittings for joining the pipe have inside diameter Y mm, where Y ~ N(15.07, 0.0222).

(i) Find the probability that X exceeds 14.99 mm.

(ii) Within what range will X lie with probability 0.95?

(iii) Find the probability that a randomly chosen pipe fits into a randomly chosen fitting (i.e. X < Y).

XY

Page 24: Stats for Engineers Lecture 6

Example: Manufacturing variability

The outside diameter, X mm, of a copper pipe is N(15.00, 0.022) and the fittings for joining the pipe have inside diameter Y mm, where Y ~ N(15.07, 0.0222).

(i) Find the probability that X exceeds 14.99 mm.

Answer:

𝑃 (𝑋>14.99 )=𝑃 (𝑍> 14.99βˆ’15.00.02 )

¿𝑃 (𝑍>βˆ’0.5 )

¿𝑃 (𝑍<0.5 )=𝑄(0.5)

Reminder:

β‰ˆ0.6915

Page 25: Stats for Engineers Lecture 6

Example: Manufacturing variability

The outside diameter, X mm, of a copper pipe is N(15.00, 0.022) and the fittings for joining the pipe have inside diameter Y mm, where Y ~ N(15.07, 0.0222).

(ii) Within what range will X lie with probability 0.95?

Answer

From previous example

⇒𝑋=15Β±1.96Γ—0.02

β‡’14.96mm<𝑋<15.04mm

i.e. lies in with probability 0.95

Page 26: Stats for Engineers Lecture 6

1. 2. 3. 4.

71%

11%4%

14%

Where is the probability

We found 95% of the probability lies within

What is the probability that 15.04mm?

1. 0.0252. 0.053. 0.954. 0.975

P=0.025

P=0.025

Page 27: Stats for Engineers Lecture 6

Example: Manufacturing variability

The outside diameter, X mm, of a copper pipe is N(15.00, 0.022) and the fittings for joining the pipe have inside diameter Y mm, where Y ~ N(15.07, 0.0222).

(iii) Find the probability that a randomly chosen pipe fits into a randomly chosen fitting (i.e. X < Y).

Answer

For we want ).

To answer this we need to know the distribution of , where and bothhave (different) Normal distributions

Page 28: Stats for Engineers Lecture 6

Means and variances of independent random variables just add.

Distribution of the sum of Normal variates

A special property of the Normal distribution is that the distribution of the sum of Normal variates is also a Normal distribution. [stated without proof]

β‡’πœ‡π‘‹ 1+𝑋 2=πœ‡π‘‹ 1

+πœ‡π‘‹ 2𝜎 𝑋 1+𝑋 2

2 =𝜎 𝑋 1

2 +𝜎 𝑋 2

2Etc.

If are independent and each have a normal distribution

If are constants then:

E.g. 𝑋 1+ 𝑋 2βˆΌπ‘ (πœ‡1+πœ‡2 ,𝜎12+𝜎2

2)𝑋 1βˆ’π‘‹ 2βˆΌπ‘ (πœ‡1βˆ’πœ‡2 ,𝜎 1

2+𝜎22)

Page 29: Stats for Engineers Lecture 6

Example: Manufacturing variability

The outside diameter, X mm, of a copper pipe is N(15.00, 0.022) and the fittings for joining the pipe have inside diameter Y mm, where Y ~ N(15.07, 0.0222).

(iii) Find the probability that a randomly chosen pipe fits into a randomly chosen fitting (i.e. X < Y).

Answer

For we want ).

π‘Œ βˆ’ π‘‹βˆΌπ‘ (πœ‡π‘Œ βˆ’πœ‡π‘‹ ,𝜎 π‘Œ2 +𝜎 𝑋

2 )¿𝑁 (15.07βˆ’15,0.022+0.0222)=𝑁 (0.07,0 .000884 )

Hence ¿𝑃 (𝑍>βˆ’2.35 )¿𝑃 (𝑍<2.35 )β‰ˆ0.991

=

Page 30: Stats for Engineers Lecture 6

1 2 3 4

16% 16%

55%

14%

Which of the following would make a random pipe more likely to fit into a random fitting?

The outside diameter, X mm, of a copper pipe is N(15.00, 0.022) and the fittings for joining the pipe have inside diameter Y mm, where Y ~ N(15.07, 0.0222).

1. Decreasing mean of Y2. Increasing the variance of X3. Decreasing the variance of X4. Increasing the variance of Y

XY

Page 31: Stats for Engineers Lecture 6

Which of the following would make a random pipe more likely to fit into a random fitting?

The outside diameter, X mm, of a copper pipe is N(15.00, 0.022) and the fittings for joining the pipe have inside diameter Y mm, where Y ~ N(15.07, 0.0222).

XY

Common sense.

𝑃 (𝑋<π‘Œ )=𝑃 (𝑑>0 )=𝑃 (𝑍>0βˆ’πœ‡π‘‘

𝜎 𝑑)=𝑃 (𝑍<

πœ‡π‘‘

𝜎 𝑑)

Larger probability if - larger (bigger average gap between pipe and fitting)- smaller (less fluctuation in gap size)

, so is smaller if variance of is decreased

Answer

Or use

Page 32: Stats for Engineers Lecture 6

Normal approximations

Central Limit Theorem: If are independent random variables with the same distribution, which has mean and variance (both finite), then the sum tends to the distribution as . Hence: The sample mean is distributed approximately as .

For the approximation to be good, n has to be bigger than 30 or more for skewed distributions, but can be quite small for simple symmetric distributions.

The approximation tends to have much better fractional accuracy near the peak than in the tails: don’t rely on the approximation to estimate the probability of very rare events.

It often also works for the sum of non-independent random variables,i.e. the sum tends to a normal distribution (but the variance is harder to calculate)

Page 33: Stats for Engineers Lecture 6

Example: Average of n samples from a uniform distribution:

Page 34: Stats for Engineers Lecture 6

Answer:

The total weight of passengers is the sum of individual weights.Assuming independent:

Example:

The mean weight of people in England is ΞΌ=72.4kg, with standard deviation 15kg.

The London Eye at capacity holds 800 people at once.

What is the distribution of the weight of the passengers at any random time when the Eye is full?

by the central limit theorem

¿𝑁 (58000 kg ,180000 kg2)

i.e. Normal with ,

[usual caveat: people visiting the Eye unlikely to actually have independent weights, e.g. families, school trips, etc.]

Page 35: Stats for Engineers Lecture 6

1 2 3 4 5 6 7 8

0%

13%

31%

0%

38%

0%

13%

6%

Course Feedback

Which best describes your experience of the lectures so far?

1. Too slow2. Speed OK, but struggling to understand

many things3. Speed OK, I can understand most things4. A bit fast, I can only just keep up5. Too fast, I don’t have time to take notes

though I still follow most of it6. Too fast, I feel completely lost most of the

time7. I switch off and learn on my own from the

notes and doing the questions8. I can’t hear the lectures well enough (e.g.

speech too fast to understand or other people talking)

Stopped prematurely, not many answers

Page 36: Stats for Engineers Lecture 6

1 2 3 4 5

65%

7%12%14%

2%

Course Feedback

What do you think of clickers?

1. I think they are a good thing, help me learn and make lectures more interesting

2. I enjoy the questions, but don’t think they help me learn

3. I think they are a waste of time4. I think they are a good idea, but

better questions would make them more useful

5. I think they are a good idea, but need longer to answer questions

Page 37: Stats for Engineers Lecture 6

1 2 3 4 5

20%

4%

16%14%

47%

Course Feedback

How did you find the question sheets so far?

1. Challenging but I managed most of it OK

2. Mostly fairly easy3. Had difficulty, but workshops helped

me to understand4. Had difficulty and workshops were

very little help5. I’ve not tried them

Page 38: Stats for Engineers Lecture 6

Normal approximation to the Binomial If and is large and is not too near 0 or 1, then is approximately

Page 39: Stats for Engineers Lecture 6

p=0.5

Page 40: Stats for Engineers Lecture 6

p=0.5

Page 41: Stats for Engineers Lecture 6

Approximating a range of possible results from a Binomial distribution

e.g. if

𝑃 (𝑋 ≀6 )=𝑃 (π‘˜=0 )+𝑃 (π‘˜=1 )+…+𝑃 (π‘˜=6 )=0.8281

β‰ˆβˆ«βˆ’βˆž

6.5 π‘’βˆ’ (π‘₯βˆ’πœ‡ )2

2𝜎2

√2πœ‹ 𝜎2=Q ( 6.5βˆ’πœ‡πœŽ ) πœ‡=𝑛𝑝=5

β‰ˆQ( 1.5√2.5 )=Q (0.9487 )=0.8286 [not always so accurateat such low !]

Page 42: Stats for Engineers Lecture 6

1 2 3

60%

27%

13%

If what is the best approximation for?

i.e. If , , , what is the best approximation for ?

Page 43: Stats for Engineers Lecture 6

Quality control example: The manufacturing of computer chips produces 10% defective chips. 200 chips are randomly selected from a large production batch. What is the probability that fewer than 15 are defective?

Answer: mean variance .

So if is the number of defective chips, approximately

Hence

This compares to the exact Binomial answer . The Binomial answer is easy to calculate on a computer, but the Normal approximation is much easier if you have to do it by hand. The Normal approximation is about right, but not accurate.

¿𝑃 (𝑍<βˆ’1.296 )=1βˆ’π‘ƒ (𝑍 <1.296 )ΒΏ1βˆ’[0.9015+0.6Γ— (0.9032βˆ’0.9015 )]β‰ˆ0.097