Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... ·...

14
Chapter 6 Probability Plots Probability Plots Specifically, normal probability plots. Section 6.7 1 / 14

Transcript of Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... ·...

Page 1: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Chapter 6Probability Plots

Probability Plots

Specifically, normal probability plots.

Section 6.7

1 / 14

Page 2: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Were my data drawn from a normal distribution?

7 | 6

8 | 7

9 | 7

10 | 15

11 | 058

12 | 013

13 | 133455

14 | 12356899

15 | 001344678888

16 | 0003357789

17 | 0112445668

18 | 0011346

19 | 034699

20 | 0178

21 | 8

22 | 189

23 | 7

24 | 5

The decimal point is 1 digit(s) to the right of the |

The compressive strength data at theleft looks ‘normally distributed’, but isit really?

It is unimodal and has a bell shape, butdo the probabilities line-up with anormal distribution?

Having the correct general shape is astart, but there are specific probabilitiesthat coincide with the normaldistribution.

2 / 14

Page 3: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Were my data drawn from a normal distribution?

Definition

A probability plot is a graphical method for determining if sample dataconform to some specific distribution (such as normal, exponential, etc.)

More reliable than basing the decision on a histogram.Some examples of probability plots...

NOTE: Different software will label the axes differently, but it is the pattern of the data points in the plot that matters, and that

is similar across software.3 / 14

Page 4: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Were my data drawn from a normal distribution?

In this course, we will limit discussion to the normal probability plotas we want to know if our data conforms to a normal distribution.

We’re just briefly discussing this topic (there are more details toexplore).

In a normal probability plot, we plot the ordered observed data pointsagainst those that would have been observed if we had sampled froma truly normal distribution.

Ordered observations: x(1), x(2) . . . , x(n)where x(1) is the minimum and x(n) is the maximum.

Then each x(j) is plotted against its relevant ‘hypothetical’ z-scoreor zj if the data were truly normally distributed.

4 / 14

Page 5: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Normal Probability Plots

Example (n=15 observed data points)

Ordered FindzsuchthatObserved ϕ(z)=(j-0.5)/nValue j (j-0.5)/n z

-8 1 0.0333 -1.83-5.5 2 0.1 -1.28

-2.25 3 0.1667 -0.97-1.25 4 0.2333 -0.73-0.75 5 0.3 -0.52-0.75 6 0.3667 -0.34-0.25 7 0.4333 -0.17-0.25 8 0.5 0-0.25 9 0.5667 0.17

0 10 0.6333 0.340.5 11 0.7 0.520.75 12 0.7667 0.73

1 13 0.8333 0.974.5 14 0.9 1.2824 15 0.9667 1.83

5 / 14

Page 6: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Normal Probability Plots

Example (n=15 observed data points)

-1 0 1

-50

510

1520

25

Normal Q-Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

This normal probability plot suggests the data was NOT drawn froma normal distribution.

6 / 14

Page 7: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Normal Probability Plots

Example (n=15 observed data points)

histogram

observed value

Frequency

-10 -5 0 5 10 15 20 25

02

46

8

The histogram also suggests there’s a very large value, whichwouldn’t have been expected if it was truly a normal distribution.

7 / 14

Page 8: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Normal Probability Plots - what to look for

If the data were generated from a normal distribution, then the datapoints in the normal probability plot will fall approximately on astraight diagonal line.

8 / 14

Page 9: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Normal Probability Plots - what to look for

The patterns below suggest non-normality...

“S” shape “S” shape “J” shape

Light-tails Heavy tails Right - skewedcompared to compared to

normal normal

All the above patterns are signs of non-normality.9 / 14

Page 10: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Normal Probability Plots (not a ‘best-fit-line’)

NOTE: The diagonal line is NOT A ‘BEST FIT LINE’ to the data.

The line is simply a ‘reference line’ for your eye.

In R statistical software, the line is drawn by simply connecting thetwo (x, y) points determined by the values at the 25th and 75thpercentiles.

10 / 14

Page 11: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Normal Probability Plots

Example (non-normality)

The normal probability plot below suggests there are non-normality issuesbecause of the points at the bottom left. Normality is questionable.

-2 -1 0 1 2

-20

-10

010

Normal Q-Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Reference line connects valuesat the 25th and 75th percentiles (in blue).

11 / 14

Page 12: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Normal Probability Plots - Can a transformation help?

Sometimes we can use a transformation of the data to improve thenormality (but you’ll be working on the transformed scale after that).

Below, a log-transformation helped, but didn’t quite get us tonormality.

-2 -1 0 1 2

050

100

150

200

250

NPP plot - original scale

Theoretical Quantiles

Sam

ple

Qua

ntile

s

-2 -1 0 1 2

-2-1

01

23

NPP plot - log scale

Theoretical Quantiles

Sam

ple

Qua

ntile

s

12 / 14

Page 13: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Normal Probability Plots

The normal probability plot below looks pretty good. Not perfect, butreasonable to assume approximate normality.

-2 -1 0 1 2

1015

2025

30

Normal Q-Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

sReference line connects

values at the25th and 75th percentiles (in blue)

And yes, I simulated the above data from a normal distribution.

13 / 14

Page 14: Chapter 6 Probability Plotshomepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch6_qq... · 2019-10-30 · Normal Probability Plots Example (n=15 observed data points)-1 0 1-5 0 5 10

Normal Probability Plots

I’d like to spend more time with normal probability plots, but due totime constraints, I want you to know two main things...

1 We use a normal probability plot to check for normality.

2 What the normal probability plot looks like when the data are normallydistributed (and not normally distributed).

14 / 14