Download - The Importance of Normality - ASQ Orange Empireasqorangeempire.org/.../2016/02/Larry-Bartkus-Importance-of-Normality.pdf · The Importance of Normality Orange Empire –February 8,

The Importance of NormalityOrange Empire – February 8, 2016

Larry Bartkus

What is Normality?

– The normal (or Gaussian) distribution is a very commonly

occurring continuous probability distribution

– Normal distributions are extremely important in statistics and are

often used in the natural and social sciences

– The Gaussian distribution is sometimes informally called the bell

curve

– Physical quantities that are expected to be the sum of many

independent processes (such as measurement errors) often

have a distribution very close to the normal

Normal Data vs Non-normal Data

Commonly found in nature

Easy to identify

No major mathematical

manipulations are necessary

Easy to explain and justify

Solid predictions can readily be

made

Often difficult to recognize and

sometimes hidden

Can be confused with outliers

Statistical manipulations are

sometimes needed to utilize

distribution

Difficult to explain or justify

Normal Data Non-normal Data

Normal Distribution

Non-normal Distribution

Time (t)0 200.00

f(t)

0.04

Lognormal

λ = 0

Weibull

λ = 0

Exponential

λ = 0

Probability Density Function

Why Graph Data

12108642

Median

Mean

8.07.57.06.56.0

1st Q uartile 5.0000

Median 7.0000

3rd Q uartile 9.0000

Maximum 12.0000

6.1712 7.8288

6.0000 8.0000

1.9867 3.1952

A -Squared 0.27

P-V alue 0.660

Mean 7.0000

StDev 2.4495

V ariance 6.0000

Skewness -0.000000

Kurtosis -0.544920

N 36

Minimum 2.0000

A nderson-Darling Normality Test

95% C onfidence Interv al for Mean

95% C onfidence Interv al for Median

95% C onfidence Interv al for StDev

95% Confidence Intervals

Summary for A

Why Graph Data

108642

Median

Mean

8.07.57.06.56.0


Median 7.5000


Maximum 11.0000

6.1791 7.8209

5.7354 8.0000

1.9677 3.1646

A -Squared 0.61

P-V alue 0.101

Mean 7.0000

StDev 2.4260

V ariance 5.8857

Skewness -0.216101

Kurtosis -0.923506

N 36

Minimum 2.0000




95% C onfidence Interv al for StDev

95% Confidence Intervals

Summary for B

Assessing for Normality – Distribution Analyzer

Assessing for Normality – Anderson Darling

1612840

Median

Mean

65432


Median 2.5724


Maximum 16.4155

2.4541 5.5224

1.8680 3.7050

3.2721 5.5232

A -Squared 2.77

P-V alue < 0.005

Mean 3.9883

StDev 4.1086

V ariance 16.8803

Skewness 1.90297

Kurtosis 3.02385

N 30

Minimum 0.3286




95% C onfidence Interv al for StDev95% Confidence Intervals

Summary for C2

Project: NORMALITY EXAMPLES 2011-06-22.MPJ

Assessing for Normality – Anderson Darling

Assessing for Normality – Ryan Joiner

151050-5

99

95

90

80

70

60

50

40

30

20

10

5

1

C2

Pe

rce

nt

Mean 3.988

StDev 4.109

N 30

RJ 0.863

P-Value <0.010

Probability Plot of C2


Normal

Assessing for Normality – Ryan Joiner

0.2600.2590.2580.2570.256

99.9

99

95

90

80

706050403020

10

5

1

0.1

EW Data 4

Pe

rce

nt

Mean 0.2581

StDev 0.0006815

N 60

RJ 1.000

P-Value >0.100

Probability Plot of EW Data 4


Normal

Sensitive to Normality

Non-Sensitive

Confidence intervals on means

T-tests

ANOVA, including DOE and

Regression

X-bar Charts

Sensitive

Confidence Intervals on standard

deviations

Tolerance Intervals, Reliability

Intervals

Variance tests

Cpk, Cp, Ppk, Pp

I-Charts

1) Measurement Resolution

Equipment lacks granularity

2) Data Shift

During collection of from a process a

shift has occurred

3) Multiple Sources of Data

Multiple operators machines,

Lots of Material, etc.

4) Truncated Data

There’s a stop somewhere

or there’s a sorting process

7 Common Reasons for Failing Normality

0.2600.2590.2580.2570.256

99.9

99

95

90

80

706050403020

10

5

1

0.1

EW Data 4

Pe

rce

nt

Mean 0.2581

StDev 0.0006815

N 60

RJ 1.000

P-Value >0.100

Probability Plot of EW Data 4


Normal

12.011.210.49.68.88.07.2

Median

Mean

10.410.310.210.110.09.9

1st Q uartile 9.610

Median 10.083


Maximum 12.618

10.121 10.370

9.934 10.264

0.747 0.923

A -Squared 2.11

P-V alue < 0.005

Mean 10.245

StDev 0.826

V ariance 0.682

Skewness 0.656745

Kurtosis -0.222344

N 172

Minimum 9.002





Summary for N(10,1) Truncate at 9


5) Presence of Outliers

Data has one or more points

that are anomalies. These differ

greatly from the rest of the population.

6) Too Much Data

When there are too many

data points involved in

the assessment. >100

7) Underlying Distribution is Not Normal

Some processes are not intended to be

normal such as time, microbio, pull force

7 Common Reasons for Failing Normality

12011010090

Median

Mean

116114112110108


Median 112.90


Maximum 122.40

107.48 113.94

107.73 116.08

6.27 11.03

A -Squared 0.70

P-V alue 0.060

Mean 110.71

StDev 7.99

V ariance 63.87

Skewness -1.17564

Kurtosis 1.99926

N 26

Minimum 86.20





Summary for EW Data 2


8765432

Median

Mean

5.105.055.004.954.90


Median 5.0000


Maximum 8.2000

4.9042 5.0306

4.9000 5.1000

0.9762 1.0658

A -Squared 0.54

P-V alue 0.167

Mean 4.9674

StDev 1.0190

V ariance 1.0384

Skewness -0.0380120

Kurtosis 0.0171328

N 1000

Minimum 1.5000





Summary for N(5,1) Rounded to 0.1


Non-normal Data

Let’s First Treat the Data as Normal

Non-normal Data

Group Discussion

– What are the consequences of failing a capability

that should be acceptable due to an error in the

normality distribution assumption?

– What are the consequences of accepting a

capability study as satisfactory when it should fail

due to an error in the normality distribution

assumption?

Any Questions?