Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use...

Post on 27-Dec-2015

215 views 1 download

Tags:

Transcript of Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use...

Graphical Analysis

Why Graph Data? Graphical methods

Require very little training Easy to use

Massive amounts of data can be presented more readily

Can provide an understanding of the distribution of the data

May be easier to interpret for individuals with less mathematical background than engineers

Graphical methods Quantitative data (numerical data)

Cost of a computer (continuous) Number of production defects (discrete) Weight of a person (continuous) Parts produced this month (discrete) Temperature of etch bath (continuous)

Graphical tools Line charts Histograms Scatter charts

Graphical methods Qualitative data (categorical and

attribute) Type of equipment (Manual, automated,

semi-automated) Operator (Tom, Nina, Jose)

Graphical tools Bar charts Pie charts Pareto charts

Getting Started Classify data

Quantitative vs. Qualitative Continuous or discrete (quantitative)

Chose the right graphical tool Chose axes and scales to provide

best “view” of data Label graphs to eliminate ambiguity

Graphical Analysis

Examples

Bar or Column Graph

Displays frequency of observations that fall into nominal categories

Color distribution for a random package of M&Ms

0

5

10

15

20

25

brown red yellow green orange blue

Color

Qty

Line Chart Shows trends in data at equal intervals

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Ma

x S

kew

Ave

rag

e

Ma

x P

itch

Ave

rag

e

Con

tro

lled

Sca

n

Fre

eh

an

d S

can

Brig

ht

Lig

ht

Nor

ma

l Lig

ht

Lo

w L

ight

Performance Category

Sca

n T

ime

(Sec

on

ds)

CCD1 CCD2 LR LCCD CMOS

Graphical methods Acceptable graph

EDC WarehouseTest Results for Read Time

ALL SYSTEMS

0.64

0.20

0.52

0.810.66

N/A

1.46

0.88

0

1

2

1 2 3 4 5 6 7 8

RFID System

Re

ad

Tim

e

(se

cs/r

ea

d)

Graphical methods Better graph

EDC WarehouseTest Results for Read Time

ALL SYSTEMS

0.88

1.46

N/A

0.660.81

0.52

0.20

0.64

0

2

A B C D E F G H

RFID System

Read

Tim

e (

secs/r

ead

)

Graphical Analysis Details Always label axis with titles and units Always use chart titles Use scales that are appropriate to the

range of data being plotted Use legends only when they add value Use both points and lines on line

graphs only if it is appropriate – don’t use if the data is discrete

Histograms Histograms are pictorial

representations of the distribution of a measured quantity or of counted items. It is a quick tool to use to display the average and the amount of variation present.

Histogram example

The Pareto principle

Dr. Joseph Juran (of total quality management fame) formulated the Pareto Principle after expanding on the work of Wilfredo Pareto, a nineteenth century economist and sociologist. The Pareto Principle states that a small number of causes is responsible for a large percentage of the effect--usually a 20-percent to 80-percent ratio.

Pareto example

Histogram Example in Excel

Line Width Histogram

0

10

20

30

40

50

60

70

0.75

1.17

1.59

2.01

2.43

2.85

3.27

3.69

4.12

4.54

4.96

Line Width (um)

Fre

qu

en

cy

ENGR 112

Fitting Equations to Data

Introduction Engineers frequently collect paired data

in order to understand Characteristics of an object Behavior of a system

Relationships between paired data is often developed graphically

Mathematical relationships between paired data can provide additional insight

Regression Analysis

Regression analysis is a mathematical analysis technique used to determine something about the relationship between random variables.

Regression Analysis Goal

To develop a statistical model that can be used to predict the value of a variable based on the value of another

Regression Analysis Regression models are used

primarily for the purpose of prediction

Regression models typically involve A dependent or response variable

Represented as y One or more independent or

explanatory variables Represented as x1, x2, …,xn

Regression Analysis

Our focus? Models with only one

explanatory variable

These models are called simple linear regression models

Regression Analysis A scatter diagram is used to plot an

independent variable vs. a dependent variable Mail-Order House

Relationship b/w Weight of Mail vs. No. of Orders

0

5

10

15

20

25

0 100 200 300 400 500 600 700 800

Weight of Mail (lbs)

No

. of

Ord

ers

(th

ou

san

ds)

Regression AnalysisRemember!!

Relationships between variables can take many forms

Selection of the proper mathematical model is influenced by the distribution of the X and Y values on the scatter diagram

Regression Analysis

X

Y

X

YX

Y

X

Y

Regression Analysis Model

SIMPLE LINEAR REGRESSION MODEL

However, both 0 and 1 are population parameters

i Represents the random error in Y for each observation i that occurs

Yi = 0 + 1Xi + i

Regression Analysis Model

Since we will be working with samples, the previous model becomes

Where b0 = Y intercept (estimate of 0)

Value of Y when X = 0 b1 = Slope (estimate of 1)

Expected change in Y per unit change in X Yi = Predicted (estimated) value of Y

Yi = b0 + b1Xi

^

^

Regression Analysis Model

What happened with the error term?

Unfortunately, it is not gone. We still have errors in the estimated values iii YYe

Regression Analysis Find the straight line That BEST fits the data

Regression Analysis

X

Y

00

Positive Straight-Line Relationship

e1

e2

e3

e4

e5

Yi = b0 + b1Xi

b0 xy

b1

Least Squares Method

Mathematical technique that determines the values of b0 and b1

It does so by minimizing the following expression

n

1i

2ieMin

2n

1ii10i

2n

1iii

n

1i

2i XbbYYYeMin

Least Squares MethodResulting equations

Equations (1) and (2) are called the “normal equations”

n

1ii10

n

1ii XbnbY

n

1i

2i1

n

1ii0

n

1iii XbXbYX

(1)

(2)

Least Squares Method Assume the following values

Resulting equations

15xy,10x,20y,2x,5n 2

20b2b5 1 10

15b10b2 2 10

Assessing Fit How do we know how good a regression

model is? Sum of squares of errors (SSE)

Good if we have additional models to compare against

Coefficient of determination r2

A value close to 1 suggests a good fit

SST

SSE1r2

Where do weget these values?