Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected...

15
Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental unit. Special techniques that can be used in describing these variables will help you identify possible relationships between them. ©1998 Brooks/Cole Publishing/ITP

Transcript of Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected...

Page 1: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

Chapter 3 Describing Bivariate Data

General Objectives:

Sometimes the data that are collected consist of observations for two variables on the same experimental unit. Special techniques that can be used in describing these variables will help you identify possible relationships between them.

©1998 Brooks/Cole Publishing/ITP

Page 2: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

Specific Topics

1. Bivariate data

2. Side-by-side pie charts, comparative line charts

3. Side-by-side bar charts, stacked bar charts

4. Scatterplots for two quantitative variables

5. Covariance and the correlation coefficient

6. The best-fitting line

©1998 Brooks/Cole Publishing/ITP

Page 3: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

3.1 and 3.2 Bivariate Data and Graphs for Qualitative Variables

When two variables are measured on a single experimental unit, the resulting data are called bivariate data.

You may want to explore the relationship between the two variables.

If you have one qualitative and one quantitative variable, you can use two side-by-side charts or a bar chart in which the bars for the two populations are placed side by side.

Another option is to use a stacked bar chart, where the bars for each category are stacked on top of each other.

See Figures 3.1, 3.2, and 3.3 for examples of techniques.

©1998 Brooks/Cole Publishing/ITP

Page 4: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

©1998 Brooks/Cole Publishing/ITP

Figure 3.1 Comparative bar charts

Page 5: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

©1998 Brooks/Cole Publishing/ITP

Figure 3.2 Comparative pie charts

Page 6: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

©1998 Brooks/Cole Publishing/ITP

Figure 3.3 Stacked bar chart

Page 7: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

3.3 Scatterplots for Two Quantitative Variables

Describing patterns shown in a scatterplot:

- What type of pattern do you see?

Upward or downward? Curved? None or random?

- How strong is the pattern?

All points follow it? Only weakly visible?

- Are there any unusual observations?

Outliers? Clusters? Explanation for groupings? See Figures 3.4 and 3.5 for examples of the use of a scatterplot.

©1998 Brooks/Cole Publishing/ITP

Page 8: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

©1998 Brooks/Cole Publishing/ITP

Figure 3.4

Page 9: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

©1998 Brooks/Cole Publishing/ITP

Figure 3.5

Page 10: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

3.4 Numerical Measures for Quantitative Bivariate Data

Sometimes two variables exhibit a linear relationship, e.g., the relationship between height and weight in humans.

©1998 Brooks/Cole Publishing/ITP

Page 11: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

A simple measure of the relationship between two variables x and y is the correlation coefficient, r :

where sx and sy are the standard deviations for the variables

x and y.

The new quantity sxy is called the covariance between x and

y and is defined as:

A computing formula for the covariance:

where xiyi is the sum of the products xiyi for each of the n pairs of measurements.

©1998 Brooks/Cole Publishing/ITP

)/( yxxy sssr

1)()(

__

nyyxx

s iixy

1

))((

nn

yxyx

s

iiii

xy

Page 12: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

If the points in the x vs y plot tend to run from lower left to upper right, then sxy and r will be positive.

If the points tend to run from upper left to lower right, then sxy and r will be negative.

If the points are scattered high and low and left and right, then sxy and r will be close to zero.

The value of y typically depends on the value of x ; y is called the dependent variable and x is called the independent variable.

Sometimes it is possible to describe the relationship relating x to y using a straight line given by the equation y ax b.

©1998 Brooks/Cole Publishing/ITP

Page 13: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

The best-fitting line relating y to x, called the regression or least-squares line, is found by minimizing the sum of the squared differences between the data points and the line itself.

The formulas for computing b and a are

When r is positive, b is positive; when r is negative, b is negative; when r is zero, b is zero.

©1998 Brooks/Cole Publishing/ITP

and xbyas

srb

x

y

Page 14: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

Example

An electrical engineer is interested in calibrating a volt meter to be used on a production line. For the test, 10 different flow rates are fixed and the corresponding meter readings observed.

Measured Voltage Actual Voltage

14 10

23 20

31 30

42 40

51 50

58 60

68 70

76 80

87 90

95 100©1998 Brooks/Cole Publishing/ITP

Page 15: Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.

Compute the correlation coefficient between the observed and actual voltages.

Solution

©1998 Brooks/Cole Publishing/ITP

67.065)SQRD554(09364)( 22

..n

yysyy

82.50(55)SQRD/1385)( 22

n

xxsxx

7435055(54.5)/1374.1))()(( 2

n

yxxysxy

.999665))(82.5(67.074.35/SQRT yyxxxy sss