1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is...

19
1.4 Data in 2 Variables Definitions

Transcript of 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is...

Page 1: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

1.4 Data in 2 Variables

Definitions

Page 2: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

5.3 Data in 2 Variables:Visualizing Trends

• When data is collected over long period of time, it may show trends

• Trends allow you to make predictions about future events

• Trends can be over time, or over change in some other variable (e.g. mass)

• One effective way to visualize: scatterplot– Shows joint distribution of 2 variables

Page 3: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Scatterplots

Time (s)

Dis

tanc

e (m

)

Independent variable

•Variable whose values are arbitrarily chosen

Dependent variable

•Variable whose values depend on independent variable

Page 4: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

• Scatterplots can help determine if there is a relationship in the data– Is there a pattern in the data?

x

y

xy

•there is a relationship

•As x increases, y increases

•there is no relationship

•As x increases, y stays pretty much the same

Page 5: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

0

5

10

15

20

25

30

0 2 4 6 8 10 12

Time (s)

Dis

tanc

e (m

)We can show the relationship using a line of best fit

Page 6: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

0

50

100

150

200

250

0 2 4 6 8 10 12

Time (s)

Dis

tanc

e (m

)If the data is nonlinear, we use a curve of best fit

0

50

100

150

200

250

0 2 4 6 8 10 12

Time (s)

Dis

tanc

e (m

)

Page 7: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Correlation

• Measure of the strength of the apparent relationship between two variables

• Look at upward/downward/horizontal trend– Positive/Negative/No correlation

• Look at how closely the points fit the curve of best fit– Strong/Moderate/Weak correlation

• Note: trend and fit are unrelated

Page 8: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Classifying Linear Relationships

x

y

Strong positive correlation

•Positive slope

•Tightly clustered to line of best fit

Page 9: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Classifying Linear Relationships

x

y

Strong negative correlation

•Negative slope

•Tightly clustered to line of best fit

Page 10: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Classifying Linear Relationships

x

y

No correlation

•0 slope

•Randomly scattered

Page 11: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Classifying Linear Relationships

x

y

Moderate positive correlation

Page 12: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Classifying Linear Relationships

x

y

Weak positive correlation

Page 13: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Classifying Linear Relationships

x

y

Weak negative correlation

Page 14: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Warning!!!

• Correlation does not necessarily mean causation

• Just because there is a relationship between A and B does not mean A causes B – More on this next day

Page 15: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Using trends for predictions

• Use the equation of the line of best fit

• Extrapolation– Estimation of a value outside known data set

• Interpolation– Estimation of a value between two known

values

Page 16: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

x

y

y = mx + b

Extrapolation

Interpolation

Page 17: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Go to “Go For the Gold!”

Page 18: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Go for the Gold!Line of Best Fit: Men

• Mensdistance = 0.016 Year – 24.04• Sum of squares = 0.8308• Slope is 0.016: change in distance over time (in

years)– Every year, the distance should increase by 1.6 cm

• Y-intercept is –24.04– In year zero, they jumped backwards!?

– meaningless

Page 19: 1.4 Data in 2 Variables Definitions. 5.3 Data in 2 Variables: Visualizing Trends When data is collected over long period of time, it may show trends Trends.

Go for the Gold!Line of Best Fit: Women

• Womensdistance = 0.021 Year –35• Sum of squares = 0.7447• Slope is 0.021; y-int is -35

– Every year, the winning women’s distance should increase by 2.1 cm.

– Y-intercept is meaningless for this case

• In 2008, winning men’s distance should be 8.85 m and the women’s distance should be 7.17 m (actual distances 8.34 m and 7.04 m)

• In 2012?