1.3 Trends in Data
description
Transcript of 1.3 Trends in Data
![Page 1: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/1.jpg)
We’re ‘NUT’ Giving Up Fundraiser One Grand Prize
Airline tickets Montreal/Ft Lauderdale Return 3-Nights’ Accommodation at Marriott Fort Lauderdale 2 Tickets to Florida Panthers Alumni Box Dinner with Florida Panthers Jesse Winchester 2 Signed Florida Panthers Jerseys 2 Tickets to Miami Dolphins game
500 tickets sold at $100 each Should Mr. Lieff buy one?
![Page 2: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/2.jpg)
We’re ‘NUT’ Giving Up Fundraiser Airline tickets Montreal/Ft Lauderdale Return $ 800 3-Nights’ Accommodation at Marriot Fort Lauderdale $ 450 2 Tickets to Florida Panthers Alumni Box $ 400 Dinner with Florida Panthers Jesse Winchester $ 250 2 Signed Florida Panthers Jerseys $ 400 2 Tickets to Miami Dolphins game $ 200 TOTAL $2500
E(X) = 2500 * 1/500 = 5 So you are expected to win $5 per $100 ticket. You are better off taking your $100 to a blackjack table where
E(X) = 98.5!
![Page 3: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/3.jpg)
1.3 Trends in Data
Questions? pp. 20–24 #1, 4, 9, 11, 14Learning goals:Describe the trend and correlation in a scatter plotUse a line of best fit to make predictionsMSIP / Home Learning: p. 37 #2, 3, (6-7 or 8)
![Page 4: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/4.jpg)
Variables Variable (Mathematics)
a symbol denoting an unknown quantity (x, y, θ, etc.)
Variable (Statistics) A measurable attribute; these typically vary over time or
between individuals e.g., height, age, favourite hockey team Can be discrete, continuous or categorical
Continuous: Weight (digital scale) Discrete: Number of siblings Categorical: Hair colour
![Page 5: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/5.jpg)
Scatter Plot a graph that shows two numeric variables each axis represents a variable each point indicates a pair of values (x, y) may show a trend
![Page 6: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/6.jpg)
The Two Types of Variables on a Scatter Plot Independent Variable
Horizontal axis Time is independent (why?) Timing is dependent (e.g., time to run 100m)
Dependent Variable Values depend on the independent variable Vertical axis
Format: “dependent vs. independent” e.g., a graph of arm span vs. height means arm span
is the dependent variable and height is the independent
![Page 7: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/7.jpg)
What is a trend? the ‘direction’ of the data a pattern of average behavior that occurs over time e.g., costs tend to increase over time (inflation) need two variables to exhibit a trend (time can be one)
![Page 8: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/8.jpg)
An Example of a trend
U.S. population from 1780 to 1960
Describe the trend A
ttr2
_pop
mill
ions
0
20
40
60
80
100
120
140
PearlReedandKish1940_USpopulationfrom17901940_year1780 1800 1820 1840 1860 1880 1900 1920 1940 1960
019 Scatter Plot
![Page 9: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/9.jpg)
Correlations Strength can be…
None – no clear pattern in the data Weak – data loosely follows a pattern Strong – data follows a clear pattern
If strong or weak, the direction can be… Positive - data rises from left to right (overall)
As x increases, y increases Negative: data drops from left to right (overall)
As x increases, y decreases http://www.seeingstatistics.com/seeing1999/gallery/Corr
elationPicture.html
Strong, positive linear correlation
![Page 10: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/10.jpg)
AGENDA for Fri-Mon 1.3 Median-Median Line
Using a regression equation Fathom Activity - Predict your weight as an NHL
player 1.4 Trends With Technology
Correlation Coefficient (R) Coefficient of Determination (R2) Residuals / Least-Squares Line
Fathom Investigation: finding the Least Squares Line
![Page 11: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/11.jpg)
Line of Best Fit
A straight line that represents the trend in the data
Can be used to make predictions (graph or equation)
Can be drawn or calculated Fathom has 3: movable, median-median, least
squares Gives no measurement of the strength of the
trend (that’s next class!)
![Page 12: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/12.jpg)
An example line of best fit
this is temperature recycling data with a median-median line added
what type of trend are we looking at?
![Page 13: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/13.jpg)
Median-Median Line
![Page 14: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/14.jpg)
Creating a Median-Median Line Divide the points into 3 symmetric groups
If there is 1 extra point, include it in the middle group If there are 2 extra points, include one in each end group
Calculate the median x- and y-coordinates for each group and plot the 3 median points (x, y)
If the median points are in a straight line, connect them Otherwise, line up the two outer points, move 1/3 of the
way to the other point and draw a line of best fit
![Page 15: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/15.jpg)
Median-Median Line (10 points)
![Page 16: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/16.jpg)
Lines of Best Fit – why 3? Drawing a line of best fit is arbitrary
Hit as many points as possible Have the same number of points above and below the
line Outliers tend to be ignored
The median-median line is easy to construct and takes the spread of the data into consideration
The least-squares line takes every point into consideration but is based on a complicated formula
Good-Better-Best is a recurring theme in this course 3.3 Measures of Spread (Range, IQR, StdDev)
![Page 17: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/17.jpg)
Using a regression equation
The equation of a line of best fit will be in the form y = mx + b
e.g., Toronto Maple Leafs roster on 3-Oct-13 W = 7.25H – 332
Mr. Lieff is 73.5” tall. His weight as a Maple Leaf would be: W = 7.25(73.5) – 331.8
= 201.075 or 201 lbs.
![Page 18: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/18.jpg)
Fathom Activity – How much would you weigh as an NHL player? To Generate and Import Data: Click http://www.nhl.com/ice/playerstats.htm
Pick a group of players that you want to associate with TEAM: Pick your favourite OR select Position, Country, Status, etc.
Select REPORTBIOS Click GO> Copy the URL
Open Fathom Click FileImportImport From URL Paste the URL Double-click the Collection name and shorten it Expand the Collection, right-click the first case and click Cut Case.
![Page 19: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/19.jpg)
To create a graph of Weight vs. Height Create a scatter plot of Weight vs. Height
Double click the Collection icon (cardboard box) Click the Cases tab Create a graph in the workspace Drag Weight and Height to the respective axes
Which is dependent?
Right-click and select Median-Median Line Use the equation to:
Predict your weight based on your height Discuss with a neighbour: is the prediction
reasonable? Are there any limitations to the model? Extension: How would you predict your NHL height
based on your current weight?
![Page 20: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/20.jpg)
Scatter Plots - Summary A graph that compares two numeric variables
One is dependent on the other May show a correlation
positive/negative strong/weak
A line may be a good model Median-Median and Least-Squares If not, non-linear (can be quadratic, exponential,
logarithmic, etc.) Excel can do these
![Page 21: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/21.jpg)
1.4 Trends in Data Using Technology
Learning goal: Describe and measure the strength of trendsQuestions? p. 37 #2, 3, (6-7 or 8)MSIP / Home Learning: p. 51 #1-2, 3-5 (Fathom), 8
![Page 22: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/22.jpg)
Regression The process of fitting a line or curve to a set of
data A line of best fit is a linear regression (Excel or
Fathom) A curve can be quadratic, cubic, exponential,
logarithmic, etc. (Excel) We do this to generate a mathematical model
(graph or equation) We can use the equation to make predictions
Interpolation – within the span of the data Extrapolation – outside of the span of the data
![Page 23: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/23.jpg)
Example armspan = 0.87 height + 22 y = 0.87 x + 22 What is the arm span of a student who is 175 cm tall?
y = 0.87(175) + 22 = 174.25 cm
How tall is a student with a 160 cm arm span? y = 0.87x + 22 160 = 0.87x + 22 160 – 22 = 0.87x 138 = 0.87x x = 138 ÷ 0.87 = 158.6 cm
![Page 24: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/24.jpg)
Correlation Coefficient r2 is the coefficient of determination
Takes on values from 0 to 1 r2 is the percent of the change in the y-variable that is
due to the change in x if r2 = 0.52 for the Leafs weight vs. height, 52% of the
variation in weight is due to height r is correlation coefficient
indicates of the strength and direction of a linear relationship r = 0 no relationship r = 1 perfect positive correlation r = -1 perfect negative correlation
![Page 25: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/25.jpg)
Residuals a residual is the vertical
distance between a point and the line of best fit
if the model you are considering is a good fit, the residuals should be small and have no noticeable pattern
The least-squares line minimizes the sum of the squares of the residuals
y
23456789
x1 2 3 4 5 6 7 8 9
y = 0.0804x + 3.5; r^2 = 0.021
-1
1
3
Res
idua
l
1 2 3 4 5 6 7 8 9x
Collection 1 Scatter Plot
http://www.math.csusb.edu/faculty/stanton/m262/regress/
![Page 26: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/26.jpg)
Least Squares LineWeight vs. Height (NHL) w = 7.23h – 325
![Page 27: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/27.jpg)
Using the equation
How much does a player who is 71 in tall weigh?
w = 7.23(71) – 325 = 188.33 lbs
How tall is a player who weighs 180 lbs? w = 7.23h – 325 h = (w + 325) ÷ 7.23 So h = (180 + 325) ÷ 7.23 = 69.85” or 177.4cm
![Page 28: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/28.jpg)
NHL Least-Squares Line Activity See handout
![Page 29: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/29.jpg)
1.5 Comparing Apples to Oranges http://www.smarter.org/research/apples-to-
oranges/
![Page 30: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/30.jpg)
The Power of Data
Chapter 1.5 – The MediaMathematics of Data Management (Nelson)MDM 4U
There are 3 kinds of lies: lies, damn lies and statistics.
![Page 31: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/31.jpg)
Example 1 – Changing the scale on the axis Why is the following graph misleading?
Mr. Lieff Mr. Winter Mr. Dickie Mr. Frey40
42
44
46
48
50
52
Favourite Teacher
![Page 32: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/32.jpg)
Example 1 – Scale from 0 Consider that this is a bar graph – could it
still be misleading?
Mr. Lieff Mr. Winter Mr. Dickie Mr. Frey0
10
20
30
40
50
60
Favourite Teacher
![Page 33: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/33.jpg)
Include every category!
Mr. Lieff Mr. Winter Mr. Dickie Mr. Frey Mr. Villanueva0
10
20
30
40
50
60
70
Favourite Teacher
![Page 34: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/34.jpg)
Example 2 – Using a Small Sample For the following surveys, consider:
The sample size If there is any (mis)leading language
![Page 35: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/35.jpg)
Example 2 – Using a Small Sample “4 out of 5 dentists recommend Trident sugarless gum to
their patients who chew gum.” “In the past, we found errors in 4 out of 5 of the returns
people brought in for a Second Look review.” (H&R Block)
“Did you know that 1 in 4 women can misread a traditional pregnancy test result?” (Clearblue Easy Digital Pregnancy Test)
“Using Pedigree® DentaStix® daily can reduce the build up of tartar by up to 80%.”
“Did you know that the average Canadian wastes $500 of food in a year?” (Zip-Lock Freezer bags)
![Page 36: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/36.jpg)
Details on the Trident Survey How many dentists did they ask?
Actual number: 1200 4 out of 5 is convincing but reasonable
5 out of 5 is preposterous 3 out of 5 is good but not great Actual statistic 85%
Recommend Trident over what? There were 2 other options:
Chewing sugared gum Not chewing gum
![Page 37: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/37.jpg)
Misleading Statements(?)
How could these statements be misleading? “More people stay with Bell Mobility than any
other provider.” “Every minute of every hour of every
business day, someone comes back to Bell.”
![Page 38: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/38.jpg)
“More people stay with Bell Mobility than any other provider.” Does not specify how many more customers stay
with Bell. e.g. Percentage of customers renewing their plan:
Bell: 30% Rogers: 29% Telus: 25% Fido: 28% Did they compare percentages or totals? What does it mean to “stay with Bell”? Honour entire
contract? Renew contract at the end of a term? Are early terminations factored in? If so, does Bell
have a higher cost for early terminations? Competitors’ renewal rates may have decreased
due to family plans / bundling Does the data include Private / Corporate plans?
![Page 39: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/39.jpg)
“Every minute of every hour of every business day, someone comes back to Bell.” 60 mins x 7 hours x 5 days = 2 100/wk What does it mean to “Come back to Bell”? How many hours in a business day?
![Page 40: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/40.jpg)
How does the media use (misuse) data? To inform the public about world events in an
objective manner It sometimes gives misleading or false impressions
to sway the public or to increase ratings
It is important to: Study statistics to understand how information is
represented or misrepresented Correctly interpret tables/charts presented by the media
![Page 41: 1.3 Trends in Data](https://reader035.fdocuments.in/reader035/viewer/2022062410/5681624d550346895dd297de/html5/thumbnails/41.jpg)
MSIP / Homework
Read pp. 57 – 60 Ex. 1-2 Complete p. 60 #1-6 Final Project Example – Manipulating Data
(on wiki)
Examples http://junkcharts.typepad.com/ http://www.coolschool.ca/lor/AMA11/unit1/U01L02.htm http://mediamatters.org/research/200503220005