Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods ...
-
Upload
desmond-harward -
Category
Documents
-
view
222 -
download
1
Transcript of Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods ...
![Page 1: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/1.jpg)
Chapter 10
Two Quantitative Variables
Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and Regression: Theory-Based
Methods
![Page 2: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/2.jpg)
Section 10.1 and 10.2: Summarizing Two Quantitative Variables and Inference for Correlation (Simulation)
Time 30 41 41 43 47 48 51 54 54 56 56 56 57 58
Score 100 84 94 90 88 99 85 84 94 100 65 64 65 89
Time 58 60 61 61 62 63 64 66 66 69 72 78 79
Score 83 85 86 92 74 73 75 53 91 85 62 68 72
I was interested in knowing the relationship between the time it takes a student to take a test and the resulting score. So I collected some data …
![Page 3: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/3.jpg)
Scatterplot
Put explanatory variable on the horizontal axis.
Put response variable on the vertical axis.
50
60
70
80
90
100
time30 40 50 60 70 80
![Page 4: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/4.jpg)
Describing Scatterplots When we describe data in a scatterplot,
we describe the Direction (positive or negative) Form (linear or not) Strength (strong-moderate-weak, we will let
correlation help us decide) How would you describe the time and test
scatterplot?
![Page 5: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/5.jpg)
Correlation Correlation measures the strength and
direction of a linear association between two quantitative variables.
Correlation is a number between -1 and 1. With positive correlation one variable increases, on
average, as the other increases. With negative correlation one variable decreases,
on average, as the other increases. The closer it is to either -1 or 1 the closer the points
fit to a line. The correlation for the test data is -0.56.
![Page 6: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/6.jpg)
Correlation GuidelinesCorrelation Value
Strength of Association
What this means
0.5 to 1.0 Strong The points will appear to be nearly a straight line
0.3 to 0.5 Moderate When looking at the graph the increasing/decreasing pattern will be clear, but it won’t be nearly a line
0.1 to 0.3 Weak With some effort you will be able to see a slightly increasing/decreasing pattern
0 to 0.1 None No discernible increasing/decreasing pattern
Same Strength Results with Negative Correlations
![Page 7: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/7.jpg)
Back to the test data I was not completely honest. There were
three more test scores. The last three people to finish the test had scores of 93, 93, and 97.
50
60
70
80
90
100
time30 40 50 60 70 80 90 100
Scatter Plot
What does this do to the correlation?
![Page 8: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/8.jpg)
Influential Observations The correlation changed from -0.56 (a fairly
strong negative correlation) to -0.12 (a fairly weak negative correlation).
Points that are far to the left or right and not in the overall direction of the scatterplot can greatly change the correlation. (influential observations)
50
60
70
80
90
100
time30 40 50 60 70 80 90 100
50
60
70
80
90
100
time30 40 50 60 70 80
Scatter Plot
![Page 9: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/9.jpg)
Correlation Correlation measures the strength and
direction of a linear association between two quantitative variables. -1 < r < 1 Correlation makes no distinction between
explanatory and response variables. Correlation has no unit. Correlation is not a resistant measure. Correlation based on averages is usually higher
than that for individuals. Correlation game
![Page 10: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/10.jpg)
Inference for Correlation with Simulation1. Compute the observed statistic. (Correlation)
2. Scramble the response variable, compute the simulated statistic, and repeat this process many times.
3. Reject the null hypothesis if the observed statistic is in the tail of the null distribution.
![Page 11: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/11.jpg)
Temperature and Heart RateHypotheses
Null: There is no association between heart rate and body temperature. (ρ = 0)
Alternative: There is a positive linear association between heart rate and body temperature. (ρ > 0)
ρ = rho
![Page 12: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/12.jpg)
Temperature and Heart Rate
Tmp 98.3 98.2 98.7 98.5 97.0 98.8 98.5 98.7 99.3 97.8
HR 72 69 72 71 80 81 68 82 68 65
Tmp 98.2 99.9 98.6 98.6 97.8 98.4 98.7 97.4 96.7 98.0
HR 71 79 86 82 58 84 73 57 62 89
Collect the Data
![Page 13: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/13.jpg)
Temperature and Heart Rate
r = 0.378
Explore the Data
![Page 14: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/14.jpg)
Temperature and Heart Rate If there was no association between heart
rate and body temperature, what is the probability we would get a correlation as high as 0.378 just by chance?
If there is no association, we can break apart the temperatures and their corresponding heart rates. We will do this by shuffling one of the variables. (The applet shuffles the response.)
![Page 15: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/15.jpg)
Temp HR98.30 8198.20 7398.70 5798.50 8697 7998.80 7298.50 7198.70 6899.30 5897.80 8998.20 8299.90 6998.60 7198.60 7297.80 6298.40 8498.70 8297.40 8096.70 6598 68
Shuffled r =-0.216
Temp HR98.30 5898.20 8698.70 7998.50 7397 7198.80 8298.50 5798.70 8099.30 7197.80 7298.20 8199.90 8298.60 8998.60 6297.80 8498.40 7298.70 6897.40 6896.70 6998 65
Shuffled r =0.212
Temp HR98.30 8698.20 6598.70 6998.50 7197 7398.80 5898.50 8198.70 7299.30 5797.80 8298.20 7199.90 8098.60 6898.60 7997.80 8498.40 8998.70 6897.40 8296.70 6298 72
Shuffled r =-0.097
![Page 16: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/16.jpg)
Temperature and Heart Rate In three shuffles, we obtained correlations
of -0.216, 0.212 and -0.097. These are all a bit closer to 0 than our sample correlation of 0.378
Let’s scramble 1000 times and compute 1000 simulated correlations.
![Page 17: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/17.jpg)
Temperature and Heart Rate Notice our null
distribution is centered at 0 and somewhat symmetric.
We found that 530/10000 times we had a simulated correlation greater than or equal to 0.378.
![Page 18: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/18.jpg)
Temperature and Heart Rate With a p-value of 0.053 we don’t have
strong evidence there is a positive linear association between body temperature and heart rate. However, we do have moderate evidence of such an association and perhaps a larger sample give a smaller p-value.
If this was significant, to what population can we make our inference?
![Page 19: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/19.jpg)
Temperature and Heart Rate Let’s look at a different data set comparing
temperature and heart rate (Example 10.5A) using the Analyzing Two Quantitative Variables applet.
Pay attention to which variable is explanatory and which is response.
The default statistic used is not correlation, so we need to change that.
![Page 20: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/20.jpg)
Work on exploration 10.2 to see if there is an association between birth date and the 1970 draft lottery number.
![Page 21: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/21.jpg)
Least Squares Regression
Section 10.3
![Page 22: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/22.jpg)
If we decide an association is linear, it’s helpful to develop a mathematical model of that association.
Helps make predictions about the response variable.
The least-squares regression line is the most common way of doing this.
Introduction
![Page 23: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/23.jpg)
Unless the points form a perfect line, there won’t be a single line that goes through every point.
We want a line that gets as close as possible to all the points
Introduction
50
60
70
80
90
100
time30 40 50 60 70 80
Scatter Plot
![Page 24: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/24.jpg)
We want a line that minimizes the vertical distances between the line and the points These distances are called residuals. The line we will find actually minimizes the
sum of the squares of the residuals. This is called a least-squares regression
line.
Introduction
![Page 25: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/25.jpg)
Are Dinner Plates Getting Larger?
Example 10.3
![Page 26: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/26.jpg)
There are many recent articles and TV reports about the obesity problem.
One reason some have given is that the size of dinner plates are increasing.
Are these black circles the same size, or is one larger than the other?
Growing Plates?
![Page 27: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/27.jpg)
They appear to be the same size for many, but the one on the right is about 20% larger than the left.
Growing Plates?
This suggests that people will put more food on larger dinner plates without knowing it.
![Page 28: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/28.jpg)
Portion Distortion There is name for this phenomenon:
Delboeuf illusion
![Page 29: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/29.jpg)
Researchers gathered data to investigate the claim that dinner plates are growing
American dinner plates sold on ebay on March 30, 2010 (Van Ittersum and Wansink, 2011)
Size and year manufactured.
Growing Plates?
Year 1950 1956 1957 1958 1963 1964 1969 1974 1975 1978size 10 10.8 10.1 10 10.6 10.8 10.6 10 10.5 10.1
Year 1980 1986 1990 1995 2004 2004 2007 2008 2008 2009Size 10.38 10.8 10.4 11 10.8 10.1 11.5 11 11.1 11
![Page 30: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/30.jpg)
Both year (explanatory variable) and size (response variable) are quantitative.
Each dot represents one plate in this scatterplot.
Describe the association here.
Growing Plates?
9.5
10.0
10.5
11.0
11.5
year1950 1960 1970 1980 1990 2000 2010
plate size Scatter Plot
![Page 31: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/31.jpg)
The association appears to be roughly linear The least squares regression line is added How can we describe this line?
Growing Plates?
9.5
10.0
10.5
11.0
11.5
1950 1960 1970 1980 1990 2000 2010year
size = 0.0128year - 14.8 2
plate size Scatter Plot
![Page 32: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/32.jpg)
The regression equation is : a is the y-intercept b is the slope x is a value of the explanatory variable ŷ is the predicted value for the response
variable For a specific value of x, the corresponding
distance is a residual
Growing Plates?
![Page 33: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/33.jpg)
The least squares line for the dinner plate data is
Or This allows us to predict plate diameter for
a particular year. Substitute 2000 for year and we predict a
diameter of -14.8 + 0.0128(2000) = 10.8 in.
Growing Plates?
![Page 34: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/34.jpg)
What is the predicted diameter for a plate manufactured in the year 2001? -14.8 + 0.0128(2001) = 10.8128 in.
How does this compare to our prediction for the year 2000 (10.8 in.)? 0.0128 larger
Slope b = 0.0128 means that diameters are predicted to increase by 0.0128 inches per year on average
Growing Plates?
![Page 35: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/35.jpg)
Slope is the predicted change in the response variable for one-unit change in the explanatory variable.
Both the slope and the correlation coefficient for this study were positive.
The slope and correlation coefficient will always have the same sign.
Growing Plates?
![Page 36: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/36.jpg)
The y-intercept is where the regression line crosses the y-axis or the predicted response when the explanatory variable equals 0.
We had a y-intercept of -14.8 in the dinner plate equation. What does this tell us about our dinner plate example? Dinner plates in year 0 were -14.8 inches.
How can it be negative? The equation works well within the range of values given
for the explanatory variable, but fails outside that range. Our equation should only be used to predict the
size of dinner plates from about 1950 to 2010.
Growing Plates?
![Page 37: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/37.jpg)
Predicting values for the response variable for values of the explanatory variable that are outside of the range of the original data is called extrapolation.
Growing Plates?
![Page 38: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/38.jpg)
Predicting Height from Footprints
Exploration 10.3
![Page 39: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/39.jpg)
Inference for the Regression Slope: Theory-
Based Approach
Section 10.5
![Page 40: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/40.jpg)
Beta vs Rho Testing the slope of the regression line is
equivalent to testing the correlation. Hence these hypotheses are equivalent.
Ho: β = 0 Ha: β > 0 (Slope) Ho: ρ = 0 Ha: ρ > 0 (Correlation)
Sample slope (b) Population (β: beta) Sample correlation (r) Population (ρ: rho)
When we do the theory based test, we will be using the t-statistic which can be calculated from either the slope or correlation.
![Page 41: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/41.jpg)
We have seen in this chapter that our null distributions are again bell-shaped and centered at 0 (for either correlation or slope as our statistic). This is also true if we use the t-statistic.
Introduction
![Page 42: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/42.jpg)
Under certain conditions, theory-based inference for correlation or slope of the regression line use t-distributions.
We are going to use the theory-based methods for the slope of the regression line. (The applet allows us to do confidence intervals for slope.)
We would get the same p-value if we used correlation as our statistic.
Validity Conditions
![Page 43: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/43.jpg)
Validity Conditions for theory-based inference for slope of the regression equation.
1. The values of the response variable at each possible value of the explanatory variable must have a normal distribution in the population from which the sample was drawn.
2. Each normal distribution at each x-value must have the same standard deviation.
Validity Conditions
![Page 44: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/44.jpg)
Suppose the graph represents a scatterplot for an entire population where the explanatory variable is heart rate and the response is body temperature.
Normal distributions oftemps for each BPM
Each normal distributionhas the same SD
Validity Conditions
![Page 45: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/45.jpg)
How realistic are these conditions? In practice, you can check these conditions
by examining your scatterplot. Let’s look at some scatterplots that do not
meet the requirements.
Validity Conditions
![Page 46: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/46.jpg)
0
20
40
60
80
100
120
140
Alcohol_Drinks
0 10 20 30 40 50 60 70
Smoked = 0.328Alcohol_Drinks - 0.24 2
The relationship between number of drinks and cigarettes per week for a random sample of students at Hope College.
The dot at (0,0)represents 524 students
Are conditions met?
Smoking and Drinking
![Page 47: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/47.jpg)
Since these conditions aren’t met, we shouldn’t apply theory-based inference
Smoking and Drinking
![Page 48: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/48.jpg)
When checking conditions for traditional tests use the following logic. Assume that the condition is met in your
data Examine the data in the appropriate
manner If there is strong evidence to contradict
that the condition is met, then don’t use the test.
![Page 49: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/49.jpg)
What do you do when validity conditions aren’t met for theory-based inference? Use the simulated-based approach
Another strategy is to “transform” the data on a different scale so conditions are met. The logarithmic scale is common
Validity Conditions
![Page 50: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/50.jpg)
Predicting Heart Rate from Body Temperature
Example 10.5A
![Page 51: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/51.jpg)
Earlier we looked at the relationship between heart rate and body temperature with 130 healthy adults
r = 0.257
Heart Rate and Body Temp
![Page 52: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/52.jpg)
We tested to see if we had convincing evidence that there is a positive association between heart rate and body temp in the population using a simulation-based approach. (We will make it 2-sided this time.)
Null Hypothesis: There is no association between heart rate and body temperature in the population. β = 0
Alternative Hypothesis: There is an association between heart rate and body temperature in the population. β ≠ 0
Heart Rate and Body Temp
![Page 53: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/53.jpg)
We get a very small p-value (≈ 0.006) since anything as extreme as our observed slope of 2.44 is very rare
Heart Rate and Body Temp
![Page 54: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/54.jpg)
We can also approximate a 95% confidence interval
observed statistic + 2 SD of statistic 2.44 ± 2(0.850) = 0.74 to 4.14
What does this mean? We’re 95% confident that, in the
population of healthy adults, each 1º increase in body temp is associated with an increase in heart rate of between 0.74 to 4.14 beats per minute
Heart Rate and Body Temp
![Page 55: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/55.jpg)
The theory-based approach should work well since the distribution has a nice bell shape
Also check the scatterplot
Heart Rate and Body Temp
![Page 56: Chapter 10 Two Quantitative Variables Inference for Correlation: Simulation-Based Methods Least-Squares Regression Inference for Correlation and.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649cad5503460f94970202/html5/thumbnails/56.jpg)
Let’s do the theory-based test using the applet. We will use the t-statistic to get our theory-
based p-value. We will find a theory-based confidence interval
for the slope. After we are done with that, do Exploration
10.5: Predicting Brain Density from Facebook Friends
Heart Rate and Body Temp