Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill...
-
Upload
oliver-robertson -
Category
Documents
-
view
214 -
download
1
Transcript of Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill...
Chapter 12
Examining Relationships in Quantitative Research
Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin
12-2
Relationships between Variables
• Is there a relationship between the two variables we are interested in?
• How strong is the relationship?• How can that relationship be best described?
12-3
Relationships between Variables
–Linear relationship: The strength and nature of the relationship remains the same over the range of both variables
–Curvilinear relationship: The strength and/or direction of their relationship changes over the range of both variables
12-4
Covariation and Variable Relationships
• Covariation: The amount of change in one variable that is consistently related to the change in another variable of interest– Scatter diagram: A graphic plot of the relative
position of two variables using a horizontal and a vertical axis to represent the values of the respective variables• A way of visually describing the covariation between
two variables
12-5
Positive Relationship between X and Y
12-6
Negative Relationship between X and Y
12-7
Curvilinear Relationship between X and Y
12-8
No Relationship between X and Y
12-9
Correlation Analysis
• Pearson correlation coefficient: Statistical measure of the strength of a linear relationship between two numerical variables– Varies between – 1.00 and 1.00• 0 represents absolutely no association between two
variables• – 1.00 or 1.00 represent a perfect association between
two variables– -1.00 represents a perfect negative (indirect) association– + 1.00 represents a perfect positive (direct) association
12-10
Rules of Thumb about the Strength of Correlation Coefficients
12-11
Assumptions for CalculatingPearson’s Correlation Coefficient
• The two variables have been measured using interval- or ratio-scaled measures
• Relationship is linear• Variables come from a normally distributed
population
12-12
SPSS Pearson Correlation Example
What is the extent of the relationship between ‘Satisfaction’ and ‘Likelihood of Recommending’?
12-13
Coefficient of Determination
• Coefficient of determination (r2): A number measuring the proportion of variation in one variable accounted for by the variation in another variable– Can be thought of as a percentage and varies from 0.0 to
100% (0.0 to 1.0)– The larger the size of the coefficient of determination:
• The stronger the linear relationship between the two variables being examined
• The greater the proportion of variation in the DV that is explained by variation in the IV.
– How is this the same as / different from the correlation coefficient (r)?
12-14
Correlating Rank Data
• Spearman rank order correlation coefficient: A statistical measure of the linear association between two variables where both have been measured using rank order scales.– Measures essentially the same thing as the
Pearson correlation coefficient
12-15
SPSS Spearman Rank Order Correlation
What is the extent of the relationship between ‘Food Quality’ and ‘Service’ rankings by customers of Santa Fe Grill?
12-16
What is Regression Analysis?
• A method for arriving at more mathematically detailed relationships (predictions) than those provided by the correlation coefficient
• Allows numerical predictions of DVs from IVs• Assumptions– Variables are measured on interval or ratio scales– Variables come from a normal population– Error terms are normally and independently
distributed
12-17
Bivariate regression analysis
• Bivariate regression analysis: A statistical technique that analyzes the linear relationship between two variables by estimating coefficients for an equation of a straight line– One variable is designated as the dependent
variable (DV)– The other is designated the independent or
predictor variable (IV)
12-18
Fundamentals of Bivariate Regression
• General formula for a straight line:
• Where,– Y = The dependent variable– a = The intercept (point where the straight line
intersects the Y-axis when X = 0)– b = The slope (the change in Y for every 1 unit
change in X )– X = The independent variable used to predict Y– ei = The error of the prediction
12-19
The Straight Line Relationship in Regression
12-20
Fitting the Regression Line Using the “Least Squares” Procedure
12-21
Ordinary Least Squares• A statistical procedure that estimates regression equation coefficients that
produce the lowest sum of squared differences between the actual and predicted values of the dependent variable
Regression Coefficient• Same as “slope coefficient”• An indicator of the importance of an independent variable in predicting a
dependent variable• Large coefficients are good predictors and small coefficients are weak
predictors – ** only applies to bivariate regression!!
12-22
SPSS Results for Bivariate RegressionWhat is the mathematical relationship between ‘Satisfaction’ and customers’ perception of ‘Reasonable Prices’ at Santa Fe Grill?
12-23
Multiple Regression Analysis
• A statistical technique which analyzes the linear relationship between a dependent variable and multiple independent variables by:– Estimates multiple slope coefficients for the
equation of a straight line– Each DV has a slope coefficient that partially
predicts IV– Much more complicated than bivariate regression
12-24
Fundamentals of Multiple Regression
• General formula for a straight line:• Y = a + b1X1 + b2X2 + b3X3 + … ei
• Where,– Y = The dependent variable– a = The intercept (point where the straight line intersects the Y-
axis when X = 0)– b1 = The slope (the change in Y for every 1 unit change in X1 )
– X1 = The first independent variable used to predict Y
– b2 = The slope (the change in Y for every 1 unit change in X2 )
– X2 = The second independent variable used to predict Y
– ei = The error of the prediction
12-25
Standardized Beta Coefficient
• An estimated regression coefficient that has been recalculated to have a mean of 0 and a standard deviation of 1
• Enables independent variables with different units of measurement to be directly compared on the strength of their association with the dependent variable
12-26
Examining the Omnibus Statistical Significanceof the Regression Model
• Model F statistic: Magnitude of “Model F” used determine whether the entire regression is significant– A significant F statistic indicates that the
regression model as a whole is “significant” (i.e. can be trusted!)
– Look for p-value of F-Statistic less than .05
12-27
Substantive Significance
• The multiple r2 (coefficient of determination) describes the strength of the relationship between all the independent variables as a group and the dependent variable– The larger the r2 measure, the more the behavior
of the dependent measure that is explained by the group of independent variables
– 1 - r2 = “coefficient of alienation” or the portion of DV variation that remains unexplained.
12-28
Examining the Statistical Significanceof Each Coefficient
• Each regression coefficient is divided by its standard error to produce a t statistic
• P-values of t-tests for coefficients = 0 that are less than .05 are typically regarded as “significant”– Significantly different from “0”– If “significant”, we are confident in coefficient’s
(i.e. variable’s) mathematical effect on the DV.
12-29
Multiple Regression Assumptions
• Linear relationship between DV and IVs• Homoskedasticity: The pattern of the co-variation
is constant (the same) around the regression line, whether the values are small, medium, or large– Heteroskedasticity: The pattern of covariation around
the regression line is not constant, and varies in some way when the values change from smaller to larger
• Normal distribution: All variables are normally distributed
12-30
Example of Heteroskedasticity
12-31
Example of a Normally Distributed Variable
12-32
SPSS Results for Multiple RegressionWhat is the mathematical relationship between ‘Satisfaction’ and customers’ perception of ‘Fresh Food’, ‘Food Taste’ and ‘Proper Food Temperature’ at Santa Fe Grill? Which is the best regression model?
12-33
• Assess the statistical significance of the overall (omnibus) regression model using the “Model F” statistic and its associated p-value
• Evaluate the regression’s adjusted multiple R-squared (i.e. coefficient of determination)
• Examine the individual regression coefficients and their p-values to see which are statistically significant
• Look for p < .05, but consider “marginal” cases
• Look at values of the standardized beta coefficients to assess relative influence of each predictor (IV) on the Dependent Variable (DV)
Evaluating a Regression Analysis - Summary
12-34
Multicollinearity• A situation in which several independent variables are highly correlated with each
other• Can result in difficulty in estimating independent regression coefficients for the
correlated variables• Standard errors of Beta coefficients become unreasonably high• Beta coefficients will typically not be significant
12-35
Multicollinearity – How to Avoid or Fix it!!
• Eliminate or replace highly correlated IVs– Perform a correlation matrix– Typically search for correlations higher than .5 in
absolute value• Factor Analysis Techniques (also called
“Principal Components Analysis”)• “Live with it”