Galambos N Analysis Of Survey Results
-
Upload
nora-galambos -
Category
Documents
-
view
395 -
download
6
Transcript of Galambos N Analysis Of Survey Results
Analysis of Survey Results
Nora Galambos, PhDOffice of Institutional Research
Stony Brook University
Survey Planning: Questions to Ask
» What hypotheses are being tested?» What types of analyses are planned to test the
hypotheses?» Look over the instrument and create a map or
outline of possible analysis methods» What is the magnitude of the differences you
would like to detect?
Importance of Pilot Testing
» The most obvious reason for pilot testing is to be able to estimate the sample size.
» Find potential sources of bias» Assists in power calculations» Discover possible distribution problems prior to
surveying the entire sample
Bonferroni Adjustment
» When multiple tests are performed in an experiment the experiment-wise error rate increases.
» For example, if we are performing a trial with a=0.05, then 1 out of every 20 trials will be significant by chance.
» Let’s say we are comparing males and females on 15 independent factors with a=0.05, where the null hypothesis holds, then the chance of one being significant is actually 0.54.
» Study error rate = » To control this error rate the significance level can be
adjusted using α/n
Type I and Type II Errors
» A Type I error occurs when a true null hypothesis is rejected. The probability of a Type I error is denoted by α, and is the significance level of the hypothesis test, with 0.05 being a common value for α.
» On the other hand, a Type II error occurs when the null hypothesis is false and it is not rejected. A Type II error is denoted by β and is often set to 0.20.
Hypothesis Testing Table
True Results
Experimental Results Ho is true Ho is false
Reject Ho α (Type I error rate) Power = 1 - β
Accept Ho β (Type II error rate)
Power Calculations
» Statistical Power Analysis for the Behavioral Sciences—Jacob Cohen
» The power of a significance test is the probability of rejecting a false null hypothesis, and is equal to 1 - β. If β is 0.20, the power = 0.80.
» 0.80 is generally considered to be adequate level for the power
» Since sample size and power are related, a small sample size results in less power, or reduced probability of rejecting a false null hypothesis.
Using Sample Size Tables
» Based on initially planning, what differences do you hope or need to detect?
» For example, you may want to find the sample size needed for a t test to evaluate the difference between two means (where the standard deviation is the same in both groups.
» Calculate the effect size:
Power for a two-sided test, α=0.01
n (for each group) 0.2 0.5 0.8
30 0.03 0.24 0.66
40 0.04 0.35 0.82
50 0.06 0.45 0.91
60 0.07 0.55 >0.995
80 0.12 0.82 >0.995
100 0.29 0.99 >0.995
200 0.29 >0.995 >0.995
500 0.72 >0.995 >0.995
d = 0.2, 0.5, 0.8 (small, medium, and large effects)
Types of Missing Data
» Missing Completely at Random (MCAR)˃ Given two variables X and Y, the missingness is unrelated to either. The
missing values in X are independent of Y and vice versa. ˃ If the data are MCAR, then listwise deletion is appropriate
» Missing at Random (MAR)˃ Given two variables X and Y, the missingness is related to or dependent
upon X, but not Y. Suppose X = age and Y = income and income is more often missing in certain age groups, but within each age group, no income group is missing more often that any others, then the data are MAR.
» Nonignorable˃ Given two variables X and Y, the missingness is related to X, but may also be
related to Y. In our age-income example, certain income groups within an age group may be less likely to respond.
Evaluating Missing Data
» Select items with a missing percentage greater than 1% or 2%.
» Recode them into binary variables where with 1=missing and 0=non-missing.
» Analyze these variables by the demographic variables using t-tests or chi-square, as appropriate.
» Significant results indicate that missingness is associated with one or more of the demographic variables.
Data Reduction Methods
» Used to uncover relationship patterns among a group of variables with the goal of reducing the variables to a smaller group
» Two types of data reduction methods--confirmatory and exploratory
» Exploratory factor analysis does not assume any particular structure prior to the analysis and is used to “explore” relationships between variables
» Confirmatory factor analysis is used to test hypotheses regarding the underlying structure of a group of variables
» Traditional factor analysis and principal components analysis are exploratory data reduction methods
Principal Components Analysis
» Principal components analysis a method often used for reducing the number of variables
» Principal components analysis is part of the factor analysis procedures in SAS and SPSS
» Although factor analysis (FA) and principal components analysis (PCA) have mathematical differences the results are often similar
» Many authors loosely use the term “factor analysis” to refer to data reduction methods, in general
Principal Components Analysis
» Finds groups that are correlated with each other, possibly measuring the same construct.
» Reduces the variables in the data to a smaller number of items that account for most of the variance of all of the variables in the data
» The first component accounts for the greatest amount of variance. Then second one accounts for the greatest amount not accounted for by the first component and is uncorrelated with the first component.
Necessary Assumptions
» Suggested sample size: at least 100 subjects and 10 observations per variable
» A correlation analysis of the variables should result in most correlations greater than 0.3
» Bartlett’s test of sphericity is significant (p < 0.05)
» Kaiser-Meyer-Olkin (KMO) test of sampling adequacy ≥ 0.6
» Determinant >0.00001 which indicates that multicollinearity is not a problem
Obtaining a PCA
» In SPSS select principal components under “extraction method”» Select varimax rotation.˃ A rotation uses a transformation to aid in the
interpretation of the factor solution ˃ A varimax rotation is orthogonal, so the components
are uncorrelated, which maximizes the column variance
Evaluating PCA Results
» Kaiser criterion—choose components with eigenvalues greater than one.
» Scree plot—plot of eigenvalues˃ Retain the eigenvalues before the leveling off point of the plot.
» Want the proportion of variance accounted for by each factor (or component) to be 5% to 10%
» Cumulative variance accounted for should be 70% to 80%
Abbreviated Table of Variance Explained
Total Variance ExplainedInitial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Component Total % of VarianceCumulative %Total % of Variance Cumulative % Total % of VarianceCumulative %1 14.26 47.53 47.53 14.26 47.53 47.53 7.22 24.06 24.062 2.55 8.49 56.02 2.55 8.49 56.02 5.79 19.31 43.373 1.37 4.56 60.58 1.37 4.56 60.58 4.41 14.70 58.074 1.09 3.64 64.22 1.09 3.64 64.22 1.84 6.15 64.225 0.98 3.26 67.486 0.86 2.86 70.337 0.80 2.67 73.008 0.75 2.51 75.519 0.68 2.25 77.76
10 0.62 2.06 79.8211 0.58 1.93 81.7512 0.56 1.88 83.6313 0.49 1.64 85.2714 0.48 1.59 86.85
Scree Plot
More about PCA Results
» There should be at least three items with significant loadings on each component
» Check the conceptualization of the component items
» With an orthogonal rotation the factor loadings = correlation between variable and component
» A communality is the proportion of variance in a variable that is accounted for by the retained components or factors. A communality is large if it loads heavily on at least one component.
Obtaining Scores
» Factor score˃ Save the regression scores as variables˃ Standardize the survey responses˃ For each subject’s response, multiply the
standardized survey response by the corresponding regression weights—add the results
» Factor-based score˃ Average the responses of the items in the
component˃ Check for reverse codings and missing data.
Cronbach’s Alpha
» Cronbach’s Alpha is used to measure the reliability or the internal consistency of the factors or components.
» The variables in a scale are all entered into the calculation to obtain the alpha score.
» A Cronbach’s alpha > 0.7 is considered to be sufficient for demonstrating internal consistency for most social science research, while values > 0.6 are marginably acceptable