Education 795 Class Notes Factor Analysis Note set 6.
-
Upload
edwin-hudson -
Category
Documents
-
view
219 -
download
2
Transcript of Education 795 Class Notes Factor Analysis Note set 6.
Education 795 Class Notes
Factor Analysis
Note set 6
Today’s agenda
Announcements (ours and yours)
Q/A
Introducing factor analysis
History and general goals
Attempts to represent a set of measured variables in terms of a smaller number of hypothetical constructs Spearman (1904) invented factor analysis as a way of studying correlations between mental test scores – the G-factorModern uses
Data reduction (tends toward EFA)Identification of latent structure (tends toward CFA)
Practice of Factor Analysis
Differs a bit from other techniques, in that a lot of judgment calls are required
Rules of thumb abound
There are, however, general standards of practice
Iteration is usually required
Courtesy of Pedhazur
Gould (1981) characterized factor analysis as a real pain, “although we think a more apt description is that of a forest in which one can get lost in no time” (p. 590)
GIGO: garbage in, garbage out. The theoretical underpinnings of the latent variables is key.
Correlational Technique
We try to identify dimensions that underlie relations among a set of observed variablesFactor analysis is applied to the correlations among variables
Correlation matrix is a square matrix (equal number of rows and columns)No matter how many subjects, correlation matrix has as many rows as variables
Correlation Matrix
It is a good idea to do a thorough examination of the correlation matrix before beginning a FA
You can compute a FA using either raw data OR just the correlation matrix
Appropriate data types include anything for which a correlation can be properly computed (continuous, interval, ordinal, dichotomous only nominal)
Bartlett’s Test of Sphericity
A necessary but not sufficient piece of evidence that the correlation matrix is appropriate for FA
Null hypothesis: The correlation matrix is an identity matrix (e.g., 1s on the diagonal and 0s everywhere else)
This test is affected by sample size… for large samples, almost always reject null hypothesis. That is why this is necessary but no sufficient evidence.
PCA vs FA
Principal Components vs Factor AnalysisThere is no agreement in the field of practice as to which one is better or more appropriate
They are distinct techniques with different goals
PCA—data reduction methodfactors are extracted distinctly to explain maximum variance
PCA vs. FA
FA—latent variable methodology. The “unobserved” factors predict the “observed” variables.
Aimed at explaining common variances shared by the observed variables
Difference: PCA extracts both common variance AND error variance – it tends to inflate the actual association between variables and factors.
PCA vs. FA
In reality, a PCA will yield similar results to a FA but PCA will have larger factor loadings (inflated estimated communalities)
General Rule: We want the first 2-3 factors to explain at least 50% of the total variance.
In Our World
We will use primarily Principal Axis Factoring (FA).Once the Extraction Method is decided upon, we rotate the matrix.
Orthogonal—Keeps factors independent, perfect for regressionOblique—allows some dependence between factors, ok if it makes conceptual sense.
The Necessary Steps
1. Identify and gather data appropriate for factor analysis
2. Decide upon extraction approach and selection criteria
1. PCA vs. PAF1. Eigenvalue =>1. 2. Scree Plot.
3. Rotate extracted factors after deciding upon rotational approach
1. Varimax2. Oblimin
4. Before naming factors, cycle through steps 2 and 3 until you have achieved a reasonable statistical and conceptual solution
Simple Example (SES)
X1 = Family Income
X2 = Father’s Education
How is the correlation between X1 and X2 best represented?
1X
X2
F
b1
b2
U1
U2d2
d1
r
1X
X2
The SES Equations
X1 =
X2 =
Factor Extraction
Assumes factors will be uncorrelatedHow many factors?
Less than the number of variables being analyzed
Specific theorized numberAmount of variance explained (Eigenvalue)Scree plot
Different approaches to extracting factorsPrincipal componentsPrincipal axis factoring
Rotating Extracted Factors
Unrotated factor matrix is only one of many possible ones; transformations can clarify meaning without changing the underlying relationship amongst the variables
Rotation is typically necessary to ease interpretation
Desire to approach “simple structure”
Orthogonal or oblique? Varimax or Oblimin?
Interpreting and Naming Rotated Factors
Appropriate after cycling through various solutions and identifying the one that makes both statistical and conceptual sense
Naming should capture the essence of the variables that are most closely associated with each factor
Should take the relative strength of loading into account in naming factors
Let’s Move on to a More Complex Example
Assume there is a latent structure in describing why people go to college.
Theoretically we can make an argument that there are intrinsic and extrinsic reasons students go to college.
Let’s say we decide to use PAF and we decide we want to try both rotations, orthogonal and oblique…
Reasons people go to college
SPSS Syntax
/SORT
PAF
Results
Results: Orthogonal
Results: Oblique
Discussion
Are the two rotations different in any way?
How do we decide which one to use?
Questions?
Next Week
Read Pedhazur Ch 4 p79-80
Read Pedhazur Ch 5 p81-83
Read Pedhazur Ch 22 p607-627
Read Pedhazur Ch 23 p631-632