Education 795 Class Notes Factor Analysis Note set 6.

Education 795 Class Notes

Factor Analysis

Note set 6

Today’s agenda

Announcements (ours and yours)

Q/A

Introducing factor analysis

History and general goals

Attempts to represent a set of measured variables in terms of a smaller number of hypothetical constructs Spearman (1904) invented factor analysis as a way of studying correlations between mental test scores – the G-factorModern uses

Data reduction (tends toward EFA)Identification of latent structure (tends toward CFA)

Practice of Factor Analysis

Differs a bit from other techniques, in that a lot of judgment calls are required

Rules of thumb abound

There are, however, general standards of practice

Iteration is usually required

Courtesy of Pedhazur

Gould (1981) characterized factor analysis as a real pain, “although we think a more apt description is that of a forest in which one can get lost in no time” (p. 590)

GIGO: garbage in, garbage out. The theoretical underpinnings of the latent variables is key.

Correlational Technique

We try to identify dimensions that underlie relations among a set of observed variablesFactor analysis is applied to the correlations among variables

Correlation matrix is a square matrix (equal number of rows and columns)No matter how many subjects, correlation matrix has as many rows as variables

Correlation Matrix

It is a good idea to do a thorough examination of the correlation matrix before beginning a FA

You can compute a FA using either raw data OR just the correlation matrix

Appropriate data types include anything for which a correlation can be properly computed (continuous, interval, ordinal, dichotomous only nominal)

Bartlett’s Test of Sphericity

A necessary but not sufficient piece of evidence that the correlation matrix is appropriate for FA

Null hypothesis: The correlation matrix is an identity matrix (e.g., 1s on the diagonal and 0s everywhere else)

This test is affected by sample size… for large samples, almost always reject null hypothesis. That is why this is necessary but no sufficient evidence.

PCA vs FA

Principal Components vs Factor AnalysisThere is no agreement in the field of practice as to which one is better or more appropriate

They are distinct techniques with different goals

PCA—data reduction methodfactors are extracted distinctly to explain maximum variance

PCA vs. FA

FA—latent variable methodology. The “unobserved” factors predict the “observed” variables.

Aimed at explaining common variances shared by the observed variables

Difference: PCA extracts both common variance AND error variance – it tends to inflate the actual association between variables and factors.

PCA vs. FA

In reality, a PCA will yield similar results to a FA but PCA will have larger factor loadings (inflated estimated communalities)

General Rule: We want the first 2-3 factors to explain at least 50% of the total variance.

In Our World

We will use primarily Principal Axis Factoring (FA).Once the Extraction Method is decided upon, we rotate the matrix.

Orthogonal—Keeps factors independent, perfect for regressionOblique—allows some dependence between factors, ok if it makes conceptual sense.

The Necessary Steps

1. Identify and gather data appropriate for factor analysis

2. Decide upon extraction approach and selection criteria

1. PCA vs. PAF1. Eigenvalue =>1. 2. Scree Plot.

3. Rotate extracted factors after deciding upon rotational approach

1. Varimax2. Oblimin

4. Before naming factors, cycle through steps 2 and 3 until you have achieved a reasonable statistical and conceptual solution

Simple Example (SES)

X1 = Family Income

X2 = Father’s Education

How is the correlation between X1 and X2 best represented?

1X

X2

F

b1

b2

U1

U2d2

d1

r

1X

X2

The SES Equations

X1 =

X2 =

Factor Extraction

Assumes factors will be uncorrelatedHow many factors?

Less than the number of variables being analyzed

Specific theorized numberAmount of variance explained (Eigenvalue)Scree plot

Different approaches to extracting factorsPrincipal componentsPrincipal axis factoring

Rotating Extracted Factors

Unrotated factor matrix is only one of many possible ones; transformations can clarify meaning without changing the underlying relationship amongst the variables

Rotation is typically necessary to ease interpretation

Desire to approach “simple structure”

Orthogonal or oblique? Varimax or Oblimin?

Interpreting and Naming Rotated Factors

Appropriate after cycling through various solutions and identifying the one that makes both statistical and conceptual sense

Naming should capture the essence of the variables that are most closely associated with each factor

Should take the relative strength of loading into account in naming factors

Let’s Move on to a More Complex Example

Assume there is a latent structure in describing why people go to college.

Theoretically we can make an argument that there are intrinsic and extrinsic reasons students go to college.

Let’s say we decide to use PAF and we decide we want to try both rotations, orthogonal and oblique…

Reasons people go to college

SPSS Syntax

/SORT

PAF

Results

Results: Orthogonal

Results: Oblique

Discussion

Are the two rotations different in any way?

How do we decide which one to use?

Questions?

Next Week

Read Pedhazur Ch 4 p79-80




Education 795 Class Notes Factor Analysis Note set 6.

Documents

Transcript of Education 795 Class Notes Factor Analysis Note set 6.