An Introduction to Factor Analysis

81
An Introduction to Factor Analysis Reducing variables and/or detecting underlying structures

description

An Introduction to Factor Analysis. Reducing variables and/or detecting underlying structures. Books you’ll never see. Uses. Data reduction. 24 actual variables. Factor 1. Factor 2. Two latent variables. Uses. Create composites/scales for psychometric instruments. Depression. Anxiety. - PowerPoint PPT Presentation

Transcript of An Introduction to Factor Analysis

Page 1: An Introduction to Factor Analysis

An Introduction to Factor Analysis

Reducing variables and/or detecting underlying structures

Page 2: An Introduction to Factor Analysis

Books you’ll never see . . .

Page 3: An Introduction to Factor Analysis

Uses

• Data reduction

Factor 1Factor 2

24 actual variables

Two latent variables

Page 4: An Introduction to Factor Analysis

Uses

• Create composites/scales for psychometric instruments

DepressionAnxiety

Page 5: An Introduction to Factor Analysis

Uses

• Validate composites/scales for psychometric instruments

DepressionAnxiety

Page 6: An Introduction to Factor Analysis

Summary of uses• Also used in the

development or exploration of questionnaires or other psychometric instruments.

• Factor analytic techniques are most commonly used to reduce many items into a more usable number of factors. This way, the more simplified data can be used more easily in research.

Page 7: An Introduction to Factor Analysis

Latent variables

A metaphor

Page 8: An Introduction to Factor Analysis

An example of common variance using bivariate relationships

• I measure a sample of kindergarten children’s ability to recognize the sound(s) at the beginning of words, e.g., /k/ in “cat”

• I also measure the children’s ability to segment (break apart) sounds

e.g., “cat” = /k/ /a/ /t/

• I correlate these two measures

Page 9: An Introduction to Factor Analysis

Beginning letter sounds

Ph

on

eme

Seg

men

tati

on

Page 10: An Introduction to Factor Analysis

Not useful when A vast array of

variables, with no theoretical association are forced into analysis just to see what turns up

The variables have inadequate reliability. This lack of stability of measurement affects the meaningfulness of the derived factors.

Page 11: An Introduction to Factor Analysis

Approaches to Factor Analytic Techniques

Exploratory• Mathematically driven

technique• Seeks to identify the

underlying structure of a set of items or variables

• Use of scholarly intuition to figure out what the factors mean

Confirmatory• Starts with a theory of

what you expect to confirm (a priori)

• Do the items load as you expected on the factors that you predicted?

• Much more involved Structural Equation Modeling approach—test of model fit

Page 12: An Introduction to Factor Analysis

Methodological Considerations1. Selection of variables

2. Size of sample

3. Reliability of measures

4. Appropriateness of using Factor Analytic techniques (given the goal of the research)

5. Choice of method (how to extract the factors)

6. How many factors to retain

7. Methods of rotation (to ease interpretability)

Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.

Page 13: An Introduction to Factor Analysis

Methodological Considerations

1.Selection of variables

Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.

Page 14: An Introduction to Factor Analysis

Assumptions and Requirements of Factor Analytic Techniques

• More than one variable involved• Sample acquired through random selection• Robust bivariate relationships among variables• Variables are measured using either interval or

ratio (or ordinal—quasi-interval?) level data• Data approximate a normal distributions

(multivariate normality is also nice)• Relationships among variables are linear• Variables are measured reliably • No multicolinearity (e.g., bivariate r above 0.90)• Few missing observations• “Large” number of observations

Page 15: An Introduction to Factor Analysis

Methodological Considerations1. Selection of variables

2. Size of sample

3. Reliability of measures

4. Appropriateness of using Factor Analytic techniques (given the goal of the research)

5. Choice of method (how to extract the factors)

6. How many factors to retain

7. Methods of rotation (to ease interpretability)

Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.

Page 16: An Introduction to Factor Analysis

Size of sampleWhat is a reasonable sample size? How many

observations do you need?• Old school: Ten observations per planned

extracted factor (with a minimum of 100 recommended)

• “More is better” rule. Similar reasoning as other parametric statistical techniques, but less can be okay under some circumstances.

• Recently, it is more recognized that smaller samples can be reasonably factor analyzed, but this is something still hotly debated.

Page 17: An Introduction to Factor Analysis

Methodological Considerations1. Selection of variables

2. Size of sample

3. Reliability of measures

4. Appropriateness of using Factor Analytic techniques (given the goal of the research)

5. Choice of method (how to extract the factors)

6. How many factors to retain

7. Methods of rotation (to ease interpretability)

Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.

Page 18: An Introduction to Factor Analysis

Reliability of measures• Factor analysis is a correlational technique

(multiple regression)

• Low reliabilities attenuate correlations

• Low reliabilities introduce “noise” and obscure “signal” for the factors you are trying to detect and extract

Page 19: An Introduction to Factor Analysis

Researcher as Quality Control

Page 20: An Introduction to Factor Analysis

Methodological Considerations1. Selection of variables

2. Size of sample

3. Reliability of measures

4. Appropriateness of using Factor Analytic techniques (given the goal of the research)

5. Choice of method (how to extract the factors)

6. How many factors to retain

7. Methods of rotation (to ease interpretability)

Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.

Page 21: An Introduction to Factor Analysis

Appropriateness of Factor Analysis• Test development and instrument validation

– Create composites/sub-scales for psychometric instruments

– Detect underlying structures within• Construct validity • Evaluation of a theory

• Data reduction– Reduce multiple variables to a smaller group, while

maintaining the diversity of information offered.– Demonstrate that multiple instruments test the same

thing

demonstrate that items load on one factor, or no factors, or multiple factors

Page 22: An Introduction to Factor Analysis

Methodological Considerations1. Selection of variables

2. Size of sample

3. Reliability of measures

4. Appropriateness of using Factor Analytic techniques (given the goal of the research)

5. Choice of method (how to extract the factors)

6. How many factors to retain

7. Methods of rotation (to ease interpretability)

Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.

Page 23: An Introduction to Factor Analysis

Partitioning Variance

1. Variance common to other variables

2. Variance specific to that variable

3. Random measurement error

Page 24: An Introduction to Factor Analysis

Most common methods of extracting factors?

Common Factor Analysis (CFA)

Assumption: The factors explain the correlations among the variables (variance in common)

Finds common variance among many items, groups it, and then it must be appropriately labeled

Goal: To find the fewest number of factors that account for the relationships among variables

Kahn 2006

Unique variance

(item)

Common variance

Unique variance

(item)

Unique variance

(item)

CFA considers this

variance

DeCoster (1998) Overview of Factor Analysis

Page 25: An Introduction to Factor Analysis

Principal Components Analysis (PCA)Assumption: Components

explain the variance in common among the variables and the amount of unique variance (item & error) present

Goal: To find the fewest components that account for the relationships among variables

Unique variance

(item+error)

Unique variance

(item+error)

Unique variance

(item+error)

Page 26: An Introduction to Factor Analysis

Comparisons

Common Factor Analysis

• Seeks the factors that account for the common variance among the variables

• Used for Exploratory Factor Analysis (EFA) or Confirmatory Factor Analysis (CFA)

• Easier to generalize to other samples/populations since the unique and error variance of items isn’t considered

• Most often used to detect underlying structures among variables.

Principal Components Analysis

• Seeks factors that account for all of the common and other variance among the variables

• Harder to generalize since other sources of variance (that are item specific and not shared) are included in the model

• Most often used for data reduction to use in research

Page 27: An Introduction to Factor Analysis

Factor Analytic TechniquesItem 1

Item 4

Item 5

Item 8

Item 7

Factor 1

Item 2

Item 3

Item 6

Item 10

Item 9

Factor 2

Latent Variables

(unobserved)

What factors exist among the variables?

To what degree are the variables (items) related to the factors that were extracted?

FACTOR LOADINGS

Exploratory questions:

Kahn 2006

unique

unique

unique

unique

unique

unique

unique

unique

unique

unique

Observed variables

Page 28: An Introduction to Factor Analysis

Common Factor Analysis• CFA takes into account shared (common) and

item specific variance and uses the squared multiple correlation (R squared) as the measure of communality.

• Communality is the variance in one variable that is shared with the other variables.

• The factors extracted by CFA, therefore, explain the shared variance common to more than one variable.

Page 29: An Introduction to Factor Analysis

Common Factor Analysis1. Variance common to other variables

Multicultural Counseling Inventory—Item 6:

“I include the facts of age, gender roles, and socioeconomic status in my understanding of different minority cultures.”

The measured overlap (R square) between this item and the other items on the MCI is the communality.

Page 30: An Introduction to Factor Analysis

Common Factor Analysis

Partitions variance for that variable, that is in common with other variables. How?

Uses Multiple Regression.

a. Use each item as an outcome in MR

b. Use all other items as predictors

c. Finds the communality among all of the variables, relative to one another

Page 31: An Introduction to Factor Analysis

Common Factor Analysis

Predictors:

Item 2

Item 3

Item 4

Item 5

Item 6

Item 7

Item 8

Item 9

Item 10

Outcome:

Item 1

The R square is the average shared variance for that item with the other items

Item 1

Page 32: An Introduction to Factor Analysis

Predictors:

Item 1

Item 3

Item 4

Item 5

Item 6

Item 7

Item 8

Item 9

Item 10

Outcome:

Item 2

The average R square is the average shared variance for that item with the other items

Common Factor Analysis

Item 2

Page 33: An Introduction to Factor Analysis

Predictors:

Item 1

Item 2

Item 4

Item 5

Item 6

Item 7

Item 8

Item 9

Item 10

Outcome:

Item 3

The average R square is the average shared variance for that item with the other items

Common Factor Analysis

Item 3

Page 34: An Introduction to Factor Analysis

How is communality reported with CFA?

Item 1 Item 2 Item 3 Item 4 Item 5 Item

Item 1 .76

Item 2 .60 .56

Item 3 .43 .76 .87

Item 4 .34 .45 .64 .56

Item 5 .33 .32 .34 .65 .52

Item 6 .82 .81 .45 .57 .33 .41

Squared multiple correlations (R square) are on the diagonal of the correlation matrix

Page 35: An Introduction to Factor Analysis

What makes a good factor?

• It is consistent with the literature regarding past investigations of variable relationships

• It is easy to understand and interpret

• It adheres to the “simple structure” model

Page 36: An Introduction to Factor Analysis

Principal Component Analysis

Data reduction

Page 37: An Introduction to Factor Analysis

Principal Component AnalysisItem 1

Item 4

Item 5

Item 8

Item 7

Component 1

Item 2

Item 3

Item 6

Item 10

Item 9

Component 2

How many components are there that can account for

all or most of the information contained in the original data?

Kahn 2006

unique

unique

unique

unique

unique

unique

unique

unique

unique

unique

Page 38: An Introduction to Factor Analysis

How is communality reported with PCA?

Item 1 Item 2 Item 3 Item 4 Item 5 Item 6

Item 1 1.0

Item 2 .71 1.0

Item 3 .62 .76 1.0

Item 4 .34 .45 .64 1.0

Item 5 .33 .32 .34 .65 1.0

Item 6 .82 .81 .45 .57 .33 1.0

Page 39: An Introduction to Factor Analysis

CFA vs. PCA• Common factor analysis and principal

components analysis often yield similar results when sample sizes are large and/or if item communalities are large.

• Common factor analysis is preferred in situations in which these criteria are not met, especially when the researcher wishes to better understand the latent variables that underlie a mass of items.

Page 40: An Introduction to Factor Analysis

Factor Analytic Family of Techniques

Metaphors for extraction of factors/components

Page 41: An Introduction to Factor Analysis

• With each extraction of a component, less and less variance is unaccounted for.

12

3 4 5 6 7 8

Page 42: An Introduction to Factor Analysis

Factor Analysis MetaphorITEM POOL: Variance-covariance matrix for an instrument Extracts the

shared variation only (i.e., plusses)

First factor+ + + + - - + + +

+ - - + - - + + +

+ + + + - - + + +

+ - - + - - + - -

+ + + + + + + + + + + + + + +

+ + +

+ + +

- -

+ - - + - -

- -

+ - - + - - + - - + + +

+ + +

Extracts the shared variation only (i.e., plusses)

+ + +

+

+

+

+ + + +

Second factorITEM POOL: There is still shared variance left, but it is different than the first batch

Page 43: An Introduction to Factor Analysis

The Principle of Parsimony

• Goal: We often want to use the smallest number of separate variables to convey the most information about the relationships among constructs.

“Less is more”

Kahn 2006

Page 44: An Introduction to Factor Analysis

Methodological Considerations1. Selection of variables

2. Size of sample

3. Reliability of measures

4. Appropriateness of using Factor Analytic techniques (given the goal of the research)

5. Choice of method (how to extract the factors)

6. How many factors to retain?

7. Methods of rotation (to ease interpretability)

Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.

Page 45: An Introduction to Factor Analysis

How many factors to retain?If you keep letting the program extract

factors, it will extract as many factors as there are items.

So how do you decide how many factors to extract?

Bryant & Yarnold (1995). Principal-Components and Factor Analysis from Grimm & Yarnold’s (Eds.) Reading and Understanding Multivariate Statistics

Page 46: An Introduction to Factor Analysis

You want the fewest factors necessary to account for the most variance.

Factor Analytic techniques will give you as many factors as you want (even if they’re complete nonsense). The aim is to find the real factors that are consistent with the theoretical structure, not just factors that pop up and have no logical explanation.

Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

Page 47: An Introduction to Factor Analysis

How many factors to retain?

A priori criterion

• Replication criterion

• Percentage criterion

Stopping rules

• Kaiser rule

• Catell’s scree plot

• Parallel analysis

Bryant & Yarnold (1995). Principal-Components and Factor Analysis from Grimm & Yarnold’s (Eds.) Reading and Understanding Multivariate Statistics

Page 48: An Introduction to Factor Analysis

A priori criterion1. When you are replicating research and

you want to use the same number of factors to retain as previous researchers.

2. You decide a cut-off point, based on some theoretical rationale (e.g., retain factors until 80% of the variance is explained by the extracted factors).

Page 49: An Introduction to Factor Analysis

Eigenvalues

The eigenvalue is the variance in every variable that is accounted for by the factor in question.

The sum of all eigenvalues = number of variables/items in component analysis

Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

Page 50: An Introduction to Factor Analysis

How many factors to retain?Kaiser criterion - Retain all

factors with an Eigenvalue greater than 1.0)

This sets the limit so that a component must account for at least as much variance as a single variable (to be considered useful).

Kahn 2006

(For CFA, which SPSS calls principal axis factoring, this would be “factor” instead of “component”)

Page 51: An Introduction to Factor Analysis

How many factors to retain?Catell’s scree test: Retain all

factors with a big drop (change in slope). Can be combined with the Kaiser criterion (Factors with an eigenvalue greater than 1.0)

This includes the limit so that a factor must show that it accounts for a chunk of unique variance that is more than the variance of a single item.

Page 52: An Introduction to Factor Analysis

Parallel Analysis

• You generate a scree plot (with eigenvalues) based on random data that uses the same number of variables (items) and the same number of cases.

• Retain the factors with eigenvalues higher than the random eignenvalues.

• Not an option in SPSS

Kahn 2006

Page 53: An Introduction to Factor Analysis

Factor Rotation

Obtaining a clearer pattern of factor loadings

Page 54: An Introduction to Factor Analysis

The Goal of Rotating Factors

To create high factor loadings for each item on one factor

And create low factors loadings for all other factors

THIS COMBINATION OF CHARACTERISTICS IS REFERRED TO AS THE SIMPLE STRUCTURE.

IT MAKES THE FACTORS MORE INTERPRETABLE

Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

Page 55: An Introduction to Factor Analysis

Factor Structure Coefficients• These are correlations between the item and it’s

associated factor.

• The simple structure dictates that factor coefficients are best if they are very high (in reference to their own factor) and very low (in reference to any other retained factor).

• Rotating factors will change their structure coefficients, thus better approximating the simple structure being sought.

Page 56: An Introduction to Factor Analysis

Thurston’s Rule

• Good items (variables) should only load onto one factor

• Items should load on that one factor at least a magnitude of 0.30.

• The item should not have an eigenvalue of less than 1.0

Page 57: An Introduction to Factor Analysis

Item 1

Item 2

Item 3

Item 5

Item 4

Factor 1

Item 7

Item 8

Item 6

Item 1

Item 3

Item 4

Item 7

Item 2

Item 5

Item 6

Item 8

Factor 2

Distillation

Page 58: An Introduction to Factor Analysis

Kirby, J.R., Parrila, R., & Pfeiffer, S. (2003). Naming speed and phonological awareness as predictors of reading development. Journal of Educational Psychology, 93(3), 453-464.

Page 59: An Introduction to Factor Analysis

Kirby, J.R., Parrila, R., & Pfeiffer, S. (2003). Naming speed and phonological awareness as predictors of reading development. Journal of Educational Psychology, 93(3), 453-464.

.96

.90

.77

.63

.90

.75

.47

Picture naming

Color naming

Sound isolation

Phoneme elision

Blending onset-rime

Blending phonemes

Rapid automatized

naming

Phonological awareness

-.10

-.05

.06

.15

.03

-.05

Rapid automatized

naming

Blending phonemes

Blending onset-rime

Phoneme elision

Sound isolation

Color naming

Picture naming

Phonological awareness

Factor 1

Factor 2

Factor 1

Factor 2

Page 60: An Introduction to Factor Analysis

Common rotations

Orthogonal - factors are at 90 degree angles (i.e., uncorrelated)

• *Varimax

• Quartimax

• Equimax

*most popular

Oblique-Factors maybe correlated with each other.

Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

Page 61: An Introduction to Factor Analysis

Factor Extraction

Because the first factor extracted accounts for the most variance among the variables, the next factor extracted will capture variance not accounted for by the first factor. This helps the latent variables be “orthogonal,” meaning that the extracted factors are generally uncorrelated with each other.

Page 62: An Introduction to Factor Analysis

Orthogonal Rotations

Varimax: Most common. Maximizes loadings on one factor while minimizing loadings on other factors.

Quartimax: Uncommon. Maximizes factor loading on the first factor only.

Equimax: Also less common. Combines other techniques and because of this, is more difficult to interpret than the other two options.

Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

Page 63: An Introduction to Factor Analysis

Oblique rotationsNot used frequently but should be when factors

are correlated.

Promax is the most popular of the oblique methods• First rotates orthogonally• Then followed by oblique rotation• Minimizes small loadings• Simple structure is best approximated

Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

Page 64: An Introduction to Factor Analysis

How to decide?

• You want what will give you the most interpretable result, with the simplest solution, consistent with an underlying theoretical structure.

• You can use different rotational techniques and compare results. Similar results strengthen confidence in the outcome.

Ferketich & Muller (1999) Readings in Research Methodology, Second Edition

Page 65: An Introduction to Factor Analysis

How to clarify factor loadings using rotation

Item 1Item 2

Item 3

Factor 1 axis

Factor’s 2 axis

Item 4

Page 66: An Introduction to Factor Analysis

Rotation

Item 1Item 2

Factor 1 axis

Factor’s 2 axis

Item 4

Page 67: An Introduction to Factor Analysis

Item 1Item 2

Factor 1 axis

Factor’s 2 axis

Rotation

Item 4

Page 68: An Introduction to Factor Analysis

Item 1Item 2

Factor 1 axis

Factor’s 2 axis

Rotation

Item 4

Page 69: An Introduction to Factor Analysis

Item 1Item 2

Factor 1 axis

Factor’s 2 axis

Rotation

Item 4

Page 70: An Introduction to Factor Analysis

Item 1Item 2

Factor 1 axis

Factor’s 2 axis

Rotation

Item 4

Page 71: An Introduction to Factor Analysis

Item 1Item 2

Factor 1 axis

Factor’s 2 axis

Rotation

Item 4

Page 72: An Introduction to Factor Analysis

Factor Rotation

Item 1Item 2

Factor 1axis

Item 3

Rot

ated

Fac

tor 1

Factor 2axis

Rotated Factor 2Item 4

Page 73: An Introduction to Factor Analysis

• Factor loading coefficients define the eigenvector. The factor loading coefficient represents the correlation between the item and the eigenvector

Eigenvectors

Variables 1 2

1 .62 .52

2 .54 .25

3 .25 .59

4 .39 .66

5 .35 .68

Before orthogonal rotation

Page 74: An Introduction to Factor Analysis

After orthogonal rotation

• Factor loading coefficients define the eigenvector. The factor loading coefficient represents the correlation between the item and the eigenvector

Eigenvectors

Variables 1 2

1 .65 .45

2 .62 .09

3 .05 .694 .02 .685 .10 .82

Page 75: An Introduction to Factor Analysis

Factor coefficients: before and after

Eigenvectors

Variables 1 2

1 .65 .45

2 .62 .09

3 .05 .69

4 .02 .68

5 .10 .82

Eigenvectors

Variables 1 2

1 .62 .52

2 .54 .25

3 .25 .59

4 .39 .66

5 .35 .68

Page 76: An Introduction to Factor Analysis

Uses of Factor Analytic Techniques

• All of the techniques associated with creating factors from many variables are sample specific; however, the better the quality of your sample (size, representativeness, etc.), the more likely your results will generalize to other samples, and theoretically, to the population of interest.

Page 77: An Introduction to Factor Analysis

Floyd & Widaman (1995)

“Thus, common factor analysis can provide valuable insights into the multivariate structure of a measuring instrument, isolating the theoretical constructs [i.e., factors] whose effects are reflected in responses on the instrument.” (p. 287)

Page 78: An Introduction to Factor Analysis

Cross Validation

• Randomly divide your sample (2/3, 1/3)

• Try to replicate factor solutions across groups

• Explore for part of the sample, then confirm with the other portion

Page 79: An Introduction to Factor Analysis

EFA vs. CFA

Exploratory • Find and retain

factors (no test of significance, per se)

Confirmatory• See how well the

constructed model fits the data

Chi-square goodness of fit test

Page 80: An Introduction to Factor Analysis

Confirmatory Factor Analysis and Model Fit

The researcher specifies in advance (predicts) how many factors will be found and which items should load on which factors.

Factor 1

Factor 2Factor 3

Factor 4

Page 81: An Introduction to Factor Analysis

Links and Resources

• http://www.siu.edu/~epse1/pohlmann/factglos/