Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor...

28
Introduction to Factor Analysis Professor Ron Fricker Naval Postgraduate School Monterey, California 3/2/13 1 Reading Assignment: Fricker, Kulzy & Appleget (2012)

Transcript of Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor...

Page 1: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Introduction to Factor Analysis!

Professor Ron Fricker!Naval Postgraduate School!

Monterey, California!

3/2/13 1

Reading Assignment:!Fricker, Kulzy & Appleget (2012)!

Page 2: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Goals for this Lecture!

•  Learn about factor analysis as a tool for:!–  Deriving unobserved latent variables from

observed survey question responses!–  Data reduction!

•  Understand the steps in conducting factor analysis and the R functions/syntax!

•  Illustrate the application of factor analysis to survey data!

3/2/13 2

Page 3: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Why Factor Analysis?!

•  Factor analysis is a method for identifying latent traits from question-level survey data!–  Useful in survey analysis whenever the

phenomenon of interest is complex and not directly measurable via a single question!

•  In such situations, must ask a series of questions about the phenomenon, then appropriately combine the resulting responses into a single measure or “factor”!–  Such factors, then, become the observed

measures of the unobservable or latent phenomenon!

3/2/13 3

Page 4: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Goal of Factor Analysis!

•  Factor analysis is a hybrid of social and statistical science!

•  Dates to the early 1900s, where the goal was multivariate data reduction!

•  Idea is to explain the correlation structure observed in p dimensions via a linear combination of r factors, where:!–  the number of factors is smaller than the number

of observed variables (r < p), and!–  the factors achieve both “statistical simplicity and

scientific meaningfulness” (Harman, 1976)!

3/2/13 4

Page 5: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Factor Analysis and Survey Data!

•  Common use of exploratory factor analysis is to “determine what sets of items hang together in a questionnaire” (DeCoster, 1998)!–  Particularly important for instruments with large

number of items (i.e., for data reduction)!–  Also when need to summarize sets of items in

terms of their commonalities (i.e., express results in terms of latent variables)!

•  Practically, can make interpreting and summarizing (complex) survey results easier / more meaningful / efficient!

3/2/13 5

Page 6: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Three Types of Factor Analysis!

•  Principle components!–  Empirical data reduction methodology, but not

focused on achieving “scientific meaningfulness”!•  Exploratory factor analysis!

–  Also empirical data reduction methodology that often does derive scientifically meaningful factors!

–  Focus of this lecture!•  Confirmatory factor analysis!

–  Variety of methods focused on testing hypotheses about structure of factors!

–  See Maj Steve Jones’ thesis (2012) for more info.!

3/2/13 6

Page 7: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

A Bit About Principle Components!

•  Standard statistical method for data reduction!•  Seeks to explain as much variance as

possible in a small number of orthogonal linear combinations of the original data!

•  Useful when the goal is to reduce the number of variables in a model/analysis while capturing much of the variability!

•  However, as just stated, resulting components do not necessarily achieve “scientific meaningfulness”!

3/2/13 7

Page 8: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

A Bit About Confirmatory Factor Analysis!

•  Intended as a way to test theories/hypotheses about factor constructs!

•  My preference: Whenever possible, test results via reproducibility (on separate data) vice confirmatory factor analysis (CFA)!–  “Finally, the process of reproducing Factor Analysis on out-

of-sample data (the 2011 survey) proved much more useful than conducting CFA. Although CFA most undoubtedly has uses for some models and some data sets, it is neither powerful enough, nor informative enough, to justify its use compared to the reproduction of Factor Analysis” (Jones, 2012).!

ü Reproducibility is the appropriate scientific standard and important to do for any statistical analysis!

3/2/13 8

Page 9: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Exploratory Factor Analysis in a Picture!

3/2/13 9

•  Example: Six questions that are functions of two underlying (unobserved) factors:!

Page 10: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Mathematically!

•  The idea is to find a set of r common factors, F1,…,Fr, such that when used to estimate the data the correlation structure of the estimated data is close to the correlation structure of the actual data!

3/2/13 10 Common factors!

Loadings! Unique loading (and its factor)!

Page 11: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Steps in (Exploratory) Factor Analysis!

•  Determine the number of factors!–  Seems like a Catch-22 (“How can I know the

number of factors if they’re unobserved?”), but there is a way that works well!

•  Fit the exploratory factor analysis model!•  Rotate the model to achieve desired solution!

–  Two main approaches: promax and varimax!–  Decide whether to keep all variables in each factor

or use a cut-off for the loadings!•  Interpret the resulting factors!

–  Re-rotate as necessary!3/2/13 11

Page 12: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Determining the Number of Factors!

•  Getting the number of factors right is critical!–  Too few and factors load with irrelevant items!–  Too many and items spread out over many factors!–  Both make interpreting the resulting factors hard

and may obscure the real underlying factors!•  Variety of methods proposed:!

–  Kaiser rule, scree plot, etc.!•  What works well is parallel analysis!

–  Idea: Factors derived from real data should have larger eigenvalues than equivalent factors derived from equivalent simulated data!

3/2/13 12

Page 13: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Parallel Analysis with QOL Data!

•  Consider question 7 from QOL survey!–  5-point Likert rating of 15 NPS services!

3/2/13 13

Removed – too much missing (950 out of 1,368)

Removed – too much missing (864 out of 1,368)

Removed – too much missing (818 out of 1,368)

Removed – too much missing (505 out of 1,368)

Page 14: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Data Preparation!

•  Re-coded Likert scale: 1=Very Satisfied to 5=Very Unsatisfied!

•  Deleted all records where respondents failed to answer one or more of the 11 parts (casewise deletion)!–  Only did it here for convenience to illustrate factor analysis!–  Would have used nearest neighbor hot deck imputation,

based on demographics, in a real analysis!

•  Final result: 11 questions for 555 respondents!

3/2/13 14

Page 15: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Results for QOL Q7 Data!

3/2/13 15

2 4 6 8 10

01

23

Parallel Analysis Scree Plots

Factor Number

eige

nval

ues

of p

rinci

pal c

ompo

nent

s an

d fa

ctor

ana

lysi

s

PC Actual Data PC Simulated Data PC Resampled Data FA Actual Data FA Simulated Data FA Resampled Data

Indicates 6 factors appropriate for Q7!

Page 16: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Fitting the Model!

•  Idea: Find factors and associated loadings so that covariance of their linear combination is “close to” covariance of the original data!–  I.e., find the estimated data so that !

•  Mathematics beyond the scope of what we’ll cover today!

•  Because factors and their loadings are all unknown, there is no unique solution !–  In fact, there are an infinite number of solutions!

3/2/13 16

X̂ = Λ̂F̂ + Ψ̂cor X( ) ≈ cor X̂( )

Page 17: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Fitting the Model in R!

•  Given the desired number of factors, use the factanal() function in base R!

•  Basic syntax is factanal(dataframe,nr_factors) –  Here “dataframe” contains only those variables to

be used in the factor analysis!–  And “nr_factors” is an integer!–  Default rotation is varimax, but can also specify

promax!•  Varimax results in orthogonal factors!•  Promax allows for correlated factors !!

3/2/13 17

Page 18: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Varimax Rotation!

•  Varimax finds the rotation that makes the high loadings as high as possible while also making the low loadings as low as possible!

•  I.e., varimax finds an orthogonal transformation that for maximizes:!

!

3/2/13 18

Essentially, the variance of the jth factor’s (rescaled) loadings

over the p questions !

Sum of the “variances” over the r factors !

Page 19: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Example #1: QOL Results!

•  In the end, I found the following 6 factors using a loadings cut-off of 0.4 (a subjective choice):!

3/2/13 19

Hea

thca

re

Serv

ice!

Exch

ange

an

d C

omm

.!

Auto

Se

rvic

es!

MW

R

Serv

ices!

NPS

Stu

dent

Se

rvic

es!

Fitn

ess

Serv

ices!

Page 20: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Compare to Principle Components!

3/2/13 20

Page 21: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Example #1: Discussion!

•  This is only an illustration!–  Use of casewise deletion was extreme!

•  Better to use demographics and nearest neighbor hot deck imputation!

–  Also, only running factor analysis on a small subset of the survey questions was extreme!

•  Better to run factor analysis on all the questions!– How might the additional information affected the

factor formulation? What else might have entered into the factors?!

•  Compared to principle components, resulting factors more intuitively interpretable!

3/2/13 21

Page 22: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Example #2: Three National Surveys!

•  140 questions common across four countries!•  Fielded in 2010 to: !

–  3,770 respondents in Country “A” !–  1,661 respondents in Country “B” !–  1,874 respondents in Country “C”!–  1,481 respondents in Country “D”!

•  Survey asked about !–  quality of life!–  governance, politics, and international relations!–  security, social tolerance!

3/2/13 22

Page 23: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

Example #2 (continued)!

•  Figure shows the results from fa.parallel for Country A, which resulted in setting r = 27 –  Sensitivity analysis using other values of r

confirmed that r = 27 was appropriate !

•  Country B: r = 28; for Country C: r = 25; etc.!3/2/13 23

Page 24: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

“Government Trust” Factors!

3/2/13 24

Country “A” Country “B” Country “C” Country “D”

Page 25: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

“Trustor Propensity” Factors!

3/2/13 25

Country “A” Country “B” Country “C” Country “D”

Page 26: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

“Ability” Factors!

3/2/13 26

Ability Factors & Loadings Country “A” Country “B” Country “C” Country “D”

Page 27: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

“Benevolence/Integrity” Factors!

3/2/13 27

Country “A” Country “B” Country “C” Country “D”

Page 28: Introduction to Factor Analysis - Facultyfaculty.nps.edu/rdfricke/OA4109/Lecture 10-1 -- Factor Analysis.pdf · Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate

What We Have Just Learned!

•  Learned about factor analysis as a tool for:!–  Deriving unobserved latent variables from

observed survey question responses!–  Data reduction!

•  Discussed the steps in conducting factor analysis and the R functions/syntax!

•  Illustrated the application of factor analysis to survey data!

3/2/13 28