The European Commission’s science and knowledge service · 2017-12-11 · Principal Component...
Transcript of The European Commission’s science and knowledge service · 2017-12-11 · Principal Component...
The European Commission’s science and knowledge service
Joint Research Centre
Step 5: Weighting methods (I)
Principal Component Analysis
Hedvig Norlén
COIN 2017 - 15th JRC Annual Training on Composite Indicators & Scoreboards 06-08/11/2017, Ispra (IT)
3 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
Step 5: Weighting methods (I) • Not a trivial task to choose the weights for
the components. Should be in agreement with the underlying theoretical framework
• Diverse weighting methods exist and there is no established methodology on how to select the best method
• Weights usually have an important impact on the composite indicator value and on the resulting ranking
• The weighting method should be made clear and transparent
Importance
4 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
Commonly used weighting methods
Equal weights (EW)
Weights based on statistical methods
• Principal component analysis (PCA) and Factor analysis (FA)
• Data envelopment analysis (DEA)
Weights based on public/expert opinion
• Budget allocation process (BAP) • Analytic hierarchy process (AHP) • Conjoint analysis (CA)
I.
II.
III.
5 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
I. Equal weights (EW) • Many composite indicators use an equal weighting scheme (EW).
Why? no “à priori” knowledge and no clear reference about the importance of the elements in the composite indicator.
• EW does not mean there are no weights the weights are equal
Equal weighting does not guarantee equal importance and equal contribution of the indicators to the composite indicator!
6 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
I. Example: Innovation Output Indicator (IOI)
Vertesy, The Innovation Output Indicator 2017: Methodology Report, forthcoming
EW result in an unbalanced contribution of components to the IOI
EW= 0,25 4 components in IOI Adjusted weights
Adjusted weights result in a more balanced contribution of components to the IOI
Weights ComponentsPearson
correlation coefficient
R^2 R^2
0,25 PCT 0,79 0,62 0,60,25 KIABI 0,76 0,58 0,60,25 COMP 0,74 0,55 0,60,25 DYN 0,55 0,30 0,3
Weights ComponentsPearson
correlation coefficient
R^2 R^2
0,22 PCT 0,72 0,52 0,50,22 KIABI 0,70 0,49 0,50,22 COMP 0,71 0,51 0,50,34 DYN 0,67 0,45 0,4
Correlation between IOI and components Correlation between IOI and components
7 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
I. Equal weights (EW) • Highly correlated indicators – caution!
11 12 1
21 22 2
1 2
.....
.....
.
......
p
p
n n np
x x xx x x
x x x
=
X
n = number of objects (countries, individuals, …) p = number of indicators 𝑤𝑖= weight for indicator i, i=1,…,p
E.g. Correlation between indicator 1 and 2 is very high What should we do? - Disregard one of the two indicators - Reduce the weights of both indicators
8 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
I. Example EW: The Global Talent Competitiveness Index 2017 (GTCI 2017)
Measures the ability of countries to compete for talent
9 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
GTCI 2017 model
• 2 sub-indices
• 65 variables (ind)
• 14 sub-pillars (3-7 ind)
• 6 pillars
• Overall index GTCI Equal Weights (EW) & Arithmetic Averaging (AA)
(66,7%) (23,7%)
(16,7%) (16,7%) (16,7%) (16,7%) (16,7%) (16,7%)
(5,6%) (5,6%)
(8,3%) (8,3%) (8,3%) (8,3%)
10 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
II. Weights based on PCA
• Empirical and objective option for weight selection
• Identify a small number of “averages” (PCs) that explain most of the variance observed.
• Each principal component PCi is a new variable computed as a linear combination of the original (standardized) variables
• Weights by using the coefficients of the first principal component (1-dim case)
Observed indicators are reduced into components
kk xw....xwxwPC +++= 22111
. . .
𝒘𝟏
𝒘𝟐
𝒘𝒑
Component
Item 1 X1
Item 2 X2
Item p Xp
PCA PCA
11 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
II. Weights based on PCA • Pro: good mathematical properties, determining the set of weights
which explains the largest variation in the original indicators • (potential) Con: small weights are assigned to indicators which
have little variation, irrespective of their possible related contextual importance (“Elitist index”)
• The first PC is often used as the “best” composite index • First PC accounts for a limited part of the variance in the data – can
lose a consistent amount of information
12 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
II. Example PCA weights - GTCI 2017
Indicators
Sub-pillars
Pillars
GTCI
EW & AA
EW & AA
EW & AA
Default GTCI
Indicators
Sub-pillars
Pillars
GTCI
EW & AA
EW & AA
PC1
Ex 1) PCA weights at pillar level
Indicators
Sub-pillars
Pillars
GTCI
Ex 2) PCA weights at 3 levels
PC1
PC1
PC1
13 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
Com-ponent
Eigenvalue Variance Cumulative variance
PC1 PC2 PC3 PC4 PC5 PC6
1 4,95 0,82 0,82 1,Enable 0,94 0,18 0,03 -0,08 -0,10 -0,25
2 0,45 0,07 0,90 2,Attract 0,83 0,53 0,05 0,10 0,08 0,10
3 0,25 0,04 0,94 3,Grow 0,92 -0,04 -0,30 -0,22 0,05 0,09
4 0,13 0,02 0,96 4,Retain 0,94 -0,14 0,12 0,03 -0,25 0,13
5 0,12 0,02 0,98 5,VT_skills 0,90 -0,25 0,32 -0,06 0,18 0,00
6 0,10 0,02 1,00 6,GK_skills 0,91 -0,23 -0,21 0,25 0,06 -0,06
SS 4,95 0,45 0,25 0,13 0,12 0,10
Stopping criterion Eigenvalue>0.90 (or cumulative variance over 0.70)
Total variance explainedpillar and principal componentPearson correlation coefficients between
Ex 1) PCA weights - GTCI 2017 • PC1 weights at pillar level
GKSkiVTSkiRtainGrowAttrEnab XXXXXXPC 91.090.094.092.083.094.01 +++++=
Weights (not normalized)
GKSkiVTSkiRtainGrowAttrEnabnorm XXXXXXPC 17.016.017.017.015.017.01 +++++= Scaled to unity sum
14 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
Ex 1 & 2) PCA weights - GTCI 2017 • Ex 1) PC1norm and default GTCI give clearly very similar rankings
98 out of 118 countries same ranking. Others shift in just one rank position
(Ex 1)
GKSkiVTSkiRtainGrowAttrEnabnorm XXXXXXPC 17.016.017.017.015.017.01 +++++=
∑=6
1
6/1 jXGTCI)17.06/1( ≈
Select the simplest method!
15 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
Ex 1 & 2) PCA weights - GTCI 2017 • Ex 1) PC1norm and default GTCI give clearly very similar rankings
• Ex 2) PC1norm at 3 levels and default GTCI more divergent rankings (up to 9 positions in difference)
98 out of 118 countries same ranking. Others shift in just one rank position
(Ex 2)
Only 28 out of 118 countries same ranking.
9 positions difference
(Ex 1)
16 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
Multiple principal components • If there are multiple principal components?
• 2 PCs included with the current stopping criterion
• Fairly low
correlations between PC2 and the 6 pillars
Pearson correlation coefficients between pillar and PC
Com-ponent
Eigenvalue Variance Cumulative variance
PC1 PC2
1 4,18 0,6973 0,6973 1.Enable 0,85 0,44
2 0,91 0,1512 0,8486 2.Attract 0,84 0,46
3 0,42 0,0700 0,9186 3.Grow 0,82 0,27
4 0,24 0,0400 0,9586 4.Retain 0,84 0,41
5 0,19 0,0317 0,9902 5.VT_skills 0,83 0,26
6 0,06 0,0098 1,0000 6.GK_skills 0,83 0,44
SS 4,18 0,91
Stopping criterion Eigenvalue>0,90 (or cumulative variance over 0,70)
Total variance explained
17 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
Pearson correlation coefficients
between pillar and PCCom-
ponent Eigenvalue VarianceCumulative
variance PC1
1 4,18 0,6973 0,6973 1.Enable 0,85
2 0,91 0,1512 0,8486 2.Attract 0,84
3 0,42 0,0700 0,9186 3.Grow 0,82
4 0,24 0,0400 0,9586 4.Retain 0,84
5 0,19 0,0317 0,9902 5.VT_skills 0,83
6 0,06 0,0098 1,0000 6.GK_skills 0,83
SS 4,18
Stopping criterion Eigenvalue>1
Total variance explained
Multiple principal components
Recommendation:
• Change stopping criterion
• Reduce the threshold to get a single PC
18 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
Concluding words
• Different weighting systems imply different results/rankings • No explicit weighting method is the norm
• Important to check the choice of weighing method (uncertainty
and sensitivity analysis)
19 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis
References
Books • Johnson, “Applied Multivariate Statistical Analysis (6th Revised edition)”, Pearson Education Limited
(2013) • Maggino, “Complexity in Society: From Indicators Construction to their Synthesis”, Springer (2017) • Tinsley and Brown, ”Handbook of Applied Multivariate Statistics and Mathematical Modeling”, Academic
Press (2000) Papers • Paruolo, Saisana and Saltelli, “Ratings and rankings: voodoo or science?.” Journal of the Royal
Statistical Society: Series A (Statistics in Society) 176.3: 609-634, 2013 • Becker, Saisana, Paruolo, Vandecasteele “Weights and importance in composite indicators: Closing the
gap”. Ecological Indicators 80:12-22, 2017
Any questions? You may contact us at @username & [email protected]
Welcome to email us at: [email protected]
THANK YOU COIN in the EU Science Hub https://ec.europa.eu/jrc/en/coin COIN tools are available at: https://composite-indicators.jrc.ec.europa.eu/
The European Commission’s Competence Centre on Composite Indicators and Scoreboards