lecture 13 finalpubdocs.worldbank.org/en/940961593193977328/lecture-13... · LECTURE 13 1 1...
Transcript of lecture 13 finalpubdocs.worldbank.org/en/940961593193977328/lecture-13... · LECTURE 13 1 1...
1
Training
Measuring inequalityLECTURE 13
1
1
Training
Outline for final lectures
§ Once datasets have been finalized, it is time to produce results, with the aim of representing the patterns emerging from the data.
§ In practice?
§ Inequality
§ Poverty
§ Basic summary statistics on household demographics, education, access to services, etc.
§ Average expenditures and incomes2
this lecture
next lecture
final lecture
2
Training
Inequality and poverty measurement
1) a measure of living standards
2) high-quality data on households’ living standards
3) a distribution of living standards (inequality)
4) a critical level (a poverty line) below which individuals are classified as “poor”
5) one or more poverty measures
living standard
persons
3
3
2
Training
Cowell (2011)
99.9% of this lecture is explained with better words in Cowell’s work: this book and other (countless) journal articles
4
Training
Warning
§ During the course we stressed the distinction between the concepts of living standard, income, expenditure, consumption, etc.
§ In this lecture we make an exception, and use these terms interchangeably
§ Similarly, I will not make a distinction between income per household, per capita, or per adult equivalent
§ For once, and for today only, we will be (occasionally) inconsistent
5
5
Training
Focus on the term 'inequality'
§ “When we say income inequality, we mean simply differences in income, without regard to their desirability as a system of reward or undesirability as a scheme running counter to some ideal of equality” (Kuznets 1953: xxvii)
§ In practice, how can we appraise the inequality of a given income distribution? Three main options:
① Tables
② Graphs
③ Summary statistics
8
3
Training
Tables: an assessment
§ In general, tables are not recommended when the focus is inequality
§Difficult to get a clue of the extent of inequality in the distribution by looking at a table.
§ To illustrate, let us look at the the distribution of income and taxable income in South Africa.
9
9
Training
Source: 2018 Tax statistics, National Treasury and the South African Revenue Service
Can you tell how high or low inequality is?
10
Training
Another candidate: graphs
Do graphs (diagrams) helprepresent and understand inequality?
11
11
4
Training
Histograms
§ Let the interval [𝑥#, 𝑥%] denote the range of the data.
§ Partition [𝑥#, 𝑥%] into 𝑚∗ non-overlapping bins (intervals) of equal width h = (𝑥% −𝑥#)/𝑚∗.
§ A histogram estimate of the density 𝑓(𝑥) is the fraction of observations falling in the bin containing 𝑥, divided by the bin width h:
0𝑓 𝑥 =(fraction of sample obs. in same bin as x)
ℎ
§ The area of each bar (= ℎ× 0𝑓 𝑥 ) is interpreted as the fraction of sample observations within the bin. All bar areas sum up to unity.
12
12
Training
Histograms?Note: extreme values (top 1%) have been removed
0
5.0e-05
1.0e-04
1.5e-04
2.0e-04
Den
sity
0 5,000 10,000 15,000 20,000Per capita expenditure
(2015 Rupees/person/month)
10 bins
0
1.0e-04
2.0e-04
3.0e-04
Den
sity
0 5,000 10,000 15,000 20,000Per capita expenditure
(2015 Rupees/person/month)
80 bins
0
1.0e-04
2.0e-04
3.0e-04
Den
sity
0 5,000 10,000 15,000 20,000Per capita expenditure
(2015 Rupees/person/month)
240 bins10
bins80
bins240 bins
13
Training
Histograms: an assessment
§ The position and number of bins is arbitrary
§ Inherently lumpy: discontinuities at the edge of each bin
§ Can provide very different pictures of the same distribution
§ Look of graphs highly dependent on treatment of extreme values
§ Read Cowell, Jenkins and Litchfield (1996) for more.
14
5
Training
Beyond histograms
§ A probability density function (PDF) is the ‘continuous version’ of a histogram
§ A convenient way to introduce the PDF is by starting from the cumulative distribution function (CDF)
15
15
Training
The cumulative distribution function (CDF)
§ The cumulative distribution function (CDF) of a random variable 𝑋 is defined as follows:
𝐹 𝑥∗ = EF
G∗
𝑓 𝑥 𝑑𝑥
§ 𝐹 𝑥∗ is the proportion of individuals having 𝑋 less than or equal to 𝑥∗ .
§ If 𝑋 is income and, say, 𝑥∗ = 2,000 Rps., then 𝐹 𝑥∗ = Pr(𝑋 < 2,000), that is the fraction of people with less than 2,000 Rps.
16
Training
The empirical cumulative distribution function (ECDF)Iraq IHSES 2007/2012., Cumulative distribution of welfare aggregate (p.21)
§ Pick any income level on the 𝑥 -axis, and the curve 𝐹 𝑥 will tell you the percentage of individuals in the population having a level of income lower than 𝑥.
§ The CDF is more effective in telling you about the incidence of poverty than about inequality.
17
IRAQ, The Unfulfilled Promise of O il and GrowthPoverty, Inclusion and Welfare in Iraq, 2007-2012
Volume 1: Main Report
17
6
Training
The probability density function (PDF)
§ The probability distribution function (PDF) is the derivative of the CDF:
𝑓 𝑥 =𝑑𝐹(𝑥)𝑑𝑥
§ Simply, the pdf describes the likelihood that a random variable 𝑋 takes on a given value.
§ Note: for continuous random variables, the probability that 𝑋 takes on any particular value 𝑥 is 0, the probability being defined only over (a,b) intervals.
18
Training
Iraq, 2007/12PDF of welfare aggregate (p.21)
§Does the PDF work any better than the CDF when describing inequality?
§ Probably, yes.What do you think?
IRAQ, The Unfulfilled Promise of O il and GrowthPoverty, Inclusion and Welfare in Iraq, 2007-2012
Volume 1: Main Report
21
Training
PDFs: an assessment
§ Similar drawbacks as histograms
§ The bandwidth (how “smooth” the graph is) is arbitrary
§ In most cases, PDFs require some trimming of top values to avoid looking “squished” and being unreadable
§ In general, the PDF is not very effective in showing what is going on in the upper tail, but that is important information when focusing on inequality
23
7
Training
The quantile function
§ Let p = F(x) be the proportion of people in the population with income lower than x.
§ The quantile function Q(p) is defined as:
§ Q(p) is the income level below which we find a proportion p of the population.
F Q p( )!" #$= p or Q p( ) = F−1 p( )inverse cdfcdf
Source: Haughton and Khandker (2009).
Pen’s Parade (Quantile Function) for Expenditure per Capita, Vietnam, 1998
24
Training
0
.2
.4
.6
.8
1
Prop
ortio
n of
tota
l exp
endi
ture
(cum
ulat
ive,
%)
0 .2 .4 .6 .8 1Proportion of population (cumulative, %)
Equality
2015
The Lorenz Curve (1905)Picture and intuition
§ Horizontal axis: cumulative % of population(individuals ordered poorest to the richest)
§ Vertical axis: cumulative % of income received by each cumulative % of population.
§ 45-degree line: Lorenz curve if perfect equality.
§ The overall distance between the 45-degree line and the Lorenz curve is indicative of the amount of inequality present in the population.
26
Training
The Lorenz Curve (1905)Mathematically
§ The Lorenz curve L(p) is defined as follows:
𝐿 𝑝 = ∫PQ R S TU
∫PV R S TU
§ The numerator sums the incomes of the poorest p% of the population;
§ The denominator sums the incomes of all.
§ The ratio L(p) indicates the cumulative % of total income held by a cumulative proportion p of the population.
§ Example: if L(0.5) = 0.3, then we know that the 50% poorest individuals hold 30%of the total income in the population.
27
8
Training
Quantile function and Lorenz curve: an assessment
§ These graphical tools emphasize the ranking of shares of the population on the basis of income
§ The Lorenz curve clearly shows how far the distribution is from perfect equality
§ Still, no graph is as straightforward and easily comparable as a scalar measure of inequality
28
28
Training
Recap and next steps
§ Not all graphs are OK to represent inequality
§ Lorenz curve is the most popular
§ A better conceptual understanding comes from constructing inequality measures from first principles.
§ The most straightforward approach: inequality measures as pure statistical measures of dispersion.
29
Training
Inequality indicators
30
30
9
Training
Measures of dispersion
§ Range R = xXYZ − xX[\
§ What do you think?
§ Variance σ^ = _\∑[a_
\ (x[ − µ)^
§ Coefficient of Variation 𝐶𝑉 = efg
§ …
31
Training
Quantiles, Quintiles, Quartiles, …
§ Take a group of observations for variable 𝑥 (e.g. 𝑥 = expenditure)
§ Order observations from smallest 𝑥 to largest 𝑥 (poor to rich)
§ Partition the set of observations into 𝑛 subsets of equal size
§ 𝑄_, 𝑄^, … , 𝑄j#_ are values of 𝑥 that identify the cut points between the 𝑛 subsamples, and are called quantiles.
§ Some quantiles have special names
32
Training
Example with n = 4
25% of data 25% of data 25% of data 25% of data
lowest x highest x
𝑄_ 𝑄^ 𝑄k
also called first quartile
or p25
also called second quartile
or p50or median
also called third quartile
or p75
Note that p indicates percentiles, that is, quantiles when n = 100
33
10
Training
Quantile ratios
§ A quantile ratio measures the gap between the rich and the poor.
§ It is defined as the ratio of two quantiles, Q(p2)=Q(p1) using percentiles p1and p2.
§ Three popular indices are:
§ The quintile ratio (p2 = 80 and p1=20):
QR = Q(p80)/Q(p20)
§ the decile ratio (p2 = 90 and p1=10):
DR = Q(p90)/Q(p10)
34
Training
The decile ratioper adult equivalent income (OECD def.)
2.9
25.6
6.34.9
4.2
35
Training
Quantile share ratios
§ Let S20 denote the share of (equivalised disposable) income received by the bottom 20% of the population, and S80 the income share received by the top 20% of the population.
§ The quintile share ratio is defined as follows:S80-20 = S80/S20
§ The quintile share ratio is the level-1 Laeken indicator, chosen by the EU to monitor income distribution.
36
11
Training
QSR around the world
37Source: (WDI) Income share held by highest 20% over Income share held by lowest 20%, last available 2010-2017
37
Training
S80/S20 in Sub-Saharan Africa
38
010
2030
Burkina
Faso
Maurita
nia
Sierra
Leon
eNige
r
Guinea
Gambia
, The
Liberi
a
Tanza
nia
Burund
i
Ethiop
iaGab
onKen
ya
Seneg
al
Malawi
Ugand
a
Cote d'I
voire
Rwanda
Zimba
bwe
Congo
, Dem
. Rep
.Tog
oCha
dGha
na
Camero
on
Guinea
-Bissau
Congo
, Rep
.
Mozam
bique
Botswan
aBen
in
Leso
tho
Zambia
Namibia
South
Africa
28
5
Source: (WDI) Income share held by highest 20% over Income share held by lowest 20%, last available 2010-2017
38
Training
The Gini CoefficientA definition
§ Yitzhaki (1997) counts more than a dozen formulas available for the Gini index.
§ A classic definition of the Gini coefficient:
§ The Gini coefficient ranges from 0 (all recipients have the same income: maximum equality), to 100 (all income is received by one recipient: maximum inequality).
G =1
2n2µxi − x j
j=1
n
∑i=1
n
∑
39
12
Training
The Gini CoefficientInterpretation – Farris 2010: 857
§ Consider the following experiment.
§ Pick two households at random and record the lower of their two incomes: call the result 𝑦 . Let µ denote the mean income. It turns out that:
§m
g= 1 − 𝐺
§ Assuming that the Gini index is 40%, we can conclude that the lower of two family incomes, chosen at random, is about 60% (=100-40%) of the national mean: on average, the poorest of any two families earns only about 60% of the national mean.
40
Training
The Gini CoefficientGraphical interpretation
§ The Gini index is two times the area A between the Lorenz curve and the equality diagonal:
𝐺𝑖𝑛𝑖 =𝐴
(𝐴 + 𝐵)= 2𝐴
= 2 _^− 𝐵 = 1 − 2𝐵
41
Training
Gini around the world
42Source: (WDI), GINI index (World Bank estimate) last available year 2010-2017
42
13
Training
Gini in Sub-Saharan Africa
43
020
4060
Maurita
nia
Guinea
Sierra
Leon
eNige
r
Liberi
a
Burkina
Faso
Gambia
, The
Tanza
niaGab
on
Burund
i
Ethiop
ia
Seneg
al
Kenya
Cote d'I
voire
Congo
, Dem
. Rep
.
Ugand
aTog
o
Zimba
bwe
ChadGha
na
Rwanda
Malawi
Camero
onBen
in
Congo
, Rep
.
Guinea
-Bissau
Botswan
a
Mozam
bique
Leso
tho
Zambia
Namibia
South
Africa
63%
33%
Source: (WDI), GINI index (World Bank estimate) last available year 2010-2017
43
Training
Gini around the WorldA selection of countries
44Source: (WDI), GINI index (World Bank estimate) last available year 2010-2017
Min
Max
44
Training
Recap
§Quantile ratios, quantile share ratios, Gini, are all popular inequality measures
§ They do a fine job at representing inequality with a number
§ Problemthey do not always have all the properties that we would want for an inequality measure
§ Solutionsolve the problem backwards. First lay out some desirable properties, then construct a measure that complies with them
46
46
14
Training
Deriving inequality measures from axioms
§ Axiom: a statement accepted as true as the basis for argument or inference.
§ The axiomatic approach allows us to “custom-build” inequality measures that fit our needs:
1. We define a set of elementary properties (axioms) that we think inequality measures ought to have
2. We obtain a mathematical formula that delivers a class of inequality measures satisfying the axioms
47
Training
Five axioms of inequality measures
A. Anonymity (or Symmetry)Who is earning the income does not matter
B. The Population PrinciplePopulation size does not matter
C. Scale Invariance (or Relative Income Principle)Income levels do not matter
D. The (Pigou-Dalton) Principle of TransfersRank-preserving rich-to-poor transfers reduce inequality
E. Decomposability (or Subgroup Consistency)The measure is additively decomposable
48
48
Training
Generalized Entropy Indices (GEI)Shorrocks (1980)
§ Inequality measures that satisfy all axioms (A to E), must have the following form:
§𝐺𝐸 𝜃 = _vf#v
_j∑wa_j Gx
G
v− 1
§where 𝜃 is a parameter that may be given any value (positive, zero or negative).
§Depending on the value assigned to 𝜃, different indices are obtained.
53
15
Training
The family of Generalized Entropy Indices
§ If 𝜃 = 0Mean Logarithmic Deviation: 𝐺𝐸 0 = 𝑀𝐿𝐷 = _
j∑wa_j log G
Gx
§ If 𝜃 = 1Theil Index: 𝐺𝐸 1 = 𝑇𝐻𝐸𝐼𝐿 = _
j∑wa_j Gx
G logGxG
§ If 𝜃 = 2Half Coeff. of Variation Squared: 𝐺𝐸 2 = 𝒦f
^
§ The case of Uganda illustrates
54
Training
UgandaNational Household Survey 2012/2013
55
55
Training
§ Inequality decompositions are typically used to estimate the extent to which the heterogeneity of the population affects overall inequality. Two popular techniques are:1. Decomposition by population sub-group
2. Decomposition by income source
§ We focus on the former:§ Societies can often be partitioned into groups (e.g. North-South). We would like to be able to
decompose total inequality into two components, namely the inequality within the constituent groups, and inequality between the groups:
𝐼����� = 𝐼������ + 𝐼�������
Inequality decomposition
56
16
Training
Rwanda 2016/17Fifth Integrated Household Living Conditions Survey, EICV5 (2016/17)
58
58
Training
Lessons learned
61
§Many ways to describe inequality, some more effective than others
§Graphs: most notable are quantile functions and Lorenz curves
§Measures: different inequality measures lead to different results
§ The Gini index is an extremely popular measure - we recommend its use.
§ The GEI (generalized entropy indices), and in particular the MLD(mean log deviation), are recommended because of their theoretical properties.
61
Training
ReferencesRequired readings
Cowell, F. (2011). Measuring inequality. OxfordUniversity Press. (Chapter 1 & 2)
Suggested readingsAtkinson, A. B. (1970). On the measurement of inequality.Journal of economic theory, 2(3), 244-263.Cowell, F., Jenkins, S. P., and Litchfield, J. A. (1996). Thechanging shape of the UK income distribution: kernel densityestimates (pp. 49-75). Cambridge University Press.Farris, F. A. (2010). The Gini index and measures ofinequality. The American Mathematical Monthly, 117(10),851-864.Haughton and Khandker (2009). Handbook on Povertyand Inequality, Chapter 6.
Pen, J. (1971). Income Distribution: facts, theories, policies.Praeger.Pyatt, G. (1976). On the interpretation and disaggregation ofGini coefficients. The Economic Journal, 86(342), 243-255.Shorrocks, A. F. (1980). The class of additively decomposableinequality measures. Econometrica: Journal of theEconometric Society, 613-625.
62
17
Training
Thank you for your attention
63
63
Training
Homework
64
64
Training
Exercise 1 – Engaging with the literature
Considering equations (10) to (12) in Farris (2010) give a brief interpretation of a Gini index of 63% for South Africa
65
18
Training
Exercise 2 - Inequality in South Asia
§ Turn to page 2 of this report (see next slide)
§What criticisms would you make to this chart?
67
Training
Note: Orange and light brown bars indicate countries where inequality is estimated based on consumption per capita. Light blue bars indicate countries with estimates based on income per capita.
68
Training
Exercise 3 - Functional vs Personal distribution of income
§ The nature of the relationship that links the evolution of income shares to income inequality is complex and still widely debated among researchers.
§ In that context, comment on Figure 19 of the ILO Global Wage Report 2016/2017.
69
69