Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization...

41
www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013

Transcript of Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization...

Page 1: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

www.lisdatacenter.org

Joint World Bank-LIS Workshop on database creation and survey harmonization

 Thursday, June 6, 2013

Page 2: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

LIS: an overview

LIS: Cross-National Data Center • parent organization • located in Luxembourg• independent, chartered non-profit organization• cross-national, participatory governance• acquires, harmonizes, and disseminates data for research• venue for research, conferences, and user training• staff: approximately 10 persons

LIS Center @ CUNY• satellite office• located at the Graduate Center of the City University of New York• administrative, managerial, development support to parent office • venue for research, teaching, and graduate student supervision• staff: approximately 10 persons (mostly part-time PhD students)

Page 3: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

History

• LIS was founded in 1983 by two US academics (Tim Smeeding and Lee Rainwater) and a team of multi-disciplinary researchers in Europe. It began as a “study”, which later grew and was institutionalized as “LIS”.

• For nearly 20 years, LIS was part of a local research institute, CEPS (Centre d'Etudes de Populations, de Pauvreté et de Politiques Socio-Economiques). In 2002, LIS became an independent non-profit institution.

• LIS is supported by the Luxembourg government, by the national science foundations and other funders in many of the participating countries, and by several supranational organizations

• We are building a growing partnership with the new University of Luxembourg.

Page 4: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Our missionTo enable, facilitate, promote, and conduct cross-national comparative research on socio-economic

outcomes and on the institutional factors that shape those outcomes.

Page 5: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

What we do

Step 1. We identify appropriate datasets.

Data must be neutral, reliable, and high-quality.

Step 2. We negotiate with each data provider.

Step 3. We collect, harmonize and document the data.

LIS’ data experts harmonize the data into a common, cross-national template, and create comprehensive documentation. Teresa will discuss

Step 4. We double-check the harmonized data.

Step 5. We make the data available to researchers via remote execution, and other user-friendly pathways.

Thierry will discuss

Page 6: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

LIS and LWS DatabasesLuxembourg Income Study Database (LIS)

• First and largest available database of harmonized income data, available at the household and person levels

• In existence since 1983• Data mostly start in 1980, some go back to the 1960s (recollected every 3-5 years)• 45 countries• 205 datasets• Used to study: poverty; income inequality; labor market outcomes; policy effects

Luxembourg Wealth Study Database (LWS)

• First available database of harmonized wealth data, available at the household level• In existence since 2007• Data going back to 1994• 12 countries• 20 datasets (planned expansion underway)• Used to study: household assets, debt, and expenditures; wealth portfolios; policy

effects

Page 7: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Pathways to the data

Page 8: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Remote-execution system (“LISSY”)

This is the primary means of access; it uses a software system that was designed specifically for LIS.

Researchers write programs (in SPSS, SAS, or Stata) and send them to the LIS server; results are returned to the researcher, with an average processing time of under two minutes.

Page 9: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Two other pathways to the LIS data

Web-based tabulator (“the WebTab”)

LIS Key Figures (no registration needed)

Page 10: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Current coverage: 62% of world population

84% of world GDP

Current axis of growth: middle-income countries (now 17 out of 47 countries)

Australia Denmark India Paraguay * Spain

Austria Dominican Republic *

Ireland Poland Sweden

Belgium Egypt * Israel Peru Switzerland

Brazil Estonia Italy Romania Taiwan

Canada Finland Japan Russia United Kingdom

Chile * France Luxembourg Serbia * United States

China Germany Mexico Slovak Republic

Uruguay

Colombia Greece Netherlands Slovenia

Cyprus Guatemala Norway South Africa

Czech Republic

Hungary Panama * South Korea

Page 11: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Our leadership

Janet GornickDirector of LIS | Director of LIS Center (CUNY)Professor of Political Science and Sociology Graduate Center, City University of New York.

Markus JänttiResearch Director of LISProfessor of Economics, Stockholm University

Tony Atkinson President of LIS BoardEconomist at Nuffield College, Oxford University

Serge AllegrezzaPresident of LIS Local Advisory BoardDirector of Luxembourg National Statistical Office

We are governed by an elected Executive Committee and an international Board, comprising representatives from our funders and data providers.

Page 12: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

LIS’ partners

Our partners include data providers, data users, and funders, in more than 40 countries …and in major supranational organizations, including:

Financial contributors:The World Bank (WB)The Organization for Economic Cooperation and Development (OECD) The International Monetary Fund (IMF)The United Nations Development Program (UNDP)

Dataset exchange; joint research projects; joint fundraising: The European Central Bank (ECB)The United Nations Children’s Fund (UNICEF)EUROMODHarvard Population Center

Page 13: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Users, products, services

Thousands of data users - and growing• remote execution enables use around the world• free access for students in all countries• free access for data providers and their staffs

Pedagogical activities• annual training workshops in Luxembourg• local workshops• self-teaching lessons online

Research activities and support• visiting scholar program• working paper series (600+)• research conferences• edited books (new one coming in July!)

Page 14: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Research using the LIS and LWS data:

some highlights

Page 15: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

LIS provides evidence forcomparative research on socio-economic outcomes

• assessing income inequality• measuring poverty• comparing employment outcomes• analyzing assets and debt• researching policy impacts

Page 16: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Assessing Income Inequality Inequality Across Households

Income inequality in the US is the highest among 25 high-income countries included in the LIS Database.

Denmark

Slove

nia

Swed

en

Slova

k Rep

ublic

Finlan

d

Norway

Czech Rep

ublic

Netherl

ands

Switz

erlan

d

Luxe

mbourg

Beligiu

mFra

nce

Hungary

German

y

Taiw

anKorea

Austria

Irelan

d

Poland

Spain

Canad

a

Greece Ita

ly

United Kingd

om

United St

ates

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

Inequalit

y In

dic

ato

r: G

ini I

ndex

Source: Luxembourg Income Study Key Figures (publicly available online – www.lisdatacenter.org).

Page 17: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Measuring Poverty - IHousehold Poverty Rates

The poverty rate in the US is the highest among 25 high-income countries included in the LIS Database.

Denmark

Swed

en

Czech Rep

ublic

Netherl

ands

Finlan

d

Norway

Slove

nia

Hungary

Slova

k Rep

ublic

Switz

erlan

d

Beligiu

mFra

nce

Luxe

mbourg

German

y

Taiw

an

Poland

United Kingd

omGree

ce Italy

Austria

Canad

a

Irelan

dKorea

Spain

United St

ates

0

2

4

6

8

10

12

14

16

18

Pove

rty

Rate

(50%

of

media

n d

isposa

ble

house

hold

in

com

e)

Source: Luxembourg Income Study Key Figures (publicly available online – www.lisdatacenter.org).

Page 18: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Measuring Poverty - II “Real Income Levels” of Children

US children: the rich are richer, and the poor are poorer.

Source: Timothy Smeeding and Lee Rainwater. 2002. Comparing Living Standards Across Nations: Real Incomes at the Top, the Bottom and the Middle, LIS Working Paper 266.

United Kingdom

United States

Australia

Germany

Netherlands

Belgium

Canada

France

Finland

Denmark

Sweden

Switzerland

Norway

0 20 40 60 80 100 120 140 160 180

89

100

103

114

120

126

126

126

131

137

137

146

157

As Percent of Low US Child Income

Sweden

Netherlands

Denmark

Germany

Australia

Norway

United Kingdom

Belgium

Finland

France

Canada

Switzerland

United States

0 20 40 60 80 100 120

54

61

63

68

69

70

71

71

76

77

87

92

100

As Percent of High US Child Income

Page 19: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Comparing Employment Outcomes Earnings Equality between Women and Men

Earnings equality between working men and women ranks 18th among 25 high-income countries in the LIS Database.

Source: Luxembourg Income Study Key Figures (publicly available online – www.lisdatacenter.org).

Switz

erlan

dFra

nceSp

ain

Hungary

Taiw

an

German

y

Swed

en

Finlan

d

Austria

Irelan

d

Slove

nia

United Kingd

om

Luxe

mbourg

Denmark

Netherl

ands

Beligiu

m

Canad

a

Slova

k Rep

ublic

United St

ates

Korea

Czech Rep

ublic

Poland

Greece

Norway Ita

ly0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Ratio o

f W

om

en’s

Earn

ings

to M

en’s

Earn

ings

Page 20: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Analyzing Assets and DebtOlder Women’s Income and Asset Poverty

In the US, 27% of older women are both income poor and asset poor – a higher share than among older women in several other countries.

United States Finland Germany Italy Sweden United Kingdom0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

18

52

4237 34 38

27

12

1315

5

1812

4

54

10

8

43

3141 43

50

36Neither Income nor Asset Poor

Income Poor, NOT Asset Poor

Income Poor AND Asset Poor

Asset Poor, NOT Income Poor

45% As-set Poor

39% In-come Poor

16% In-come Poor 18% In-

come Poor

19% In-come Poor

20% In-come Poor

26% In-come Poor

64% Asset Poor 55%

Asset Poor

52% Asset Poor

39% Asset Poor

56% Asset Poor

Source: Gornick, Janet C., et al. 2009. “The Income and Wealth Packages of Older Women in Cross-National Perspective.” Journal of Gerontology: Social Sciences 64B(3): 402-414.

Page 21: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Researching Policy Impacts Income Inequality and Redistribution

The US government does less than other rich countries to reduce income inequality.

Source: Andrea Brandolini et al, 2007, Inequality in Western Democracies: Cross-Country Differences and Time Changes, LIS Working Paper 458.

Denmark 47%

Finland 36%

Netherlands 36%

Norway 39%

Sweden 45%

Czech Rep. 41%

Germany 43%

Romania 27%

Switzerland 22%

Poland 41%

Taiwan 9%

Canada 28%

Australia 34%

United Kingdom 33%

Israel 33%

United States 23%

23

25

25

25

25

26

28

28

28

29

30

30

32

34

35

37

42

38

39

41

46

44

48

38

36

50

33

42

48

51

52

48

Gini Indices: income before taxes and transfers (upper bars) and after taxes and transfers (lower bars)

Gini index of market income Gini index of disposable income

Reduction in Gini Index

through taxes and transfers

Page 22: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Linking LIS Data with Other DataIncome Inequality and Earnings Mobility

Countries with higher levels of income inequality have lower levels of intergenerational economic mobility.

Source: OECD 2008. Growing Unequal: Income Distribution and Poverty in OECD Countries. Paris: OECD.

Income inequality (from LIS)

Page 23: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Harmonisation

Page 24: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Data harmonisation at LIS: an overview

Harmonisation

Page 25: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Data harmonisation at LIS: an overview

Harmonisation

The origins of the LIS data

Page 26: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Data harmonisation at LIS: an overview

Harmonisation

The origins of the LIS data

The harmonisation process

Page 27: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Data harmonisation at LIS: an overview

Harmonisation

The origins of the LIS data

The harmonisation process

The final output: LIS data

Page 28: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Harmonisation process in 5 steps

: Data acquisition

Get the original data and documentation

Opening of the original dataUnderstand the original data and concepts

Data harmonisation- Conceptual: map original variables into LIS variables

- Technical: create uniform file structure and variables

Checking of the LIS dataCheck final LIS files for consistency

Creation of LIS metadataCreate harmonised user documentation of the LIS files

Page 29: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

The challenges of harmonisation

Make comparable original data that are:

from various countries different institutional / societal setups

over time changes in institutions and original surveys

household / individual level data confidentiality issues

from various existing datasets output (or ex-post) harmonisation

Page 30: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

The challenges of ex-post harmonisation

Different types/purposes of original collection instrument

Survey versus administrative data (coverage and contents) Cross-sections versus panels (sample selection)

The concepts used in the original data collection are different

Different definitions (employment definition) Different universes and reference periods Country-specific classifications (education, occupation, industry,

social security benefits)

The level of detail of information collected differs Labor market (e.g., LFS type of survey) Incomes /wealth (detailed breakdown vs. overall questions)

Different statistical techniques Different sampling procedures (e.g., oversampling of the rich) Weighting procedures (self-weighted, sampling weights, etc.) Treatment of missing values, imputation methods

Page 31: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

The challenges of harmonising income data

Income sources included in total household disposable income (irregular payments, non-cash incomes, imputed rents, non-taxable incomes, “informal” incomes )

Current versus annual

Net versus gross (or in between...)

Top- and bottom-coding

Level of detail (e.g., total pensions) and different aggregation (e.g. pensions by type of system versus by function)

Classification of incomes: Public versus private Social insurance versus universal versus social assistance

systems

Page 32: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

The challenges of harmonising data from middle income countries

Urban versus rural (sample composition, population coverage)

Household membership and treatment of incomes (live-in domestic servants, family members temporarily absent)

Complex households (multigenerational households, definition of head, polygamy)

Employment definition and labour market characteristics (informal employment, child labour, multiple jobs, status in employment)

Education (attended versus completed, highest level versus highest qualification)

Enlargement of income concept to in-kind incomes (consumption from own production, in-kind individual public goods, subsidies)

Classification of income: Employer-provided pensions and benefits (labour income, social security) Social insurance versus assistance versus universal benefits)

Treatment of taxes

Page 33: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

LIS golden rules for harmonisation

Set clear definitions for LIS variables Maximise comparability by setting clear definitions for each

variable (and trying to stick to them as much as possible) Document very well any deviation from the general definition

Complement ease of use with flexibility of use Enhance user-friendliness by providing fully standardised

variables (standard variables, recodes, dummies, aggregate variables)

Allow users the flexibility to create other concepts by leaving a large amount of detailed information

Adapt the LIS template to the changing environment (over time and space)

The 2011 template Backwards rerunOverall guiding principle: COMPARABILITY

Page 34: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Remote Execution System

Page 35: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Primary PathwayOutput

AccessibilityPublicly available

Registration required

Researchers only

Any advanced statistics

Cross-national descriptive tables

Ready-made indicators

Key Figures

Web Tabulator

LISSY System

Programming

Page 36: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

The LISSY system

Remote Execution System (Version 8)• Fully automated, running 24 hours/day and 7 days/week• Researchers analyse microdata at their own place of work• Statistical programs (e.g., Stata, R) automatically processed.

Outcomes automatically sent back

Restricted to social science research purposes only• Micro-databases cannot be downloaded and no direct access

to the data is permitted• Users must register with LIS. LIS grants access to databases for

a limited time period (1 year) renewable annually

Over 4,500 users from 55 countries ever registered In 2012, 1015 applications (new and renewed)

Page 37: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Security and confidentiality

Working with LISSY• Write, submit and view requests• Track status of job requests• Access and manage history of all jobs you ever submitted

55,000 jobs per year to monitor• Security settings defined for an automatic scan each incoming

request• Suspicious jobs are sent to a review queue for a manual

review • All incoming jobs and outputs stored allowing to trace back

researchers’ job history

Data providers’ legal constraints

Technical implementation

Researchers’ needs

Page 38: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Ancillary support services

Extensive documentation is available on LIS website • Detailed information on original surveys, LIS variables’

content and availability, etc… allowing users to understand the context in which LIS outcomes should be analysed

• Information on how to access to and work with micro-data: – Data accreditation (access, confidentiality rules…)– Data access system (how-to and FAQ sections) – Learning materials (self-teaching packages …)

Support• Support facilities as a mean to improve researchers’ ability to

work with LISSY and to reduce risks of breaching confidentiality rules

• User support (500 emails per year) and training sessions through workshops

Page 39: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Challenges still to face

• Challenges to face include revising the LIS databases’ documentation system by supplying a new metadata system that will allow LIS users to create tailored documentation extracts fitted to their individual needs

• The key objective to work on: constantly adjusting the microdata access services to fulfill researchers’ needs while maintaining the same level of security and communication

Page 40: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Ideas for afternoon discussion

Possible collaborative activities:• Exchange of information and expertise

regarding dataset selection/acquisition; harmonisation; micro-simulation/imputation; design and construction of metadata (etc.)

• Joint data harmonisation opportunities?• Joint research opportunities?• Joint fundraising opportunities?• Any other possibilities that arise!

Page 41: Www.lisdatacenter.org Joint World Bank-LIS Workshop on database creation and survey harmonization Thursday, June 6, 2013.

Thank YouJanet Gornick, Teresa Munzi, Thierry Kruten

www.lisdatacenter.org