Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile...

126
Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation zur Erlangung des akademischen Grades eines Doktors der Wirtschafts- und Sozialwissenschaften (Dr. rer. pol.) der Friedrich-Alexander-Universität Erlangen-Nürnberg vorgelegt von: Dipl.-Volksw. Manfred Antoni aus Arad, Rumänien

Transcript of Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile...

Page 1: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Leerzeile

Essays on the measurement andanalysis of educational and skill

inequalities

Inaugural-Dissertation

zur Erlangung des akademischen Grades eines Doktors

der Wirtschafts- und Sozialwissenschaften

(Dr. rer. pol.)

der Friedrich-Alexander-Universität Erlangen-Nürnberg

vorgelegt von: Dipl.-Volksw. Manfred Antoniaus Arad, Rumänien

Page 2: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Leerzeile

Erstreferent: Prof. Dr. Claus SchnabelZweitreferent: Prof. Dr. Lutz BellmannLetzte Prüfung: 13. November 2012

Page 3: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Acknowledgements

First of all, I would like to express my gratitude to my supervisor, Prof. ClausSchnabel. Since before I even started working on this thesis, he never failed to inspireideas, provide critical counsel or respond readily to any questions or issues thatarose during the last few years. I am also indebted to my co-supervisor, Prof. LutzBellmann, who always had an open ear for my ideas and provided valuable advicethroughout my whole doctoral studies.This work was supported by the Graduate Programme of the Institute for Employ-ment Research (IAB) and the School of Business and Economics of the Universityof Erlangen-Nuremberg (GradAB). I am grateful for the programme’s generousfunding, the comprehensive learning opportunities it provided, for the cooperativeand encouraging atmosphere among its participants and for the help and advice Ireceived from Sandra Huber, the coordinator and kind soul of the programme.Special thanks go to each of the department and project heads I had or still havethe pleasure of working with as a researcher at the IAB. They granted me all thetime for my dissertation my other responsibilities permitted and fostered my workwith advice and critical comments on a broad range of topics. One of them, Prof.Guido Heineck, was daring enough to co-author an essay with me, which is now partof this dissertation.Additionally, I would like to thank current and former colleagues and fellow doctoralcandidates for their support but also for helping me to enjoy my profession as muchas I do. Particularly, but without making any claim of being exhaustive, I would liketo thank Gerhard Krug, Hans-Dieter Gerner, Jens Stegmaier, Karin Simon, KatrinDrasch, Lena Koller, Silvia Melzer and Wolfgang Dauth.I dedicate this dissertation to my partner Heidi and to my family for their constantsupport. They enabled me to withstand all hardships and doubts. For that and somuch more, I am deeply grateful.

3

Page 4: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Contents

1 Introduction 81.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2 Operationalisation of central elements of human capital . . . . . . . . 91.3 Organisation of the dissertation . . . . . . . . . . . . . . . . . . . . . 10

2 Data, record linkage and selectivity analysis 112.1 Data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.1.1 ALWA survey . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1.2 Measuring basic skills: ALWA-LiNu . . . . . . . . . . . . . . . 122.1.3 Administrative data . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Record linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.1 Legal and ethical aspects of record linkage . . . . . . . . . . . 152.2.2 The process of record linkage . . . . . . . . . . . . . . . . . . 16

2.3 State of research on record linkage consent and success . . . . . . . . 182.3.1 Determinants of consent . . . . . . . . . . . . . . . . . . . . . 19

2.3.1.1 Respondent characteristics . . . . . . . . . . . . . . . 192.3.1.2 Interviewer characteristics . . . . . . . . . . . . . . . 252.3.1.3 Interview situation . . . . . . . . . . . . . . . . . . . 26

2.3.2 Determinants of record linkage success . . . . . . . . . . . . . 272.4 Selectivity analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.4.1 Descriptive results . . . . . . . . . . . . . . . . . . . . . . . . 292.4.2 Multivariate results . . . . . . . . . . . . . . . . . . . . . . . . 32

2.4.2.1 Results on determinants of consent . . . . . . . . . . 342.4.2.2 Results on determinants of record linkage success . . 382.4.2.3 Sensitivity analyses . . . . . . . . . . . . . . . . . . . 43

2.5 Summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . 452.5.1 Implications for data users . . . . . . . . . . . . . . . . . . . . 452.5.2 Implications for survey practice . . . . . . . . . . . . . . . . . 452.5.3 Implications for the process of record linkage . . . . . . . . . . 462.5.4 Further avenues of research . . . . . . . . . . . . . . . . . . . 47

4

Page 5: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Contents

3 Lifelong learning inequality? 483.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483.2 Theory and hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . 503.3 Data and descriptives . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.3.1 Independent variables . . . . . . . . . . . . . . . . . . . . . . 553.3.2 Dependent variable . . . . . . . . . . . . . . . . . . . . . . . . 58

3.4 Econometric strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . 583.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.6 Predictions and sensitivity analysis . . . . . . . . . . . . . . . . . . . 69

3.6.1 Simulation of training participation by family background . . 693.6.2 Sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . . . 70

3.7 Summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . 74

4 Do literacy and numeracy pay off? 774.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774.2 Background and previous research . . . . . . . . . . . . . . . . . . . . 784.3 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 814.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 824.5 Empirical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.5.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 854.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 864.5.3 Sensitivity analyses . . . . . . . . . . . . . . . . . . . . . . . . 92

4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5 General summary 95

Bibliography 96

A List of abbreviations 111

B Appendix to Chapter 2 112

C Appendix to Chapter 3 119

D Appendix to Chapter 4 122

E Short German summary 125

5

Page 6: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

List of tables

2.1 Identifiers used for record linkage . . . . . . . . . . . . . . . . . . . . 172.2 Number of observations over the stages of record linkage . . . . . . . 182.3 Existing results on determinants of consent to record linkage with

employment-related register data . . . . . . . . . . . . . . . . . . . . 202.4 Consent and linkage success rate by subgroups, expressed as percentages 312.5 Mean characteristics by consent status, t-test of difference . . . . . . 332.6 Determinants of consent to record linkage, probit regression models

with and without respondent-interviewer interactions, respectively . . 352.7 Determinants of record linkage success on all stages, separate probit

regression models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.8 Significantly differing coefficients between matching success probit

regressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.1 On-the-job training, education and endowments by dichotomous familybackground, measured as educational level of parent of the same sexas respondent, t-test of difference . . . . . . . . . . . . . . . . . . . . 57

3.2 Probability and frequency of on-the-job training by individual, parentaland job characterstics (only dummy variables) . . . . . . . . . . . . . 59

3.3 Determinants of non-participation in training and of the number ofcourses respectively, zero-inflated negative binomial regression, probitinflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.4 Determinants of non-participation in training and of the number ofcourses respectively, zero-inflated negative binomial regression, probitinflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.5 Wald tests of variable groups based on hypotheses and estimationresults from Model 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.6 Predicted frequency of training per spell (y) and probabilities of counts,respectively, by selected parental educational levels . . . . . . . . . . 69

3.7 Determinants of non-participation in training and of the number ofcourses respectively, comparing different measures of family back-ground, zero-inflated negative binomial regression, probit inflation . . 71

6

Page 7: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.8 Determinants of non-participation in training and of the number ofcourses respectively, excluding spells starting below the age of 26 or ofpublic service respectively, zero-inflated negative binomial regression,probit inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.1 Sample statistics of independent variables . . . . . . . . . . . . . . . 844.2 Augmented Mincer-type earnings equation, random effects GLS esti-

mates with and without basic skills scores, respectively (2007-2010) . 874.3 Augmented Mincer-type earnings equation, random effects GLS esti-

mates including squared basic skills scores (2007-2010) . . . . . . . . 894.4 Augmented Mincer-type earnings equation, random effects GLS esti-

mates including interaction terms (2007-2010) . . . . . . . . . . . . . 90

B.1 Sample statistics of independent variables . . . . . . . . . . . . . . . 112B.2 Consent and linkage success rate by subgroups, expressed as percent-

ages of all German language interview participants . . . . . . . . . . 114B.3 Determinants of consent to record linkage, probit regression models

with and without respondent-interviewer interactions, respectively,weighting variables included . . . . . . . . . . . . . . . . . . . . . . . 115

B.4 Determinants of consent to record linkage, separate probit regressionmodels with respondent-interviewer interactions for female and malerespondents, respectively . . . . . . . . . . . . . . . . . . . . . . . . . 117

C.1 Sample statistics of independent variables . . . . . . . . . . . . . . . 119C.2 Loss of observations due to exclusion restrictions and missing values

based on Models 1 to 4 (not mutually exclusive) . . . . . . . . . . . . 121

D.1 Numbers of cases by step of data preparation . . . . . . . . . . . . . 122D.2 Augmented Mincer-type earnings equation, random effects GLS esti-

mates, restricted to years 2007/2008 . . . . . . . . . . . . . . . . . . . 123D.3 Mincer-type earnings equation, random effects GLS estimate based

on all linked cases, regardless of test-participation (2007-2010) . . . . 124

7

Page 8: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

1. Introduction

1.1. Motivation

“[T]hose attempting to guide the economy and our societies are likepilots trying to steering a course without a reliable compass. Thedecisions they (and we as individual citizens) make depend on whatwe measure, how good our measurements are and how well ourmeasures are understood. We are almost blind when the metrics onwhich action is based are ill-designed or when they are not wellunderstood. For many purposes, we need better metrics.”

(Stiglitz et al., 2009)

Ever since the seminal works of Mincer (1958), Schultz (1961) and Becker (1962),human capital theory has played a major role in both labour economics and theeconomics of education. In both fields, the value of or investments in human capitalwere at the center of a multitude of research questions. The concept of human capitalcontinues to develop theoretically as well as in the methodology of its analysis (seee.g., Card, 1999, 2001; Folloni and Vittadini, 2010). In spite of this development,some challenges remain for the accurate analysis of human capital and its rewardsfor individuals or societies.One of these challenges arises in the operationalisation and measurement of humancapital. Any definition of a person’s human capital comprises several or all of thefollowing elements: innate abilities, education, experience, and some measure of skillor productivity derived from the composition of these elements. However, empiricalanalyses are rarely able to comprehensively account for all of these elements. Someof these elements may simply be missing from the data used, for instance becausethey have been unobservable during data collection; some of these elements of humancapital may have been observed, but only with some degree of measurement error.Such shortcomings in data sets do not make empirical analyses unfeasible, but theyincrease econometric challenges considerably.Such challenges regarding the accurate measurement of human capital are the startingpoint for my dissertation. One of my goals is to deal with these issues by using novel

8

Page 9: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

1.2. Operationalisation of central elements of human capital

data and, more importantly, by demonstrating how their potential for empirical workcan be increased by combining them with other data sources. On that basis I will,first, take into account inter-generational determinants of human capital development.To do so, I ask whether the parental background of adults persistently influencestheir non-formal training participation. I therefore examine the determinants ofon-the-job training, with an emphasis on parental background variables.

Second, I will examine intra-generational aspects of different elements of humancapital and how they are rewarded at the German labour market. I do this byexamining the relationship between a person’s basic skills and her earnings. Theunderlying data allow me to differentiate between the two skill domains literacy andnumeracy.

1.2. Operationalisation of central elements of humancapital

To show how above elements of human capital will be operationalised in the followingchapters, I will first provide definitions and naming conventions of the centralconcepts. The first differentiation is needed between types of learning activities,which are termed according to their labour market relevance: general educationis termed as schooling, whereas labour market oriented, professional education istermed as training. The term education, without a qualifier, will include both generalschooling and professional training. Among professional training degrees, a furtherdistinction is made between vocational degrees and academic degrees. The termsoccupational or training degree are used as synonyms for a vocational degree. Asthere are no universally valid conventions, these definitions are not generalisablebeyond the context of this dissertation.

Another dimension which I will use to differentiate educational activities is their degreeof formalisation (see European Commission, 2000, p. 8): formal training henceforthdenotes activities at institutions that provide access to recognised professionalcertificates. These are vocational training or higher education. Non-formal activitiesare courses that are offered by a variety of institutions but do not lead to recognisedcertificates. Contrary to most of the Anglo-Saxon literature, I use the term on-the-jobtraining only for non-formal training activities.

I further differentiate between different types of abilities. Along the line of Heckmanet al. (2006), I differentiate between non-cognitive and cognitive abilities. As asynonym for non-cognitive abilities, I will use the term personality traits. Among

9

Page 10: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

1. Introduction

cognitive abilities I will distinguish between innate abilities, such as intelligence, andbasic skills like literacy or numeracy.

1.3. Organisation of the dissertationThe dissertation is based on three papers that have been or will be published asdiscussion papers and submitted to peer-reviewed journals. As the emphasis of thisdissertation lies on issues of the measurement of human capital, a considerable partof the text is devoted to the data used in the empirical analyses. Chapter 2 thereforegoes beyond a mere description of the different data sources, as it comprises empiricalanalyses on methodological aspects of their linkage. These analyses are presented inSections 2.2-2.5, which are mainly based on Antoni (2013a).Chapter 3, which is based on Antoni (2011), is focused on the relevance of theparental background for adults’ non-formal training participation. Chapter 4 isbased on Antoni and Heineck (2012).1 Therein I examine the relationship betweenbasic skills and earnings in the German labour market. In Chapter 5, I conclude thedissertation with a short summary of its empirical findings.

1As this paper was written in collaboration with a coauthor, Chapter 4 is written in the first-personplural perspective.

10

Page 11: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage andselectivity analysis

2.1. Data sets

To measure individual educational and skill inequalities, and to examine the researchquestions mentioned above, I combine several data sources that have been gathered byvery different methods and with different purposes in mind. As with the measurementof human capital, the accurate combination of such data sources poses a challenge initself. The following chapter will therefore describe the different data sources andtheir respective contribution for the analyses at hand, show the procedure of the datalinkage and its methodological challenges and examine the success of this procedureempirically.

2.1.1. ALWA survey

The starting point for the data used in all of the analyses below is the ALWA survey(Working and Learning in a Changing World)2 of the Institute for EmploymentResearch (IAB). ALWA was conducted from August 2007 up to and including April2008 and lead to 10,404 retrospective, computer-assisted telephone interviews (CATI)with people born between 1956 and 1988. The target population were Germanresidents, regardless of their nationality.Longitudinal information was gathered on residential, educational, employment andpartnership histories as well as on children and times of parental leave. All theseevents are measured detailed to the month. One of the major topics of interest duringthe longitudinal part of the interview were educational activities. This includedthe levels of schooling and training achieved over time as well as their timing inthe life course. Non-formal training is measured by its frequency during spells ofemployment, unemployment and other situations.

2The acronym is derived from the study’s German name “Arbeiten und Lernen im Wandel”.See Kleinert et al. (2011) for an overview of the study or Antoni et al. (2010) and http://fdz.iab.de/en/FDZ_Individual_Data/ALWA.aspx for more detailed information on theavailable data.

11

Page 12: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

A potential shortcoming of retrospectively collected, longitudinal survey data isstated by Reimer (2005), who summarises results showing that respondents mightencounter recall problems when asked to report events from their past. Theymay remember start or end dates and other details of episodes of their life courseincorrectly, or they may be unable to recollect and report events at all. To improve therecollection of autobiographical information, aided recall techniques were used duringthe longitudinal parts of the ALWA interview. Based on event history calendars(see Glasner and van der Vaart, 2009) of the episodes reported so far, interviewersand respondents could collaboratively check the completeness and consistency of thebiographical data, and correct them if necessary. Drasch and Matthes (2013) providemore details on the techniques used in ALWA and examine their effectiveness. Theyreport that accuracy of dating and completeness of events are indeed improved bythe implemented aided recall techniques.These longitudinal data are complemented by a rich set of cross-sectional variables.The interview covered standard socio-demographic characteristics but also topicssuch as place and date of birth, immigrant background, religiousness, language skills,family background, importance of different domains of life as well as informal learningand cultural activities.The full set of questions was only directed at respondents with sufficient Germanlanguage proficiency. A shorter questionnaire, lacking most of the longitudinalelements, was available in Turkish and Russian. It was answered by 227 of therespondents. That is why I will only use the answers of 10,177 of the total 10,404ALWA respondents in most of the empirical analyses below.

2.1.2. Measuring basic skills: ALWA-LiNu

Of the 10,177 respondents that have answered the full questionnaire, 3,980 also tookpart in basic skills tests in the domains prose literacy and numeracy. These testshad originally been designed for the International Adult Literacy Survey (IALS) andthe Adult Literacy and Life Skills Survey (ALL)3 and were adapted for ALWA bythe Educational Testing Service (ETS), Princeton. Both domains were tested inseparate, fully fledged batteries of tasks, with an average duration of 30 minutes perdomain. See Wölfel and Kleinert (2012) for an overview of the “ALWA Literacy andNumeracy Data” (ALWA-LiNu) and Kleinert et al. (2012b) for details on the designof its skills tests and the scaling of the scores derived from them.

3For details on and results from both surveys, see http://nces.ed.gov/surveys/all/index.asp.In 2011, the OECD ran a third survey on adults’ skills, the “Programme for the InternationalAssessment of Adult Competencies” (PIAAC), covering 25 OECD and partner countries. Firstresults of this study are expected to be published in 2013.

12

Page 13: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.1. Data sets

The results of the tests in both domains reflect the respondents’ knowledge and skillsrelevant in situations of everyday life as well as on the job. As the basic skills scoresin our data are based on tests originally included in ALL, its definitions for proseliteracy and numeracy are also valid for ALWA-LiNu (Statistics Canada and OECD,2005, p. 16):

Prose literacy: “the knowledge and skills needed to understand and useinformation from texts including editorials, news stories, brochures andinstruction manuals.”

Numeracy: “the knowledge and skills required to effectively manage themathematical demands of diverse situations.”

Data access to ALWA-LiNu is provided by the Research Data Centre (FDZ) of theGerman Federal Employment Agency at the IAB as a distinct data product, but itcan be linked to the data set described below by a common identifier variable.

2.1.3. Administrative data

The administrative data of the German Federal Employment Agency have been amajor data source for labour market research in Germany for several years (Heining,2010). They are based on mandatory social security notifications by employers anddata from internal processes of the local employment agencies (see Jacobebbing-haus and Seth, 2007). The consolidated research data contain daily longitudinalinformation on dependent employment and registered unemployment as well as onjob search activities and participation in measures of active labour market policy.Its information on wages and transfer payments are highly accurate, as they arecomputed on the basis of social security contributions. The resulting data are calledIntegrated Employment Biographies (IEB).One shortcoming of these data is that they do not cover all possible labour marketstates. Episodes during which a person does not contribute to or receives benefitsfrom the social security system, are not included. The most notable groups excludedfrom the data are therefore self-employed, civil servants4 and people in formal full-time education. People in dual vocational training relationships are included in thedata as they have a employment relationship relevant for the social security system.Employment spells can be supplemented by yearly firm-level characteristics. Theseinclude the economic sector of the firm, the qualification and age structure of itsemployees as well as its wage distribution. The firm-level data can also be enriched

4From here on, I will consistently use the term civil servants for German “Beamte” to differentiatethem from the encompassing group of public service employees.

13

Page 14: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

by information on worker flows for different subgroups of employees as well as onthe founding and closing of firms under consideration. These data draw from theEstablishment History Panel of the IAB (see Hethey-Maier and Seth, 2011; Spengler,2008). The yearly reference date is the end of June.Both the IEB and the BHP can be linked to the survey data of the ALWA participants.However, the link between the survey and administrative data sets is only valid onthe person-level. It is yet impossible to link related episodes of a given person’semployment history between survey and administrative data. The combined dataproduct is available under the name of “ALWA survey data linked to administrativedata of the IAB” (ALWA-ADIAB). Access to the data is provided by the FDZ viaon-site use and subsequent remote data access. See Antoni and Seth (2012) foran overview or Antoni et al. (2011) for details on the data set. More details onthe actual process of record linkage that was the basis for ALWA-ADIAB will beprovided in the following sections.

2.2. Record linkage

For various research questions and methods of inference in economics and the socialsciences, rich data sets are required. Since survey and administrative data setshave their respective advantages, a combination of both data sources enhances thepotential for research. Furthermore, record linkage has several advantages from asurvey methodological perspective. By omitting aspects from a survey interview thatcan be supplemented from other data sources, the length of the questionnaire canbe reduced in advance. This saves interview time and reduces survey costs as wellas the burden for respondents, which in turn might make interview terminations orpanel attrition less likely (e.g., Hartmann and Krug, 2009). Record linkage may alsolead to improved data quality, for instance by validation of survey data (e.g., Jäckleet al., 2004) or analyses of measurement error or nonresponse bias (e.g., Sakshaugand Kreuter, 2012).These potentials can only be realised fully if linkage rates are sufficiently high andif we understand and are able to control for possible selectivity that may arise atdifferent stages of the record linkage process. First, the consent of respondents to thelinkage, which is necessary in many countries, might be refused. Too high or selectiverefusal in terms of respondent characteristics might impede statistical inference basedon the linked data. Second, the success of the actual record linkage crucially dependson the available information to identify a respondent in administrative records andon the quality of such identifiers. Both consent to and success of record linkage may

14

Page 15: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.2. Record linkage

vary substantially by the individual characteristics of the respondent. Consent mayalso be influenced by characteristics of the interviewer and of the interview situation.This has implications for potential data users, for survey practice and for the linkageof the data itself. The goal of the study thus is threefold. First, I attempt to answerthe following questions: does a sufficient number of observations remain in the linkeddata set for substantial research? Is there selectivity in the linked data compared tothe overall survey population, and if so, in which way is the sample selective? Onwhich stages of the linkage process does the selectivity arise? Second, implications forsurvey design and field administration are shown by examining how the interviewerstaff may be composed to maximise consent rates, and how field management maybe optimised to assert high and stable consent rates. Finally, advice for practitionersof record linkage is provided by showing what may be gained from probabilistic andmanual linkage compared to mere deterministic record linkage in terms of numbersof observations, and if and how this additional effort influences the selectivity of thelinked sample.Apart from answering these questions, the study contributes to the literature inseveral ways. Not only was the consent rate to record linkage in the ALWA surveywell above the average of comparable surveys (see Sakshaug and Kreuter, 2012), butits wealth of respondent characteristics also allows to control for several variablesthat have not been considered in existing studies, such as self-reported cognitiveskills or the native language of the respondents. The analysis also benefits fromexceedingly rich paradata (see e.g., Kreuter and Casas-Cordero, 2010) to control forcharacteristics of the interviewers and the interview situation. Furthermore, thisis the first German study to examine record linkage selectivity and success basedon personal and address data of the respondents instead of previously known andunique identifiers like Social Security Numbers. That is why the paper goes beyondmost existing papers by examining all stages of the process of record linkage in aconsistent modelling framework.

2.2.1. Legal and ethical aspects of record linkage

For any attempt to link survey data with other micro-level data sources on the sameindividuals, it is crucial whether data protection regulations make the consent ofrespondents necessary. If that is the case, interviewers are legally bound to informrespondents about the nature and amount of information that is going to be matchedas well as how the combined data is going to be used. Interviewers are obliged to askfor consent to that procedure explicitly, which can be given verbally or in writtenform, depending on the mode of data collection. This legal requirement also appliesto surveys conducted in Germany (see Metschke, 2010). For administrative data of

15

Page 16: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

the Federal Employment Agency, this requirement is governed by §75 Social CodeBook X (SGB X).The ALWA questionnaire fulfils this requirement by asking for consent to recordlinkage during the longitudinal part of the interview. The translation of the questionfor consent in the ALWA study (Matthes and Trahms, 2010, p. 110) reads as follows:

“To keep the interview hereafter as brief as possible, we would like toinclude data in our analyses of the survey that are held by the Insti-tute for Employment Research of the Federal Employment Agency inNuremberg. These include, for instance, information on previous periodsof employment and unemployment and on participation in measuresof active labour market policy. To this end I kindly ask you for yourpermission to merge this data to the survey data. It is guaranteed that allrules of data protection are strictly applied whenever these informationare analysed. It goes without saying that your consent is voluntary. Youcan withdraw it at any time. Is that ok?”

Truly informed consent is also a matter of ethical considerations, which can beillustrated by the following aspects (see Lessof, 2009): record linkage with a datasource as comprehensive as that of the Federal Employment Agency implies thatadministrative employment information dating back for several decades before thetime of interview will be added; it also implies that newly gathered administrativeinformation will be added to the combined data in the future, as long as consent isnot withdrawn. Concerns and objections of respondents should thus be respected,regardless of the legal situation. However, to alleviate confidentiality and privacyconcerns of its participants, the ALWA study included an advance letter that explainedin detail how data protection and respondent anonymity will be ensured during andafter data collection.

2.2.2. The process of record linkage

Records from different data sources can either be matched by means of an uniqueidentifier such as the Social Security Number or on the basis of a combination ofambiguous and error-prone identifiers. The two data sets linked in this projectare the ALWA survey and the Integrated Employment Biographies (IEB) (seeJacobebbinghaus and Seth, 2007) of the Institute for Employment Research (IAB).As the population of interest in the ALWA survey consisted of individuals livingin Germany regardless of their labour market status or nationality, the sample wasdrawn from registers of the residents’ registration offices of 250 German municipalities.The result was a sample of addresses without an unique identifier related to the

16

Page 17: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.2. Record linkage

administrative records of the Federal Employment Agency. Record linkage is thereforeperformed on the basis of the identifiers given in Table 2.1.

Table 2.1.: Identifiers used for record linkageName First and last nameSex Dummy variableBirth date Day, month and year of birthAddress Postal code, place name, street name and house number

This information was provided by the survey institute, but only for respondentswhich have consented to record linkage. The corresponding data from the adminis-trative records were provided by the IAB department IT Services and InformationManagement. Data retrieval considered all people who had been registered in anydata source of the Federal Employment Agency at any time during the year 2007,the year in which the ALWA sample was drawn and the field phase started. Giventhe considerable amount of data that resulted from this procedure, the data from theadministrative records were restricted beforehand to the birth cohorts of the surveypopulation.Both sets of address data could be brought together specifically for the purpose ofrecord linkage, as the Federal Employment Agency and the IAB had been namedas the contracting authorities of ALWA in the advance letter to the respondentsand in the question for consent. Otherwise, data protection legislation would haveprevented the survey institute from transferring above identifiers to the IAB. Thelinkage could still have been conducted using methods of privacy preserving recordlinkage (see Schnell et al., 2009).Before the records from both data sources were actually compared, extensive pre-processing was conducted to clean up typographical errors, to minimise the amountof different spellings of names, places and street names as well as to fill in missinginformation in postal codes or place names. These steps of standardization weredone consistently for both the administrative and survey address records.The actual comparison of address records started with deterministic matches onthe complete match key, i.e. by an exact character-by-character comparison of allidentifier variables mentioned above. To increase the amount of successful matches,probabilistic record linkage was used in an additional step (see Herzog et al., 2007;Winkler, 2009). It computes the degree of similarity between two records fromdifferent data sources over all identifiers.5 Based on the decision rule proposed by

5The comparison with the Jaro-Winkler string comparator metric and blocking on the postalcode was done by using the software Merge ToolBox (MTB), version 0.7. A newer version ofMTB as well as its documentation can be retrieved from the German Record Linkage Center,

17

Page 18: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

Table 2.2.: Number of observations over the stages of record linkage

N NNc

NNr

CATI respondents (Nr) 10,404 100.00%Consenting CATI respondents (Nc) 9,531 100.00% 91.61%Deterministic matches 5,035 52.83% 48.39%Deterministic and probabilistic matches 7,919 83.09% 76.11%Deterministic, probabilistic and manual matches 8,243 86.49% 79.23%

Source: ALWA address data, address data from Federal Employment Agency statistics, owncalculations.

Fellegi and Sunter (1969), pairs of records were classified into links, potential linksand non-links after the comparison. Pairs that were classified as links were directlyused for the retrieval of administrative data. Those that were classified as possiblelinks were subsequently coded as either links or non-links by hand.During each of these steps, some observations were lost for the final research datadue to lack of consent or of success in record linkage. The remaining number ofobservations on each stage is documented in Table 2.2. Consent for record linkagewas given by 9,531 (92%) of the respondents. For 53% of the consenters, deterministicmatches on the complete match key could be found. Together with matches thatcould successfully be identified by probabilistic record linkage, this figure reaches83%. When considering manual matches as well, 86% of all consenters and 79%of all respondents are included in the linked data. Descriptive statistics for thewhole survey population on all explanatory variables considered later on are given inTable B.1 in the appendix.Other than Table 2.2, all tables and analyses below do not include the 227 cases offoreign language interviews, leading to a total of 10,177 observations. If interviewscould not be conducted in German, shorter Turkish and Russian interviews weredone instead. These questionnaires also included the question of consent, but theylack several aspects that are important for the following analyses.

2.3. State of research on record linkage consent andsuccess

Selective success in linking different data sources may arise on different stages ofthe process. First, when the informed consent of the entity the data relates to isnecessary, differences in the willingness to provide consent may introduce bias to the

see http://soz-159.uni-duisburg.de/linkage. Probabilistic matching parameters (m- andu-probabilities) have been chosen according to prior experience with IAB data (see Bachteler,2008).

18

Page 19: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.3. State of research on record linkage consent and success

potentially linked sample in a manner similar to unit non-response. Second, evenwhen consent for record linkage has been acquired, it might be impossible to find thecorresponding records in the administrative data for some of the respondents. Theseare, for instance, people that have never experienced an employment spell or havenever been registered as unemployed up to the time of interview. Linkage based onpersonal information as identifiers instead of unique identifiers such as the SocialSecurity Number may also fail due to wrong or partially missing address informationin one of the data sources. Both consent and linkage are prone to be selective inways that are described in the following sections.

2.3.1. Determinants of consent

The consent of respondents to record linkage may be influenced by a wealth of factors.There is a considerable amount of literature on this topic, but the majority of previousstudies considers surveys that ask for consent to the linkage of health records. Dunnet al. (2004), Huang et al. (2007) and Kho et al. (2009) provide recent overviews ofresults on record linkage in the context of specific epidemiological or health studies,whereas Knies et al. (2012) and Sakshaug and Kreuter (2012) add findings fromhealth data record linkage to general population surveys. Although linked health datacould be considered as equally if not more sensitive as administrative employment orincome information, it is an open question whether the results from these studies canbe transferred to the population of the ALWA survey and its focus on educationaland labour market activities.The following sections will summarise results of studies that did have comparablepopulations and which also linked survey data with administrative employmentdata. I will assess the relevance of these studies for the question at hand, showthe mechanisms determining consent to record linkage they propose and add ownhypotheses. A quick overview of the results of relevant studies is given in Table 2.3.Empty cells mean that the respective study did not control for the characteristicmentioned in the given row, and rows without any results from previous studiesrepresent characteristics that have never been controlled for before.

2.3.1.1. Respondent characteristics

The most commonly examined potential predictors of consent are characteristicsof the respondent. However, results on respondent characteristics from differentstudies often contradict each other. This is best exemplified by the relationship ofrespondents being female with linkage consent, which is reported to be either positive(Haider and Solon, 2000), negative (Hartmann and Krug, 2009; Sala et al., 2012), or

19

Page 20: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

Table 2.3.: Existing results on determinants of consent to record linkage withemployment-related register data

Beste

(201

1)

Gustm

anan

dSteinm

eier

(199

9)

Haide

ran

dSo

lon(200

0)

Hartm

annan

dKrug(200

9)

Jenk

inset

al.(20

06)

Olso

n(199

9)

Saksha

ugan

dKreuter

(201

2)

Sala

etal.(20

12)

Sing

eret

al.(20

03)

RespondentMale ns ns ns + ns ns + nsForeign, ethnic minority - - - - - - - -Native languageRegion of residence ns sig sig ns sig nsAge ns ns sig ns - - +Qualification ns - ns ns ns - + nsCognitive skillsLabour market status ns sig ns sig ns nsIncome + + + ns + ns ns +Refused income information - - - ns -Wealth, assets - - -Existing relationship/marriage + ns + + nsChildren ns + nsCooperation in other consent questions +Share of refused answersShare of answers like "don’t know"InterviewerMale + ns nsAge + + nsQualification - + nsExperience before study nsPrior interviews within actual study ns -Respondent-interviewer similaritySex nsAge nsQualification nsInterview situationWeekday/time of interviewDuration of interview ns +Disturbances/problems during interview -

Notes: +/-/ns/sig denote significantly positive/significantly negative/no significant/overallsignificant influence on consent, respectively. Lines may represent groups of variables with po-tentially varying levels of significance within. Lines without any results indicate determinantsthat have not been considered in any multivariate analysis before.

20

Page 21: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.3. State of research on record linkage consent and success

non-existent (Beste, 2011; Gustman and Steinmeier, 1999; Jenkins et al., 2006; Olson,1999; Singer et al., 2003), sometimes with contradicting results for the very samesurvey. Therefore, an analysis regarding the characteristics of consenting respondentsis in order. The existence of a partner and that of children in the respondent’shousehold will be included as control variables without explicit hypotheses on theirinfluence.More consideration will be given to characteristics which, if they are in fact relatedto selective consent, have a high potential of biasing estimation results based onthe linked data. These are, for one thing, groups that are only weakly representedin the survey from the outset, such as ethnic minority groups. For another thing,I will examine variables that will most likely be of central interest for future datausers. Given the core themes of the ALWA survey, these are mainly educational andemployment related variables.The relevant studies so far produce inconclusive results on the respondent’s age.In the given context, the amount of register data that is available on a person ispositively related to her labour market experience, thus usually also to her age. Ifone assumes that the reluctance of providing consent for record linkage grows withthis amount of data due to privacy concerns, consent should be negatively related toage.Cognitive aspects may be relevant for the first step of the response process, thecomprehension of the question for consent (see Tourangeau and Bradburn, 2010,p. 317). First, German language problems on behalf of the respondent might preventher from fully understanding the meaning of the question. The comprehension ofeither the importance of the linked data for research or the extent of information thatis to be matched might be insufficient. It is not clear from the outset how this shouldinfluence the likelihood of consent, both directions are possible. A comprehensiveunderstanding of the amount of data that can be matched might well lead to arejection of consent.Second, even when the question of consent is fully understood, a lack of experiencewith or understanding of the function or functioning of the Federal EmploymentAgency and its local offices might impede full comprehension of the consequencesof record linkage. This might be the case for foreign respondents, whose residencein Germany and contact with its institutions so far have been limited. Again, it isunclear whether this would make consent more or less likely. Someone who wouldhave refused consent given full comprehension of its consequences might provideconsent when comprehension is deficient and vice versa. Studies that include anethnic minority or foreign background as a control variable unanimously agree onits negative relationship with consent. To investigate this matter in more detail the

21

Page 22: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

dummy variable for foreign nationality is supplemented by an indicator for Germanas the native language of the respondent.These considerations do not imply a general lack of cognitive abilities as a source forselective consent, though these might be relevant as well. Apart from a potentiallanguage barrier, deficient comprehension of what the linking of different datasources implies technically or for the richness of the resulting data might triggerdifferent responses to consent questions. For instance, the risk of a breach ofdata confidentiality might be under- or overestimated, leading to a higher or lowerlikelihood of consent, respectively. The expected influence on consent is ambiguous.Apart from educational levels or income as proxy variables, no study so far hasconsidered the direct influence of the respondent’s cognition on consent to recordlinkage. To achieve this, scores for self-reported prose and document literacy as wellas for numeracy are computed with principal component analyses (see Jolliffe, 2002)based on several 5-point items.Low cognitive abilities or educational achievements may also be considered as proxyvariables for recall problems. Respondents with low cognitive sophistication (seeKrosnick, 1991) may have problems to retrieve dates or other information on pastevents. To compensate for their lack of recall, they might be inclined to allowsuch data to be supplemented from other sources. Assuming that this motivation isrelevant during a telephone interview, higher cognitive abilities and higher educationallevels should be negatively related to consent.Instead of being a possible proxy for cognitive abilities the educational level may alsoinfluence the respondent’s attitude towards the survey. An interview with numerousquestions on educational success might be regarded as important and worthwhileby well educated respondents, whereas it might be experienced as unpleasant if notembarrassing by poorly educated respondents. I thus expect a positive relationshipbetween the educational level and consent to record linkage. However, becausethis statement contradicts the hypothesis put forward in the former paragraph,the expected overall influence of the respondent’s education remains ambiguous.Insignificant or inconclusive results on that matter are presented by Haider and Solon(2000), Hartmann and Krug (2009), Olson (1999) and Singer et al. (2003), whereasGustman and Steinmeier (1999) report a negative relationship.General attitudes towards the survey at hand might influence consent as well, asargued by Singer et al. (1993, 2003) and Hartmann and Krug (2009). The moresympathetic and trusting the respondent feels towards the survey or its contractingauthority, and the more interested she is in its topics, the more likely will shecooperate when it comes to the consent question. Singer et al. (2003) indeed find apositive relationship between the respondents’ feeling of obligation to cooperate and

22

Page 23: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.3. State of research on record linkage consent and success

their consent to linkage. Data used by Sala et al. (2012) lack a measure of obligationto cooperate but find a negative relationship between the amount of prior wavesof the given survey and linkage consent. Prior survey participation is thus used asa proxy for observed cooperation in the past. Beste (2011) uses the willingness toparticipate in subsequent panel waves as a proxy variable and does find a positiverelationship with consent. This strategy is also applied here as I include questionson the willingness to participate in later survey waves and in subsequent cognitivetests as control variables. I assume there to be a positive relationship between bothquestions and consent to linkage. However, when using these proxy variables, itis hard to distinguish between a lack of interest in the survey and a potentiallyunderlying attitude of distrust.

Lack of trust towards the interviewer, towards the specific survey or towards surveysin general might be an important driver for refused consent. Respondents whoexplicitly express privacy concerns during the interview would therefore be lessinclined to allow record linkage. This hypothesis is supported by the results ofSala et al. (2012) and Singer et al. (2003). Contrary to the surveys used there,ALWA did not comprise explicit measures of confidentiality or privacy concerns. Theliterature provides examples for several potential proxy measures. Beste (2011) andHartmann and Krug (2009) use the amount of refused answers to sensitive questionsas a proxy variable and indeed find a negative relationship with consent to recordlinkage. The results from Jenkins et al. (2006) are ambiguous, as they show nosignificant relationship between item non-response in income questions and consentto linkage with administrative record, but a negative relationship with consent tocontact the employer for further information. In the absence of an actual measureof confidentiality or privacy concerns, I include the share of refused answers in myanalyses as well as the refusal of income information as a separate income category.

Another characteristic potentially related to trust, though very specific to Germansurveys, is whether a respondent was born in East Germany (Beste, 2011; Hartmannand Krug, 2009). Given the birth cohorts included in the ALWA sample, anyrespondent who reports being born in East Germany (without West Berlin) isvery likely to have grown up in the former German Democratic Republic (GDR).Recent empirical evidence shows that, even nearly two decades after the Germanreunification, East Germans still show more social distrust than West Germans(Heineck and Süßmuth, 2010). People born in East Germany should therefore be lessinclined to provide consent, regardless of their place of residence at the time of theinterview. Beste (2011) finds no influence of residence in East Germany at the time ofinterview whereas Hartmann and Krug (2009) find a significant positive relationship.Both analyses though do not account for potential mobility of respondents, as they do

23

Page 24: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

not control for the respondent’s place of residence before the German reunification.Since longitudinal earnings information are one of the main advantages of adminis-trative employment data, selectivity regarding the income of respondents could be amajor problem when empirical inference is based on the linked data. The reportedpersonal net income of the respondent across all income sources at the time of theinterview will therefore be included in the analysis. I expect consent to record linkageto be positively related to income. That is because respondents with very low or noown income might be unwilling to grant access to additional information on theiractual or previous labour market success—or lack thereof. Existing results mainlyagree that linkage consent is positively related to the respondent’s income. This isonly contradicted by Jenkins et al. (2006) who find a higher willingness to consentamong respondents that were eligible for means-tested benefits, which by definitionimplies that they have a low income or none at all.Instead of or in addition to current income, some existing studies include therespondent’s wealth or monetary assets in their analyses. The implications forconsent are different from those of actual income as the accumulated wealth maybe considered as sensible if not secret. The hypothetical influence is negative,which is corroborated by the results of Gustman and Steinmeier (1999), Haiderand Solon (2000) and Olson (1999). As the ALWA questionnaire does not considerthe wealth of the respondent or her household, I use the degree of participation inhigh-cultural activities as a proxy measure. This is calculated by means of principalcomponent analysis. I argue that activities such as visits to theatres or operas or thenumber of books in the household are strongly correlated with wealth as they makeconsiderable monetary resources necessary.6 I therefore expect people with a highscore in high-cultural activity to be less willing to consent to record linkage.Finally, consent might depend on the relevance of the survey’s main topics or thedata that are to be matched for the current situation of the respondent, including thelabour market status. Respondents that are satisfied with their employment situationmight be more willing to provide consent, as the information in the register data arefavourable for them. On the other hand, one could argue, a study on employmenthistories has a higher relevance for unemployed respondents or those with benefitreceipt of some kind, as they might expect the research based on the linked data tolead to political actions that improve their labour market chances. They should havehigher incentives to provide the necessary information, even by giving consent to

6One might argue that these activities are also related to the actual income. However, as theactual income is directly controlled for by information on the personal net income and indirectlyby the educational level, the variable on high-cultural activities should only capture the influenceof previous earnings and existing wealth. A test on Cramer’s V indeed shows that there is onlya weak correlation between high-cultural activities and the personal net income.

24

Page 25: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.3. State of research on record linkage consent and success

linkage with additional data sources. The majority of existing results corroboratesthis hypothesis, the exceptions being the studies from Beste (2011) and Hartmannand Krug (2009), which are both related to Germany. As one of the main topics ofthe ALWA survey were educational activities, a similar reasoning applies for peoplecurrently in formal education. They should have a strong interest in the survey,which would imply high consent rates among these respondents.

2.3.1.2. Interviewer characteristics

A successful interaction between interviewer and respondent is crucial to achievingcooperation by the respondent. Thus, consent should be strongly influenced by char-acteristics of the interviewer. The time-invariant socio-demographic characteristicsincluded in the analysis were provided by the survey institute, whereas variables thatmay change between different interviews of a single interviewer are computed fromparadata.The sex of the interviewer in itself does not necessarily lend itself for a specifichypothesis. This changes when it is considered in interaction with the sex of therespondent, which so far has not been done in many studies. Hartmann and Krug(2009) include a dummy variable that indicates a male respondent that is interviewedby a younger female interviewer. The consent rate in this constellation does notsignificantly differ from that in others. One possible hypothesis states that aninterviewer of the opposite sex might unconsciously be considered as a candidate fora romantic relationship. In that case, respondents’ answers might be influenced byconsiderations of social desirability, with the socially more desirable behaviour beingthe provision of consent. Consent would be more likely if the sex of the interviewerdiffers from that of the respondent. However, given that ALWA interviews wereconducted as computer-assisted telephone interviews, the implied rationale of socialdesirability might prove irrelevant in the given study.Experience of the interviewer, be it life or job experience, is expected to positivelyinfluence her success in creating cooperation of respondents. I therefore control forthe interviewer’s age, the years of experience as an interviewer before the ALWA studyand the number of ALWA interviews before the actual interview. The interviewer-specific consent rate up to the actual interview measures the interviewer’s priorsuccess in achieving respondent cooperation. Previous studies have included similarexperience-related variables, though never as comprehensive as in the work at hand.Beste (2011) and Hartmann and Krug (2009) corroborate a positive relationship ofconsent with the age of the interviewer, a relationship that is missing in the resultsof Sala et al. (2012). While Beste (2011) finds no significant relationship with thenumber of previous interviews, Sala et al. (2012) find a positive one.

25

Page 26: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

An interviewer’s qualification may be related to her rhetorical abilities, and maythereby influence her ability to convey the importance of record linkage and toconvince respondents to cooperate. Linkage consent should thus be positively relatedto the educational level of the interviewer, which is measured by dummy variablesregarding general schooling and training certificates. Existing studies find inconclusiveresults, they range from significantly negative (Beste, 2011), over non-existent (Salaet al., 2012) to significantly positive (Hartmann and Krug, 2009).To consider the degree of similarity between interviewer and respondent, a numberof interaction variables are added in an alternative model. Age of the intervieweris included in relation to the age of the respondent to take into considerationthat strong differences may have adverse effects on cooperation. Dummy variablesindicate whether the interviewer is more than ten years younger or more than tenyears older than the respondent, respectively. Differences in general schooling levelsare also measured by dummy variables. Potential interactions between the sex ofthe respondent and that of the interviewer will be examined by estimating separatemodels for female and male respondents.

2.3.1.3. Interview situation

Characteristics of the survey in general are important, such as its topic, the natureand amount of the information that is going to be matched from other data sources,the context of the question of consent in the course of the interview, the purportedusage of the combined data or the client of the survey institute. This enumeration isfar from being conclusive, but it shows that findings of other studies with similartopics might not be applicable to different survey contexts. These characteristics arespecific to each study and their respective influence on linkage consent cannot beexamined on the basis of a single survey.Several aspects of a specific interview situation might also influence the willingnessof a person to consent to record linkage. The duration of the interview until thequestion for consent is asked may affect the willingness to cooperate in differentways. On the one hand, the respondent might be dissatisfied with the length of theinterview so far, in which case a long interview duration might increase the reluctanceto give consent to linkage. On the other hand, the willingness of the respondentmight increase with duration, as she considers the length of the conversation so far asa sign of the high importance of her answers. In that case, she would be more willingto consent in order to give more meaning to what she has already reported. Existingresults on the influence of the elapsed interview duration are as inconclusive as thehypothetical expectations. Hartmann and Krug (2009) find no significant effects,whereas Jenkins et al. (2006) find a positive relationship with the duration of the

26

Page 27: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.3. State of research on record linkage consent and success

previous interview with a given respondent. To establish a detailed analysis of thistopic, my analysis considers the elapsed duration of the interview until the consentquestions exact to the minute. The question is asked directly after the employmentrelated longitudinal questionnaire module. Before that, the longitudinal moduleson the residential, general schooling and training histories as well as on times ofmandatory military or civilian service had already been finished.The relationship of problems or disturbances during the interview and linkageconsent is not clear from the outset. Problems may be caused by a general or growingdissatisfaction of the respondent over the course of the interview, which would alsolead to a lower probability of consent. It might also be due to external sources ofdisturbance that are not related to the cooperation of the respondent. There wouldthus be no influence on the provision of consent. The evidence on this issues so faris rather scarce, which might be due to a lack of paradata. Jenkins et al. (2006)report that problems are negatively related to consent probability, though they onlymeasure problems during the previous interview with the given respondent insteadof the current one. I examine this issue by including an indicator of comprehensionproblems on behalf of the respondent, disturbances or other unspecified problemsduring the interview. They are noted by the interviewer after the end of the actualinterview. These variables do not provide information on the timing of eventualproblems, so they may have happened before or after the question for linkage consent.

2.3.2. Determinants of record linkage success

Hypotheses so far were based on factors that influence the respondent’s decisionregarding consent to record linkage. When examining linkage success these consid-erations no longer apply. The following hypotheses are related to the procedureof identifying and linking records of a given person in different data sources. Thetwo aspects important in this context are, first, availability and validity of identifiervariables common to both data sources. Since a valid address is necessary to con-tact potential respondents before and during the survey, the personal informationavailable to the survey institute are to be considered as rather valid. This is notnecessarily true for all address information in the administrative data. Given thatrecord linkage in other surveys with a labour market context is usually done by meansof unique identifiers like the Social Security Number, the results from most of thestudies mentioned above are not informative for the following hypotheses. Second,respondents can only be found if they have ever experienced labour market statesthat are registered in any of the sources of administrative data. Otherwise, theirnames and addresses will not show up in the register address data. Both aspects aremeasurable to some extent in the survey and field information of the respondents.

27

Page 28: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

Labour market related events such as employment, registered unemployment, jobsearch or participation in active labour market measures usually lead to the regis-tration of the address of the person in question. All characteristics influencing thelabour market status of the respondent during or before the interview thus indirectlyinfluence matching success. For instance, the older a respondent, the more likely willshe have entered the labour market at some time. This makes it more likely for herto be represented in the administrative employment data and therefore also in theaddress data. Only respondents that are near the retirement age might be less likelyto be registered in the register data, as the Federal Employment Agency usually nolonger gathers information on people once they are retired. This is in line with theresults from Beste (2011) who finds an inversely u-shaped relationship of age andlinkage success. Since the age range of ALWA ends at 52 years, early retirementshould play no major role in this analysis.

The nationality of the respondent might influence linkage success for several reasons,though the direction of the influence is not clear from the outset. On the one hand,foreign names might be misunderstood or misspelled more likely during the surveyprocess or even in the offices of the employment agencies; foreigners or refugees mightsometimes provide inaccurate birth dates to the authorities, either because they donot know the exact date or because they refrain from revealing it to authoritiesor employers. On the other hand, precisely because names that are uncommon inGermany have a higher risk of being misspelled, either the person providing the nameor the person asking for it might be more inclined to have the name spelled letterby letter during its registration. Being born in East Germany should not influencelinkage success as names from native citizens of East and West Germany ordinarilydo not differ much.

The employment status of the respondent at the time of interview or during themonths before is important, as it determines whether and what kind of administrativerecords exist for the year 2007. People who are or have recently been unemployedshould have accurate address information in the administrative records, because it isnecessary to mail them job offers or information on benefit receipt. This is not thecase for people that are out of the labour force but not registered as unemployed inthe administrative data. Respondents in dependent employment should be foundsuccessfully, as their employers are obligated to give notifications on addresses oraddress changes of their employees to the social security authorities. Self-employedand civil servants, who usually do not contribute to the social security system, shouldbe found less successfully. People currently in formal education are also not coveredby the data of the Federal Employment Agency, which should make them harderto find during the linkage procedure. However, this only applies to those in general

28

Page 29: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.4. Selectivity analyses

schools or higher education, as people in dual vocational training are registered bytheir training company. To sum up, the highest success in record linkage shouldbe found for unemployed or dependently employed respondents; the lowest successshould be found for self-employed, civil servants and people outside the labour force,including those in formal education. The results from Beste (2011) support thesehypotheses.If the educational level exhibits any influence on linkage success, it should beindirectly through its impact on the labour market status, which is controlled forby the respective variables. The same applies for measures of cognitive ability, asthey may also positively influence labour market attachment. However, as bothunemployed respondents and those in dependent employment should be equally likelyto be found by record linkage, the relationship between linkage success and thehuman capital variables is not clear from the outset.Dependent employees are registered in administrative data regardless of their level ofincome, as long as this income stems from legal employment. All income groups shouldthus be represented equally well in the linked data. As self-employed respondents arenot included in the register data, and as it can be assumed that they are more likelyto be part of the upper income brackets,7 these brackets are likely to be linked withless success than the lower income groups. This is also in line with Beste (2011).Apart from personal characteristics of the respondents, aspects of the variables thatare used for the actual linkage may be included as well; these are not relevant onthe stage of consent. Beste (2011) argues that names that occur only once in thewhole survey address list either are very likely to be misspellings or are so rare andthus unfamiliar that there is a good chance that they were spelled incorrectly in theaddress data. This would reduce the likelihood of a respondent of being found in theadministrative data, which is corroborated by the results of Beste (2011).

2.4. Selectivity analyses

2.4.1. Descriptive results

To get a first impression of eventual selectivity of consent and matching successrates in ALWA, a descriptive analysis is in order. While multivariate analysesin section 2.4.2 allow ceteris-paribus statements regarding the influence of singlevariables, the following descriptive results are informative in terms of how certaingroups from the overall survey population are represented in the data set on each

7This is corroborated by the correlation between reported income and the labour market status.Self-employed are over-represented among the two highest income classes.

29

Page 30: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

step. Table 2.4 compares those rates over different subgroups of respondents.8 The p-values resulting from Pearson χ2-tests indicate significant differences between groups.Consent rates are computed based on all German language interviews, whereas matchrates are computed based on consenting respondents only. That way, any selectivitythat may arise from the second stage onward can be distinguished more clearly froman eventual consent bias.As the process of linkage itself can be subdivided into the different stages of de-terministic, probabilistic and manual record linkage, their resulting matches mayyield different selectivity. The respective results for these stages are depicted inseparate columns to discern whether subsequent steps of the linkage process introduceadditional selectivity compared to the previous one or vice versa. The second columnshows deterministic match rates by subgroups; the third column shows rates ofsuccessful matches by either deterministic or probabilistic record linkage; the finalcolumn shows all matches by adding the manual matches as well.Overall, 92% of the German language interview respondents gave consent. Amongthose, 53% could be matched to register address records by deterministic recordlinkage and 80% either by deterministic or probabilistic record linkage. This figureincreases to 86% when including the manual matches as well. These values slightlydiffer from those in Table 2.2 because at this stage only those cases are included thatare part of the estimation sample used in Section 2.4.2.The results for subgroups show that the youngest respondents and those born inEast Germany are significantly over-represented among consenters as well as on allstages of the matching process. The gender structure is similar to that of the overallsurvey population up to the step of probabilistic linkage. Manual matches haveadded more matches for female relative to male respondents. Although Germans andforeigners are about equally represented among consenters and deterministic matches,foreigners show significantly higher probabilistic linkage rates than Germans, with85% compared to 80%. This is even more pronounced after adding the manualmatches. Respondents with a native language other than German are matched withsignificantly higher rates than German native speakers. The relative scarcity offoreign names in the address data seems to have lead to a higher manual linkagesuccess for foreigners compared to that for native Germans. No selectivity of consentis found when considering the qualification, but some significant structural differencesemerge on the stages of matching success.When considering the labour market status of the respondent, the picture is even lessclear. Self-employed and unemployed are the least likely to provide consent (bothwith 89%), but whereas self-employed are still weakly represented among all linked

8Most tables in this document were produced using estout.ado (see Jann, 2005, 2007).

30

Page 31: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.4. Selectivity analyses

Table 2.4.: Consent and linkage success rate by subgroups, expressed as percentages

consenters deterministic determ.+ allmatches probabilistic matches

matches

Total 92.1 53.1 80.4 86.4

Aged 18-24 93.9 59.9 83.9 90.025-34 91.6 58.3 83.6 90.735-44 92.0 50.4 79.4 85.545-52 91.5 48.8 77.5 82.7

(0.014) (0.000) (0.000) (0.000)Female 92.1 52.5 80.6 87.1Male 92.2 53.7 80.3 85.7

(0.879) (0.243) (0.726) (0.050)German nationality 92.1 53.0 80.3 86.2Foreign nationality 90.9 55.3 87.4 95.5

(0.489) (0.532) (0.012) (0.000)Native language not German 92.1 65.8 85.9 93.8Native language German 92.1 52.6 80.2 86.2

(0.991) (0.000) (0.015) (0.000)Born in West Germany 91.7 52.3 80.0 85.8Born in East Germany 93.8 56.5 82.4 89.1

(0.003) (0.002) (0.022) (0.000)No training 93.3 57.9 81.7 87.1Training + lower secondary 92.0 58.9 86.3 92.4Training + intermediate 91.8 57.6 84.0 90.1Training + upper secondary 93.1 53.5 83.1 88.9Master craftsman 92.5 48.0 73.1 79.0Higher Education 91.0 41.8 72.6 79.0

(0.106) (0.000) (0.000) (0.000)Self employed 89.2 42.4 64.6 72.6Freelancer 93.8 52.9 82.0 87.4In dependent employment 92.1 56.3 87.3 92.9Civil servant 93.3 13.1 21.7 25.7Unemployed 89.3 68.4 86.9 94.2In formal education 94.7 56.6 79.8 84.9Other activity 92.3 47.5 74.0 82.4

(0.000) (0.000) (0.000) (0.000)Personal net income<500EUR 93.2 54.7 80.6 86.5500-999EUR 92.7 59.4 85.0 92.51000-1499EUR 92.6 56.4 83.8 90.41500-1999EUR 92.8 57.2 83.5 88.92000-2999EUR 92.9 46.1 76.3 81.3More than 3000EUR 92.5 40.2 69.3 74.4Income refused 63.6 47.3 81.3 86.7

(0.000) (0.000) (0.000) (0.000)

Observations 9,790 9,024 9,024 9,024

Source: ALWA, own calculations. Notes: p-values of Pearson χ2-test in parentheses. Percent-ages in columns related to linkage success (col. 2-4) are based on consenters.

31

Page 32: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

respondents (73%), unemployed are the most successfully matched respondents onany stage of linkage (68%, 87% and 94%). This implies that the selective processesat different stages counteract each other for some groups of respondents while theyamplify each other for other groups.Above findings do not depend on whether the Pearson χ2-tests are only based onconsenters or on all German language interview participants. Table B.2 in theappendix replicates Table 2.4 but considers all relevant CATI participants. Thisallows a direct comparison of the distribution of groups of respondents in the linkeddata with their distribution in the survey data alone. The only noteworthy differencefrom the results in Table 2.4 is that people that refuse income information during theinterview are strongly under-represented on all stages of the matching process. Thisis only driven by the reluctance of these respondents to provide consent. The refusalhas no influence on the matching success itself, as shown at the bottom of Table 2.4.Table 2.5 compares mean values of respondent characteristics stemming from paradata,characteristics of the interviewer and the interview situation for non-consenting andconsenting respondents. Column 3 shows test statistics of t-tests on significantdifferences of these means. They indicate significant differences between both groupsin several variables. Most results are in line with the hypotheses stated in section 2.3.The most notable deviation from the hypotheses is related to the number of previousinterviews within the ALWA survey. Other than expected, interviewers with moreexperience within the specific survey are not more likely to achieve consent. Theexpected positive relationship between interviewers’ education and consent is notcorroborated by the descriptive results. Hypothetical considerations regarding privacyconcerns or trust of the respondents find support in the results. Consenters exhibitless refused answers or recall problems than non-consenters. Consent to recordlinkage is positively related to the general attitude of the respondent towards thesurvey, as people who consented to record linkage are also more willing to participatein subsequent cognitive paper-and-pencil tests or in later panel waves. Problemsduring the interviews coincide with a lower probability of consent. However, adirect relationship between these matters cannot be examined in more detail, asinterviewer-reported problems could also have occurred sometime after the consentquestion. The timing of any problems or disturbances during a given interview isnot given in the paradata.

2.4.2. Multivariate results

The descriptive analysis of possible selectivity of consent and linkage success arenow complemented by multivariate analyses. As the research questions aim at thedeterminants of consent to and actual success of record linkage, separate probit

32

Page 33: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.4. Selectivity analyses

Table 2.5.: Mean characteristics by consent status, t-test of difference

no consent consent difference t

Consent to follow-up survey (d) 0.778 0.949 0.171∗∗∗ 18.873Consent to cognitive tests (d) 0.332 0.585 0.254∗∗∗ 13.776Share of refused answers 0.175 0.040 −0.135∗∗∗ −14.595Share of ’don’t know’ 0.937 0.598 −0.339∗∗∗ −7.377Int: male (d) 0.624 0.556 −0.068∗∗∗ −3.664Int: aged up to 24 (d) 0.183 0.119 −0.063∗∗∗ −5.125Int: aged 25-34 (d) 0.180 0.172 −0.008 −0.569Int: aged 35-44 (d) 0.209 0.199 −0.010 −0.634Int: aged 45-54 (d) 0.299 0.364 0.065∗∗∗ 3.624Int: aged 55 and more (d) 0.130 0.145 0.016 1.195Int: no training (d) 0.150 0.171 0.021 1.497Int: training, below upp. secondary (d) 0.140 0.162 0.022 1.585Int: training, upper secondary (d) 0.149 0.165 0.016 1.125Int: higher education (d) 0.354 0.333 −0.021 −1.191Int: education unknown (d) 0.207 0.170 −0.037∗∗∗ −2.636Experience as interviewer (years) 1.684 1.828 0.144∗∗∗ 3.898No. of previous interviews 0-25 (d) 0.325 0.353 0.028 1.567No. of previous interviews 26-50 (d) 0.171 0.164 −0.007 −0.470No. of previous interviews 51-100 (d) 0.234 0.219 −0.015 −0.965No. of previous interviews >100 (d) 0.269 0.263 −0.007 −0.394Int: consent rate in prior interviews 0.888 0.904 0.016∗∗∗ 2.813Interview on weekend (d) 0.194 0.196 0.001 0.095Duration before consent quest. (min.) 26.026 25.461 −0.565 −1.109Disturbance during int. (d) 0.071 0.058 −0.013 −1.471Comprehension problems during int. (d) 0.060 0.050 −0.010 −1.178Other problems during int. (d) 0.124 0.081 −0.043∗∗∗ −4.168

Source: ALWA, own unweighted calculations. Notes: 9,790 observations. ***/ **/ * indicatessignificant difference at the 1/ 5/ 10% level. d denotes dummy variable.

33

Page 34: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

regressions with different dichotomous dependent variables are estimated. Since thetwo stages of the process happen sequentially during the interview and consent on thefirst stage is a prerequisite for success on the second stage, any models considering thelinkage success as the dependent variable will be based on the sample of consentersrather than the whole survey population.To account for the potential influence of unobserved interviewer characteristics thatare common over all interviews of single interviewers, cluster-robust standard errorsare computed for models with consent as dependent variable. Without taking thisinto account, standard errors could be underestimated and statistical inference wouldbe impossible (Moulton, 1990). As interviewer characteristics are no longer relevanton the stage of record linkage, the variance-covariance matrices in models concerninglinkage success are not modified to account for interviewer-clustering.9 All modelsare estimated without survey weights. To infer whether this decision influences theresults, the main specifications are re-estimated including weights in the sensitivityanalyses.

2.4.2.1. Results on determinants of consent

The results of different specifications of probit regressions with consent to recordlinkage as dependent variables are shown in Table 2.6. Model 1 includes all potentialdeterminants of consent mentioned in Section 2.3 excluding respondent-interviewerinteractions. These interactions are controlled for in Model 2. The results for mostof the control variables common to both models are unaffected by the additionalvariables in Model 2. Deviations from that pattern will be interpreted specificallybelow.Previous results on the influence of the respondent’s age have been inconclusive.In the present analysis there is a negative but only weakly significant relationshipbetween some age-classes above the reference category of the 18 to 24 year-olds.Wald tests reject the joint significance of the age-classes (χ2(3): 3.76, p: 0.289 forModel 1 and χ2(3): 4.83, p: 0.185 for Model 2).In line with the inconclusive hypotheses on foreign nationality or native language,these characteristics show no significant influence on a respondent’s consent decision.This result contradicts those from all existing studies which unanimously find anegative impact of a foreign or ethnic minority background. I argue that this might bea result of omitted variable bias on behalf of some of those studies. They might havefailed to control for important respondent characteristics that are often correlated

9The interviewer effects on the first stage have not been analysed within a multilevel frameworkto keep the modelling strategy consistent across all stages examined in this paper.

34

Page 35: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.4. Selectivity analyses

Table 2.6.: Determinants of consent to record linkage, probit regression models with andwithout respondent-interviewer interactions, respectively

without interaction with interaction(1) (2)

Male (d) −0.009 (0.048) −0.003 (0.049)Aged 25-34 (d) −0.167∗ (0.093) −0.183∗ (0.095)Aged 35-44 (d) −0.095 (0.095) −0.166 (0.103)Aged 45-52 (d) −0.129 (0.101) −0.231∗∗ (0.116)Foreign nationality (d) −0.137 (0.116) −0.140 (0.116)Native language German (d) −0.182 (0.121) −0.180 (0.121)Born in East Germany (d) 0.176∗∗∗ (0.062) 0.172∗∗∗ (0.064)Partner in household (d) 0.081 (0.049) 0.083∗ (0.049)Children in household (d) −0.009 (0.055) −0.013 (0.056)Training + lower secondary (d) 0.044 (0.097) 0.017 (0.099)Training + intermediate (d) −0.001 (0.082) −0.016 (0.086)Training + upper secondary (d) 0.108 (0.093) 0.128 (0.095)Master craftsman (d) 0.063 (0.120) 0.056 (0.121)Higher Education (d) 0.015 (0.087) 0.033 (0.087)Prose literacy score −0.031 (0.023) −0.030 (0.023)Document literacy score −0.032 (0.020) −0.028 (0.020)Numeracy score −0.036 (0.023) −0.035 (0.023)High-cultural activity −0.076∗∗∗ (0.019) −0.074∗∗∗ (0.019)Self employed (d) 0.127 (0.113) 0.127 (0.114)Freelancer (d) 0.407∗∗∗ (0.142) 0.403∗∗∗ (0.144)In dependent employment (d) 0.275∗∗∗ (0.086) 0.277∗∗∗ (0.086)Civil servant (d) 0.394∗∗∗ (0.132) 0.401∗∗∗ (0.133)In formal education (d) 0.388∗∗∗ (0.111) 0.395∗∗∗ (0.111)Other activity (d) 0.283∗∗∗ (0.106) 0.290∗∗∗ (0.106)Personal net income <500EUR (d) 0.010 (0.078) 0.009 (0.078)500-999EUR (d) −0.066 (0.074) −0.067 (0.074)1000-1499EUR (d) −0.039 (0.067) −0.038 (0.067)2000-2999EUR (d) 0.002 (0.075) 0.003 (0.075)More than 3000EUR (d) 0.062 (0.080) 0.063 (0.080)Income refused (d) −0.647∗∗∗ (0.101) −0.643∗∗∗ (0.101)Consent to follow-up survey (d) 0.650∗∗∗ (0.069) 0.649∗∗∗ (0.069)Consent to cognitive tests (d) 0.412∗∗∗ (0.049) 0.413∗∗∗ (0.049)Share of refused answers −0.294∗∗∗ (0.075) −0.295∗∗∗ (0.075)Share of ’don’t know’ −0.051∗∗∗ (0.015) −0.051∗∗∗ (0.015)Int: male (d) −0.121∗∗ (0.059) −0.122∗∗ (0.059)Int: aged 25-34 (d) 0.112 (0.109) 0.107 (0.101)Int: aged 35-44 (d) 0.123 (0.122) 0.176 (0.138)Int: aged 45-54 (d) 0.192∗∗ (0.097) 0.275∗∗ (0.122)Int: aged 55 and more (d) 0.188 (0.120) 0.332∗∗ (0.152)Int: training, below upp. secondary (d) 0.016 (0.106) 0.071 (0.116)Int: training, upper secondary (d) 0.061 (0.104) 0.060 (0.107)Int: higher education (d) −0.118 (0.092) −0.114 (0.094)Int: education unknown (d) −0.065 (0.114) −0.049 (0.126)Experience as interviewer (years) 0.075∗∗∗ (0.029) 0.075∗∗∗ (0.028)

(table continued on following page)

35

Page 36: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

without interaction with interaction(1) (2)

No. of previous interviews 26-50 (d) −0.164∗∗ (0.064) −0.164∗∗ (0.064)No. of previous interviews 51-100 (d) −0.139∗∗ (0.068) −0.142∗∗ (0.065)No. of previous interviews >100 (d) −0.152∗∗ (0.067) −0.151∗∗ (0.065)Int: consent rate in prior interviews 0.152 (0.141) 0.141 (0.140)Int: different sex than respondent (d) 0.029 (0.039)Int. at least 10 years younger (d) 0.032 (0.080)Int. at least 10 years older (d) −0.144∗ (0.078)Same schooling level (d) 0.080 (0.121)Higher schooling than respondent (d) 0.120 (0.140)Unknown relation of schooling levels (d) 0.077 (0.191)Interview on weekend (d) 0.055 (0.059) 0.053 (0.058)Duration before consent quest. (min.) −0.002 (0.002) −0.002 (0.002)Disturbance during int. (d) −0.002 (0.076) −0.002 (0.076)Comprehension problems during int. (d) 0.001 (0.080) 0.002 (0.080)Other problems during int. (d) −0.197∗∗∗ (0.067) −0.201∗∗∗ (0.067)Constant 0.534∗∗ (0.257) 0.472 (0.310)

Wald-statistic (χ2) [p-value] 1,295 [0.000] 1,416 [0.000]AIC 4,896 4,902pseudoR2 0.114 0.115Observations 9,790 9,790

Source: ALWA, own unweighted calculations. Notes: Robust standard errors inparentheses based on 210 interviewers as clusters. ***/ **/ * indicates significanceat the 1/ 5/ 10% level. Reference categories in both specifications: respondentaged 18-24, no training, unemployed, net household income of 1500-1999 EUR,interviewer aged up to 24, no training, 0-25 previous ALWA interviews. Additionalreference categories in interacted specification: interviewer aged the same (+/-10years) and same schooling level as respondent. d denotes dummy variable.

with nationality or ethnicity, such as the native language or labour market andeducational success.The hypothetical considerations on cognitive skills also did not allow a clear-cutprediction on their influence on consent, and results indeed show no significant impactof literacy or numeracy skills. The same is true for the educational level of therespondent. Contrary to the hypothesis of high consent among the well educatedrespondents, they do not seem to show greater interest in the goals or success of asurvey strongly focusing on educational activities. Alternatively, a lack of educationmight have lead to low cognitive sophistication in terms of reporting and dating pastevents. In that case, respondents could have been inclined to make up for insufficientrecall by allowing the register data to be linked. These two contradicting processesmight have counterbalanced each other over the whole survey population.In contradiction to existing results and to my hypothesis, reported personal netincome is not positively related to linkage consent. However, caution is in order

36

Page 37: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.4. Selectivity analyses

when interpreting this result as there is a strong negative relationship of consentand the refusal of income information. The possibility that this refusal is not evenlydistributed over all income classes cannot be ruled out with the data at hand.However, descriptive analyses of the distribution of educational levels—a commonproxy for earnings potential—among those who refuse the answer on personal incomeshows no clear relationship. A small Cramér’s V test statistic of 0.05 corroboratesthat finding. Moreover, Sakshaug and Kreuter (2012) were able to examine thispotential bias in more depth, as they observe the income based on administrativedata of all respondents of the PASS survey,10 regardless of whether they providedconsent to record linkage. They do not find a significant bias when comparingincomes between all respondents of the PASS survey and those respondents, whichconsented to record linkage.Another result potentially related to privacy concerns is that of a negative relationshipof consent and the degree of participation in high-cultural activities, at least whenone is willing to accept that the latter is a valid proxy variable for monetary wealth.In that case, the result is in line with both the relevant hypothesis and the existingliterature.In contrast to the respective hypothesis, respondents born in East Germany aresignificantly more likely to consent to the linkage of their data than those born inWest Germany. This is mostly in line with result of other German studies, but canhardly be explained theoretically. It is possible though that, after being obliged tocooperate with government agencies and their representatives for several decades,people born in the former socialist GDR are too accustomed to this situation tosimply disregard it after a few years. The ALWA survey may have received above-average cooperation from respondents born in the GDR simply because the advanceletter and the questionnaire stated that the IAB is the research institute of theFederal Employment Agency, which is a well-known public body.Finally, when considering the respondent’s labour market status, the group that isthe least likely to provide linkage consent are the unemployed. The hypothesis thatunemployed respondents are reluctant to disclose more details of their potentiallyunflattering employment history to the data producer or users seems to be sustained.Most previous studies did not receive significant results on this matter, but whereasthe results of Haider and Solon (2000) concur with mine, the results of Jenkins et al.(2006) on the influence of means-tested benefit receipt are contradicted.Consent decreases with the share of refused answers, which hints at an underlyinglack of trust that also fosters the refusal of linkage consent. Cooperation with regard

10PASS is an acronym for the panel study “Labour Market and Social Security” of the IAB. SeeTrappmann et al. (2010) for an overview.

37

Page 38: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

to other consent decisions during the interview correlates positively with linkageconsent. This signifies a latent propensity of the respondent to cooperate with theinterviewer. This matter will be examined in more detail in Section 2.4.2.3.The results on interviewer characteristics first reveal that female interviewers farebetter in terms of achieving consent than their male colleagues. If this holds for bothmale and female respondents will be examined in section 2.4.2.3. Success in achievinglinkage consent also increases with the age and the general survey experience ofan interviewer but is unaffected by her educational level. The latter result is asunexpected as the finding that likelihood of consent decreases with the number ofinterviews an interviewer has already conducted. This might imply that interviewerswear off over the course of a study. Variables indicating the interaction betweenrespondent and interviewer are mostly insignificant. We learn, however, that it mayhave adverse effects on consent when the interviewer is much older than her interviewpartner.Characteristics of the interview situation mostly show the expected outcomes. Mostimportantly, the inconclusive hypotheses regarding the elapsed duration of theinterview before the consent question are mirrored in a insignificant result in themodel. The burden of the retrospective interview up to the consent decision obviouslywas not so big as to discourage respondents from giving consent to record linkage.

2.4.2.2. Results on determinants of record linkage success

At the stage of the actual record linkage neither the interviewer nor the process ofthe interview play any role for the outcome. Linkage success is only determined bycharacteristics of the respondent; only they influence how successfully the underlyingstring variables are standardised and how likely the respondents’ addresses are foundin the register address data. The model depicted in Table 2.7 thus only includescontrol variables regarding the respondent. They are identical to those included inModel 1 apart from variables that indicate the rareness of the respondent’s name,which are only included in models related to linkage success.As non-consenting respondents may not be linked to their register data, the followingmodels are restricted to consenters, which is why the number of observations differsfrom that of previous models. That way, only selectivity that emerges at this stageis captured by the model. To examine differences in matching success between thesubsequent steps of linkage described in section 2.2.2, Model 3 uses an indicator foran deterministic match as the dependent variable, and Model 4 uses an indicator fora successful match by either deterministic or probabilistic record linkage. Finally, inModel 5 the dependent variable denotes successful matches by either of the methods,i.e. it also includes manual matches.

38

Page 39: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.4. Selectivity analyses

Table 2.7.: Determinants of record linkage success on all stages, separate probitregression models

deterministic determ.+ allmatches probabilistic matches

matches(3) (4) (5)

Male (d) 0.049 (0.033) 0.027 (0.039) −0.005 (0.044)Aged 25-34 (d) 0.015 (0.056) −0.034 (0.068) −0.170∗∗ (0.084)Aged 35-44 (d) −0.121∗∗ (0.059) −0.132∗ (0.072) −0.411∗∗∗ (0.088)Aged 45-52 (d) −0.122∗∗ (0.060) −0.179∗∗ (0.073) −0.512∗∗∗ (0.089)Foreign nationality (d) 0.030 (0.104) 0.200 (0.131) 0.378∗∗ (0.179)Native language German (d) −0.517∗∗∗ (0.089) −0.061 (0.105) −0.100 (0.133)Born in East Germany (d) 0.001 (0.037) −0.015 (0.044) 0.043 (0.051)Partner in household (d) −0.111∗∗∗ (0.038) −0.166∗∗∗ (0.045) −0.072 (0.052)Children in household (d) 0.013 (0.037) 0.085∗∗ (0.043) 0.124∗∗ (0.049)Training + lower secondary (d) 0.056 (0.063) 0.150∗∗ (0.076) 0.349∗∗∗ (0.091)Training + intermediate (d) 0.030 (0.053) 0.084 (0.064) 0.200∗∗∗ (0.074)Training + upper secondary (d) −0.042 (0.060) 0.089 (0.072) 0.195∗∗ (0.083)Master craftsman (d) −0.074 (0.075) −0.123 (0.086) −0.035 (0.096)Higher Education (d) −0.179∗∗∗ (0.059) −0.029 (0.068) 0.126 (0.077)Prose literacy score 0.003 (0.015) −0.005 (0.018) −0.023 (0.021)Document literacy score 0.000 (0.015) −0.034∗∗ (0.017) −0.024 (0.020)Numeracy score −0.008 (0.014) 0.004 (0.017) −0.004 (0.019)High-cultural activity −0.055∗∗∗ (0.016) −0.040∗∗ (0.018) −0.075∗∗∗ (0.021)Self employed (d) −0.490∗∗∗ (0.083) −0.578∗∗∗ (0.094) −0.721∗∗∗ (0.115)Freelancer (d) −0.289∗∗∗ (0.096) −0.125 (0.112) −0.323∗∗ (0.135)In dependent employment (d) −0.260∗∗∗ (0.070) 0.082 (0.084) −0.006 (0.107)Civil servant (d) −1.416∗∗∗ (0.110) −1.750∗∗∗ (0.113) −1.989∗∗∗ (0.130)In formal education (d) −0.412∗∗∗ (0.082) −0.364∗∗∗ (0.096) −0.678∗∗∗ (0.120)Other activity (d) −0.496∗∗∗ (0.080) −0.443∗∗∗ (0.092) −0.625∗∗∗ (0.114)Personal net income<500EUR (d) −0.091 (0.056) −0.054 (0.066) −0.071 (0.076)500-999EUR (d) 0.008 (0.053) 0.032 (0.064) 0.204∗∗∗ (0.077)1000-1499EUR (d) −0.085∗ (0.050) −0.065 (0.060) −0.030 (0.070)2000-2999EUR (d) −0.141∗∗∗ (0.052) −0.008 (0.062) −0.037 (0.070)More than 3000EUR (d) −0.182∗∗∗ (0.061) −0.168∗∗ (0.069) −0.238∗∗∗ (0.076)Income refused (d) −0.226∗∗ (0.111) −0.032 (0.131) −0.064 (0.146)First name unique (d) −0.859∗∗∗ (0.067) −0.216∗∗∗ (0.070) −0.055 (0.081)Last name unique (d) −0.001 (0.031) 0.028 (0.036) 0.028 (0.041)Both parts of name unique (d) −0.773∗∗∗ (0.051) −0.112∗∗ (0.056) 0.094 (0.066)Constant 1.281∗∗∗ (0.129) 1.299∗∗∗ (0.153) 1.772∗∗∗ (0.189)

Wald-statistic (χ2) [p-value] 1,010 [0.000] 1,043 [0.000] 1,294 [0.000]pseudoR2 0.081 0.117 0.180Observations 9,024 9,024 9,024

Source: ALWA, own unweighted calculations. Notes: ***/ **/ * indicates significance at the 1/5/ 10% level. Reference category: respondent aged 18-24, no training, unemployed, net householdincome of 1500-1999 EUR, no part of name unique. d denotes dummy variable. Model 4 includessuccessful matches based on both deterministic and probabilistic linkage, Model 5 also includesmanual matches.

39

Page 40: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

In contrast to the hypothesis, linkage success is negatively related to the age of therespondent across all stages of linkage. The selectivity regarding age at the stageof consent is amplified at the stages of deterministic and probabilistic matches andeven more so after adding manual matches. This might be due to more old-fashionednames among older respondents that make errors or missing information in either ofthe address sources more likely. However, this should be controlled for by variablesdirectly considering the rareness of a respondent’s name in the survey address data.Another explanation might be that respondents in the reference group of the 18 to24 year-olds are predominantly found in dependent employment, in dual vocationaltraining or in registered unemployment while older respondents could be more likelyto have moved on to being self-employed or civil servants. In that case, the influenceof the respondent’s age could be confounded by that of her current labour marketstatus, which might be due to incorrect or incomplete information on the latter.A counterintuitive picture also emerges when considering the nationality and nativelanguage of the respondent. While there is no significant relationship between thenationality and deterministic or probabilistic linkage success, respondents with anationality other than German are much more likely to be linked after includingthe manual matches in Model 5. Manual review of possible matches might haveincreased the coverage of people with names that are rarely used in Germany. Thedegree of conventionality of the respondent’s name might also be related to hernative language. Consequently, German native speakers are much less likely to bematched deterministically than respondents with another language background. Thegeneral result is that respondents with an immigrant background are more likelyto be matched successfully across different stages of linkage. These are good news,given that they are under-represented among the ALWA participants compared tothe general population (see Kleinert et al., 2012a).Family related variables are also relevant for linkage success though not consistentlyacross the different stages. Having a partner in one’s household is negatively relatedto linkage in Models 3 and 4. This might indicate that respondents who are inpartnerships are more likely to have a partner that is the sole bread-winner in thehousehold. In these cases the respondent would be more likely to be out of thelabour force and thus not registered in the address data of the Federal EmploymentAgency. An alternative explanation comes to mind when considering the fact thatlinkage success is no longer related to the existence of a partner in the householdin Model 5. In some cases the manual comparison of potential matches obviouslyrevealed that respondents had changed their last names due to marriage and codedthese cases as manual matches based on the remaining identifiers.11 This shows that

11See Antoni (2013b) for the decision rules that were applied during the stage of manual matching.

40

Page 41: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.4. Selectivity analyses

the additional step of manual matching is worthwhile particularly when the culturalknowledge of the reviewers involved allows them to consider reasons for deviationsbetween address pairs that could not be considered by other linkage techniques. Theexistence of children in the household is positively related to linkage success, butonly in Models 4 and 5. A possible explanation is that children increase householdexpenditures, which increases the necessity of own income. This applies all themore for lone parent households. This increased need for money makes dependentemployment more likely.While there is no clear relationship between linkage success and the educationallevel of the respondent in Models 3 and 4, this changes in Model 5. When includingmanual matches, respondents with a vocational training degree are more likely to bematched in either of the stages than respondents without any schooling or trainingdegree in the reference group. As with other variable groups before, this might reflecta higher likelihood of the respective group of being in dependent employment ratherthan being self-employed or in full-time education.The hypothesis on the influence of the labour market status is mainly corroboratedby the results. Respondents that actually were in labour market states that areincluded in the data of the Federal Employment Agency, i.e. registered unemploymentdependent employment, are linked most successfully across the stages of linkage.As expected, the highest income class shows the lowest likelihood of being matched,whereas all other groups show no systematic differences from the reference group.The most likely explanation is that self-employed respondents with above-averageincome misreported their actual employment status as dependent employment orthat they have been classified incorrectly during the process of data preparation.Apart from the stage of deterministic matching, respondents that consented to recordlinkage but refused to answer the question on personal net income are not less likelyto by matched than others.The only variables that were included in the linkage success models but not in theconsent models are dummy variables on the rareness of the respondent’s name inthe survey address data. These are highly significant in Models 3 and 4 but notin Model 5. Respondents with unique first names or for whom both parts of theirnames are unique are less likely to be matched deterministically or probabilistically.A unique last name alone does not significantly influence linkage success though thatmight be due to the low number of observations in this cell. In Model 5, none ofthese dummy variables is individually significant. Manual review of address pairsseems to be better able to deal with very rare or misspelled names than the purelytechnical steps of deterministic and probabilistic matching.When comparing the determinants of success across the different stages of linkage,

41

Page 42: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

the picture is ambivalent. The additional step of probabilistic record linkage did notcounterbalance the influence of most of the control variables, but the two characteris-tics related to the immigrant background do no longer significantly influence linkagesuccess. The latter finding is corroborated by a Wald test, which shows that foreignnationality and a German native language are not jointly significant in Model 4(χ2(2): 1.38, p: 0.503). The important message at this point is that probabilisticrecord linkage did not introduce additional selectivity for the linked respondentscompared to the result of the deterministic linkage.

Adding the manual matches leads to considerable changes in the relationship betweensome respondent characteristics and the likelihood of finding someone in the registeraddress data. This is particularly remarkable as these manual matches only amountto 324 additional cases, which is a rather low proportion of the total number of8,243 successful matches. Formal tests show that the relationships between somerespondent characteristics and linkage success indeed changes more strongly and moreoften from the probabilistic stage to the manual stage than from the deterministicstage to the probabilistic stage. This is measured by a comparison of coefficientsgiven in Models 3 and 4 as well as Models 4 and 5, respectively. The results presentedin Table 2.8 are based on seemingly unrelated estimation12 of Models 3 to 5 andsubsequent Wald tests on equality of related coefficients.

Table 2.8.: Significantly differing coefficients betweenmatching success probit regressions

(3) vs. (4) (4) vs. (5)

Aged 35-44 (d) 1.000 0.001Aged 45-52 (d) 1.000 0.000Native language German (d) 0.000 1.000Training + lower secondary (d) 1.000 0.066Higher Education (d) 0.584 0.089In dependent employment (d) 0.000 1.000Civil servant (d) 0.017 0.254In formal education (d) 1.000 0.018500-999EUR (d) 1.000 0.072First name unique (d) 0.000 0.172Both parts of name unique (d) 0.000 0.001

Source: ALWA, own unweighted calculations. Notes: Num-bers denote p-values of Wald-tests on equality of coefficientsacross the regression models given in Table 2.7 based onseemingly unrelated estimation. The Bonferroni correction ofp-values has been applied to account for multiple testing of allvariables in the original models. Only p-values of significantlydifferent coefficients are shown. d denotes dummy variable.

12See the pioneering work of Zellner (1962) on seemingly unrelated regression and Davidson andMacKinnon (2004, pp. 518 sqq.) for its extension to non-linear models.

42

Page 43: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.4. Selectivity analyses

The pairwise comparison of all related coefficients between the respective modelsinvolves a large number of distinct hypothesis tests, which is why the level ofacceptable alpha-error has to be corrected. This problem of multiple testing isaccounted for by correcting the p-values using the method of Bonferroni (see Kornand Graubard, 1990). As this method is considered to be very conservative, it mightlead to too few differences to be reported as significant. However, when correcting thep-values with the less conservative method of Holm (1979), the coefficients reportedas significantly different across the models remain the same.Detailed analysis of the single p-values reveals that not all of the changes in coefficientsthat were accompanied by changes of significance levels are indeed statisticallysignificant. The changes in coefficients regarding the respondent’s nationality, herfamily related characteristics and her refusal of income information thus are notpractically relevant.

2.4.2.3. Sensitivity analyses

Decisions regarding the empirical strategy, model specification and how to deal withpotential unit nonresponse of the ALWA survey may affect the results of this analysis.This potential impact is examined in the following section. First, the general decisionto use the probit regression model is evaluated against two alternatives. To this end,each of the main specifications (Models 1 and 5) are re-estimated by using logisticregression and the linear probability model. While the significance levels of somecoefficients change, none of the substantive results from the original probit regressionmodels is put into question by results from alternative methods.13

All models so far have been estimated without considering survey weights. However,as the ALWA survey, like any other survey, experienced some unit nonresponse (seeKleinert et al., 2012a), its participants possibly had an above-average willingness tocooperate with the survey institute or the interviewers. Could this have influencedthe results on consent? To examine this, both models from Table 2.6 are re-estimatedwith the calibration weight provided with the ALWA data set, the outcome of whichis presented in Table B.3 in the appendix. With three exceptions, the results fromthe original models are unaffected. Both in Model 1a and in Model 2a, a foreignnationality shows a negative impact on linkage consent, which was not the case inprior models without the sample weights. This result is in line with both the relevanthypothesis and the findings from other studies. Moreover, respondents born in EastGermany are no longer more likely to consent to linkage than those born in WestGermany (Model 2a), although a weakly significant relationship remains in Model 1a.13The results are not presented in this paper. They and all of the following results that are

interpreted but not shown in the paper are available from the author upon request.

43

Page 44: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

Finally, respondents with a partner in the household show a significantly higherwillingness to consent compared to singles after the weights are taken into account.This is most likely an effect of unit nonresponse for which the calibration weight failsto correct.To examine the gender interaction of interviewer and respondent further, separateestimations for female and male respondents are conducted. Table B.4 in the appendixshows that male interviewers perform worse than female interviewers in achievingconsent from both female and male respondents, though this result is only weaklysignificant in both of the separate estimations.The previous section stated that single interviewers might wear off after completinga high number of interviews. An alternative explanation is that a high number ofprevious interviews per interviewer rather indicates that the specific interview tookplace very late in the field phase. The declining consent rate would then reflectthe fact that only the phone numbers of less cooperative respondents are left overfor further contact attempts. This would consequently also decrease cooperationregarding record linkage. To investigate this, I included the elapsed time since thegiven phone number had first entered the field management system and the numberof contact attempts before the actual interview. Both show no significant relationshipwith the respondents’ consent to record linkage. Furthermore, the inclusion ofthese variables did not affect the results on the influence of the number of previousinterviews per interviewer.An alternative explanation would be that interviewers with a very low numberof conducted interviewers are in fact supervisors. As they usually are the mostexperienced members of the interviewer staff, they might have been by appointedoccasionally to question the least cooperative respondents. To test this hypothesis Ire-estimated Model 1 after excluding all observations of interviewers with a totalnumber of conducted interviews below ten. The results regarding the number ofprevious interviews per interviewer remained stable.In estimating separate univariate probit regression models for the different dependentvariables, I deliberately chose a different strategy than Jenkins et al. (2006) or Salaet al. (2012). They estimate multivariate probit models (Cappellari and Jenkins,2003, see e.g.,) to explicitly allow for correlation between the error terms of differentequations. By doing so, they take into account that unobserved characteristicsof respondents potentially co-determine different decisions regarding cooperationduring the interview. In the study at hand, this rationale cannot be applied, as onlythe consent decision is made by the respondent. The success of the actual recordlinkage can no longer be influenced by the survey participants. To assert that thisassumption is valid, a fourvariate probit regression is estimated with full correlation

44

Page 45: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.5. Summary and conclusions

of all error terms. The dependent variables are linkage consent, the willingness toparticipate in a subsequent face-to-face interview including tests of cognitive skills,the willingness to participate in additional panel waves and the successful match ofthe data to the register data. As expected, while some of the relationships slightlyvary in terms of their significance levels, none of the substantial results reported sofar have to be revised.

2.5. Summary and conclusions

2.5.1. Implications for data users

Linked survey and administrative data sets provide additional research opportunitiesby unifying their respective wealth of variables. In the case of the ALWA survey,this was achieved with a comparatively low loss of observations over the differentstages of the process. The remaining number of observations should be sufficientfor a multitude of research questions that are usually examined using survey data.This study demonstrates potential sources of bias related to respondent consent or tolinkage success, thereby providing potential data users with the means to assess andcounteract any influence on their own empirical work. The main drivers of selectivityin terms of linkage rates are the respondents’ age and employment status.

2.5.2. Implications for survey practice

Lessons for survey practice are related to the composition of the interviewer staffand to field administration. Female interviewers significantly outperform their malecolleagues in achieving consent to record linkage, regardless of whether the respondentis male or female. This is not to say that survey institutes should only employ femaleinterviewers. The sex of the interviewer may also be relevant for other dimensionsof survey quality, and in some of them the relationship might be different. Wecan conclude that, first, female interviewers should at least not be strongly under-represented in the survey staff, as this might have adverse effects on overall consentrates. Second, female interviewers should be appointed specifically to respondentgroups for which consent rates are known to be low. This decision might be basedon studies such as this, or survey specific knowledge may be gained by an immediatemonitoring of consent rates during the field phase. A strategy like this obviouslyinvolves a field management that is able to react quickly to experiences from priorinterviews as well as to assign specific interviewers to respondents with previouslyknown characteristics such as age or sex.

45

Page 46: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2. Data, record linkage and selectivity analysis

High numbers of interviews per interviewer seem to have adverse effects on theirability to achieve cooperation by the respondents. Thus, an effective field monitoringshould notice and react to declining success rates of single interviewers over thefield phase. An a priori limit on interviews per interviewer seems advisable to keepconsent rates stable.

The results show that the older and more experienced the interviewers, the morelikely do they achieve consent to record linkage. This offers implications for therecruiting practice of the survey institute. Field management should bear in mindthat strong differences in age ought to be avoided, at least when the interviewerwould be much older than the respondent.

2.5.3. Implications for the process of record linkage

The results show that the step of probabilistic record linkage after the standardisationof addresses and the deterministic record linkage increased the number of matchesbetween survey and administrative records substantively. Linkage rates could beincreased by 30 percentage points; nearly 3,000 respondents could be added tothe linked data set. Descriptive and multivariate analyses demonstrate that theseadditional observations overall did not increase selectivity of the resulting sample.Thus, the effort invested in the additional step of probabilistic record linkage hasbeen worthwhile, as it further increased the research potential of the linked data set.

For the step of manual record linkage, the results are somewhat less promising.Additional 324 links could be found through a rather time-consuming process ofmanual comparisons of promising record pairs previously identified by the probabilisticstep of record linkage. The fact that this step increases the linkage rate amongthe consenters only by around 3 percentage points is not surprising, as the ratesachieved by deterministic and probabilistic linkage have already been very high, andonly very few record pairs have been found as being promising enough to enter themanual process. The influence that these few additional links had on the structureof the combined data in terms of socioeconomic characteristics is somewhat moreworrisome, given that the influence of several respondent characteristics significantlychanged after the classification of these cases as matches. Most notably, the negativerelationship with the respondent’s age got more pronounced, whereas a positiverelationship arose with the educational level of the respondent. Reasons for thesechanges will be subject to future research.

46

Page 47: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

2.5. Summary and conclusions

2.5.4. Further avenues of research

There are further steps to be done in the ALWA project. The number of respondentsfor which administrative records can be found will be increased. Several labourmarket states are not registered in the data of the Federal Employment Agency.Respondents that have been in one of these states during the whole year of 2007might have no administrative records available for the sampling period. However,spells may exist for earlier or subsequent years, when other employment states mayhave applied. Therefore, addresses from administrative records from before and afterthe year 2007 will be drawn and submitted to the linkage procedure. Subsequentresearch will determine which influence this additional effort will have on the resultingsample, for instance whether it affects the selectivity of the linked data set.Moreover, analyses of the validity of survey data can be conducted. The combineddata sets have overlapping longitudinal information both from the view of therespondent and their representation in administrative records. By comparing these,it is possible to identify inconsistencies in terms of the dating and duration ofevents (see e.g., Huber and Schmucker, 2009). This might facilitate methodologicalimprovements regarding the gathering of longitudinal information.ALWA participants which had been willing to participate in subsequent panel waveswere included in a sub-study of the German National Educational Panel Study(NEPS) (see Allmendinger et al., 2011). Those who did not provide consent torecord linkage during the ALWA interview were asked for it again and new samplemembers were asked for the first time. Possible changes in consent behaviour ofsingle respondents or between the different survey populations will be examined.This will provide methodological evidence on record linkage from a longitudinalperspective.Section 2.3 made it obvious that research on record linkage would benefit fromtheoretical advances related to consent behaviour. Hypotheses often had to bebuilt upon the results of previous record linkage studies or on theoretical conceptsexplaining item or unit nonresponse. The decision on providing consent to recordlinkage, however, does not necessarily follow the same rules as nonresponse. Forinstance, accurate information retrieval is less important in the context of recordlinkage than for retrospective questions on longitudinal information. On the otherhand, consent to record linkage involves a higher level of trust in data protectionthan answers to single items such as the actual personal or household income. Acomprehensive theoretical framework for these specific matters has yet to be devised.

47

Page 48: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?The relevance of family background for on-the-job training

3.1. Introduction

Intergenerational transmission of educational chances and success is documented bya number of studies. The educational achievements of children, youths and evenadolescents correlate heavily with the success of their parents in this field. Childrenof better educated parents achieve higher schooling certificates and attempt as wellas attain formal occupational training or higher education more often than childrenfrom less educated family backgrounds. That holds for various countries and hasbeen a stable phenomenon for decades (e.g. Björklund and Salvanes, 2011; Heineckand Riphahn, 2009; Hertz et al., 2007).This relationship is not restricted to general schooling. Early decisions on theattainment of formal training sustain an intergenerational relationship of educationallevels, even when parents no longer have direct influence on subsequent decisions. Partof this can be explained by path dependency between general schooling and formaloccupational or higher training (see Pallas, 2004). Unequal chances in schoolingattainment lead to even more unequal opportunities when it comes to educationthat is directly relevant for labour market success. This is of particular relevance forcountries in which the majority of occupational or academic educational activitiesstrictly require certain schooling certificates prior to admission. Thus, unequalchances in early education have a lasting influence on educational success during thewhole life course.Education—or a lack thereof—heavily influences several aspects of life, includinglabour market success (see e.g., Trostel et al., 2002) and a wealth of non-monetaryoutcomes (see e.g., Grossman, 2006). Policymakers therefore strive to minimiseinequality in educational chances. Among others, non-formal training, henceforthreferred to as on-the-job training interchangeably, is regarded as a means to compen-sate for lacking formal educational achievements. Formal and non-formal trainingactivities are highly dissimilar in terms of, for instance, the amount of time or moneythey exact, the extent of knowledge and signalling value they offer, or the entry

48

Page 49: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.1. Introduction

barriers they present. Nevertheless, non-formal training has proven to increase wages,lower the risk of unemployment and facilitate career advancement (e.g., Asplund,2005; Büchel and Pannenberg, 2004; Dieckhoff, 2007). This compensates for lowerchances of access to formal education to some extent. People from less educated fam-ily backgrounds might catch up in terms of educational and subsequently in labourmarket chances if they only participated enough in non-formal training. However,for this catching-up to happen they would have to participate in on-the-job trainingmore often or more intensely than people from better educated family backgrounds.Otherwise the gap widens even more due to the path dependency of formal education.

When we turn our attention to adults’ participation in on-the-job training, theempirical knowledge appears to be sound. For Germany, for instance, results basedon a variety of data sets show that participants mainly do not have an immigrantbackground, are better educated, middle-aged and gainfully employed (see e.g.,Büchel and Pannenberg, 2004; Schömann and Leschke, 2008). Despite ostensiblyample evidence, it is startling that the relation of formal parental education andan adult’s own non-formal training participation has never been analysed explicitly.The literature on intergenerational transmission of education concentrates on formaleducation and thus provides no insights into the question at hand.

There are only few studies that even remotely touch this issue. Those who do someasure family background poorly or include it in the analysis without interpretingit or motivating it theoretically. Pannenberg (2001) examines returns to on-the-jobtraining in Germany. In a first step of estimating the selection into training, heincludes the educational level of the father. This is positively related to the offspring’straining participation. Buchmann et al. (1999) include the qualification and labourmarket status of the father in their analysis of determinants of non-formal trainingin Switzerland. They find that the father’s labour market status plays no role,regardless of the type of training. A lack of a vocational certificate on behalf ofthe father, though, lessens the probability of his offspring’s further training. Themeasurement of family background in both studies is so undifferentiated that athorough analysis of the relationship of parental formal education and the offspring’snon-formal training participation is not feasible.

Instead of examining a mono-causal link between parental education and the off-spring’s non-formal education, the present study shows that on-the-job trainingduring adulthood is determined by a wealth of factors, many of which are related toparental education. Family background taken by itself plays a significant role foron-the-job training. Factors like cognitive skills, personality traits, cultural capitaland the detailed formal educational history allow a distinction between the influenceof the parental education and other inheritable characteristics.

49

Page 50: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

By means of count data models, the analysis differentiates between what determinesnon-participation and what is important for the rate of on-the-job training participa-tion over the course of employment spells. The results show that not all relevantfactors determine both measures in the same way. Data from the German ALWAsurvey with a strong focus on formal and non-formal education as well as a wealthof social background information on respondents make this analysis possible.Although the results presented here stem from Germany, they are informative froman international perspective for two reasons. First, Germany shares the phenomenonof low intergenerational educational mobility with a multitude of countries (seeChevalier et al., 2009; Schütz et al., 2008). Lessons on the long-term consequencesof this phenomenon should be relevant to educational and labour market policy injust as many countries. Second, the institutional barriers inherent in the Germanvocational training system14 also exist in countries with similar systems such asAustria, France, Switzerland and a growing number of Asian countries. They maybenefit from these results when evaluating their educational systems’ capabilities toincrease intergenerational educational mobility.This paper develops in the following way. Section 3.2 shows the theoretical founda-tion and derives some hypotheses. Section 3.3 introduces the data set and showsdescriptive results. The econometric strategy is described in Section 3.4. This isfollowed by a presentation of the results in Section 3.5. Section 3.6 adds predictionsand results from sensitivity analyses. Section 3.7 concludes.

3.2. Theory and hypotheses

Since there is no unified theory on how the family background might influence aperson’s educational decisions in adulthood, one has to draw on theories on differentaspects of the proposed relationship. Indeed, there are some theories that provide thebasis for hypotheses, as they border at the topic or concentrate on some aspects ofit. The framework for the hypotheses will be provided by human capital theory (e.g.,Becker, 1962). Additional assumptions based on theories and results from disciplinessuch as psychology and genetics are implemented below.According to Becker (1962), human capital denotes a bundle of skills that determinea person’s labour market productivity. In the standard model, this productivitycan be observed accurately by both the person holding it and potential employersat no cost. Human capital can be augmented by training, which in turn increasesfuture income prospects, as a person is assumed to achieve a wage according to hermarginal productivity. Investment in human capital should take place as long as the14See Frick et al. (2007) for a description of the German educational system.

50

Page 51: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.2. Theory and hypotheses

expected present value of wage gains at least equals the present value of direct andopportunity costs. Another assumption is that the costs of training are negativelyrelated to innate abilities of the learner. More able people thus invest more in theirhuman capital.Becker and Tomes (1979, 1986) bring the family into the theoretical framework bydeveloping a model that allows for utility maximisation over time on the family level.Rather than treating people as isolated beings, they are assumed to be members offamilies that consist of several generations. The utility of future generations inside afamily is considered in consumption and investment decisions of each actual familymember. Parents strive to maximise the utility of the family by creating inheritablewealth, but also by investing in the human capital of their offspring. Models in asimilar fashion or developments thereof have been proposed by Becker and Chiswick(1966), Solon (2004) and Checchi (2006).Since, for one thing, training costs depend on the ability of the learner and, foranother thing, the amount of investments in the offspring’s human capital dependson the expected returns, parents are willing to invest more in the education of moreable children. Signals of high ability like cognitive skills or schooling success, whichare both observable by parents, would lead to higher investments in the children’shuman capital. Thus, children who perform well in school are more likely to receivefinancial assistance by their parents during initial formal schooling and even after itsend.Bowles and Gintis (2002) stress that parents do not only provide the financial meansfor the education of their offspring. They also hand on other important endowmentsto their children. Surveying decades of research, Plomin et al. (2008, 156 sqq. and238 sqq.) show that both cognitive ability and personality traits are geneticallyinheritable to some extent. Economic studies corroborate that for cognitive ability(Anger and Heineck, 2010; Black et al., 2009). After birth, the development of theoffspring’s abilities and personality is affected by a wealth of factors including theparents’ education or parenting skills (see e.g., Cunha and Heckman, 2008; Feinsteinet al., 2004). In terms of the model of Becker and Tomes (1986) it is not relevantwhich part or how much of the inherited endowment is due to genetic inheritance ordue to imitating and learning from parents. Both channels are equally included intheir model.Ability and personality on the other hand are related to the propensity of attainingeducation. The survey by Colquitt et al. (2000) on personality traits that influencetraining participation reveals that locus of control (see Rotter, 1966), anxiety (seeMcCrae and John, 1992) and self-efficacy (see Bandura, 1994) are the most predictivetraits. Moreover, Fouarge et al. (2013) empirically find several personality traits

51

Page 52: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

that strongly influence the willingness to participate in on-the-job training. Cawleyet al. (2001) and Heckman et al. (2006) give an overview of studies that show thestrong positive relationship between cognitive ability and educational attainment.Cognitive skills and personality traits also positively influence wages (e.g., Greenand Riddell, 2003; Heineck, 2011; Heineck and Anger, 2010). By reducing liquidityconstraints, these endowments also foster human capital investments indirectly.Moreover, ability and personality are interrelated. Results on the relationshipbetween cognition and temporal discounting indicate that the higher one’s abilities,the lower one’s discount rate (Dohmen et al., 2010; Frederick, 2005; Kirby et al.,2005). Applying this to human capital theory, which states that the higher thediscount rate the lower the probability of investment in human capital, it followsthat the higher a person’s ability, the lower her discount rate, the more likely willshe participate in education.To sum up, family background influences the incidence of non-formal training bothindirectly and directly. Smarter parents are better educated and thus have higherincomes. They are able to invest more in their offspring’s formal education. This inturn puts children on a path of continued education in later life. It follows that theyhave more access to non-formal training in later life just by having enjoyed betterinitial conditions than people from a less educated family background.Apart from this indirect effect, ability and favourable personality traits inheritedfrom parents directly increase educational attainment. These endowments lead tohigher training motivation, a higher willingness to finance training and more learningsuccess.However, the decision whether someone is trained and, if she is, how this is financedis not entirely up to the potential learner herself. In contrast to human capital theory,theories of segmented labour markets (see e.g., Leontaridi, 1998; Taubman andWachter, 1986) stress the influence of job characteristics. They argue that trainingparticipation mainly depends on the job, or more specifically, the labour marketsegment someone is employed in. Jobs in some segments mostly provide well-paidand stable employment with good career prospects as well as training opportunities.Other segments comprise of badly-paid jobs which lack stability or career chances.Due to the short expected duration of these employment relationships, time tocash-in returns on investments in human capital would be too short. Anticipatingthis, employers are not willing to finance such investments. Furthermore, as peoplein secondary segment jobs earn less, their own financial means to invest in theirhuman capital are smaller, making training even more unlikely for them. Along thelines of these theories, I argue that a strong influence of job and firm characteristicsis to be expected. The hypotheses and implications are summarised below.

52

Page 53: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.2. Theory and hypotheses

H1: The better educated and the more economically successful the family back-ground, the more likely is on-the-job training participation (intergenerationalpersistence hypothesis).

According to the model of Becker and Tomes (1986), better educated and wealthierparents are more able to support their offspring financially. This support has twoimplications. First, the theory allows for continued financial support even afterthe end of formal education or after the offspring has moved out of the parents’household. Second, early financial transfers in wealthy families may lead to higheraccumulated wealth on the part of the offspring. This in turn facilitates the ownfinancing of on-the-job training in the long run. As financial support of that kind orthe offspring’s wealth are not observed in the data set at hand, their impact cannotbe distinguished from that of parental education.

H2: Cognitive skills and favourable personality traits positively influencetraining participation (endowment hypothesis).

The smarter a person, the higher her probability to invest in any kind of education.Cognitive skills and on-the-job training should thus be positively related. Theprobability also depends on personality traits. A high value on, for instance, anexternal locus of control signifies the belief that events in life or their outcomes aredetermined by external factors such as luck rather than by own decisions, actionsor effort. This belief should lead to a low expected return to education and thusto fewer investment in on-the-job training. Finally, the higher the cultural capital,measured as the participation in high-cultural activities and the number of books inthe household, the more likely will on-the-job training be.

H3: The higher the formal education the higher the probability of on-the-jobtraining (path dependency hypothesis).

The individual wage depends on one’s productivity, which in turn depends on thetraining one has received. Therefore, the higher one’s schooling and training thehigher are the financial means to invest in further education. The investment decisionsclaimed by human capital theory can be realised as they are not impeded by liquidityconstraints.

H4: Participation in on-the-job training is the more likely the higher the jobrequirements, the higher the weekly working hours and the larger the firm (jobsegment hypothesis).

53

Page 54: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

As proposed by theories of segmented labour markets, the job one is employed atstrongly determines training participation. In particular, the probability of on-the-jobtraining should rise with weekly working time, job requirements as well as firm size.All of those characteristics are seen as proxy variables for stable employment in afavourable labour market segment or the presence of an internal labour market inthe firm. This in turn increases the probability of positive returns to human capitalinvestments and makes on-the-job training more likely.

3.3. Data and descriptives

The analyses in this chapter are only based on the longitudinal life course informationgathered during the ALWA survey. The administrative employment data from ALWA-ADIAB or the basic skills scores from ALWA-LiNu cannot be used for the followingreasons: first, the questionnaire covered non-formal training activities, measured bytheir frequency during spells of employment, unemployment and other life courseepisodes. Those activities reported during employment, henceforth referred to ason-the-job training, will be of central interest in the sections to come. That is why theemployment spells and their related variables reported by the ALWA respondents areindispensable in this context. Although one might be inclined to enrich these data byinformation from administrative records, particularly by their earnings information,this is not feasible. While ALWA-ADIAB represents a link between a respondent’ssurvey answers and her administrative employment records on the person-level, theydo not provide a match on the spell-level. That means that there is no link betweena given survey-based employment spell, for which the information on on-the-jobtraining is given, and its representation in the administrative data.

Second, the basic skills tests in ALWA, which are the basis for the ALWA-LiNu data,were only conducted by a sub-sample of the ALWA respondents. The restriction ofthe estimation sample to employment spells of these tested participants, on top of therestrictions presented in the next section, would lead to a severe loss of observations.This would strongly impede statistical inference. More importantly, the basic skillswere tested shortly after the interview, whereas all reported on-the-job trainingactivities have taken place before the time of interview, in most cases with a distanceof time of several years. Consequently, there is no way of telling what the level ofbasic skills had been during the employment spell including the on-the-job training,or whether or how the skill levels at the time of interview had been influenced bythese activities.

54

Page 55: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.3. Data and descriptives

3.3.1. Independent variables

Standard socio-demographic characteristics include age-classes, sex and an immigrantbackground. This is supplemented by a dummy variable for being employed in EastGermany and the unemployment rate, differentiated between East and West Ger-many. Education is measured by dummy variables for formal schooling and traininglevels. Full-time equivalent employment experience and its square are included asan additional measure for human capital. They are computed as cumulated workexperience over all jobs before the beginning of the spell under consideration.Family background is measured by formal schooling and training levels of the parentof the same sex as the respondent as well as this parent’s employment status duringthe youth of the survey respondent. Along the lines of the sex-role model (see Sinnott,1994), the parent of the own sex is likely to have a stronger influence on educationalor career decisions of the offspring than the parent with the opposite sex. To examinewhether this decision has an influence on the results, Section 3.6 presents resultsbased on the highest training level of any of the parents.To test the endowment hypothesis, several self-reported measures of cognitive skills,personality traits and cultural capital are included in the analysis. These have beencomputed by principal component analyses (see e.g., Jolliffe, 2002) using several5-point items for each score. Cognition is measured by scores in the three domainsprose literacy, document literacy and numeracy. These have been calculated based onthe self-reported success in the school subjects mathematics and German as well ason several items on self-assessed literacy and numeracy. Information on personalitytraits include scores on an external versus an internal locus of control. Personality isalso measured by a score of employment-related self-confidence. Finally, scores onthe importance of life domains such as work and occupation on the one hand andfamily and friends on the other hand are included. I argue that these scores representproxy variables for time preference, as they relate to the trade-off between work andleisure time. The more value one assigns to the family or friends, the higher the timepreference and the lower the likelihood to invest in work-related training. A proxyvariable on cultural capital is constructed as a principal component score based onitems on the participation in high-cultural activities and the number of books in thehousehold.As only spells of dependent employment will be considered, firm and job characteris-tics control for the employer’s influence on human capital investments. The dataprovide information on the working time in four categories, the qualification levelrequired for the job, the firm size and whether the worker is employed in the publicservice. Sample statistics for all independent variables are given in Table C.1 in theappendix.

55

Page 56: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

The analysis sample considered in Table C.1 is a result of the following restrictions:spells that originate from outside Germany are excluded; spells that originate fromEast Germany starting before reunification are excluded, as training decisions in theformer GDP did not necessarily follow the cost-benefit considerations laid down byhuman capital theory. Although the same could apply to public sector workers, theyare not excluded from the sample. The implications of this decision are presentedin Section 3.6. Finally, self-employment spells are excluded, as the notion of co-financing of training by the employer cannot be examined for them. The number ofobservations lost due to each of these exclusions and to missing values is given inTable C.2 in the appendix.

Although descriptive in nature, Table 3.1 provides a first empirical indication ofwhether individual characteristics are associated with family background variables,as stated in Section 3.2. The table reports t-tests comparing the means of importantindividual characteristics of low qualified and highly qualified family backgrounds.The offspring of parents without formal schooling or training is described in thecolumn denoted by “low”, whereas the offspring of parents that have both formalschooling and formal training are described by the column denoted by “high”. Thetable is based on spells rather than individuals because some independent variablesmay vary between different spells of a given person in the sample.

The first striking difference between columns 1 and 2 is related to the probabilityof participation in and frequency of on-the-job training per spell. The offspringfrom more educated family backgrounds is significantly more likely to participatein on-the-job training. On average, they also experience on-the-job training moreoften per employment spell than the offspring from less educated backgrounds, witha frequency of 2.41 compared to 1.97.

The following rows shed some light on how people from different family back-grounds differ in terms of education, cognitive skills and personality. Given thewell-documented low educational mobility in Germany, it is hardly surprising that theoffspring of better educated parents is better educated itself. This is demonstratedby, for instance, the significantly lower share of observations without formal schoolingor training in the second column. Higher cognitive skills, higher employment-relatedself-confidence and a higher cultural capital in column 2 also corroborate the assump-tion of inheritability of cognition and personality. All in all, the offspring of bettereducated parents seem to be better equipped with endowments and skills relevant forthe labour market. Based on the assumptions stated in Section 3.2, these differencesshould make them more likely to participate in on-the-job training.

56

Page 57: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.3. Data and descriptives

Table 3.1.: On-the-job training, education and endowments by dichotomous familybackground, measured as educational level of parent of the same sexas respondent, t-test of difference

low high difference t

Participation in on-the-job training (d) 0.293 0.338 0.045∗∗∗ 4.985Training frequency per spell 1.966 2.405 0.439∗∗∗ 3.223No schooling (d) 0.012 0.006 −0.006∗∗∗ −3.718Lower secondary schooling (d) 0.302 0.184 −0.118∗∗∗ −15.165Intermediate schooling (d) 0.439 0.392 −0.046∗∗∗ −4.885Upper secondary schooling (d) 0.247 0.418 0.170∗∗∗ 18.268No training (d) 0.058 0.046 −0.012∗∗∗ −2.962Apprenticeship (d) 0.782 0.630 −0.152∗∗∗ −16.709Master craftsman/technician (d) 0.027 0.056 0.030∗∗∗ 7.010Higher education (d) 0.134 0.269 0.135∗∗∗ 16.351Prose literacy (score) −0.002 0.051 0.053∗∗∗ 2.721Document literacy (score) −0.191 0.081 0.272∗∗∗ 14.236Numeracy (score) −0.130 0.036 0.165∗∗∗ 8.741High-cultural activity (score) 0.014 0.090 0.076∗∗∗ 4.099Importance of work (score) −0.135 −0.029 0.106∗∗∗ 5.498Importance of occupation (score) −0.010 0.149 0.159∗∗∗ 9.568Importance of friends (score) −0.159 −0.064 0.096∗∗∗ 4.890Importance of family (score) 0.147 0.033 −0.114∗∗∗ −6.086External locus of control (score) −0.006 −0.041 −0.035∗ −1.862Internal locus of control (score) 0.104 −0.026 −0.130∗∗∗ −6.670Employment related self-confidence (score) −0.000 0.052 0.052∗∗∗ 2.745

Source: ALWA, own unweighted calculations. Notes: 17,254 observations. ***/ **/ *indicates significant difference at the 1/ 5/ 10% level. d denotes dummy variable. low denotes alack of parental formal schooling or training degree (n=3,307), high denotes existing parentalformal schooling as well as training (n=13,947).

57

Page 58: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

3.3.2. Dependent variable

The number of on-the-job training activities during each employment spell is con-sidered as the dependent variable. Table 3.2 provides descriptive statistics for thisvariable and differentiates between several individual, parental and job characteristics.The first column reports on the likelihood of experiencing on-the-job training in agiven spell; the second column shows the number of on-the-job training activitiesper spell. As employment relationships strongly differ in duration, column 3 alsopresents the frequency of on-the-job training per year.Over the whole estimation sample, the probability of experiencing on-the-job trainingduring an employment relationship equals 32.9%, the mean number of on-the-jobtraining activities per spell is 2.32, and the mean frequency per year is 0.56. Byconsidering how these numbers vary by subgroup, these descriptive results providea first impression of the influence of individual, parental and job characteristics onon-the-job training.First, the better educated a person, the higher is the likelihood of on-the-job trainingas well as its frequency. This applies for formal schooling and formal training levels.Second, on-the-job training probability and frequency are associated with parentalschooling and training. The better educated the family background the more likelyand frequent is own on-the-job training. While, for instance, a parental backgroundwithout formal schooling is associated with a probability of on-the-job training of22.3%, a parental upper secondary schooling degree is associated with a trainingprobability of 37.1%.The amount of on-the-job training is also related to job and firm characteristics. Thehigher the weekly working time, the job requirement or the number of employees inthe firm, the more likely and frequent is on-the-job training. Thus, the descriptiveresults corroborate the hypotheses given in Section 3.2.

3.4. Econometric strategyThe dependent variable is the number of on-the-job training activities during agiven employment spell, denoted by y. Since the underlying random variable is anon-negative integer with a strongly right-skewed distribution, a count data model(see Cameron and Trivedi, 1998) is the usual approach for this analysis. Such modelsallow inference about the influence of explanatory variables on the number of eventsduring a measurable amount of time.The most basic variant of count data models assumes a probability distribution forthe number of events based on the Poisson distribution. The Poisson regressionmodel can only be justified if the properties of the underlying Poisson distribution are

58

Page 59: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.4. Econometric strategy

Table 3.2.: Probability and frequency of on-the-job training by individual, parentaland job characterstics (only dummy variables)

participation frequency freq./year obs.(1) (2) (3)

Total 0.329 2.320 0.563 17,254Age: 18-21 (d) 0.230 1.639 0.359 4,472Age: 22-25 (d) 0.308 2.190 0.527 3,725Age: 26-35 (d) 0.397 2.974 0.668 6,241Age: 36-51 (d) 0.365 2.125 0.703 2,816Male (d) 0.330 2.390 0.561 8,713Immigrant background (d) 0.319 2.185 0.582 2,812No schooling (d) 0.145 0.661 0.190 124Lower secondary schooling (d) 0.226 1.517 0.355 3,562Intermediate schooling (d) 0.308 2.092 0.444 6,925Upper secondary schooling (d) 0.410 3.021 0.806 6,643No training (d) 0.138 0.668 0.279 826Apprenticeship (d) 0.282 1.896 0.439 11,367Master craftsman/technician (d) 0.405 2.830 0.493 873Higher education (d) 0.480 3.693 0.970 4,188P: no/unknown schooling degree (d) 0.223 1.657 0.379 443P: lower secondary schooling (d) 0.322 2.232 0.484 11,002P: intermediate schooling (d) 0.338 2.427 0.678 3,490P: upper secondary schooling (d) 0.371 2.708 0.800 2,319P: no/unknown vocational degree (d) 0.291 1.910 0.389 3,222P: apprenticeship (d) 0.329 2.329 0.566 10,641P: mastercraftsman/technician (d) 0.368 2.605 0.607 1,447P: higher education (d) 0.367 2.741 0.804 1,944P: not employed (d) 0.333 2.134 0.606 2,615P: employed (d) 0.325 2.327 0.545 12,880P: self-employed (d) 0.357 2.547 0.630 1,759Firm in East Germany (d) 0.274 1.999 0.499 2,195Public service (d) 0.464 4.118 0.835 3,730Working .25 full-time (d) 0.145 0.633 0.214 1,429Working .5 full-time (d) 0.280 1.898 0.537 1,662Working .75 full-time (d) 0.353 2.672 0.696 931Working full time (d) 0.354 2.531 0.595 13,232No training required (d) 0.090 0.349 0.113 1,913Induction period required (d) 0.169 0.891 0.244 1,983Vocational training required (d) 0.298 1.917 0.468 8,438Vocational schooling required (d) 0.468 3.483 0.805 1,100Master craftsman/technician requ. (d) 0.453 3.702 0.653 678Higher education required (d) 0.584 4.802 1.189 3,142Firm size: less than 5 (d) 0.197 1.049 0.322 1,394Firm size: 5-9 (d) 0.243 1.439 0.417 2,222Firm size: 10-19 (d) 0.265 1.613 0.473 2,186Firm size: 20-99 (d) 0.294 1.927 0.478 3,999Firm size: 100-199 (d) 0.330 2.283 0.714 1,830Firm size: 200-1,999 (d) 0.432 3.184 0.711 3,629Firm size: 2,000 and more (d) 0.472 4.219 0.755 1,994

Source: ALWA, own unweighted calculations based on 17,254 observations. Note: d denotesdummy variable.

59

Page 60: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

fulfilled, most notably the assumption of equality of conditional mean and conditionalvariance:

V ar(yi|xi) = E(yi|xi) = µi (3.1)

If this assumption of equidispersion is violated, the single-parameter Poisson distri-bution may be too inflexible to account for the real-life data at hand. As a result,biased standard errors make statements on the significance of regressors unfeasible.A glance at the distribution of the count variable in the estimation sample indeedreveals significant overdispersion in the data. Its variance of 49.56 is more than 20times as high as its mean of 2.32.A common remedy is a generalisation by using a model based on the negativebinomial distribution (e.g., Cameron and Trivedi, 1986; Hilbe, 2008). This allows formore flexibility due to an additional distributional parameter (Cameron and Trivedi,1998). The variance function from Equation 3.1 therefore turns into

V ar(yi|xi) = µi + αµki . (3.2)

One possible interpretation of the gamma-distributed parameter α is that of dealingwith unobserved heterogeneity as a mixture distribution. The dispersion parameterα can be estimated from the data. The two most common variants of the negativebinomial model are often called Negbin I (k = 1) and Negbin II (k = 2). Bothare better suited to deal with overdispersed data than the Poisson model, but thedecision for either of them depends on the data at hand. Information criteria aresuitable to provide a basis for this decision.Since Poisson and negative binomial models are nested, different tests can be usedto determine whether the null hypothesis of α being equal to zero has to be rejected.Highly significant test statistics of both the Pearson’s χ2 goodness-of-fit test (199229.8,df=17,206) and the likelihood ratio test (86089.4, df=17,206) reject the null hypothesisand reveal that the Poisson regression model is not appropriate for the data at hand.Information criteria show that the Negbin II model (BIC=49338.49, AIC=48958.45)is superior to the Negbin I variant (BIC=49840.55, AIC=49460.52) for the analysisat hand. Due to this result the Poisson and Negbin I regression models will no longerbe considered throughout the text.Both Poisson and negative binomial models produce biased results when the share ofzero-counts in the data is much higher than predicted by the underlying probabilitydistribution (Hardin and Hilbe, 2007). The most common solutions in econometricsare the hurdle model (Mullahy, 1986) and the zero-inflated model (Heilbron, 1994;Lambert, 1992). Both models are flexible enough to allow for a substantive amountof non-participants in the sample and can be based either on the Poisson or the

60

Page 61: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.4. Econometric strategy

negative binomial distribution. Thus, the issues of missing equidispersion and anexcess of zeros can both be tackled with these models.The models’ interpretations of what drives the amount of zero-counts differ. Thehurdle model assumes that counts below a given hurdle—in our case zero-counts—andcounts above the threshold stem from different data generating processes. Theyare potentially driven by different explanatory factors. The hurdle model thus is atwo-part model with one part modelling the probability to encounter an event at alland a second part considering positive counts of events. The interpretation wouldbe that of a two-staged decision process. Empirical applications of this model toon-the-job training include Arulampalam and Booth (2001) and Pannenberg (1998).The zero-inflated model enables us to distinguish between a subgroup that is notsubject to the risk of any event and a subgroup that can experience any numberof events, including zero-counts. Covariates that describe the first group explainthe so-called inflation of the number of zero-counts. The covariate vector describingthe group that experiences any number of events, including zeros, may differ fromthe former. Empirical evidence provides support for this method in the currentcontext. Backes-Gellner et al. (2007) show that some people persistently refrain fromon-the-job training as contrasted to those who merely participate less frequentlythan others. They identify characteristics that make it more likely to belong to thegroup of chronic non-participants. This is more in line with the interpretation of thezero-inflated model than that of the hurdle model.Given that the two equations of the zero-inflated model are estimated simultaneously,it is generally more efficient than the hurdle model. Bearing that in mind andusing both Akaike and Bayesian information criteria to compare the two non-nestedmodels, the hurdle model is rejected in favour of the zero-inflated model. The testbased on the work of Vuong (1989) confirms that the data indeed show an excessof zeros as the zero-inflated negative binomial model is preferred over the negativebinomial model. Therefore, all results presented in Sections 3.5 and 3.6 are based onthe zero-inflated negative binomial model and maximum likelihood estimation.15 Foran application of this method see Gerner and Stegmaier (2009) who examine theprovision of on-the-job training by firms in Germany on the basis of firm data.The longer an employment spell the more occasions may arise in which on-the-jobtraining is necessary or profitable. Thus, a given count may represent different ratesof training per period of time. The natural logarithm of the spell duration in monthsis included in the model to indicate the amount of time during which a personis exposed to the possibility of participating on-the-job training. Its coefficient isconstrained to 1.15Details on the tests and the results of all discarded models are available upon request.

61

Page 62: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

Some of the individuals under consideration experience more than one employmentspell during the observation period. These observations are correlated due to commonunobserved characteristics (Moulton, 1990). This usually leads to an underestimationof standard errors. To achieve robust standard errors, the variance-covariance matrixis estimated by the modified, cluster-robust version of the sandwich estimator basedon Huber (1967) and White (1980).

3.5. ResultsThe estimation results given in Tables 3.3 and 3.4 are structured according to thehypotheses given in Section 3.2. In every model specification, the column “non-participation” represents the probability never to participate in on-the-job trainingduring a given spell, whereas the column “frequency” is related to the number oftraining events per spell. Results are presented as odds ratios, so that values largerthan one indicate a lack of on-the-job training (columns 1 and 3) but a highernumber of on-the-job training activities (columns 2 and 4) per spell respectively. Thecovariates common to all specifications include age-class, sex, immigrant background,employment in East Germany and the unemployment rate in East or West Germanyat the beginning of the spell. Their odds-ratios will only be shown for the first twomodels.

Table 3.3.: Determinants of non-participation in training and of the number of coursesrespectively, zero-inflated negative binomial regression, probit inflation

Model 1 Model 2non-part. freq. non-part. freq.

Age: 22-25 (d) 0.684∗∗∗ 1.184∗∗ 0.777∗∗∗ 1.184∗∗(0.070) (0.097) (0.065) (0.098)

Age: 26-35 (d) 0.418∗∗∗ 1.224∗∗ 0.598∗∗∗ 1.204∗∗(0.117) (0.117) (0.060) (0.107)

Age: 36-51 (d) 0.430∗∗∗ 1.398∗∗∗ 0.629∗∗∗ 1.378∗∗∗(0.100) (0.143) (0.071) (0.144)

Male (d) 1.364 1.017 1.081 1.075(0.277) (0.145) (0.099) (0.128)

Immigrant background (d) 1.099 1.093 1.030 1.136(0.118) (0.098) (0.093) (0.096)

Firm in East Germany (d) 1.520∗∗ 0.839 1.465∗∗ 0.792(0.316) (0.137) (0.227) (0.126)

Regional unemployment rate 1.014 1.007 1.003 1.018(0.017) (0.016) (0.014) (0.015)

P: lower secondary schooling (d) 0.649∗∗ 0.919 0.629∗∗∗ 0.964(0.127) (0.166) (0.103) (0.170)

(table continued on following page)

62

Page 63: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.5. Results

Model 1 Model 2non-part. freq. non-part. freq.

P: intermediate schooling (d) 0.591∗∗ 1.177 0.577∗∗∗ 1.158(0.135) (0.225) (0.114) (0.222)

P: upper secondary schooling (d) 0.498∗∗ 1.182 0.567∗∗ 1.103(0.140) (0.252) (0.140) (0.237)

P: apprenticeship (d) 0.820∗ 1.324∗∗ 0.883 1.284∗∗(0.097) (0.163) (0.081) (0.138)

P: mastercraftsman/technician (d) 0.712∗∗ 1.366∗∗ 0.818 1.328∗∗(0.117) (0.183) (0.116) (0.168)

P: higher education (d) 0.880 1.570∗∗ 1.039 1.509∗∗(0.200) (0.286) (0.215) (0.266)

P: employed (d) 0.811 0.812 0.882 0.825(0.185) (0.231) (0.112) (0.182)

P: self-employed (d) 0.782 0.893 0.866 0.920(0.188) (0.259) (0.134) (0.210)

Prose literacy (score) 0.898∗∗∗ 1.121∗∗∗(0.034) (0.038)

Document literacy (score) 0.929∗∗ 1.062∗(0.029) (0.039)

Numeracy (score) 0.985 0.998(0.036) (0.033)

High-cultural activity (score) 0.743∗∗∗ 1.114∗∗∗(0.042) (0.041)

Importance of work (score) 0.943 0.943(0.045) (0.073)

Importance of occupation (score) 0.895∗∗∗ 1.026(0.032) (0.048)

Importance of friends (score) 0.960 1.037(0.038) (0.034)

Importance of family (score) 0.941 0.990(0.036) (0.039)

External locus of control (score) 1.104∗∗ 0.958(0.048) (0.032)

Internal locus of control (score) 1.043 1.122∗∗(0.051) (0.060)

Employment related self-confidence (score) 1.054 1.042(0.045) (0.050)

Constant 1.237 0.041∗∗∗ 1.314 0.035∗∗∗(0.311) (0.012) (0.286) (0.009)

AIC 50,081.5 49,619.3Wald-statistic (χ2) [p-value] 68.774 [0.000] 170.215 [0.000]

Source: ALWA, own unweighted calculations based on 17,254 observations from6,490 individuals. Notes: exponentiated coefficients, cluster-robust standarderrors in parentheses, ***/ **/ * indicates significance at the 1/ 5/ 10% level, ddenotes dummy variable. Reference category: parent of own sex without formalschooling or training and not employed, no own formal schooling or training,working ≤ 25% of full-time, no training required, firm size: 1-19 employees.

63

Page 64: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

Model 1 presents the most parsimonious specification. It only includes parentalcharacteristics and those socio-demographic covariates that serve as control variablesin all specifications. Model 1 corroborates the descriptive results on the relationshipbetween family background and training participation. A better educated familybackground is related to a decreased likelihood of totally refraining from on-the-jobtraining as indicated by odds ratios below 1. This relationship is driven by parentalschooling as well as their training level. For the number of training events per spell,this finding no longer holds. Only the dummy variables representing parental traininglevels show significant positive influence on the own training frequency. The parentallabour market status bears no relevance in either of the two equations. Nevertheless,a Wald test on joint significance of the background variables shows a significantinfluence of these variables (χ2(16): 106.13, p: 0.00). So far, the intergenerationalpersistence hypothesis cannot be rejected.

Model 2 extends the specification by including scores based on self-reported measuresof cognitive skills, personality traits and cultural capital. In line with the endowmenthypothesis, literacy skills as well as cultural capital are positively related to boththe occurrence and the number of training activities. Self-reported personalitytraits are only weakly related to on-the-job training. The importance of the ownoccupation decreases the likelihood of never participating in on-the-job trainingwhereas this likelihood increases with external locus of control. As expected, thestronger the internal locus of control the higher the frequency of on-the-job training.The respective Wald tests show joint significance of the endowment as well as thefamily background variables (χ2(22): 190.79, p: 0.00 and χ2(16): 68.22, p: 0.00,respectively). Both the intergenerational persistence hypothesis and the endowmenthypothesis can be maintained.

Additional covariates on educational level and full-time equivalent employmentexperience prior to the actual spell are included in Model 3 in Table 3.4. The resultssupport the path dependency hypothesis. Having achieved at least intermediateschooling and any level of vocational degree makes non-formal training abstinenceless likely. There is no influence of training levels on the frequency of on-the-jobtraining. A Wald test does not reject the joint significance of own formal educationallevels on on-the-job training (χ2(12): 94.78, p: 0.00). The employment experienceon the other hand influences the number of training events per spell and hints at au-shaped relationship. This could be explained by growing opportunity costs due toincreasing wages over the life course.

Finally, Model 4 also includes job and firm characteristics of the employment spell athand. As expected, they exhibit a strong influence on both the probability never toparticipate in on-the-job training and the number of training events per spell. Public

64

Page 65: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.5. Results

Table 3.4.: Determinants of non-participation in training and of the number of coursesrespectively, zero-inflated negative binomial regression, probit inflation

Model 3 Model 4non-part. freq. non-part. freq.

P: lower secondary schooling (d) 0.648∗∗ 0.951 0.652∗∗∗ 0.823(0.111) (0.181) (0.101) (0.174)

P: intermediate schooling (d) 0.658∗∗ 1.133 0.643∗∗ 0.995(0.128) (0.231) (0.112) (0.223)

P: upper secondary schooling (d) 0.597∗∗ 1.047 0.596∗∗ 0.918(0.149) (0.231) (0.126) (0.222)

P: apprenticeship (d) 0.957 1.263∗∗ 0.977 1.246∗∗(0.086) (0.132) (0.076) (0.119)

P: mastercraftsman/technician (d) 1.060 1.361∗∗ 0.994 1.291∗∗(0.160) (0.177) (0.120) (0.156)

P: higher education (d) 1.307 1.458∗∗ 1.229 1.401∗∗(0.275) (0.242) (0.207) (0.230)

P: employed (d) 0.834∗ 0.835 0.895 0.884(0.090) (0.161) (0.077) (0.150)

P: self-employed (d) 0.824 0.907 0.804∗ 0.959(0.121) (0.186) (0.100) (0.174)

Prose literacy (score) 0.948 1.095∗∗∗ 0.965 1.088∗∗∗(0.036) (0.035) (0.032) (0.034)

Document literacy (score) 0.997 1.048 1.006 1.040(0.036) (0.036) (0.029) (0.034)

Numeracy (score) 1.031 0.985 1.041 0.979(0.039) (0.033) (0.033) (0.032)

High-cultural activity (score) 0.818∗∗∗ 1.102∗∗∗ 0.876∗∗∗ 1.068∗(0.037) (0.040) (0.033) (0.036)

Importance of work (score) 0.936∗ 0.951 0.972 0.972(0.034) (0.062) (0.031) (0.053)

Importance of occupation (score) 0.920∗∗ 1.022 0.965 1.021(0.032) (0.046) (0.032) (0.041)

Importance of friends (score) 0.942 1.039 0.981 1.053∗(0.035) (0.033) (0.030) (0.032)

Importance of family (score) 0.969 1.011 0.959 0.995(0.041) (0.038) (0.031) (0.033)

External locus of control (score) 1.057 0.952 1.019 0.953(0.040) (0.030) (0.031) (0.029)

Internal locus of control (score) 1.073 1.147∗∗∗ 1.033 1.123∗∗∗(0.052) (0.053) (0.036) (0.045)

Employment related self-confidence (score) 0.991 1.034 0.998 1.034(0.039) (0.043) (0.033) (0.039)

Lower secondary schooling (d) 0.815 1.173 0.826 1.288(0.228) (0.522) (0.220) (0.584)

Intermediate schooling (d) 0.506∗∗ 0.946 0.609∗ 1.015(0.141) (0.384) (0.158) (0.418)

Upper secondary schooling (d) 0.486∗∗ 1.119 0.598∗ 1.156

(table continued on following page)

65

Page 66: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

Model 3 Model 4non-part. freq. non-part. freq.

(0.138) (0.461) (0.160) (0.485)Apprenticeship (d) 0.459∗∗∗ 1.091 0.755∗ 0.866

(0.065) (0.256) (0.112) (0.206)Master craftsman/technician (d) 0.238∗∗∗ 0.920 0.610∗∗ 0.612∗

(0.069) (0.229) (0.127) (0.161)Higher education (d) 0.179∗∗∗ 1.214 0.864 0.855

(0.088) (0.293) (0.161) (0.220)Employment experience (in years) 0.974 0.973∗∗ 0.978 0.977∗

(0.022) (0.013) (0.016) (0.013)Employment experience squared 1.000 1.000∗∗ 1.000∗ 1.000∗

(0.000) (0.000) (0.000) (0.000)Public service (d) 0.728∗∗∗ 1.114∗

(0.059) (0.069)Working .5 full-time (d) 0.735∗∗ 1.587∗∗∗

(0.092) (0.253)Working .75 full-time (d) 0.566∗∗∗ 1.889∗∗∗

(0.082) (0.313)Working full time (d) 0.487∗∗∗ 1.723∗∗∗

(0.056) (0.254)Induction period required (d) 0.818 1.546∗∗

(0.105) (0.342)Vocational training required (d) 0.568∗∗∗ 1.786∗∗∗

(0.064) (0.356)Vocational schooling required (d) 0.329∗∗∗ 2.097∗∗∗

(0.061) (0.441)Master craftsman/technician required (d) 0.498∗∗∗ 2.628∗∗∗

(0.086) (0.585)Higher education required (d) 0.213∗∗∗ 2.449∗∗∗

(0.047) (0.536)Firm size: 20-99 (d) 0.868∗ 0.969

(0.064) (0.073)Firm size: 100-199 (d) 0.958 1.346

(0.091) (0.282)Firm size: 200-1,999 (d) 0.697∗∗∗ 1.079

(0.056) (0.090)Firm size: 2,000 and more (d) 0.597∗∗∗ 1.091

(0.064) (0.088)Constant 4.410∗∗∗ 0.033∗∗∗ 13.016∗∗∗ 0.015∗∗∗

(1.499) (0.016) (4.475) (0.008)

AIC 49,303.7 48,280.3Wald-statistic (χ2) [p-value] 200.283 [0.000] 285.084 [0.000]

Source: ALWA, own unweighted calculations based on 17,254 observations from6,490 individuals. Notes: exponentiated coefficients, cluster-robust standard errorsin parentheses, ***/ **/ * indicates significance at the 1/ 5/ 10% level, d denotesdummy variable. Additional controls: age-class, sex, immigrant background, EastGermany and regional unemployment rate. Reference category: parent of own sexwithout formal schooling or training and not employed, no own formal schooling ortraining, working ≤ 25% of full-time, no training required, firm size: 1-19 employees.

66

Page 67: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.5. Results

service employees are less likely to experience training abstinence and they participatein courses more often than employees in private firms. The higher the working timeor the job requirement, the more likely and frequent is on-the-job training. Finally,on-the-job training is more likely in firms with more than 200 employees than insmaller firms. Given these results, it is not surprising that the results of Wald testsdepicted on Table 3.5 indicate joint significance of job characteristics. Although thejob segment hypothesis is corroborated by these results in general, the firm size onlyinfluences training probability, but not its frequency. Being employed in a small firmmight therefore be an obstacle to getting on-the-job training at all. As soon as a firmis able and determined to provide training at all, the number of training activitiesmight no longer depend on its size.

Table 3.5.: Wald tests of variable groups based on hypotheses and estima-tion results from Model 4

non-participation p-value frequency p-value

Family background (H1) 13.819 0.087 24.110 0.002Parental schooling 8.153 0.043 7.259 0.064Parental training 2.175 0.537 6.560 0.087Parental employment 3.100 0.212 1.298 0.522

Endowment (H2) 21.147 0.032 38.486 0.000Cognition 3.077 0.380 8.795 0.032Personality 17.963 0.022 24.300 0.002

Own education (H3) 24.688 0.000 14.065 0.029Schooling 17.573 0.001 3.850 0.278Training 7.082 0.069 10.088 0.018

Job characteristics (H4) 235.296 0.000 52.539 0.000Working time 43.721 0.000 16.032 0.001Job requirement 71.196 0.000 27.630 0.000Firm size 36.129 0.000 3.664 0.453

Source: ALWA, own calculations. Notes: columns show χ2-statistics and significancelevels, respectively.

Despite the strong explanatory power of the job and firm variables, the main results ofthe more parsimonious models remain valid in the full specification given in Model 4,as can be seen in Table 3.5. Most importantly, on-the-job training is still stronglyrelated to parental characteristics. This corroborates the intergenerational persistencehypothesis. Even in adulthood there is a channel through which an educated familybackground fosters educational attainment. A closer look at the results reveals aneven clearer picture of this influence. Whereas formal schooling achievements onbehalf of the parents is associated with both a lower likelihood of on-the-job trainingabstinence and a higher frequency of training activities, parental training levels areonly positively related to their offspring’s frequency of on-the-job training. Thelatter might be due to networks of parents, continued financial assistance or a higher

67

Page 68: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

cumulated wealth among the offspring of better educated and wealthier parents. Theschooling of parents, on the other hand, might be better suited as an indicator toisolate wealth-related aspects from inherited cultural capital or intrinsic learningmotivation. These different channels, however, cannot be investigated in more detaildue to a lack of data.

Although not strongly significant, Wald tests on cognition and personality generallysupport the endowment hypothesis. Endowments are less relevant for a lack ofon-the-job training than for the frequency of training. While cognitive skills aresignificantly related to the frequency of training, they are not relevant for abstinencefrom training. Cognitive skills are therefore neither a prerequisite for receivingtraining in the first place, nor do they significantly foster participation when there issome other obstacle that leads to a lack of training. For those who receive trainingat all, prose literacy skills in particular contribute to a higher number of trainingactivities, which is in line with human capital theory and the endowment hypothesis.Personality traits are jointly significant for both equations. This is mainly drivenby the fact that a high score in cultural capital increases both the likelihood andfrequency of participation in on-the-job training. Having a strong internal locus ofcontrol leads to a higher frequency of on-the-job training activities per spell.

Although the variables regarding the own formal education are jointly significant inModel 4, some results no longer concur with the path dependency hypothesis. Bothformal schooling and training make the absence of on-the-job training significantlyless likely, but formal schooling no longer has a jointly significant influence on thefrequency of on-the-job training. Existing formal training certificates even have ajointly significant negative relationship with the number of training activities perspell.

This counterintuitive result can be explained by the assumptions given in Section 3.2.If training participation depends on the financial means of a person, higher humancapital measured by educational certificates leads to more training because liquidityconstraints are less likely to be relevant. If, on the other hand, the wage is stronglyrelated to the job segment one is employed in—as claimed by the theories of seg-mented labour markets—job characteristics would be more important for wages thaneducation. In that case, the training level as a proxy for the unobserved wage getsless important. If the job segment is controlled for, a complementary explanation getsrelevant: a better educated and thus more productive worker has higher opportunitycosts of training. This influences his investment decision towards less on-the-jobtraining. Finally, human capital theory also offers an explanation of the negativerelationship between formal training level and the number of on-the-job trainingactivities per spell. A person that has already achieved a high level of human capital

68

Page 69: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.6. Predictions and sensitivity analysis

is more likely to refrain from additional investments, because she might already haveachieved the optimal level of human capital. Investing more would yield less futurereturns than the actual costs.

3.6. Predictions and sensitivity analysis

3.6.1. Simulation of training participation by family background

To infer the economic relevance of the results, Table 3.6 shows predicted frequenciesof on-the-job training per spell—denoted by y—by parental educational level as wellas the probabilities of realising certain values for y. The predictions are computedbased on the results from Model 4; the rows represent different parental educationallevels. The column denoted by E(y) represents the expected frequency of on-the-jobtraining activities per spell, whereas the remaining columns show the probability toexperience y training courses during a spell with y ranging from 0 to 3. Predictionsfor higher values of y are not shown due to their small share in the estimation sample.The confidence intervals in square brackets are estimated by bootstrapping with1,000 replications.

Table 3.6.: Predicted frequency of training per spell (y) and probabilities of counts,respectively, by selected parental educational levels

E(y) Pr(y=0) Pr(y=1) Pr(y=2) Pr(y=3)(1) (2) (3) (4) (5)

No schooling or training 0.727 0.753 0.095 0.052 0.032[0.545-0.978] [0.705-0.805] [0.065-0.114] [0.040-0.063] [0.026-0.040]

Lower secondary schooling,apprenticeship

1.119 0.636 0.134 0.076 0.048[1,046-1,352] [0.606-0.648] [0.120-0.141] [0.072-0.081] [0.046-0.052]

Intermediate schooling,master craftsman/technician

1.505 0.575 0.141 0.083 0.055[1.169-2.045] [0.522-0.619] [0.110-0.159] [0.072-0.093] [0.050-0.062]

Upper secondary schooling,higher education

1.446 0.590 0.136 0.080 0.053[1.129-1.948] [0.537-0.633] [0.107-0.154] [0.069-0.091] [0.047-0.060]

Source: ALWA, own unweighted calculations. Notes: predictions based on Model 4, remain-ing covariates fixed at their mean, 95% confidence intervals in square brackets calculated bybootstrapping with 1,000 replications.

The expected number of training activities per spell increases by parental education.For instance, the expected value is at least twice as high for people with a highlyeducated family background (rows 3 and 4) compared to people from a familybackground without any formal education (row 1). However, the expected numbersof training activities per spell do not significantly differ between the three formallyeducated groups (rows 2 to 4), as shown by the overlapping confidence intervals.

69

Page 70: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

The same applies when considering the rest of the predicted values. Column 2shows the predicted probability of experiencing no on-the-job training based onboth equations of the zero-inflated negative binomial model. The result can beinterpreted as the probability of experiencing no on-the-job training either becauseof a lack of opportunity or because of not using or having been offered on-the-jobtraining although the opportunity would have existed. The predicted probability ofa lack of on-the-job training for people from a family background without formaleducation is 75%. The same fate is less likely by 16 percentage points for peoplefrom an academic family background. Thus, when comparing the different parentaleducational levels, the conclusions are similar to those from column 1. The sameapplies for the remaining columns 3 to 5, albeit with smaller absolute differencesbetween the educational levels.This shows that family background plays an important role for on-the-job training,but this applies mainly to the lowest end of the parental educational distribution. Atotal lack of parental education is associated with low probabilities and frequenciesof on-the-job training. Variation in parental education beyond that threshold is nomajor explanatory factor for non-formal training.

3.6.2. Sensitivity analysis

To check the robustness of the results, alternative specifications are tested. Theremaining control variables from Model 4 are included in the estimations but notshown in the tables hereafter. Depending on the groups and variables included in therespective models, the number of observations vary. First, Model 4 is re-estimatedusing alternative specifications of the family background to infer whether the resultsdepend on assumptions regarding the measurement of these variables. Model 4aincludes schooling and training of the parent of the same sex as the respondent ascumulated durations of educational activities instead of categorical educational levels.These durations are computed based on mean durations per educational level thathave been tested empirically for Germany (see Helberger, 1988). Other empiricalapplications can be found in Black et al. (2005) and Plug (2004). Model 4b includesthe highest schooling and training levels achieved by any of the parents. Table 3.7shows the results of Model 4a.Similar to the results from Model 4, more parental schooling is related to a smallerprobability of non-participation and more parental training leads to a higher numberof training events. Contrary to both the previous results and the intergenerationalpersistence hypothesis, the cumulated training duration is weakly associated withhigher probability of non-participation in training. This could be interpreted asan inversely u-shaped relationship between parental schooling duration and non-

70

Page 71: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.6. Predictions and sensitivity analysis

Table 3.7.: Determinants of non-participation in training and of the number of coursesrespectively, comparing different measures of family background, zero-inflatednegative binomial regression, probit inflation

Model 4a Model 4bnon-part. freq. non-part. freq.

P: schooling duration same sex (d) 0.959∗∗∗ 1.002(0.015) (0.019)

P: training duration same sex (d) 1.034∗ 1.041∗∗(0.020) (0.019)

P: employed (d) 0.884 0.902(0.075) (0.150)

P: self-employed (d) 0.793∗ 0.967(0.096) (0.174)

P: lower secondary (d) 0.591∗∗ 0.726(0.126) (0.273)

P: intermediate schooling (d) 0.598∗∗ 0.863(0.135) (0.329)

P: upper secondary schooling (d) 0.641∗ 0.796(0.154) (0.306)

P: apprenticeship (d) 0.895 1.008(0.090) (0.115)

P: master craftsman/technician (d) 0.777∗∗ 1.039(0.095) (0.131)

P: higher education (d) 0.898 1.172(0.130) (0.173)

P: (self-)employed (d) 0.506∗∗∗ 0.142∗(0.082) (0.147)

AIC 48,304.7 48,201.7Wald-statistic (χ2) [p-value] 246.897 [0.000] 277.115 [0.000]

Source: ALWA, own unweighted calculations based on 17,254 observationsfrom 6,490 individuals. Notes: exponentiated coefficients, cluster-robuststandard errors in parentheses, ***/ **/ * indicates significance at the 1/ 5/10% level, d denotes dummy variable, rest of control variables from Model 4not shown. Reference category: parent of own sex not employed (Model4a)/ no parent formally educated or employed (Model 4b), no own formalschooling or training, working ≤ 25% of full-time, no training required, firmsize: 1-19 employees.

participation in on-the-job training. Alternatively, it could be due to the fact thatthe variable representing parental cumulated training duration has only five differentvalues. That is why the influence of this variable thus can hardly be interpreted aslinear. The Wald test on the overall relevance of the family background, nevertheless,still corroborates the intergenerational persistence hypothesis.Whereas all parental characteristics so far have been based on information on theparent of the same sex as the respondent, Model 4b replicates Model 4 but usesthe highest parental schooling and training levels given in the family as well as a

71

Page 72: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

dummy variable that indicates whether at least one parent had been employed orself-employed when the respondent was 15 years old. Parental schooling and to alesser extent also parental training make non-participation in training less likely, butnone of these regressors are significantly associated with the frequency of on-the-job training. Having had at least one parent that was employed or self-employed,on the other hand, is negatively related to the number of training events. Thiscounterintuitive finding was already present in the previous specifications, althoughnever statistically significant. Except for this result, the intergenerational persistencehypothesis can be sustained, and there is no indication that this specification shouldbe preferred over the specification based on the sex-role model in Model 4.

To test whether the influence of family background diminishes over time, spells fromearly in the respondents’ life courses are excluded, i.e. all that began before the ageof 26. That way, own educational and labour market achievements gain relevance andpath dependency gets more important. This is tested by Model 4c in Table 3.8. Thenumber of observations decreases markedly and fewer regressors regarding the familybackground remain statistically significant taken by themselves. It is not discerniblewhether this is due to the fact that the relationship described before may in factdiffer between the two age groups or due to the reduced number of observations.The intergenerational persistence hypothesis still receives support by a Wald testthat does not reject the joint significance of the family background variables.

A closer look reveals that this result is mainly driven by parental schooling ratherthan parental training. The influence of socialisation by the parents regarding thelearning orientation seems to be more strongly and more persistently explained byparental schooling. It is also to be expected that the parental training level is morestrongly related to unobserved wealth of the family. Since it is likely that financialsupport from parents decreases over the life course of the offspring, the result fromModel 4c also reflects a decreasing influence of unobserved wealth of the family onon-the-job training over time. The offspring gets more independent from its parents,and that also applies to the financing of human capital investments.

Section 3.3 states that training decisions for public service employees might followrules that are not in line with the human capital theory. Public employers areless inclined to decide on training provision based on cost-benefit considerations.Instead, their decisions on the provision of on-the-job training are often orientedsolely on administrative requirements of a given service grade. In that case, trainingparticipation would be determined more strongly by the training motivation of thegiven employee. To test this, Model 4d excludes spells of public service jobs. The agegroups that are included are similar to those of Model 4. The number of observationsdecreases compared to Model 4, but less markedly than in Model 4c.

72

Page 73: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.6. Predictions and sensitivity analysis

Table 3.8.: Determinants of non-participation in training and of the number of coursesrespectively, excluding spells starting below the age of 26 or of public servicerespectively, zero-inflated negative binomial regression, probit inflation

Model 4c Model 4dnon-part. freq. non-part. freq.

P: lower secondary schooling (d) 0.576∗∗∗ 0.769 0.658∗∗ 0.733(0.100) (0.177) (0.108) (0.191)

P: intermediate schooling (d) 0.725∗ 0.994 0.652∗∗ 0.853(0.141) (0.241) (0.119) (0.234)

P: upper secondary schooling (d) 0.710 0.942 0.632∗∗ 0.982(0.158) (0.249) (0.139) (0.283)

P: apprenticeship (d) 1.019 1.171∗ 0.967 1.261∗∗(0.097) (0.111) (0.081) (0.133)

P: mastercraftsman/technician (d) 0.847 1.167 0.984 1.411∗∗(0.141) (0.150) (0.127) (0.194)

P: higher education (d) 1.084 1.276 1.116 1.182(0.197) (0.225) (0.194) (0.197)

P: employed (d) 1.027 0.951 0.858∗ 0.793(0.109) (0.135) (0.080) (0.168)

P: self-employed (d) 0.887 1.037 0.767∗∗ 0.806(0.148) (0.170) (0.103) (0.179)

AIC 28,649.1 33,707.6Wald-statistic (χ2) [p-value] 255.592 [0.000] 234.326 [0.000]Background (Wald-statistic) [p-value] 39.373 [0.001] 38.456 [0.001]Individuals 4,322 5,435Observations 9,057 13,524

Source: ALWA, own unweighted calculations. Notes: exponentiated coef-ficients, cluster-robust standard errors in parentheses, ***/ **/ * indicatessignificance at the 1/ 5/ 10% level, d denotes dummy variable, rest of controlvariables from Model 4 not shown. Reference category similar to Model 4.Model 4c: exclusion of all spells beginning before the age of 26, Model 4d:exclusion of employment spells in the public service.

The relationship between variables on parental training and the frequency of on-the-job training is slightly weaker than in Model 4, whereas parental schooling stillstrongly decreases the likelihood of a lack of on-the-job training. The parental labourmarket status, though, increases in relevance compared to Model 4, as employedparents are related to a reduced likelihood of a total lack of on-the-job training. If theassumption holds that public employers provide on-the-job training or its financingmore willingly than private, for-profit employers, this result is not surprising. Privatesector employees may be more likely to be forced to finance all or parts of their on-the-job training themselves. In that case, the wealth of the family of origin, measuredby proxy by parental employment, may influence whether on-the-job training can beafforded in the first place. Stronger professional networks of economically successful

73

Page 74: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

parents may also positively influence an employer’s decision to finance or at leastco-finance the offspring’s on-the-job training.These sensitivity analyses brought about more differentiated results, but did notcontradict the general conclusions. On-the-job training participation is related toboth formal parental schooling and training. Whereas parental schooling is mainlyassociated with the general likelihood of training, parental training is particularlyrelevant for the number of training activities per spell. Both relationships diminishover the life course of the offspring.

3.7. Summary and conclusions

The aim of this study was to examine whether people from low-qualified family back-grounds make up for any inherited lack of formal education by means of non-formaltraining. If this is not the case, one would have to conclude that a lack of intergen-erational mobility in educational attainment is persistent over the life course. Thehigh relevance of education for both labour market success and social participationmakes this a key issue for economic research and from a policy perspective.The results confirm that educational inequality is a lifelong phenomenon. In ac-cordance with the intergenerational persistence hypothesis, family background isassociated with a person’s human capital investments over the whole life. Growingup in a poorly educated household impedes one’s prospects of formal and non-formaleducational attainment even during adulthood. The often cited blessings of lifelonglearning are misleading as long as non-formal training does not contribute to thecatching-up of people from low-qualified family backgrounds. They lack formal edu-cation themselves, and as they do not attain non-formal training often or intensivelyenough, the gap in human capital becomes even wider.However, the matter is not as simple as that. It is not only the education of parentsthat influences someone’s adult educational achievement. The endowment hypothesisstated that also personality traits and cognitive skills, which are in part inheritable,can explain a substantive part of differences in on-the-job training participationbetween people from different family backgrounds. The analyses corroborate thishypothesis. One channel of intergenerational persistence therefore works throughthe inheritance of endowments that make educational attainment more likely.The path dependency hypothesis stated that a higher level of formal educationis associated with a higher likelihood and frequency of on-the-job training. Thisreceives only partial support by the results in that existing formal schooling andtraining in fact make a total lack of on-the-job training less likely. They seem tobe a prerequisite for further education. The frequency of on-the-job training per

74

Page 75: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3.7. Summary and conclusions

employment spell is influenced by previous formal training only. When job andfirm characteristics are controlled for, the fact that someone has undergone formaltraining is associated with a lower frequency of non-formal training. This might bedue to high opportunity costs resulting from a level of human capital that is alreadyvery high. Additional non-formal training might exceed the optimal level of humancapital, in which case such investments would not pay off in the long run.Finally, the job segment hypothesis stated that employment in a favourable labourmarket segment leads to more on-the-job training. This is corroborated by thefinding that both the likelihood and the frequency of training rise with working timeand with job requirements. Firm size is only positively related to a higher likelihoodof on-the-job training.This study makes two contributions to the literature. First, it shows that there aresome factors that contribute to a lack of non-formal training—i.e. they constitutean obstacle to training, whereas other factors are associated with both the likelihoodand frequency of training. The negative-binomial regression model has provenitself useful for more differentiated analyses than what would have been possiblewith the standard Poisson regression model. Second, and more importantly, thepresent analysis has been the first to bring together the issue of intergenerationaleducational persistence with an analysis of the determinants of non-formal training.In doing so, intergenerational mobility—or a lack thereof—could be analysed morecomprehensively.These results allow several policy implications. In the short run, human capitaldeficits could be reduced in part by incentives to participate in non-formal trainingand information on its benefits aimed at low-qualified people. Although these short-run remedies might be small steps towards more equal educational chances, a moreadvantageous cost-benefit ratio may be achieved by investments in human capitalearlier in the life course. Findings reviewed, for instance, by Cunha and Heckman(2010) stress how profitable early investments in human capital are. It is thereforenot surprising that the following long-term policy implications resemble those fromthe literature on intergenerational persistence in formal education.Family policy can play a major role in fostering intergenerational mobility. Possiblemeasures include compulsory and cost-free kindergarten attendance or subsidisedchild-care facilities. Both foster the labour supply of parents of young children whowould otherwise be unable to finance daycare for their children. The resulting increasein income would diminish financial constraints and allow additional investments in thehuman capital of the offspring. Moreover, daycare facilities with well-educated staffoffer favourable role-models and learning-oriented peer-groups. Both may foster thedevelopment of children’s skills that are important for future educational attainment

75

Page 76: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

3. Lifelong learning inequality?

and economic success (e.g., Heckman et al., 2010). These findings make a particularlystrong point against any policy measures that prevent or discourage parents frommaking use of child-care facilities outside the family. In the light of theses findings,the introduction of the child-care benefit currently discussed in Germany,16 whichwould have a particularly discouraging effect on the labour supply of low-educatedparents, would be ill-advised.Educational policy has an equally important role to play. The comprehensive surveyby Björklund and Salvanes (2011) identifies key obstacles in educational systemsthat lead to a lack of educational mobility. For one thing, the duration of compulsoryschool attendance proves to be negatively related to the influence of the familybackground on educational attainment. The longer children are obligated by law tolearn independently from parental decisions, the more human capital will be attainedeven after the period of compulsory school attendance. For another thing, the systemof school tracking is important. The longer all pupils learn together before beingseparated into different tracks, the lower the influence of the family background onearly educational decisions.This study is a first step towards a better understanding of the long-term influence offamily background on non-formal training participation. Further steps should includea replication of the present study for other countries. By comparing the resultsfor countries with educational systems similar to that of Germany with countriesthat show a higher intergenerational educational mobility, research on the influenceof the institutional background could be conducted. Moreover, as the data setincludes only cross-sectional measures of cognition and personality, the assumptionof their time-invariability is crucial for the analysis. To overcome this, repeatedmeasurements of these dimensions would be necessary.

16See http://www.economist.com/node/21554245 on the debate over the proposed “Betreuungs-geld”.

76

Page 77: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4. Do literacy and numeracy pay off?On the relationship between basic skills and earnings∗

4.1. Introduction

Ever since its establishment in the early 1960s (Becker, 1962; Schultz, 1961), researchon human capital had a focal interest on the labour market returns to individuals’schooling, which was considered to represent individuals’ productivity. However, itwas soon recognised that the indicator typically used, time spent in education, maynot necessarily be a good proxy for individuals’ capabilities, and that the resultingcoefficients will be biased if ability cannot be accounted for (Griliches and Mason,1972).Though there is a small but established literature on the returns to cognitive ability,17

it might be argued that adults’ basic skills, i.e. literacy and numeracy, are at leastas relevant for earnings. Not only may they be considered as better indicatorsfor individuals’ productivity than schooling credentials, but they are also moremalleable than innate cognitive abilities. The latter are partly determined by pre-natal circumstances, are mainly developed in childhood and early youth and cannotbe altered easily later on in the life course (Cunha and Heckman, 2007). In fact,numerous policy programs that are embedded in the “lifelong-learning” debate showthat the removal of deficiencies in and the enhancement of adult’s basic skills are ofhigh interest to policymakers.18 But similar to the literature on cognitive abilities,there is only little evidence on the role of adults’ literacy and numeracy. This is notsurprising since data on adults’ basic skills are rather scarce and have a focus onAnglo-Saxon countries, with only very limited evidence from other countries.

∗This chapter is based on Antoni and Heineck (2012), which is joint work with Guido Heineck.17It is worth noting that the available evidence is far from conclusive: on the one hand, there is

a large number of studies that reveal substantial returns to cognitive abilities (e.g., Bronarsand Oettinger, 2006; Cameron and Heckman, 1993; Green and Riddell, 2003). On the otherhand, there are as many studies suggesting that cognitive abilities have barely any additionaleffect on earnings (Bound et al., 1986; Murnane et al., 1995) and that they are a poor predictorof earnings compared to a direct measure of education, family background, and environment(Cawley et al., 2001; Zax and Rees, 2002).

18See, for example, Recommendation 2006/962/EC of the European Parliament and of the Councilon key competencies for lifelong learning.

77

Page 78: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4. Do literacy and numeracy pay off?

We add to the literature by providing first evidence on the association between adults’literacy and numeracy skills and individual earnings for Germany. In contrast toevidence on the formation of basic skills in the school student population, for whichthere has been a surge in research interest after the German “PISA shock” in 2000(see Klieme et al., 2010), our analysis will shed some light upon the situation of adultworkers for whom there is barely any research. To be able to do this, we base ouranalysis on the ALWA-ADIAB and ALWA-LiNu data, a combination of data sourcesthat is unique in several aspects, even beyond the German context.To be clear, the set of basic skills as given in our data will neither allow us toseparate innate abilities from literacy and numeracy skills, nor can we disentanglecausal mechanisms that might lead to differences between the returns to educationalattainment and to basic skills. We primarily aim at providing evidence on whetherliteracy and numeracy adds to the explanation of variation in earnings beyondindividuals’ educational attainment or whether potential effects are fully absorbedby educational credentials. Despite its more descriptive character, our analysis isrelevant for both economists and policy makers as it furthers our understanding ofthe returns to education and skills beyond formal schooling.The remainder of our paper is as follows: we review prior research on the returns tobasic skills in Section 4.2, introduce the data sets in Section 4.4 and provide resultsin Section 4.5. Section 4.6 draws conclusions.

4.2. Background and previous research

As noted above, interest in individuals’ abilities is nothing new in the empiricalliterature on human capital. Where available, researchers used information onindividuals’ cognitive abilities for technical reasons, i.e. in order to remove potentialissues caused by omitted variables. Researchers are also interested in the labourmarket value of cognitive abilities – and along the same line of argument, of basicskills – as they may reflect individuals’ productivity as good as formal schooling, ifnot better. If so, implications from human capital theory would predict that such aset of skills will pay off.It is therefore not surprising that the interest in adults’ basic skills has also beenaround for some time, and particularly so in the political spheres. Faced with thechallenges of increasing demands for a skilled workforce in knowledge based societies,it was only in the more recent past, that the OECD implemented two comparativestudies, the International Adult Literacy Survey (IALS), that was conducted inthree periods between 1994 and 1998 and covered 21 countries, and the 2003 AdultLiteracy and Lifeskills Survey (ALL) that covered six countries.

78

Page 79: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4.2. Background and previous research

Some of the studies that examined the returns to basic skills are based on eitherALL or IALS data. Yet, before looking at findings in more details, it is usefulto first address the concepts of numeracy and literacy which are, as Dougherty(2003, p. 512) puts it, “... susceptible to definitional variability”. It furthermore isnecessary to distinguish basic skills measures as given in the database we use fromcognitive abilities as included in, for example, the National Longitudinal Survey ofYouth (NLSY). Such measures are typically used as proxies for individuals’ innateabilities and are mainly taken from a common cohort and when individuals are stillin education.We however use data drawn from basic skills tests conducted during a survey withadult respondents. Our measures therefore are mainly from individuals who havecompleted their educational and vocational training and are of different age. Beyondthat, and more importantly, the tests aim at capturing skills that are needed inindividuals’ everyday life. Since the data do not contain actual measures of innateability, our basic skills measures should be interpreted as a compound measure of anindividual’s innate abilities, educational gains, and post-education experiences of theindividual in both her working and private life.As outlined above, there is only little evidence on the association between workers’basic skills and labour market success, measured by either employment participationor earnings. While there are a few studies from the 1970s (see Dougherty, 2003), weconcentrate on more recent studies to allow some comparison to the current Germansetting. As most of the research on the association between adults’ basic skillsand labour market success focuses on Anglo-Saxon countries, we outline first a fewrelevant studies the North Americas and the UK before we look at other countries,including Germany.Dougherty (2003) uses data for the US from the NLSY which provides test scoresfrom the Armed Services Vocational Aptitude (ASVAB) test. Rather than using thecomposite Armed Forces Qualification Test (AFQT) score that can be derived fromthe ASVAB scores, the author employs numeracy scores, based on the individuals’arithmetic reasoning attainment, and literacy scores, a joint verbal composite basedon word knowledge and paragraph comprehension. The results suggest that numeracyis strongly related to earnings, working indirectly via its effect on college attainmentbut also directly, controlling for educational attainment. While the effect is smallin absolute terms, it appears to increase over the 1988-1996 period covered in theanalysis. Compared to that of numeracy, the literacy earnings gradient is smallerand less significant.Ishikawa and Ryan (2002) use only the prose literacy measures from the 1992 NationalAdult Literacy Survey (NALS) for the US and examine both the formation of basic

79

Page 80: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4. Do literacy and numeracy pay off?

skills and their association to earnings. They attempt to disentangle basic skills thatare learned in school from those acquired in the post-school periods and concludethat it is the “substance of learning in school [...] that counts” (ibid., p. 241), whichemphasises the need to account for both schooling and basic skills. Their resultsfurther suggest that compared to their white or hispanic counterparts, black workersdo not benefit from basic skills acquired in school.McIntosh and Vignoles (2001) use data from the 1991 British National Child De-velopment Study (NCDS) and data on the UK from the 1994 IALS.19 The authorsfocus on individuals in the bottom part of the skills distribution, and their findingsfor numeracy imply that low skilled individuals are substantially more likely to beemployed, and, if employed, earn some 16-21% more than the lowest skilled. Theresults for literacy are more heterogenous and there are large differences betweenresults based on either data set, so that the authors avoid a concluding answer.Another recent study for the UK is by Vignoles et al. (2011) who use data from theBritish Cohort Study (BCS) for 2004, and, for comparison over time, NCDS datafor 1995. Similar to the study by McIntosh and Vignoles (2001), the authors findthat both numeracy and literacy skills measured at age 16, 21, and 34 are positivelyrelated to the earnings of 34-year old workers. The earnings premia are at about11% for a standard deviation increase in numeracy, and 14% for literacy, respectively,with no substantial differences between men and women. These results are robust toa range of different specifications. The findings further imply that despite numerouspolicy efforts to increase the supply of skills over the period 1995 to 2004, the valueof basic skills has remained stable in the UK.Evidence for full-time working male Canadians is provided by Green and Riddell(2003) who use the country component of the 1994 IALS. They employ the averageof test scores on document, prose and quantitative skills and run quantile regressionsin order to examine whether basic skills vary across the wage distribution. Theirresults suggest that this is not the case, though they also find a strong associationbetween their skills measure and the earnings of male workers.The analysis of Shomos (2010) complements the overall picture of the positiverelationship between adults’ basic skills and labour market success as measured inearnings. Using data from the Australian Adult Literacy and Life Skills Survey(ALLS)20 for 2006, his results suggest a 14 percentage points increase in wages foran increase in skills, which are measured in categories, from the lower to the next

19This is interesting on its own, since the authors compare the methodological strengths and, aboveall, weaknesses of both data sources. The authors in fact conclude that both data sets “... sufferfrom significant, but different, measurement problems” (McIntosh and Vignoles, 2001, p. 474).

20Note that while this survey bears the same name, Australia was not one of the six countries inthe comparative survey in 2003.

80

Page 81: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4.3. Hypotheses

higher skill level. Separated by gender, there are somewhat stronger effects for menthan for women.One of the few studies on non-Anglo-Saxon countries is by Denny and Doyle (2010)who use the 1998 IALS components for the Czech Republic, Hungary, and Slovenia.The authors employ semi-parametric econometric techniques and find that returns tobasic skills are significant in Slovenia and the Czech Repulic, but to a lesser extentin Hungary. Based on the flexible semi-parametric part of their analysis, the authorsconclude that the returns vary considerably between numeracy and literacy as wellas across the countries in their sample.For Germany, there is only very limited evidence: measures of individuals’ cognitiveabilities, as approximated by an ultra-short IQ test that was implemented in the2006 wave of the German Socio-Economic Panel Study (SOEP), are used by Heineckand Anger (2010). Their results indicate that, controlling for education, speed ofcognition is positively related to the earnings of males, but not relevant for females.However, since their measure is a proxy for individuals’ innate abilities rather thanfor basic skills, the comparability to our analysis is limited.The German 1994 IALS component has, to our knowledge, been explored only byFreeman and Schettkat (2001). They use the IALS numeracy scores, and compareGermany to the US with a focus on how much the skills distribution contributes tothe differences in the wage distributions in both countries. So, again, comparabilityto what we do in this analysis is limited, and even more so since the German IALSprovided information only on net rather than gross income, and only the workers’position in one of 20 income brackets rather than actual income.

4.3. HypothesesBoth theoretical considerations and implications from the empirical findings men-tioned above allow us to derive hypotheses on the relationship between basic skillsand earnings. One of the fundamental implications of human capital theory is thatone’s productivity—and thereby one’s wage—depends on the skills one is endowedwith. This includes cognitive and non-cognitive abilities (Heckman et al., 2006;Heineck and Anger, 2010), as well as basic skills as given in our data. The assumedrelationship is positive, which leads straightforwardly to Hypothesis 1 :

H1: Conditional on education and other productivity-related characteristics,basic skills are positively related to labour market earnings.

Although signalling (Spence, 1973) and screening (Stiglitz, 1975) theories are relatedto innate abilities rather than basic skills, some of their extensions offer implications

81

Page 82: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4. Do literacy and numeracy pay off?

different from those of the human capital theory. Riley (1976) and Psacharopoulos(1979) laid the ground for what Farber and Gibbons (1996) later on termed “employerlearning”: employers may base hiring decisions and initial wage levels on formaleducational certificates as signals for productivity; over time, they will increasinglytake observable productivity into account for wage setting decisions. Hypothesis 2summarises the implications of employer learning for our analysis:

H2: The higher the tenure at a given firm, the higher the rewards for basicskills.

Based solely on theoretical considerations we would not expect any differences inthe strength of the relationship between earnings and either of the domains ofbasic skills we consider. After all, human capital theory does not predict that anyaspect of human capital should generally yield higher monetary payoffs than others.Contrary to that, several studies mentioned in the previous section do find significantdifferences. Based on these results of higher earnings payoffs to numeracy than toliteracy we put forward Hypothesis 3 :

H3: If there are any differences between the rewards to literacy and numeracy,there should be a stronger relationship of earnings with numeracy than withliteracy.

Some of the findings mentioned above indicate differences in rewards to basic skillsbetween socio-demographic groups, e.g. by gender or ethnicity. Theories thatattempt to explain individual earnings, though, do not offer any explanation for suchheterogeneity. In fact, we argue that such findings result from omitted variable bias asthey are confounded by the influence of unobserved determinants of earnings. As ourlinked data set allows us to comprehensively control for important characteristics suchas the educational and employment history, as well as tenure and job characteristics,we do not expect to find significant heterogeneity in the rewards to basic skills.Consequently, Hypothesis 4 reads as follows:

H4: Rewards to literacy and numeracy do not vary between different sub-groups in the labour market.

4.4. DataThis chapter finally uses all data sources described in chapter 2, thereby utilisinga combination of data sources that is innovative and unique in several aspects,even beyond the German context. It brings together longitudinal survey data on

82

Page 83: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4.4. Data

educational activities with actual measures of basic skills and tops that off withaccurate and reliable earnings information from administrative employment data.Moreover, one of the main foci of the ALWA survey has been on educational activities.That allows for a particularly accurate measurement of human capital investmentsover the life courses of the respondents. Finally, a broad set of cross-sectionalvariables includes, for instance, the immigrant background as well as language skills.Some limitations of the data have to be born in mind. The tests on these basic skillswere run only once, shortly after the longitudinal interview in ALWA, so that wehave a cross-sectional measure only. Although these types of skills are potentiallysubject to change over time (Desjardins and Warnke, 2012; Hertzog et al., 2008), wehave to assume that they are constant over our observation period on the individuallevel. To deal with this assumption, we only consider employment spells in theyear of measurement and up to three years after that. During this relatively shorttime-span, we have to rely on the assumption that basic skills are not likely to changesubstantially.The administrative data have some shortcomings, though we argue that they arenot relevant for our analysis. First, the data do not include spells of self-employed,civil servants or students in higher education, as these groups do not contributeto the pension insurance schemes underlying the data of the Federal EmploymentAgency. However, neither of these groups is relevant when examining the valuationof basic skills in dependent employment. The majority of German civil servants areremunerated according to strict rules and mainly on the basis of their position. Thisdoes not leave much room for differentiation on the basis of individual skills, nor iscareer advancement of civil servants mainly determined by such skills. The incomeof self-employed is determined by the success of their enterprise and is therefore notcomparable to income from dependent employment.Second, information on educational levels provided by employers are not relevant forsocial security entitlements, which is why they are reported less accurately and lesscomprehensively than information on earnings. This leads to missing or inconsistentinformation on educational levels in about 10% of the administrative employmentspells (see e.g., Fitzenberger et al., 2006). As we combine these data with ALWA,we are able to compensate for this shortcoming by using the detailed educationalhistories given in the survey data.For our empirical analyses we are only able to use data on ALWA respondents whoboth participated in the skills tests and could be identified in the administrativedata. The intersection of these cases, the removal of cases with missing values, arestriction on spells of full-time employment as well as the exclusion of spells ofcurrent vocational training relationships finally lead to data on 1,818 individuals.

83

Page 84: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4. Do literacy and numeracy pay off?

This leaves us with a total of 5,924 observations over the period 2007-2010, witha reference date of June 30.21 Table 4.1 provides descriptive sample statistics andindicates the source of each variable.

Table 4.1.: Sample statistics of independent variables

source mean s.d. min maxLiteracy (ML estimate) S 308 (34) 153 500Numeracy (ML estimate) S 308 (54) 0 500No professional degree (d) S .073 (.26) 0 1Vocational degree (d) S .67 (.47) 0 1Academic degree (d) S .25 (.43) 0 1Employment experience in years A 18 (8.3) .033 36Male (d) S .65 (.48) 0 1Age in years S 41 (8.6) 19 54Immigrant background (d) S .18 (.39) 0 1Native language German (d) S .97 (.18) 0 1Job in East Germany (d) A .18 (.38) 0 1Tenure in years A 6.9 (6.8) .0028 36Firm size (no. of employees) A 1,214 (4,932) 1 52,1562007 (d) A .24 (.43) 0 12008 (d) A .25 (.43) 0 12009 (d) A .26 (.44) 0 12010 (d) A .26 (.44) 0 1

Source: ALWA-ADIAB, ALWA-LiNu, own unweighted calculations based on 5,924 obser-vations. Notes: Variables not shown: occupation and industry dummies. Letters S or Adenote survey or administrative data, respectively. d denotes dummy variable.

Our dependent variable reflects the log monthly gross earnings computed for eachof the above spells, deflated to the year 2005 by the consumer price index. Ourcentral interest lies on the relationship of these earnings with individuals’ basic skills,prose literacy and numeracy. To ease interpretation of the estimated coefficients,the scores from the tests are standardised and included as z-scores with a mean ofzero and unit standard deviation. The second important aspect of human capital inour analysis, professional qualification, is measured by the levels “no professionaldegree”, “vocational degree” and “academic degree”. Another element of humancapital is employment experience. We are able to differentiate between the totalemployment experience and tenure in a given job, which we both compute based onthe employment history given in the administrative data, and include as a polynomial.In addition to that we control for the respondent’s age, as most elements of cognitivefunctioning, including both cognitive abilities and basic skills, are subject to decline21See Table D.1 in the appendix on the loss of observations due to each of the data preparation

steps.

84

Page 85: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4.5. Empirical analysis

over the life course (Desjardins and Warnke, 2012; Hertzog et al., 2008). That iswhy it is important to control for both employment experience and age in the givencontext. To account for the fact that the gender wage gap is considerable in Germany(see Antonczyk et al., 2010), the gender of the respondent is included as a dummyvariable. A first, second or third generation immigrant background is defined bya single dummy variable and is supplemented by whether the respondent’s nativelanguage is German.Contrary to the human capital theory, job and firm characteristics are considered asimportant predictors of individual earnings in theories of segmented labour markets(see Leontaridi, 1998). Along the line of these theories we include a number ofvariables reflecting the demand side of the job relationship. These variables are theoccupational segment (10 categories), the firm’s size and its industry (18 categories)as well as the year of the employment spell (years 2007-2010). As labour marketoutcomes in Germany are strongly related to regional aspects, we include a dummyvariable for being employed in East Germany.

4.5. Empirical analysis

4.5.1. Method

We estimate a Mincer-type earnings equation (Mincer, 1958) augmented by the scoresof prose literacy and numeracy as well as the set of control variables as outlined inthe previous section. This is summarised by

ln yit = x′itβ + c′

iγ + uit, i = 1, ..., N, t = 1, ..., T. (4.1)

The dependent variable ln yit are the log monthly gross earnings, xit is the vectorof controls, and ci represents the proficiency scores in numeracy and prose literacy,standardised as z-scores. The idiosyncratic error term is denoted by uit, and theparameters of interest are β and γ. The data provide us with an unbalanced panelstructure with up to four observations per person.We estimate the earnings equation using feasible generalised least squares (GLS)random effects regression (Balestra and Nerlove, 1966). As we have to assume thatthere is correlation between different observations of a given respondent, we introduceαi as an individual-specific component of the error term:

uit = αi + εit. (4.2)

85

Page 86: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4. Do literacy and numeracy pay off?

To test whether this improves our specification compared to a regression with a singleerror term we use a modified version of the Breusch-Pagan Lagrange multiplier test(Breusch and Pagan, 1980) for random effects that is suitable for unbalanced panels(Baltagi and Li, 1990). The null hypothesis is that the individual-specific componentof the error term is not different from zero (H0 : σ2

α = 0). This would imply thatthe error correlation between different observations of a given individual is negligibleand pooled ordinary least squares regression is adequate for our analysis. This nullhypothesis is clearly rejected (χ2: 4615.06, p: 0.000).By computing standard errors that correct for clustering on the individual level (seeHuber, 1967; White, 1980) we specifically consider the intra-individual correlationbetween different observations. If we did not, the standard errors would be biaseddownwards despite the fact that we already consider the panel structure by usingthe random effects model.

4.5.2. Results

The first panel of Table 4.2 shows our estimates including the basic skills measures,which we compare to a more standard earnings regression given in the second panelin order to examine whether and to what extent the returns to education are affectedby omitting individuals’ capabilities. As basic skills are included as standardisedterms, we learn that an increase of literacy or numeracy by one standard deviationis related to an increase in monthly earnings by 3% or 6%, respectively, which is linewith Hypothesis 1. Compared to prior research for other countries, the skills premiaare lower. This is, however, not problematic as it may be caused by, for example,differences in wage settings schemes, or differences in the sample structures. Thedifferential in the coefficients furthermore hints at a stronger relationship betweenearnings with numeracy than with literacy, but a Wald test shows that the differencebetween the two coefficients is not statistically significant (χ2: 2.37, p: 0.124). Ourresults therefore show that basic skills do matter but, in line with Hypothesis 3, thatthere are no different rewards by skill type.As for the standard human capital covariates, we find that in neither of the twospecifications do respondents with a vocational degree earn significantly more thanthose without any professional degree. This may seem puzzling but is likely due tothe sample’s restriction on spells of full-time employment. Those individuals withoutany professional degree that manage to become full-time employed are most likelya positive selection in terms of their productivity. Investing in an academic degreeis related to substantially higher earnings compared to people without a formalprofessional qualification. The earnings premium is larger than 48% in the standardMincer earnings regression (column 2), which decreases to 41% when controlling

86

Page 87: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4.5. Empirical analysis

Table 4.2.: Augmented Mincer-type earnings equation, random effects GLSestimates with and without basic skills scores, respectively (2007-2010)

with scores without scores(1) (2)

Literacy (z-score) 0.031∗∗∗ (0.012)Numeracy (z-score) 0.058∗∗∗ (0.011)Vocational degree (d) 0.109 (0.109) 0.128 (0.108)Academic degree (d) 0.407∗∗∗ (0.098) 0.479∗∗∗ (0.094)Employment experience in years 0.041∗∗∗ (0.005) 0.040∗∗∗ (0.005)Employment experience squared/100 −0.039∗∗∗ (0.012) −0.038∗∗∗ (0.012)Male (d) 0.265∗∗∗ (0.028) 0.282∗∗∗ (0.028)Age in years 0.012 (0.010) 0.016 (0.010)Age squared/100 −0.028∗∗ (0.013) −0.034∗∗∗ (0.013)Immigrant background (d) 0.036 (0.029) 0.018 (0.029)Native language German (d) −0.042 (0.057) −0.010 (0.059)Job in East Germany (d) −0.092∗∗∗ (0.027) −0.106∗∗∗ (0.027)Tenure in years 0.009∗∗∗ (0.003) 0.009∗∗∗ (0.003)Tenure squared/100 −0.031∗∗ (0.013) −0.031∗∗ (0.013)Firm size (no. of employees) 0.000∗∗∗ (0.000) 0.000∗∗∗ (0.000)Firm size squared/100 −0.000∗∗∗ (0.000) −0.000∗∗∗ (0.000)Constant 6.614∗∗∗ (0.224) 6.480∗∗∗ (0.222)

R2 overall 0.488 0.475R2 between 0.511 0.498R2 within 0.086 0.084Observations 5,924 5,924

Source: ALWA-ADIAB, ALWA-LiNu, own unweighted calculations. Notes: log monthlyearnings as dependent variable, cluster-robust standard errors in parentheses, ***/ **/ *indicates significance at the 1/ 5/ 10% level, d denotes dummy variable. Reference: noprofessional degree. Variables not shown: occupation, industry and year dummies.

87

Page 88: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4. Do literacy and numeracy pay off?

for basic skills (column 1). Thus, some part of the earnings differences betweenpeople with different levels of qualification does indeed stem from differences in skillendowments. Not all of the earnings gap can be attributed to mere differences ineducational credentials, as the signalling theory would imply. However, the coefficientsfor vocational and academic degrees significantly differ in the specifications both withand without the basic skills measures (χ2: 68.37, p: 0.000 and χ2: 78.95, p: 0.000,respectively), showing that university graduates earn more than vocationally trainedworkers, even when controlling for basic skills.There is evidence for a considerable gender earnings gap. Although a coefficient of27% may appear particularly high, it is in line with the findings of Antonczyk et al.(2010). It might also be explained by the fact that we include spells in our analysison the basis of an administrative variable that differentiates between full-time andpart-time jobs. Among those classified as full-time employed, the regular weeklyworking hours may still vary considerably. As we do not observe such deviations, andbecause our dependent variable measures the gross monthly earnings instead of thehourly wage, the gender earnings gap may be due to different working time patternsbetween women and men. Moreover, we cannot control for overtime or shiftworkpayments, which might also vary strongly between women and men.Furthermore, there is a earnings gap of 9% for people working in the eastern part ofGermany, compared to their West German counterparts. Moreover, although notstatistically significant, the coefficient on individuals with an immigrant backgroundwould imply a remarkable result inasmuch as it indicates higher earnings comparedto native Germans. Since well educated people are over-represented in our data(Kleinert et al., 2012a), we assume that to be a result of sample selectivity in termsof willingness to take part in the skills tests.We also examine whether there is non-linearity in the relationship between basicskills and earnings. In the specification shown in Table 4.3 we additionally includesquared terms of literacy and numeracy. Both literacy and numeracy remain to bepositively related to earnings, as indicated by the individual significance of bothlinear terms, but neither of the squared terms is statistically significant. Furthermore,Wald tests clearly reject the hypothesis that the linear and the quadratic terms forliteracy and numeracy are jointly equal to zero (χ2: 6.80, p: 0.033 and χ2: 25.72,p: 0.000, respectively). As the positive relationship between basic skills and earningsseems to be linear, we conclude that there would be a monetary payoff to efforts ofincreasing a person’s basic skill endowments, regardless of where she might initiallybe located in the skill distribution.To examine potential heterogeneity in the skills-earnings relationship between differentgroups in the labour market, Table 4.4 provides the results of a specification that

88

Page 89: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4.5. Empirical analysis

Table 4.3.: Augmented Mincer-type earnings equation, ran-dom effects GLS estimates including squaredbasic skills scores (2007-2010)

Literacy (z-score) 0.030∗∗ (0.014)Literacy (z-score, squared) −0.000 (0.003)Numeracy (z-score) 0.059∗∗∗ (0.013)Numeracy (z-score, squared) −0.002 (0.004)Vocational degree (d) 0.107 (0.110)Academic degree (d) 0.406∗∗∗ (0.099)Employment experience in years 0.041∗∗∗ (0.005)Employment experience squared/100 −0.039∗∗∗ (0.012)Male (d) 0.265∗∗∗ (0.028)Age in years 0.012 (0.010)Age squared/100 −0.028∗∗ (0.013)Immigrant background (d) 0.037 (0.029)Native language German (d) −0.044 (0.056)Job in East Germany (d) −0.092∗∗∗ (0.027)Tenure in years 0.009∗∗∗ (0.003)Tenure squared/100 −0.031∗∗ (0.013)Constant 6.658∗∗∗ (0.225)

R2 overall 0.488R2 between 0.510R2 within 0.126Observations 5,924

Source: ALWA-ADIAB, ALWA-LiNu, own unweighted calculations.Notes: log monthly earnings as dependent variable, cluster-robuststandard errors in parentheses. ***/ **/ * indicates significanceat the 1/ 5/ 10% level, d denotes dummy variable. Reference: noprofessional degree. Variables not shown: occupation, industry, firmsize and year dummies.

89

Page 90: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4. Do literacy and numeracy pay off?

includes regressors interacting the skills measures with other important controlvariables. The single coefficient of numeracy strongly increases to 18%, whereasthat of literacy is almost reduced to zero. Only the non-interacted coefficient ofnumeracy remains significantly larger than zero, though with a strongly reduced levelof statistical significance. The results on the main effects of the level of qualification,sex, an immigrant background, a job in East Germany and tenure remain fairlystable compared to the results given in Table 4.2, column 1.

Table 4.4.: Augmented Mincer-type earnings equation, ran-dom effects GLS estimates including interactionterms (2007-2010)

Literacy (z-score) 0.006 (0.075)Numeracy (z-score) 0.178∗ (0.107)Vocational degree (d) 0.074 (0.077)Academic degree (d) 0.386∗∗∗ (0.077)IA vocational degree x literacy 0.055 (0.085)IA academic degree x literacy 0.021 (0.076)IA vocational degree x numeracy −0.137 (0.130)IA academic degree x numeracy −0.091 (0.107)Male (d) 0.266∗∗∗ (0.028)IA male x literacy −0.021 (0.024)IA male x numeracy −0.004 (0.024)Immigrant background (d) 0.029 (0.029)IA immigrant background x literacy −0.050∗ (0.029)IA immigrant background x numeracy −0.004 (0.035)Job in East Germany (d) −0.090∗∗∗ (0.028)IA job in East Germany x literacy 0.026 (0.031)IA job in East Germany x numeracy −0.056∗∗ (0.027)Tenure in years 0.009∗∗∗ (0.004)IA tenure x literacy 0.002 (0.001)IA tenure x numeracy −0.001 (0.003)Constant 6.662∗∗∗ (0.209)

R2 overall 0.487R2 between 0.507R2 within 0.095Observations 5,924

Source: ALWA-ADIAB, ALWA-LiNu, own unweighted calculations.Notes: log monthly earnings as dependent variable, cluster-robuststandard errors in parentheses. ***/ **/ * indicates significance atthe 1/ 5/ 10% level, d denotes dummy variable. Reference: no profes-sional degree. Variables not shown: age (squared), employment exp.(squared), native language, tenure squared, occupation, industry, firmsize and year dummies.

Although the joint significance of the four interaction terms of the basic skillsmeasures with the dummies on levels of qualification is rejected by a Wald test(χ2(4): 3.06, p: 0.547), we should pay attention to the economic significance of the

90

Page 91: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4.5. Empirical analysis

results. People in the reference category, i.e. those without any formal professionaldegree, do show payoffs to basic skills, as indicated by the jointly significant maineffects of the basic skills, and their payoffs are not significantly different from those ofthe better qualified. In a sense this result conveys a positive message, as low-qualifiedworkers experience the same payoffs to basic skills as do better qualified workers.Judging by the significantly negative coefficient of the regressor interacting a person’simmigrant background with literacy, migrants seem to gain less from literacy thanGerman natives. Furthermore, a Wald test shows that both migration-relatedinteraction terms are jointly significant (χ2(2): 4.76, p: 0.092). These are unexpectedfindings, especially as we also control for whether the native language of a givenperson is German, so a lack of language skills should not be the reason behind ourfindings. The result is also not confounded with any potential selection of migrantsinto certain jobs, as we also control for the occupational segment and the industry ofthe firm. We attribute the finding to unobserved individual or job characteristics,such as discrimination against migrants by their employers, and to the selection ofrespondents into the test sample.The interaction between the dummy variable indicating a job in East Germanywith numeracy yields a significantly negative coefficient, indicating a smaller rewardfor this basic skill than in jobs in West Germany. This finding cannot be dueto differences in the occupational or industry structures between East and WestGermany, as we do control for these characteristics. One possible explanation wouldbe persistent differences in collective bargaining coverage between East and WestGermany (Addison et al., 2011). Unfortunately, our data do not include informationon the wage setting regime of a given firm. Neither to they include informationon overtime or shiftwork payments. Differences in payed working hours at a givenearnings level between East and West Germany may thus partially explain thisfinding. Apart from the results regarding the immigrant background and EastGermany, the lack of heterogeneity in rewards for different socio-economic groupscorroborates Hypothesis 4.We also included an interaction of the continuous variable tenure with both basic skilldomains to infer whether actual skills are rewarded more strongly the longer someoneis employed at a given firm. However, we do not find any evidence in support ofemployer learning (Hypothesis 2) in our data. The related interaction terms are notsignificant, neither taken by themselves nor tested jointly (χ2(2): 1.62, p: 0.445).Although there are only few significant individual coefficients, both literacy withits related interaction terms and numeracy with its interaction terms are jointlysignificant (χ2(7): 18.94, p: 0.008 and χ2(7): 32.21, p: 0.000, respectively). Thisis also true if testing both basic skills coefficients with all their interaction terms

91

Page 92: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4. Do literacy and numeracy pay off?

combined in one Wald test (χ2(14): 66.00, p: 0.000). This tells us that there is arobust positive relationship between basic skills and earnings, even in a specificationas highly differentiated as in Table 4.4.

4.5.3. Sensitivity analyses

To infer whether our findings are a result of our specific estimation sample, of themodel specification we have chosen or of general assumptions we have made, weran a number of sensitivity analyses. The most important results are discussed inthis section. Estimation results not shown here are available from the authors uponrequest.We argue that it is plausible to assume that basic skills are constant at least overthe short run. To test whether our four-year observation period is short enough tofulfill that assumption, we ran additional regressions based on a sample restricted tothe years 2007 and 2008. If the results should differ from those of the main model,we might have to call our assumption into question. The results, which are shown inTable D.2 do not differ substantially from those shown in Table 4.2, column 1.A further test examines whether the loss of individuals from the whole surveypopulation to the intersection of the tested and successfully linked respondents shownin Table D.1 reduced the generalisability of our results. To do so, we re-estimated thestandard Mincer earnings equation without basic skills scores based on all respondentswith linked administrative data, regardless of whether they have been tested. Theresults (see Table D.3) do not differ substantially from results based on the smallersample shown in Table 4.2, column 2.Despite the highly reliable income information given in the administrative data,there is one remaining shortcoming: earnings are only measured accurately betweenthe lower and the upper earnings limits for social security contributions, which varyover subsequent years. A few spells show values of gross monthly earnings that areoutside that range for their given year. We therefore estimated our main modelwithout those 47 cases (0.8% of the original sample). Again, our previous resultsremain robust. Due to this result, and because of the very low share of spells affectedby these two thresholds, we decided against applying multiple imputation (see e.g.,Büttner and Rässler, 2008; Rässler et al., 2008) for spells with earnings above theupper limits for social security contributions.Finally, we tested whether our results are robust to different specifications of aworker’s formal educational level, as it is one of the central characteristics of a Mincerearnings equation. We therefore ran analyses including “years of education” instead oflevels of professional education. We chose the latter for our main specification as theyare more appropriate in the German context with its distinctive dual apprenticeship

92

Page 93: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4.6. Conclusions

training system. The results of the alternative specification corroborate our mainfindings.

4.6. Conclusions

We add to the literature by analyzing whether basic skills are rewarded on theGerman labour market in terms of gross earnings differentials by adults’ literacyand numeracy skills. Evidence on this is relevant from both the economist’s andthe policy maker’s perspective. For economists, knowing whether there are rewardsfor basic skills beyond formal schooling enhances our understanding of returns toeducation. For policy makers, learning about the relationship between labour marketoutcomes and different types of basic skills may help to improve the design of, forexample, “lifelong learning” programmes.Our analysis is based on a unique data set, a combination of the ALWA-ADIAB andALWA-LiNu data, which enables us to measure both domains of basic skills and tocombine them with comprehensive information on educational histories and reliableand accurate register earnings data for Germany. We therefore provide the firstevidence for Germany regarding this question. Conditional on formal education, socio-economic controls as well as job and firm characteristics, our results show significantpayoffs to both skills. In particular, a one standard deviation increase in numeracyis, on average, associated with a 6% increase in gross earnings of full-time employedworkers, whereas there is an earnings differential of 3% for a standard deviationincrease in literacy. Compared to prior research for other countries, these premia arelower. This may, however, be caused by differences in wage setting schemes betweenGermany and these countries, or differences in the sample structures. Moreover, toput this into perspective, the payoff to numeracy is larger than the return to thefirst year of a worker’s job experience, which, according to our main specification,amounts to about 4%.Findings from extensions of our main specification show a linear relationship betweenearnings and both numeracy and literacy. Beyond that, there is only little evidence onheterogeneity for different groups. There is, in particular, no significant differentialin the returns to basic skills by gender, and only small gradients by immigrantbackground or by region of employment (East-West). Given that our theoreticalconsiderations have not given us any reason to expect such heterogeneity, this resultis not surprising. The contrast to other studies, which do find heterogeneity inpayoffs to basic skills between different groups in the labour market, may be becauseof differences in the sample composition or because of the fact that our linked dataset allows us to control for otherwise unobservable individual and job characteristics.

93

Page 94: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

4. Do literacy and numeracy pay off?

We may thus have been able to avoid omitted variable bias more comprehensivelythan some of the previous studies.There may be no significant differences between literacy and numeracy, no non-linearity and no measurable heterogeneity across different groups of workers regardingthe relationship between earnings and basic skills. But there may well be differencesin the cost of—or necessary effort for—an increase in skill endowments betweenthe skill domains. These costs may also depend on the initial place in the skilldistribution of a person, or on any other characteristic considered in this analysisfor that matter. To infer whether efforts to increase basic skills of a given personare a worthwhile investment in her human capital, one would have to compare thepotential payoffs of these efforts with their costs. However, there is no empiricalevidence on how much effort it takes to raise a person’s literacy or numeracy levelby, say, one standard deviation. We would thus argue that empirical evidence on thecost of skill improvements, differentiated by characteristics of a potential learner, isnecessary for conclusions regarding the cost-effectiveness of investments in basic skills.Policy advice regarding a targeting or tailoring of learning programs for specificgroups in the labour market cannot be given on the basis of these results.Although a generalization of our results is not possible in a straightforward manner,and policy implications have thus to be handled carefully, our analysis is a usefulstarting point for further discussion on the rewards of adults’ basic skills in thelabour market.

94

Page 95: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

5. General summary

After introducing the different data sources used in this dissertation in Chapter 2, Ievaluate the linkage of the ALWA survey data with the administrative data of theIAB. I examine both consent bias and selectivity in linkage success based on surveyand paradata. My results are informative for potential data users, for survey practiceas well as for practitioners of record linkage. My main findings are the following:contrary to previous results, linkage consent in ALWA is not significantly related tothe respondents’ reported educational or income levels. Consent bias is mainly drivenby the labour market status of the respondent. The highest consent rates are achievedby older, female and more experienced interviewers. Probabilistic record linkagesubstantially increases the number of observations without introducing additionalselectivity to the linked sample compared to the result of deterministic matching.Manual matching further increases the number of observation at the cost of morepronounced selectivity of the resulting sample. Selectivity of the successfully linkeddata is mainly driven by the age, the immigrant background and the employmentstatus of the respondents.In Chapter 3, I examine whether people from low-qualified family backgroundsmake up for any inherited lack of formal education by means of non-formal training.Hypotheses based on economic theory and findings from various other disciplinessuggest otherwise. I use the ALWA survey data to estimate the influence of familybackground on non-formal training participation. Count data analyses show that alow-qualified family background is negatively related to both likelihood and frequencyof on-the-job training. This result holds when controlling for education, ability andpersonality as well as job and firm characteristics.Chapter 4, which is mainly based on a collaborative paper with one coauthor, askswhether there is a reward for basic skills in the German labour market. To answerthis question, we examine the relationship between literacy, numeracy and monthlygross earnings of full-time employed workers. We use data from ALWA survey,augmented by test scores on basic cognitive skills as well as administrative earningsdata. Our results indicate that earnings are positively related to both types of skills.Furthermore, there is no evidence for non-linearity in this relationship and only littleheterogeneity when differentiating by sub-groups.

95

Page 96: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Addison, John, Alex Bryson, André Pahnke, and Paulino Teixeira (2011). Changeand Persistence in the German Model of Collective Bargaining and Worker Rep-resentation. CEP Discussion Paper No 1099. Accessed: 21.9.2012. url: http://cep.lse.ac.uk/pubs/download/dp1099.pdf.

Allmendinger, Jutta, Corinna Kleinert, Manfred Antoni, Bernhard Christoph, KatrinDrasch, Florian Janik, Kathrin Leuze, Britta Matthes, Reinhard Pollak, andMichael Ruland (2011). ‘Adult education and lifelong learning’. In: Zeitschrift fürErziehungswissenschaft 14(Supplement 2), pp. 283–299.

Anger, Silke and Guido Heineck (2010). ‘Do smart parents raise smart children? Theintergenerational transmission of cognitive abilities’. In: Journal of PopulationEconomics 23(3), pp. 1105–1132.

Antonczyk, Dirk, Bernd Fitzenberger, and Katrin Sommerfeld (2010). ‘Rising wageinequality, the decline of collective bargaining, and the gender wage gap’. In: LabourEconomics 17(5), pp. 835–847.

Antoni, Manfred (2011). Lifelong learning inequality? The relevance of family back-ground for on-the-job training. IAB Discussion Paper 09/2011. Accessed: 27.09.2012.url: http://doku.iab.de/discussionpapers/2011/dp0911.pdf.

– (2013a). Linking survey data with administrative employment data: The case ofthe German ALWA survey. IAB Discussion Paper. forthcoming.

– (2013b). Record linkage of the ALWA survey with administrative data of the FederalEmployment Agency (ALWA-ADIAB): technical report. FDZ Methodenreport.forthcoming.

Antoni, Manfred and Guido Heineck (2012). Do literacy and numeracy pay off? Onthe relationship between basic skills and earnings. IAB Discussion Paper 21/2012.Accessed: 20.09.2012. url: http://doku.iab.de/discussionpapers/2012/dp2112.pdf.

96

Page 97: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Antoni, Manfred and Stefan Seth (2012). ‘ALWA-ADIAB - linked individual sur-vey and administrative data for substantive and methodological research’. In:Schmollers Jahrbuch. Journal of Applied Social Science Studies 132(1), pp. 141–146.

Antoni, Manfred, Katrin Drasch, Corinna Kleinert, Britta Matthes, Michael Ruland,and Annette Trahms (2010). Working and Learning in a Changing World. Part I:Overview of the study. FDZ Methodenreport 05/2010 (en).

Antoni, Manfred, Peter Jacobebbinghaus, and Stefan Seth (2011). ALWA-Befragungsdatenverknüpft mit administrativen Daten des IAB 1975-2009 (ALWA-ADIAB 7509).FDZ Datenreport 05/2011.

Arulampalam, Wiji and Alison L. Booth (2001). ‘Learning and Earning: Do MultipleTraining Events Pay? A Decade of Evidence from a Cohort of Young British Men’.In: Economica 68(271), pp. 379–400.

Asplund, Rita (2005). ‘The Provision and Effects of Company Training: A BriefReview of the Literature’. In: Nordic Journal of Political Economy 31, pp. 47–73.

Bachteler, Tobias (2008). Dokumentation Record Linkage IEB-PASS. unpublisheddocument.

Backes-Gellner, Uschi, Johannes Mure, and Simone N. Tuor (2007). ‘The Puzzle ofNon-Participation in Continuing Training - An Empirical Study of Chronic vs.Temporary Non-Participation’. In: Zeitschrift für Arbeitsmarktforschung 40(2/3),pp. 295–311.

Balestra, Pietro and Marc Nerlove (1966). ‘Pooling Cross Section and Time SeriesData in the Estimation of a Dynamic Model: The Demand for Natural Gas’. In:Econometrica 34(3), pp. 585–612.

Baltagi, Badi H. and Qi Li (1990). ‘A lagrange multiplier test for the error componentsmodel with incomplete panels’. In: Econometric Reviews 9(1), pp. 103–107.

Bandura, Albert (1994). ‘Self-Efficacy’. In: Encyclopedia of Human Behavior. Ed. byVilayanur Subramanian Ramachandran. Vol. 4. San Diego et al.: Academic Press,pp. 71–81.

Becker, Gary S. (1962). ‘Investment in Human Capital: a Theoretical Analysis’. In:Journal of Political Economy 70(S5), pp. 9–49.

97

Page 98: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Becker, Gary S. and Barry R. Chiswick (1966). ‘Education and the Distribution ofEarnings’. In: American Economic Review 56(1/2), pp. 358–369.

Becker, Gary S. and Nigel Tomes (1979). ‘An Equilibrium Theory of the Distributionof Income and Intergenerational Mobility’. In: Journal of Political Economy 87(6),pp. 1153–1189.

– (1986). ‘Human Capital and the Rise and Fall of Families’. In: Journal of LaborEconomics 4(3), S1–S39.

Beste, Jonas (2011). Selektivitätsprozesse bei der Verknüpfung von Befragungs- mitProzessdaten. Record Linkage mit Daten des Panels "Arbeitsmarkt und sozialeSicherung" und administrativen Daten der Bundesagentur für Arbeit. FDZ Metho-denreport 09/2011.

Björklund, Anders and Kjell G. Salvanes (2011). ‘Education and Family Background:Mechanisms and Policies’. In: Handbook of the Economics of Education. Ed. byEric A. Hanushek, Stephen J. Machin, and Ludger Wößmann. Vol. 3. Amsterdamet al.: North-Holland, pp. 201–247.

Black, Sandra E., Paul J. Devereux, and Kjell G. Salvanes (2005). ‘Why the AppleDoesn’t Fall Far: Understanding Intergenerational Transmission of Human Capital’.In: American Economic Review 95(1), pp. 437–449.

– (2009). ‘Like father, like son? A note on the intergenerational transmission of IQscores’. In: Economics Letters 105(1), pp. 138–140.

Bound, John, Zvi Griliches, and Bronwyn H Hall (1986). ‘Wages, Schooling and IQof Brothers and Sisters: Do the Family Factors Differ?’ In: International EconomicReview 27(1), pp. 77–105.

Bowles, Samuel and Herbert Gintis (2002). ‘The Inheritance of Inequality’. In: Journalof Economic Perspectives 16(3), pp. 3–30.

Breusch, T. S. and A. R. Pagan (1980). ‘The Lagrange Multiplier Test and itsApplications to Model Specification in Econometrics’. In: Review of EconomicStudies 47(1), pp. 239–253.

Bronars, Stephen G. and Gerald S. Oettinger (2006). ‘Estimates of the return toschooling and ability: evidence from sibling data’. In: Labour Economics 13(1),pp. 19–34.

98

Page 99: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Büttner, Thomas and Susanne Rässler (2008). Multiple imputation of right-censoredwages in the German IAB Employment Sample considering heteroscedasticity.Accessed: 30.09.2012. url: http://doku.iab.de/discussionpapers/2008/dp4408.pdf.

Büchel, Felix and Markus Pannenberg (2004). ‘Berufliche Weiterbildung in West-und Ostdeutschland. Teilnehmer, Struktur und individueller Ertrag’. In: Zeitschriftfür Arbeitsmarktforschung 37(2), pp. 73–126.

Buchmann, Marlis, Markus König, Li Jiang Hong, and Stefan Sacchi (1999). Weiter-bildung und Beschäftigungschancen. Chur et al.: Rüegger.

Cameron, Adrian C. and Pravin K. Trivedi (1986). ‘Econometric Models Based onCount Data: Comparisons and Applications of Some Estimators and Tests’. In:Journal of Applied Econometrics 1(1), pp. 29–53.

Cameron, Adrian Colin and Pravin K. Trivedi (1998). Regression Analysis of CountData. Cambridge: Cambridge University Press.

Cameron, Stephen V and James J Heckman (1993). ‘The Nonequivalence of HighSchool Equivalents’. In: Journal of Labor Economics 11(1), pp. 1–47.

Cappellari, Lorenzo and Stephen P. Jenkins (2003). ‘Multivariate probit regressionusing simulated maximum likelihood’. In: Stata Journal 3(3), pp. 278–294.

Card, David (1999). ‘The Causal Effect of Education on Earnings’. In: Handbook ofLabor Economics. Ed. by Orley Ashenfelter and David Card. Vol. 3. Handbook ineconomics. Amsterdam et al.: Elsevier, pp. 1801–1863.

– (2001). ‘Estimating the Return to Schooling: Progress on Some Persistent Econo-metric Problems’. In: Econometrica 69(5), pp. 1127–1160.

Cawley, John, James Heckman, and Edward Vytlacil (2001). ‘Three observations onwages and measured cognitive ability’. In: Labour Economics 8(4), pp. 419–442.

Checchi, Daniele (2006). The Economics of Education. Human Capital, FamilyBackground and Inequality. Cambridge et al.: Cambridge University Press.

Chevalier, Arnaud, Kevin Denny, and Dorren McMahon (2009). ‘A Multi-countryStudy of Intergenerational Educational Mobility’. In: Education and InequalityAcross Europe. Ed. by Peter Dolton, Rita Asplund, and Erling Barth. Cheltenhamet al.: Edward Elgar, pp. 260–281.

99

Page 100: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Colquitt, Jason A., Jeffrey A. LePine, and Raymond A. Noe (2000). ‘Toward anIntegrative Theory of Training Motivation: A Meta-Analytic Path Analysis of 20Years of Research’. In: Journal of Applied Psychology 85(5), pp. 678–707.

Cunha, Flavio and James J. Heckman (2007). ‘The Technology of Skill Formation’.In: American Economic Review 97(2), pp. 31–47.

– (2008). ‘Formulating, Identifying and Estimating the Technology of Cognitive andNoncognitive Skill Formation’. In: Journal of Human Resources 43(4), pp. 738–782.

– (2010). Investing in Our Young People. NBER Working Paper 16201. Accessed:21.9.2012. url: http://www.nber.org/papers/w16201.pdf.

Davidson, Russell and James G. MacKinnon (2004). Econometric theory and methods.New York et al.: Oxford University Press.

Denny, Kevin and Orla Doyle (2010). ‘Returns to basic skills in central and easternEurope’. In: Economics of Transition 18(1), pp. 183–208.

Desjardins, Richard and Arne Jonas Warnke (2012). Ageing and Skills. A Review andAnalysis of Skill Gain and Skill Loss Over the Lifespan and Over Time. OECDEducation Working Papers No. 72.

Dieckhoff, Martina (2007). ‘Does it Work? The Effect of Continuing Training onLabour Market Outcomes: A Comparative Study of Germany, Denmark, and theUnited Kingdom’. In: European Sociological Review 23(3), pp. 295–308.

Dohmen, Thomas, Armin Falk, David Huffman, and Uwe Sunde (2010). ‘Are RiskAversion and Impatience Related to Cognitive Ability?’ In: American EconomicReview 100(3), pp. 1238–1260.

Dougherty, Christopher (2003). ‘Numeracy, literacy and earnings: evidence from theNational Longitudinal Survey of Youth’. In: Economics of Education Review 22(5),pp. 511–521.

Drasch, Katrin and Britta Matthes (2013). ‘Improving retrospective life course databy combining modularized self-reports and event history calendars. Experiencesfrom a large scale survey’. In: Quality & Quantity 47(2), pp. 817–838.

Dunn, Kate M., Kelvin Jordan, Rosie J. Lacey, Mark Shapley, and Clare Jinks(2004). ‘Patterns of Consent in Epidemiologic Research: Evidence from Over 25,000Responders’. In: American Journal of Epidemiology 159(11), pp. 1087–1094.

100

Page 101: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

European Commission (2000). A Memorandum on Lifelong Learning. SEC(2000) 1832.Accessed: 21.9.2012. Brussels. url: http://ec.europa.eu/education/lifelong-learning-policy/doc/policy/memo_en.pdf.

Farber, Henry S. and Robert Gibbons (1996). ‘Learning and Wage Dynamics’. In:Quarterly Journal of Economics 111(4), pp. 1007–1047.

Feinstein, Leon, Angela Lee Duckworth, and Ricardo Sabates (2004). A Modelof the Inter-generational Transmission of Educational Success. Wider Benefitsof Learning Research Report No. 10. Accessed: 21.9.2012. url: http://www.learningbenefits.net/Publications/ResReps/ResRep10.pdf.

Fellegi, Ivan P. and Alan B. Sunter (1969). ‘A Theory for Record Linkage’. In: Journalof the American Statistical Association 64(328), pp. 1183–1210.

Fitzenberger, Bernd, Aderonke Osikominu, and Robert Völter (2006). ‘ImputationRules to Improve the Education Variable in the IAB Employment Subsample’. In:Schmollers Jahrbuch. Journal of Applied Social Science Studies 126(3), pp. 405–436.

Folloni, Giuseppe and Giorgio Vittadini (2010). ‘Human capital measurement: asurvey’. In: Journal of Economic Surveys 24(2), pp. 248–279.

Fouarge, Didier, Trudie Schils, and Andries de Grip (2013). ‘Why do low-educatedworkers invest less in further training?’ In: Applied Economics 45(18), pp. 2587–2601.

Frederick, Shane (2005). ‘Cognitive Reflection and Decision Making’. In: Journal ofEconomic Perspectives 19(4), pp. 25–42.

Freeman, Richard and Ronald Schettkat (2001). ‘Skill compression, wage differentials,and employment: Germany vs the US’. In: Oxford Economic Papers 53(3), pp. 582–603.

Frick, Joachim R., Markus M. Grabka, and Olaf Groh-Samberg (2007). EconomicGains from Publicly Provided Education in Germany. IZA Discussion Paper No.2911. Accessed: 21.9.2012. url: http://ftp.iza.org/dp2911.pdf.

Gerner, Hans-Dieter and Jens Stegmaier (2009). ‘Unsicherheit und betriebliche Weit-erbildung. Eine empirische Analyse der Weiterbildungsaktivität unter Unsicherheitin KMU und Großbetrieben’. In: Zeitschrift für Betriebswirtschaft Special Issue6/2009, pp. 135–163.

101

Page 102: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Glasner, Tina and Wander van der Vaart (2009). ‘Applications of calendar instrumentsin social surveys: a review’. In: Quality & Quantity 43(3), pp. 333–349.

Green, David A. and W.C. Riddell (2003). ‘Literacy and earnings: an investigationof the interaction of cognitive and unobserved skills in earnings generation’. In:Labour Economics 10(2), pp. 165–184.

Griliches, Zvi and William M. Mason (1972). ‘Education, Income, and Ability’. In:Journal of Political Economy 80(3, Part 2), S74–S103.

Grossman, Michael (2006). ‘Education and nonmarket outcomes’. In: Handbook ofthe Economics of Education. Ed. by Eric A. Hanushek and Finis Welch. Vol. 1.Amsterdam et al.: North-Holland, pp. 577–633.

Gustman, Alan L. and Thomas L. Steinmeier (1999). What People Don’t KnowAbout Their Pensions and Social Security: An Analysis Using Linked Data from theHealth and Retirement Study. NBER Working Paper No. 7368. Accessed: 21.9.2012.url: http://www.nber.org/papers/w7368.pdf.

Haider, Steven J. and Gary Solon (2000). Nonrandom Selection in the HRS SocialSecurity Earnings Sample. RAND Labor and Population Program Working PaperSeries 00-01. Accessed: 21.9.2012. url: http://www- personal.umich.edu/~gsolon/workingpapers/nonresp.pdf.

Hardin, James W. and Joseph M. Hilbe (2007). Generalized Linear Models andExtensions. 2nd ed. College Station: Stata Press.

Hartmann, Josef and Gerhard Krug (2009). ‘Verknüpfung von personenbezogenenProzess- und Befragungsdaten – Selektivität durch fehlende Zustimmung derBefragten?’ In: Zeitschrift für Arbeitsmarktforschung 42(2), pp. 121–139.

Heckman, James J., Jora Stixrud, and Sergio Urzua (2006). ‘The Effects of Cognitiveand Noncognitive Abilities on Labor Market Outcomes and Social Behavior’. In:Journal of Labor Economics 23(3), pp. 411–482.

Heckman, James J., Seong Hyeok Moon, Rodrigo Pinto, Peter A. Savelyev, and AdamYavitz (2010). ‘The rate of return to the HighScope Perry Preschool Program’. In:Journal of Public Economics 94(1-2), pp. 114–128.

Heilbron, David C. (1994). ‘Zero-Altered and Other Regression Models for CountData with Added Zeros’. In: Biometrical Journal 36(5), pp. 531–547.

102

Page 103: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Heineck, Guido (2011). ‘Does It Pay To Be Nice? Personality And Earnings In TheUK’. In: Industrial and Labor Relations Review 64(5), pp. 1020–1038.

Heineck, Guido and Silke Anger (2010). ‘The returns to cognitive abilities andpersonality traits in Germany’. In: Labour Economics 17(3), pp. 535–546.

Heineck, Guido and Regina T. Riphahn (2009). ‘Intergenerational Transmission ofEducational Attainment in Germany - The Last Five Decades’. In: Jahrbücher fürNationalökonomie und Statistik 229(1), pp. 36–60.

Heineck, Guido and Bernd Süßmuth (2010). A Different Look at Lenin’s Legacy:Trust, Risk, Fairness and Cooperativeness in the Two Germanies. IZA DiscussionPaper No. 5219. Accessed: 21.9.2012. url: http://ftp.iza.org/dp5219.pdf.

Heining, Jörg (2010). ‘The Research Data Centre of the German Federal EmploymentAgency: Data supply and demand between 2004 and 2009’. In: Zeitschrift fürArbeitsmarktforschung 42(4), pp. 337–350.

Helberger, Christof (1988). ‘Eine Überprüfung der Linearitätsannahme der Hu-mankapitaltheorie’. In: Bildung, Beruf, Arbeitsmarkt. Ed. by Horst Albach andHans-Joachim Bodenhöfer. Berlin: Duncker & Humblot, pp. 151–170.

Hertz, Tom, Tamara Jayasundera, Patrizio Piraino, Sibel Selcuk, Nicole Smith,and Alina Verashchagina (2007). ‘The Inheritance of Educational Inequality:International Comparisons and Fifty-Year Trends’. In: B.E. Journal of EconomicAnalysis & Policy 7(2). Article 10.

Hertzog, Christopher, Arthur F. Kramer, Robert S. Wilson, and Ulman Lindenberger(2008). ‘Enrichment Effects on Adult Cognitive Development’. In: PsychologicalScience in the Public Interest 9(1), pp. 1–65.

Herzog, Thomas N., Fritz J. Scheuren, and William E. Winkler (2007). Data qualityand record linkage techniques. New York: Springer.

Hethey-Maier, Tanja and Stefan Seth (2011). The Establishment History Panel (BHP)1975-2008. Handbook Version 1.0.2. FDZ Datenreport 04/2010 (en).

Hilbe, Joseph M. (2008). Negative Binomial Regression. Cambridge: CambridgeUniversity Press.

Holm, Sture (1979). ‘A Simple Sequentially Rejective Multiple Test Procedure’. In:Scandinavian Journal of Statistics 6(2), pp. 65–70.

103

Page 104: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Huang, Nicole, Shu-Fang Shih, Hsing-Yi Chang, and Yiing-Jenq Chou (2007). ‘Recordlinkage research and informed consent: who consents?’ In: BMC Health ServicesResearch 7(18).

Huber, Martina and Alexandra Schmucker (2009). ‘Identifying and Explaining In-consistencies in Linked Administrative and Survey Data: The Case of GermanEmployment Biographies’. In: Historical Social Research 34(3), pp. 230–241.

Huber, Peter J. (1967). ‘The behavior of maximum likelihood estimates undernonstandard conditions’. In: Proceedings of the Fifth Berkeley Symposium onMathematical Statistics and Probability. Ed. by Lucien M. Le Cam and JerzyNeyman. Vol. 1. Berkeley: University of California Press, pp. 221–233.

Ishikawa, Mamoru and Daniel Ryan (2002). ‘Schooling, basic skills and economicoutcomes’. In: Economics of Education Review 21(2002), pp. 231–243.

Jacobebbinghaus, Peter and Stefan Seth (2007). ‘The German Integrated EmploymentBiographies Sample IEBS’. In: Schmollers Jahrbuch. Journal of Applied SocialScience Studies 127(2), pp. 335–342.

Jann, Ben (2005). ‘Making regression tables from stored estimates’. In: Stata Journal5(3), pp. 288–308.

– (2007). ‘Making regression tables simplified’. In: Stata Journal 7(2), pp. 227–244.

Jäckle, Annette, Emanuela Sala, Stephen Jenkins, and Peter Lynn (2004). ‘Validat-ing Survey Data: Experiences using Employer Records and Government Benefit(Transfer) Data in the UK’. In: Proceedings of the Survey Research Methods Sectionof the American Statistical Association, pp. 4802–4809.

Jenkins, Stephen P., Lorenzo Cappellari, Peter Lynn, Annette Jäckle, and EmanuelaSala (2006). ‘Patterns of consent: evidence from a general household survey’. In:Journal of the Royal Statistical Society: Series A (Statistics in Society) 169(4),pp. 701–722.

Jolliffe, Ian T. (2002). Principal Component Analysis. New York: Springer.

Kho, Michelle E, Mark Duffett, Donald J Willison, Deborah J Cook, and Melissa CBrouwers (2009). ‘Written informed consent and selection bias in observationalstudies using medical records: systematic review’. In: BMJ 338.

104

Page 105: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Kirby, Kris N., Gordon C. Winston, and Mariana Santiesteban (2005). ‘Impatienceand grades: Delay-discount rates correlate negatively with college GPA’. In: Learn-ing and Individual Differences 15(3), pp. 213–222.

Kleinert, Corinna, Britta Matthes, Manfred Antoni, Katrin Drasch, Michael Ruland,and Annette Trahms (2011). ‘ALWA - New Life Course Data for Germany’. In:Schmollers Jahrbuch. Journal of Applied Social Science Studies 131(4), pp. 625–634.

Kleinert, Corinna, Michael Ruland, and Annette Trahms (2012a). Bias in einemkomplexen Surveydesign. Ausfallprozesse und Selektivität in der IAB-BefragungALWA. FDZ Methodenreport. forthcoming.

Kleinert, Corinna, Kentaro Yamamoto, Oliver Wölfel, and Rainer Gilberg (2012b).Working and Learning in a Changing World. Part VI: Literacy and NumeracySkills - Test design, Implementation, Scaling and Statistical Models for ProficiencyEstimation. FDZ Methodenreport 10/2012 (en).

Klieme, Eckhard, Cordula Artelt, Johannes Hartig, Nina Jude, Olaf Köller, ManfredPrenzel, Wolfgang Schneider, and Petra Stanat (2010). PISA 2009: Bilanz nacheinem Jahrzehnt. Münster et al.: Waxmann.

Knies, Gundi, Jonathan Burton, and Emanuela Sala (2012). ‘Consenting to healthrecord linkage: evidence from a multi-purpose longitudinal survey of a generalpopulation’. In: BMC Health Services Research 12(52).

Korn, Edward L. and Barry I. Graubard (1990). ‘Simultaneous Testing of RegressionCoefficients with Complex Survey Data: Use of Bonferroni t Statistics’. In: TheAmerican Statistician 44(4), pp. 270–276.

Kreuter, Frauke and Carolina Casas-Cordero (2010). ‘Paradata’. In: Building onProgress. Expanding the Research Infrastructure for the Social, Economic, andBehavioral Sciences. Ed. by German Data Forum (RatSWD). Vol. 1. Opladen etal.: Budrich UniPress, pp. 509–529.

Krosnick, Jon A. (1991). ‘Response strategies for coping with the cognitive demandsof attitude measures in surveys’. In: Applied Cognitive Psychology 5(3), pp. 213–236.

Lambert, Diane (1992). ‘Zero-Inflated Poisson Regression, with an Application toDefects in Manufacturing’. In: Technometrics 34(1), pp. 1–14.

105

Page 106: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Leontaridi, Marianthi (1998). ‘Segmented Labour Markets: Theory and Evidence’.In: Journal of Economic Surveys 12(1), pp. 63–101.

Lessof, Carli (2009). ‘Ethical Issues in Longitudinal Surveys’. In: Methodology ofLongitudinal Surveys. Ed. by Peter Lynn. Chichester: Wiley, pp. 35–54.

Matthes, Britta and Annette Trahms (2010). Arbeiten und Lernen im Wandel. TeilII: Codebuch. FDZ Datenreport 02/2010.

McCrae, Robert R. and Oliver P. John (1992). ‘An Introduction to the Five-FactorModel and its Applications’. In: Journal of Personality 60(2), pp. 175–215.

McIntosh, Steven and Anna Vignoles (2001). ‘Measuring and assessing the impactof basic skills on labour market outcomes’. In: Oxford Economic Papers 53(3),pp. 453–481.

Metschke, Rainer (2010). ‘Record Linkage from the Perspective of Data Protection’.In: Building on Progress. Expanding the Research Infrastructure for the Social,Economic, and Behavioral Sciences. Ed. by German Data Forum (RatSWD). Vol. 2.Opladen et al.: Budrich UniPress, pp. 643–656.

Mincer, Jacob (1958). ‘Investment in Human Capital and Personal Income Distribu-tion’. In: Journal of Political Economy 66(4), pp. 281–302.

Moulton, Brent R. (1990). ‘An Illustration of a Pitfall in Estimating the Effects ofAggregate Variables on Micro Units’. In: Review of Economics and Statistics 72(2),pp. 334–338.

Mullahy, John (1986). ‘Specification and testing of some modified count data models’.In: Journal of Econometrics 33(3), pp. 341–365.

Murnane, Richard J., John B. Willett, and Frank Levy (1995). ‘The Growing Impor-tance of Cognitive Skills in Wage Determination’. In: Review of Economics andStatistics 77(2), pp. 251–266.

Olson, Janice A. (1999). ‘Linkages with Data from Social Security AdministrativeRecords in the Health and Retirement Study’. In: Social Security Bulletin 62(2),pp. 73–85.

Pallas, Aaron M. (2004). ‘Educational Transitions, Trajectories, and Pathways’. In:Handbook of the Life Course. Ed. by Jeylan T. Mortimer and Michael J. Shanahan.New York: Springer, pp. 165–184.

106

Page 107: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Pannenberg, Markus (1998). ‘Weiterbildung, Betriebszugehörigkeit und Löhne: ökonomis-che Effekte des "timings" von Investitionen in die berufliche Weiterbildung’. In:Qualifikation, Weiterbildung und Arbeitsmarkterfolg. Ed. by Friedhelm Pfeifferand Winfried Pohlmeier. ZEW Wirtschaftsanalysen 31. Baden-Baden: Nomos,pp. 257–278.

– (2001). ‘Schützt Weiterbildung on-the-job vor Arbeitslosigkeit?’ In: Bildung undBeschäftigung. Ed. by Robert K. von Weizsäcker. Schriften des Vereins für So-cialpolitik. Berlin: Duncker & Humblot, pp. 275–291.

Plomin, Robert, John C. DeFries, Gerald E. McClearn, and Peter McGuffin (2008).Behavioral genetics. 5th ed. New York: Worth.

Plug, Erik (2004). ‘Estimating the Effect of Mother’s Schooling on Children’s School-ing Using a Sample of Adoptees’. In: American Economic Review 94(1), pp. 358–368.

Psacharopoulos, George (1979). ‘On the weak versus the strong version of thescreening hypothesis’. In: Economics Letters 4(2), pp. 181–185.

Reimer, Maike (2005). Autobiografisches Gedächtnis und retrospektive Datenerhebung.Die Rekonstruktion und Validität von Lebensläufen. Vol. 70. Studien und Berichte.Berlin: Max-Planck-Institut für Bildungsforschung.

Riley, John G. (1976). ‘Information, Screening and Human Capital’. In: AmericanEconomic Review 66(2, Papers and Proceedings), pp. 254–260.

Rotter, Julian B. (1966). ‘Generalized expectancies for internal versus external controlof reinforcement’. In: Psychological Monographs 80(1), pp. 1–28.

Rässler, Susanne, Donald B. Rubin, and Nathaniel Schenker (2008). ‘IncompleteData. Diagnosis, Imputation, and Estimation’. In: International Handbook ofSurvey Methodology. Ed. by Edith D. de Leeuw, Joop J. Hox, and Don A. Dillman.Vol. 2008. New York: Erlbaum, pp. 370–386.

Sakshaug, Joseph W. and Frauke Kreuter (2012). ‘Assessing the Magnitude of Non-Consent Biases in Linked Survey and Administrative Data’. In: Survey ResearchMethods 6(2), pp. 113–122.

Sala, Emanuela, Jonathan Burton, and Gundi Knies (2012). ‘Correlates of ObtainingInformed Consent to Data Linkage’. In: Sociological Methods & Research 41(3),pp. 414–439.

107

Page 108: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Schömann, Klaus and Janine Leschke (2008). ‘Lebenslanges Lernen und soziale Inklu-sion - der Markt alleine wird’s nicht richten’. In: Bildung als Privileg - Erklärungund Befunde zu den Ursachen der Bildungsungleichheit. Ed. by Rolf Becker andWolfgang Lauterbach. 3rd ed. Wiesbaden: VS Verlag für Sozialwissenschaften,pp. 347–383.

Schnell, Rainer, Tobias Bachteler, and Jörg Reiher (2009). ‘Entwicklung einer neuenfehlertoleranten Methode bei der Verknüpfung von personenbezogenen Daten-banken unter Gewährleistung des Datenschutzes’. In: Methoden, Daten und Analy-sen. Zeitschrift für empirische Sozialforschung 3(2). pdf ang, pp. 203–217.

Schütz, Gabriela, Heinrich W. Ursprung, and Ludger Wößmann (2008). ‘EducationPolicy and Equality of Opportunity’. In: Kyklos 61(2), pp. 279–308.

Schultz, Theodore W. (1961). ‘Investment in Human Capital’. In: American EconomicReview 51(2), pp. 1–17.

Shomos, Anthony (2010). Links Between Literacy and Numeracy Skills and LabourMarket Outcomes. Productivity Commission Staff Working Paper No 104. Accessed:21.9.2012. url: http://www.pc.gov.au/__data/assets/pdf_file/0009/102024/literacy-numeracy-labour-outcomes.pdf.

Singer, Eleanor, Nancy A. Mathiowetz, and Mick P. Couper (1993). ‘The Impact ofPrivacy and Confidentiality Concerns on Survey Participation: The Case of the1990 U.S. Census’. In: Public Opinion Quarterly 57(4), pp. 465–482.

Singer, Eleanor, John van Hoewyk, and Randall J. Neugebauer (2003). ‘Attitudes andBehavior: The Impact of Privacy and Confidentiality Concerns on Participation inthe 2000 Census’. In: Public Opinion Quarterly 67(3), pp. 368–384.

Sinnott, Jan D. (1994). ‘Sex Roles’. In: Encyclopedia of Human Behavior. Ed. byVilayanur Subramanian Ramachandran. Vol. 4. San Diego et al.: Academic Press,pp. 151–159.

Solon, Gary (2004). ‘A model of intergenerational mobility variation over time andplace’. In: Generational income mobility in North America and Europe. Ed. byMiles Corak. Cambridge: Cambridge University Press, pp. 38–47.

Spence, Michael (1973). ‘Job Market Signaling’. In: Quarterly Journal of Economics87(3), pp. 355–374.

108

Page 109: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Spengler, Anja (2008). ‘The Establishment History Panel’. In: Schmollers Jahrbuch.Journal of Applied Social Science Studies 128(3), pp. 501–509.

Statistics Canada and OECD (2005). Learning a Living. First Results of the AdultLiteracy and Life Skills Survey. Ottawa et al.: OECD.

Stiglitz, Joseph E. (1975). ‘The Theory of “Screening,” Education, and the Distribu-tion of Income’. In: American Economic Review 65(3), pp. 283–300.

Stiglitz, Joseph E., Amartya Sen, and Jean-Paul Fitoussi (2009). Report by theCommission on the Measurement of Economic Performance and Social Progress.Final Report. Accessed: 30.09.2012. url: http://www.stiglitz-sen-fitoussi.fr/documents/rapport_anglais.pdf.

Taubman, Paul J. and Michael L. Wachter (1986). ‘Segmented labor markets’. In:Handbook of Labor Economics. Ed. by Orley Ashenfelter. Vol. 2. Amsterdam etal.: North-Holland, pp. 1183–1217.

Tourangeau, Roger and Norman M. Bradburn (2010). ‘The Psychology of SurveyResponse’. In: Handbook of survey research. Ed. by Peter V. Marsden and James D.Wright. 2nd ed. Bingley: Emerald, pp. 315–346.

Trappmann, Mark, Stefanie Gundert, Claudia Wenzig, and Daniel Gebhardt (2010).‘PASS – A Household Panel Survey for Research on Unemployment and Poverty’.In: Schmollers Jahrbuch. Journal of Applied Social Science Studies 130(4), pp. 609–622.

Trostel, Philip, Ian Walker, and Paul Woolley (2002). ‘Estimates of the economicreturn to schooling for 28 countries’. In: Labour Economics 9(1), pp. 1–16.

Vignoles, Anna, Augustin De Coulon, and Oscar Marcenaro-Gutierrez (2011). ‘Thevalue of basic skills in the British labour market’. In: Oxford Economic Papers63(1), pp. 27–48.

Vuong, Quang H. (1989). ‘Likelihood Ratio Tests for Model Selection and Non-NestedHypotheses’. In: Econometrica 57(2), pp. 307–333.

White, Halbert (1980). ‘A heteroskedasticity-consistent covariance matrix estimatorand a direct test for heteroskedasticity’. In: Econometrica 48(4), pp. 817–838.

109

Page 110: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Bibliography

Winkler, William E. (2009). ‘Record Linkage’. In: Sample Surveys: Design, Methodsand Applications. Ed. by C. R. Rao. Vol. 29A. Handbook of Statistics. Amsterdamet al.: Elsevier, pp. 351–380.

Wölfel, Oliver and Corinna Kleinert (2012). Working and Learning in a ChangingWorld. Part VII: Description of the ALWA literacy and numeracy data (ALWA-LiNu). FDZ Datenreport 05/2012 (en).

Zax, Jeffrey S. and Daniel I. Rees (2002). ‘IQ, Academic Performance, Environment,and Earnings’. In: Review of Economics and Statistics 84(4), pp. 600–616.

Zellner, Arnold (1962). ‘An Efficient Method of Estimating Seemingly UnrelatedRegressions and Tests for Aggregation Bias’. In: Journal of the American StatisticalAssociation 57(298), pp. 348–368.

110

Page 111: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

A. List of abbreviations

AFQT Armed Forces Qualification TestALL Adult Literacy and Life Skills SurveyALLS Adult Literacy and Life Skills Survey (Australian survey)ALWA Survey Working and Learning in a Changing WorldALWA-ADIAB ALWA survey data linked to administrative data of the IABALWA-LiNu ALWA Literacy and Numeracy DataASVAB Armed Services Vocational AptitudeBCS British Cohort StudyBHP Establishment History PanelCATI Computer assisted telephone interviewETS Educational Testing ServiceFDZ Research Data CentreGDR German Democratic RepublicGLS Generalised least squaresIAB Institute for Employment ResearchIALS International Adult Literacy SurveyIEB Integrated Employment BiographiesML Maximum likelihoodMTB Merge ToolBoxNALS National Adult Literacy SurveyNEPS National Educational Panel StudyNLSY National Longitudinal Survey of YouthOECD Organisation for Economic Co-operation and DevelopmentPASS Panel Study Labour Market and Social SecurityPIAAC Programme for the International Assessment of Adult Compe-

tenciesPISA Programme for International Student AssessmentSGB Social CodeSOEP Socio-Economic Panel

111

Page 112: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

B. Appendix to Chapter 2

Table B.1.: Sample statistics of independent variablesmean s.d. min max

Male (d) 0.50 (0.50) 0 1Aged 18-24 (d) 0.19 (0.39) 0 1Aged 25-34 (d) 0.17 (0.38) 0 1Aged 35-44 (d) 0.35 (0.48) 0 1Aged 45-52 (d) 0.30 (0.46) 0 1Foreign nationality (d) 0.02 (0.15) 0 1Native language German (d) 0.97 (0.18) 0 1Born in East Germany (d) 0.19 (0.39) 0 1Partner in household (d) 0.63 (0.48) 0 1Children in household (d) 0.51 (0.50) 0 1No training (d) 0.18 (0.39) 0 1Training + lower secondary (d) 0.12 (0.33) 0 1Training + intermediate (d) 0.28 (0.45) 0 1Training + upper secondary (d) 0.11 (0.32) 0 1Master craftsman (d) 0.06 (0.25) 0 1Higher Education (d) 0.23 (0.42) 0 1Prose literacy score 0.00 (1.00) -4.5 2.1Document literacy score 0.00 (1.00) -4 2.7Numeracy score 0.00 (1.00) -3.6 2.4High-cultural activity -0.00 (1.00) -2.2 3.7Self employed (d) 0.09 (0.29) 0 1Freelancer (d) 0.04 (0.19) 0 1In dependent employment (d) 0.59 (0.49) 0 1Civil servant (d) 0.04 (0.20) 0 1Unemployed (d) 0.05 (0.22) 0 1In formal education (d) 0.11 (0.31) 0 1Other activity (d) 0.08 (0.27) 0 1Personal net income<500EUR (d) 0.22 (0.41) 0 1500-999EUR (d) 0.17 (0.38) 0 11000-1499EUR (d) 0.17 (0.38) 0 11500-1999EUR (d) 0.14 (0.35) 0 12000-2999EUR (d) 0.16 (0.36) 0 1More than 3000EUR (d) 0.11 (0.31) 0 1Income refused (d) 0.02 (0.15) 0 1

(table continued on following page)

112

Page 113: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

mean s.d. min max

Consent to follow-up survey (d) 0.94 (0.25) 0 1Consent to cognitive tests (d) 0.57 (0.50) 0 1Share of refused answers 0.05 (0.25) 0 8Share of ’don’t know’ 0.62 (1.23) 0 20No part of name unique (d) 0.38 (0.49) 0 1First name unique (d) 0.05 (0.22) 0 1Last name unique (d) 0.47 (0.50) 0 1Both parts of name unique (d) 0.10 (0.30) 0 1Interview on weekend (d) 0.20 (0.40) 0 1Duration before consent quest. (min.) 25.51 (13.59) 1.1 119Disturbance during int. (d) 0.06 (0.24) 0 1Comprehension problems during int. (d) 0.05 (0.22) 0 1Other problems during int. (d) 0.08 (0.28) 0 1Int: male (d) 0.56 (0.50) 0 1Int: aged up to 24 (d) 0.12 (0.33) 0 1Int: aged 25-34 (d) 0.17 (0.38) 0 1Int: aged 35-44 (d) 0.20 (0.40) 0 1Int: aged 45-54 (d) 0.36 (0.48) 0 1Int: aged 55 and more (d) 0.14 (0.35) 0 1Int: no training (d) 0.17 (0.38) 0 1Int: training, below upp. secondary (d) 0.16 (0.37) 0 1Int: training, upper secondary (d) 0.16 (0.37) 0 1Int: higher education (d) 0.33 (0.47) 0 1Int: education unknown (d) 0.17 (0.38) 0 1Experience as interviewer (years) 1.82 (0.99) 1 4No. of previous interviews 0-25 (d) 0.35 (0.48) 0 1No. of previous interviews 26-50 (d) 0.16 (0.37) 0 1No. of previous interviews 51-100 (d) 0.22 (0.41) 0 1No. of previous interviews >100 (d) 0.26 (0.44) 0 1Int: consent rate in prior interviews 0.90 (0.15) 0 1

Source: ALWA, own unweighted calculations based on 9,790 observa-tions. Note: d denotes dummy variable.

113

Page 114: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Table B.2.: Consent and linkage success rate by subgroups, expressed as percent-ages of all German language interview participants

consenters deterministic determ.+ allmatches probabilistic matches

matches

Total 92.1 48.9 74.1 79.6

Aged 18-24 93.9 56.3 78.9 84.525-34 91.6 53.4 76.6 83.135-44 92.0 46.3 73.0 78.645-52 91.5 44.6 70.9 75.7

(0.014) (0.000) (0.000) (0.000)Female 92.1 48.3 74.2 80.2Male 92.2 49.5 74.0 79.0

(0.879) (0.245) (0.836) (0.144)German nationality 92.1 48.9 74.0 79.4Foreign nationality 90.9 50.2 79.5 86.8

(0.489) (0.695) (0.067) (0.008)Native language not German 92.1 60.5 79.0 86.3Native language German 92.1 48.5 73.9 79.4

(0.991) (0.000) (0.038) (0.002)Born in West Germany 91.7 48.0 73.3 78.7Born in East Germany 93.8 53.0 77.4 83.6

(0.003) (0.000) (0.000) (0.000)No training 93.3 54.0 76.2 81.2Training + lower secondary 92.0 54.2 79.4 85.0Training + intermediate 91.8 52.9 77.2 82.8Training + upper secondary 93.1 49.8 77.4 82.8Master craftsman 92.5 44.4 67.7 73.1Higher Education 91.0 38.1 66.0 71.9

(0.106) (0.000) (0.000) (0.000)Self employed 89.2 37.9 57.8 64.9Freelancer 93.8 49.7 76.9 82.0In dependent employment 92.1 51.9 80.4 85.6Civil servant 93.3 12.2 20.2 23.9Unemployed 89.3 61.1 77.7 84.2In formal education 94.7 53.6 75.6 80.5Other activity 92.3 43.9 68.4 76.0

(0.000) (0.000) (0.000) (0.000)Personal net income<500EUR 93.2 50.9 75.1 80.6500-999EUR 92.7 55.1 78.9 85.91000-1499EUR 92.6 52.2 77.6 83.71500-1999EUR 92.8 53.1 77.5 82.52000-2999EUR 92.9 42.9 70.9 75.5More than 3000EUR 92.5 37.2 64.0 68.8Income refused 63.6 30.1 51.7 55.1

(0.000) (0.000) (0.000) (0.000)

Observations 9,790 9,790 9,790 9,790

Source: ALWA, own unweighted calculations. Notes: p-values of Pearson χ2-test in paren-theses. Percentages in columns related to linkage success (col. 2-4) are based on all Germanlanguage interview participants.

114

Page 115: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Table B.3.: Determinants of consent to record linkage, probit regression models with andwithout respondent-interviewer interactions, respectively, weighting variablesincluded

without interactions with interactions(1a) (2a)

Male (d) −0.042 (0.064) −0.033 (0.067)Aged 25-34 (d) −0.166 (0.107) −0.181∗ (0.109)Aged 35-44 (d) −0.049 (0.119) −0.122 (0.127)Aged 45-52 (d) −0.158 (0.123) −0.258∗ (0.140)Foreign nationality (d) −0.280∗ (0.146) −0.283∗ (0.147)Native language German (d) −0.120 (0.134) −0.125 (0.134)Born in East Germany (d) 0.103∗ (0.063) 0.102 (0.063)Partner in household (d) 0.126∗∗ (0.063) 0.130∗∗ (0.062)Children in household (d) −0.089 (0.065) −0.089 (0.065)Training + lower secondary (d) 0.055 (0.104) 0.068 (0.106)Training + intermediate (d) −0.025 (0.088) −0.021 (0.089)Training + upper secondary (d) 0.055 (0.110) 0.033 (0.111)Master craftsman (d) 0.025 (0.168) 0.019 (0.166)Higher Education (d) −0.041 (0.103) −0.057 (0.104)Prose literacy score −0.042 (0.029) −0.042 (0.029)Document literacy score −0.027 (0.024) −0.026 (0.023)Numeracy score −0.033 (0.027) −0.033 (0.027)High-cultural activity −0.058∗∗ (0.022) −0.061∗∗∗ (0.022)Self employed (d) 0.115 (0.114) 0.119 (0.114)Freelancer (d) 0.430∗∗∗ (0.136) 0.427∗∗∗ (0.138)In dependent employment (d) 0.317∗∗∗ (0.095) 0.318∗∗∗ (0.096)Civil servant (d) 0.385∗∗ (0.195) 0.389∗∗ (0.196)In formal education (d) 0.436∗∗∗ (0.120) 0.440∗∗∗ (0.121)Other activity (d) 0.177 (0.129) 0.184 (0.130)Personal net income<500EUR (d) −0.022 (0.100) −0.025 (0.100)500-999EUR (d) −0.083 (0.091) −0.087 (0.091)1000-1499EUR (d) −0.024 (0.074) −0.021 (0.073)2000-2999EUR (d) 0.067 (0.087) 0.065 (0.087)More than 3000EUR (d) 0.084 (0.085) 0.081 (0.087)Income refused (d) −0.666∗∗∗ (0.128) −0.668∗∗∗ (0.128)Consent to follow-up survey (d) 0.606∗∗∗ (0.085) 0.611∗∗∗ (0.085)Consent to cognitive tests (d) 0.402∗∗∗ (0.053) 0.402∗∗∗ (0.052)Share of refused answers −0.315∗∗∗ (0.086) −0.314∗∗∗ (0.084)Share of ’don’t know’ −0.040∗∗ (0.016) −0.040∗∗ (0.016)Int: male (d) −0.131∗ (0.073) −0.131∗ (0.073)Int: aged 25-34 (d) 0.126 (0.120) 0.131 (0.115)Int: aged 35-44 (d) 0.157 (0.128) 0.232 (0.147)Int: aged 45-54 (d) 0.241∗∗ (0.105) 0.350∗∗ (0.144)Int: aged 55 and more (d) 0.170 (0.129) 0.332∗ (0.177)Int: training, below upp. secondary (d) 0.093 (0.126) 0.118 (0.133)Int: training, upper secondary (d) 0.072 (0.127) 0.074 (0.129)Int: higher education (d) −0.113 (0.106) −0.104 (0.107)Int: education unknown (d) 0.002 (0.130) 0.005 (0.138)

(table continued on following page)

115

Page 116: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

without interactions with interactions(1a) (2a)

Experience as interviewer (years) 0.063∗∗ (0.031) 0.064∗∗ (0.031)No. of previous interviews 26-50 (d) −0.191∗∗ (0.078) −0.192∗∗ (0.079)No. of previous interviews 51-100 (d) −0.174∗∗ (0.074) −0.177∗∗ (0.073)No. of previous interviews >100 (d) −0.171∗∗ (0.076) −0.169∗∗ (0.076)Int: consent rate in prior interviews 0.162 (0.153) 0.155 (0.153)Int: different sex than respondent (d) 0.045 (0.047)Int. at least 10 years younger (d) 0.049 (0.091)Int. at least 10 years older (d) −0.151∗ (0.092)Same schooling level (d) 0.114 (0.135)Higher schooling than respondent (d) 0.049 (0.145)Unknown relation of schooling levels (d) 0.075 (0.205)Interview on weekend (d) 0.039 (0.064) 0.039 (0.063)Duration before consent quest. (min.) 0.000 (0.002) 0.000 (0.002)Disturbance during int. (d) 0.009 (0.089) 0.012 (0.089)Comprehension problems during int. (d) 0.086 (0.087) 0.088 (0.087)Other problems during int. (d) −0.176∗∗ (0.078) −0.181∗∗ (0.078)Constant 0.467∗ (0.268) 0.397 (0.316)

Wald-statistic (χ2) [p-value] 1,225 [0.000] 1,379 [0.000]AIC 4,736 4,740pseudoR2 0.110 0.111Observations 9,790 9,790

Source: ALWA, own weighted calculations. Notes: Robust standard errors inparentheses based on 210 interviewers as clusters. ***/ **/ * indicates significanceat the 1/ 5/ 10% level. Reference categories in both specifications: respondent aged18-24, no training, unemployed, net household income of 1500-1999 EUR, intervieweraged up to 24, no training, 0-25 previous ALWA interviews. Additional referencecategories in interacted specification: interviewer aged the same (+/-10 years) andsame schooling level as respondent. d denotes dummy variable.

116

Page 117: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Table B.4.: Determinants of consent to record linkage, separate probit regression modelswith respondent-interviewer interactions for female and male respondents,respectively

female respondents male respondents

Aged 25-34 (d) −0.298∗∗ (0.128) −0.118 (0.129)Aged 35-44 (d) −0.406∗∗∗ (0.150) 0.001 (0.145)Aged 45-52 (d) −0.553∗∗∗ (0.147) 0.038 (0.166)Foreign nationality (d) −0.220 (0.189) −0.111 (0.167)Native language German (d) −0.445∗∗ (0.181) 0.004 (0.164)Born in East Germany (d) 0.185∗∗ (0.088) 0.157∗ (0.084)Partner in household (d) 0.197∗∗∗ (0.064) −0.102 (0.076)Children in household (d) −0.138∗ (0.072) 0.124 (0.090)Training + lower secondary (d) −0.037 (0.132) −0.009 (0.118)Training + intermediate (d) 0.013 (0.125) −0.087 (0.105)Training + upper secondary (d) 0.209∗ (0.126) −0.014 (0.146)Master craftsman (d) 0.329 (0.290) −0.064 (0.139)Higher Education (d) 0.061 (0.125) −0.050 (0.110)Prose literacy score −0.024 (0.030) −0.034 (0.032)Document literacy score −0.028 (0.029) −0.030 (0.028)Numeracy score −0.050∗ (0.030) −0.014 (0.028)High-cultural activity −0.070∗∗∗ (0.027) −0.071∗∗ (0.033)Self employed (d) 0.197 (0.157) 0.072 (0.147)Freelancer (d) 0.386∗ (0.198) 0.380∗∗ (0.171)In dependent employment (d) 0.307∗∗∗ (0.114) 0.223∗ (0.135)Civil servant (d) 0.220 (0.169) 0.651∗∗∗ (0.219)In formal education (d) 0.166 (0.159) 0.511∗∗∗ (0.150)Other activity (d) 0.391∗∗∗ (0.136) 0.062 (0.158)Personal net income<500EUR (d) 0.065 (0.101) −0.153 (0.146)500-999EUR (d) −0.039 (0.101) −0.160 (0.136)1000-1499EUR (d) −0.028 (0.101) −0.041 (0.096)2000-2999EUR (d) 0.011 (0.134) −0.035 (0.097)More than 3000EUR (d) 0.154 (0.172) 0.004 (0.102)Income refused (d) −0.780∗∗∗ (0.160) −0.567∗∗∗ (0.172)Consent to follow-up survey (d) 0.653∗∗∗ (0.086) 0.646∗∗∗ (0.100)Consent to cognitive tests (d) 0.494∗∗∗ (0.063) 0.339∗∗∗ (0.060)Share of refused answers −0.298∗∗∗ (0.103) −0.300∗∗∗ (0.113)Share of ’don’t know’ −0.060∗∗∗ (0.019) −0.038∗ (0.022)Int: male (d) −0.111∗ (0.066) −0.133∗ (0.074)Int: aged 25-34 (d) 0.152 (0.107) 0.068 (0.138)Int: aged 35-44 (d) 0.272∗ (0.146) 0.100 (0.181)Int: aged 45-54 (d) 0.370∗∗ (0.154) 0.193 (0.158)Int: aged 55 and more (d) 0.621∗∗∗ (0.178) 0.097 (0.201)Int. at least 10 years younger (d) 0.029 (0.121) 0.051 (0.104)Int. at least 10 years older (d) −0.282∗∗∗ (0.095) −0.030 (0.099)Int: training, below upp. secondary (d) 0.063 (0.143) 0.089 (0.163)Int: training, upper secondary (d) 0.077 (0.102) 0.039 (0.153)Int: higher education (d) −0.098 (0.094) −0.119 (0.120)Int: education unknown (d) −0.037 (0.107) −0.060 (0.173)

(table continued on following page)

117

Page 118: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

female respondents male respondents

Same schooling level (d) 0.144 (0.184) 0.026 (0.113)Higher schooling than respondent (d) 0.190 (0.193) 0.058 (0.151)Unknown relation of schooling levels (d) −0.019 (0.220) 0.194 (0.232)Experience as interviewer (years) 0.085∗∗∗ (0.033) 0.065∗ (0.038)No. of previous interviews 26-50 (d) −0.113 (0.093) −0.209∗∗∗ (0.079)No. of previous interviews 51-100 (d) −0.198∗∗∗ (0.074) −0.100 (0.087)No. of previous interviews >100 (d) −0.133∗ (0.072) −0.174∗ (0.093)Int: consent rate in prior interviews 0.130 (0.161) 0.176 (0.186)Interview on weekend (d) 0.106 (0.092) 0.008 (0.063)Duration before consent quest. (min.) −0.001 (0.003) −0.004 (0.002)Disturbance during int. (d) 0.012 (0.114) −0.031 (0.132)Comprehension problems during int. (d) −0.013 (0.132) 0.006 (0.138)Other problems during int. (d) −0.262∗∗∗ (0.097) −0.130 (0.096)Constant 0.721∗∗ (0.352) 0.498 (0.390)

Wald-statistic (χ2) [p-value] 811 [0.000] 911 [0.000]pseudoR2 0.150 0.104Observations 4,920 4,870

Source: ALWA, own unweighted calculations. Notes: Robust standard errors inparentheses based on 197 and 203 interviewers as clusters, respectively. ***/ **/ *indicates significance at the 1/ 5/ 10% level. Reference categories: respondent aged18-24, no training, unemployed, net household income of 1500-1999 EUR, intervieweraged up to 24, no training, aged the same (+/-10 years) and same schooling level asrespondent, 0-25 previous ALWA interviews. d denotes dummy variable.

118

Page 119: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

C. Appendix to Chapter 3

Table C.1.: Sample statistics of independent variablesmean s.d. min max

Age (in years) 28.00 (7.35) 18 51Male (d) 0.50 (0.50) 0 1Immigrant background (d) 0.16 (0.37) 0 1Prose literacy (score) 0.04 (1.00) -3.8 2.1Document literacy (score) 0.03 (0.99) -4 2.3Numeracy (score) 0.00 (0.98) -3.2 2.4High-cultural activity (score) 0.08 (0.96) -2 3.7Importance of work (score) -0.05 (1.00) -4.7 2.2Importance of occupation (score) 0.12 (0.86) -4.5 1.2Importance of friends (score) -0.08 (1.01) -5.2 2.4Importance of family (score) 0.05 (0.97) -5.9 1.4External locus of control (score) -0.03 (0.98) -4.4 2.4Internal locus of control (score) -0.00 (1.01) -7.6 1.5Employment related self-confidence (score) 0.04 (0.98) -5.5 1.7No schooling (d) 0.01 (0.08) 0 1Lower secondary schooling (d) 0.21 (0.40) 0 1Intermediate schooling (d) 0.40 (0.49) 0 1Upper secondary schooling (d) 0.39 (0.49) 0 1No training (d) 0.05 (0.21) 0 1Apprenticeship (d) 0.66 (0.47) 0 1Master craftsman/technician (d) 0.05 (0.22) 0 1Higher education (d) 0.24 (0.43) 0 1Employment experience (in years) 4.85 (5.88) 0 33P: no/unknown schooling degree (d) 0.03 (0.16) 0 1P: lower secondary schooling (d) 0.64 (0.48) 0 1P: intermediate schooling (d) 0.20 (0.40) 0 1P: upper secondary schooling (d) 0.13 (0.34) 0 1P: no/unknown vocational degree (d) 0.19 (0.39) 0 1P: apprenticeship (d) 0.62 (0.49) 0 1P: mastercraftsman/technician (d) 0.08 (0.28) 0 1P: higher education (d) 0.11 (0.32) 0 1P: not employed (d) 0.15 (0.36) 0 1P: employed (d) 0.75 (0.44) 0 1P: self-employed (d) 0.10 (0.30) 0 1

(table continued on following page)

119

Page 120: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

mean s.d. min max

Firm in East Germany (d) 0.13 (0.33) 0 1Regional unemployment rate 9.52 (3.58) 2 23Public service (d) 0.22 (0.41) 0 1Working .25 full-time (d) 0.08 (0.28) 0 1Working .5 full-time (d) 0.10 (0.30) 0 1Working .75 full-time (d) 0.05 (0.23) 0 1Working full time (d) 0.77 (0.42) 0 1No training required (d) 0.11 (0.31) 0 1Induction period required (d) 0.11 (0.32) 0 1Vocational training required (d) 0.49 (0.50) 0 1Vocational schooling required (d) 0.06 (0.24) 0 1Master craftsman/technician required (d) 0.04 (0.19) 0 1Higher education required (d) 0.18 (0.39) 0 1Firm size: less than 5 (d) 0.08 (0.27) 0 1Firm size: 5-9 (d) 0.13 (0.33) 0 1Firm size: 10-19 (d) 0.13 (0.33) 0 1Firm size: 20-99 (d) 0.23 (0.42) 0 1Firm size: 100-199 (d) 0.11 (0.31) 0 1Firm size: 200-1,999 (d) 0.21 (0.41) 0 1Firm size: 2,000 and more (d) 0.12 (0.32) 0 1

Source: ALWA, own unweighted calculations based on 17,254 observa-tions. d denotes dummy variable.

120

Page 121: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Table C.2.: Loss of observations due to exclusion restrictions and missing valuesbased on Models 1 to 4 (not mutually exclusive)

No. of excluded spells Share of total spells

Before exclusion restrictions 30,594 1.000Spell from abroad 983 0.032Former East Germany 2,033 0.066Self-employed 2,034 0.066Exclusion restrictions combined 5,046 0.165

Remaining after restrictions 25,707 1.000Participation in on-the-job training 1,056 0.041Training frequency per spell 1,150 0.045Age class 1,072 0.042Highest schooling degree 50 0.002Highest training degree 672 0.026Schooling of parent with same sex 1,912 0.074Training of parent with same sex 1,532 0.060Occ. status of parent with same sex 590 0.023Prose score 601 0.023Numeracy score 601 0.023Document literacy score 601 0.023Importance of work 457 0.018Importance of occupation 457 0.018Importance of friends 457 0.018Importance of family 457 0.018External locus of control 93 0.004Internal locus of control 93 0.004Employment related self-confidence 85 0.003High-cultural activity 49 0.002Public service 577 0.022Working time 280 0.011Training requirement of job 243 0.009Firm size 2,132 0.083East Germany 654 0.025Regional unemployment rate 928 0.036All missings after restrictions 8,453 0.329

Remaining after all dropouts 17,254 0.564

Source: ALWA, own unweighted calculations.

121

Page 122: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

D. Appendix to Chapter 4

Table D.1.: Numbers of cases by step of data preparation

Persons SpellsALWA respondents (German interviews) 10,177

Tests participants 3,980Respondents with linked administrative data 8,022

Respondent with administrative and skills data 3,263After selection of cases 1,818 5,924

2007 1,412 1,4122008 1,478 1,4782009 1,515 1,5152010 1,526 1,526

Source: ALWA-ADIAB, ALWA-LiNu, own calculations.

122

Page 123: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Table D.2.: Augmented Mincer-type earnings equation,random effects GLS estimates, restricted toyears 2007/2008

Literacy (z-score) 0.028∗∗ (0.014)Numeracy (z-score) 0.053∗∗∗ (0.014)Vocational degree (d) 0.293 (0.180)Academic degree (d) 0.547∗∗∗ (0.166)Employment experience in years 0.046∗∗∗ (0.006)Employment experience squared/100 −0.062∗∗∗ (0.015)Male (d) 0.282∗∗∗ (0.035)Age in years 0.008 (0.012)Age squared/100 −0.018 (0.015)Immigrant background (d) 0.049 (0.033)Native language German (d) −0.082 (0.068)Job in East Germany (d) −0.104∗∗∗ (0.030)Tenure in years 0.006 (0.004)Tenure squared/100 −0.026∗ (0.015)Firm size (no. of employees) 0.000∗∗∗ (0.000)Firm size squared/100 −0.000∗∗∗ (0.000)Constant 6.632∗∗∗ (0.306)

R2 overall 0.473R2 between 0.479R2 within 0.229Observations 2,887

Source: ALWA-ADIAB, ALWA-LiNu, own unweighted calculations.Notes: log monthly earnings as dependent variable, cluster-robuststandard errors in parentheses. ***/ **/ * indicates significanceat the 1/ 5/ 10% level, d denotes dummy variable. Reference: noprofessional degree. Variables not shown: occupation, industry andyear dummies.

123

Page 124: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

Table D.3.: Mincer-type earnings equation, random effectsGLS estimate based on all linked cases, regard-less of test-participation (2007-2010)

Vocational degree (d) 0.059 (0.037)Academic degree (d) 0.445∗∗∗ (0.035)Employment experience in years 0.033∗∗∗ (0.003)Employment experience squared/100 −0.027∗∗∗ (0.008)Male (d) 0.285∗∗∗ (0.017)Age in years 0.021∗∗∗ (0.006)Age squared/100 −0.036∗∗∗ (0.008)Immigrant background (d) 0.014 (0.016)Native language German (d) 0.023 (0.030)Job in East Germany (d) −0.136∗∗∗ (0.017)Tenure in years 0.009∗∗∗ (0.002)Tenure squared/100 −0.031∗∗∗ (0.007)Firm size (no. of employees) 0.000∗∗∗ (0.000)Firm size squared/100 −0.000∗∗∗ (0.000)Constant 6.375∗∗∗ (0.142)

R2 overall 0.476R2 between 0.491R2 within 0.039Observations 15,297

Source: ALWA-ADIAB, own unweighted calculations. Notes: logmonthly earnings as dependent variable, cluster-robust standarderrors in parentheses. ***/ **/ * indicates significance at the 1/ 5/10% level, d denotes dummy variable. Reference: no professionaldegree. Variables not shown: occupation, industry and year dummies.

124

Page 125: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

E. Short German summary

In meiner Dissertation gehe ich drei empirischen Fragestellungen nach. Dabei verwen-de ich unterschiedliche Kombinationen von Datensätzen, die jeweils auf der Befragung“Arbeiten und Lernen im Wandel” (ALWA) aufbauen. Die ALWA Studie wurde inden Jahren 2007 bis 2008 vom Institut für Arbeitsmarkt- und Berufsforschung (IAB)der Bundesagentur für Arbeit durchgeführt. Eine weitere Teil-Datenquelle ergabsich durch eine, ebenfalls im Rahmen der ALWA Studie durchgeführte, persönlicheBefragung, in der ein Teil der ALWA-Befragten Aufgabenhefte zu lösen hatten, mitdenen ihre alltagsmathematischen und Lesefähigkeiten ermittelt wurden. Das dritteElement der Datengrundlage bilden die prozessproduzierten Arbeitsmarktdaten desIAB, die ich mit Hilfe von Record Linkage-Verfahren mit ALWA verknüpft habe.In der ersten Fragestellung befasse ich mich in Kapitel 2 mit den methodologischenAspekten der Verknüpfung von ALWA mit den Prozessdaten des IAB. Nach einerkurzen allgemeinen Erläuterung rechtlicher und ethischer Aspekte einer Verknüpfungsolcher Datenquellen, beschreibe ich das konkrete Verfahren der für ALWA durchge-führten Verknüpfung. Ich untersuche anschließend empirisch, wovon die Zustimmungder ALWA-Befragten zu der Verknüpfung abhängt. Bezüglich der Befragtenmerkmalekann ich zeigen, dass die Zustimmung mit dem Alter des Befragten abnimmt, sievon Arbeitslosen am wenigsten wahrscheinlich erteilt wird, jedoch auch, dass dieZustimmung nicht vom Einkommen des Befragten abhängt.Die Analyse von Interviewermerkmalen zeigt zum einen, dass es weiblichen Interview-ern eher als ihren männlichen Kollegen gelingt, die Zustimmung zur Datenverknüpfungzu erlangen. Dies gilt unabhängig davon, ob es sich um weibliche oder männlicheBefragte handelt. Außerdem steigt der Erfolg von Interviewern mit ihrer Erfahrungbei der Durchführung von Befragungen sowie ihrem Alter. Letzteres gilt jedoch nur,wenn der Interviewer nicht deutlich älter ist als der Befragte, weil sich das wiederumnegativ auf die Zustimmungsbereitschaft der Befragten auswirkt.Darüber hinaus gehe ich der Frage nach, von welchen Merkmalen der Befragten esabhängt, ob die Verknüpfung der beiden sie betreffenden Daten gelingt, gegebensie haben dieser Verknüpfung vorher zugestimmt. Hier zeigt sich, dass die Einfluss-faktoren des Verknüpfungserfolgs sich teilweise zwischen verschiedenen Methodendes Record Linkage unterscheiden. Über alle Stufen des Verfahrens hinweg werden

125

Page 126: Essays on the measurement and analysis of educational and skill … · 2013-09-03 · Leerzeile Essays on the measurement and analysis of educational and skill inequalities Inaugural-Dissertation

insbesondere arbeitslos Gemeldete sowie abhängig Beschäftigte am erfolgreichstenverknüpft. Außerdem nimmt dieser Erfolg mit dem Alter des Befragten ab, währenddessen Einkommen weitgehend keine Rolle spielt.In Kapitel 3 gehe ich der Frage nach, ob es einen Zusammenhang zwischen derWeiterbildungsbeteiligung von Erwachsenen und deren familiärem Hintergrund gibt.Letzteren operationalisiere ich über den höchsten schulischen und beruflichen Ab-schluss sowie dem Erwerbsstatus eines Elternteils der untersuchten Person. Ich leitemit Hilfe bestehender Theorien her, wie sich der familiäre Hintergrund langfristigauch auf die Beteiligung an nicht-formaler Bildung einer Person auswirken kann.Grundlage für die empirischen Analysen bilden hier ausschließlich die Befragungs-daten der ALWA Studie. Die darin erhobenen Erwerbsepisoden enthalten unteranderem Angaben dazu, wie häufig die Befragten während der betroffenen Tätigkeitan nicht-formaler beruflicher Weiterbildung teilgenommen haben. Mit Zähldatenmo-dellen untersuche ich die Determinanten der Teilnahmehäufigkeit. Mein besonderesAugenmerk liegt dabei auf dem Einfluss der Merkmale des familiären Hintergrunds.Die Ergebnisse zeigen, dass auch die Teilnahme von Erwachsenen an nicht-formalerWeiterbildung noch einen signifikanten Zusammenhang insbesondere mit dem Bil-dungsniveau der Eltern aufweist. Erwachsene mit wenig formal gebildeten Elternnehmen selbst signifikant weniger häufig an nicht-formaler Bildung teil. Statt alsoeinen teilweise herkunftsbedingten Rückstand an formaler Qualifikation durch nicht-formale Bildung aufzuholen, fallen Personen mit einem bildungsfernen familiärenHintergrund in ihrer Humankapitalaustattung relativ gesehen noch weiter zurück.In Kapitel 4, das auf einem in Koautorenschaft mit Guido Heineck entstandenenAufsatz basiert, gehe ich schließlich der Frage nach, ob sich Grundkompetenzen amdeutschen Arbeitsmarkt auszahlen. Um diese Frage zu beantworten, untersuchen meinKoautor und ich den Zusammenhang von Lesekompetenzen und alltagsmathemati-schen Fähigkeiten mit der monatlichen Brutto-Entlohnung von Vollzeitbeschäftigten.Wir nutzen dazu Daten der ALWA-Befragung, die wir mit Testergebnissen zu denGrundkompetenzen der Befragten sowie mit administrativen Daten zu ihrer Ent-lohnung anreichern. Unsere Ergebnisse weisen auf einen positiven Zusammenhangzwischen der Entlohnung und beiden Kompetenzdomänen hin. Darüber hinaus findenwir jedoch keine Anhaltspunkte für Nicht-Linearität in diesem Zusammenhang undnur wenig Heterogenität zwischen verschiedenen Teilgruppen am Arbeitsmarkt.

126