I: NLSY79 I.A. NLSY79 Variables

22
Page 1 of 22 DATA APPENDIX SMART AND ILLICIT: WHO BECOMES AN ENTREPRENEUR AND DO THEY EARN MORE? ROSS LEVINE AND YONA RUBINSTEIN July 2016 I: NLSY79 I.A. NLSY79 Variables Earnings: Annual Hours Worked NUMBER OF HOURS WORKED IN PAST CALENDAR YEAR R24456.00 Full-time, Full- Year If the respondent (1) works 50 or more weeks per year and (2) works 2000 or more hours per year, and (3) works 40 or more hours per week, then Full-time, Full-year is set equal to one, otherwise it is set equal to zero. Earnings Wages plus income from business. Deflated by the CPI corresponding to when those earnings were realized. Earnings are in 2010 prices. Demographics and Family: Age The age of the respondent. Current year – year of birth. College Graduate (or more) Graduated from college or obtained an advanced degree. Educational attainment (six categories) The six educational attainment categories: (i) high school dropouts: less than 12 years of schooling (ii) GED degree (iii) high school graduates: 12 years of schooling (iv) had some college education: 13-15 years of schooling (i) college education: 16 years of schooling (vi) advanced studies: 17+ years of schooling. These are measured at the end of the respondent’s educational experience, so that they do not vary over time for a respondent. Family Income in 1979 The income of the respondent’s family in 1979 in 2010-year prices. In those cases where 1979 is missing, we use the earliest year between 1980 and 1981 with a non-missing value. See section I.E. NLSY79 Family Income in 1979 for details. Father's Education Years of schooling of the respondent's father. See Section I.D. NLSY79 Imputation of Mother and Father Education for details. Female Equals one if the respondent reports being female and zero otherwise. Mother's Education Years of schooling of the respondent's mother. See Section I.D. NLSY79 Imputation of Mother and Father Education for details. Potential experience Equals the age of the respondent minus the years of schooling minus six, or, if this computation is less than zero, then potential experience set equal to zero.

Transcript of I: NLSY79 I.A. NLSY79 Variables

Page 1: I: NLSY79 I.A. NLSY79 Variables

Page1of22

DATAAPPENDIX

SMARTANDILLICIT:WHOBECOMESANENTREPRENEURANDDOTHEYEARNMORE?

ROSSLEVINEANDYONARUBINSTEIN

July2016

I:NLSY79

I.A.NLSY79Variables

Earnings: AnnualHoursWorked

NUMBEROFHOURSWORKEDINPASTCALENDARYEARR24456.00

Full-time,Full-Year

Iftherespondent(1)works50ormoreweeksperyearand(2)works2000ormorehoursperyear,and(3)works40ormorehoursperweek,thenFull-time,Full-yearissetequaltoone,otherwiseitissetequaltozero.

Earnings Wagesplusincomefrombusiness.DeflatedbytheCPIcorrespondingtowhenthoseearningswererealized.Earningsarein2010prices.

DemographicsandFamily:

Age Theageoftherespondent.Currentyear–yearofbirth.CollegeGraduate(ormore)

Graduatedfromcollegeorobtainedanadvanceddegree.

Educationalattainment(sixcategories)

Thesixeducationalattainmentcategories:(i)highschooldropouts:lessthan12yearsofschooling(ii)GEDdegree(iii)highschoolgraduates:12yearsofschooling(iv)hadsomecollegeeducation:13-15yearsofschooling(i)collegeeducation:16yearsofschooling(vi)advancedstudies:17+yearsofschooling.Thesearemeasuredattheendoftherespondent’seducationalexperience,sothattheydonotvaryovertimeforarespondent.

FamilyIncomein1979

Theincomeoftherespondent’sfamilyin1979in2010-yearprices.Inthosecaseswhere1979ismissing,weusetheearliestyearbetween1980and1981withanon-missingvalue.SeesectionI.E.NLSY79FamilyIncomein1979fordetails.

Father'sEducation

Yearsofschoolingoftherespondent'sfather.SeeSectionI.D.NLSY79ImputationofMotherandFatherEducationfordetails.

Female Equalsoneiftherespondentreportsbeingfemaleandzerootherwise.

Mother'sEducation

Yearsofschoolingoftherespondent'smother.SeeSectionI.D.NLSY79ImputationofMotherandFatherEducationfordetails.

Potentialexperience

Equalstheageoftherespondentminustheyearsofschoolingminussix,or,ifthiscomputationislessthanzero,thenpotentialexperiencesetequaltozero.

Page 2: I: NLSY79 I.A. NLSY79 Variables

Page2of22

Twoparentfamily(14)

Equalsoneiftherespondentlivedinatwo-parentfamilyattheageof14.

White Equalsoneiftherespondentreportsbeingwhiteandzerootherwise.

Yearofbirth Thecalendaryearinwhichtherespondentwasborn.YearsofSchooling

Therespondent’smaximumnumberofyearsofschooling,soitdoesnotvaryovertimeforarespondent

Traits,etc. AFQT ArmedForcesQualificationsTestscoremeasurestheaptitudeand

trainabilityoftherespondent.Collectedduringthe1980NLSY79survey,theAFQTscoreisbasedonarithmeticreasoning,worldknowledge,paragraphcomprehension,andnumericaloperations.Itisfrequentlyemployedasageneralindicatorofcognitiveskills.ThisAFQTscoreismeasuredasapercentileoftheNLSY79survey,withamedianvalueof50.

AppliedforPatent(residualstandardized)

WesetAppliedforPatentequaloneiftherespondentin2010answered,"yes"tothequestion,"Hasanyone,includingyourself,everappliedforapatentforworkthatyousignificantlycontributedto?"SinceAppliedforPatentisobtaineddecadesafterapersonbecomesprimeage,wecollecttheresidualsfromaregressionofAppliedforPatentoneducation,AFQT,RosenbergSelf-Esteem,RotterLocusofControl,theIllicitIndex,andyearofbirth.WethenstandardizetheseresidualstoobtainAppliedforPatent(residualstandardized),whichhasameanofzeroandastandarddeviationofone.

Entrepreneur(residualstandardized)

WesetthevariableEntrepreneurequalsoneiftherespondentin2010answers,"yes"tothequestion,"Doyouconsideryourselftobeanentrepreneur?”Inposingthequestion,theNLSY79definesanentrepreneuras“someonewholaunchesabusinessenterprise,usuallywithconsiderableinitiativeandrisk."SinceEntrepreneurisobtaineddecadesafterapersonbecomesprimeage,wecollecttheresidualsfromaregressionofEntrepreneuroneducation,AFQT,RosenbergSelf-Esteem,RotterLocusofControl,theIllicitIndex,andyearofbirth.WethenstandardizetheseresidualstoobtainEntrepreneur(residualstandardized),whichhasameanofzeroandastandarddeviationofone.

Force(raw) Thisequalsoneiftherespondentreportsinthe1980surveyeverusingforcetoobtainsomething.

IllicitActivityIndex(standardized)

Thisisconstructedbasedontheanswersto20questionsinthe1980survey,where17arequestionsabout"delinquency"and3areaboutruninswiththe"police."Thedelinquencyquestionscoverissuesassociatedwithdamagingproperty,fightingatschool,shoplifting,robbery,usingforcetoobtainthings,assault,threateningtoassaultsomebody,druguse,dealingdrugs,gambling,etc.The"police"questionsinvolvebeingstoppedbythepolicy,chargedwithanillegalactivity,orconvicted,allforactivitiesother

Page 3: I: NLSY79 I.A. NLSY79 Variables

Page3of22

thanminortrafficoffenses.Foreachquestion,weassignthevalueoneifthepersonengagedinthatactivityandzerootherwise.Foreachrespondent,wethenaddthesevaluesanddivideby20.Wethenstandardizethevaluesbysubtractingthesamplemeananddividingbythestandarddeviation,sothattheIllicitActivityIndexhasameanofzeroandastandarddeviationofone.WeprovideamoredetailedexplanationinSectionI.B.NLSY79IllicitActivityIndex.

RosenbergSelf-Esteem(standardized)

RosenbergSelf-Esteemscoreisbasedonaten-partquestionnairegiventoallNLSY79participantsin1980.Itmeasuresthedegreeofapprovalordisapprovalofone’sself.Thevaluesrangefromsixto30,wherehighervaluessignifygreaterself-approval.RosenbergSelf-Esteem(standardized)standardizesthescore,sothatithasameanofzeroandastandarddeviationofone.

RotterLocusofControl(standardized)

RotterLocusofControlmeasuresthedegreetowhichrespondentsbelievetheyhaveinternalcontroloftheirlivesthroughself-determinationrelativetothedegreethatexternalfactors,suchaschance,fate,andluck,shapetheirlives.Itwascollectedaspartofapsychometrictestinthe1979NLSY79survey.TheRotterLocusofControlrangesfrom4to16,wherehighervaluessignifylessinternalcontrolandmoreexternalcontrol.Thisisstandardized,sothatithasameanofzeroandastandarddeviationofone.

Steal50orless(raw)

Thisequalsoneiftherespondentreportsinthe1980surveystealingsomethingworth$50orlessduringtheyear.

StoppedbyPolice(raw)

Thisequalsoneiftherespondentreportsinthe1980surveyeverbeingstoppedbythepolice.

Employmenttype

Salaried FromtheNLSY79’sunifiedclassofworker(R24455.10),therearefourresponsesforworkingrespondents:(1)Privatecompany,includingnon-profit,(2)government,(3)self-employed,and(4)thoseworkingwithoutpay,includinginfamilybusinesses.WesetSalariedequaltooneiftherespondent’sclassofworkeriseither“(1)”or“(2)”andzerootherwise.

Self-employed FromtheNLSY79’sunifiedclassofworker(R24455.10),therearefourresponsesforworkingrespondents:(1)Privatecompany,includingnon-profit,(2)government,(3)self-employed,and(4)thoseworkingwithoutpay,includinginfamilybusinesses.WesetSelf-employedequaltooneiftherespondent’sclassofworkeris“(3)”andzerootherwise.

IncorporatedSelf-employed

Ifarespondentisself-employed,theNLSY79furtheraskswhetherthebusinessisincorporatedornot.Iftherespondentisself-employedandthebusinessisincorporated,thenIncorporatedSelf-employedequalsoneanditiszerootherwise.SeeSectionI.C.NLSY79IncorporatedSelf-EmploymentCodingDetails.

UnincorporatedSelf-employed

Ifarespondentisself-employed,theNLSY79furtheraskswhetherthebusinessisincorporatedornot.Iftherespondentisself-

Page 4: I: NLSY79 I.A. NLSY79 Variables

Page4of22

employedandthebusinessisunincorporated,thenUnincorporatedSelf-employedequalsoneanditiszerootherwise.

I.B.NLSY79IllicitActivityIndex

Inthissubsection,wefirstdescribethecoredatafromtheNLSY79surveyandthenprovidedetailsontheconstructionoftheindex

I.B.1.TheCoreData

TheIllicitActivityindexisbasedonquestionsfromthe1980survey.Weusetwotypesofquestionsonillicitactivities“delinquency”and“police”questions.

Weusedataon17ofthe20questionson“delinquency”providedbytheNLSY79.Wedonotusetheotherthreequestionsthatwereonlyposedtopeoplewhowere17yearsoldoryoungerin1979.Thus,thesequestionswereonlyaskedofabout30%ofthesample(3,898outof12,686).Includingthesevariableswouldreducethesamplebyabout70%.Thesurveyasksaboutwhether—andhowmanytimes—therespondentengagedinthedelinquentact.Forexample,oneofthequestionsaskshowmanytimestherespondentsmokedmarijuana/hashishinthepastyear.Welistall20questionsbelowandindicatewhichonesweuseinconstructingtheIllicitActivityIndex.(Thisistitled“Tableonthedistributionofresponsestoquestionsondelinquencyandpolice.”)

Usetwoversionsoftheresponsestothedelinquencyquestions:

(1)Thereisthe“0-6intensive”versionthatusestheactualnumberoftimestherespondentengagedinthedelinquentact,wheretherearesevenanswerscategorizedfromzerotosix;and

(2)Thereisourprimary,core(“extensive”)versionthatusesthevaluesofzerooroneincodingtheresponsestothedelinquencyquestions,i.e.,wecodethevaluesaseitherthepersondidordidnotengageintheactatleastonce.

Withrespecttothe“police”questions,weusethethreequestionsoninteractionswiththepoliceprovidedbytheNLSY79.Thesethreequestionsofferzero/oneoptionsforresponses.Forexample,oneofthequestionsasks,wereyou“…everconvictedonillegalactivitychargesotherthanminortrafficoffense?”Welistthesequestionsbelowalso.

Hereisalistingofthecoredata:

Page 5: I: NLSY79 I.A. NLSY79 Variables

Page5of22

NLSY79ReferenceNumber

QuestionName Definition

Yearofsurvey Sample

DelinquencyquestionsusedinconstructingtheIllicitActivityIndex

R03049.00 DELIN-4

ILLEGAL ACTIVITY 80 INT - TIMES INTENTIONALLY DAMAGED PROPERTY IN PAST YEAR 1980 ALL

R03050.00 DELIN-5ILLEGAL ACTIVITY 80 INT - TIMES FOUGHT AT SCHOOL OR WORK IN PAST YEAR 1980 ALL

R03051.00 DELIN-6

ILLEGAL ACTIVITY 80 INT - TIMES SHOPLIFTED IN PAST YEAR 1980 ALL

R03052.00 DELIN-7

ILLEGAL ACTIVITY 80 INT - TIMES STOLEN OTHER'S BELONGINGS PAST YR (WORTH <$50) 1980 ALL

R03053.00 DELIN-8

ILLEGAL ACTIVITY 80 INT - TIMES STOLEN OTHER'S BELONGINGS PAST YR (WORTH >$50) 1980 ALL

R03054.00 DELIN-9

ILLEGAL ACTIVITY 80 INT - TIMES USED FORCE TO OBTAIN THINGS IN PAST YEAR 1980 ALL

R03055.00

DELIN-10

ILLEGAL ACTIVITY - TIMES SERIOUSLY THREATENED TO HIT/HIT SOMEONE PAST YEAR 1980 ALL

R03056.00

DELIN-11

ILLEGAL ACTIVITY 80 INT - TIMES ATTACKED W/INTENT TO INJURE/KILL IN PAST YEAR 1980 ALL

R03057.00

DELIN-12

ILLEGAL ACTIVITY 80 INT - TIMES SMOKED MARIJUANA/HASHISH IN PAST YEAR 1980 ALL

R03058.00

DELIN-13

ILLEGAL ACTIVITY - TIMES USED OTHER DRUGS/CHEMICALS TO GET HIGH IN PAST YEAR 1980 ALL

R03059.00

DELIN-14

ILLEGAL ACTIVITY 80 INT - TIMES SOLD MARIJUANA/HASHISH IN PAST YEAR 1980 ALL

R03060.00

DELIN-15

ILLEGAL ACTIVITY 80 INT - TIMES SOLD HARD DRUGS IN PAST YEAR 1980 ALL

R03061.00

DELIN-16

ILLEGAL ACTIVITY 80 INT - TIMES ATTEMPTED TO "CON" SOMEONE IN PAST YEAR 1980 ALL

R03062.00

DELIN-17

ILLEGAL ACTIVITY 80 INT - TIMES TAKEN AUTO W/OUT OWNER PERMISSION IN PAST YEAR 1980 ALL

R03063.00

DELIN-18

ILLEGAL ACTIVITY 80 INT - TIMES BROKEN INTO A BUILDING IN PAST YEAR 1980 ALL

R03064.00

DELIN-19

ILLEGAL ACTIVITY 80 INT - TIMES KNOWINGLY SOLD/HELD STOLEN GOODS IN PAST YEAR 1980 ALL

R03065.00

DELIN-20

ILLEGAL ACTIVITY 80 INT - TIMES AIDED IN GAMBLING OPERATION IN PAST YEAR 1980 ALL

Policequestions(zero/oneanswers)

R03067.00 POLICE-1

EVER "STOPPED" BY POLICE FOR OTHER THAN MINOR TRAFFIC OFFENSE? 1980 ALL

R03071.00 POLICE-2

EVER CHARGED WITH ILLEGAL ACTIVITY? 80 INT (EXC MINOR TRAFFIC OFFENSE) 1980 ALL

Page 6: I: NLSY79 I.A. NLSY79 Variables

Page6of22

R03078.00 POLICE-3

EVER CONVICTED ON ILLEGAL ACTIVITY CHARGES OTHER THAN MINOR TRAFFIC OFFENSE? 1980 ALL

DelinquencyquestionsNOTusedinconstructingtheIllicitActivityIndexNLSY79ReferenceNumber

QuestionName Definition

Yearofsurvey Sample

R03046.00 DELIN-1ILLEGAL ACTIVITY 80 INT - TIMES RUN AWAY FROM HOME IN PAST YR (AGE 17 OR UNDER) 1980 R<=17

R03047.00 DELIN-2ILLEGAL ACTIVITY 80 INT - TIMES SKIPPED SCHOOL DAY IN PAST YR (AGE 17 OR UNDER) 1980 R<=17

R03048.00 DELIN-3ILLEGAL ACTIVITY - TIMES DRANK ALCOHOLIC BEVERAGES PAST YR (AGE 17 OR UNDER) 1980 R<=17

Theselastthreequestionsareexcludedbecausetheywereonlyaskedofrespondentswhowere17yearsoryoungerin1979(bornbetween1962and1964).Includingthemwouldreducethesampleby70%.

I.B.2.ConstructingtheIndex

TheIllicitIndex

1. Sample:ofthe12,686individualsintheNLSY79survey,1,310havemissingdataononeofthe17delinquencyquestionsthatweincludeinourIndex.Oftheremaining11,376individuals,19havemissingdataononeofthethreepolicequestions.

Thisistabulatedasfollows:

Variable/SelectionCriteria Persons Dropped

Initialnumberofpersons 12,686 0Missingoneofthe17questions 11,376 1,310Missingoneofthe3questions 11,357 19

FinalNumberofPerson-Observations 11,357

2. Weusethe(i)responsestothethreepolicequestions(whichofferzero/oneresponsesintheNLSY79survey)and(ii)the“one/zero”responsestothe17delinquencyquestions,i.e.,weusetheextensiveversionsoftheanswerstothedelinquencyquestions.

3. Foreachrespondent,wesumtheresponsestothetwentyzero/oneresponsesanddivideby20.Wethenstandardizethevaluesbysubtractingthesamplemeananddividingbythestandarddeviation,sothattheIllicitActivityIndexhasameanofzeroandastandarddeviationofone.

Page 7: I: NLSY79 I.A. NLSY79 Variables

Page7of22

TheIllicitActivityIndex0-6Intensive

1. Weusethefullanswerstothe17delinquencyvariablesconcerningthenumberoftimestherespondentengagedintheactivity.Specifically,therearesevenpossibleanswers:(0)never,(1)once,(2)twice,(3)3-5times,(4)6-10times,(5)11-50times,and(6)morethan50times.

2. FortheIllicitActivityIndex0-6Intensive,weassignthevalues0,1,2,4,8,30,and50tothesevenanswers.

3. Wethen(a)computethestandardizedvalueofeachofthe20questions((value–mean)/standarddeviation))andthen(b)sumthevaluesanddivideby20.

Thetwo,IllicitActivityIndex(standardized)andtheIllicitActivityIndex0-6intensive(standardized),areveryhighlycorrelated(0.91)andthepaper’sresultsholdwhenusingeitherIndex.WeillustratethisrobustnessintheAppendixTables:APPENDIXTABLEVIIA,APPENDIXTABLEVIII,andAPPENDIXTABLEXI.

Page 8: I: NLSY79 I.A. NLSY79 Variables

Page8of22

TheDistributionofResponsestoQuestionsonDelinquencyandPolice:

Question N Min Median Mean Max NEVER ONCE TWICE TIMES_3_5 TIMES_1_10 TIMES_11_50 TIMES_50P

DelinquencyquestionsusedinconstructingtheIllicitActivityIndexDAMAGED 11734 0 0 0.36 6 0.82 0.09 0.04 0.04 0.01 0.00 0.00FOUGHT 11800 0 0 0.56 6 0.72 0.13 0.06 0.06 0.02 0.01 0.00SHOPLIFTED 11788 0 0 0.53 6 0.74 0.12 0.05 0.05 0.02 0.01 0.00STEEL50M 11788 0 0 0.37 6 0.81 0.09 0.04 0.04 0.01 0.01 0.00STEEL50P 11776 0 0 0.11 6 0.94 0.03 0.01 0.01 0.00 0.00 0.00FORCE 11794 0 0 0.10 6 0.95 0.03 0.01 0.01 0.00 0.00 0.00THREAT 11785 0 0 0.82 6 0.63 0.16 0.08 0.08 0.03 0.02 0.01ATTACK 11792 0 0 0.20 6 0.89 0.06 0.02 0.02 0.01 0.00 0.00MARIJUANA 11722 0 0 1.84 6 0.53 0.09 0.04 0.07 0.05 0.07 0.15DRUGES 11698 0 0 0.60 6 0.81 0.05 0.03 0.04 0.03 0.03 0.02MARIJUANA_SOLD 11693 0 0 0.33 6 0.89 0.03 0.02 0.02 0.02 0.01 0.01DRUGES_SOLD 11717 0 0 0.07 6 0.97 0.01 0.00 0.00 0.00 0.00 0.00CON 11721 0 0 0.48 6 0.78 0.09 0.05 0.05 0.02 0.01 0.01AUTO 11752 0 0 0.14 6 0.92 0.04 0.01 0.01 0.00 0.00 0.00BROKEN 11748 0 0 0.12 6 0.94 0.03 0.01 0.01 0.00 0.00 0.00SOLD 11749 0 0 0.23 6 0.89 0.06 0.02 0.02 0.01 0.00 0.00GAMBELING 11737 0 0 0.06 6 0.98 0.01 0.00 0.00 0.00 0.00 0.00

Policequestions(zero/oneanswers)usedinconstructingtheIllicitActivityIndexSTOPPED_POLICE 12129 0 0 0.19 1 0.81 0.19 0.00 0.00 0.00 0.00 0.00CHARGED 12136 0 0 0.11 1 0.89 0.11 0.00 0.00 0.00 0.00 0.00CONVICTED 12130 0 0 0.06 1 0.94 0.06 0.00 0.00 0.00 0.00 0.00

AVG 0.84 0.07 0.03 0.03 0.01 0.01 0.01

Page 9: I: NLSY79 I.A. NLSY79 Variables

Page9of22

I.C.NLSY79IncorporatedSelf-employmentCodingDetails

Wefollowthefollowingprocessforcodingincorporatedself-employment.

1.TheNLSY79providesinformationon(a)theclassofworker,includingwhethertherespondent(R)issalariedorself-employed,and(b)whetherR’sbusinessisincorporated.

2.From1994onward,theNLSY79notesthatwheneverR’sjobisthesameasthejobinthelastinterview,classofworkerandincorporationstatusareonlyreportediftheinformationhaschanged.Itiscodedasmissingiftherehasbeennochangesincethelastinterview.

3.Forexample,ifpersonAhasbeencontinuouslyself-employedby"ALConsulting,Inc."forseveralyears,A’s"raw"datamightlooklikethis:

YearCOW(job#1)INCORP(job#1)20004(SE)1(yes)2002-4-42004-4-42006-4-42008-4-42010-4-4Eventhoughjob#1referstothesamejob(ALConsulting,Inc.)ineachoftheseinterviews,COWandINCORParemissingafterthefirstyearbecausetheyarenotre-asked.4. TheNLSY79solvesthisproblemforclassofworker.Theyappropriately“fillin”the

informationonclassofworkerforeachRratherthanleavingdataentriesas“missing,”e.g.,seetheclassofworkervariableforJob#1intheNLSY79Navigator).TheCOWALLvariables(e.g.,R4587905=COWALL-EMP1_1994,whichisCOWforjob#1in1994)havebeen"filledin"tocarrytheoldinformationforward.

5. TheNLSY79,however,didnot“fillin”theincorporationstatus.TheINCORPvariables(e.g.,R4587000=QES1-56E_1994,whichisINCORPforjob#1in1994)havenot.Soifoneispairing(created)COWALLvariableswith(raw)INCORPvariables,the(fake)datawilllooklikethis:

YearCOWALL-EMP1INCORP(job#1)20004(SE)1(yes)20024-420044-420064-420084-420104-4Clearly,therearemanycaseswhereINCORPismissingeventhoughCOWALL=SE.6. Thereisastraightforwardprocedureforaddressingthiscodingissue.IfRisself-employed,

usetheincorporationstatusofthelastinterviewtoappropriately“fillin”missingvalues(i.e.,fillinthe“validskips”).Afterfollowingthisprocedure,incorporatestatushas1.5%missingvaluesbasedonthesampleofindividualsintheTableIsummarystatistics.

Page 10: I: NLSY79 I.A. NLSY79 Variables

Page10of22

7. WemakeafewadditionaladjustmentsforRswho(a)areself-employedbut(b)havemissingvaluesofincorporatedbusinessstatusafterfollowingtheaboveprocedureinasurveyyear.

7.a.Wefind3person-yearobservationsafter1994whereRisself-employedandRreportshavingthesamejobaslastyearandRwasincorporatedlastyear.WecodeRasincorporated.(SameJobasLastYear)7.b.Wefind8person-yearobservations(after1994)inwhichtheNLSY79indicatesinsurveyt+1that(a)Risincorporatedand(b)hasthesamejobaslastyear,sowecodeRasincorporatedinsurveyt.(SameJobasNextYear)7.c.Wefind7person-yearobservationspriorto1994inwhichaself-employedpersoninperiodtandt-1hasamissingvalueforincorporatedstatusinsurveyt,andwasincorporatedself-employedinsurveyt-1.WecodetheseRsasincorporatedinsurveyt.(Pre-1994:SameJobasLastYear)

7.d.Wefind9person-yearobservationsinwhichaself-employedpersoninsurveytandt+1hasamissingvalueforincorporatedstatusinsurveytbutisincorporatedself-employedinsurveyt+1.WecodetheseRsasincorporatedinsurveyt.(Pre-1994:SameJobasNextYear)

7.e.Afterthis,thereare2additionalperson-yearobservationsinwhichtheincorporatedstatusismissing,butinwhichoneofRsotherjobs,i.e.,Job2–Job5,isincorporated.Wecodethesetwoobservationsasincorporated.AcrossJobCategories)

7.f.Theresultsarerobusttokeepingthese29person-yearobservationsasmissing,asshowninAppendixTables.SeeAPPENDIXTABLEI,APPENDIXTABLEII,APPENDIXTABLEIV,andAPPENDIXTABLEVIIB.

Specifically,fromtheSTATAprogramforthe1,936person-yearobservationscodedasincorporatedself-employed,wetabulatethefollowing:

IncorporatedSource Freq. % Cum. NLSY79rawdata 1,501 77.53 77.53NLSY79post1994procedure 4,06 20.97 98.50SamejobaslastYear 3 0.15 98.66Samejobasnextyear 8 0.41 99.07Pre-1994:samejobaslastyear 7 0.36 99.43Pre-1994:samejobasnextyear 9 0.46 99.90Acrossjobcategories 2 0.10 100.0

I.D.NLSY79:ImputationofMotherandFatherEducation

Ofthe132,681person-yearobservation(10,719individuals)coveredinTableI,125,291(10,093)havemother’seducationand115,216(9,263)havefather’seducation.

Page 11: I: NLSY79 I.A. NLSY79 Variables

Page11of22

IftheNLSY79doesnotreportdataontheeducationofthemotherorthefather,weusethefollowingtwoimputationprocedures.

1.PartnerImputation.Ifoneparent’seducationismissing,weusetheotherparent’s.

2.MeanImputation.Ifbothparent’seducationaremissing,weusethemeaneducationofparents,differentiatingbyrace(Black,Hispanic,White)andgender.

AsshowninAppendixTables,theresultsarerobusttoexcludingtheseimputedmeasures.

Thefollowingtablesdetaileachimputationprocedure(oneobservationperindividual)forthe10,719individualsinourbasesample(seenextpage).

MotherEducationSource Freq. % Cum. NLSY79rawdata 10,093 94.16 94.16Imputedusingfather’seducation 294 2.74 96.90Imputedusinggroup’smean 332 3.10 100.00

FatherEducationSource Freq. % Cum. NLSY79rawdata 9,263 86.42 86.42Imputedusingfather’seducation 1,124 10.49 10.49Imputedusinggroup’smean 332 3.10 3.10

I.E.NLSY79:FamilyIncomein1979

ForFamilyIncomein1979,weusethenon-zerovaluesofthevariableR0217900,whichistruncatedat$75,000in2010-yearprices.Weusetheearliestnon-missingvaluein1979-1981.In81%ofthecases,thisis1979.In13%,itis1980;andin3.3%,itis1981.

Thus,ofthe132,681person-yearobservationsinTableI,8,676,wehavenon-missingdataforFamilyIncomein1979for97.3%ofthoseobservations

Fortheremaining2.7%(3,598person-yearobservations),weimputefamilyincomebyusingthemeanvalueoffamilyincomebyrace(Black,Hispanic,White).

Page 12: I: NLSY79 I.A. NLSY79 Variables

Page12of22

I.F.DataProcessingNLSY79:SampleSelectionCriteria

ThetableonthenextpagedetailshowwearriveatthenumberofobservationsineachtableusingNLSY79data.

DataProcessingNLSY79

Variable/SelectionCriteria Year-Person Individuals Dropped

Initialnumberofobservations 317,150 12,686 0Interviewed 243,641 12,686 73,509Agebetween25to55 166,250 12,264 77,391SalariedorSelf-Employed 143,583 11,780 22,667AFQT 137,272 11,133 6,311RotterScore 136,037 11,020 1,235RosenbergScore 132,681 10,719 3,356SchoolYears 132,681 10,719 0

NumberofObservations 132,681 10,719 184,469

TablesandFigures

TablesIandII:Demographics,LaborMarketOutcomesandHomeEnvironment

NumberofObservations 132,681 10,719 184,469

TableIII:SwitchingBetweenUnincorporatedandIncorporatedSelf-Employment

TableIand 132,681 10,719 184,469Self-employed,firstyearinspell 4,118 2,799 128,563Atmostoneswitchwithinself-employmentspell 4,083 2,786 35NumberofObservations 4,083 2,786 0

TableIV:JobTaskRequirementsbyEmploymentType

TableIand 132,681 10,719 184,469Excludingmissingoccupation 131,949 10,674 732NumberofObservationsPanelB.1 131,949 10,674 Lastjobassalariedworker 120,156 10,218 12,5251NumberofObservationsPanelB.2 120,156 10,218

TableVII:SelectionintoEmploymentTypesonCognitive,NoncognitiveandFamilyTraitsTableIand 132,681 10,719 IllicitIndex 125,166 10,055 7,515NumberofObservations 125,166 10,055

1ThisdropisrelativetotheobservationsinTableI.

Page 13: I: NLSY79 I.A. NLSY79 Variables

Page13of22

TableVIII:DifferencesinJobTaskRequirementsofBusinessesbyIndividualTraitsTableVIIand 125,166 10,055

Whites 69,503 5,981 55,663Males 35,012 2,964 34,491Salariedtwoyearsago 29,754 2,818 5,258Validindustrycodesinyeart 29,412 2,817 342NumberofObservations 29,412 2,817

TablesIX,X,XIandFiguresIandII:Earnings,LevelsandFirstDifferencesTableVIIand 125,166 10,055

Whites 69,503 5,981 55,663Males 35,012 2,964 34,491HourlyEarnings 32,768 2,924 2,244Full-Time;Full-Year 23,657 2,595 9,111NumberofObservations 23,657 2,595 -Firstdifferencesregressions 17,479 2,227 6,178

Page 14: I: NLSY79 I.A. NLSY79 Variables

Page14of22

II:CPS

II.A.Variables

Earnings: AnnualHoursWorked NumberofhoursworkedduringthepastcalendaryearFull-time,Full-Year Iftherespondent(1)works50ormoreweeksperyearand(2)works

2000ormorehoursperyear,and(3)works40ormorehoursperweek,thenFull-time,Full-yearissetequaltoone,otherwiseitissetequaltozero.

Earnings Wagesplusincomefrombusiness.DeflatedbytheCPIcorrespondingtowhenthoseearningswererealized.Earningsarein2010prices.

Demographics: Age TheageoftherespondentCollegeGraduate(ormore) Graduatedfromcollegeorobtainedanadvanceddegree.Educationalattainment(sixcategories)

Thesixeducationalattainmentcategoriesare:(i)completedlessthan9thgrade,(ii)completedbetween9thand11thgrade,(iii)graduatedfromhighschool,(iv)hadsomecollegeeducation,(v)graduatedfromcollege,and(vi)obtainedanadvanceddegree.

Female Equalsoneiftherespondentreportsbeingfemaleandzerootherwise.Potentialexperience Equalstheageoftherespondentminustheyearsofschoolingminus

seven,or,ifthiscomputationislessthanzero,thenpotentialexperiencesetequaltozero.

White Equalsoneiftherespondentreportsbeingwhiteandzerootherwise.Yearofbirth Thecalendaryearinwhichtherespondentwasborn.YearsofSchooling Totalyearsofeducationalattainment. Employmenttype Salaried TheCPSclassifiesallworkersineachyearaseithersalariedorself-

employed.Salariedequalsoneiftherespondentissalariedandzerootherwise.

Self-employed TheCPSclassifiesallworkersineachyearaseithersalariedorself-employed.Self-employedequalsoneiftherespondentissalariedandzerootherwise.

IncorporatedSelf-employed TheCPSclassifiesallworkersineachyearaseithersalariedorself-employed,andamongtheself-employed,indicateswhetherindividualsareincorporatedorunincorporated.Specifically,individualsareaskedabouttheiremploymentclassfortheirmainjob:“Wereyouemployedbyagovernment,byaprivatecompany,anonprofitorganization,orwereyouself-employed(orworkinginafamilybusiness)?”Thoserespondingthattheyareself-employedarefurtherasked,“Isthisbusinessincorporated?”Incorporatedself-employedequalsoneifthepersonanswersyes,andzerootherwise.

Page 15: I: NLSY79 I.A. NLSY79 Variables

Page15of22

UnincorporatedSelf-employed TheCPSclassifiesallworkersineachyearaseithersalariedorself-employed,andamongtheself-employed,indicateswhetherindividualsareincorporatedorunincorporated.Specifically,individualsareaskedabouttheiremploymentclassfortheirmainjob:“Wereyouemployedbyagovernment,byaprivatecompany,anonprofitorganization,orwereyouself-employed(orworkinginafamilybusiness)?”Thoserespondingthattheyareself-employedarefurtherasked,“Isthisbusinessincorporated?”Unincorporatedself-employedequalsoneifthepersonanswersnotothisquestionandyestobeingself-employed,andzerootherwise.

Page 16: I: NLSY79 I.A. NLSY79 Variables

Page16of22

II.B.DataProcessingCPS:SampleSelectionCriteria

ThetableonthenextpagedetailshowwearriveatthenumberofobservationsineachtableusingNLSY79datainthepaper.

DataProcessingCPS,1996-2013

Variable/SelectionCriteria Year-Person Individuals2 Dropped

Initialnumberofobservations 3,384,125 -- 0AdultCivilians 2,551,860 -- 832,265Households 2,551,836 -- 24Withpositivesampleweight 2,550,441 -- 1,395

Agebetween25to55 1,500,103 -- 1,050,338Gender,race,education 1,500,103 -- 0Potentialexperience0-50 1,500,103 -- 0Validindustrycode 1,257,925 -- 242,178Validoccupationcode(<997) 1,257,925 -- 0SchoolYears 1,257,925 -- 0

Excluding: -FarmersandFarmLaborers 1,240,776 -- 17,149-Agriculture 1,226,658 -- 14,118

Salariedorself-employed 1,225,886 -- 772

NumberofObservations 1,225,886 893,780 2,126,200

2ThenumberofindividualsisthebaseforourCPSpaneldataanalyses.

Page 17: I: NLSY79 I.A. NLSY79 Variables

Page17of22

TablesSelectionCriteria Year-Person Individuals Dropped

TableI:DemographicsandLaborMarketOutcomesbyEmploymentType

NumberofObservations 1,225,886 893,780 2,126,200

TableIV:JobTaskRequirementsbyEmploymentType

TableI 1,225,886 893,780 NumberofObservationspanelA.1 1,225,886 893,780 Panel 513,701 257,017 712,185Salariedworkerlastyear 230,330 230,330 283,371NumberofObservationspanelA.2 230,330 230,330

TableV:SelectionintoUnincorporatedandIncorporatedSelf-EmploymentTableIVpanelA.2 230,330 230,330 NumberofObservations 230,330 230,330

TableVI:TopandBottomIndustriesbyNonroutineJobTaskRequirementsTableI 1,225,886 893,780

NumberofObservations 1,225,886 893,780

Page 18: I: NLSY79 I.A. NLSY79 Variables

Page18of22

II.C.MatchedSample

Weconstructatwo-yearmatchedpanel.TheCPSinterviewsahouseholdforfourconsecutivemonths.Thenextyear,theCPSreturnstothesamelocation.Inmostcases,thesecondinterviewinvolvesthesamehouseholdasthefirstinterview.

WefollowtheguidelinesinMadrianandLefren(2000)formatchingCPShouseholdsacrosstime.Thisinvolvescheckingtheage,race,gender,education,etc.ofthoseinterviewedanddroppingindividuals(forthematchedpanelsample)wherethesedonotmatchacrosstheCPSinterviews.

Wedonotfinddifferentialselectionintothematched-CPSsampleonceweconditionondemographics(andFTFYwhenconductingtheearningsanalyses),asshowninthetablebelow.

Morespecifically,selectionintothepanelsub-sampleisnotrandom.Whitesandindividualswithlargerearningsaremorelikelytobeobservedtwoconsecutiveyearsthanothers.Theincorporatedandunincorporatedself-employedaremorelikelytobeselectedintothetwo-yearpanel(5.85%and1.7%)thanothers.Yet,conditionalonstandarddemographics,suchasgender,race,educationandpotentialexperiencethesedifferencesdisappear.

Asimple(non-parametric)waytoobservethatdifferentialselectionisnotaproblemwhenusingthematched-CPSsampleistocomparethecross-sectionCPSsamplewiththeMatched-CPSsamplebydemographicgroups.Whenwerestrictthesampletowhites,wefindmuchsmallergapsinkeymeasuresbetweentheMatched-CPSsampleandthecross-sectionsample.Forinstance,thegapinyearsofschoolingcompleted,annualhoursworked,andtheDOTmeasuresarenegligible.ThegapinannualearningsbetweentheMatched-CPSsampleandthecross-sectionsampledropsbyhalfto4%.Furthermore,whenrestrictingthesampletoFTFYwhitemen,thegapinearningsdropsto1%.ThedifferentialselectiononearningsintotheCPS-Matchedsampledropsfromapproximately7%(moreforsalariedthanincorporatedandunincorporatedself-employed)toapproximately2%.

Foracomparisonoftheearningswhenusingthefullandmatchedsamples,seeAppendixTableIXB:CPS:EarningsFull&MatchedSamples,whichshowsthattheresultsareverysimilar.

Page 19: I: NLSY79 I.A. NLSY79 Variables

Page19of22

DifferencesbetweentheCross-SectionCPSandtheMatched-PanelCPS

DifferencesinAbsoluteTerms(All-Matched)

Differencesin%(All-Matched)/AllSample: All Whites White

MalesWhiteMalesFTFY

WhiteMalesFTFY2K

All Whites White

MalesWhiteMalesFTFY

WhiteMalesFTFY2K

PanelA:AllTypesofWorkersObservations 712185 439193 224825 183224 177032

58.1% 52.2% 51.8% 50.3% 50.3%

Age -1.4 -1.2 -1.1 -1.0 -1.0

-3.5% -2.9% -2.7% -2.4% -2.4%White -0.10 0.00 0.00 0.00 0.00

-14.4% 0.0% 0.0% 0.0% 0.0%

Female 0.00 0.00 0.00 0.00 0.00

-0.1% 0.5% Yearsofschooling -0.2 -0.1 0.0 0.0 0.0

-1.6% -0.4% -0.3% -0.2% -0.2%

Meanearnings -3853 -2217 -2222 -867 -821

-8.1% -4.3% -3.5% -1.2% -1.2%Medianearnings -3930 -2225 -2010 -1158 -1171

-10.9% -5.6% -4.1% -2.2% -2.2%

Annualworkedhours -61 -49 -35 3 3

-3.0% -2.4% -1.6% 0.1% 0.1%Full-TimeFull-Year -0.03 -0.03 -0.03 0.00 0.00

-5.0% -4.4% -3.4% -0.1% 0.0%

NonroutineAnalytical -0.16 -0.06 -0.05 -0.03 -0.03

-4.0% -1.5% -1.2% -0.6% -0.6%NonroutineDCP -0.23 -0.10 -0.09 -0.05 -0.05

-7.8% -3.1% -2.7% -1.5% -1.4%

NonroutineManual 0.04 0.01 0.01 0.01 0.01

3.6% 1.5% 1.0% 0.7% 0.7%

PanelB:SalariedObservations 647422 392704 196143 160633 155473

58.4% 52.3% 52.1% 50.5% 50.5%

Age -1.4 -1.2 -1.2 -0.6 -0.7

-3.6% -3.1% -2.9% -1.4% -1.6%White -0.10 0.00 0.00 0.00 0.00

-14.9% 0.0% 0.0% 0.0% 0.0%

Female 0.00 0.00 0.00 0.00 0.00

-0.6% 0.2% Yearsofschooling -0.2 -0.1 0.0 0.0 -0.1

-1.7% -0.4% -0.3% -0.3% -0.4%

Meanearnings -3802 -2236 -2321 126 -124

-8.2% -4.4% -3.8% 0.2% -0.2%

Page 20: I: NLSY79 I.A. NLSY79 Variables

Page20of22

Medianearnings -3770 -2067 -2187 0 -405

-10.4% -5.2% -4.5% 0.0% -0.8%Annualworkedhours -59 -48 -35 -1 -4

-3.0% -2.4% -1.6% 0.0% -0.2%

Full-TimeFull-Year -0.04 -0.03 -0.03 0.00 0.00

-5.1% -4.5% -3.6% 0.0% 0.0%NonroutineAnalytical -0.16 -0.06 -0.05 -0.02 -0.01

-4.0% -1.5% -1.2% -0.5% -0.3%

NonroutineDCP -0.23 -0.10 -0.09 -0.02 0.05

-8.0% -3.2% -2.8% -0.6% 1.3%NonroutineManual 0.04 0.01 0.01 0.02 0.01

3.6% 1.5% 0.8% 1.5% 0.5%

PanelC:Self-EmployedObservations 64763 46489 28682 22591 21559

55.2% 50.8% 49.8% 49.0% 49.0%

Age -0.9 -0.7 -0.6 -0.6 -0.6

-2.0% -1.6% -1.4% -1.3% -1.3%White -0.08 0.00 0.00 0.00 0.00

-10.2% 0.0% 0.0% 0.0% 0.0%

Female 0.01 0.01 0.00 0.00 0.00

3.2% 3.1% Yearsofschooling -0.2 -0.1 -0.1 0.0 0.0

-1.4% -0.4% -0.4% -0.3% -0.3%

Meanearnings -3553 -1735 -978 126 150

-6.1% -2.8% -1.3% 0.1% 0.2%Medianearnings -2749 -1627 -1239 0 -821

-8.0% -4.5% -2.6% 0.0% -1.5%

Annualworkedhours -64 -51 -31 -1 -1

-3.1% -2.4% -1.3% 0.0% 0.0%Full-TimeFull-Year -0.03 -0.02 -0.02 0.00 0.00

-4.3% -3.6% -2.0% 0.0% 0.0%

NonroutineAnalytical -0.13 -0.06 -0.04 -0.02 -0.02

-3.1% -1.3% -0.8% -0.4% -0.4%NonroutineDCP -0.17 -0.08 -0.06 -0.02 -0.02

-4.5% -2.1% -1.3% -0.5% -0.4%

NonroutineManual 0.03 0.01 0.02 0.02 0.02

3.5% 1.4% 1.9% 1.7% 1.7%

PanelD:Self-EmployedUnincorporatedObservations 42785 29506 16440 11804 11101

56.7% 51.7% 50.3% 49.3% 49.4%

Age -1.0 -0.8 -0.7 -0.7 -0.7

-2.4% -1.9% -1.7% -1.6% -1.5%White -0.09 0.00 0.00 0.00 0.00

-12.2% 0.0% 0.0% 0.0% 0.0%

Female 0.01 0.01 0.00 0.00 0.00

2.2% 3.0% Yearsofschooling -0.2 -0.1 -0.1 -0.1 -0.1

-1.8% -0.6% -0.6% -0.4% -0.4%

Meanearnings -3150 -1905 -1375 -210 -124

-7.7% -4.4% -2.5% -0.3% -0.2%Medianearnings -2415 -1632 -1026 -102 -405

-9.8% -6.3% -3.0% -0.3% -1.0%

Annualworkedhours -66 -57 -38 -2 -4

-3.4% -3.0% -1.8% -0.1% -0.2%Full-TimeFull-Year -0.03 -0.02 -0.02 0.00 0.00

-4.4% -4.0% -2.2% 0.3% 0.0%

Page 21: I: NLSY79 I.A. NLSY79 Variables

Page21of22

NonroutineAnalytical -0.14 -0.07 -0.04 -0.02 -0.01

-3.7% -1.7% -1.0% -0.5% -0.3%NonroutineDCP -0.13 -0.05 -0.02 0.04 0.05

-4.1% -1.6% -0.5% 1.1% 1.3%

NonroutineManual 0.03 0.01 0.02 0.01 0.01

2.7% 0.6% 1.2% 0.6% 0.5%

PanelE:Self-EmployedIncorporatedObservations 21978 16983 12242 10787 10458

52.6% 49.4% 49.1% 48.6% 48.6%

Age -0.6 -0.5 -0.4 -0.4 -0.4

-1.4% -1.1% -1.0% -1.0% -1.0%White -0.06 0.00 0.00 0.00 0.00

-6.7% 0.0% 0.0% 0.0% 0.0%

Female 0.01 0.01 0.00 0.00 0.00

3.1% 2.1% Yearsofschooling -0.1 0.0 0.0 0.0 0.0

-0.4% -0.1% -0.1% -0.1% -0.1%

Meanearnings -1394 -147 467 963 1003

-1.6% -0.2% 0.4% 0.9% 0.9%Medianearnings -1390 -609 -592 -310 -728

-2.5% -1.1% -0.9% -0.4% -1.0%

Annualworkedhours -37 -30 -15 2 3

-1.6% -1.3% -0.6% 0.1% 0.1%Full-TimeFull-Year -0.02 -0.02 -0.01 0.00 0.00

-2.4% -2.3% -1.4% -0.2% 0.0%

NonroutineAnalytical -0.05 -0.02 -0.01 -0.01 -0.01

-1.1% -0.4% -0.2% -0.2% -0.2%NonroutineDCP -0.13 -0.09 -0.08 -0.07 -0.06

-2.5% -1.7% -1.4% -1.2% -1.2%

NonroutineManual 0.03 0.02 0.02 0.03 0.03

3.3% 2.1% 2.4% 3.0% 2.9%

Page 22: I: NLSY79 I.A. NLSY79 Variables

Page22of22

III:JobTaskRequirements

III.A.Basics

TheDOTwasfirstconstructedin1939tohelpemploymentofficesmatchjobseekerswithjobopenings.Itprovidesinformationontheskillsdemandedofover12,000occupations.TheDOTwasupdatedin1949,1964,1977,and1991,andreplacedbytheO*NETin1998.

Giventhetimingofourstudy,weusethe1991DOT,andconfirmtheresultswiththe1977DOT.

TheDOTaggregatesinformationintofiveskillcategories.Weusetheseaggregatedjobtaskrequirementsofindividualoccupations.TolinktheDOTmeasurestotheCPSandNLSY79data,wefollowAutor,Levy,andMurnane(2003)andusethecodesprovidedonDavidAutor’swebsite. (Reference: Autor,DavidH.,FrankLevy,andRichardJ.Murnane.2003.“Theskillcontentofrecenttechnologicalchange:Anempiricalinvestigation.”QuarterlyJournalofEconomics118(4):1279-1333.)WethenusetheunifiedIPUMSoccupationcodestohaveconsistentcodingofoccupationsofindividualsovertime.ThisgivesDOTmeasuresforeachperson-year.

III.B.Industry

Tocalculatethejobtaskrequirementsbyindustry,weusetheweightedaveragejobtaskrequirementsofworkersintheindustry.Weweightbythenumberofhoursworked(dividedby2000)ofeachworker.

III.C.Variables

NonroutineAnalytical Thedegreetowhichthetaskdemandsanalyticalflexibility,creativity,reasoning,andgeneralizedproblem-solving.

NonroutineDirection,Control,Planning

Thedegreetowhichthetaskdemandscomplexinterpersonalcommunicationssuchaspersuading,selling,andmanagingothers.

NonroutineManual Thedegreetowhichthetaskdemandseye,hand,andfootcoordination.RoutineAnalytical Thedegreetowhichthetaskrequiresthepreciseattainmentofset

standards,RoutineManual Thedegreetowhichthetaskrequiresrepetitivemanualtasks