Call Me Maybe: Experimental Evidence on Using Mobile ...pubdocs.worldbank.org › en ›...

download Call Me Maybe: Experimental Evidence on Using Mobile ...pubdocs.worldbank.org › en › 433791466183856928 › MobilePhonesPaper.pdfCall Me Maybe: Experimental Evidence on Using Mobile

If you can't read please download the document

Transcript of Call Me Maybe: Experimental Evidence on Using Mobile ...pubdocs.worldbank.org › en ›...

  • Call Me Maybe: Experimental Evidence on UsingMobile Phones to Survey African Microenterprises*

    Rob Garlick†, Kate Orkin‡, and Simon Quinn§

    March 15, 2016

    LATEST VERSION · ONLINE APPENDIXPRE-ANALYSIS PLAN · QUESTIONNAIRES

    Abstract

    We provide the first experimental evidence on the usefulness and viability of high-frequencyphone surveys of microenterprises. We randomly assign microenterprises to three groups, whoare interviewed face-to-face at monthly intervals (mimicking a standard method of collectingdata from microenterprises), face-to-face at weekly intervals, and by mobile phone at weeklyintervals. We find high frequency data collection is useful: it captures extensive volatility ina number of measures not visible in less frequent data collection. We also find it viable: onmost measures and at most quantiles of the distribution, data patterns are indistinguishablein interviews conducted weekly or monthly and face-to-face or by phone. We administer auniform face-to-face endline survey and verify that high-frequency data collection (either overa mobile phone or in person) does not substantially alter microenterprise owner behaviourover a 12-week period. These results demonstrate an important role for phone-based datacollection in development research, particularly given the substantially cost differences thatwe document.

    *We benefitted from comments from Nicola Branson, Markus Eberhardt, Simon Franklin, David Lam, MurrayLeibbrandt, Gareth Roberts, Volker Schoer, Duncan Thomas and seminar audiences at the Development EconomicsNetwork Berlin, Duke University, NEUDC 2016, the University of Cape Town, the University of Oxford, and theUniversity of the Witwatersrand. We thank Bongani Khumalo, Thembela Manyathi, Mbuso Moyo, Mohammed Mo-tala, Egines Mudzingwa and the staff and fieldwork at the Community Agency for Social Enquiry (CASE) for theirsurvey fieldwork; Mzi Shabangu, Arul Naidoo and the GIS section at Statistics South Africa for enumeration areamaps; and Rose Page, Richard Payne and Gail Wilkins at the CSAE. This project was funded by the Private Enter-prise Development for Low-Income Countries (PEDL), a joint research initiative of the Centre for Economic PolicyResearch (CEPR) and the Department For International Development (DFID), and we thank Chris Woodruff and thePEDL team for their support.

    †Department of Economics, Duke University; [email protected].‡Department of Economics, Centre for the Study of African Economies and Merton College, University of Oxford;

    [email protected].§Department of Economics, Centre for the Study of African Economies and St Antony’s College, University of

    Oxford; [email protected].

    1

    http://www.robgarlick.com/MobilePhonesPaper.pdfhttp://www.robgarlick.com/MobilePhonesAppendix.pdfhttps://www.socialscienceregistry.org/trials/346http://www.robgarlick.com/MobilePhonesQuestionnaires.zip

  • 1 Introduction

    We run the first randomized controlled trial to test the usefulness and viability of phone inter-

    views and high-frequency interviews of microenterprises. We draw a representative sample of

    microenterprises in Soweto, South Africa, and randomly assign respondents to three treatment

    groups. The first group is interviewed face-to-face at monthly intervals, to mimic a standard

    method of collecting data from microenterprises. The second group is interviewed face-to-face

    at weekly intervals. This preserves one key feature of the benchmark method — that respon-

    dents are interviewed in person — but allows us to test the consequences of collecting data at

    a much higher frequency. The third group are interviewed at weekly intervals by mobile phone.

    We find mobile phone interviews to be accurate: on most measures and at most quantiles of

    the distribution, data patterns are indistinguishable in interviews conducted weekly, whether

    by mobile phone or face-to-face). Using an endline administered face-to-face, we find little

    evidence that high-frequency data collection (either over a mobile phone or in person) alters

    microenterprise behaviour. We are powered to detect relatively small differences between dif-

    ferent interview methods (median minimum detectable effect across outcomes is 0.07 standard

    deviations) so our findings are not explained by low sample sizes or noisy data. We also find

    interviews at weekly intervals to be useful: they capture extensive volatility in a number of

    measures that is not visible in less frequent data collection. We conclude that mobile phone

    data collection offers considerable cost savings with little, if any, reduction in data quality.

    1.1 Motivation

    Development economists are likely to find high-frequency data collection useful. First, high-

    frequency interviews improve statistical power, particularly for noisy and relatively less auto-

    correlated outcomes such as business profits, household incomes and expenditures, or for rel-

    atively rare and episodic outcomes, like shocks (McKenzie, 2012). Multiple measures at short

    intervals enable researchers to average out noise, increasing power. Second, high-frequency

    2 Robert Garlick, Kate Orkin & Simon Quinn

  • 1.1 Motivation

    follow-up measures in experiments can be very useful for measuring how treatment effects

    evolve over time: see, for example, Frison and Pocock (1992), McKenzie (2012) and Franklin

    (2015).1 Third, high-frequency data is useful for measuring the volatility faced by microen-

    terprises — and, in turn, for characterising heterogeneity in volatility faced by different mi-

    croenterprises. Households and microenterprises in developing countries often face highly

    variable flows, lumpy expenses and extreme seasonal variation.2 Indeed, many arguments in

    favour of certain policy interventions — such as microfinance — centre on their ability to en-

    able households to smooth consumption (Rosenzweig and Wolpin, 1993; Gertler, Levine, and

    Morett, 2009; Banerjee, Duflo, Glennerster, and Kinnan, 2015; Karlan and Zinman, 2011). A

    reliable and cheap method of high-frequency data collection will enable researchers to charac-

    terize volatility; better understand patterns in volatility over time, seasonally and in response

    to shocks; better understand whether interventions reduce volatility; and examine whether in-

    terventions work differently for households experiencing more or less volatility.

    Against these advantages, there are several large potential costs to using weekly surveys. First,

    some evidence suggests that different survey methods induce substantially different reporting

    by respondents in developing country contexts. For example, Das, Hammer, and Sánchez-

    Paramo (2012) find that length of recall period has a large impact on reported health and use

    of health services. Similarly, Beegle, De Weerdt, Friedman, and Gibson (2012) find signifi-

    cant lower reported consumption using recall modules than consumption diaries, particularly

    among poorer households, while Caeyers, Chalmers, and De Weerdt (2012) find that reported

    consumption is higher for paper-based interviewing that computer-assisted interviewing. Baird

    1 Similarly, high-frequency measures can be useful to test how treatment effects differ between periods of higher andlower microenterprise volatility: for example, Drexler, Fischer, and Schoar (2014) and Karlan and Valdivia (2011)find business training programs improve sales in bad months, but on average find no significant treatment effects.

    2 For example, McKenzie and Woodruff (2008) report that the percentage change in monthly profits from one calendarquarter to the next in data from Mexico ranges from -97.6 to +4110. The limited evidence suggests the majority oflarge changes reported are genuine, rather than measurement error. In one panel of firms with no paid employeesin Accra, Ghana, Fafchamps, McKenzie, Quinn, and Woodruff (2012) queried owners if they reported changes inprofits or sales from one quarter to the next over a threshold, but owners corrected only between 3 and 13 per centof the large changes in profits and sales in a survey round. See also Collins, Morduch, Rutherford, and Ruthven(2009).

    3 Robert Garlick, Kate Orkin & Simon Quinn

  • 1.1 Motivation

    and Özler (2012) and Barrera-Osorio, Bertrand, Linden, and Perez Calle (2011) find signifi-

    cant over-reporting in self-reports on school attendance and enrolment compared to adminis-

    trative records and observational data. Relatedly, enumerator error may differ by data capture

    method. Patnaik, Brunskill, and Thies (2009) find lower error rates when enumerators dictate

    survey answers over the phone to data capturers than using PDAs. Of six RCTs comparing

    PDAs to paper in developed world contexts, two found PDAs more accurate, one found paper

    more accurate, and three found no differences (Lane, Heddle, Arnold, and Walker, 2006).

    Second, different survey frequencies may even change actual respondent behaviour, although

    results vary by the type of outcome. Reminder effects have only been found for outcomes

    where respondents pay limited attention to their behaviour in the absence of the survey. Zwane,

    Zinman, Van Dusen, Pariente, Null, Miguel, Kremer, Karlan, Hornbeck, Gine, Duflo, Devoto,

    Crepon, and Banerjee (2011) randomize the frequency of face-to-face surveys on health be-

    haviour and diaorrhea incidence during a trial of water chlorination products. Among house-

    holds surveyed biweekly for 18 rounds, there is less reported child diaorrhea and more uptake

    of the product, compared to households surveyed in just three of the 18 rounds; the authors

    attribute this to the effects of being reminded more frequently to invest in water purification.

    Uptake of the product is measured through sampling chlorine in the household’s water as well

    as self-reports, so differences are not due to social desirability bias. In the US, Stango and Zin-

    man (2013) find that taking a survey which asks about whether clients take an overdraft and

    pay fees reduces the probability of incurring any overdraft fee by an estimated 3.7 percentage

    points on a base of 30 per cent, conditional on being in a population which takes surveys for

    a market research company. Each survey within a two year period reduces the probability by

    1.7 percentage points. Effects are stronger for respondents with lower education and finan-

    cial literacy, which is consistent with the explanation that respondents don’t pay attention to

    overdrafts. Relatedly, market researchers find asking subjects directly for a forecast of their

    likelihood of engaging in a targeted behaviour is known to improve the likelihood that they

    engage in that behaviour (Dholakia, 2010).

    4 Robert Garlick, Kate Orkin & Simon Quinn

  • 1.1 Motivation

    However, high-frequency surveys on economic outcomes in developing countries find that be-

    ing surveyed frequently does not change behaviour. Beaman, Magruder, and Robinson (2014)

    found Kenyan microenterprise owners often forgot to bring cash in small denominations to

    break larger bills, losing sales between 5 and 8 percent of weekly profits. They administered

    a weekly survey asking about profits, sales, sales and profits lost to missing change and time

    spent looking for change, for between 8 and 18 weeks. Simply being surveyed three times

    about cash management had no significant effect on profits, sales or the amount of change

    brought to work, although it reduced the number and amount of lost sales due to not having

    change. However, an information intervention telling firms how much money they were losing

    reduced lost sales and increased profits. Franklin (2015) finds surveying randomly selected

    unemployed Addis Ababa youth every week by phone for three months has no effect on job

    search and employment outcomes compared to surveying them only at the start and end of the

    period: he argues that respondent awareness is not likely to be an important determinant of

    employment status.3

    Conducting high-frequency interviews by phone is cheaper and easier than revisiting respon-

    dents multiple times in a short period for face-to-face interviews.4 Mobile phone data collec-

    tion may reduce attrition because respondents migrate or travel temporarily, provided migrants

    keep the same mobile phone number, and is likely to reduce tracking costs. Answering a phone

    call may be more convenient for respondents than being at a pre-arranged location for an in-

    terview.

    3 Research which finds effects on behaviour of being surveyed at all also finds effects differ by the type of outcome.In another part of their study, Zwane, Zinman, Van Dusen, Pariente, Null, Miguel, Kremer, Karlan, Hornbeck, Gine,Duflo, Devoto, Crepon, and Banerjee (2011) find that being surveyed at all had no effect on take-up of loans but didaffect take-up of medical insurance. They argue consumption needs were likely salient anyway, but that a surveymight make medical risks more salient.

    4 Researchers surveying conflict areas (van der Windt and Humphreys, 2013), remote areas (Bauer, Akakpo, En-lund, and Passeri, 2013) or areas suffering contagious disease outbreaks (Himelein, 2014) have used mobile phonesurveys, and these are particularly attractive if existing sampling frames are available.

    5 Robert Garlick, Kate Orkin & Simon Quinn

  • 1.2 Contribution and Results

    But can researchers rely on phone interviews? An extensive literature from developed coun-

    tries suggests differences between phone and face-to-face interviews are mostly small. Jackle,

    Roberts, and Lynn (2006) randomly assigned European Social Survey respondents in Hungary

    and Portugal, sampled from a single sampling frame, to be interviewed either face-to-face or

    over a fixed line or mobile phone. Although variables which were most likely to be affected

    were selected for the experiment, there were differences in only 8 of 33 items, and differences

    were small; they were larger than one standard deviation in only one variable. Other ran-

    domised trials also find significant differences by mode in only a few variables, no more than

    would be expected by chance (Groves, 1979; Sykes and Hoinville, 1985; Körmendi, 2001;

    de Leeuw, 1992; Nord and Hopwood, 2007). Jackle, Roberts, and Lynn (2006) also find no

    evidence that mode effects are stronger for respondents with lower cognitive ability.5

    Even if responses on average don’t differ, phenomena such as item non-response, non-differentiation

    (Krosnick, 1991), acquiescence (Smith and Fischer, 2008) or response order effects (Schwarz,

    Hippler, and Noelle-Neumann, 1992) may be more prevalent in phone surveys and thus affect

    certain types of outcomes (such as attitudes questions using scales). However, randomised tri-

    als comparing phone and face-to-face surveys find no differences (Jackle, Roberts, and Lynn,

    2006; de Leeuw, 1992).

    1.2 Contribution and Results

    There is very limited research to date on the viability of mobile phone interviewing in develop-

    ing countries – and, to our knowledge, no research regarding microenterprises. (A household

    survey conducted biweekly by the World Bank and Gallup in Honduras found no differences

    between weeks when data was collecting using voice calls and weeks when it was collected

    5 Some political scientists in the US do find more significant differences (Holbrook, Green, and Krosnick, 2003;Shanks, Sanchez, and Morton, 1983). However, most research is done to compare polling results from sampleschosen using different methods – face-to-face surveys sampled via conventional listing of randomly selected streetblocks and telephone surveys sampled by random digit dialling – and cannot separate the effects of differences insampling from differences in interview mode. This literature is less relevant if one is concerned with the effects ofmode of interview on samples chosen using the same method.

    6 Robert Garlick, Kate Orkin & Simon Quinn

  • 1.2 Contribution and Results

    face-to-face from the same respondents (Gallup, 2012).6)

    We collect a random sample of microenterprises in a low-income urban area outside Johannes-

    burg, South Africa, by screening all households within sampled census enumeration areas to

    find households who were running at least one microenterprise at the beginning of the study.

    We remove enterprises with more than two full-time employees, which provided professional

    services, which operated fewer than three days and week and whose owners did not have a mo-

    bile phone. We randomly allocate enterprises to be surveyed either face-to-face weekly for 12

    weeks, over the phone weekly for 12 weeks, or face-to-face every 4 weeks for 12 weeks. All

    surveys use an identical questionnaire, which takes approximately 15 minutes to administer

    and measures seventeen enterprise outcomes. We then conduct a common face-to-face endline

    with all microenterprises.

    First, we find little evidence that interview frequency (weekly vs monthly) or interview medium

    (face-to-face vs phone) changes behaviour. We test whether endline responses differ between

    treatment groups and interpret this as a test of whether our interview techniques changed ac-

    tual microenterprise outcomes. Behavioural changes may arise if regular surveys make infor-

    mation about enterprise performance more salient to owners. Owners interviewed weekly

    report a slightly and statistically significantly higher stock of fixed assets than owners in-

    terviewed monthly. Some insignificant but non-trivial differences in measures are weakly

    consistent with reminder effects but do not provide strong evidence of behavioural changes.

    owners interviewed by phone report significantly lower profits and slightly lower spending on

    stock/inventory, withdrawals for household use, and number of full-time employees. We do not

    believe these differences by interview medium are consistent with simple behavioural change

    explanations.

    6 For field notes describing the use of mobile phones in surveys, see Dillon (2012) and Croke, Dabalen, Demombynes,Giugale, and Hoogeveen (2014).

    7 Robert Garlick, Kate Orkin & Simon Quinn

  • 1.2 Contribution and Results

    Second, we find mobile phone interviews are accurate compared to a benchmark of face-to-

    face interviews. We compare the reported outcomes for the panel surveys conducted after the

    baseline and before the endline. On most measures and at most quantiles of the distribution,

    data patterns are very similar across weekly phone and weekly face-to-face interviews. Owners

    interviewed by phone report working fewer hours and taking less money from the enterprise

    for their household (as well as lower stock/inventory levels), which may reflect differential

    strength of social desirability bias across interview media.

    Third, we provide evidence on the extent and implications of attrition from a high-frequency

    microenterprise panel. Most microenterprises miss multiple scheduled interviews but the level

    of attrition does not vary with the medium of interview. Attrition is slightly higher for high-

    than low-frequency interviews but attrition does not increase through time. We interpret this

    pattern as evidence of the logistical challenges of conducting high-frequency interviews, not

    fatigue or irritation by respondents. The reasons for attrition are generally balanced across

    groups, though phone interviews yield slightly higher rates of refusal and incorrect contact in-

    formation (presumably due to changing phone numbers). Attrition is predicted by few baseline

    microenterprise characteristics and the results reported above are robust to adjustment using

    inverse-probability-of-attrition weights. We conclude that attrition is an important component

    of panel data collection in this setting, but that high-frequency phone interviews do not pro-

    duce substantially worse data quality due to attrition than conventional methods.

    Our results show that development researchers studying microenterprises should seriously con-

    sider phone interviews. These are substantially cheaper than other face-to-face interviews and

    produce data of comparable quality. These results may also apply to studies of households,

    particularly given the intertwined nature of consumption and production behaviour in many

    developing countries (Singh et al., 1986). The viability of phone surveys depends on phone

    penetration, but this is high and rising rapidly in most developing countries. South Africa, and

    the population we study, are hardly outliers in this respect.

    8 Robert Garlick, Kate Orkin & Simon Quinn

  • 2 Design and Data

    2.1 Context

    The study takes place in the township of Soweto, in South Africa’s Gauteng province.7 Soweto’s

    population in October 2011 was approximately 1.28 million people. Residents are almost all

    Black Africans (99%) and most speak one of South Africa’s local African languages (96%).8

    Of the 0.9 million residents aged 15 or older, 41% engage in some form of economic activity

    (including occasional informal work) and 78% of these adults work primarily in the formal

    sector. 19% of households report receiving zero annual income and another 42% report receiv-

    ing less than $10 per day. These income figures reflect South Africa’s middle income status

    and the fact that Soweto includes relatively few recent migrants.9

    2.2 Sample definition and sampling strategy

    We define an eligible microenterprise as any enterprise that: (i) has zero, one or two full-time

    employees (in addition to the owner); and (ii) does not provide a professional service (e.g.

    medicine); and (iii) operates at least three days each week; and (iv) whose owner has a mobile

    phone. The first two conditions operationalize our definition of ‘microenterprise’, consistent

    with the general approach in the development economics literature. The third condition ex-

    cludes microenterprises that are seasonal, occasional (e.g. selling food at soccer games), or

    run over weekends in addition to wage employment. We impose this condition partly to ensure

    week-to-week variation in the outcomes of interest. The fourth condition is necessary to allow

    7 ‘Township’ in South Africa typically refers to low-income urban areas that include both formal and informalhousing and were designated as Black African living areas under apartheid’s residential segregation laws. Soweto isone of the older townships around Johannesburg. Relative to other townships, it includes more middle-income areasand houses fewer recent migrants.

    8 Statistics South Africa asks people to describe themselves in the Population Census in terms of five racial populationgroups, Black African, White, Coloured, Indian or Asian and Other or unspecified. We follow this terminology.

    9 Authors’ own calculations, from the 2011 Census public release data.

    9 Robert Garlick, Kate Orkin & Simon Quinn

  • 2.3 Data collection and assignment to data collection methods

    phone surveys and does not bind for any otherwise eligible microenterprises.10

    We used a three-stage clustered sampling scheme, designed so that our sample is representative

    of the population of households who live in ‘low-income’ areas of Soweto and own eligible

    microenterprises. This is discussed in detail in the Online Appendix.

    2.3 Data collection and assignment to data collection methods

    We employed a four-stage data collection process.11 In the first stage, we conducted a screen-

    ing survey with all households in the sampled small area layers (SALs). This short survey

    identified whether anyone in the household owned a microenterprise and, if so, asked further

    questions about the microenterprise and owner. This survey established whether the microen-

    terprise met the eligibility criteria laid out in section 2.2 and collected phone numbers for the

    owner and other household members. The screening process took place between September

    2013 and February 2014 and realized a sample of 1081 eligible microenterprises. Where a

    household included multiple eligible microenterprises, we randomly selected one for the final

    sample, which reduced the sample to 1046.

    In the second stage, we conducted a baseline survey with all eligible microenterprise owners

    identified in the screening stage. These interviews were conducted at the enterprise premises

    to verify that the enterprises existed, whereas the screening survey was conducted at the own-

    ers’ homes. The baseline survey asked 30 questions about the microenterprise and owner and

    recorded the GPS coordinates for the enterprise location. We completed the baseline question-

    naire between December 2013 and February 2014 with only 895 of the 1046 microenterprise

    10 This pattern is unsurprising given that 87% of South Africans aged 18 or older own a mobile phone (Mitullahand Kama, 2013). This rate is only slightly higher than other African countries where comparable surveys wereconducted by Afrobarometer between 2011 and 2013: 76% in Ghana, 81% in Kenya, 74% in Nigeria, 65% inTanzania, and 62% in Uganda. Political polling firms in developed countries consider 80% phone ownership tobe the threshold above which reliable collection of representative surveys over the phone can occur, with somereweighting (Croke, Dabalen, Demombynes, Giugale, and Hoogeveen, 2014).

    11 All questionnaires are available for download at www.robgarlick.com/research.

    10 Robert Garlick, Kate Orkin & Simon Quinn

    www.robgarlick.com/research

  • 2.3 Data collection and assignment to data collection methods

    owners (85 percent) identified in the screening stage. Of the remaining 183 owners, 67% could

    not be contacted using phone calls or home visits, 18% closed their enterprise between screen-

    ing and baseline, 8% relocated outside Soweto, 6% refused to be re-interviewed, and 1% did

    not answer key questions in the baseline survey.

    Between the second and third stages, we randomly divided the 895 baseline microenterprises

    into three data collection groups: monthly in-person surveys (298 microenterprises), weekly

    in-person surveys (299 microenterprises), and weekly phone surveys (298 microenterprises).

    Following Bruhn and McKenzie (2009), we first created strata based on (i) gender, (ii) number

    of employees, (iii) microenterprise sector and (iv) enterprise location.12 This yielded 149 strata

    with 1-51 microenterprises each. We then split each stratum randomly between the three data

    collection groups. This generated some residual microenterprises in each stratum (as not all

    strata sizes are multiples of three). We randomly assigned these microenterprises to data col-

    lection groups, with the restriction that a pair of residual microenterprises in a stratum would

    always go into separate groups. We finally assigned microenterprises to fieldworkers to en-

    sure that each owner would be interviewed in her or his preferred language (English, Sotho,

    Tswana, or Zulu) and to minimize fieldworkers’ travel time across microenterprise locations.

    Fieldworkers were thus randomly assigned to data collection groups but not to the microenter-

    prises within those groups.

    In the third stage, we conducted repeated surveys with each microenterprise owner. These

    were conducted in-person or on mobile phones every week or every four weeks. We randomly

    staggered the start dates for this stage of the data collection to allow fieldworkers to acclima-

    tise to the high-frequency surveys. Within each data collection group, 25% of enterprises were

    interviewed between March 10 and May 30, 25% between March 17 and June 6, 25% between

    March 24 and June 13, and 25% between March 31 and June 20. We successfully completed

    12 We used the subplace in which the microenterprise was located as the location block. This generally differed fromthe subplace in which the household was located, which we used for the initial sampling scheme.

    11 Robert Garlick, Kate Orkin & Simon Quinn

  • 2.4 Data description

    4070 of 8058 repeated surveys and we discuss the pattern of attrition in detail in section 2.5.

    In the fourth stage, we conducted an endline survey with each microenterprise owner. This sur-

    vey was conducted in person at the microenterprise location, irrespective of the assigned data

    collection method for the repeated surveys. This common endline format means that observed

    endline differences across randomly assigned data collection groups must reflect persistent ef-

    fects of the data collection method; we interpret differences at endline as ‘real’ differences in

    microenterprise outcomes, rather than measurement effects. We successfully completed 591 of

    895 endline interviews and we discuss the pattern of attrition in detail in section 2.5. Through-

    out the third and fourth stages of data collection, microenterprise owners received a mobile

    phone airtime voucher for every fourth interview that they completed (ZAR12, approximately

    USD0.97).13 This equates the per-interview payout across data collection groups.14

    2.4 Data description

    Summary statistics for the final sample of all 895 microenterprises are shown in the first two

    columns of Table 1. We draw two conclusions from this table. First, the randomisation is well

    balanced; see columns 3 to 6 of the table.15 Second, our sample is broadly similar to samples

    of microenterprises in other contexts, so our results are likely to hold external validity. In par-

    ticular, the microenterprises are relatively well-established (average age 7 years) and have a

    13 We use an exchange rate of USD1 to ZAR10.27, the South African Reserve Bank rate at the start of the survey on31 August 2013).

    14 Croke, Dabalen, Demombynes, Giugale, and Hoogeveen (2014) found that varying the amount of the mobile phoneairtime voucher did not increase the response rate in mobile phone surveys in either Tanzania (either US$0.17 orUS$0.42) or South Sudan (either US$2 or US$4).

  • 2.5 Interview completion

    diversified client base (mean and median numbers of clients are 34 and 20 respectively, though

    this varies by sector). By design, our microenterprises have at most 2 employees beside the

    owner: 61% have no other employees, while 28% and 11% have respectively one and two other

    employees. Most microenterprises operate in food services (43%) or retail (32%). Very few

    are formally registered in any sense but 20% keep written financial records and 57% conduct

    business over the phone at least weekly.

    Microenterprise owners live in households with an average of 3.8 other people, though this is

    widely dispersed with an interdecile range of 1 to 7. These households accrue a mean monthly

    income of ZAR4050 (approximately USD380 at the time of the survey) across all sources,

    which falls in the fourth decile for all households across the country and the seventh decile of

    all households in Soweto.16

    < Table 1 here. >

    2.5 Interview completion

    We show attrition rates for the repeated and endline interviews in Table 2 (columns 1 and 2

    respectively). We define ‘attrition’ as a binary variable equal to one if and only if a scheduled

    interview is not successfully completed with the target microenterprise owner. In short, we find

    high attrition for repeated interviews, and marginally higher endline attrition for respondents

    who have been interviewed weekly by phone. The attrition rate increases very slightly over the

    course of the survey period (Figure 1) but a substantial fraction of microenterprises complete

    all their assigned interviews (Figure 2). Under our definition of attrition, a microenterprise

    can attrit in week t and return to the sample in subsequent weeks. Attrition is moderately

    autocorrelated: microenterprises surveyed in week t are 39 percentage points more likely to be

    surveyed in week t+ 1 than microenterprises not surveyed in week t.

    16 This is the average across the 87% of microenterprise owners who are willing to answer this question. There areessentially no missing values for the other variables.

    13 Robert Garlick, Kate Orkin & Simon Quinn

  • 2.5 Interview completion

    < Table 2 here. >

    < Figure 1 here. >

    < Figure 2 here. >

    We show attrition rates by screening and baseline characteristics in Table 3. We report marginal

    effects from a fractional logit regression of the proportion of missed repeated interviews in

    column 1 and from a logit regression of binary endline attrition in column 2 (Papke and

    Wooldridge, 1996). Attrition in both the repeated and endline data collection stages does vary

    systematically but the repeated attrition in particular is driven by a relatively small number

    of characteristics. Repeated stage attrition is higher for microenterprise owners who are not

    Black African (though this group is tiny), with more education, and who do not answer the

    baseline survey question about total household income. The same set of variables predict end-

    line attrition (though with slightly different coefficients), as do some home languages, number

    of employees, owners’ growth plans for their microenterprise, and whether owners regularly

    conducted business over the phone before the survey period. We interpret the results in Table 3

    as evidence that attrition is non-random. However, the extent of the non-randomness does not

    strongly vary across data collection methods: the goodness-of-fit measures for group-specific

    fractional logit regressions are essentially identical across data collection groups.

    < Table 3 here. >

    We do not believe that there is a clear economically-interpretable pattern to the non-random

    attrition. An obvious model implies that attrition will be higher for microenterprise owners

    whose time is (self-perceived to be) more valuable or who find it more difficult to intertempo-

    rally shift their time expenditure from other activities to answer survey questions. But attrition

    does not vary systematically with childcare responsibilities, the number of other employees in

    the microenterprise, or microenterprise sector. Similarly, attrition does not vary systematically

    with variables that might proxy for the difficulty owners face in determining the answers to

    14 Robert Garlick, Kate Orkin & Simon Quinn

  • 2.6 Attrition benchmarks

    our survey questions: microenterprise age, the presence of written records, registration for tax,

    financial literacy test results, or digit span recall test results.

    < Table 4 here. >

    Reasons given for attrition differ for phone interviews, but do not systematically differ between

    weekly and monthly in-person interviews. We show these reasons for attrition — as recorded

    by fieldworkers — in Table 4. Refusals and owners reporting that they are too busy to com-

    plete the interviews are more common for phone interviews than in-person interviews. This

    may be due to a weaker rapport between fieldworkers and respondents without face-to-face

    contact. Wrong contact information leads to more attrition in the phone than in-person in-

    terviews, which presumably reflects owners changing their mobile phone numbers. Although

    we recorded multiple phone numbers (for owners and their household members), the phone

    interviews relied on time-invariant phone numbers whereas the in-person interviews required

    that respondents kept either their phone numbers or their enterprise location. However, the 2

    percentage point higher rate of incorrect contact information for the phone interviews is reas-

    suringly low. Attrition due to enterprise closure and owner relocation does not differ by data

    collection method, consistent with the fact that these phenomena should not be influenced by

    the surveying methods.

    2.6 Attrition benchmarks

    The attrition rates in our sample are in line with those found in a number of other studies using

    mobile phone interviews. In their surveys of 1,500 households in Honduras and Peru, Gallup

    (2012) have broadly similar rates of attrition, despite following standard Gallup protocols for

    recontacting households who could not be reached initially.17 In their mobile phone survey

    in Dar es Salaam, Croke, Dabalen, Demombynes, Giugale, and Hoogeveen (2014) lost 16 per

    17 In Peru, the authors were able to contact only 33 per cent of households identified in their random sample baselineby phone in Round 1, dropping to 25 per cent by Round 6 (with higher attrition among poorer, less educated andmore rural households). In Honduras, the authors contacted 59 per cent of households in Round 1, dropping to 50per cent in Round 6.

    15 Robert Garlick, Kate Orkin & Simon Quinn

  • cent of respondents between baseline and round one (92 of 550 respondents). The authors then

    completed 66 per cent of planned interviews with the remaining 458 respondents. Franklin

    (2015) surveyed 551 unemployed young men and women in Addis Ababa available for work by

    phone each week for 11 weeks. With the subsample of respondents who agreed to participate

    further in the survey, he then conducted 4510 successful interviews out of a planned 6061:

    a success rate of 74.4 per cent.18 Beaman, Magruder, and Robinson (2014) identified 1195

    microenterprise owners in their census in small towns in Kenya, but only 793 could be found

    again for invitation into their study. They enrolled 508 of these (64 per cent) in the study. They

    thus provided two opportunities for owners to opt out of the study before beginning, and 46

    percent of their random sample did so. They lose a much larger portion of their sample between

    screening and the first round. They planned 5180 weekly visits and only missed 7 per cent of

    these.19 In sum, we do not believe that attrition in our sample affects our estimation results (as

    we discuss further shortly). However, relatively high attrition is likely to be a challenge for any

    researchers using mobile phone collection for microenterprises, and our study is no exception.

    3 Results

    This section discusses our estimating methods and presents our results. Our estimation meth-

    ods follows closely our pre-analysis plan.20 We consider the following questions:

    1. Does interview method affect responses throughout the survey?

    2. Does interview method affect microenterprise performance (as measured through a face-

    to-face endline)?

    18 Because these individuals are unemployed, they may have a lower marginal cost of time than the individuals in ourstudy, who are running small enterprises.

    19 We should expect attrition rates to differ depending on whether a study is urban or rural. For example, muchlower attrition rates were reported in a 14 round mobile phone survey in rural Tanzania, with an average of 191.2respondents of 195 interviewed in each round (Dillon, 2012). However, rounds were only held every three weeks,rather than every week. The author notes that the sample was highly clustered in small villages, so that othervillagers helped enumerators find respondents who did not answer calls or whose phones were lost or not charged.

    20 Our pre-analysis plan is available at https://www.socialscienceregistry.org/trials/346.

    16 Robert Garlick, Kate Orkin & Simon Quinn

  • 3.1 Does interview method affect survey responses?

    We are primarily interested in testing the effect of different measurement techniques on a set

    of enterprise performance outcomes (outlined shortly). We also construct several measures of

    consistency in reporting. First, we construct the absolute value of (sales minus costs, minus

    profits). In principle, this should be zero for all microenterprise; we can therefore interpret this

    as a measure of accounting error and/or misunderstanding of these concepts (as they are com-

    monly applied in microenterprisesurveys). Second, we use two measures of enumerator per-

    ceptions: whether the enumerator believes that the respondent answered honestly and whether

    the enumerator believes that the respondent answered carefully. Each question is recorded on a

    five-point Likert scale; for each question separately, we code a dummy variable for (i) whether

    the response is at or above the sample median for the question, or (ii) whether the response

    is below that sample median. Finally, we use a dummy variable for whether the respondent

    reports having referred to written financial records or notes in answering the questions.

    3.1 Does interview method affect survey responses?

    We begin by testing effects on reported responses during the repeated interviews — that is, for

    the questionnaires after the baseline and before the endline. For respondents in the first and

    second treatment groups, this refers to the 12 surveys conducted at weekly intervals. For re-

    spondents in the control group, this refers to the three surveys conducted at monthly intervals.

    In this section, we use t to index the calendar week of interview.21 We index microenterprises

    by i and outcome variables by k; Ykit therefore refers to the response of microenterprise i

    to outcome k in week t. We use T1i and T2i respectively as dummies for being assigned to

    treatment method 1 (i.e. face-to-face interviews at weekly intervals) and treatment method 2

    (i.e. phone interviews at weekly intervals). If Ykit is a continuous variable, we normalise first

    by the mean and standard deviation of the control group (i.e. those interviewed face-to-face at

    21 We use time dummy variables for actual calendar weeks in our main specification to allow for common shocks. Inthe pre-analysis plan, we intended that t twould refer to weeks since the start of the survey: for microenterprises inthe first and second treatment groups, this would be t ∈ {1, . . . , 12}. For a microenterprise in the control group,this would be either to t ∈ {1, 5, 9}, or to t ∈ {2, 6, 10}, or to t ∈ {3, 7, 11}, or to t ∈ {4, 8, 12} (depending on therandom survey timing for that microenterprise). We made this decision before analysing any of the data.

    17 Robert Garlick, Kate Orkin & Simon Quinn

  • 3.1 Does interview method affect survey responses?

    monthly intervals).

    Survey responses may differ across interview methods in several ways. For example, different

    methods may generate left- or right-shifted distributions or generate more or less dispersion

    around a common mean. If responses only differ in dispersion (e.g. one introduces more classi-

    cal measurement error), then some summary statistics will be robust across different interview

    methods. If responses differ in either dispersion or mean, then multivariate analyses including

    regression will generally be sensitive to the choice of interview method. We explore differences

    in three stages. We begin by examining the empirical CDFs of survey responses by method,

    pooling observations across microenterprises. This provides the most general overview of the

    differences in responses. We then run mean regressions; these test only for differences in

    mean survey responses but allow us to use the panel structure more effectively. We finally

    test if the microenterprise-specific standard deviations through the panel differ by interview

    method. Taken together, these provide a flexible and comprehensive description of possible

    differences in survey responses by interview method. Differences by survey method in these

    responses may reflect differences in reporting or in underlying microenterprise outcomes. We

    return to the distinction between these issues in section 3.2.

    We begin by inspecting empirical CDFs; for each empirical CDF, we superimpose ‘∗’ and ‘+’

    to indicate significant differences in the quantiles of Ykit.22 We begin by examining the abso-

    22 Formally, for each outcome k, we estimate the following quantile regression:

    Qθ(Ykit |T1i, T2i) = βθ0 + βθ1 · T1i + βθ2 · T2i, (1)

    where we estimate for quantiles θ ∈ {0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95}. We use ‘+’ to indicate significance onthe null hypothesis βθ1 = β

    θ2 (i.e., the null hypothesis that weekly surveys in-person and by phone are equivalent) and

    we use ‘∗’ to indicate significance on a null hypothesis βθ1 = βθ2 = 0 (i.e. the null hypothesis that all three treatmentsare equivalent). ‘+’ and ‘∗’ denote p < 0.1; ‘++’ and ‘∗∗’ denote p < 0.05; ‘+++’ and ‘∗∗∗’ denotes p < 0.01.We report significance tests for each level of θ, where we cluster by microenterprise (Silva and Parente, 2013)and use the False Discovery Rate (Benjamini, Krieger, and Yekutieli, 2006) to control for multiple testing acrossquantiles. In the pre-analysis plan we planned to use strata and time dummy variables; this proved computationallyinfeasible. We also planned to use simultaneous-quantile regression and to jointly test for coefficient equality acrossall quantiles. However, we do not believe an estimator has been proposed for systems of simultaneous quantileregression models with clustered standard errors.

    18 Robert Garlick, Kate Orkin & Simon Quinn

  • 3.1 Does interview method affect survey responses?

    lute value of sales minus costs minus profits — which should be zero — as a suggestive mea-

    sure of the coherence of respondents’ answers. Respondents in weekly in-person interviews

    are more likely than those in weekly phone interviews to give answers close to zero, followed

    by the control group (Figure 3). We interpret this as providing suggestive evidence that weekly

    interviews are useful for reducing measurement error in microenterprise performance.

    < Figure 3 here. >

    Respondents report higher profits in the control group than for either type of weekly interview

    (which do not differ significantly from each other): see Figure 4. This pattern is explained

    by a larger difference between profits and sales-costs in the control group, not by higher sales

    or lower costs in the control group. We interpret this as evidence of less accurate recall over

    longer time periods, rather than changes in real outcomes.

    < Figure 4 here. >

    In several other outcomes, we find suggestive evidence of a ‘social desirability’ mechanism

    when respondents are interviewed face-to-face. Most prominently, both kinds of in-person

    respondents report being open for more hours yesterday than respondents interviewed over

    the phone (Figure 5). The same behaviour may explain the differences – at the top end of

    the distribution – in reporting of enterprise assets (i.e. stocks and inventories): see Figure 6.

    Finally, we find significant differences in the value given to the household. Respondents inter-

    viewed weekly in person report higher value given to the household, compared to households

    interviewed monthly in person – who, in turn, report higher value to the household than re-

    spondents interviewed weekly on the phone (Figure 7). This would be consistent with both a

    recall mechanism and a social desirability mechanism working simultaneously.

    < Figure 5 here. >

    < Figure 6 here. >

    < Figure 7 here. >

    19 Robert Garlick, Kate Orkin & Simon Quinn

  • 3.1 Does interview method affect survey responses?

    We do not find significant differences between the empirical CDFs for our other measures.23

    We show these empirical CDFs together in Figure 8: for fixed assets, sales for the last four

    weeks, sales for the last week, total costs, total employees, paid employees, full-time employ-

    ees and money kept by the entrepreneur.

    < Figure 8 here. >

    In sum, we find that, for most measures of microenterprise performance, weekly phone in-

    terviews do not induce different reporting than either weekly in-person interviews or monthly

    in-person interviews. We therefore conclude that weekly phone interviews are a useful and

    viable way for tracking microenterprise performance at high frequency; indeed, Figures 3 and

    Figure 4 even suggest that weekly interviews may reduce measurement error in profit. Where

    we see differences by interview method, they are left- or rightward shifts of the applicable

    CDF. We see no cases of mean-preserving spreads, which would arise if interview methods

    differed purely in the amount of noise they generated.

    To check these conclusions, we estimate the effect of survey methodology on mean response

    values:

    Ykit = β1 · T1i + β2 · T2i + ηg + φt + εkit, (2)

    where ηg are dummy variables for the matched randomisation blocks, and φt are dummy vari-

    ables for the survey week. We cluster errors by microenterprise. We test whether either treat-

    ment induces different average reporting. We do this by testing H10 : β1 = 0, H20 : β2 = 0,

    H30 : β1 = β2, H40 : β1 = β2 = 0. This model is more restrictive than the quantile regressions

    but the mean regressions have several desirable features. We can include randomisation block

    and survey week dummies, reducing the residual variation. We can also calculate exact min-

    23 We occasionally reject at one of the quantiles, but never to indicate any pattern that suggests anything but randomnoise.

    20 Robert Garlick, Kate Orkin & Simon Quinn

  • 3.1 Does interview method affect survey responses?

    imum detectable effect sizes for all outcomes, which is more complex for quantile regression.24

    We find some significant differences between method and frequency of interview in the re-

    peated interviews. The pattern of differences reinforces the findings from the quantile regres-

    sion analysis. The weekly interviews generate marginally lower reported profits and higher

    reported fixed assets and stock and inventory (though only the latter different is statistically

    significant). We again see some suggestive evidence of social desirability bias: phone respon-

    dents report opening their microenterprises for fewer hours yesterday25 but they do not keep

    more money for themselves or give more to their household. Enumerator perceptions differ

    sharply by survey method: they report lower care, less honesty, and use of written records

    for phone respondents than in-person respondents and for weekly respondents than monthly

    respondents. This may reflect real differences in data quality or simply differences in enu-

    merator perceptions. We see no evidence that phone interviews have greater profit-sales-cost

    discrepancies than in-person interviews, which is our most direct test for data quality differ-

    ences.

    These insignificant differences between reported enterprise outcomes occur despite relatively

    small minimum detectable effects. We are powered to detect differences between either weekly

    24 We calculate the minimum difference in the mean survey response we can detect for a test with 5% size and 80%

    power. We use the formula MDE = 2.8√

    σ2Yσ2T

    1NM

    [ρY +

    1−ρYNW

    ], where 2.8 is derived from the chosen test size

    and power, NM and NW are respectively the number of microenterprises in the sample and the mean number ofcompleted panel interviews per microenterprise, the variance of each treatment indicator σ2T is calculated fromthe sample completed panel interviews, the variance of the outcome σ2Y is calculated for the control group afterconditioning on randomization block and survey week dummies, and the intertemporal correlation coefficient ρYis calculated for the control group after conditioning on randomization block and survey week dummies. Thisapproach updates prospective MDE calculations by using values of NW , σ2T , σ

    2Y , and ρY from the trial. But it

    calculates MDEs for chosen test size and power, rather than calculating power for the observed treatment effects.The latter approach is uninformative, as the retrospective power of a test is a one-to-one function of the p-value. SeeScheiner (2001) for a detailed discussion on this issue. Note that the MDEs calculated using this approach may besmaller than insignificant differences between treatment arms estimated from the sample data.

    25 In-person interviews were conducted at the microenterprise premises, so respondents who keep their business openlater may be easier to interview and hence less likely to attrit. This could account for part of the difference inthis outcome. However, the result is robust to two different adjustments for differential attrition, as we discuss insubsection 3.3.

    21 Robert Garlick, Kate Orkin & Simon Quinn

  • 3.1 Does interview method affect survey responses?

    interview group and the monthly interview group of 0.06 to 0.08 standard deviations of most

    outcomes.26 The key exception is stock and inventory. This measure is relatively noisy and

    only weakly correlated with the randomisation block dummies, so the MDEs and the (insignif-

    icant) estimated differences between groups are quite large.

    < Table 5 here. >

    We find limited evidence of heterogeneity (see online appendix tables A6-A11). We test for

    heterogeneous responses on six pre-specified dimensions: gender, education, digit recall span

    score, numeracy score, use of written records at baseline, and having at least one employee

    other than the owner. Owners with higher numerical performance (from digit recall span

    and numeracy tests) report hiring more employees, specifically more paid employees, when

    assigned to either of the weekly survey groups. Responses from owners who kept written busi-

    ness records as baseline display the opposite pattern. Male owners and owners of enterprises

    with multiple employees at baseline report (not always significantly) higher stock/inventory

    when assigned to either of the weekly survey groups. These results provide little evidence

    of heterogeneity in reporting differences across interview media and frequencies. In particu-

    lar, we might expect that reporting behaviour would differ for enterprises with better record-

    keeping capacity (had multiple employees or kept written records at baseline) or owners with

    better numerical performance (digit recall span and numeracy scores). We at most very weak

    evidence to support this hypothesis.

    For each outcome k, we also test for difference in dispersion. We calculate the sample standard

    deviation for each outcome, and we then estimate:

    Sik = β1 · T1i + β2 · T2i + ηg + µki, (3)

    26 Recall that continuous outcomes are normalised to have mean 0 and standard deviation 1 in the monthly interviewgroup. Categorical measures (# employees) and binary measures (enterprise closure, written records, enumeratorassessments) are not normalized.

    22 Robert Garlick, Kate Orkin & Simon Quinn

  • 3.2 Does interview method affect microenterprise performance?

    using heteroscedasticity robust standard errors. We then test H10 : β1 = 0, H20 : β2 = 0,

    H30 : β1 = β2, H40 : β1 = β2 = 0. We show in table 6 that weekly reports of stock and inven-

    tory, household takings, and fixed assets are more dispersed than monthly reports (the latter

    difference is large but imprecisely estimated). The higher-frequency interviews may pick up

    large short-term fluctuations in these measures. The phone respondents’ responses show more

    dispersion in costs, number of (paid) employees, and hours of operation.

    < Table 6 here. >

    Finally, we test whether different interview methods have implications for estimating the dy-

    namics of microenterprise performance. To do this, we report Blundell-Bond estimates for

    log(profit) and log(capital), assuming an AR(1) structure on the error term.27 Table 7 reports

    these estimates, disaggregated between weekly phone interviews and weekly in-person inter-

    views. At the bottom of the table, we report Wald tests for the null hypothesis that the estimated

    dynamics are equivalent between phone and in-person interviews; in each case, the null hy-

    pothesis of equivalence comfortably passes.28 Weekly interviews are a useful mechanism for

    estimating the dynamics of microenterprises at high frequency. And phone interviews are no

    worse than in-person interviews for this purpose.

    < Table 7 here. >

    3.2 Does interview method affect microenterprise performance?

    The previous section described tests for whether responses differ during the repeated surveys,

    not including baseline and endline. We conduct a common endline with all microenterprises,

    face-to-face. We test whether endline responses differ between treatment groups. We interpret

    27 In the Online Appendix, we show that two lags is a reasonable structure for log(profit) and that four lags is reasonablefor log(capital).

    28 Since the sample collected by phone and the sample collected in person are non-overlapping, we run this test byimposing that the cross-equation covariance terms are zero.

    23 Robert Garlick, Kate Orkin & Simon Quinn

  • 3.2 Does interview method affect microenterprise performance?

    this as a test of whether our interview techniques changed actual microenterprise outcomes (for

    example, by prompting microenterprise managers to pay more attention to their capital stock).

    We begin by repeating the analysis of empirical CDFs, and we then test again for differences

    in means by using regression. For each outcome Yki, we estimate:

    Yki = β1 · T1i + β2 · T2i + ηg + εki, (4)

    where we allow for heteroscedasticity-robust standard errors. As before, we then test H10 :

    β1 = 0, H20 : β2 = 0, H30 : β1 = β2, H

    40 : β1 = β2 = 0.

    29

    We find few difference in endline firm outcomes based on earlier interview method, as shown

    in table 8. Weekly respondents report slightly lower values of fixed assets than monthly re-

    spondents. Phone respondents responses’ are not significantly different to face-to-face respon-

    dents’ responses on any outcome, though some outcomes do differ between weekly phone and

    weekly face-to-face respondents. In particular, weekly phone respondents report taking less

    money from the enterprise for their household and family members. We show in figure 9 that

    this difference is driven by the upper tail of the distribution. This is consistent with the sug-

    gestive evidence of lower social desirability bias in phone interviews visible from the earlier

    results. Comparisons of the empirical CDFs for the other outcomes provide little useful infor-

    mation and are shown in the online appendix.

    < Figure 5 here. >

    < Table 8 here. >29 Our pre-analysis plan specified that we would use a bootstrap method to test whether the variance of measured Yki

    varies between treatments. We later decided that quantile regression is a simple and more intuitive approach tospeak to the same issue.

    24 Robert Garlick, Kate Orkin & Simon Quinn

  • 3.3 Accounting for differential interview completion rates

    3.3 Accounting for differential interview completion rates

    We observe some differences in the rates of successful interview success rate. This may influ-

    ence the patterns of results reported in this section. We check the robustness of our findings

    in two ways. First, we estimate the effects of interview method and frequency on responses

    assuming that the differential success rates are concentrated in the lower and upper tail of the

    observed outcome distribution. These generate respectively upper and lower bounds on the

    effects of interview method and frequency (Lee, 2009). We report these bounds in table 5.

    Researchers typically interpret results as “robust” if the bounds exclude zero. Theis interpre-

    tation is inappropriate for our research question, where zero and non-zero coefficients are both

    of interest. Instead, we use the width of the bounded set as a measure of the vulnerability

    of our findings to differential interview success rates across interview media and frequencies.

    The median width over is 0.28 standard deviations for the continuous outcomes and 23 per-

    centage points for the binary outcomes. The only extremely wide sets are for the fat-tailed

    stock/inventory and asset measures. In almost all other cases, we can rule out differences be-

    tween interview media and frequencies of more than 0.4 standard deviations for continuous

    outcomes and 30 percentage points for binary outcomes.

    Second, we estimate the probability of successsfully completing each repeated interview (ex-

    cluding the endline and baseline) as a function of the baseline characteristics discussed in

    section 2.2: P̂ = P̂ r (Interview Successit|Xi0). We then construct inverse probability of in-

    terview success weights Ŵ = 1/P̂ and reestimate equation 2 using these weights. If interview

    completion does not covary with latent outcomes after conditioning on these baseline charac-

    teristics, the weighted regressions fully correct for missed interviews. The weighted results

    are shown in the online appendix and differ only slightly from the unweighted results. They

    provide no evidence that calls for revising the conclusions reached above.

    25 Robert Garlick, Kate Orkin & Simon Quinn

  • 4 Cost effectiveness

    Phone surveys offer the prospect of lower per-interview costs, which allows larger samples or

    longer panels. We provide a detailed breakdown of our data collection costs to inform other

    researchers about the potential savings. We use the survey firm’s general ledger entries, which

    give a detailed breakdown of expenditure by date and purpose. We do not examine the costs

    of the screening, baseline and endline, which were conducted face-to-face for all respondents,

    nor fixed costs, such as the survey manager’s salary and the survey firm’s rent and office costs.

    Costs were classified into nine categories. The general ledger allows some costs to be easily

    allocated to different treatment arms. Costs are recorded separately for each enumerator, so

    we can allocate salaries, per diems and fieldworker transport costs to treatment arms. Only

    the face to face interview teams used a car and driver and paid for fuel; the monthly team had

    one car and the weekly team had another car. The survey firm separated phone calls during

    fieldwork (conducted on a project mobile phone) from phone calls to survey firms in the phone

    arm (conducted on landlines). To split the total costs for printing, data capture and respondent

    incentives across the arms, we divide by the number of successful interviews in each treatment

    arm. These last three costs are basically the same across the arms.

    Once a baseline has been conducted to collect contact details, a phone interview conducted cost

    roughly ZAR49 (US$4.76 at August 2013 rates). A face-to-face interview conducted weekly

    cost ZAR63 (US$6.12) and one conducted montly cost ZAR75 (US$7.30). Phone surveys re-

    duced the per-interview cost of high-frequency data collection by approximately 25%.

    Cost savings from phone interviews are likely to increase if interviews are shorter relative to

    travel time between interviews, as the time and expense of travelling between interviews in-

    creases, and as the costs of calling mobile phones decrease. Here, our in-person survey costs

    are relatively low because we were interviewing in a high density urban area and South African

    26 Robert Garlick, Kate Orkin & Simon Quinn

  • call costs are relatively high (intervies cost roughly US$1.30 per 15 minute interview). A simi-

    lar phone survey in urban Dar es Salaam, Tanzania, cost US$4.10 without including consultants

    to maintain a website, supervise data collection and to analyse the data, and US$7.30 including

    the consultant Croke, Dabalen, Demombynes, Giugale, and Hoogeveen (2014). In contrast, in

    a Tanzanian high-frequency household survey with farmers in remote rural areas, phone survey

    interviews cost US$6.98, while face-to-face interviews cost US$97 each Dillon (2012).

    Cost savings on salaries and per diems depend on how many phone interviews can be con-

    ducted in the same amount of time as a face to face interview. Here, double the number of

    enumerators (8) were used to survey 300 firms face-to-face weekly as were used to survey

    firms face-to-face over the phone (4). The per interview costs in salary and per diem are much

    lower for phone (US$2.02) than face to face (US$2.93) interviews. We could have made even

    greater cost savings on salaries. We over-estimated how many calls it would take to reach

    businesses and underestimated the number of interviews phone fieldworkers could do per day,

    so we assigned too large a team of enumerators to the phone arm. We could have lowered costs

    further with fewer fieldworkers, but had already allocated businesses to each fieldworker and

    did not change the arrangement.

    In other contexts, cost savings will also depend on if fieldworkers are away from home, should

    paid a per diem, and are paid for transport to the office. Here we had no accommodation costs.

    Enumerators were paid the same daily rate and per diem regardless of the survey method to

    avoid differences in motivation and incentives. We also paid the enumerators doing the phone

    calls for their transport costs to the office (the face to face enumerators met in a central place

    close to their houses and were dropped off at the businesses).

    Finally, data capture and printing are likely to be easier and cheaper for phone surveys. Here

    all surveys were done on paper and data-captured to avoid differences in data quality between

    interview methods. Both phone and face to face interviews could be conducted on tablets,

    27 Robert Garlick, Kate Orkin & Simon Quinn

  • but phone interviews can be captured directly on computers using more advanced data capture

    software and fewer fieldworkers are required for phone surveys, so equipment costs will be

    lower.

    5 Conclusion

    We provide the first experimental evidence on the usefulness and viability of high-frequency

    phone surveys of microenterprises. We draw a representative sample of microenterprises in

    Soweto (South Africa), and randomly assign respondents to three treatment groups. The first

    group is interviewed face-to-face at monthly intervals, to mimic a standard method of col-

    lecting data from microenterprises. The second group is interviewed face-to-face at weekly

    intervals. This preserves one key feature of the benchmark method — that respondents are

    interviewed in person — but allows us to test the consequences of collecting data at a much

    higher frequency. The third group are interviewed at weekly intervals by mobile phone.

    Our results show that mobile phone interviews are accurate: on most measures and at most

    quantiles of the distribution, data patterns are indistinguishable in interviews conducted weekly,

    whether by mobile phone or face-to-face). Using an endline administered face-to-face, we find

    little evidence that high-frequency data collection (either over a mobile phone or in person)

    alters microenterprise behaviour. We find some differences in attrition (levels and reasons)

    across data collection methods but these are generally small and are not correlated with mi-

    croenterprise characteristics. Mobile phone interviews are, however, substantially cheaper than

    face-to-face interviews and offer a far largr volume of data for the same price.

    Our results also show that weekly phone interviews are useful: they capture extensive volatility

    in a number of measures that is not visible in less frequent data collection. Low-frequency data

    collection both fails to observe this volatility and captures measures that do not account for the

    28 Robert Garlick, Kate Orkin & Simon Quinn

  • past history of volatility. This can lead to measurement error problems that produce both biased

    and less precisely estimated coefficient estimates in regression analysis of microenterprise out-

    comes. We conclude that mobile phone data collection offers considerable cost savings with

    little, if any, reduction in data quality.

    29 Robert Garlick, Kate Orkin & Simon Quinn

  • REFERENCES

    ReferencesBAIRD, S., AND B. ÖZLER (2012): “Examining the Reliability of Self-reported Data on School Par-

    ticipation,” Journal of Development Economics, 98(1), 89–93.

    BANERJEE, A., E. DUFLO, R. GLENNERSTER, AND C. KINNAN (2015): “The Miracle of Micro-finance? Evidence from a Randomized Evaluation,” American Economic Journal: Applied Eco-nomics, 7(1), 22–53.

    BARRERA-OSORIO, F., M. BERTRAND, L. L. LINDEN, AND F. PEREZ CALLE (2011): “Improvingthe Design of Conditional Transfer Programs: Evidence from a Randomized Education Experimentin Colombia,” American Economic Journal: Applied Economics, 3(2), 167–195.

    BAUER, J.-M., K. AKAKPO, M. ENLUND, AND S. PASSERI (2013): “A New Tool in the Toolbox: Us-ing Mobile Text for Food Security Surveys in a Conflict Setting,” Humanitarian Practice NetworkOnline Exchange (http://www.odihpn.org/the-humanitarian-space/news/announcements/blog-articles/a-new-tool-in-the-toolbox-using-mobile-text-for-food-security-surveys-in-a-conflict-setting), pp. 1–2.

    BEAMAN, L., J. MAGRUDER, AND J. ROBINSON (2014): “Minding Small Change: Limited Attentionamong Small Firms in Kenya,” Journal of Development Economics, 108(69-86).

    BEEGLE, K., J. DE WEERDT, J. FRIEDMAN, AND J. GIBSON (2012): “Methods of Household Con-sumption Measurement Through Surveys: Experimental Results from Tanzania,” Journal of Devel-opment Economics, 98(1), 3–18.

    BENJAMINI, Y., A. M. KRIEGER, AND D. YEKUTIELI (2006): “Adaptive Linear Step-Up Proceduresthat Control the False Discovery Rate,” Biometrika, 93(3), 491–507.

    BRUHN, M., AND D. MCKENZIE (2009): “In Pursuit of Balance: Randomization in Practice in De-velopment Field Experiments,” American Economic Journal: Applied Economics, pp. 200–232.

    CAEYERS, B., N. CHALMERS, AND J. DE WEERDT (2012): “Improving Consumption Measure-ment and Other Survey Data through CAPI: Evidence from a Randomized Experiment,” Journal ofDevelopment Economics, 98(1), 19–33.

    COLLINS, D., J. MORDUCH, S. RUTHERFORD, AND O. RUTHVEN (2009): Portfolios of the Poor:How the World’s Poor Live on $2 a Day. Princeton University Press, Princeton.

    CROKE, K., A. DABALEN, G. DEMOMBYNES, M. GIUGALE, AND J. HOOGEVEEN (2014): “Col-lecting High Frequency Panel Data in Africa using Mobile Phone Interviews,” Canadian Journal ofDevelopment Studies, 35(1), 186–207.

    DAS, J., J. HAMMER, AND C. SÁNCHEZ-PARAMO (2012): “The Impact of Recall Periods on Re-ported Morbidity and Health Seeking Behavior,” Journal of Development Economics, 98(1), 76–88.

    DE LEEUW, E. D. (1992): Data Quality in Mail, Telephone and Face to Face Surveys. TT Publikaties,Amsterdam.

    DHOLAKIA, U. (2010): “A Critical Review of Question-behavior Effect Research,” Review of Market-ing Research, 7, 147–99.

    DILLON, B. (2012): “Using Mobile Phones to Collect Panel Data in Developing Countries,” Journalof International Development, 24, 518–27.

    30 Robert Garlick, Kate Orkin & Simon Quinn

  • REFERENCES

    DREXLER, A., G. FISCHER, AND A. SCHOAR (2014): “Keeping it Simple: Financial Literacy andRules of Thumb,” American Economic Journal: Applied Economics, 6(2), 1–31.

    FAFCHAMPS, M., D. MCKENZIE, S. QUINN, AND C. WOODRUFF (2012): “Using PDA ConsistencyChecks to Increase the Precision of Profits and Sales Measurement in Panels,” Journal of Develop-ment Economics, 98(1), 51–57.

    FRANKLIN, S. (2015): “Job Search, Transport Costs and Youth Unemployment: Evidence from UrbanEthiopia,” Working paper: University of Oxford.

    FRISON, L., AND S. POCOCK (1992): “Repeated Measures in Clinical Trials Analysis using MeanSummary Statistics and its Implications for Design,” Statistics in Medicine, 11, Statistics inMedicine.

    GALLUP (2012): “The World Bank Listening to LAC (L2L) Pilot: Final Report,” Discussion paper,Gallup, London.

    GERTLER, P., D. I. LEVINE, AND E. MORETT (2009): “Do Microfinance Programs Help FamiliesInsure Consumption Against Illness?,” Health Economics, 18(3), 257–273.

    GROVES, R. M. (1979): “Actors and Questions in Telephone and Personal Interview Surveys,” PublicOpinion Quarterly, 43(2), 190–205.

    HIMELEIN, K. (2014): “The Socio-Economic Impacts of Ebola in Liberia,” Working paper: WorldBank, Gallup and Liberia Institute of Statistics and Geo-Information Services, pp. 1–16.

    HOLBROOK, A. L., M. C. GREEN, AND J. A. KROSNICK (2003): “Telephone vs. Face-to-Face In-terviewing of National Probability Samples With Long Questionnaires: Comparisons of RespondentSatisficing and Social Desirability Response Bias,” Public Opinion Quarterly, 67, 79–125.

    JACKLE, A., C. ROBERTS, AND P. LYNN (2006): “Telephone Versus Face-to-Face Interviewing:Mode Effects on Data Quality and Likely Causes. Report on Phase II of the ESS-Gallup MixedMode Methodology Project,” Discussion Paper 2006-41, University of Essex, Colchester.

    KARLAN, D., AND M. VALDIVIA (2011): “Teaching Entrepreneurship: Impact of Business Trainingon Microfinance Clients and Institutions,” The Review of Economics and Statistics, 93(2), 510–27.

    KARLAN, D., AND J. ZINMAN (2011): “Microcredit in Theory and Practice: Using RandomizedCredit Scoring for Impact Evaluation,” Science, 332(6035), 1278–84.

    KÖRMENDI, E. (2001): “The Quality of Income Information in Telephone and Face-to-Face Surveys,”in Telephone Survey Methodology, ed. by R. M. Groves, P. P. Biemer, L. E. Lyberg, J. T. Massey,W. L. Nicholls, and J. Waksberg. John Wiley and Sons, New York.

    KROSNICK, J. A. (1991): “Response Strategies for Coping with the Cognitive Demands of AttitudeMeasures in Surveys,” Applied Cognitive Psychology, 5(3), 213–236.

    LANE, S. J., N. M. HEDDLE, E. ARNOLD, AND I. WALKER (2006): “A Review of RandomizedControlled Trials Comparing the Effectiveness of Hand Held Computers with Paper Methods forData Collection,” BMC Medical Informatics and Decision Making, 6(23), 1–10.

    MCKENZIE, D. (2012): “Beyond Baseline and Follow-up: The Case for More T in Experiments,”Journal of Development Economics, 99(2), 210–221.

    31 Robert Garlick, Kate Orkin & Simon Quinn

  • REFERENCES

    MCKENZIE, D., AND C. WOODRUFF (2008): “Experimental Evidence on Returns to Capital andAccess to Finance in Mexico,” World Bank Economic Review, 22(3), 457–82.

    MITULLAH, W., AND P. KAMA (2013): The Partnership of Free Speech and Good Governance inAfrica, vol. 3. Afrobarometer, University of Cape Town, Cape Town.

    NORD, M., AND H. HOPWOOD (2007): “Does Interview Mode Matter for Food Security Measure-ment? Telephone Versus In-Person Interviews in the Current Population Survey Food Security Sup-plement,” Public Health Nutrition, 10(12), 1474–80.

    PAPKE, L., AND J. WOOLDRIDGE (1996): “Econometric Methods for Fractional Response Variableswith an Application to 401(k) Plan Participation Rates,” Journal of Applied Econometrics, 11, 619–632.

    PATNAIK, S., E. BRUNSKILL, AND W. THIES (2009): “Evaluating the Accuracy of Data Collectionon Mobile Phones: A Study of Forms, SMS, and Voice,” Proceedings of the 3rd International Con-ference on Information and Communication technologies and Development, pp. 74–84.

    ROSENZWEIG, M., AND K. WOLPIN (1993): “Credit Market Constraints, Consumption Smoothingand the Accumulation of Durable Production Assets in Low-Income Countries: Investments in Bul-locks in India,” Journal of Political Economy,, 101, 223–244.

    SCHWARZ, N., H. HIPPLER, AND E. NOELLE-NEUMANN (1992): “A Cognitive Model of Response-Order Effects in Survey Measurement,” in Context Effects in Social and Psychological Research, ed.by N. Schwarz, and S. Sudman, pp. 187–201. Springer-Verlag.

    SHANKS, J. M., M. SANCHEZ, AND B. MORTON (1983): “Alternative Approaches to Survey DataCollection for the National Election Studies: A Report on the 1982 NES Method ComparisonProject,” Discussion paper, University of Michigan Survey Research Center.

    SILVA, J. S., AND P. M. PARENTE (2013): “Quantile regression with clustered data,” EconomicsDiscussion Papers, University of Essex, Department of Economics., 728.

    SMITH, P. B., AND R. FISCHER (2008): “Acquiescence, Extreme Response Bias and Culture: AMultilevel Analysis,” in Multilevel Analysis of Individuals and Cultures, ed. by F. J. R. van de Vijver,D. A. van Hemert, and Y. H. Poortinga, pp. 285–314. Taylor and Francis Group and LawrenceErlbaum Associates.

    STANGO, V., AND J. ZINMAN (2013): “Limited and Varying Consumer Attention: Evidence fromShocks to the Salience of Bank Overdraft Fees,” Working paper: Dartmouth College.

    SYKES, W., AND G. HOINVILLE (1985): Telephone Interviewing on a Survey of Social Attitudes: AComparison with Face-to-Face Procedures. Social and Community Planning Research, London.

    VAN DER WINDT, P., AND M. HUMPHREYS (2013): “Crowdseeding Conflict Data,” Working paper:Columbia University.

    ZWANE, A. P., J. ZINMAN, E. VAN DUSEN, W. PARIENTE, C. NULL, E. MIGUEL, M. KREMER,D. KARLAN, R. HORNBECK, X. GINE, E. DUFLO, F. DEVOTO, B. CREPON, AND A. BANERJEE(2011): “Being Surveyed can Change Later Behavior and Related Parameter Estimates,” Proceedingsof the National Academy of Sciences, 108(5), 1821–1826.

    32 Robert Garlick, Kate Orkin & Simon Quinn

  • REFERENCES

    Tables and Figures

    33 Robert Garlick, Kate Orkin & Simon Quinn

  • REFERENCES

    Table 1: Sample Description and Balance Test Results(1) (2) (3) (4) (5) (6)

    Full Sample Monthly Weekly Weekly p-value forMean Std Deviation In-person In-person Phone balance test

    Panel A: Variables Used in StratificationOwner age 44.8 12.7 44.5 44.7 45.2 0.805% owners female 0.617 0.486 0.601 0.629 0.621 0.769# employees at enterprise 0.498 0.685 0.510 0.492 0.493 0.937% enterprises in trade 0.318 0.466 0.312 0.311 0.332 0.824% enterprises in food 0.426 0.495 0.423 0.438 0.416 0.857% enterprises in light manufacturing 0.103 0.304 0.104 0.100 0.104 0.985% enterprises in services 0.088 0.284 0.094 0.084 0.087 0.904% enterprises in agriculture/other sector 0.065 0.246 0.067 0.067 0.060 0.929

    Panel B: Other Owner Demographic Variables% owners Black African 0.993 0.082 0.990 0.997 0.993 0.576% owners another race 0.007 0.082 0.010 0.003 0.007 0.576% owners from South Africa 0.923 0.267 0.916 0.936 0.916 0.533% owners from Mozambique 0.046 0.209 0.047 0.037 0.054 0.597% owners from another country 0.031 0.174 0.037 0.027 0.030 0.778% owners who speak English 0.065 0.246 0.064 0.087 0.044 0.096% owners who speak Sotho 0.213 0.410 0.211 0.217 0.211 0.979% owners who speak Tswana 0.084 0.277 0.077 0.087 0.087 0.876% owners who speak Zulu 0.482 0.500 0.493 0.482 0.470 0.849% owners who speak another language 0.156 0.363 0.154 0.127 0.188 0.124# years lived in Gauteng 40.2 16.7 39.9 40.2 40.3 0.956# years lived in Soweto 39.2 17.2 39.3 39.3 39.1 0.990

    Panel C: Other Owner Education & Experience Variables% with at most primary education 0.152 0.359 0.124 0.181 0.151 0.157% with some secondary education 0.469 0.499 0.487 0.482 0.440 0.450% with completed secondary education 0.304 0.460 0.322 0.244 0.346 0.015% with some tertiary education 0.075 0.263 0.067 0.094 0.064 0.353% financial numeracy questions correct 0.511 0.264 0.513 0.508 0.512 0.970Digit recall test score 6.271 1.489 6.333 6.220 6.260 0.632% owners with previous wage employment 0.760 0.427 0.785 0.773 0.721 0.169

    Panel D: Other Owner Household VariablesOwner’s HH size 4.785 2.683 4.745 4.756 4.856 0.852# HH members with jobs 0.720 0.979 0.728 0.716 0.715 0.984Owner’s total HH income 4049 4285 3994 3957 4191 0.799% owners whose business supplies at most half of HH income 0.554 0.497 0.581 0.515 0.567 0.238% owners with primary care responsible for children 0.544 0.498 0.493 0.542 0.597 0.038% owners who perceive pressure within HH to share profits 0.634 0.482 0.607 0.635 0.658 0.444% owners who perceive pressure outside HH to share profits 0.565 0.496 0.581 0.605 0.510 0.053

    Panel E: Other Enterprise VariablesEnterprise age 7.187 7.511 7.302 7.278 6.980 0.842% enterprises registered for payroll tax or VAT 0.079 0.270 0.081 0.060 0.097 0.232% owners who keep written financial records for enterprise 0.196 0.397 0.195 0.167 0.225 0.207% owners who want to grow business in next five years 0.762 0.426 0.752 0.766 0.768 0.876% owners who conduct business on the phone at least weekly 0.568 0.496 0.554 0.579 0.570 0.823# clients for the enterprise 33.7 71.4 28.9 40.8 31.3 0.189Sample size 895 298 299 289Joint balance test statistic over treatment groups 70.949 (0.380)Joint balance test statistic over fieldworkers 793.050 (0.000)

    Notes: This table shows summary statistics for 40 variables collected in the screening and baseline interviews incolumns 1 and 2. Columns 3 – 5 show the mean values of the variables for each of the three data collection groups.Column 6 shows the p-value for the test that all three groups have equal means. The first eight variables are used inthe stratified random assignment algorithm and so are balanced by construction.

    34 Robert Garlick, Kate Orkin & Simon Quinn

  • REFERENCES

    Table 2: Attrition Rates for Repeated and Endline Interviews by Data Collection Group(1) (2)

    Repeated EndlineWeekly in-person 0.058∗∗ -0.061

    (0.028) (0.039)Weekly phone 0.095∗∗∗ 0.074∗

    (0.028) (0.039)Control mean 0.427∗∗∗ 0.336∗∗∗

    (0.020) (0.027)# enterprises 895 895R2 0.012 0.014F test stat for H0: equal attrition 5.6 6.2p-value for H0: equal attrition 0.004 0.002

    Notes: This table shows attrition rates for repeated and endline interviews from linear regressions withheteroscedasticity-robust standard errors. ***, **, and * denote significance at the 1, 5, and 10% levels.

    35 Robert Garlick, Kate Orkin & Simon Quinn

  • REFERENCES

    Figure 1: Interview success rate by treatment arm

    Figure 2: Total number of interviews by treatment arm

    36 Robert Garlick, Kate Orkin & Simon Quinn

  • REFERENCES

    Table 3: Predictors of Attrition in Endline and Repeated Interviews

    (1) (2)Repeated Endline

    Owner’s age -0.003 0.003(0.002) (0.002)

    Owner female (d) -0.026 -0.035(0.028) (0.039)

    Other is another race (d) 0.292∗∗ 0.353∗

    (0.118) (0.206)Owner was born in Mozambique (d) 0.055 0.112

    (0.066) (0.100)Owner was born in another country (d) 0.104 0.145

    (0.079) (0.116)Owner speaks Sotho (d) 0.024 0.148

    (0.051) (0.097)Owner speaks Tswana (d) -0.011 0.081

    (0.060) (0.109)Owner speaks Zulu (d) 0.005 0.182∗∗

    (0.049) (0.084)Owner speaks another language (d) 0.054 0.053

    (0.057) (0.099)Owner has some secondary education (d) -0.108∗∗∗ -0.083∗

    (0.037) (0.049)Owner has finished secondary education (d) -0.053 -0.036

    (0.043) (0.058)Owner has some tertiary education (d) -0.006 -0.091

    (0.062) (0.075)Years owner has lived in Gauteng 0.002 0.004

    (0.003) (0.004)Years owner has lived in Soweto -0.002 -0.008∗∗

    (0.003) (0.003)% of financial literacy questions owner correctly answers -0.020 0.018

    (0.015) (0.021)Owner’s digit recall test score -0.002 -0.005

    (0.008) (0.012)Owner has ever held regular paid employment (d) -0.005 -0.025

    (0.030) (0.044)Owner’s household size 0.002 -0.008

    (0.005) (0.007)Owner’s household’s total income 0.000 0.000

    (0.000) (0.000)Missing value for owner’s household’s total income (d) 0.129∗∗∗ 0.090

    (0.040) (0.059)Enterprise provides at most half of household income (d) 0.027 0.016

    (0.026) (0.036)Owner has primary responsibility for childcare (d) -0.005 -0.017

    (0.026) (0.036)Owner perceives pressure within HH to share profits (d) 0.017 0.049

    (0.027) (0.037)

    37 Robert Garlick, Kate Orkin & Simon Quinn

  • REFERENCES

    Owner perceives pressure outside HH to share profits (d) -0.025 0.011(0.026) (0.037)

    Food sector (d) 0.030 0.001(0.028) (0.040)

    Light manufacturing sector (d) 0.034 0.082(0.047) (0.065)

    Services sector (d) -0.002 0.031(0.046) (0.065)

    Agriculture/other sector (d) -0.014 0.014(0.056) (0.073)

    # employees 0.015 0.061∗∗

    (0.019) (0.025)Enterprise age 0.002 -0.001

    (0.002) (0.002)Owner keeps written financial records (d) 0.003 0.036

    (0.032) (0.045)Enterprise is registered for payroll tax or VAT (d) 0.064 0.023

    (0.046) (0.068)Owner plans to grow enterprise in next five years (d) -0.001 -0.089∗∗

    (0.029) (0.042)Owner conducts business by phone at least weekly (d) 0.011 0.082∗∗

    (0.025) (0.035)# clients -0.000 -0.000

    (0.000) (0.000)

    # enterprises 895 895# regressors 35 35Pseudo-R2 0.071χ2 test stat for H0: no covariates predict attrition 67.3 79.5p-value for H0: no covariates predict attrition 0.001 0.000

    Notes: This table shows attrition rates for repeated and endline interviews with different model specifications, all with heteroscedasticity-robust

    standard errors. Column 1 shows marginal effects from a fractional logit regression of the proportion of missed repeated interviews. Column 2

    shows marginal effects from a logit regression of the endline attrition indicator. Marginal effects for continuous variables are evaluated at the sample

    mean of the relevant variable with standard errors calculated using the delta method. Discrete marginal effects are calculated for binary variables,

    denoted by (d). Omitted categories are Black African for race, South Africa for country of birth, English for home language, incomplete primary for

    education and trade/retail for business type. The unconditional probability of endline attrition is 34.0% and the unconditional mean and interquartile

    range of missed repeated interviews are respectively 47.8% and [16.7%,7