Child apps, personal data regulation and home-country ......game apps in relation to achieving...

32
Child apps, personal data regulation and home-country compliance PRELIMINARY VERSION G. Cecere * , F. Le Guel , V.Lefrere, C.Tucker § , P.L. Yin January 20, 2018 Abstract This article uses an original dataset on apps targeted at very young children to explore the types and scope of data that is collected about children when they use online mobile applications. We show that in the global economy of app developers, the geographical location of the developer influences whether they collect sensitive data, such as precise location, about their child users. Developers based in the US or in the OECD are less likely to collect sensitive data, while developers in countries that have no privacy law are most likely to collect sensitive data. We also distinguish the effects of an official Google program which encourages developers to comply with US child privacy regulation. We find that 10% of apps that are targeted at children under 5 that certify themselves via the program collect sensitive data from their child users. By contrast, 47% of apps which are targeted at children under 5 through keywords such as ‘toddler’ or ‘preschool’ which do not self-certify collect sensitive data about their users. JEL CODE: D82, D83, M31, M37 * Telecom Ecole de Management, Institut Mines Telecom, RITM-University of Paris Sud and Digital Society Institute. Email: [email protected] RITM-University of Paris Sud. Email: [email protected] Telecom Ecole de Management, Institut Mines Telecom-RITM-University of Paris Sud. Email: [email protected] § Massachusetts Institute of Technology (MIT) - Management Science (MS). Email: [email protected] Greif Center for Entrepreneurial Studies, Marshall School of Business, University of Southern California. Email: [email protected] 1

Transcript of Child apps, personal data regulation and home-country ......game apps in relation to achieving...

Page 1: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

Child apps, personal data regulation andhome-country compliance

PRELIMINARY VERSION

G. Cecere∗, F. Le Guel†, V.Lefrere, ‡C.Tucker §, P.L. Yin ¶

January 20, 2018

Abstract

This article uses an original dataset on apps targeted at very young children toexplore the types and scope of data that is collected about children when they useonline mobile applications. We show that in the global economy of app developers, thegeographical location of the developer influences whether they collect sensitive data,such as precise location, about their child users. Developers based in the US or in theOECD are less likely to collect sensitive data, while developers in countries that haveno privacy law are most likely to collect sensitive data. We also distinguish the effectsof an official Google program which encourages developers to comply with US childprivacy regulation. We find that 10% of apps that are targeted at children under 5that certify themselves via the program collect sensitive data from their child users. Bycontrast, 47% of apps which are targeted at children under 5 through keywords such as‘toddler’ or ‘preschool’ which do not self-certify collect sensitive data about their users.

JEL CODE: D82, D83, M31, M37

∗Telecom Ecole de Management, Institut Mines Telecom, RITM-University of Paris Sud and DigitalSociety Institute. Email: [email protected]†RITM-University of Paris Sud. Email: [email protected]‡Telecom Ecole de Management, Institut Mines Telecom-RITM-University of Paris Sud. Email:

[email protected]§Massachusetts Institute of Technology (MIT) - Management Science (MS). Email: [email protected]¶Greif Center for Entrepreneurial Studies, Marshall School of Business, University of Southern California.

Email: [email protected]

1

Page 2: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

1 Introduction

Many mobile applications are targeted at very young children, even toddlers and preschoolers.

Much as with an adult target audience, these apps automate collection of detailed data about

these children. This article analyzes what influences the sensitivity of data collected by child-

targeted applications. This is important because of the widespread use of mobile applications

by children. According to a recent Common Sense report, 98% of children under 8 use mobile

devices; they spend an average of 48 minutes per day on them (Rideout, 2017). However,

to our knowledge, there have been no empirical studies of the market for kids’ apps and the

effect of privacy regulation.1

Reflecting the global app economy, developers of children’s apps are located across the

world. We analyzes whether the country that a developer is located in affects the sensitivity

of the data it collects about children. In particular, we measure whether there are spillovers

from child privacy regulation in the US and OECD countries that affect foreign developers’

strategies. In the United States, digital content aimed at children aged under 13 years must

comply with COPPA, a statute aimed at protecting children’s privacy.2 In January 2013, the

US Federal Trade Commission (FTC) published a definition of children’s personal data. This

definition includes persistent identifiers such as cookies or mobile device identifier, photos,

videos and audio recordings, and geolocation data.3

Within this regulatory framework, the FTC promotes self-regulatory principles based

on notice and consent (Acquisti et al., 2016). In May 2015, Google Play Store introduced a

form of self-regulation called the “Designed for Families” program, to encourage developers to

comply with COPPA and which also helps parents identify content appropriate for children.4

Strong privacy protection can protect kids and reassure parents, which might increase use

of digital services but also might hamper innovative developers’ market access. We exploit

1The recent Mobile Kids Report published by Nielsen (2017) shows that 59% of the children in-terviewed used mobile devices to download apps http://www.nielsen.com/us/en/insights/news/2017/

mobile-kids--the-parent-the-child-and-the-smartphone.html.2Children’s Online Privacy Protection Act of 1998, 16 CFR Part 312,

www.ftc.gov/enforcement/rules/rulemaking-regulatory-reform-proceedings/childrens-online-privacy-protection-rule.

3See the Children’s Online Privacy Protection Rule: https://www.ecfr.gov/cgi-bin/text-idx?SID=cbe35c6ccc2aaf22d50f0087848c30c8&mc=true&node=pt16.1.312&rgn=div5

4During Google’s 2015 Annual Conference, app developers were introduced to the “Family star” icon.Note that in 2013, the Apple App Store introduced a kids app category (Apple’s WWDC 2013 Keynote).

2

Page 3: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

information on developers’ geographical location to estimate the effect of spillovers from US

regulation on foreign developers’ strategies. This is important because, though protection of

children’s personal data is a stated priority for policymakers and companies, to our knowledge

there have been no analyses of the market for children’s apps testing for differences in online

kids’ apps produced worldwide, and for whether they collect sensitive data.

To study this question, we collected weekly data on Google Play from July to Septem-

ber 2017 from the “Google Family” category. We compare this to apps that, rather than

choosing to certify, instead target children though keywords such as ‘preschool’ and ‘toddler.’

Our dataset includes 10,280 apps, corresponding to 4,516 different developers located in 88

countries, and a panel of 93,227 observations. Identification of the developer’s country is

based on the address provided by the developer.

The results show that developers located in regions not covered by privacy regulation

collect more sensitive data about chidren, relative to developers based in the US or OECD.

However, developers who comply with the Google self-certification program and are located

in countries without strong privacy regulation are also more likely to collect less data about

chidren. This suggests there are spillover effects on the behavior of foreign developers from

platform efforts to facilitate developer compliance with US privacy regulation. The results

are robust whether we look at broader definitions of sensitive data, or in particular at the

collection of granular location data.

We contribute to three literatures: the economics of privacy, the economics of smartphone

applications, and a more general literature on children’s Internet usage.

The first literature we contribute to is a literature on the economic effects of privacy regu-

lation. This highlights the tradeoff between the individuals’ protection and the development

of further innovation (Goldfarb and Tucker, 2012), which builds on specific studies of the

effects of privacy regulation on firm performance (Goldfarb and Tucker, 2011), competition

(Campbell et al., 2015) and welfare outcomes (Miller and Tucker, 2009). This is the first

study to our knowledge which documents the effects of privacy regulation focused on protect-

ing the privacy of children. It also builds on the finding by Rochelandet and Tai (2016) that

there is a relationship between privacy regulation and location. We show that in the global

app economy, developers are influenced by the presence or lack of protections, and that also

3

Page 4: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

there can be international spillovers in privacy regulation on behavior.

Our findings have direct relevance for a second literature on the economics of mobile

applications. This body of work focuses mainly on the characteristics of killer apps, and

estimation of demand and supply conditions. For instance, Ghose and Han (2014) uses a

structural model to estimate the factors influencing consumers’ demand for apps. Their

results suggest that demand for children’s apps is higher than demand for adult apps. They

show also that kids’ apps have lower marginal costs of production compared to other age

restricted categories. Yin et al. (2014) investigate the differences between game and non-

game apps in relation to achieving ‘killer app’ status. They find that developers of non-game

apps have a higher chance of developing a killer app if they focus on a single app and improve

it via updates. In the case of games apps, the probability of a particular app being successful

increases with the developer’s experience. We build in particular on a literature which shows

the role of platform design on the strategies of app developers. Ershov (2017) investigates

how the design of the Google Play platform changed entry dynamics, and shows that splitting

games categories into different subcategories reduces search costs and lowers the quality of

new entrants. Kummer and Schulte (2016) show that there is a trade-off for both the demand

side and the app suppliers, between the amount of personal information collected to monetize

a given app and the success of the focal application measured by installed numbers. While

there is empirical evidence showing importance of game categories in the smartphone market,

there is no published research on the characteristics of apps aimed at children.

Last, our research also builds on a third literature on children’s use of the Internet more

broadly, and especially to two research streams. One literature has questioned how internet

access affects educational outcomes (Bulman and Fairlie, 2016; Belo et al., 2013) and generally

has suggested mixed effects. The other stream of work has studied the relationship between

the presence of children in the household and Internet use. There is empirical evidence

that Internet use in school affects the level of Internet penetration in households Belo et al.

(2016). We contribute to this literature by highlighting children’s participation in the mobile

app economy.

This paper has several implications for policy. First, the statistics we provide about the

scope and depth of data collection about children improve upon a variety of existing policy

4

Page 5: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

studies. Two FTC policy reports (FTC, 2012a,b) provide some initial summary statistics

surrounding data collection by apps, but they evaluate only 364 apps and focus on the extent

to which these apps disclose data collection via privacy policies. Another study of web-

sites conducted by the Global Privacy Enforcement Network analyzed the privacy practices

of 1,494 world websites targeting children.5 It found that 67% of these websites required

personal information: 29% asked for names, 20% asked for dates of birth, 12% asked for

phone numbers, 11% asked for addresses, and 9% gathered photos or videos (GPEN, 2015).

We show that in the mobile applications economy that is increasingly replacing desktop-

orientated websites, data collection, especially for very young childen, may be even more

pervasive. This is because unlike websites, mobile applications do not rely on children to be

able to type or report information, but instead automate its collection, meaning they collect

data on particularly young chidren.

As well as providing some of the first and most comprehensive data about automated data

collection practices surrounding very young children, our empirical analysis also provides

suggestive evidence for policy. First, we identify spillover effects from platform compliance

efforts surrounding US policy regulation on the behavior of foreign developers. Second, as a

baseline matter, our analysis suggests that in a global app economy, even if some developers

are covered by regulation, children’s data may still be collected in a pervasive manner by

developers based in non-regulated countries.

The article is structured as follows: Section 2 reviews the relevant literature. Section

3 presents the econometric models. Section 4 describes the data sources and presents the

descriptive statistics. Section 5 discusses the econometrics results and provides some robust-

ness checks. Section 6 concludes.

5GPEN includes 29 Data Protection Authorities worldwide - ‘2015 GPEN Sweep - Children’s Privacy’:http://194.242.234.211/documents/10160/0/GPEN+Privacy+Sweep+2015.pdf

5

Page 6: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

2 Description of the sample

We collected weekly data on smartphone applications for children from the US Google Play

store using the Google KID Category and a keyword search . First, we collected the charac-

teristics of apps in the category “Designed for family” aimed at children aged under 13 years.

These included three age subcategories: 5 & under, 6 - 8 years, and age 9 & over. 6 Second,

we constructed a benchmark group of applications aimed at children by simulating the user’s

(parent’s) browsing of Google Play to identify children’s apps. Using Google ADwords, we

identified three sets of groups of keywords most frequently associated to children’s applica-

tions; the SEARCH group of keywords for children aged under 5 including “2 year old”, “3

year old”, “4 year old”, “5 year old”, “babies”, “baby”, “kindergarten”, “kindergartners”,

“preschool”, “preschoolers”, “toddler”, “toddlers”; the SEARCH group of keywords for chil-

dren aged between 6 and 8 years including “6 year old”, “7 year old”, “8 year old”, and the

SEARCH group of keywords for children aged 9 & over including “9 year old”, “10 year old”,

“11 year old”, and “12 year old”.

Our sample consists of apps included on Google Play or identified in the keywords searches

at least once during the period of study. Over 12 weeks, we tracked each application starting

from its first appearance to the end of the sample period. New apps appear over time while

others become unavailable: the number of apps available in Google Play category or iden-

tified by the keywords search increased from 5,154 to 10,280. Our sample includes 93,227

observations; 80% of the applications included a clear developer address. Developers were

located in 88 countries. Table 1 presents the descriptive statistics of the overall sample.

The Designed for Families program includes six broad categories: Action & Adventure,

Brain Games, Creativity, Education, Music and Video, and Pretend Play with an additional

three categories aimed at children aged 5 & under, 6 - 8 and 9 & and over. The content

included in Google Play Family is rated “Everyone” according to the Entertainment Software

Rating Board (ESRB) definition (see Figure 2 in appendix).

6Figure 2 shows an example of the type of data collected.

6

Page 7: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

Table 1: Summary statistics for the full sample of apps (panel data of 12 weeks)

Variable Mean Std. Dev. Min. Max. N

DEPENDENT VARIABLESSensitive data 0.627 1.241 0 15 93227Users’ location data 0.213 0.607 0 5 93227INDEPENDENT VARIABLESApps’ characteristicsUsers interact 0.048 0.213 0 1 93227Unrestricted internet 0.001 0.038 0 1 93227Contains Ad 0.578 0.494 0 1 93227Freemium 0.355 0.479 0 1 93227Log nbr reviews 5.30 3.48 0 17.9 93227Exit 0.348 0.476 0 1 93227Data collectionGoogle KID category 0.230 0.421 0 1 93227Search by keywords 0.340 0.474 0 1 93227Search by both (KID and Keywords) 0.081 0.273 0 1 93227Macro levelWithout developer address (Reference) 0.197 0.405 0 1 93227OECD 0.378 0.485 0 1 93227No OECD 0.424 0.494 0 1 93227COMPLIANCE WITH EU REGULATION :Member of the EU 0.263 0.440 0 1 93163Recognized by the EU 0.287 0.452 0 1 93163Independent authority 0.070 0.255 0 1 93163With legislation 0.143 0.350 0 1 93163No privacy law 0.029 0.169 0 1 93163PRIVACY LEGISLATION :Heavy 0.518 0.500 0 1 91482Robust 0.135 0.342 0 1 91482Moderate 0.052 0.222 0 1 91482Limited 0.083 0.275 0 1 91482INCOME LEVEL:High income 0.635 0.481 0 1 93227Upper middle income 0.078 0.268 0 1 93227Low and middle income 0.079 0.269 0 1 93227

Notes: Sensitive data and User’s location data are the two variables of interest. Other variables are regressorsfor econometric estimations including macro-economics variables. The dummy variable Freemium takes value1 if the application allows in-app purchases and/or digital purchases.

Our empirical strategy allows us to measure whether the platform policy related to chil-

dren’s content provides effective protection for their personal data, compared to the bench-

mark group. We collected all publicly available data such as app characteristics (number of

installations, free or paid apps), developer’s name and address, type of interactive elements

7

Page 8: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

proposed by the app, and number and type of permissions required by developers.

We are interested in 1) measuring the effectiveness of the platform policy to protect

children, and 2) testing whether foreign developers collect more sensitive data and adhere to

the Google Play program.

2.1 Dependent variables: Pieces of sensitive data collected and

users’ location data

To measure whether children’s apps comply with United States privacy legislation we identify

two measures of sensitive data collection. First, to measure whether the apps ask for sensitive

data, we create the variable Sensitive Data to count the number of pieces of sensitive data

collected by each app. This variable is created using two sources of information: the number

of sensitive permissions required by each app, and the interactive elements share users’ lo-

cation data and share personal information. To identify the list of sensitive permissions, we

use the classification in Sarma et al. (2012) which evaluates the privacy intrusiveness of the

permissions related to the Android system.

Table 2 presents the descriptive statistics of the sensitive data collected by developers.

Column 1 presents the statistics for the whole sample while columns 2 and 3 present the app

statistics respectively without and with developer addresses. All the standard deviations are

higher than the means suggesting important heterogeneity among apps in terms of sensitive

data requested. Approximate Network Based Location and Precise GPS Location are gen-

erally more often requested by developers who do not declare a geographical address. It is

possible that users’ location data are more valuable, and developers who request them hide

their identity because these are sensitive data.

Second, to measure whether the apps collect users’ location data, we create a second

variable: Users’ location data. This counts the number of permissions requiring the user’s

location, and we consider also whether the app requires the interactive elements ‘Shares Lo-

cation’. Table 3 lists the location data items required by apps. Columns 2 and 3 respectively

8

Page 9: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

Table 2: List of permissions and interactive elements used to construct the de-pendent variable Sensitive data

(1) (2) (3)Overall No Developer Address With Developer Address

Mean sd Mean sd Mean sdSENSITIVE PERMISSIONSAccess Extra Location Provider 0.004 0.062 0.013 0.111 0.002 0.039Approximate Network Based Location 0.104 0.305 0.172 0.377 0.086 0.281Read Text Messages Sms/Mms 0.010 0.097 0.013 0.111 0.009 0.093Precise Gps Location 0.088 0.283 0.144 0.351 0.073 0.261Read Calendar Events Plus Conf 0.004 0.066 0.006 0.080 0.004 0.062Read Call Log 0.008 0.089 0.010 0.100 0.007 0.086Read Sensitive Log Data 0.007 0.085 0.012 0.108 0.006 0.078Read Contacts 0.024 0.152 0.038 0.192 0.020 0.139Read Own Contact Card 0.004 0.063 0.005 0.073 0.004 0.060Read Owner Data 0.001 0.037 0.001 0.024 0.002 0.039Read Phone Status And Identity 0.245 0.430 0.249 0.433 0.244 0.429Read Text Messages Sms/Mms 0.010 0.097 0.013 0.111 0.009 0.093Edit Text Messages Sms/Mms 0.003 0.058 0.003 0.057 0.003 0.059Read Web Bookmarks And Hi 0.005 0.070 0.007 0.085 0.004 0.066Record Audio 0.069 0.254 0.072 0.259 0.069 0.253Reroute Outgoing Calls 0.005 0.070 0.006 0.076 0.005 0.069INTERACTIVE ELEMENTSInteractive elements Shares location 0.015 0.123 0.021 0.143 0.014 0.117Interactive elements Shares info 0.021 0.142 0.017 0.131 0.021 0.145N 93227 19415 73812

Notes: This table depicts the summary statistics of the permissions and interactive elements used to construct the dependent variable Sensitive data.Sd is the column of the standard deviation. Column 1 shows the descriptive statistics of the overall sample. Column 2 shows the descriptive statisticsfor the apps without developer address. Column 3 shows the summary statistics for the apps with developer address.

9

Page 10: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

present the statistics of the apps without and with developer addresses. Developers that do

not disclose their address on apps in Google Play collect more user location data compared

to other developers.

Table 3: List of permissions and interactive elements used to construct the de-pendent variable Users’ location data

(1) (2) (3)Overall No developer address With developer address

Mean sd Mean sd Mean sd

LOCATION PERMISSIONSAccess Extra Location Provider 0.003 0.061 0.012 0.111 0.001 0.038Approximate Network based Loc 0.104 0.305 0.171 0.377 0.086 0.280Mock Location Sources For Test 0.001 0.036 0.002 0.047 0.001 0.032Precise Gps Location 0.088 0.283 0.144 0.351 0.073 0.260INTERACTIVE ELEMENTSShares location 0.015 0.123 0.020 0.143 0.013 0.117N 93227 19415 73812

Notes: This table depicts the summary statistics of the permissions and interactive elements used toconstruct the dependent variable Users’ location data. Sd is the column of the standard deviation.Column 1 shows the descriptive statistics of the overall sample. Column 2 shows the descriptivestatistics for the apps without developer address. Column 3 shows the summary statistics for the appswith developer address.

2.2 App characteristics

Google Play provides a large set of information for all apps, and this information allows a

better understanding of the children’s apps market. In particular, the dummy variable Ev-

eryone indicates suitability for both children and adults. The set of Family (sub)category

variables indicates the Google Family category: Action & Adventure, Brain Games, Creativ-

ity, Education, Music and Video, and Pretend Play. To measure app success, we include in

the regression Log nbr reviews the log number of the reviews received by each app which is

a measure of real usage of the app rather than number of installations.

The variables Freemium and Contains Ad assess the business model. The binary variable

Freemium indicates if the application proposes in-app purchases or purchase through the

app. The binary variable Contain Ad takes the value 1 if the app displays ads to users.

10

Page 11: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

The ranking indicated by the user (variable User rating) measures the application’s qual-

ity, based on a 1 to 5 scale. To measure app popularity we use the number of ratings rather

than the number of installations provided by Google, mainly because Google provides down-

load counts within a range rather than as discrete numbers.

To measure the application’s behaviors, we abstract from the set of dummy variables for

the interactive elements available on Playstore based on the ESRB ranking7:

• Users Interact - Indicates possible exposure to unfiltered/uncensored user-generated

content including user-to-user communications and media sharing via social media and

networks

• Unrestricted Internet - Product provides access to the internet

2.3 Geographical location of developers

To explore regulation spillovers to other countries, we retrieved geographic information dis-

closed by developers of apps in the Google Play store. First, we collected location latitudes

and longitudes to identify the country, using Google Maps APIs. Second, we created an

algorithm to search for country name in the developer address provided. Third, we checked

the match between location identified using the Google Maps APIs and the country name

identified by the algorithm. Fourth, we identified missing geographical location and created

the variable Without developer address. In our sample, 20 % of apps contain no information

on the developers’ geographical address.

The average number of applications per country is about 89.5. US developers are respon-

sible for some 29 % of the applications in our sample. While India, the United Kingdom and

the United States account for more than 400 apps each, some other countries such as Qatar,

Tunisia and Costa Rica produce only a single app.

7ESRB is a non-profit, self-regulatory body that assigns ratings to video games and apps to classify contentaccording to its target. http://www.esrb.org/ratings/ratings guide.aspx#elements

11

Page 12: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

2.3.1 National privacy regulation

Privacy regulation rules vary across countries, and we exploit this variation to characterize

country privacy policy. To assess differences in national regulatory frameworks, we augment

these data with a Privacy regulation index which is a vector of the country’s privacy regu-

lation, and thus, is associated with the developer’s address. We use two privacy regulation

indexes to measure the level of privacy regulation in the developer’s country. First, we use

DLA Piper’s Global Data Protection Laws of the World to compare national data protection

laws. This measures the level of regulation and enforcement in each country on a scale from

Heavy to Limited. Heavy indicates strong privacy protection, and limited indicates a low

level of privacy protection8.

Second, we use a measure of privacy regulation which indicates the country’s level of

compliance with EU privacy legislation9. This index is computed by the French Privacy Reg-

ulation Authority (CNIL)10. The dummy variable Member of the EU identifies the developer

country as belonging to the EU or the EEA11, and indicates that the country’s privacy laws

are compatible with EU legislation. The binary variable Independent authority and law(s)

indicates the existence of an independent authority regulating privacy. The binary variable

With legislation indicates that the country has some level of privacy legislation while the

dummy variable No privacy law indicates absence of privacy laws in the developer’s country.

Table 11 in the Appendix presents countries categorized according to their level of compli-

ance with EU privacy legislation.

2.3.2 Macro-economic indicators: Graphical evidences

The developer’s strategy might be associated also to the home institutional framework. To

measure these effects, we include two sets of variables. First, we consider whether OECD

country developers demonstrate behavior that is different from that displayed by developers

8https://www.dlapiperdataprotection.com9Table 11 indicates the countries that belong to each group of privacy legislation

10https://www.cnil.fr/fr/la-protection-des-donnees-dans-le-monde, last retrieved the 8 January 2018.11European Economic Area

12

Page 13: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

located in non-OECD countries with weaker institutions and regulation. Second, we include

the country income level computed by the World Bank, in order to measure the effect of the

developer’s origin country’s economic growth. This variable measures the relative costs asso-

ciated with the collection and storing of personal data for developers located in low income

countries.

Figure 1 depicts the average number of pieces of sensitive data as a percentage of total

possible pieces of sensitive data per group of countries, and highlights the average percentage

of sensitive data items collected by developers that do not indicate their geographical address.

The statistics shows that overall, developers that do not indicate their geographical address

collect more data compared to developers that declare their location. The top left histogram

in Figure 1 shows the distribution of sensitive data items in OECD and non-OECD countries.

The bottom left histogram shows the distribution of sensitive data according to the privacy

index. The bottom right histogram depicts the distribution of sensitive data according to

the level of income. The amount of sensitive data collected tends to decrease if the developer

is based in an OECD country. The top right histogram shows the distribution of sensitive

data collected according to the privacy regulation regime. Developers from countries with

no privacy laws collect the largest amounts of data compared to other developers, followed

by developers who do not indicate their geographic location.

13

Page 14: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

Figure 1: Distribution of sensitive data per group of countries

0.2

.4.6

.81

Avg

sens

itive

data

No address OECD

No OECD USA

0.2

.4.6

.81

Avg

sens

itive

data

No Address Member of the UE or EEE

Recognized by the EU Independent authority & law

With legislation No privacy law

USA

0.2

.4.6

.81

Avg

sens

itive

data

No Address Heavy

Robust Moderate

Limited USA

0.2

.4.6

.81

Avg

sens

itive

data

No Address High income

Upper middle income Lower middle income

USA

Notes: The vertical axis is the percentage of sensitive data collected by developers.

2.4 Sensitive data and users’ location data by age group

Table 4 presents descriptive statistics of sensitive data by age and source of the data: Google

Kid category versus organic Search by keywords. Apps in the Google Kid category (see

Columns 1, 3, and 5) targeting Age 9 & up tend to collect more sensitive data than apps

aimed at Age 5 & under. However, the pattern changes for apps collected using organic

Search by keywords. In this case, the amount of sensitive data requested is always higher

compared to the Google kid category. Table 5 presents descriptive statistics by data source of

the dependent variables and the explanatory variable. Overall, the apps selected via Search

by keywords collect more sensitive data and data on user’s location.

14

Page 15: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

Table 4: Average number of sensitive data and users’ location data collected : Fullsample

Google 5/ Under Search 5/Under Google 6-8 Search 6-8 Google 9/Up Search 9 & Up(1) (2) (3) (4) (5) (6)

ALL COUNTRIESSensitive data 0.10 0.47 0.15 0.30 0.32 0.40Users’ location data 0.05 0.27 0.08 0.15 0.18 0.20

Notes: This table shows the average number of sensitive data and users’ data location required by developers, by agegroup and data source (Google Kid classification versus Search by keywords).

Table 5: Detailed descriptive statistics per source of data

(1) (2) (3) (4)Google kid Search keyword Google Kids & keyword Exit

Variable Mean sd Mean sd Mean sd Mean sdSensitive data 0.455 0.791 0.819 1.583 0.404 0.737 0.606 1.169Users’ location data 0.123 0.445 0.302 0.729 0.096 0.388 0.211 0.595Users interact 0.037 0.190 0.069 0.253 0.031 0.173 0.038 0.192Unrestricted internet 0.000 0.007 0.004 0.059 0.000 0.000 0.001 0.024Contains ad 0.472 0.499 0.687 0.464 0.591 0.492 0.540 0.498Freemium 0.323 0.468 0.407 0.491 0.527 0.499 0.286 0.452Free 0.511 0.500 0.945 0.228 0.783 0.412 0.658 0.474Log nbr reviews 5.732 3.518 6.800 3.144 8.397 2.489 3.828 3.228OECD 0.411 0.492 0.344 0.475 0.474 0.499 0.368 0.482No OECD 0.472 0.499 0.398 0.489 0.472 0.499 0.407 0.407Member of the UE 0.259 0.438 0.246 0.431 0.329 0.470 0.266 0.442Recognized by the EU 0.386 0.487 0.217 0.412 0.395 0.489 0.265 0.441Independent authority 0.093 0.291 0.057 0.232 0.092 0.289 0.062 0.241With legislation 0.104 0.306 0.181 0.385 0.110 0.313 0.138 0.345No privacy law 0.030 0.172 0.029 0.169 0.018 0.134 0.031 0.173Heavy 0.652 0.476 0.406 0.491 0.659 0.474 0.505 0.500Robust 0.116 0.321 0.163 0.370 0.194 0.396 0.107 0.309Moderate 0.049 0.216 0.054 0.225 0.054 0.225 0.051 0.221Limited 0.054 0.226 0.101 0.302 0.038 0.191 0.094 0.292High income 0.745 0.436 0.557 0.497 0.804 0.397 0.601 0.490Upper middle income 0.075 0.263 0.080 0.272 0.087 0.281 0.075 0.264Low and middle income 0.054 0.226 0.093 0.291 0.055 0.228 0.086 0.280Observations 21476 31723 7561 32467

Notes: The table presents the summary statistics of all the variables. Column 1 shows the descriptivestatistics of the apps collected via the Google Play Family group. Column 2 shows the descriptive statisticsof applications collected via organic search by keywords. Column 3 shows the descriptive statistics of theapps identified via both search methods. Column 4 shows the descriptive statistics of the applicationsthat exit at one point from the Google KID category and search by keywords.

2.5 Descriptive statistics of apps without geographical address of

developers

COPPA legislation requires that parents be informed about the companies that collect kids’

data, and in particular, that companies indicate their contact details, name, email and geo-

graphical address. In our sample, 20% of apps do not include a developer address. Table 6

presents the descriptive statistics of the sample of apps with developers’ address and with-

15

Page 16: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

out. Developers who do not indicate their address collect more sensitive data and more user

location data.

Table 6 shows the characteristics of apps in group with (column (1)) and without addresses

(column (2)). As shown in table 6 developer without address using more sensitive data as

well as Users’ data location. They also rely more on advertising. Apps without addresses are

mostly in the categories: Casual, Entertainment, Lifestyle, and Health and Fitness.

Table 6: Breakdown statistics of applications with addresses and without addresses

(1) (2)No developer address developer address

Variable Mean sd Mean sd

Sensitive data 0.803 1.386 0.584 1.196Users’data location 0.352 0.756 0.175 0.554

Users interact 0.058 0.233 0.046 0.210Unrestricted internet 0.003 0.056 0.001 0.031Contains ad 0.716 0.451 0.537 0.499Freemium 0.066 0.248 0.431 0.495Free 0.972 0.164 0.662 0.473Log nbr reviews 4.238 3.172 5.580 3.507Search by both 0.021 0.144 0.098 0.297Google KID category 0.140 0.347 0.256 0.436Search by keyword 0.440 0.496 0.311 0.463Exit 0.398 0.490 0.335 0.472Action and Adventure 0.052 0.223 0.106 0.308Brain Games 0.092 0.289 0.137 0.344Creativity 0.076 0.266 0.078 0.268Education 0.130 0.336 0.333 0.471Music and Video 0.027 0.163 0.032 0.176Pretend Play 0.073 0.260 0.110 0.313

Observations 19415 71749

Notes: Column 1 shows the descriptive statistics of apps characteristics thatdo not have geographical information. Column 2 depicts the statistics of appcharacteristics with developer address.

16

Page 17: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

3 Model specification

Our econometric analysis estimates the effect of the regulation in the developer’s country of

origin on the amount of sensitive data collected from children. We estimate the dependent

variable Sensitive data which measures the amount of sensitive data collected related to an

application i (i= 1 to N = 10, 280) in week t (t= 1 to T=12) in country j. Note that j=1

to 88 :

Sensitivedataitj = β0 + β1Appsitj + β2SourceDataitj + β3WithoutDeveloperAddressitj

+ β4PrivacyRegulationitj + ρt + αj + εitj (1)

where Apps is the vector of app characteristics i at time t in country j. The dependent

variables in the vector Apps include the variable Exit to measure whether the apps exited

from one of these data sources during the observation period. SourceData includes the set

of variables for the source of the data, namely Google KID Category or Search by Keywords

or Both. The dummy variable WithoutDeveloperAddress indicates whether the developer

displays a geographical address.

To measure whether differences in the item of sensitive data required by developers re-

flects privacy regulation differences, Privacy Regulation is included as a vector of the macro-

economic variables. We identify two measures of privacy regulation. First, we include the

level of privacy protection according to European legislation using a set of dummy variables

that capture the country’s level of compliance. Second, we consider the privacy international

index computed by DLA Piper, a global law firm. We include alternately income level avail-

able from the World Bank and a dummy variable for whether the developer’s country is in the

Organisation for Economic Co-operation and Development (OECD) group of countries. The

equation also includes time (week) and country fixed effects ρt and αj, respectively. We also

cluster the standard errors at the country level to account for correlation among observations

within the same country.

17

Page 18: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

The number of pieces of sensitive personal data required by apps follows a Poisson distri-

bution. An important condition of a Poisson model is that it assumes that the conditional

variance is equal to the conditional mean (equidispersion). Given that our dependent vari-

able is a count variable with overdispersion (see Table 1), our empirical strategy is based

on a negative binomial. This model can be considered a modified Poisson model (Greene

(1994)). Overdispersion is corrected by adding an error term to consider between-subject

heterogeneity.

4 Estimation of the pieces of sensitive data collected

by developers

Table 7 presents the estimation results for the number of pieces of sensitive data collected by

developers. We measure developers that do not indicate their geographic location with the

variable Without developer address. COPPA legislation requires that each company or the

third parties that collect user data provide contact information such as name and address to

allow parents to contact them. We investigate the impact of privacy regulation and macro-

economic characteristics on the number of pieces of sensitive data requested by developers.

We include several app characteristics to account for app heterogeneity. All the specifications

include country-level controls and category and time fixed effects.

Table 7 column 1 estimates the model with the variable for developer located in an OECD

country. It suggests that developers from OECD countries request fewer pieces of sensitive

personal data compared to developers that do not include location information. Developers

that fail to declare their address collect more data compared to other developers. Table 7

column 2 includes a set of dummies for compliance with EU legislation. Developers in EU

countries or countries whose privacy laws are compatible with EU legislation request fewer

pieces of sensitive data compared to developers that do not indicate their location informa-

tion. This is consistent with the previous estimates. Column 3 estimates the model including

18

Page 19: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

the variable for enforcement of privacy legislation and shows that the existence of privacy

legislation does not affect the number of pieces of sensitive data collected. Column 4 esti-

mates the model including a set of variables for a country’s income level according to the

World Bank. No significant effects were found. Finally, column 5 includes country dummy

fixed effects.

The present study highlights that developers located in countries with reasonable privacy

policies tend to comply with their national privacy regulation. This is an important finding

which contributes to our understanding of the global apps market. A developer from any

world country can offer its apps in a specific market such as the United States. Our results

suggest a “home-country compliance”. It would seem that developers from countries with

strong regulation collect less personal data which allows them to comply with the regulation

in the target market.

19

Page 20: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

Table 7: Estimation of the piece of sensitive personal data collected as function ofPrivacy regulation, Income Level and Country fixed effects. Reference categoryis the group of apps without developer address

(1) (2) (3) (4) (5)Log number of reviews 0.019* 0.020** 0.025*** 0.021** 0.017*

(0.010) (0.010) (0.009) (0.010) (0.009)User rating -0.002 -0.005 -0.002 -0.003 -0.007

(0.016) (0.015) (0.014) (0.016) (0.013)Everyone -0.069 -0.058 -0.050 -0.087 -0.060

(0.099) (0.096) (0.097) (0.101) (0.088)Users interact 0.815*** 0.839*** 0.830*** 0.832*** 0.794***

(0.085) (0.084) (0.085) (0.087) (0.102)Unrestricted internet 1.283*** 1.297*** 1.276*** 1.275*** 1.240***

(0.200) (0.196) (0.199) (0.201) (0.181)Contains ad 0.044 0.005 -0.001 0.020 -0.075

(0.152) (0.148) (0.150) (0.149) (0.145)Freemium 0.169* 0.172* 0.160* 0.164* 0.204***

(0.093) (0.090) (0.087) (0.089) (0.076)Google KID category 0.198* 0.171 0.192* 0.193 0.149

(0.120) (0.113) (0.114) (0.118) (0.109)Search by keyword 0.022 0.013 0.017 0.016 0.045

(0.104) (0.107) (0.109) (0.105) (0.095)Exit 0.188** 0.170* 0.176* 0.176* 0.151

(0.095) (0.097) (0.097) (0.095) (0.096)With developer address ref.OECD -0.202*

(0.114)No OECD -0.047

(0.076)With developer address ref.Member of the UE or EEE -0.217*

(0.129)Recognized by the EU -0.149

(0.097)Independent authority & law 0.112

(0.173)With legislation -0.062

(0.124)No privacy law 0.394*

(0.207)With developer address ref.Heavy -0.164

(0.112)Robust -0.186

(0.160)Moderate 0.218

(0.250)Limited 0.092

20

Page 21: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

(0.112)With developer address ref.High income -0.141

(0.093)Upper middle income -0.063

(0.181)Low and middle income 0.118

(0.113)Period fixed effect Yes Yes Yes Yes YesGoogle Category fixed effect Yes Yes Yes Yes YesCountry fixed effect No No No No YesN 93227 93163 91482 93227 93227R2 0.054 0.055 0.055 0.054 0.074

Notes: We show negative binomial estimates. The dependent variable is the number of pieces of sensitivedata collected by apps. Estimations include the dummy variable Without developer address. Robust standarderrors clustered at country level are reported in parentheses. Reference category: Search on both. Column1 estimates the model with the dummy variable OECD. Column 2 includes a set of dummies measuring EUcountry compliance with EU legislation. Column 3 includes a set of variables measuring privacy regulationand enforcement (reference category: heavy privacy legislation). Column 4 includes World Bank incomeclassification with High Income as the reference. Column 5 includes country fixed effects (reference country:Morocco). All the regressions include week fixed effects. ∗p < .10, ∗ ∗ p < .05, ∗ ∗ ∗p < .01

Table 8 reports the marginal effects at the mean for the main specification. Column 1

shows that developers in OECD countries collect 0.125 pieces of sensitive data. To test our

main hypothesis, we investigate how the Google KID category moderates the effect of privacy

regulation and the institutional framework to measure the spillover effects of US legislation.

To do this, we reestimated the model with the interaction effects between data source and

the variables OECD and Compliance with EU law. Table 8 column 2 includes the interaction

OECD x Google KID Category, i.e. if the developer is not located in an OECD country and

decides to comply with the Google KID program, the number of pieces of sensitive data col-

lected decreases to 0.165. This suggests the presence of a spillover effect of US regulation on

the non-OECD countries if the developer decides to comply with the Google Family program.

Table 8 column 4 shows the interaction effect between the EU compliance index and the

data source. While the interaction Recognized by the EU x Search by keyword is positive

and statistically significant, the interaction Recognized by the EU x Google KID category is

negative and statistically significant which suggests spillover of US legislation if developers

21

Page 22: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

decide to participate in the Google KID program. Compliance with the Google KID program

reduces the number of pieces of sensitive data collected to 0.224 units for developers located

in Argentina, Canada, Israel, New Zealand, Switzerland, the United States and Uruguay

(Recognized by the EU ). Similarly, the interaction With legislation x Google KID category

indicates that there is a spillover effect. Developers located in countries with privacy legisla-

tion that comply with the Google self-certification program collect on average of 0.116 less

sensitive data. The interaction No privacy law x Google KID category indicates that there

is no spillover effect in the case of developers located in countries with no privacy laws that

are in the Google KID category. Developers located with no privacy law that comply with

the Google self-certification program request on average of 0.220 additional sensitive data.

22

Page 23: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

Table 8: Estimation of number of Sensitive data (Marginal effects): Moderatingeffects of the Google KID category vs. Search by Keywords. Reference categoryis the group of apps without developer address

(1) (2) (3) (4)Log number of reviews 0.012* 0.012* 0.012** 0.012*

(0.006) (0.007) (0.006) (0.006)User rating -0.001 -0.001 -0.003 -0.004

(0.010) (0.010) (0.010) (0.010)Everyone -0.043 -0.048 -0.037 -0.044

(0.062) (0.060) (0.060) (0.058)Users interact 0.511*** 0.509*** 0.527*** 0.519***

(0.047) (0.046) (0.054) (0.051)Unrestricted internet 0.805*** 0.801*** 0.815*** 0.765***

(0.137) (0.135) (0.136) (0.119)Contains ad 0.028 0.025 0.003 -0.010

(0.096) (0.097) (0.093) (0.092)Freemium 0.106* 0.104* 0.108* 0.102*

(0.058) (0.058) (0.057) (0.057)Google KID category 0.124* 0.221*** 0.107 0.181***

(0.075) (0.061) (0.069) (0.062)Search by keyword 0.014 0.019 0.008 -0.014

(0.065) (0.068) (0.067) (0.067)Exit 0.118** 0.121* 0.107* 0.099

(0.060) (0.063) (0.061) (0.063)OECD -0.125* -0.114

(0.068) (0.079)No OECD -0.032 -0.007

(0.050) (0.072)OECD X Search by keyword -0.025

(0.077)OECD X Google KID category -0.049

(0.070)No OECD X Search by keyword 0.012

(0.114)No OECD X Google KID category -0.165***

(0.061)Member of the UE or EEE -0.132* -0.135

(0.074) (0.087)Recognized by the EU -0.094 -0.143***

(0.060) (0.053)Independent authority & law 0.080 0.177

(0.130) (0.138)With legislation -0.041 0.021

(0.079) (0.105)No privacy law 0.327 0.307

(0.206) (0.207)Member of the UE X Search by keyword -0.021

(0.089)

23

Page 24: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

Member of the UE X Google KID category -0.012(0.066)

Recognized by the EU X Search by keyword 0.255***(0.029)

Recognized by the EU X Google KID category -0.224***(0.036)

Independent authority & law X Search by keyword -0.301***(0.102)

Independent authority & law X Google KID category -0.045(0.087)

With legislation X Search by keyword -0.093(0.066)

With legislation X Google KID category -0.116**(0.051)

No privacy law X Search by keyword -0.224(0.155)

No privacy law X Google KID category 0.220***(0.078)

Observations 93227 93227 93163 93163Pseudo R2 0.0535 0.0539 0.0551 0.0591

Notes: The marginal effects of the negative binomial estimates are shown. The dependent variable isthe number of pieces of sensitive data collected by apps. Robust standard errors clustered at country levelare reported in parentheses. Column 1 estimates the model with the dummy variable OECD. Column 2estimates the interaction effects between the OECD dummy and the source of the data. Column 3 estimatesthe set of dummies measuring compliance with EU legislation. Column 4 estimates the interaction effectsbetween the set of dummies measuring compliance with EU legislation and the variable for data source. Themain regressions include week fixed effects, Google KID category fixed effects. Significance level: ∗p < .10,∗ ∗ p < .05, ∗ ∗ ∗p < .01

4.1 Falsification test: Regression excluding apps without devel-oper address

To disentangle the effects of hidden address information, we estimated the regressions ex-

cluding apps with no address detail. Table 9 presents the estimations excluding applications

that do not include geographical location information. This table can be compared with

Tables 7 and 8 including all observations. Now, the variable Google Kid is not significant,

perhaps because characteristics of the developers showing their address are not the same as

those of developers hiding their address. In particular, developers without addresses could be

localized in countries without stringent privacy laws, which is confirmed by the higher signif-

icance of the coefficient for ‘No privacy law’ variable. Whether or not a developer belongs to

the OECD has no impact on the number of sensitive data collected. For developers showing

24

Page 25: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

their address, the OECD is not a differentiating factor. We note that ‘Contains ad’ becomes

significant which shows that developers without addresses seem to be less embedded in the

online advertising industry. Column (2) shows that if the developer is located in a country

with no privacy laws, the amount of sensitive data increases.

Table 9: Estimation of the pieces of sensitive data collected as a function of Privacyregulation, Income Level, and Country fixed effects. The apps without addressesare excluded.

(1) (2) (3) (4) (5)Log number of reviews 0.010 0.010 0.016 0.011 0.005

(0.010) (0.010) (0.010) (0.010) (0.009)User rating -0.011 -0.014 -0.009 -0.012 -0.016

(0.022) (0.021) (0.021) (0.022) (0.019)Everyone -0.034 -0.013 0.000 -0.053 -0.018

(0.128) (0.120) (0.119) (0.131) (0.111)Users interact 0.778*** 0.804*** 0.796*** 0.797*** 0.744***

(0.122) (0.122) (0.122) (0.126) (0.147)Unrestricted internet 1.579*** 1.590*** 1.572*** 1.576*** 1.514***

(0.105) (0.105) (0.114) (0.113) (0.100)Contains ad 0.218*** 0.181** 0.172** 0.195** 0.095

(0.075) (0.071) (0.080) (0.078) (0.072)Freemium 0.152 0.161* 0.146 0.150 0.193**

(0.095) (0.093) (0.089) (0.092) (0.080)Google KID category 0.168 0.139 0.165 0.163 0.102

(0.133) (0.123) (0.128) (0.131) (0.114)Search by keyword -0.024 -0.027 -0.020 -0.028 0.010

(0.110) (0.112) (0.116) (0.111) (0.098)Exit 0.147 0.130 0.139 0.135 0.092

(0.103) (0.104) (0.107) (0.104) (0.102)OECD ref.No OECD 0.167

(0.106)Member of the EU ref.Recognized by the EU 0.098

(0.104)Independent authority & law 0.328*

(0.179)With legislation 0.115

(0.166)No privacy law 0.545**

(0.241)Heavy ref.Robust -0.056

(0.165)

25

Page 26: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

Moderate 0.355(0.239)

Limited 0.177(0.139)

High income ref.Upper middle income 0.047

(0.181)Low and middle income 0.189

(0.127)Constant -0.251 -0.251 -0.252 -0.166 -0.178

(0.212) (0.205) (0.202) (0.196) (0.197)Period fixed effect Yes Yes Yes Yes YesCountry fixed effect No No No No YesGoogle Family Category fixed effect Yes Yes Yes Yes YesN 73812 73748 72067 73812 73812R2 0.061 0.063 0.063 0.061 0.087

Notes: Negative binomial estimates are shown. Dependent variable is the number of Sensitive data col-lected by app. Regressions only include apps with geographic addresses. Robust standard errors clustered atcountry level are reported in parentheses. Column 1 estimates the model with the dummy variable OECD ;the reference group is the OECD country. Column 2 includes the set of dummies measuring compliance withEU legislation for an EU country. The reference group includes the Member of the EU. Column 3 estimatesthe model with the set of variables measuring privacy regulation and enforcement, with Heavy privacy legisla-tion as the reference category. Column 4 estimates the model including the World Bank income classificationwith High Income as the reference. Column 5 estimates the model with the country fixed effects; Morocco isthe reference country. All the regressions include week fixed effects. ∗p < .10, ∗ ∗ p < .05, ∗ ∗ ∗p < .01

4.2 Robustness check: Number of Users’ location data

We show the robustness of our result to alternative dependent variable: users’ location data.

Table 10 column 2 includes a set of dummies measuring compliance with EU legislation. De-

velopers in EU countries or countries whose privacy laws are compatible with EU legislation

request less user location data. This is consistent with the previous estimates. Column 3

estimates the model with the variable for enforcement of privacy legislation; apps that do

not provide location information collect more user location data compared to apps with de-

velopers in countries with strict privacy regulation. In the estimations for privacy legislation

in specific countries, this variable might capture underlying effects such as infrastructure or

wealth. We address this in column (4) which estimates the model including a set of variables

measuring the country’s income level according to the World Bank. The results suggest that

high or upper middle average income developer country has a negative impact on the num-

26

Page 27: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

ber of risky permissions requested. Finally, column (5) includes country dummy fixed effects.

Table 10: Estimation of the piece of users’ data location collected as function ofPrivacy regulation, Income Level and Country fixed effects. Reference categoryis the group of apps without developer address

(1) (2) (3) (4) (5)Log number of reviews -0.037** -0.032* -0.028 -0.035* -0.032*

(0.018) (0.017) (0.017) (0.018) (0.018)User rating 0.019 0.011 0.017 0.018 0.020

(0.022) (0.024) (0.018) (0.022) (0.017)Everyone -0.078 -0.056 -0.022 -0.111 -0.024

(0.150) (0.150) (0.135) (0.155) (0.137)Users interact 0.922*** 0.977*** 1.002*** 0.969*** 1.015***

(0.147) (0.153) (0.154) (0.159) (0.149)Unrestricted internet 0.922*** 0.961*** 0.904*** 0.890*** 0.957***

(0.153) (0.155) (0.121) (0.122) (0.168)Contains ad 0.176 0.116 0.065 0.138 0.055

(0.260) (0.249) (0.236) (0.248) (0.247)Freemium 0.623*** 0.603*** 0.603*** 0.615*** 0.528***

(0.138) (0.129) (0.120) (0.140) (0.108)Google KID category 0.115 0.065 0.087 0.098 0.061

(0.188) (0.184) (0.185) (0.194) (0.181)Search by keyword 0.106 0.092 0.078 0.097 0.096

(0.203) (0.214) (0.223) (0.208) (0.231)Exit 0.184 0.151 0.135 0.156 0.147

(0.158) (0.169) (0.179) (0.166) (0.190)OECD -0.841***

(0.148)No OECD -0.324***

(0.099)With developer address ref.Member of the EU -0.585***

(0.179)Recognized by the EU -0.620***

(0.181)Independent authority & law -0.895***

(0.274)With legislation -0.477***

(0.143)No privacy law 0.498

(0.307)With developer address ref.Heavy -0.705***

(0.187)Robust -0.891***

(0.183)

27

Page 28: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

Moderate 0.391(0.307)

Limited -0.168(0.213)

With developer address ref.High income -0.612***

(0.141)Upper middle income -0.599***

(0.200)Low and middle income -0.062

(0.197)Constant -0.899*** -0.868*** -0.902*** -0.865*** -0.887***

(0.292) (0.295) (0.281) (0.288) (0.297)Period fixed effect Yes Yes Yes Yes YesCountry fixed effect No No No No YesGroup fixed effect Yes Yes Yes Yes YesN 93227 93163 91482 93227 93227R2 0.075 0.076 0.082 0.072 0.105

Notes: Negative binomial estimates are shown. The dependent variable is the number of Location datacollected by the app. They include the reference variable Without developer address. Robust standard er-rors clustered at country level are reported in parentheses. Column 1 estimates the model with the dummyvariable OECD. Column 2 includes the set of dummies measuring compliance with EU legislation for anEU country. Column 3 estimates the model with the set of variables measuring privacy regulation and en-forcement, with No address as the reference category. Column 4 estimates the model including the WorldBank income classification with No address as the reference. Column 5 estimates the model with the countryfixed effects. All the regressions include week fixed effects. Statistical significance of the coefficient ∗p < .10,∗ ∗ p < .05, ∗ ∗ ∗p < .01

28

Page 29: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

5 Conclusion

We investigate whether the developer’s location affects the Amount of sensitive data col-

lected. We rely on original data from Google Playstore, collected using keywords associated

with child applications. The content included in the category “Designed for Families” should

comply with Google’s guidelines for age-appropriate content and advertising and more closely

comply to COPPA.

We find that developers from countries with weak privacy regulation collect more sensi-

tive data. For example, our results show that developers from OECD countries (including

the USA) and EU countries tend to comply with COPPA compared to non-member coun-

tries. We observe that national income has no impact on the app’s intrusiveness. Together,

these findings confirm that “home country” privacy regulation has an impact on the privacy

behaviors of developers. US regulation is likely to have an impact on foreign developers if

they comply with the Google KID program.

We observe that disclosing the country location has an impact on the amount of user data

collected. More precisely, developers who do not reveal their geographic location show bad

behavior regarding children’s privacy. This is an important result from a policy perspective.

For instance, the platform might make provision of an address a condition for approval, which

could affect the collection of children’s personal data.

It is reassuring that Google’s privacy policy - via the category “Designed for Families”

– is effective in encouraging developers to request fewer pieces of sensitive data. The self-

regulation of platforms could reinforce the Children’s Online Privacy Protection Act.

Overall, our results suggest that the child apps market does not respect children’s personal

data and that data can be transferred to other countries outside the US market where there

is a lack of privacy regulation. This can result in lack of control over the use of children’s

data.

29

Page 30: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

References

Acquisti, A., Taylor, C. and Wagman, L. (2016). The Economics of Privacy. Journal of

Economic Literature. 54(2), 442–92.

Belo, R., Ferreira, P. and Telang, R. (2013). Broadband in school: Impact on student

performance. Management Science. 60(2), 265–282.

Belo, R., Ferreira, P. and Telang, R. (2016). Spillovers from Wiring Schools with Broadband:

The Critical Role of Children. Management Science. 62(12), 3450–3471.

Bulman, G. and Fairlie, R. W. (2016). Technology and education: Computers, software, and

the internet. Working Paper 22237. National Bureau of Economic Research.

Campbell, J., Goldfarb, A. and Tucker, C. (2015). Privacy regulation and market structure.

Journal of Economics & Management Strategy. 24(1), 47–73.

Ershov, D. (2017). The Effect of Consumer Search Costs on Entry and Quality in the Mobile

App Market. Working Paper.

FTC (2012a). Mobile Apps for Kids: Current Privacy Disclosures are Disappointing. Tech-

nical report.

FTC (2012b). Mobile Apps for Kids: Disclosures Still Not Making the Grade. Technical

report.

Ghose, P. and Han, S. P. (2014). Estimating Demand for Mobile Applications in the New

Economy. Management Science. 60(6), 1470–1488.

Goldfarb, A. and Tucker, C. (2012). Privacy and innovation. Innovation policy and the

economy. 12(1), 65–90.

Goldfarb, A. and Tucker, C. E. (2011). Privacy regulation and online advertising. Manage-

ment science. 57(1), 57–71.

GPEN (2015). 2015 GPEN Sweep - Children’s Privacy. Technical report.

30

Page 31: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

Greene, W. (1994). Accounting for Excess Zeros and Sample Selection in Poisson and Neg-

ative Binomial Regression Models. Working papers.

Kummer, M. and Schulte, P. (2016). When private information settles the bill: Money and

privacy in Google’s market for smartphone applications. Working Paper.

Miller, A. R. and Tucker, A. R. (2009). Privacy Protection and Technology Diffusion: The

Case of Electronic Medical Records. Management Science. 55(7), 1077–1093. doi:10.1287/

mnsc.1090.1014.

Nielsen (2017). Mobile kids: the parent, the child and the smartphone. Technical report. Last

seen: January 2017.

Rideout, V. (2017). The Common Sense census: Media use by kids age zero to eight. San

Francisco, CA: Common Sense Media, 263–283.

Rochelandet, F. and Tai, S. H. T. (2016). Do privacy laws affect the location decisions of

internet firms? Evidence for privacy havens. European Journal of Law and Economics.

42(2).

Sarma, B. P., Li, N., Gates, C., Potharaju, R., Nita-Rotaru, C. and Molloy, I. (2012). Android

permissions: a perspective combining risks and benefits. In Proceedings of the 17th ACM

symposium on Access Control Models and Technologies. June. ACM, 13 22.

Yin, P. L., Davis, J. P. and Muzyrya, Y. (2014). Entrepreneurial Innovation: Killer Apps in

the iPhone Ecosystem. American Economic Review. 104(5), 255–59.

31

Page 32: Child apps, personal data regulation and home-country ......game apps in relation to achieving ‘killer app’ status. They nd that developers of non-game apps have a higher chance

6 Appendix

Figure 2: Screenshot of Google Play Family

Table 11: Name of the country that belongs to each group of Compliance with EU privacyregulation

Member of the UE Recognized by the EU Independent authority With legislation No privacy law

Austria Argentina Armenia Australia BahrainBelgium Canada Azerbaijan Colombia BangladeshBulgaria Israel Brazil Costa Rica BelarusCroatia New Zealand Chile Georgia CambodiaCyprus Switzerland China Hong Kong SAR, China EcuadorCzech Republic United States Dominican c Korea, Rep. Egypt, Arab Rep.Denmark Uruguay India Macedonia, FYR El SalvadorEstonia Indonesia Mexico JordanFinland Japan Moldova KuwaitFrance Kazakhstan Morocco NigeriaGermany Malaysia Serbia OmanGreece Mali Tunisia PakistanHungary Paraguay Ukraine PeruIceland Philippines Puerto RicoIreland Qatar Saudi ArabiaItaly Russian Federation n Sri LankaLatvia Singapore United Arab EmiratesLithuania South AfricaMalta ThailandNetherlands TurkeyNorway VietnamPolandPortugalRomaniaSlovak RepublicSloveniaSpainSwedenUnited Kingdom

32