Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... ·...

18
..; .. (..):1–18 Yan Shvartzshnaider*, Noah Apthorpe, Nick Feamster, and Helen Nissenbaum Analyzing Privacy Policies Using Contextual Integrity Annotations Abstract: In this paper, we demonstrate the effective- ness of using the theory of contextual integrity (CI) to annotate and evaluate privacy policy statements. We perform a case study using CI annotations to compare Facebook’s privacy policy before and after the Cam- bridge Analytica scandal. The updated Facebook pri- vacy policy provides additional details about what in- formation is being transferred, from whom, by whom, to whom, and under what conditions. However, some privacy statements prescribe an incomprehensibly large number of information flows by including many CI pa- rameters in single statements. Other statements result in incomplete information flows due to the use of vague terms or omitting contextual parameters altogether. We then demonstrate that crowdsourcing can effectively produce CI annotations of privacy policies at scale. We test the CI annotation task on 48 excerpts of pri- vacy policies from 17 companies with 141 crowdworkers. The resulting high precision annotations indicate that crowdsourcing could be used to produce a large corpus of annotated privacy policies for future research. Keywords: Privacy policies, contextual integrity, anno- tation 1 Introduction Many online services operate by collecting and sharing users’ information. To protect consumers, the U.S. Fed- eral Trade Commission (FTC) devised fair information practice principles (FIPPs) based on the “notice and choice” framework [9]. These principles, in concert with *Corresponding Author: Yan Shvartzshnaider: New York University & Princeton University, E-mail: [email protected] Noah Apthorpe: Princeton University, E-mail: [email protected] Nick Feamster: Princeton University, E-mail: feam- [email protected] Helen Nissenbaum: Cornell Tech, E-mail: he- [email protected] These authors contributed equally to this work. state regulations, require companies to notify consumers about their information collection and sharing practices through privacy policies. These privacy policies, which often include details about the type of information col- lected, the entities that receive or store the information, and the conditions governing data acquisition and han- dling, serve two main purposes: 1) informing consumers about data collection practices, which they can consider when deciding whether or not to use a service, and 2) offering regulators, such as the FTC, a way to audit online services for misleading privacy practices. As we write this paper, the European General Data Protection Regulation (GDPR) [2] is coming into ef- fect, forcing companies to adapt their behavior and rewrite their privacy policies or face strict penalties. The changes are largely based on GDPR Articles 13, 14, and 15, which outline the details companies need to provide to consumers when collecting, processing and sharing their information. The regulation puts an emphasis on providing this information to the “subject in a concise, transparent, intelligible and easily accessible form, us- ing clear and plain language” [1]. As a result, consumers are receiving an avalanche of updated privacy policies as companies strive for GDPR compliance [4]. However, just because the GDPR has pushed companies to up- date their privacy policies does not necessarily mean that these updated policies address the issues of previ- ous versions. In this paper, we make a case for using the the- ory of contextual integrity (CI) [22] to annotate, assess, and compare information sharing practices disclosed in privacy policies, both within and across updates. We showcase this technique with a case study in which we use the CI framework to manually annotate Facebook’s previous and updated privacy policies to identify the senders, recipients and subjects of information, informa- tion types (attributes), and the conditions under which information may be transferred or collected (transmis- sion principles). We then use these annotations to gain insight into the privacy policy and amendments. Our analysis shows that while the updated privacy policy includes statements that describe almost as twice as many information flows as the current policy, it fails to provide more clarity to the consumer. In many cases, arXiv:1809.02236v1 [cs.CY] 6 Sep 2018

Transcript of Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... ·...

Page 1: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

..; .. (..):1–18

Yan Shvartzshnaider†*, Noah Apthorpe†, Nick Feamster, and Helen Nissenbaum

Analyzing Privacy Policies Using ContextualIntegrity AnnotationsAbstract: In this paper, we demonstrate the effective-ness of using the theory of contextual integrity (CI) toannotate and evaluate privacy policy statements. Weperform a case study using CI annotations to compareFacebook’s privacy policy before and after the Cam-bridge Analytica scandal. The updated Facebook pri-vacy policy provides additional details about what in-formation is being transferred, from whom, by whom,to whom, and under what conditions. However, someprivacy statements prescribe an incomprehensibly largenumber of information flows by including many CI pa-rameters in single statements. Other statements resultin incomplete information flows due to the use of vagueterms or omitting contextual parameters altogether. Wethen demonstrate that crowdsourcing can effectivelyproduce CI annotations of privacy policies at scale.We test the CI annotation task on 48 excerpts of pri-vacy policies from 17 companies with 141 crowdworkers.The resulting high precision annotations indicate thatcrowdsourcing could be used to produce a large corpusof annotated privacy policies for future research.

Keywords: Privacy policies, contextual integrity, anno-tation

1 IntroductionMany online services operate by collecting and sharingusers’ information. To protect consumers, the U.S. Fed-eral Trade Commission (FTC) devised fair informationpractice principles (FIPPs) based on the “notice andchoice” framework [9]. These principles, in concert with

*Corresponding Author: Yan Shvartzshnaider†: NewYork University & Princeton University, E-mail: [email protected] Apthorpe†: Princeton University, E-mail:[email protected] Feamster: Princeton University, E-mail: [email protected] Nissenbaum: Cornell Tech, E-mail: [email protected]† These authors contributed equally to this work.

state regulations, require companies to notify consumersabout their information collection and sharing practicesthrough privacy policies. These privacy policies, whichoften include details about the type of information col-lected, the entities that receive or store the information,and the conditions governing data acquisition and han-dling, serve two main purposes: 1) informing consumersabout data collection practices, which they can considerwhen deciding whether or not to use a service, and 2)offering regulators, such as the FTC, a way to auditonline services for misleading privacy practices.

As we write this paper, the European General DataProtection Regulation (GDPR) [2] is coming into ef-fect, forcing companies to adapt their behavior andrewrite their privacy policies or face strict penalties. Thechanges are largely based on GDPR Articles 13, 14, and15, which outline the details companies need to provideto consumers when collecting, processing and sharingtheir information. The regulation puts an emphasis onproviding this information to the “subject in a concise,transparent, intelligible and easily accessible form, us-ing clear and plain language” [1]. As a result, consumersare receiving an avalanche of updated privacy policiesas companies strive for GDPR compliance [4]. However,just because the GDPR has pushed companies to up-date their privacy policies does not necessarily meanthat these updated policies address the issues of previ-ous versions.

In this paper, we make a case for using the the-ory of contextual integrity (CI) [22] to annotate, assess,and compare information sharing practices disclosed inprivacy policies, both within and across updates. Weshowcase this technique with a case study in which weuse the CI framework to manually annotate Facebook’sprevious and updated privacy policies to identify thesenders, recipients and subjects of information, informa-tion types (attributes), and the conditions under whichinformation may be transferred or collected (transmis-sion principles). We then use these annotations to gaininsight into the privacy policy and amendments.

Our analysis shows that while the updated privacypolicy includes statements that describe almost as twiceas many information flows as the current policy, it failsto provide more clarity to the consumer. In many cases,

arX

iv:1

809.

0223

6v1

[cs

.CY

] 6

Sep

201

8

Page 2: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 2

the updated policy has more incomplete and ambigu-ous information flow statements than the current pol-icy. Incomplete information flow statements (45% of allstatements in the current policy and 63% in the up-dated policy) do not mention one or many informationflow parameters. This allows readers to interpret themissing parameters according to their own expectations,which may not match the actual practices of the com-pany. In contrast, some statements in both current andupdated policies suffer from what we refer to as “param-eter bloating,” i.e., they contain more than one instanceof each CI parameter. This increases the cognitive loadrequired for consumers to fully comprehend all possibleinformation flows allowed by the statement. Finally, weidentified privacy statements (over 50% in both currentand updated policies) that use vague and ambiguouslanguage.

To help streamline our approach beyond the Face-book case study, we present a methodology for crowd-sourcing CI privacy policy annotations. We construct CIannotation as a Amazon Mechanical Turk (AMT) Hu-man Intelligence Task (HIT) and compare crowdsourcedannotations against ground-truth expert annotations.We test the annotation task on 48 excerpts of privacypolicies from 17 companies with 141 AMT workers. Thecrowdsourced annotations have an average word-basedprecision score of 0.9 across CI information flow param-eters. This high precision indicates that CI annotationis both understandable and easily applicable by thosewith no prior exposure to CI, despite the often legalis-tic language employed by privacy policies. This providesfurther evidence that CI successfully expresses how mostpeople intuitively reason about information privacy. Fi-nally, the high precision of crowdsourced annotationsindicates that crowdsourcing could be applied at scaleto evaluate future privacy policy updates or to build adataset for training a machine learning model to per-form automatic CI annotations.

In summary, this work makes the following contri-butions:1. We present a method for annotating privacy policies

using the contextual integrity framework. The useof a structured framework allows rigorous analysisof difficult privacy policy texts that is applicable topolicies across companies and sectors.

2. We describe a case study using CI annotation toanalyze recent updates to Facebook’s privacy policy,which identifies several issues with information flowdescriptions across versions.

3. We demonstrate that crowdsourcing can produceprecise CI annotations of legalistic privacy policyexcerpts for future CI annotation research at scale.

2 CI PrimerThe theory of CI is based on two central premises: 1)privacy is defined as the appropriateness of informationflows, which 2) is defined by contextual norms governingparticular settings (contexts) in which information istransmitted [22].

CI offers a template for describing information flowsusing 5-parameter tuples, which include specific actors(senders, recipients, and subjects) involved in the in-formation flow, the type (attribute) of the information,and the condition (transmission principle) under whichthe information flow occurs. This combination of fiveparameters defines contexts which determine privacynorms. For example, while someone might consider shar-ing Fitbit1 data with their doctor, they might view thesharing of this same data with advertising or insurancecompanies as a privacy violation. The entire context, in-cluding recipient and information type, affects how wethink about privacy.

The CI framework was previously used as a lensfor examining android permissions [34], online platformpractices [14, 37], and examining GDPR regulations [13]themselves. In more recent efforts, CI was employed tocapture individuals’ privacy expectations, which can bethen checked for inconsistencies or used to inform poli-cymakers and manufacturers [5, 29].

3 Related workPrivacy policies are notoriously hard to read. As a re-sult, average users find them difficult to comprehendand correctly interpret. This leads to gaps betweenusers’ expectations and the stated policy [20].

The problem of privacy policy comprehension hasbeen the focus of many previous studies. Some effortsfocused on lexical [11, 27] and semantic [28] analysis ofthe privacy policies. Others works [36] used crowdsourc-ing to provide annotations that allow users to more eas-ily parse privacy policies and identify sections related

1 https://www.fitbit.com/home

Page 3: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 3

to specific concerns, as well as to help researchers assesspolicies from different websites.

The Usable Privacy Policy Project (UPPP) [26] hasrecruited law students to hand-annotate 115 privacypolicies with metadata tags such as “first party collec-tion/use,” “user choice/control,” “data retention,” and“data security.” They then used the hand-labeled poli-cies to train a machine learning algorithm that has an-notated over 7,000 policies with the same metadatatags [35]. While extracting relevant paragraphs savestime for the interested reader, it does not provide a wayof identifying issues with the policy itself. It is remainsup to the reader to interpret the text. This tends to cre-ate gaps between privacy expectations and policy state-ments, especially when policy statements are ambiguousor incomplete [20].

Recent work has shown evidence that privacy poli-cies often elide or obscure crucial contextual informationthat could help users formulate their privacy expecta-tions. In 2016, Martin and Nissenbaum [21] showed thatwhen confronted with a privacy-related scenario thatwas missing some contextual information, respondentsmentally supplemented the information, essentially gen-erating a different version of the scenario. Martin andNissenbaum also conducted a survey of 569 respondentspresented with 40 scenarios with random combination ofcontextual factors. The results showed that the “contextof information exchange – how information is used andtransmitted, the sender and receiver of the information –all impact the privacy expectations of individuals” [21].

The importance of including contextual factors wasalso reported by Rao et al., in a 2016 study that com-pared users’ privacy expectations with existing com-panies’ practices [24]. 240 participants were asked tostate their expectations for the data collection, shar-ing, and deletion practices of 16 websites across finance,health, and dictionary categories. The results showedthat users’ privacy expectations depend on the type ofwebsite and the type of information being exchanged.For example, respondents expected medical data to beshared with a medical website, but not a financial web-site. These findings provide further evidence to supportthe importance of contextual factors in how individu-als perceive privacy practices, motivating a contextualanalysis of privacy policies to identify gaps which mightresult in mismatched privacy expectations.

Another body of work has explored using crowd-sourcing to annotate privacy policies, thereby splittingthe cognitive load of understanding an individual pol-icy over multiple workers. In 2016, Wilson et. al., [36]explored the feasibility of asking crowdworkers to an-

swer questions on data collection practices. In the ex-periment, 218 crowdworkers were assigned the task ofreading through 12 privacy policies and answering 9questions about data collection, sharing, and deletionpractices stated in the policies. To support their an-swers, respondents needed to annotate the relevant textin the privacy policies. The results showed that the an-swers of the crowdworkers agreed with those of skilledannotators over 80% of the time. The results indicatethat crowdsourcing can be used to identify paragraphsdescribing specific practices in privacy policies. Our re-sults support this conclusion, but extend it to even moresophisticated annotations of individual components ofcontextual information flows described in privacy poli-cies.

4 Annotation MethodologyWe use the CI framework to annotate policy statementsthat describe contextual information exchanges. Our useof a CI flow-based abstraction is an important distinc-tion from previous privacy policy annotation research,as it serves a useful semantic abstraction for checkingprivacy statements for more complex properties thanpreviously attempted. For the remainder of the paper,we denote a privacy statement with a single set of CIparameters as an “information flow.” For example, weconsider the following statement an information flow,or simply as a “flow:”

We [Facebook]recipient also collect contact in-formationattribute that yousender provide if youupload, sync or import this information(such as an address book) from a device.T P

This flow contains an explicit sender, recipient, at-tribute, and transmission principle. The subject pa-rameter is not included, but is implicitly the consumeragreeing to the privacy policy.

We use the following guidelines to identify CI pa-rameters within individual flows for annotation:– Sender. Any entity (person, company, website, de-

vice, etc.) that transfers or shares the information.This may be a pronoun or a specific entity, such as“Company A,” “strategic partners,” or “publisher.”

– Recipient. Any entity (person, company, website,device, etc.) that ultimately receives the informa-tion. This may be a pronoun or a specific entity,such as “third party,” “developer,” “other users,” or“Company B and its affiliates.”

Page 4: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 4

Fig. 1. Screenshot of the Multi-document Annotation Environ-ment tool configured for CI privacy policy annotation.

– Transmission principle. Any clause describingthe “terms and conditions under which [...] transfersought (or ought not) to occur” [22]. This includesdescriptions of how information may be used or col-lected. Examples include “if the user gives consent,”“when an update occurs,” or “to perform specifiedfunctions.”

– Attribute. Any description of information type,instance, and/or example, such as “date of birth,”“credit card number,” “photos,” or, more generally,“personal information.”

– Subject. Any subjects of the information ex-changed in a flow. Subjects may be explicitly statedor implicitly described using pronouns and posses-sives.

We perform manual policy annotation using Docu-ment Type Definitions (DTD) markup and the Multi-document Annotation Environment [3] (Figure 1).

5 Facebook Case StudyRecent revelations about the misuse of consumer databy Facebook and Cambridge Analytica [12] has rekin-dled the debate around users’ privacy and informed con-sent on such platforms. Facebook claims [10] that theyprovide users with the right level of control to keep theirinformation private. They also claim that consumersare well informed by the disclosure of information han-dling practices in the company’s privacy framework. As-suming that this is indeed the case, i.e., ignoring thecomplexity and a sporadic evolution of Facebook con-trols [15, 17, 19], we see the Cambridge Analytica scan-dal as another example of how things can go wrongwhen consumers’ privacy expectations are misalignedwith privacy policy statements.

Much of the problem stems from not having a coher-ent higher-level abstraction that can help reason about

privacy policies. While talented legal scholars and pro-fessionals are trained to identify relevant privacy pol-icy excerpts and mentally stitch them into coherentflows, so to speak, the average consumer is usually over-whelmed by the legal language of privacy policies [31].Even experts themselves find some privacy policy state-ments confusing [25].

Furthermore, research shows that consumers’ tendto “[project] the important factors to their privacy ex-pectations onto the privacy notice” [20]. In other words,consumers implicitly fill in the blanks left by difficult-to-interpret policies, which inadvertently widens the gapbetween their expectations and actual company behav-iors.

As a result of public outcry [12], Facebook hasamended its privacy policy to include a more detailedaccount of its information sharing practices. It is there-fore timely and instructive to apply our CI annotationtechnique to the previous and updated Facebook pri-vacy policies in order to demonstrate the power of themethod and highlight issues with both policy versions.

5.1 Analysis

Using the methodology described in Section 4, we manu-ally annotated Facebook’s previous privacy policy (datapolicy) as well as the official updated version2. The fol-lowing sections demonstrate the range of analyses thatcan be performed using CI annotations but are not ex-haustive. We anticipate a variety of additional analytictechniques building on these annotations in future work.

5.1.1 Comparison of CI parameters

We compared the number information flows prescribedby both previous and updated Facebook privacy poli-cies (Figure 2) and the CI parameters they contain.We matched CI parameters across policies using fuzzystring matching [8] with the following thresholds foreach CI parameter: sender (70%), attribute (65%), re-cipient (70%), and transmission principle (55%). Whilethe fuzzy string matching worked well, some cornercases required manual validation. We describe some no-table differences between information flows in the previ-ous and updated policies on a parameter-by-parameterbasis as follows:

2 https://www.facebook.com/about/privacy/update

Page 5: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 5

CI Parameter Previous Updated

Sender people you share and communi-cate with

specific friends or accounts, friends and followers, other people using Face-book and Instagram, people

devices, phones, computers, de-vices where you install or accessour Services

connected TVs, web-connected devices you use that integrate with our Prod-ucts

Recipient family of companies that arepart of Facebook

Facebook companies, Facebook company products

people you share and communi-cate

audience they choose, specific friends or accounts, those you connect andshare with around the world, people in your networks, friends and followers,people and businesses outside the audience that you shared with, anyone whocan see the other person’s content, anyone on or off our products

partners conducting academicresearch, partners conductingsurveys

research partners, research partners who we collaborate with, academics

third-party companies who helpus provide and improve our ser-vices or who use advertising orrelated products

websites that integrate with our products, other services that integrate withour products, companies that aggregate

N/A systems, devices and operating systems providing native versions of Facebookand Instagram (i.e. where we have not developed our own first-party apps),anyone on or off our product, content creator, seller, page admins, regulators,network

Attribute information about how you useour services, how you use and in-teract with our services

information about any of your Instagram followers, the ads you see and howyou use their services, other web-connected devices you use that integratewith our products, when you last used our products, whether a window isforegrounded or backgrounded, when you’re using and have last used ourproducts, identifiers from apps or accounts that you use, actions that youhave taken on our products

content about you the features you use, life events, racial or ethnic origin, activities, whereyou live, what games you play, information about your interests actions andconnections, who you are “interested in", your health, events you attend,interests, preferences, your religious views, general demographic, the placesyou like to go and the businesses and people you’re near, whether you arecurrently active on Instagram messenger or Facebook, check-ins, websitesyou visit, other information about your Facebook friends from you, politicalviews, trade union membership, philosophical beliefs

information about the reach andeffectiveness of their advertising

reports about the kinds of people seeing their ads, which Facebook ads ledyou to make a purchase or take an action with an advertiser, ads you see,family device ids

Device information information about operations and behaviours performed on the device, otheridentifiers unique to Facebook company products associated with the samedevice or account, available storage space

N/A information about nearby wi-fi access points beacons and cell towers

TransmissionPrinciple

N/A to detect when someone needs help, to recognise you in photos videos andcamera experiences, help you stream a video from your phone to your tv,combat harmful conduct, can help distinguish humans from bots, to aidrelief efforts, whether or not you have a Facebook account or are logged into Facebook, reshared or downloaded through APIs, to have lawful rights tocollect, use and share your data before providing any data to us and manyothers.

Table 1. List of notable CI parameters that were introduced or refined between the previous and updated Facebook privacy policies.

Page 6: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 6

Total FlowsSenders

Attributes

Recipients

Transmission

Principles0

25

50

75

100

125

150

175

Coun

t

42

15

89

36

6872

20

179

61

121

Previous PolicyUpdated Policy

Fig. 2. Distribution of unique CI parameters identified in the pre-vious and updated Facebook privacy policies.

CI Param Version Instances (frequency)

Recipients Previous we [Facebook] (22), Third party ser-vice, vendors, partners (20)

Updated we [Facebook] (32), Third party ser-vice, vendors, partners (24)

Senders Previous we [Facebook] (14), you (6)Updated we [Facebook] (17), you (11)

Attributes Previous information (8), information aboutyou (2), information we have (2),non-personally identifiable informa-tion only (2)

Updated information (15), content (5), infor-mation about you (4), informationthat we have (4), public informa-tion (4), communications (2), ship-ping and contact details (2).

Table 2. The most frequent recipients, senders, and attributesmentioned in the previous and updated Facebook privacy policies.

Sender. The updated policy offers a more detailedaccount of the sources of information transfer. It elabo-rates on categories from the previous privacy policy andalso includes several new senders, such as “WhatsApp”,“connected TV”, “a business” which were not specifiedin the previous policy. Not surprisingly, the most fre-quent senders in both policies are Facebook and theconsumer (Table 2).

Recipient. Similarly to the sender parameter, theupdated version introduces new recipients, such as “peo-ple and businesses outside the audience that you sharedwith,” “content creators,” “page admin,” “Instagrambusiness profiles,” and “companies that aggregate.” Asexpected, the most common “recipients” in both ver-sions are “Facebook,” and “third party service, vendors,partners” (Table 2).

Attribute. When describing the types of informa-tion being transferred or collected, the updated policycontains more attributes (179) than the the previouspolicy (86). However, we note that some attributes fromthe previous policy were omitted in the update. Theupdated policy does not mention “user id” (opting for“username” instead), or “age range” (instead providingthe example “. . . ad was seen by a woman between theages of 25 and 34”). Generally, the updated policy de-scribes new types of information and/or elaborates oninformation that was previously generic or abstract (Ta-ble 1). For example, the updated draft provides signifi-cantly more details about the type of content that is be-ing collected about the user, including “racial or ethnicorigins,” “health,” “events attended,” “interests,” “reli-gious views,” “general demographics,” “political views,”“trade union membership,” and “philosophical beliefs.”Furthermore, the updated policy describes attributesnot discussed in the previous policy, such as “connectedTVs,” “information about nearby Wi-Fi access points,”“beacons,” and “cell towers.”

Transmission Principle. When specifying con-ditions under which information transfer may be per-formed, the updated policy includes all conditions andinformation flow constraints in the previous policy. Inaddition, the updated policy also contains new transmis-sion principles, such as “whether or not you have a Face-book account or are logged in to Facebook,” “to recog-nise you in photos, videos and camera experiences,” “re-shared or downloaded through APIs,” “to have lawfulrights to collect, use and share your data before provid-ing any data to us,” and many others (Table 1).

Subject. The subject of most flows in both policiesis the consumer. We therefore do not include the subjectparameter in our analysis.

5.1.2 Incomplete Information Flows

Our analysis of the Facebook privacy policies findsmany prescribed information flows with missing (non-specified) parameters (Figure 3). Failing to specify pa-rameters introduces ambiguity, leaving consumers un-informed about company behavior. In the previous pri-vacy policy, 45% (19/42) of flows are missing one ormore parameters. In the updated policy, this numberincreases to 68% (49/72).

Missing Recipient. Table 3 lists the flows fromboth policies with missing recipient parameter. The pre-vious policy only has one flow without an explicit re-cipient while the updated policy has two. Not stating

Page 7: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 7

Any ParameterSenders

Attributes

Recipients

Transmission

Principles0

20

40

60

80

100Pe

rcen

t of F

lows

with

Miss

ing

Para

met

ers

45%

33%

0% 2%

14%

63%

45%

2% 2%

20%

Previous PolicyUpdated Policy

Fig. 3. Percentage of incomplete information flows in Facebook’sprevious and updated privacy policies with missing CI parameters.

Information Flow Version

When you comment on another person’s postor like their content on Facebook, that persondecides the audience who can see your commentor like

Previous

You can choose to provide information in yourFacebook profile fields or life events about yourreligious views, political views, who you are “in-terested in” or your health. This and other infor-mation (such as racial or ethnic origin, philosoph-ical beliefs or trade union membership) could besubject to special protections under the laws ofyour country

Updated

For example, people can share a photo of youin a story or mention, tag you at a location ina post or share information about you in theirposts or messages

Updated

Table 3. Information flows in the previous and updated Facebookprivacy policies with missing recipient parameters.

information recipients forces users to infer what entitieswill have access to their information from other sources,often leading to incorrect notions of company behav-ior [21, 32]. Identifying the recipient can sometimes bedifficult, as in the flow “We are able to suggest that yourfriend tags you in a picture by comparing your friend’spictures to information we’ve put together from yourprofile pictures and the other photos in which you’vebeen tagged.”

Missing Sender. The sender parameter is notspecified in 14 (33%) flows in the previous policy nor in33 (45%) flows in the updated policy. Many of the state-ments with missing senders describe “use-of-data,” i.e.,they inform the consumer how the collected informationwill be used but not from where it is collected. Missing

senders can easily lead to misinterpretations and falseprivacy expectations. For example, the source of the in-formation in the following statement is unclear: “Wecollect information about the people, Pages, accounts,hashtags and groups you are connected to and how youinteract with them across our Products, such as peopleyou communicate with the most or groups you are partof.”

Missing Transmission Principle. We identified6 information flows in the previous policy where thetransmission principle was missing. For example, thestatement “We share information we have about youwithin the family of companies that are part of Facebook”does not specify under what conditions/constraints theinformation is being shared. Likewise, the statement“We also collect information about how you use our Ser-vices, such as the types of content you view or engagewith or the frequency and duration of your activities.Things others do and information they provide” doesnot contain any transmission principles. These state-ments force consumers to guess when and for what rea-son information is collected.

The updated policy contains even more (15) flowswith missing transmission principles. Without a trans-mission principle, flows like “We also receive informa-tion about your online and offline actions and purchasesfrom third-party data providers who have the rights toprovide us with your information” become ambiguousbecause it is not clear when or why this information isbeing collected.

5.1.3 CI Parameter Bloating

Our CI annotation analysis also identifies several flowsin both previous and updated policies with multipleCI parameters of the same type. We refer to this phe-nomenon as CI parameter bloating. Parameter bloatingadds to the cognitive effort required to isolate single in-formation flows from privacy policy statements, becauseit is often not clear which combinations of parametersdescribe information flows that actually take place.

Consider the following flow from the updated policy:

Page 8: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 8

Advertisers, app developers and pub-lisherssenders can send usrecipient informationthrough Facebook Business Tools that theyuse, including our social plug-ins (such asthe Like button), Facebook Login, our APIsand SDKs or the Facebook pixelT P . Thesepartners provide information about yoursubject

activities off Facebook including informa-tion about your device, websites you visit,purchases you make, the ads you see andhow you use their services whether or notyou have a Facebook account or are loggedin to Facebookattributes.

At first glance, the above privacy statement seems trans-parent and informative. It explicitly specifies the typeof information that is being exchanged, between whatactors and under what conditions. However, this is anexample of CI parameter bloating. The prescribed in-formation flow is overloaded with CI parameters. Notethe many senders (advertisers, app developers and pub-lishers) attributes (information about your device, web-sites you visit, purchases you make, the ads you see andhow you use their services), and transmission principles(when you use Like, Facebook login, APIs, SDKs andthrough Facebook Pixel). How does the consumer rea-son about this information flow? Do all listed senderstransfer all of these information types to Facebook ordoes each particular sender transmit a specific infor-mation type? Do flows with each sender/informationpair occur under each listed TP or only specific ones?Even technically-savvy users will have difficulty reason-ing about the many possible information flows with allcombinations of each parameter type.

We would like to emphasize that specifying multi-ple instances of the same parameter does not automati-cally lead to parameter bloating. Specifically, parameterbloating does not include instances where a single pa-rameter is enumerated to clarify a given category, asin the following statement, which elaborates on severalattributes:

Werecipients collect information about how use ourProducts, such as types of content you view or en-gage with, the features you use, the actions youtake, the people or accounts you interact with andthe time, frequency and duration of your activi-tiesattributes.

Figure 4 shows the number of CI parameters perflow in both current and updated policies. In the pre-vious policy, there are 10 information flows that men-tion more than one recipient, with one information flow

0 1 2 3 4 5 6 7 8 9 10Number of Parameters in Flow

0

10

20

30

40

50

0 1 2 3 4 5 6 7 8 9 100

10

20

30

40

50

Num

ber o

f Flo

ws

SenderAttributeRecipientTransmission Principle

Fig. 4. Number of CI parameters per flow in Facebook’s previous(top) and updated (bottom) privacy policies. The previous policyhad one flow with 18 attributes and the updated policy has oneflow with 40 attributes that are omitted for readability.

standing out, listing 10 potential recipients. Three flowsmention more than one sender, and 16 flows mentionmultiple attributes, ranging from 2 to 18 attributesper flow. Multiple transmission principles appear in 16flows, ranging from 2 to 5 TPs per flow.

The updated policy contains even more bloatedflows. Multiple senders appears in 8 information flows (2senders in 6 flows, 3 in 1 flow, and 4 in 1 flow). Multipleattributes occur in 36 flows ranging from 2 attributes in18 flows to 40 attributes in a single flow. Nineteen of theflows include more than one recipient (2 recipients in 14flows, 3 in 4 flows, and 7 in 1 flow). Finally, the numberof flows with multiple transmission principles increasedto 30, ranging from 2 TPs in 14 flows to 8 TPs in asingle flow.

Given that an average consumer today spends littleto no time reading privacy policies, it is unreasonable toassume that the even the most privacy-concern citizenwill dissect all possible combinations of this many multi-parameter flows.

5.1.4 Vague and Ambiguous Flows

CI annotation analysis also allow us to identify informa-tion flows that use vague terminology as defined in [6](Table 4).

Figure 5 shows the percentage of flows in Facebook’sprevious and updated policies that use vague terminol-ogy. In both policies, “modality” vagueness dominates,occurring in close to 45% of all flows. The updated pol-

Page 9: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 9

Category Definition Example Terms

Conditionality it is not clear what is the condition as-sociated with information transfer

“as needed”, “as necessary”, “as appropriate”, “depending”,“sometimes”, “as applicable”, “otherwise reasonably deter-mined”, “from time to time”

Generalization action or information types are too ab-stract or vague

“typically", “normally", “often" , “general", “usually", “gener-ally", “commonly ", “among other things", “widely", “primarily",“largely", “mostly"

Modality Hard to estimate the possibility of oc-currence

“likely", “may", “can", “could" “would", “might", “could", “pos-sibly"

Numeric Quantifier Vague numeric quantifier “certain", “most", "majority", "many", "some" "few"

Table 4. Summary of four vagueness categories as defined in [6] and associated example terms.

Modality

Modality &

Numeric Quantifie

r

Numeric Quantifie

r

Generalization0

20

40

60

80

100

Perc

ent o

f Flo

ws w

ithVa

gue

Wor

ding

42%

2% 2% 0%

45%

0% 2% 1%

Previous PolicyUpdated Policy

Fig. 5. Percentage of information flows in Facebook’s previousand updated privacy policies with various categories of vaguewording (categories defined in Table 4)

.

icy does not represent a reduction in vague terminol-ogy from the previous version. Rather, the percentageof flows with vague terms remains the same. This sup-ports our initial claim the updated data policy does notcontribute to clarity. The widespread occurrence of flowswith vague wording further supports the problem thatprivacy policies are too often “obtuse and noncommittal[and] make it difficult for people to know what informa-tion a site collects and how it will be used” [31].

5.2 Summary

The updated Facebook privacy policy has twice as manyinformation flows as the previous policy (Figure 2).However, more information flows does not necessarilyequal less confusion. Our analysis shows that many ofthe newly introduced information flows are either in-

complete, overloaded with CI parameters and/or usevague terms.

Rather than fix fundamental issues in their privacypolicy in the wake of the Cambrige Analytica scandal,Facebook seems to have opted to add more terms, en-tities, and conditions. While this may initially seem toprovide additional information to the consumer, CI an-notation analysis reveals that there are still many is-sues preventing users from interpreting clear informa-tion flows from these new details and from understand-ing how their data is being collected and shared.

6 Crowdsourcing CI AnnotationsThe ability to effectively crowdsource CI annotationwould allow researchers to efficiently pursue two pri-mary goals: 1) collect a large dataset of annotations inorder to train a machine learning model to perform CIannotation automatically, and 2) perform a large-scaleanalysis of information flows across the privacy policiesof many companies. This would provide a broad senseof information flow disclosure practices across the tech-nology sector via many of the same analysis methodsused in the Facebook case study.

We have developed a crowdsourcing technique thatposes CI annotation as an Amazon Mechanical Turk(AMT) Human Intelligence Task (HIT). We crowd-sourced the annotation of 48 privacy policy excerpts,including 16 excerpts from the Google privacy policycirca October 2017 and 16 pairs of excerpts from thepre-GDPR and post-GDPR privacy policies of 16 well-

Page 10: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 10

known companies3. This choice of policy excerpts pro-vides evidence that our crowdsourcing technique is effec-tive within a single policy as well as for privacy policiesacross the technology sector. The excerpt pairs were se-lected as representative statements from the pre-GDPRpolicies of each company and the corresponding state-ments from the GDPR-compliant version of each policyupdated in May 2018. The excerpts ranged from 21 to113 words4 and from 1 to 4 sentences for a total of 2621annotated words over 103 annotated sentences.

We compared the crowdsourced annotations toground-truth annotations from a CI expert. The crowd-sourced annotations had an average word-based preci-sion of 0.9 across CI information flow parameters, in-dicating that the crowdworkers understood the rela-tively complex notion of information flow parametersand were able to correctly identify them in real privacypolicy text. These results show that crowdsourcing canbe an effectual tool for CI annotation. We will releasethe crowdsourced annotations as a public dataset forfurther analysis upon publication.

Sections 6.1–6.7 describe the design and evaluationof our CI annotation crowdsourcing method in more de-tail.

6.1 Annotation Task Design

We developed the annotation task as a Qualtrics [23]survey deployed on AMT. The task was designed to op-timize annotation accuracy while minimizing cost.

Consent and Instructions. The first page of theannotation task is a consent form. Participants who donot consent are prevented from proceeding. The annota-tion task collects no personal information about crowd-workers and was approved by our university’s Institu-tional Review Board.

The task next presents annotation instructions (Ap-pendix Figure 9), including a description of each in-formation flow parameter that should be annotated(sender, attribute, recipient, and transmission princi-ple) and an example annotated flow. The informationflow parameter descriptions match those used by expertannotaters as described in Section 4.

3 Amazon, Fitbit, Indiegogo, LinkedIn, The New York Times,Mirosoft, Shapeways, Slack, Spotify, Steam, Stripe, Tinder,Twitter, Uber, WhatsApp, Yelp4 Mean: 55 words/excerpt, SD: 23 words/excerpt

Fig. 6. Screening questions to identify AMT workers who areable to perform high accuracy annotations. The ground truthannotations are shown with sender in blue, recipient in green,attribute in red, and transmission principle in purple. CommonEnglish stopwords (except “you,” “your,” “them,” and “we”) arenot counted when comparing crowdworker annotations to theground truth.

Screening Questions. Each crowdworker is askedto annotate (highlight and label) all words and phrasescorresponding to CI information flow parameters inthree privacy policy excerpts (Figure 6). These excerptsserve as screening questions to identify workers whoare able to perform high-accuracy annotations. Work-ers whose annotations have at least a 0.7 word-basedF1 score (Section 6.4) compared to ground-truth expertannotations on the first screening question (for whichthe correct answer is given) and either of the next twoscreening questions are allowed to proceed with the task.Workers whose annotations do not meet this accuracythreshold do not proceed. This helps limit the effect andcost of workers who do not understand the task or whoattempt to “cheat” by performing minimal annotations(e.g., highlighting just the first word in each excerpt).

Annotations. Each worker who passes the screen-ing questions is then asked to annotate 5 of the 48 ex-cerpts of interest, although these could be replaced witharbitrary privacy policy excerpts for future research.The format of these annotation questions is equivalentto the screening questions (Figure 6). The instructionsare also repeated at the top of the page for workers torefer to if they wish.

Annotations of all excerpts from multiple workersare collected, analyzed, and processed into the finalcrowdsourced annotation for each privacy policy (Sec-tion 6.3).

Page 11: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 11

The task concludes with a field for optional open-ended comments if participants have anything they wishto communicate to the researchers.

6.2 Task Deployment

We first tested the annotation task on UserBob [33], ausability-testing service where users narrate their expe-rience while performing tasks. We collected seven User-Bob responses. All UserBob workers completed the taskin less than 15 minutes. We used the UserBob responsesto adjust task instructions to ameliorate worker confu-sion. Performing such “cognitive interviews” is commonpractice in survey design and development [30].

We deployed the annotation task as a HIT on AMTusing TurkPrime [18], an online tool for researchers toeasily manage AMT tasks. We limited the HIT to AMTworkers in the United States with an HIT approval rat-ing of 90–100% and at least 100 HITs approved. 141total workers accepted the HIT. Of these workers, 99passed the screener questions. All 48 excerpts were an-notated by between 7 and 12 workers (mean 10.2).AMT workers who did not pass the screening ques-tions were automatically reimbursed $0.25. AMT work-ers who passed the screening test and completed theentire annotation task were reimbursed $1.50. Collect-ing all responses took approximately 4 hours.

6.3 Majority Vote Annotations

We are ultimately interested in acquiring the singlehighest-accuracy annotation for each privacy policy in-dependent of individual workers. We therefore combinemultiple annotations of each privacy policy excerpt intoa “majority vote” annotation, which assigns each wordin an excerpt to the CI parameter annotated by at least50% of the participants presented with that excerpt. Iffewer than 50% of workers labeled a word with the sameparameter, then the word is given no label in the ma-jority vote annotation.

6.4 Evaluation Metrics

We had one of the authors perform expert ground truthannotations of all excerpts prior to seeing the crowd-sourced results. We use the following evaluation metricsto compare the crowdsourced majority vote annotationsto the expert annotations.

Parameter-based scoring. We manually countedall instances of each CI parameter labeled in boththe crowdsourced majority vote and expert annotations(true positives), in the expert annotation only (falsenegatives), and in the crowdsourced annotation only(false positives). We further categorized the false pos-itives and false negatives to better understand crowd-worker mistakes and how to improve the annotation taskin future studies (Section 6.6).

Word-based scoring. We also applied a auto-mated word-based scoring method that did not re-quire manually comparing variable-length parametersand could be used to easily evaluate future large-scaleCI annotation efforts.

We first removed common English stopwords fromall annotations to prevent variations in article or prepo-sition highlighting from affecting annotation compar-isons. We used the stopword list in Python NLTK li-brary [7] less “you,” “your,” “them,” and “we,” as thesepronouns could have been senders or recipients in theprivacy policy excerpts.

True positives are then words labeled by both theparticipant and the expert. False positives are words la-beled by the participant only. False negatives are wordslabeled by the expert only. This allows us to calculateword-based precision, recall, and F1 scores for each CIparameter and excerpt. Some CI parameters do not oc-cur in every excerpt. If the expert did not label a partic-ular parameter in an excerpt, participants’ recalls weredefined as 1 for the corresponding annotation. If a par-ticipant did not label a particular element in an excerpt,the participant’s precision was defined as 1 for the cor-responding annotation. These are standard definitionsof precision and recall for edge cases.

6.5 Annotation Accuracy

Figure 7 shows the counts of correctly and incor-rectly annotated CI parameters across all excerpts fromparameter-based scoring. The incorrect annotations aredivided into categories to better understand the sourceof crowdworker errors. The crowdsourced majority voteannotations correctly labeled 43% of the senders, 89%of the attributes, 68% of the recipients, and 60% of thetransmission principles across all excerpts. False neg-atives were by far the most common error, with thecrowdsourced annotations missing 30% of the senders,9% of the attributes, 21% of the recipients, and 34% ofthe transmission principles across all excerpts. Finally,false positive errors comprised 26% of the senders, 2%

Page 12: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 12

CorrectSkipped

Ambiguous

Overlapping

True Error

Expert Error

Crowdsourced CI Parameter Evaluation

0

50

100

150

200

Num

ber A

cros

s All

Exce

rpts Sender

AttributeRecipientTransmission Principle

Fig. 7. Parameter-based evaluation of crowdsourced majority voteannotations compared to expert ground truth. Correct (true pos-itive) annotations are parameters labeled to match the expertannotation. Skipped (false negative) annotations are parametersonly labeled by the expert. All other incorrect annotations (falsepositives) are described in Section 6.6. Note that most errors areskipped parameters (false negatives), indicating that the crowd-workers understood the task, but that further work is needed toimprove recall.

of the attributes, 11% of the recipients, and 6% of thetransmission principles across all excerpts.5

Figure 8 shows the distributions of word-based pre-cision and recall scores for the majority vote annota-tions across all excerpts and for each CI parameter.The average precision across all excerpts is 0.95 for at-tributes, 0.80 for senders, 0.89 for recipients, and 0.94for a transmission principles. The corresponding aver-age recall across all excerpts is 0.87 for attributes, 0.82for senders, 0.83 for recipients, and 0.59 for transmissionprinciples.

Overall, the high precision of the majority votecrowdworker annotations (by both parameter-based andword-based scoring methods) indicates that the major-ity of crowdworkers understood the CI annotation task,and were able to correctly identify and highlight CI pa-rameters in short privacy policy excerpts. However, themany false negatives indicates that the framing of thetask could potentially be improved to help crowdwork-ers avoid missing or intentionally skipping some param-eters.

5 Percentages were rounded to the nearest whole value and maynot add to 100%

Fig. 8. Word-based precision and recall scores of majority votecrowdsourced annotations compared to expert ground truth foreach CI element.

6.6 Evaluating Crowdworker Errors

Analyzing the crowdsourced annotations raises thequestion “What causes particular excerpts or CI param-eters to be more difficult for crowdworkers to annotatethan others?”

One intuitive explanation is that excerpts that arelonger, more difficult to read, or contain more CI param-eters are more difficult for crowdworkers to annotate.To test this hypothesis, we calculated Spearman corre-lations of the majority vote annotation word-based F1scores versus text length, Flesch-Kincaid Reading Ease[16], FOG Index [16], and number of CI parameters (Ap-pendix Table 5). However, all of the resulting correlationcoefficients had absolute values less than 0.5, indicatingno strong correlations with F1 score. This suggests thatcrowdworker difficulties with certain excerpts or param-eters are due to more nuanced factors than length orreadability.

We further investigate these factors by manuallycomparing the crowdsourced majority vote annotationsto the expert annotations. We noticed that crowdwork-ers had more difficulty annotating senders and recip-ients than attributes and transmission principles. At-tributes and transmission principles are generally nounsor verbs, occur in lists, and require less semantic parsingto identify. In contrast, senders and attributes are often

Page 13: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 13

pronouns that occur singly and require more complexsentence parsing to distinguish between them.

More detailed analysis indicated that the 160parameter-based annotation errors fall into four maincategories. Each category has corresponding implica-tions for crowdsourcing CI annotations.

6.6.1 Expert Errors

We identified 11 cases where the majority vote crowd-sourced annotation was correct while the “ground-truth” expert annotation was incorrect. Most of thesecases were due to the expert missing a one-word senderor recipient, e.g. “we.” We did not adjust recall or preci-sion scores to reflect the incorrect expert annotations, asthese judgments were made after, and could have beeninfluenced by, viewing the crowdsourced annotations.However, the presence of these incorrect expert annota-tions demonstrates the non-triviality of the annotationtask.

6.6.2 Skipped Parameters

The most common error occurred when the crowdwork-ers simply neglected to annotate some or all instances ofa given parameter. These errors were the primary con-tributor to lowering recall scores without affecting preci-sion. We identified 117 skipped parameter errors. Thereare three possible reasons why crowdworkers might haveneglected to annotate all instances of each parameter:1) the workers may have considered an excerpt and hon-estly thought that it didn’t contain the parameter, 2)the workers may have intentionally skipped entire pa-rameters, or 3) the workers may have found one or twoinstances of each parameter and then moved on to thenext excerpt without double-checking to ensure thatnone were missed. This could be due to cognitive fa-tigue or the fact that crowdworkers are incentivized tofinish the annotations as quickly as possible to optimizetheir hourly compensation rate.

As an example of reason 1, consider the sentence“We collect information when you sync non-content likeyour email address book, mobile device contacts, or cal-endar with your account.” Both the expert and thecrowdworkers correctly labeled “email address book,”“mobile device contacts,” and “calendar” as attributes.However, the expert also labeled “information” as anattribute, while the majority vote annotation did not.This was marked as a false negative “skipped parame-

ter” error, but “information” does not provide any spe-cific details about the attribute, so it is understandablethat the crowdworkers omitted this label. This specificskipped parameter error (“information” not labeled asattribute) occurred in 6 of the annotated excerpts.

Skipped errors could potentially reduced in futurecrowdsourcing tasks by using previous crowdworker an-notations to provide “hints” for successive workers. Forexample, the number of parameters annotated by pre-vious workers could be shown (likely as a range) to in-dicate approximately how many parameters the currentworker should find. This would help address reason 3for skipped errors above, providing a nudge for workersfinding fewer parameters to continue searching for addi-tional annotations. However, such hints would have tobe carefully applied to prevent individual crowdworkererrors from negatively influencing the collective annota-tion effort.

6.6.3 Ambiguous Parameters

Ambiguous parameter errors occurred when a CI pa-rameter was mislabeled compared to the expert anno-tation, but the correct labeling is ultimately open to in-terpretation. Consider the sentence “If you want to takefull advantage of the sharing features we offer, we mightalso ask you to create a publicly visible Google Profile,which may include your name and photo.” In this sen-tence, “publicly” could be interpreted as a recipient, i.e.the public would receive the data in the Google Pro-file. However, “publicly” could also be interpreted as atransmission principle i.e. the flow is from “you” to your“Google Profile” and the condition on the flow is thatit is public. The expert labeled “publicly” as a recipi-ent, while the crowdsourced majority did not. We onlyidentified 3 such ambiguous parameter errors, indicat-ing that CI information flow descriptions map naturallyto privacy policy texts.

6.6.4 Overlapping Parameters

Overlapping parameter errors occurred when a CI pa-rameter was mislabeled compared to the expert anno-tation, but the text in question is part of two or moreCI parameters simultaneously. We identified 16 overlap-ping parameter errors. Consider the excerpt “When youuse our services or view content provided by Google, weautomatically collect and store certain information inserver logs.” The first clause (before the comma) could

Page 14: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 14

be interpreted as a single transmission principle, but the“you” could also be a sender. Variations on this issuewere the primary cause of false positive errors for the“sender” parameter, i.e. the expert annotated an entireclause as a transmission principle but the majority voteannotation instead labeled a single word in the clauseas a sender.

The presence of overlapping parameter errors is dueto a tradeoff in our implementation of the CI annota-tion task. We chose to allow only one CI parameter an-notation per word in each excerpt to simplify the taskfor workers. This tradeoff could be avoided in futurework by asking each crowdworker to annotate only a sin-gle CI parameter type, simplifying the task from multi-class classification to binary classification. However, thiswould require more crowdworkers per policy and couldlead to higher rates of false positives if crowdworkersaren’t forced to discriminate between different parame-ters.

6.6.5 True Errors

True errors occurred when the crowdworkers unambigu-ously misannotated a CI parameter. Fortunately, trueerrors accounted for only 13 out of 160 total errors inthe majority vote annotation. This implies that when alabel makes it into the majority vote annotation (withsufficient workers contributing to the vote), it is verylikely to be correct. The low frequency of true errorsindicates that, with improvements to reduce the num-ber of skipped parameter errors, crowdsourcing can bea high-accuracy method of obtaining CI annotations ofprivacy policies.

6.7 Summary

Our proof-of-concept experiment shows that crowd-workers with no prior exposure to CI are able to quicklyunderstand and perform CI annotations of legalisticprivacy policies. Labels which make it into a majorityvote annotation compiled from several individual crowd-workers are very likely correct. This supports the no-tion that CI-style information flows are a natural wayfor people to think about privacy and thereby a use-ful framework for analyzing privacy policies and privacypolicy updates.

7 DiscussionWe present a CI annotation methodology to help re-searchers and regulators assess and evaluate privacypolicies. This work is a stepping stone in a larger ef-fort to improve readability and increase transparencyin disclosure of information handling practices. Whilephilosophical in origin, the theory of CI offers a practi-cal framework to reason about privacy implications in agiven context and therefore serves as a powerful tool forreasoning about privacy preserving efforts in technicalfields.

The notion of an appropriate information flow inthe CI framework lends itself well to user data privacypolicies; privacy statements are essentially prescribed bythe policy information flows. Annotating privacy poli-cies with CI parameter labels offers a way to apply afull-fledged formal theory of privacy to their analysis.Relevant stakeholders—consumers, legal scholars, andregulators—can perform qualitative, quantitative andnormative analysis to find incomplete, vague and am-biguous privacy statements. This also enables leverag-ing other applications of the CI framework. For exam-ple, it is possible to compare which flows prescribed bythe policy align or do not align with consumers privacyexpectations [5].

As privacy policies evolve, CI annotations assistcomparative analyses of new updates to identify whichinformation flows were amended, added or removed.These analyses will ideally help companies write morecoherent and complete privacy policies by identifyingprivacy statements containing missing, vague and/orbloated CI parameters.

Furthermore, we can use our CI annotation crowd-sourcing methodology to produce a large corpus of pri-vacy policies annotations and discover trends and pat-terns in the types of flows that are being prescribed bypolicies within and across industries. This corpus couldalso be used as a training set to build tools for auto-matically identifying CI flows and parameters in privacypolicies.

8 Limitations and Future WorkWe have identified the following opportunities for fur-ther research to improve and streamline the CI annota-tion process:

First, privacy policies are not written to inten-tionally fit the CI framework. As discussed in Sec-

Page 15: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 15

tions 5 and 6, privacy policy terms can be ambiguous,vague, compound and even missing. This complicatesthe task of annotating privacy policy text with CI pa-rameter labels. Nevertheless, our crowdsourcing anno-tations showed promising results on a diverse privacystatements from privacy policies of 17 companies. Infuture work, we intend to continue validating the CIannotation approach on larger policy samples.

Second, our annotation methodology deals onlywith statements describing information transfers. Thesestatements comprise the majority of privacy policy textand lend themselves to the CI framework. However,other statements, such as those describing how longinformation is stored, when and how information ispurged, and what features allow users to fine tune pri-vacy settings, fall outside the reasoning of the CI frame-work. Annotating these statements will require addi-tional methodologies to complement our approach. Ablended technique, such as combining CI annotation forinformation transfer statements with more general tagslike those used by the Usable Privacy Project [26], couldprovide the rigor of our CI technique with the flexibilityto account for the diversity of information included inprivacy policies.

9 ConclusionThis paper presents a methodology for analyzing pri-vacy policies using annotations based on the theory ofcontextual integrity [22]. We perform a case study anno-tation of pre- and post-GDPR Facebook privacy policiesand demonstrate that CI offers a rigorous way to exam-ine privacy statements. We find that Facebook’s post-GDPR privacy policy describes more total informationflows with more parameters than the pre-GDPR ver-sion, but the updates do not improve the percentage offlows that contain vague language, omit parameters, orinclude many parameters of the same type. These issuesimpede interpretability, preventing users from clearlyunderstanding how their information is being collectedand shared.

To further scale our approach, we present a methodfor crowdsourcing CI annotation of privacy policies. Wetest this method on 48 excerpts from 17 policies with141 Amazon Mechanical Turk workers. Resulting high-precision crowdsourced annotations indicate that CI an-notation is an intuitive method for interpreting privacypolicies and that crowdsourcing could be used to obtain

a large corpus of annotated privacy policies for futureanalysis.

References[1] Art. 12 GDPR Transparent information, communication and

modalities for the exercise of the rights of the data subject.https://gdpr-info.eu/art-12-gdpr/.

[2] EU GDPR Information Portal. https://www.eugdpr.org.[3] Multi-document Annotation Environment. https://keighrim.

github.io/mae-annotation/.[4] Last-minute frenzy of GDPR emails unleashes ’torrent’ of

spam – and memes. https://www.eugdpr.org, 2108.[5] N. Apthorpe, Y. Shvartzshnaider, A. Mathur, D. Reisman,

and N. Feamster. Discovering smart home internet ofthings privacy norms using contextual integrity. ACM onInteractive, Mobile, Wearable and Ubiquitous Technologies(IMWUT/UbiComp), 2018.

[6] J. Bhatia, T. D. Breaux, J. R. Reidenberg, and T. B. Nor-ton. A theory of vagueness and privacy risk perception. InRequirements Engineering Conference (RE), 2016 IEEE 24thInternational, pages 26–35. IEEE, 2016.

[7] S. Bird, E. Klein, and E. Loper. Natural language process-ing with Python: analyzing text with the natural languagetoolkit. " O’Reilly Media, Inc.", 2009.

[8] A. Cohen. FuzzyWuzzy: Fuzzy string matching in python.https://github.com/seatgeek/fuzzywuzzy, 2011.

[9] F. T. Commission et al. Privacy online: A report tocongress. Washington, DC, June, pages 10–11, 1998.

[10] E. Dwoskin and T. Romm. Facebook makes its privacycontrols simpler as company faces data reckoning. https://www.washingtonpost.com/news/the-switch/wp/2018/03/28/facebooks-makes-its-privacy-controls-simpler-as-company-faces-data-reckoning/, 2018.

[11] M. C. Evans, J. Bhatia, S. Wadkar, and T. D. Breaux. Anevaluation of constituency-based hyponymy extraction fromprivacy policies. In Requirements Engineering Conference(RE), 2017 IEEE 25th International, pages 312–321. IEEE,2017.

[12] S. Frier. Facebook Updates Policies After Privacy Out-cry, Limits Data Use. https://www.bloomberg.com/news/articles/2018-04-04/facebook-updates-policies-after-privacy-outcry-limits-data-use, 2018.

[13] A. Guinchard. Contextual integrity and eu data protectionlaw: Towards a more informed and transparent analysis.SSRN, 2017.

[14] G. Hull, H. R. Lipford, and C. Latulipe. Contextual gaps:privacy issues on facebook. Ethics and information technol-ogy, 13(4):289–302, 2011.

[15] M. Johnson, S. Egelman, and S. M. Bellovin. Facebookand privacy: it’s complicated. In Proceedings of the eighthsymposium on usable privacy and security, page 9. ACM,2012.

[16] J. P. Kincaid, R. P. Fishburne Jr, R. L. Rogers, and B. S.Chissom. Derivation of new readability formulas (automatedreadability index, fog count and flesch reading ease formula)for navy enlisted personnel. Technical report, Naval Tech-

Page 16: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 16

nical Training Command Millington TN Research Branch,1975.

[17] H. R. Lipford, A. Besmer, and J. Watson. Understandingprivacy settings in facebook with an audience view. UPSEC,8:1–8, 2008.

[18] L. Litman, J. Robinson, and T. Abberbock. Turkprime. com:A versatile crowdsourcing data acquisition platform for thebehavioral sciences. Behavior research methods, 49(2):433–442, 2017.

[19] Y. Liu, K. P. Gummadi, B. Krishnamurthy, and A. Mislove.Analyzing facebook privacy settings: user expectations vs.reality. In Proceedings of the 2011 ACM SIGCOMM con-ference on Internet measurement conference, pages 61–70.ACM, 2011.

[20] K. Martin. Privacy notices as tabula rasa: An empiricalinvestigation into how complying with a privacy notice isrelated to meeting privacy expectations online. Journal ofPublic Policy & Marketing, 34(2):210–227, 2015.

[21] K. Martin and H. Nissenbaum. Measuring privacy: an em-pirical test using context to expose confounding variables.Colum. Sci. & Tech. L. Rev., 18:176, 2016.

[22] H. Nissenbaum. Privacy in context: Technology, policy, andthe integrity of social life. Stanford University Press, 2010.

[23] Qualtrics. www.qualtrics.com, 2018.[24] A. Rao, F. Schaub, N. Sadeh, A. Acquisti, and R. Kang.

Expecting the unexpected: Understanding mismatched pri-vacy expectations online. In Twelfth Symposium on UsablePrivacy and Security (SOUPS 2016), pages 77–96, Denver,CO, 2016. USENIX Association.

[25] J. R. Reidenberg, T. Breaux, L. F. Cranor, B. French,A. Grannis, J. T. Graves, F. Liu, A. McDonald, T. B. Nor-ton, and R. Ramanath. Disagreeable privacy policies: Mis-matches between meaning and users’ understanding. Berke-ley Tech. LJ, 30:39, 2015.

[26] N. Sadeh, A. Acquisti, T. D. Breaux, L. F. Cranor, A. M.McDonald, J. R. Reidenberg, N. A. Smith, F. Liu, N. C.Russell, F. Schaub, et al. The usable privacy policy project.Technical report, Technical report, Technical Report, CMU-ISR-13-119, Carnegie Mellon University, 2013.

[27] K. M. Sathyendra, F. Schaub, S. Wilson, and N. Sadeh.Automatic extraction of opt-out choices from privacy poli-cies. In AAAI Fall Symposium on Privacy and LanguageTechnologies, 2016.

[28] K. M. Sathyendra, S. Wilson, F. Schaub, S. Zimmeck, andN. Sadeh. Identifying the provision of choices in privacypolicy text. In Proceedings of the 2017 Conference onEmpirical Methods in Natural Language Processing, pages2764–2769, 2017.

[29] Y. Shvartzshnaider, S. Tong, T. Wies, P. Kift, H. Nis-senbaum, L. Subramanian, and P. Mittal. Learning pri-vacy expectations by crowdsourcing contextual informationalnorms. In Fourth AAAI Conference on Human Computationand Crowdsourcing, 2016.

[30] S. Sudman, N. M. Bradburn, N. Schwarz, and T. Gullick-son. Thinking about answers: The application of cognitiveprocesses to survey methodology. Psyccritiques, 42(7):652,1997.

[31] J. Turow, M. Hennessy, and A. Bleakley. Consumers’ un-derstanding of privacy rules in the marketplace. Journal ofconsumer affairs, 42(3):411–424, 2008.

[32] J. Turow, M. Hennessy, and N. Draper. Persistent Mis-perceptions: Americans’ Misplaced Confidence in PrivacyPolicies, 2003–2015. Journal of Broadcasting & ElectronicMedia, 62(3):461–478, 2018.

[33] UserBob. https://userbob.com/, 2018.[34] P. Wijesekera, A. Baokar, A. Hosseini, S. Egelman, D. Wag-

ner, and K. Beznosov. Android permissions remystified:A field study on contextual integrity. In USENIX SecuritySymposium, pages 499–514, 2015.

[35] S. Wilson, F. Schaub, A. A. Dara, F. Liu, S. Cherivirala,P. G. Leon, M. S. Andersen, S. Zimmeck, K. M. Sathyendra,N. C. Russell, et al. The creation and analysis of a websiteprivacy policy corpus. In Proceedings of the 54th AnnualMeeting of the Association for Computational Linguistics(Volume 1: Long Papers), volume 1, pages 1330–1340,2016.

[36] S. Wilson, F. Schaub, R. Ramanath, N. Sadeh, F. Liu, N. A.Smith, and F. Liu. Crowdsourcing annotations for web-sites’ privacy policies: Can it really work? In Proceedingsof the 25th International Conference on World Wide Web,WWW ’16, pages 133–143, Republic and Canton of Geneva,Switzerland, 2016. International World Wide Web Confer-ences Steering Committee.

[37] M. Zimmer. Privacy on planet google: Using the theory ofcontextual integrity to clarify the privacy threats of google’squest for the perfect search engine. J. Bus. & Tech. L.,3:109, 2008.

Page 17: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 17

Appendix

Fig. 9. CI annotation task instructions.

Page 18: Yan Shvartzshnaider *, Noah Apthorpe , Nick Feamster, and ... · AnalyzingPrivacyPoliciesUsingContextualIntegrityAnnotations 6 Total Flows Senders Attributes RecipientsTransmissionPrinciples

Analyzing Privacy Policies Using Contextual Integrity Annotations 18

Statistic CI Parameter Corr. coeff. p-valueTotal # words Attribute -0.03 0.82

Sender 0.03 0.86Recipient -0.11 0.46TP -0.15 0.30

# words labeled Attribute 0.07 0.62as CI parameters Sender 0.10 0.48by expert Recipient 0.01 0.96

TP -0.02 0.89

Flesch-Kincaid Attribute 0.14 0.35Reading Ease Sender 0.20 0.18

Recipient 0.10 0.49TP -0.05 0.76

FOG Index Attribute 0.15 0.32Sender 0.19 0.19Recipient 0.10 0.50TP -0.06 0.67

Table 5. Spearman correlations of majority vote annotation word-based F1 scores for each CI parameter versus various statistics ofcorresponding privacy policy excerpts.