An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad...

12
An Analysis of United States Online Political Advertising Transparency Laura Edelson New York University Shikhar Sakhuja New York University Ratan Dey New York University Damon McCoy New York University ABSTRACT During the summer of 2018, Facebook, Google, and Twier created policies and implemented transparent archives that include U.S. political advertisements which ran on their platforms. rough our analysis of over 1.3 million ads with political content, we show how dierent types of political advertisers are disseminating U.S. polit- ical messages using Facebook, Google, and Twier’s advertising platforms. We nd that in total, ads with political content included in these archives have generated between 8.67 billion - 33.8 billion impressions and that sponsors have spent over $300 million USD on advertising with U.S. political content. We are able to improve our understanding of political advertisers on these platforms. We have also discovered a signicant amount of advertising by quasi for-prot media companies that appeared to exist for the sole purpose of creating deceptive online communi- ties focused on spreading political messaging and not for directly generating prots. Advertising by such groups is a relatively re- cent phenomenon, and appears to be thriving on online platforms due to the lower regulatory requirements compared to traditional advertising platforms. We have found through our aempts to collect and analyze this data that there are many limitations and weaknesses that enable intentional or accidental deception and bypassing of the current implementations of these transparency archives. We provide several suggestions of how these archives could be made more robust and useful. Overall, these eorts by Facebook, Google, and Twier have improved political advertising transparency of honest and, in some cases, possibly dishonest advertisers on their platforms. We thank the people at these companies who have built these archives and continue to improve them. 1 INTRODUCTION Online advertising plays an increasingly important role in polit- ical elections. As part of the 2016 U.S. national elections there were a number of controversies regarding an ad-driven propaganda campaign to inuence elections [4] and privacy violations [12]. In response to these controversies, Facebook, Google, and Twier have all created policies and implemented products to make trans- parent and archive U.S. political advertisements that have run on their platforms. A report by Upturn points out in their analysis of Facebook’s then-proposed political transparency archive plans that providing eective transparency of only political ads can be tricky in the face of a complex online ad network [31]. In this paper, we analyze Facebook [5], Google [11], and Twit- ter’s [29] political ad transparency implementations along with the political ad data included in each archive. rough the lens of these transparency eorts, we perform what is to our knowledge the rst large-scale analysis of U.S. online political advertising. Our analysis enables us to begin to understand and describe the parts of online political advertising in the U.S. that have been made accessible and transparent. We collected as much data as was possible from these archives, collecting 75% of ads archived by Facebook and 100% of ads archived by Twier and Google from May, 2018 – October 21 st , 2018. In total, we collected and analyzed over 1.3 million politi- cal ads from over 24 thousand sponsors. We additionally connect this data to a public dataset published by ProPublica which was gathered by Facebook users via a browser plugin. e ProPublica dataset provides partial information on how Facebook ads have been targeted to the user seeing them. Our analysis was hampered by our inability to collect all of the ads in Facebook’s transparency archive due to limitations of their current beta API. It was also hampered by Facebook and Google releasing ranges instead of exact impression data. It is also unclear if spend is the best metric for measuring the impact of an online political advertisement. us, there is some level of uncertainty in much of our analysis especially that related to Facebook’s platform. We acknowledge that all three of these political advertising trans- parency archives were rapidly deployed and this has caused some of the issues with what and how it was released. We have worked with Facebook to improve access to the ads in their transparency archive and we hope to work with Google and Twier so that they can include more political advertisers in their transparency archives. Given these limitations and biases of our data, we perform an initial large-scale analysis of U.S. online political advertising. As part of this study, we provide information about the audience size of individual political ads based on impression data for each of the platforms. We nd that across all three platforms the majority of political ads are small, costing their sponsors less than $100 USD with 82% of all Facebook political ads costing between $0- $99. is conrms and quanties the prevalence of small likely highly targeted ads that can contain custom political messaging. We also create a taxonomy of political ad types based on their intent (i.e., connect, donation, inform, move) and an eective ad type classication methodology based on labeling URLs included in political ads. Using our methodology, we are able to provide a longitudinal analysis of political ads based on the type of ad. Finally, we are able to identify many likely dishonest advertisers that are not correctly disclosing or are obfuscating the real ad sponsor. We categorize these types of advertisers into quasi for-prot media companies and corporate astroturfers. We discuss the limitations and weaknesses that enable intentional or accidental deception and 1 arXiv:1902.04385v1 [cs.SI] 12 Feb 2019

Transcript of An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad...

Page 1: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

An Analysis of United States Online Political AdvertisingTransparency

Laura EdelsonNew York University

Shikhar SakhujaNew York University

Ratan DeyNew York University

Damon McCoyNew York University

ABSTRACTDuring the summer of 2018 Facebook Google and Twier createdpolicies and implemented transparent archives that include USpolitical advertisements which ran on their platforms rough ouranalysis of over 13 million ads with political content we show howdierent types of political advertisers are disseminating US polit-ical messages using Facebook Google and Twierrsquos advertisingplatforms We nd that in total ads with political content includedin these archives have generated between 867 billion - 338 billionimpressions and that sponsors have spent over $300 million USDon advertising with US political content

We are able to improve our understanding of political advertiserson these platforms We have also discovered a signicant amountof advertising by quasi for-prot media companies that appearedto exist for the sole purpose of creating deceptive online communi-ties focused on spreading political messaging and not for directlygenerating prots Advertising by such groups is a relatively re-cent phenomenon and appears to be thriving on online platformsdue to the lower regulatory requirements compared to traditionaladvertising platforms

We have found through our aempts to collect and analyze thisdata that there are many limitations and weaknesses that enableintentional or accidental deception and bypassing of the currentimplementations of these transparency archives We provide severalsuggestions of how these archives could be made more robust anduseful Overall these eorts by Facebook Google and Twier haveimproved political advertising transparency of honest and in somecases possibly dishonest advertisers on their platforms We thankthe people at these companies who have built these archives andcontinue to improve them

1 INTRODUCTIONOnline advertising plays an increasingly important role in polit-ical elections As part of the 2016 US national elections therewere a number of controversies regarding an ad-driven propagandacampaign to inuence elections [4] and privacy violations [12] Inresponse to these controversies Facebook Google and Twierhave all created policies and implemented products to make trans-parent and archive US political advertisements that have run ontheir platforms A report by Upturn points out in their analysis ofFacebookrsquos then-proposed political transparency archive plans thatproviding eective transparency of only political ads can be trickyin the face of a complex online ad network [31]

In this paper we analyze Facebook [5] Google [11] and Twit-terrsquos [29] political ad transparency implementations along with the

political ad data included in each archive rough the lens of thesetransparency eorts we perform what is to our knowledge the rstlarge-scale analysis of US online political advertising Our analysisenables us to begin to understand and describe the parts of onlinepolitical advertising in the US that have been made accessible andtransparent We collected as much data as was possible from thesearchives collecting 75 of ads archived by Facebook and 100 ofads archived by Twier and Google from May 2018 ndash October 21st 2018 In total we collected and analyzed over 13 million politi-cal ads from over 24 thousand sponsors We additionally connectthis data to a public dataset published by ProPublica which wasgathered by Facebook users via a browser plugin e ProPublicadataset provides partial information on how Facebook ads havebeen targeted to the user seeing them

Our analysis was hampered by our inability to collect all of theads in Facebookrsquos transparency archive due to limitations of theircurrent beta API It was also hampered by Facebook and Googlereleasing ranges instead of exact impression data It is also unclearif spend is the best metric for measuring the impact of an onlinepolitical advertisement us there is some level of uncertainty inmuch of our analysis especially that related to Facebookrsquos platformWe acknowledge that all three of these political advertising trans-parency archives were rapidly deployed and this has caused some ofthe issues with what and how it was released We have worked withFacebook to improve access to the ads in their transparency archiveand we hope to work with Google and Twier so that they caninclude more political advertisers in their transparency archives

Given these limitations and biases of our data we perform aninitial large-scale analysis of US online political advertising Aspart of this study we provide information about the audience sizeof individual political ads based on impression data for each ofthe platforms We nd that across all three platforms the majorityof political ads are small costing their sponsors less than $100USD with 82 of all Facebook political ads costing between $0 -$99 is conrms and quanties the prevalence of small likelyhighly targeted ads that can contain custom political messagingWe also create a taxonomy of political ad types based on theirintent (ie connect donation inform move) and an eective adtype classication methodology based on labeling URLs includedin political ads Using our methodology we are able to provide alongitudinal analysis of political ads based on the type of ad Finallywe are able to identify many likely dishonest advertisers that arenot correctly disclosing or are obfuscating the real ad sponsor Wecategorize these types of advertisers into quasi for-prot mediacompanies and corporate astroturfers We discuss the limitationsand weaknesses that enable intentional or accidental deception and

1

arX

iv1

902

0438

5v1

[cs

SI]

12

Feb

2019

bypassing of the current implementations of these transparencyarchives Based on our experiences analyzing these transparencyarchives we provide several suggestions of how these archivescould be made more robust and useful

We have made as much of the data that we have collected aspossible public as well as all of our labeling data collection andanalysis scripts public Our initial reports have been the basis formany journalistsrsquo stories which have improved online politicaladvertising transparency

2 BACKGROUNDFacebook Google and Twier have all deployed political advertis-ing transparency archives However each of these transparencyarchives have dierent criteria for the inclusion of ads and dierentmodes of access which we create a taxonomy of in Table 1 Eachof these transparency archive implementations has strengths andweaknesses and there is currently no ldquobestrdquo archive

Facebook Google TwierAds All Candidates Federal FederalIncluded Issue ads candidate candidate

related Issue adsSponsor Name Name NameInfo FECEIN billing infoAd Contents Ranges Ranges ExactViewed Gender age NA Gender ageAudience geolocation geolocationTargeting NA Age gender Age genderInfo geolocation geolocation

keywordsData Portal Portal PortalAvailability API (wNDA) Database API

Table 1 Transparency Implementations

What is an ad For purposes of this study an ad is a record inan archive with a unique id assigned by the platform While eachplatform has slightly dierent information associated with eachrecord each has 3 categories of information associated with itcontent context and results e contents of an ad consist of anytext images andor videos seen by an ad viewer e context of thead is the information specied by the advertiser about how whenwhere and by whom they want the ad to be seen and how muchthey are willing to spend for the ad to be seen Both the contentand context of the ad are specied by the advertiser at the timeof ad creation e results of the ad consist of information aboutwho ultimately interacted with the ad and when those interactionshappened and how much was ultimately spent on the ad Not allof this information is made available about each record in any ofthe archives but all of this information is made available by at leastone archiveArchive Completeness All archives contain a unique id for eachad impression counts and amount spent for each advertisement aswell as the dates the ad was active Facebook and Google releasead impression and spend in ranges and Twier releasing these asexact numbers Twierrsquos implementation of releasing exact im-pression and spend information from every ad enables us to more

precisely measure political advertising on their platform It wouldhelp remove uncertainty if Facebook and Google would also releaseexact ad impressions and spend amounts instead of ranges Wealso note that impressions is an imperfect metric to measure theldquoreachrdquo of an ad and it would be useful to also include click andother interaction metrics recorded by the platforms Facebook andTwier includes the ad text image and video content Googlersquosarchive contains a link to a webpage where this content is view-able but does not contain the content itself Twier is the onlyplatform to release targeting information including whether userswere targeted by geography age or gender Facebook and Googlehave detailed information about how users are targeting for eachad based on advertiser produced lists of personally identiableinformation groups they belong to demographic or income infor-mation that the platform has about the user geography or keywordsearch information however they do not currently make any ofthis information available in their archivesAds Included Facebook has themost inclusive policy and includesin their archive ads that meet any of the following criteria ldquo(1) Ismade by on behalf of or about a current or former candidate forpublic oce a political party a political action commiee or advo-cates for the outcome of an election to public oce (2) Relates toany election referendum or ballot initiative including rdquoget out thevoterdquo or election information campaigns (3) Relates to any nationallegislative issue of public importance in any place where the ad isbeing run [8] (4) Is regulated as political advertisingrdquo [7] Facebookallows advertisers to opt into including their ads in the archive Inorder to enforce their policy Facebook uses a combination of userreports and machine learning algorithms to ldquocatchrdquo political adswhere the sponsor did not opt into making them transparent

Google is only including rdquoads related to elections or issues thatfeature a federal candidate or oceholderrdquo [11] Google has statedthat they plan on expanding the set of ads included in their archiveIt is unclear how Google enforces their policy

Twierrsquos original policy was limited to only including ads spon-sored directly by federal candidates However Twier has sinceexpanded their policy to include ldquo(1) Ads that refer to an electionor a clearly identied candidate (2) Ads that advocate for legisla-tive issues of national importance A clearly identied candidaterefers to any candidate running for federal state or local elec-tion [30]rdquo Based on our analysis it appears that Twier is currentlynot enforcing their policy wellSponsorsrsquo Info Facebook only displays a text string which thesponsor provides and is intended to identify who is paying for thead We went through the veing process of becoming a politicaladvertiser on Facebookrsquos platform is entailed uploading a USidentication card which was approved approximately ve minuteslater at which time we could start posting political ads Facebookalso validates the address of the advertiser by sending them a postcard which must be replied to within 30 days or else the advertiserwill be suspended However during this 30 day grace period politi-cal advertisers can post ads without validating their address Wedisclosed to Facebook and a subsequent independent experimentby Turton a reporter from Vice news showed that Facebook doesnot currently vet this text string which allowed the reporter to postads appearing to be from US Senators [27] is is a security issuethat Facebook has acknowledged but claims there is no eective

2

and scalable veing techniques [21] We will describe the issuesthat we found with this self-reported text string later in this paper

Google provides both a text string and a Federal Election Candi-date (FEC) ID or EIN (US Tax ID) which is veed for every politicaladvertiser in their archive Twier provides a text string and whenavailable the billing info of the political sponsors Twier initiallydid not vet this information but started veing sponsorrsquos EIN simi-lar to Google on September 30th 2018 Providing a consistent andeasy to reference identier such as FEC or EIN for each sponsorenables us to beer study sponsors in Google and Twierrsquos archiveViewedAudience Facebook and Twier both include break downson the impression viewing audience by gender age range and statelevel geolocation Google provides this information as a heat mapimage that we cannot currently extract this information from forour analysis thus wemark Google as ldquoNArdquo for this category Googlehas replied that they will work on releasing this information in aformat that we can analyzeTargeting Info Facebook does not include any explicit targetinginformation in their archive Google and Twier makes transparentAge gender and geolocation based targeting but do not appearto release other types of targeting criteria such as audience andcontent which are allowed by their advertising platforms Googlealso makes transparent some aggregated keyword targeting dataAll of the platforms only release partial targeting information atbest which obscures a key facet of online political advertisingData Availability Facebook initially only provided a keywordbased portal that was designed for small-scale interactive user ex-ploration of the ads in the archive Facebook enabled anti-scrapingfunctionality in July 2018 that makes it dicult to collect large-scale data by scraping this portal In September 2018 Facebookreleased an API that is currently in beta testing which we haveaccess to aer signing an NDA stipulating that we will not publiclyrelease raw data collected from the API is eectively means thatonly a small set of US organizations participating in the API betatest have large-scale data collection access to ads in the Facebookarchive

Twier has provided an open API and list of all accounts in-cluded in their transparency archive which allows us to eectivelycollect all ads included in their archive Google implemented aportal similar to Facebookrsquos and also releases a Bigery (SQL-like)database of all the ads included in their archive which is updatedweekly For our use case of large-scale data analysis this databaseformat is ideal

3 DATA COLLECTION METHODOLOGY31 FacebookInitially we scraped Facebookrsquos archive using a list of keywords thatincluded elected positions (ie governor judge senator) US statenames and key political issues (ie health care immigration taxes)Around the end of July 2018 Facebook implemented anti-scrapingmeasures which blocked our scraper us we had no viable meansof collecting large-scale ad data until Facebook implemented theirAPI in September 2018 We have publicly released a report and allof the data that we collected by scraping Facebookrsquos archive userportal before our scraper was blocked [17]

We are part of Facebookrsquos Political Ad Archive API beta test-ing program [9] which allows us to query Facebookrsquos Political AdArchive for specic keyword terms which is matched against thepage name the disclaimer or ad text Ads returned by FacebookrsquosAPI are ordered using a proprietary ranking algorithm that was notdescribed to us how it functions However most advertisementsappear to be returned in chronological order A single query toFacebookrsquos API returns at most 1000 ads and we can page throughto collect additional ads using pagination functionality as part ofthe API Currently there is a limitation in Facebookrsquos Political AdArchive API beta that prevents us from paging past 8000 ads isis problematic because many searches will return far more than8000 results

Information on spend and impressions per ad is only availablein broad ranges For impressions the ranges presented are 0 - 9991000 - 4999 5000 - 9999 10000 - 49999 50000 - 99999 100000 -199999 200000 - 499999 500000 - 999999 For spend the rangespresented are 0 - 99 100 - 499 500 - 999 1000 - 4999 5000 - 999910000 - 49999 50000 - 99999 100000 - 199999 200000 - 499999500000 - 999999

Additionally the API has very low rate limits We have foundthat functionally we could make at most 3 requests per minuteon average before hiing these rate limits Our goal was to createas comprehensive and representative a dataset as possible Giventhe very low rate limits and limits on the number of responses fora given search our approach was to search by advertising pageas much as possible in order to reduce the bias in our data Weare currently able to keep up with the rate of new advertisementsappearing in Facebookrsquos political ad transparency archive Wecannot publicly release the raw data that we have collected fromFacebookrsquos API due to the agreement that we have signed withFacebook as a requirement for access to their API

We created a separate approach for discovering pages that arelinked to sponsored political ads Our approach to discoveringpages involved scraping Facebookrsquos Political Ad Archive user portalinterface We chose a scraping method for page discovery sinceour access to Facebookrsquos API is highly rate limited and it wouldbe logistically infeasible to perform the queries required for bothpage discovery and to collect ads using our API access Our list wasnot a complete list of advertisers using Facebookrsquos platform sinceit depends on good coverage based on our keyword searches

Facebook started publishing a comprehensive list aer the cut-o for our data analyzed in this study Our data collection fromFacebookrsquos archive is our best eort and was incomplete based onanalysis of what is contained in Facebookrsquos transparency reportWe have changed our data collection methodology moving forwardto discover Facebook pages running political ads using Facebookrsquosweekly transparency reports is combined with improvementswe requested and Facebook implemented to their API aer thedata collection period for this study will improve our coverage to amostly complete set of US political ads Facebook has included intheir transparency archive

32 GoogleGoogle published their archive as a public dataset in a Bigery(SQL-like) format and commied to keeping it public However we

3

observed that ad spend and impression values for ads and occasion-ally advertiser information were being changed aer their entryinto the archive For this reason we created separate archives ofthe dataset on a weekly basis Additionally ad text information wasnot available in the dataset itself but was viewable at a summarypage for each ad We scraped all of these associated pages to col-lect ad text to be associated in our dataset with the underlying addata Unfortunately many of these summary pages did not rendercorrectly so we were only able to collect ad text for approximately66 of pages which contained it Two separate issues preventedcollection First some ad pages display the messageldquoAdvertisers are able to use approved third party vendors to serve adson Google While we are able to review these ads for compliance withadvertising policies due to technical limitations we are currentlyunable to display the content of the ad in the Transparency Reportrdquo

Second some ad pages displayed the messageldquoPolicy violation is ad violated Googlersquos Advertising Policyrdquo Werecommend that Google change their implementation so that adsserved by third party vendors or which were deleted for compli-ance reasons are still accessible through their transparency archiveGoogle can place a click-through disclaimer to avoid accidentalexposure to policy violated content similar to what Facebook hasimplemented for deleted advertisements

Information on spend and impressions per ad are only availablein broad ranges For impressions the ranges presented are rdquoiexcl= 10krdquordquo10k-100krdquo rdquo100k-1Mrdquo rdquo1M-10Mrdquo rdquoiquest 10Mrdquo For spend the rangespresented are rdquoiexcl 100rdquo rdquo100-1krdquo rdquo1k-50krdquo rdquo50k-100krdquo rdquoiquest 100krdquoe ads in this dataset are a combination of text-only ads that aredisplayed alongside Google search results and image or video-onlyads that are displayed as banner or sidebar ads on Googlersquos AdSensenetwork

In addition to per-ad data Google also published some aggregatedata on a per-advertiser and geographic basis One of these aggre-gations was exact weekly spend per advertiser roughout thispaper we present minimum numbers for impressions and spendbecause both Google and Facebook publish ranges for impressionsand spend for each ad instead of exact numbers ese total ag-gregations give us a sense of how much error there is when weuse these minimum estimates for Google According to Googleadvertisers spent $45 M on political ads but our minimum estimateof spending was only $11 M

33 TwitterTwier publishes a list of all political campaigning advertisers [28]which we scrape daily to discover new political campaign advertis-ersrsquo Twier accounts In addition to this list provided by Twierwe have also manually aempted to identify every federal electioncandidatesrsquo personal or campaign Twier account We then queryeach account daily using Twierrsquos API perform to collect updatedinformation on all promoted tweets and detect federal electioncandidates which are not listed on Twierrsquos political campaigningadvertisers page but are sponsoring tweets During our scrapingwe have noticed that some promoted tweets were deleted and arereplaced with the textldquois Tweet is not available because it includes content that violatedTwier Ads Policiesrdquo

e information for these deleted promoted tweets is no longeraccessible through Twierrsquos political transparency archive How-ever if we have scraped them before they were deleted we haveretained the content and information about these promoted tweetsWe recommend that Twier change their implementation so thatpromoted tweets which were deleted are still accessible throughtheir transparency archive Twier can place a click-through dis-claimer to avoid accidental exposure to policy violated contentsimilar to what Facebook has implemented for deleted advertise-ments We have made public all of the data that we have collectedfrom Twierrsquos transparency archive

Additionally we noted that there were several accounts of fed-eral candidates that were not being archived according to Twierrsquospolicies We would nd these ads during our regular scrapes for adsby all federal candidates but no billing or impression data would beavailable and the ads would disappear from Twierrsquos archive aer7 days as is typical for non-political ads We notied Twier about4 accounts which they subsequently added to their transparencyarchive However Twier did not retroactively include their priorpromoted tweets from these accounts and there are currently 11additional federal candidate accounts which have promoted tweetsnot included in the archive us it appears that Twierrsquos pro-cess for agging federal election candidatesrsquo account that shouldbe included in their archive is not working correctly We will dis-close this new set of 11 accounts to Twier and continue to workwith them to improve their process for discovering and includingrelevant promoted tweets in their transparency archive

4 DATASETSWe have collected all of the US political ad data that Google andTwier have made transparent and archived as of October 21st 2018 In addition we have made our best eort to collect as muchof the US political ad data that Facebook has made transparent andarchived as of October 21st 2018 For Facebook we are not ableto collect all of the ad data from their transparency archive due tothe limitations in their API this is a subset of US political ads thatran on Facebook Note that our scraper was blocked by Facebookin mid-July 2018 and we were not able to collect data until thebeginning of September 2018 when we began to use their beta APIis means that we do not have good coverage of Facebook adsduring that period since it is dicult to retrieve older ads fromFacebookrsquos current beta API On October 23rd 2018 shortly aerwe froze our dataset Facebook released their Ad Archive Report [6]From this we know that as of the cuto date for data analyzed inthis study Facebook had a total of 167M ads in their archive froma total of 256M spent across 78K pages We have captured over 75of all ads in the archive but only 49 of the pages

On the Facebook platform ads that run without a rsquoPaid for byrsquolabel but are later deemed to be political are removed from circula-tion and added to the archive We have been able to nd 96106 suchads in the archive with a total spend of at least $428 million and670 million impressions It does not appear that Google or Twierhave any mechanism for retroactively marking an ad as politicalif it is discovered aer the fact and we would encourage them todevelop this capacity

4

Table 2 shows all of the data that we have collected from eachof the platforms Most of the political ads in these archives arefrom late May 2018 to October 21st 2018 but there are severalolder ads from Twier and Facebook that have been included intheir transparency archives Facebook has the most advertisersads impressions and spend However Facebook also includesmany political issue ads in their transparency archive that are notincluded in Google and Twierrsquos transparency archives so this isnot a fair comparison of political advertising activity across allthree platforms An important dierence between the datasetsis that while Facebook and Twier are publishing breakdownsof impressions on geographic and demographic lines Google isinstead publishing geographic and demographic targetings eseshould not be considered equivalent Below in the analysis sectionwe will present a more accurate comparison of political advertisingactivity across all three platforms

Additionally we use a dataset published by ProPublica of politicalads that have been viewed by their users who have installed browserextensions that automatically collected advertisements on theirFacebook pages and sent them to ProPublicarsquos servers [23] Table 3provides an overview of this dataset We were able to connect adsin the ProPublica dataset to ads in our dataset of archived politicalads by mapping the ad IDs used in the ProPublica dataset to thead archive IDs used in the archive To do this we scraped theFacebookrsquos web-based political ad archive as both the ad IDs andad archive ids were available Each record in this dataset containsamong other things the text of the data the various targetingsreceived by the dierent users who saw the ad the page associatedwith the ad and the rsquoPaid byrsquo ad sponsor string associated with thead Of the 33308 ads in the ProPublica dataset with a creation dateaer May 7th 2018 the ocial start of the Facebook dataset wewere able to nd 18010 Because the users who contribute data tothis dataset are self-selecting these ads should not be considereda representative sample of ads in the larger Facebook Ad ArchiveAmong other things the average ad spend on ads in this datasetwas $644 compared to $107 for the larger dataset

As part of our analysis we manually categorized the top adver-tisers on all three platforms We categorized these advertisers byorganization type (political candidate Political Action Commiee(PAC) Union For Prot etc) For Facebook we were able to classifythe organizations of the advertisers who were responsible for atleast 75 of the total number of ads in the Facebook archive ForGoogle we labeled the organization of top advertisers who wereresponsible for 80 of the total number of ads and for Twier wewe were able to label all 88 advertisers with their organization typeWe were able to categorize 12833 of the top ad sponsors If we werenot able to categorize an advertiser it is marked as rsquoUnknownrsquo

We also classied the ads themselves into 5 categories InformConnect Donate Move or Commercial Inform ads seek to per-suade the viewer but do not make an explicit ask Connect ads seekthe userrsquos contact information Donate ads seek the userrsquos moneyMove ads aempt to motivate the user to take some action in thephysical world such as aending a rally or voting Commercial adsseek to sell the user goods or services We classied the ads basedon the outgoing links from the ads Ads that had no outgoing linkswere always classied as Inform ads as they could not have any fur-ther ask from the user Ads that linked directly third-party sites for

Figure 1 Distribution of ads by size

event management (eventbritecom) contact management(GoogleDocs) or payments management (actbluecom) were solely classi-ed as Move Connect or Donate ads respectively Ads that linkedto general campaign sites were usually multiple-classed as somecombination of the three as these ads and pages typically mademultiple asks Ads by For Prot Media organizations were classiedas Inform ads as these advertisers do not sell goods or servicesto users Ads by For Prot organizations that linked to store sitesor sites selling services were classied as Commercial We wereable to categorize 907840 ads with these methods Heavy use ofthird-party service providers by advertisers was extremely helpfulin making these classications If we were not able to categorize anad it was marked as rsquoUnknownrsquo We validated this method of adcategorization by taking a random sample of 300 categorized adsfrom each platform and manually verifying them e error rate forFacebook was 4 for Google was 37 and for Twier was 37

A limitation that applies to all our datasets is that we do notknow when the spend and impressions for each ad occurred duringthe lifetime of the ad Some ads run for several weeks and some foronly a day but in either case we aribute their entire spend andtotal impressions to the creation date of the ad

5 RESULTSWe calculate total spend and impression minimum and maximumfor Facebook ads by summing respectively the smallest and largestvalue for the range given for each ad For Google advertiser weeklyspend data was aggregated for all advertisers so we did not have toestimate that number For Twier exact numbers for impressionsand spend were available so no estimation was needed We alsonote that we are only able to collect a subset of political advertise-ments from Facebookrsquos transparency archive due to accessibilityissues with their beta API We stress that because the criteria forinclusion in these archives diered on the dierent platforms thegures on relative proportions of ad types and advertiser typesshould be seen as a reection of what the platforms chose to maketransparent in addition to what is organically present on theseplatforms

With that in mind we can see clear dierences between theplatforms Of particular note is the dierence in ad size visible inFigure 1 with Facebook having a much larger sized share of thesmallest size of ad Also of note is the diering prevalence of typesof advertisers in Figure 3 with PACs making up a much largerpercentage of spend on Google compared with the other platforms

5

Platform Total Ads Total Sponsors Total Pages Impressions Spend First Ad Date Last Ad DateFacebook 126 M 24 K 38 k 735 B - 2112 B $135 M - $567 M July 14th 2014 October 21st 2018Google 41 K 616 NA 13 B - 116 B $45 M May 31st 2018 October 21st 2018Twier 1808 88 NA 118 M $16 M December 21st 2016 October 21st 2018

Table 2 Overall Datasets

Total Ads 81052Total Pages 2363Total Ad Sponsors 2395Earliest Ad Date July 31st 2017Latest Ad Date October 18th 2018

Table 3 ProPublica Political Advertisements From Face-book

Figure 2 Distribution of ads by type

Figure 3 Distribution of spend by advertiser type

51 Data Over Timee time period during which we were collecting data coincidedwith the 2018 midterm elections in the United States us we wereable to observe changing paerns in spend leading up to a majorelection Figure 4 shows spend by week for the 5 month periodleading up to the election and Figure 5 shows raw ad count forthe same period We note that our data particularly for Facebookspend is right-censored for the nal two weeks is is causedby Facebookrsquos API limitations which only enable us to be ableto recheck ad spends weekly us newly create ads have likely

Results Facebook Google TwierTotal Advertisers 1 K 534 54Total Ads 161 K 15 K 1 KTotal Impressions 800 M - 24 B 280 M- 3 B 100 MTotal USD Spend $12 M - $60 M $135 M $14 MAve ImpressionsAd 5 K - 15 K 32 K - 283 K 65 KAve USD SpendAd $74 - $373 $1 K $885

Table 4 Federal Candidate Only Results

not spent much of their budget when we initially discover themis right censor eect also likely eects Google and Twier to alesser degree due to ads with larger budgets that take several daysto spend down completely is can be corrected by periodicallyrechecking the ads until they have all spent their budgets which isnormally within a week If the paper is accepted we will updatethe data to include ads up to the US midterm elections

We can see the expected increases in the number of ads onall three platforms as the US midterm elections approach OnFacebookrsquos platform there is an increase in connect ads and onTwier there is an increase in move ads Both of these are relatedto sophisticated ldquoget out the voterdquo eorts that many groups havedeployed ese move ads include images which include specicpolling place addresses and websites that provide polling placedirections and information e connect ads oen provide userswith instructions on how they can volunteer to help with early andday-of voter turnout eorts e cause of the spending spikes forFacebookrsquos platform can be aributed to a few unknown sponsorsthat we could not link to a legally registered entity but that werelikely quasi for-prot advertisers which we will discuss furtherlater in the paper e spending spikes on Twierrsquos platform canbe aributed to candidates who ran a few ads with larger budgets

52 Federal Candidate ComparisonIn order to understand how political advertising across these plat-forms dier we aempted to create a comparable subset of adver-tisers and ads is is dicult because each platform has slightlydierent criteria for inclusion To do this we present results foradvertising only paid for by candidates for federal oce whichwas the broadest set that was reliably included in all three archivesNote this does not include ads by current oceholders who are notseeking re-election or ads that merely mention a federal candidatebut are paid for by another party Results for these advertisers arepresented in Table 4

Table 4 shows that Facebook is the platform with the broadestappeal to federal candidate advertisers with far more advertisersand ads than Google However political advertising by this groupon Google appears to generate more spend and possibly more im-pressions than ads on Facebook e average ad size on Facebookin terms of impressions and spend are the smallest based on our

6

Figure 4 Platform Ad Count By Ad Type By Week

Figure 5 Platform Spend By Advertiser Type By Week

Figure 6 Federal Candidate ads by Size

Figure 7 Federal Candidate Spend by Ad Type

minimum estimates indicating that advertisers are running smallerlikely more targeted ads on Facebook ese small ads on Face-book are what are called micro targeted which we dene as lessthan 1000 impressions or a spend of less than $100 For Facebookmicrotargeted ads make up 81 of the overall number of ads for

federal candidates in our dataset For Twier this number is 62and for Google it is 54 Figure 6 shows the share of ads by sizeof spend and here we can begin to see how federal candidates usethese platforms in dierent ways We note that the distribution ofads by size for federal candidates in Figure 6 is very similar to theoverall distribution of ads by size seen in Figure 1

Figure 7 shows the relative spend on dierent ad platformswhere we see very dierent percentages for types of ads Commer-cial ads are not shown in this gure because there were too fewcommercial ads to be visible Particularly of note is the fact that adsseeking donations were far more common on the Google platformand ads seeking to spread a message (rsquoInformrsquo) were much morecommon on Facebook

Seeing these dierences in both ad size and the types of ads thatwere run we wanted to understand if advertisers were trying toreach dierent geographic audiences with dierent types of adsTo do this we compared the number of regions in which variousads had impressions on Facebook and Twier and the number ofregions targeted for Google Figure 8 shows that a variation intargeting strategy is visible on Facebook and Twier On FacebookrsquoMoversquo ads that encouraged people to aend a rally volunteer fora candidate or some other in-person activity were viewed onaverage in 4 regions while rsquoDonatersquo ads were viewed in 24 regionson average is makes a certain amount of intuitive sense peopleare willing to travel only so far to aend a rally but can donate tocandidates anywhere in the United States

7

Figure 8 CDF of Regions by Ad Type for Federal Candidates

53 Ad TargetingOne of the deciencies with the Facebook political ad archive isthat while it does share geographic and demographic informationabout who saw a particular ad we have no way of knowing howthat ad was targeted However we were able to connect our datasetcontaining information about who consumed ads with one pub-lished by ProPublica which contains some data about how ads weretargeted e ProPublica data was collected by a browser pluginoperated by ProPublica which anyone can install ProPublicarsquosbrowser plugin [22] uses a supervised Natural Language Process-ing (NLP) classier to detect political ads in addition to allowingusers to manually classify ads they see as political e browserplugin then collects the partial ad targeting explanations Facebookprovides by automatically clicking on the ldquoWhy am I seeing thisrdquobuon for political ads and sends it to ProPublica for them to makepublic

We rst provide a brief background on Facebook targeting audi-ence options [34] Facebook exposes prospective advertisers to aplethora of options First advertisers can target users based on agegender location and languages they speak Second advertisers canchoose to send their ads to users in a custom audience or lookalikeaudience Custom audiences contain a list of identiers of specicusers Advertisers can use various types of data to create a customaudience list ranging from specifying the emails phone numbersor physical addresses of people they want to reach to users thathave visited their website installed their mobile application orliked their Facebook Page Lookalike audiences allow advertisers tolet Facebook choose to whom to sends their ads based on previouscampaigns Finally advertisers can choose from a long list of target-ing aributes the characteristics they want users who receive theirads to have (eg users interested in Catholic Church) Targetingaributes are categorized in types such as demographics behaviorsand interests Advertisers can choose multiple aributes to target

A prior study by Athanasioshas et al [2] reverse engineeredwhat Facebook chooses to show and the limitations of the ad tar-geting explanation Facebook provides is study showed that adexplanations are incomplete each explanation shows at most onetargeting aribute (plus agegenderlocation information) regard-less of how many aributes the advertisers use is means thatexplanations reveal only part of the targeting aributes that wereused providing us ndash and the users ndash with an incomplete picture of

Figure 9 ProPublica Spend by Ad Type

the aributes that advertisers were using However in the samestudy authors performed a number of controlled experiments thatsuggest ndash but not conclusively prove ndash that there is a logic behindwhich aributes appear in an explanation and which do not Givena targeting audience A obtained from two aributes a1 and a2 if a1and a2 come from dierent aribute categories (eg DemographicBehavior Interest etc) the aribute shown follows a specic prece-dence (Demographics and AgeGenderLocation iquest Interests iquest PIIbased lists iquest Behaviors) If a1 and a2 come from the same aributecategory the one that appears in the explanation is the one withthe highest estimated audience size is will result in a systematicunder-counting of lower priority targeting types

ere are two main sources of biases and limitations in ProP-ublicarsquos dataset One comes from users that installed ProPublicarsquosplugin and which political ads they were shown Another is fromthe way Facebook provides ad explanations e ProPublica datasetis the only publicly available source of targeting information forFacebook political ads us we present these results to provide aninitial insight into how Facebook political advertisers are targetingtheir ads with the understanding of likely biases and limitations

With these caveats in mind we proceed to an analysis of the18010 ads which we were able to connect between the ProPublicadataset and ours In Figure 9 we see that dierent types of adsdo indeed rely on dierent targeting strategies Of particular noteis the the divergence of rsquoCommercialrsquo ads of which 74 rely ontargeting by interest groups and of rsquoDonatersquo ads of which only 24do e average ad size did not dier signicantly between targetingtypes but was signicantly larger than the average for the Facebook

8

Figure 10 ProPublica Targeting by Advertiser Type

archive as a whole We believe this to be an artifact of the collectionmechanism which is biased toward nding larger ads 91 ofads in the overall ProPublica dataset had some kind of geographictargeting and 92 had age or gender targeting On average adsin this dataset had on average 41 dierent targeting parametersso for Facebook these should be thought of as a minimum criteriaBy contrast 58 of ads in the Google archive had no geographictargeting whatsoever and 70 had neither age nor gender targeting

In Figure 10 we also see diverging strategies between advertiserswith Political Candidates and PACs making heavy use of customlists of users and For Prot and For Prot Media companies relyingfar more on targeting users by their interests Campaigns havenumerous potential sources from which to compile lists of usersIn addition to their own lists of donors and voter rolls campaignscan rent lists from other candidates [15]

Both Google and Twier oer advertisers similar targeting cri-teria to what we have described for Facebook including customaudiences and lookalike audiences Both even allow targeting ofusers based on interests although they infer these interests in dier-ent ways Both havemade transparent demographic and geographictargeting information for ads in their archive but without othertargeting information this is an incomplete picture at best Weencourage Google and Twier to at minimum follow Facebookrsquosexample and make transparent to users information about whythey have been targeed for ads that they are seeing

54 New Types of Political Advertisers541 For-Profit Media One advertiser type in particular proved

to be an interesting outlier e category rsquoFor Prot Mediarsquo con-tains advertisers whose ads are not considered traditional news byFacebook (those ads are in a separate part of the archive that wedid not include) but have content intended solely to entertain orsway the opinion of the viewer Over the Facebook dataset as awhole the average ad sponsor ran ads on 16 pages Advertisersin the for-prot media category however ran ads on 32 pages onaverage We have examined many of these for-prot media compa-nies to understand why they are running across many Facebookpages What we have found in numerous instances is unknownfor-prot media companies that appear to be creating disingenuouscommunities that appear to be ldquograssroots movementsrdquo to targetdierent demographics and interests with a combination of paidand organic political messaging

A good example of this type of advertiser is rdquoNew AmericanMedia Group LLCrdquo is ad sponsor ran le leaning ads on 10dierent pages ese pages were designed to appeal to dierent

demographics (rdquoMelaninrdquo for people of color rdquoe Soldier Networkrdquofor Veterans rdquoRaising Tomorrowrdquo for parents etc) but oen run thesame content on multiple pages While this LLC has an extremelysimilar name to a now-defunct genuine le-leaning media outlet(New America Media) it appears to have no connection to thatprior group and also appears to have no activity o of Facebook

While some advertisers in this category were fairly traditionalentertainment websites (ie Comedy Central) some were ldquofor-protrdquo companies in name only that appeared to exist for no otherpurpose other than to spread a particular political message and hadno way of generating an actual prot We also discovered ldquoNewsfor Democracyrdquo is an LLC that ran le leaning ads on 14 dierentFacebook pages most of which were designed to be appealing togroups with traditionally conservative view points such as ldquoeHoly Tribunerdquo Journalists investigated this LLC and linked it toMotiveAI which is a liberal political advertising company

542 Corporate Astroturfing Corporations paying for politicaladvertising is not an entirely new phenomenon and has traditionallybeen funded through industry trade groups and PACs Howeverthe reporting requirements by the FCC for US political advertisingon television oen made this political messaging traceable to thereal sponsor ese stricter reporting requirements do not apply toonline political advertising and the ad-hoc reporting requirementsthat online platforms have enacted are being abused by corporationsand industry trade groups to undo transparency eorts

We discovered in our analysis 355 ads sponsored by ldquoCitizens forTobacco Rightsrdquo which is not a registered company in the US butdoes disclose on their website and Facebook page that it is operatedby cigaree company Philip Morris However someone who onlysaw the Facebook ad disclaimer would not be able to connect theads to Philip Morris without further investigation Other journalistshave found instances of oil and insurance lobbying groups that alsoprovided sponsor names that did not match the legally incorporatedentity sponsoring the ads [21] ese organizations are seeminglytaking advantage of Facebookrsquos policy of not veing sponsor namessince some of these entities also ran political ads on Googlersquos adplatform but provided Google with their EIN (tax ID) and correctlegally incorporated names of their organizations [21]

55 Discussione dierent policies bugs idiosyncrasies and security weaknessesof each transparency archive implementation present challenges toour analysis eorts We nd many of the issues with these archiveslikely stem from a combination of their hasty creation and the factthat the platforms are still working out how to improve security ofthese archives such they are dicult to deceive or evade We willrst discuss issues related to accidentally or intentionally deceivingthese transparency eorts and how they might be improved byimplementing more robust sponsor aribution techniques e sec-ond part of our discussion will focus on issues related to bypassinginclusion into the dierent platformsrsquo archives and what can bedone to improve these issues

551 Sponsor Aribution e for-prot political advertisers ap-pear to be the ones that are accidental or intentionally skirting andviolating the spirit of online transparency sponsorship disclosure

9

policies As we discussed in the Ad Targeting section itrsquos extremelyeasy for groups such as rsquoNew American Mediarsquo to obscure whothey actually are from users and researchers

It is worth noting that such advertising by for-prot corpora-tions was not legal until the Citizens United Supreme Court decisionin 2010 [1] that struck down restrictions on election spending byfor-prot corporations However political messaging advertiserswho run ads on television or radio stations governed by the FCCmust still report the name and contact information of the busi-ness which paid for the ad including the companyrsquos ocers anddirectors Such data is published by the FCC in a public databasePolitical advertisers who send direct mail through the US PostalService (USPS) must also report their activities through the FECwith similar public disclosure of the name and contact informationof the business e regulations that require such disclosure for adsthat mention candidates do not apply to online advertising largelybecause the laws that mandate such public disclosures were draedbefore these platforms were as ubiquitous as they have become

What this means in practice though is that people who wantto publicize a political message can form a for-prot company fordoing so with no intent of making an prot As a private companythey do not need to publicly disclose their investors in the waythat PACs are required to disclose their donors en the for-protcompany can advertise on social media also without disclosing thelegal entity providing the funds to pay for the ad

On Facebookrsquos platform advertisers can easily mislead whenproviding the rsquoad sponsorrsquo string associated with their ads eitherintentionally or accidentally us it is eectively free to circum-vent Facebookrsquos transparency implementation We see numerousinstances on Facebookrsquos platforms of this occurring Sometimes theunreliability of the ad sponsor label appeared to be caused purelyby human error such as typos or variation during data entry Forexample Donald J Trump For President Inc sponsored ads onboth the Donald J Trump page and the Mike Pence page Howeverwhen sponsoring ads on the Donald J Trump page the organiza-tion is known as rsquoDonald J Trump For President Incrsquo and whensponsoring ads on the Mike Pence page is known as rsquoDonald JTrump For President Incrsquo Facebook has not publicly stated plansto implement additional veing of political sponsors Facebookrsquosargument is that anything they might implement for additional vet-ting would not be scalable because of their broader inclusion policywhich extends to political issue ads [21] However this has cre-ated a weakness in Facebookrsquos transparency implementation thatgreatly diminishes its eectiveness for studying dishonest politicaladvertisers

Google and Twier both vet sponsors so companies must eitherreveal their legally incorporated name pay existing third-partiesto create ads on their behalf or create shell organizations (ie LLCPACs) We should note that we see instances of political ads onFacebook and Twier where the sponsor is a third-party advertisingagency instead of the actual entity that paid for the ads is is anexample of the complexities of correctly aributing political adsto the real sponsors It is clear from analysis that we need morediscussion about how to implement sponsorship disclosure andveing in a way that makes it practical to deploy at scale and moredicult to circumvent

552 Transparency Infrastructure As we have noted we ap-preciate the speed with which these transparency archives werecreated However the lack of full integration of these archives intothe broader ad platforms of these companies is currently hurtingthe ecacy of these transparency eorts

We believe that there are ads on the Google and Twier platformsthat would be considered political content that are not includedin their transparency archives because their criteria for inclusionare too narrow or their mechanisms for nding this content areinsucient More research needs to be done into exactly whatthe general population considers to be political advertising Wewould encourage these platforms to create policies and enforcementmechanisms that will make transparent advertising content thatthe general population would consider political

We also encountered several technical and policy issues withthe archives as they currently exist Many ads particularly in theGoogle archive were missing content information Information onspend and impressions were only available in broad ranges fromFacebook and Google No targeting information or very lile target-ing information was available from any of the platforms Facebookrequired us to sign an NDA that prohibited us from sharing our rawdata even with other researchers or even discussing our ndingsdirectly with non- US Persons

We call on these organizations to re-architect their platforms andpolicies to support full transparency of all political ads We realizethat making the changes we recommend will require investment oftime and money both in the technology of these platforms and thecorporate culture of the organizations that own them

6 RELATEDWORK61 Online AdvertisingKorolova [16] was the rst to point out privacy aacks based onmicro-targeted online ads Followup work has reverse-engineeredthe targeting options provided by major online ad networks [33]and explored privacy [2] and bias [26] issues of these online adnetworks ere has also been work on designing improved adtransparency mechanisms [20] For our study we leverage thisprior work on reverse-engineering online advertising networksrsquotargeting options and how Facebookrsquos ad targeting explanationlikely is implemented

To the best of our knowledge there has been no systematicanalysis of online advertisers to this point likely due to the dicultyof collecting large-scale data from online ad networks [13] One ofthe only prior large-scale quantitative studies of online advertisersfocused on how their strategies eected conversion rates based onaggregate analysis of advertisers on Microsorsquos ad network [32]XRay [18] and Sunshine [19] are two techniques that were createdto detect and infer online ad targeting methods However thesewere proof of concept systems and not deployed at large-scale Aninitial analysis of Facebookrsquos proposed ad transparency archiveimplementations pointed out the issue of only including politicalads and not revealing targeting information [31] is report wasreleased before Facebook implemented their transparency archiveand therefor did not analyze the ad data archived by Facebook orissues with the actual implementation We have conducted the rst

10

large-scale analysis of online political advertising based on the datarecently made transparent by Facebook Google and Twier

62 Political AdvertisingAnalysis of political television ads has been the focus of most priorpolitical advertising studies likely due to this data being publiclypublished by the FCC and easy to access [14 35] ere is at leastone prior study that explored the inuence of political televisionads on online discussion [25] ere have also been studies of inves-tigating the polarization of online political discourse [3 10] eclosest related to our study is a prior study which showed thatuploading political video advertisements to YouTube generated un-paid organic-views and improved their eectiveness [24] Howeverto the best of our knowledge ours is the rst large-scale study ofonline political advertising

7 CONCLUSIONSWe have performed an analysis of the ads that we were able tocollect from Facebook Google and Twierrsquos transparency archivesrelated to US politics Based on the data we collected we providean initial understanding and taxonomies of online political advertis-ing strategies for both honest and possibly dishonest US politicaladvertisers We also point out limitations and weaknesses of thepolicies and current implementations of these archives As part ofour analysis we demonstrate how advertisers are intentionally oraccidentally deceiving and bypassing these political transparencyarchives We provide a concrete list of suggestions that wouldlikely make them more robust and useful for enabling a beer un-derstanding of political advertising We are actively working witheach archive product teams to improve their implementations

We commend Facebook Google and Twier for their eorts sofar in improving transparency into political advertising on theirplatforms We note the speed with which these archives weremade available aer public concern about this issue was raised andthat these transparency eorts have improved a great deal in theshort time that these tools have been available We encourage theplatforms to continue to improve

REFERENCES[1] 2009-03-24 Citizens United v Federal Election Commission[2] Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P Gummadi

Patrick Loiseau and Alan Mislove 2018 Investigating ad transparency mecha-nisms in social media A case study of Facebookrsquos explanations InNDSS 2018 Net-work and Distributed Systems Security Symposium San Diego UNITED STATES18 ndash 21

[3] Pablo Barber John T Jost Jonathan Nagler Joshua A Tucker and RichardBonneau 2015 Tweeting From Le to Right Is Online Political CommunicationMore an an Echo Chamber Psychological Science 26 10 (2015) 1531ndash1542

[4] Brian Barre 2018 For Russia Unraveling US DemocracyWas Just Another Day Job hpswwwwiredcomstorymueller-indictment-internet-research-agency

[5] Facebook 2018 Ad Archive hpswwwfacebookcomadsarchive[6] Facebook 2018 Facebook Ad Archive Report Retrieved October 25th 2018

from hpswwwfacebookcomadsarchivereport[7] Facebook 2018-07-03 About ads that include political content rdquohpswww

facebookcombusinesshelp167836590566506rdquo[8] Facebook 2018-07-03 National issues of public importance rdquohpswww

facebookcombusinesshelp214754279118974rdquo[9] Facebook 2018-08-22 Introducing the Ad Archive API hpsnewsroom

comnews201808introducing-the-ad-archive-api[10] DJ Flynn Brendan Nyhan and Jason Reier [n d] e Nature and Origins of

Misperceptions Understanding False and Unsupported Beliefs About PoliticsPolitical Psychology 38 S1 ([n d]) 127ndash150

[11] Google 2018-08-15 Transparency Report - Political Ads hpstransparencyreportgooglecompolitical-adslibrary

[12] Kevin Granville 2018 Facebook and Cambridge Analytica What You Needto Know as Fallout Widens hpswwwnytimescom20180319technologyfacebook-cambridge-analytica-explainedhtml

[13] Saikat Guha Bin Cheng and Paul Francis 2010 Challenges in Measuring OnlineAdvertising Systems In Proceedings of the 10th ACM SIGCOMM Conference onInternet Measurement (IMC rsquo10) ACM 81ndash87

[14] Lynda Lee Kaid and Monica Postelnicu 2005 Political Advertising in the 2004Election Comparison of Traditional Television and Internet Messages AmericanBehavioral Scientist 49 2 (2005) 265ndash278

[15] Maggie Haberman Kenneth P Vogel 2018-10-13hpswwwnytimescom20181013uspoliticstrump-political-datahtmlhpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[16] A Korolova 2010 Privacy Violations Using Microtargeted Ads A Case StudyIn 2010 IEEE International Conference on Data Mining Workshops 474ndash482

[17] Shikhar Sakhuja Laura Edelson 2018 An Analysis of UnitedStates Online Political Advertising hpsonline-pol-adsgithubioOnline-Political-Ads-Analysis

[18] Mathias Lecuyer Guillaume Ducoe Francis Lan Andrei Papancea eolosPetsios Riley Spahn Augustin Chaintreau and Roxana Geambasu 2014 XRayEnhancing the Webrsquos Transparency with Dierential Correlation In USENIXSecurity Symposium San Diego United States

[19] Mathias Lecuyer Riley Spahn Yannis Spiliopolous Augustin Chaintreau RoxanaGeambasu and Daniel Hsu 2015 Sunlight Fine-grained Targeting Detectionat Scale with Statistical Condence In Proceedings of the 22Nd ACM SIGSACConference on Computer and Communications Security (CCS rsquo15) ACM New YorkNY USA 554ndash566 hpsdoiorg10114528101032813614

[20] Bin Liu Anmol Sheth Udi Weinsberg Jaideep Chandrashekar and RameshGovindan 2013 AdReveal Improving Transparency into Online TargetedAdvertising In Proceedings of the TwelhACMWorkshop onHot Topics in Networks(HotNets-XII) ACM New York NY USA Article 12 7 pages hpsdoiorg10114525357712535783

[21] Jeremy B Merrill 2018 How Big Oil Dodges FacebooksNew Ad Transparency Rules hpswwwpropublicaorgarticlehow-big-oil-dodges-facebooks-new-ad-transparency-rules

[22] ProPublica 2017-09-07 Political Advertisements from Facebook hpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[23] ProPublica 2018-10-21 Political Advertisements from Facebook hpswwwpropublicaorgdatastoredatasetpolitical-advertisements-from-facebook

[24] Travis N Ridout Erika Franklin Fowler and John Bransteer 2010 PoliticalAdvertising in the 21st Century e Rise of the YouTube Ad In APSA 2010Annual Meeting

[25] Dhavan V Shah Jaeho Cho Seungahn Nah Melissa R Gotlieb Hyunseo HwangNam-Jin Lee Rosanne M Scholl and Douglas M McLeod [n d] Campaign AdsOnline Messaging and Participation Extending the Communication MediationModel Journal of Communication 57 4 ([n d]) 676ndash703

[26] Till Speicher Muhammad Ali Giridhari Venkatadri Filipe Nunes Ribeiro GeorgeArvanitakis Fabrcio Benevenuto Krishna P Gummadi Patrick Loiseau andAlan Mislove 2018 Potential for Discrimination in Online Targeted AdvertisingIn Proceedings of the 1st Conference on Fairness Accountability and Transparency(Proceedings of Machine Learning Research) Sorelle A Friedler and ChristoWilson(Eds) Vol 81 PMLR New York NY USA 5ndash19

[27] William Turton 2018 We posed as 100 Senators to run ads on FacebookFacebook approved all of them hpsnewsvicecomen usarticlexw9n3qwe-posed-as-100-senators-to-run-ads-on-facebook-facebook-approved-all-of-them

[28] Twier 2018 Political campaigning advertisers hpsadstwiercomtransparencyipolitical advertisers

[29] Twier 2018-06-28 Ad Transparency Center hpsadstwiercomtransparency

[30] Twier 2018-06-28 Political Content in the United States hpsbusinesstwiercomenhelpads-policiesrestricted-content-policiespolitical-campaigningUS-political-contenthtml

[31] Upturn 2018 Leveling the Platform Real Transparency for Paid Messages onFacebook hpswwwteamupturnorgreports2018facebook-ads

[32] Bhanu C Vaikonda Vacha Dave Saikat Guha and Alex C Snoeren 2015Empirical Analysis of Search Advertising Strategies In Proceedings of the 2015Internet Measurement Conference (IMC rsquo15) ACM New York NY USA 79ndash91hpsdoiorg10114528156752815694

[33] G Venkatadri A Andreou Y Liu A Mislove K P Gummadi P Loiseau and OGoga 2018 Privacy Risks with Facebookrsquos PII-Based Targeting Auditing a DataBrokerrsquos Advertising Interface In 2018 IEEE Symposium on Security and Privacy(SP) 89ndash107

[34] Giridhari Venkatadri Yabing Liu Athanasios Andreou Oana Goga PatrickLoiseau Alan Mislove and Krishna P Gummadi 2018 Privacy Risks withFacebookrsquos PII-based Targeting Auditing a Data Brokerrsquos Advertising InterfaceIn Proceedings of the IEEE Symposium on Security and Privacy (IEEE SampPrsquo18) San

11

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References
Page 2: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

bypassing of the current implementations of these transparencyarchives Based on our experiences analyzing these transparencyarchives we provide several suggestions of how these archivescould be made more robust and useful

We have made as much of the data that we have collected aspossible public as well as all of our labeling data collection andanalysis scripts public Our initial reports have been the basis formany journalistsrsquo stories which have improved online politicaladvertising transparency

2 BACKGROUNDFacebook Google and Twier have all deployed political advertis-ing transparency archives However each of these transparencyarchives have dierent criteria for the inclusion of ads and dierentmodes of access which we create a taxonomy of in Table 1 Eachof these transparency archive implementations has strengths andweaknesses and there is currently no ldquobestrdquo archive

Facebook Google TwierAds All Candidates Federal FederalIncluded Issue ads candidate candidate

related Issue adsSponsor Name Name NameInfo FECEIN billing infoAd Contents Ranges Ranges ExactViewed Gender age NA Gender ageAudience geolocation geolocationTargeting NA Age gender Age genderInfo geolocation geolocation

keywordsData Portal Portal PortalAvailability API (wNDA) Database API

Table 1 Transparency Implementations

What is an ad For purposes of this study an ad is a record inan archive with a unique id assigned by the platform While eachplatform has slightly dierent information associated with eachrecord each has 3 categories of information associated with itcontent context and results e contents of an ad consist of anytext images andor videos seen by an ad viewer e context of thead is the information specied by the advertiser about how whenwhere and by whom they want the ad to be seen and how muchthey are willing to spend for the ad to be seen Both the contentand context of the ad are specied by the advertiser at the timeof ad creation e results of the ad consist of information aboutwho ultimately interacted with the ad and when those interactionshappened and how much was ultimately spent on the ad Not allof this information is made available about each record in any ofthe archives but all of this information is made available by at leastone archiveArchive Completeness All archives contain a unique id for eachad impression counts and amount spent for each advertisement aswell as the dates the ad was active Facebook and Google releasead impression and spend in ranges and Twier releasing these asexact numbers Twierrsquos implementation of releasing exact im-pression and spend information from every ad enables us to more

precisely measure political advertising on their platform It wouldhelp remove uncertainty if Facebook and Google would also releaseexact ad impressions and spend amounts instead of ranges Wealso note that impressions is an imperfect metric to measure theldquoreachrdquo of an ad and it would be useful to also include click andother interaction metrics recorded by the platforms Facebook andTwier includes the ad text image and video content Googlersquosarchive contains a link to a webpage where this content is view-able but does not contain the content itself Twier is the onlyplatform to release targeting information including whether userswere targeted by geography age or gender Facebook and Googlehave detailed information about how users are targeting for eachad based on advertiser produced lists of personally identiableinformation groups they belong to demographic or income infor-mation that the platform has about the user geography or keywordsearch information however they do not currently make any ofthis information available in their archivesAds Included Facebook has themost inclusive policy and includesin their archive ads that meet any of the following criteria ldquo(1) Ismade by on behalf of or about a current or former candidate forpublic oce a political party a political action commiee or advo-cates for the outcome of an election to public oce (2) Relates toany election referendum or ballot initiative including rdquoget out thevoterdquo or election information campaigns (3) Relates to any nationallegislative issue of public importance in any place where the ad isbeing run [8] (4) Is regulated as political advertisingrdquo [7] Facebookallows advertisers to opt into including their ads in the archive Inorder to enforce their policy Facebook uses a combination of userreports and machine learning algorithms to ldquocatchrdquo political adswhere the sponsor did not opt into making them transparent

Google is only including rdquoads related to elections or issues thatfeature a federal candidate or oceholderrdquo [11] Google has statedthat they plan on expanding the set of ads included in their archiveIt is unclear how Google enforces their policy

Twierrsquos original policy was limited to only including ads spon-sored directly by federal candidates However Twier has sinceexpanded their policy to include ldquo(1) Ads that refer to an electionor a clearly identied candidate (2) Ads that advocate for legisla-tive issues of national importance A clearly identied candidaterefers to any candidate running for federal state or local elec-tion [30]rdquo Based on our analysis it appears that Twier is currentlynot enforcing their policy wellSponsorsrsquo Info Facebook only displays a text string which thesponsor provides and is intended to identify who is paying for thead We went through the veing process of becoming a politicaladvertiser on Facebookrsquos platform is entailed uploading a USidentication card which was approved approximately ve minuteslater at which time we could start posting political ads Facebookalso validates the address of the advertiser by sending them a postcard which must be replied to within 30 days or else the advertiserwill be suspended However during this 30 day grace period politi-cal advertisers can post ads without validating their address Wedisclosed to Facebook and a subsequent independent experimentby Turton a reporter from Vice news showed that Facebook doesnot currently vet this text string which allowed the reporter to postads appearing to be from US Senators [27] is is a security issuethat Facebook has acknowledged but claims there is no eective

2

and scalable veing techniques [21] We will describe the issuesthat we found with this self-reported text string later in this paper

Google provides both a text string and a Federal Election Candi-date (FEC) ID or EIN (US Tax ID) which is veed for every politicaladvertiser in their archive Twier provides a text string and whenavailable the billing info of the political sponsors Twier initiallydid not vet this information but started veing sponsorrsquos EIN simi-lar to Google on September 30th 2018 Providing a consistent andeasy to reference identier such as FEC or EIN for each sponsorenables us to beer study sponsors in Google and Twierrsquos archiveViewedAudience Facebook and Twier both include break downson the impression viewing audience by gender age range and statelevel geolocation Google provides this information as a heat mapimage that we cannot currently extract this information from forour analysis thus wemark Google as ldquoNArdquo for this category Googlehas replied that they will work on releasing this information in aformat that we can analyzeTargeting Info Facebook does not include any explicit targetinginformation in their archive Google and Twier makes transparentAge gender and geolocation based targeting but do not appearto release other types of targeting criteria such as audience andcontent which are allowed by their advertising platforms Googlealso makes transparent some aggregated keyword targeting dataAll of the platforms only release partial targeting information atbest which obscures a key facet of online political advertisingData Availability Facebook initially only provided a keywordbased portal that was designed for small-scale interactive user ex-ploration of the ads in the archive Facebook enabled anti-scrapingfunctionality in July 2018 that makes it dicult to collect large-scale data by scraping this portal In September 2018 Facebookreleased an API that is currently in beta testing which we haveaccess to aer signing an NDA stipulating that we will not publiclyrelease raw data collected from the API is eectively means thatonly a small set of US organizations participating in the API betatest have large-scale data collection access to ads in the Facebookarchive

Twier has provided an open API and list of all accounts in-cluded in their transparency archive which allows us to eectivelycollect all ads included in their archive Google implemented aportal similar to Facebookrsquos and also releases a Bigery (SQL-like)database of all the ads included in their archive which is updatedweekly For our use case of large-scale data analysis this databaseformat is ideal

3 DATA COLLECTION METHODOLOGY31 FacebookInitially we scraped Facebookrsquos archive using a list of keywords thatincluded elected positions (ie governor judge senator) US statenames and key political issues (ie health care immigration taxes)Around the end of July 2018 Facebook implemented anti-scrapingmeasures which blocked our scraper us we had no viable meansof collecting large-scale ad data until Facebook implemented theirAPI in September 2018 We have publicly released a report and allof the data that we collected by scraping Facebookrsquos archive userportal before our scraper was blocked [17]

We are part of Facebookrsquos Political Ad Archive API beta test-ing program [9] which allows us to query Facebookrsquos Political AdArchive for specic keyword terms which is matched against thepage name the disclaimer or ad text Ads returned by FacebookrsquosAPI are ordered using a proprietary ranking algorithm that was notdescribed to us how it functions However most advertisementsappear to be returned in chronological order A single query toFacebookrsquos API returns at most 1000 ads and we can page throughto collect additional ads using pagination functionality as part ofthe API Currently there is a limitation in Facebookrsquos Political AdArchive API beta that prevents us from paging past 8000 ads isis problematic because many searches will return far more than8000 results

Information on spend and impressions per ad is only availablein broad ranges For impressions the ranges presented are 0 - 9991000 - 4999 5000 - 9999 10000 - 49999 50000 - 99999 100000 -199999 200000 - 499999 500000 - 999999 For spend the rangespresented are 0 - 99 100 - 499 500 - 999 1000 - 4999 5000 - 999910000 - 49999 50000 - 99999 100000 - 199999 200000 - 499999500000 - 999999

Additionally the API has very low rate limits We have foundthat functionally we could make at most 3 requests per minuteon average before hiing these rate limits Our goal was to createas comprehensive and representative a dataset as possible Giventhe very low rate limits and limits on the number of responses fora given search our approach was to search by advertising pageas much as possible in order to reduce the bias in our data Weare currently able to keep up with the rate of new advertisementsappearing in Facebookrsquos political ad transparency archive Wecannot publicly release the raw data that we have collected fromFacebookrsquos API due to the agreement that we have signed withFacebook as a requirement for access to their API

We created a separate approach for discovering pages that arelinked to sponsored political ads Our approach to discoveringpages involved scraping Facebookrsquos Political Ad Archive user portalinterface We chose a scraping method for page discovery sinceour access to Facebookrsquos API is highly rate limited and it wouldbe logistically infeasible to perform the queries required for bothpage discovery and to collect ads using our API access Our list wasnot a complete list of advertisers using Facebookrsquos platform sinceit depends on good coverage based on our keyword searches

Facebook started publishing a comprehensive list aer the cut-o for our data analyzed in this study Our data collection fromFacebookrsquos archive is our best eort and was incomplete based onanalysis of what is contained in Facebookrsquos transparency reportWe have changed our data collection methodology moving forwardto discover Facebook pages running political ads using Facebookrsquosweekly transparency reports is combined with improvementswe requested and Facebook implemented to their API aer thedata collection period for this study will improve our coverage to amostly complete set of US political ads Facebook has included intheir transparency archive

32 GoogleGoogle published their archive as a public dataset in a Bigery(SQL-like) format and commied to keeping it public However we

3

observed that ad spend and impression values for ads and occasion-ally advertiser information were being changed aer their entryinto the archive For this reason we created separate archives ofthe dataset on a weekly basis Additionally ad text information wasnot available in the dataset itself but was viewable at a summarypage for each ad We scraped all of these associated pages to col-lect ad text to be associated in our dataset with the underlying addata Unfortunately many of these summary pages did not rendercorrectly so we were only able to collect ad text for approximately66 of pages which contained it Two separate issues preventedcollection First some ad pages display the messageldquoAdvertisers are able to use approved third party vendors to serve adson Google While we are able to review these ads for compliance withadvertising policies due to technical limitations we are currentlyunable to display the content of the ad in the Transparency Reportrdquo

Second some ad pages displayed the messageldquoPolicy violation is ad violated Googlersquos Advertising Policyrdquo Werecommend that Google change their implementation so that adsserved by third party vendors or which were deleted for compli-ance reasons are still accessible through their transparency archiveGoogle can place a click-through disclaimer to avoid accidentalexposure to policy violated content similar to what Facebook hasimplemented for deleted advertisements

Information on spend and impressions per ad are only availablein broad ranges For impressions the ranges presented are rdquoiexcl= 10krdquordquo10k-100krdquo rdquo100k-1Mrdquo rdquo1M-10Mrdquo rdquoiquest 10Mrdquo For spend the rangespresented are rdquoiexcl 100rdquo rdquo100-1krdquo rdquo1k-50krdquo rdquo50k-100krdquo rdquoiquest 100krdquoe ads in this dataset are a combination of text-only ads that aredisplayed alongside Google search results and image or video-onlyads that are displayed as banner or sidebar ads on Googlersquos AdSensenetwork

In addition to per-ad data Google also published some aggregatedata on a per-advertiser and geographic basis One of these aggre-gations was exact weekly spend per advertiser roughout thispaper we present minimum numbers for impressions and spendbecause both Google and Facebook publish ranges for impressionsand spend for each ad instead of exact numbers ese total ag-gregations give us a sense of how much error there is when weuse these minimum estimates for Google According to Googleadvertisers spent $45 M on political ads but our minimum estimateof spending was only $11 M

33 TwitterTwier publishes a list of all political campaigning advertisers [28]which we scrape daily to discover new political campaign advertis-ersrsquo Twier accounts In addition to this list provided by Twierwe have also manually aempted to identify every federal electioncandidatesrsquo personal or campaign Twier account We then queryeach account daily using Twierrsquos API perform to collect updatedinformation on all promoted tweets and detect federal electioncandidates which are not listed on Twierrsquos political campaigningadvertisers page but are sponsoring tweets During our scrapingwe have noticed that some promoted tweets were deleted and arereplaced with the textldquois Tweet is not available because it includes content that violatedTwier Ads Policiesrdquo

e information for these deleted promoted tweets is no longeraccessible through Twierrsquos political transparency archive How-ever if we have scraped them before they were deleted we haveretained the content and information about these promoted tweetsWe recommend that Twier change their implementation so thatpromoted tweets which were deleted are still accessible throughtheir transparency archive Twier can place a click-through dis-claimer to avoid accidental exposure to policy violated contentsimilar to what Facebook has implemented for deleted advertise-ments We have made public all of the data that we have collectedfrom Twierrsquos transparency archive

Additionally we noted that there were several accounts of fed-eral candidates that were not being archived according to Twierrsquospolicies We would nd these ads during our regular scrapes for adsby all federal candidates but no billing or impression data would beavailable and the ads would disappear from Twierrsquos archive aer7 days as is typical for non-political ads We notied Twier about4 accounts which they subsequently added to their transparencyarchive However Twier did not retroactively include their priorpromoted tweets from these accounts and there are currently 11additional federal candidate accounts which have promoted tweetsnot included in the archive us it appears that Twierrsquos pro-cess for agging federal election candidatesrsquo account that shouldbe included in their archive is not working correctly We will dis-close this new set of 11 accounts to Twier and continue to workwith them to improve their process for discovering and includingrelevant promoted tweets in their transparency archive

4 DATASETSWe have collected all of the US political ad data that Google andTwier have made transparent and archived as of October 21st 2018 In addition we have made our best eort to collect as muchof the US political ad data that Facebook has made transparent andarchived as of October 21st 2018 For Facebook we are not ableto collect all of the ad data from their transparency archive due tothe limitations in their API this is a subset of US political ads thatran on Facebook Note that our scraper was blocked by Facebookin mid-July 2018 and we were not able to collect data until thebeginning of September 2018 when we began to use their beta APIis means that we do not have good coverage of Facebook adsduring that period since it is dicult to retrieve older ads fromFacebookrsquos current beta API On October 23rd 2018 shortly aerwe froze our dataset Facebook released their Ad Archive Report [6]From this we know that as of the cuto date for data analyzed inthis study Facebook had a total of 167M ads in their archive froma total of 256M spent across 78K pages We have captured over 75of all ads in the archive but only 49 of the pages

On the Facebook platform ads that run without a rsquoPaid for byrsquolabel but are later deemed to be political are removed from circula-tion and added to the archive We have been able to nd 96106 suchads in the archive with a total spend of at least $428 million and670 million impressions It does not appear that Google or Twierhave any mechanism for retroactively marking an ad as politicalif it is discovered aer the fact and we would encourage them todevelop this capacity

4

Table 2 shows all of the data that we have collected from eachof the platforms Most of the political ads in these archives arefrom late May 2018 to October 21st 2018 but there are severalolder ads from Twier and Facebook that have been included intheir transparency archives Facebook has the most advertisersads impressions and spend However Facebook also includesmany political issue ads in their transparency archive that are notincluded in Google and Twierrsquos transparency archives so this isnot a fair comparison of political advertising activity across allthree platforms An important dierence between the datasetsis that while Facebook and Twier are publishing breakdownsof impressions on geographic and demographic lines Google isinstead publishing geographic and demographic targetings eseshould not be considered equivalent Below in the analysis sectionwe will present a more accurate comparison of political advertisingactivity across all three platforms

Additionally we use a dataset published by ProPublica of politicalads that have been viewed by their users who have installed browserextensions that automatically collected advertisements on theirFacebook pages and sent them to ProPublicarsquos servers [23] Table 3provides an overview of this dataset We were able to connect adsin the ProPublica dataset to ads in our dataset of archived politicalads by mapping the ad IDs used in the ProPublica dataset to thead archive IDs used in the archive To do this we scraped theFacebookrsquos web-based political ad archive as both the ad IDs andad archive ids were available Each record in this dataset containsamong other things the text of the data the various targetingsreceived by the dierent users who saw the ad the page associatedwith the ad and the rsquoPaid byrsquo ad sponsor string associated with thead Of the 33308 ads in the ProPublica dataset with a creation dateaer May 7th 2018 the ocial start of the Facebook dataset wewere able to nd 18010 Because the users who contribute data tothis dataset are self-selecting these ads should not be considereda representative sample of ads in the larger Facebook Ad ArchiveAmong other things the average ad spend on ads in this datasetwas $644 compared to $107 for the larger dataset

As part of our analysis we manually categorized the top adver-tisers on all three platforms We categorized these advertisers byorganization type (political candidate Political Action Commiee(PAC) Union For Prot etc) For Facebook we were able to classifythe organizations of the advertisers who were responsible for atleast 75 of the total number of ads in the Facebook archive ForGoogle we labeled the organization of top advertisers who wereresponsible for 80 of the total number of ads and for Twier wewe were able to label all 88 advertisers with their organization typeWe were able to categorize 12833 of the top ad sponsors If we werenot able to categorize an advertiser it is marked as rsquoUnknownrsquo

We also classied the ads themselves into 5 categories InformConnect Donate Move or Commercial Inform ads seek to per-suade the viewer but do not make an explicit ask Connect ads seekthe userrsquos contact information Donate ads seek the userrsquos moneyMove ads aempt to motivate the user to take some action in thephysical world such as aending a rally or voting Commercial adsseek to sell the user goods or services We classied the ads basedon the outgoing links from the ads Ads that had no outgoing linkswere always classied as Inform ads as they could not have any fur-ther ask from the user Ads that linked directly third-party sites for

Figure 1 Distribution of ads by size

event management (eventbritecom) contact management(GoogleDocs) or payments management (actbluecom) were solely classi-ed as Move Connect or Donate ads respectively Ads that linkedto general campaign sites were usually multiple-classed as somecombination of the three as these ads and pages typically mademultiple asks Ads by For Prot Media organizations were classiedas Inform ads as these advertisers do not sell goods or servicesto users Ads by For Prot organizations that linked to store sitesor sites selling services were classied as Commercial We wereable to categorize 907840 ads with these methods Heavy use ofthird-party service providers by advertisers was extremely helpfulin making these classications If we were not able to categorize anad it was marked as rsquoUnknownrsquo We validated this method of adcategorization by taking a random sample of 300 categorized adsfrom each platform and manually verifying them e error rate forFacebook was 4 for Google was 37 and for Twier was 37

A limitation that applies to all our datasets is that we do notknow when the spend and impressions for each ad occurred duringthe lifetime of the ad Some ads run for several weeks and some foronly a day but in either case we aribute their entire spend andtotal impressions to the creation date of the ad

5 RESULTSWe calculate total spend and impression minimum and maximumfor Facebook ads by summing respectively the smallest and largestvalue for the range given for each ad For Google advertiser weeklyspend data was aggregated for all advertisers so we did not have toestimate that number For Twier exact numbers for impressionsand spend were available so no estimation was needed We alsonote that we are only able to collect a subset of political advertise-ments from Facebookrsquos transparency archive due to accessibilityissues with their beta API We stress that because the criteria forinclusion in these archives diered on the dierent platforms thegures on relative proportions of ad types and advertiser typesshould be seen as a reection of what the platforms chose to maketransparent in addition to what is organically present on theseplatforms

With that in mind we can see clear dierences between theplatforms Of particular note is the dierence in ad size visible inFigure 1 with Facebook having a much larger sized share of thesmallest size of ad Also of note is the diering prevalence of typesof advertisers in Figure 3 with PACs making up a much largerpercentage of spend on Google compared with the other platforms

5

Platform Total Ads Total Sponsors Total Pages Impressions Spend First Ad Date Last Ad DateFacebook 126 M 24 K 38 k 735 B - 2112 B $135 M - $567 M July 14th 2014 October 21st 2018Google 41 K 616 NA 13 B - 116 B $45 M May 31st 2018 October 21st 2018Twier 1808 88 NA 118 M $16 M December 21st 2016 October 21st 2018

Table 2 Overall Datasets

Total Ads 81052Total Pages 2363Total Ad Sponsors 2395Earliest Ad Date July 31st 2017Latest Ad Date October 18th 2018

Table 3 ProPublica Political Advertisements From Face-book

Figure 2 Distribution of ads by type

Figure 3 Distribution of spend by advertiser type

51 Data Over Timee time period during which we were collecting data coincidedwith the 2018 midterm elections in the United States us we wereable to observe changing paerns in spend leading up to a majorelection Figure 4 shows spend by week for the 5 month periodleading up to the election and Figure 5 shows raw ad count forthe same period We note that our data particularly for Facebookspend is right-censored for the nal two weeks is is causedby Facebookrsquos API limitations which only enable us to be ableto recheck ad spends weekly us newly create ads have likely

Results Facebook Google TwierTotal Advertisers 1 K 534 54Total Ads 161 K 15 K 1 KTotal Impressions 800 M - 24 B 280 M- 3 B 100 MTotal USD Spend $12 M - $60 M $135 M $14 MAve ImpressionsAd 5 K - 15 K 32 K - 283 K 65 KAve USD SpendAd $74 - $373 $1 K $885

Table 4 Federal Candidate Only Results

not spent much of their budget when we initially discover themis right censor eect also likely eects Google and Twier to alesser degree due to ads with larger budgets that take several daysto spend down completely is can be corrected by periodicallyrechecking the ads until they have all spent their budgets which isnormally within a week If the paper is accepted we will updatethe data to include ads up to the US midterm elections

We can see the expected increases in the number of ads onall three platforms as the US midterm elections approach OnFacebookrsquos platform there is an increase in connect ads and onTwier there is an increase in move ads Both of these are relatedto sophisticated ldquoget out the voterdquo eorts that many groups havedeployed ese move ads include images which include specicpolling place addresses and websites that provide polling placedirections and information e connect ads oen provide userswith instructions on how they can volunteer to help with early andday-of voter turnout eorts e cause of the spending spikes forFacebookrsquos platform can be aributed to a few unknown sponsorsthat we could not link to a legally registered entity but that werelikely quasi for-prot advertisers which we will discuss furtherlater in the paper e spending spikes on Twierrsquos platform canbe aributed to candidates who ran a few ads with larger budgets

52 Federal Candidate ComparisonIn order to understand how political advertising across these plat-forms dier we aempted to create a comparable subset of adver-tisers and ads is is dicult because each platform has slightlydierent criteria for inclusion To do this we present results foradvertising only paid for by candidates for federal oce whichwas the broadest set that was reliably included in all three archivesNote this does not include ads by current oceholders who are notseeking re-election or ads that merely mention a federal candidatebut are paid for by another party Results for these advertisers arepresented in Table 4

Table 4 shows that Facebook is the platform with the broadestappeal to federal candidate advertisers with far more advertisersand ads than Google However political advertising by this groupon Google appears to generate more spend and possibly more im-pressions than ads on Facebook e average ad size on Facebookin terms of impressions and spend are the smallest based on our

6

Figure 4 Platform Ad Count By Ad Type By Week

Figure 5 Platform Spend By Advertiser Type By Week

Figure 6 Federal Candidate ads by Size

Figure 7 Federal Candidate Spend by Ad Type

minimum estimates indicating that advertisers are running smallerlikely more targeted ads on Facebook ese small ads on Face-book are what are called micro targeted which we dene as lessthan 1000 impressions or a spend of less than $100 For Facebookmicrotargeted ads make up 81 of the overall number of ads for

federal candidates in our dataset For Twier this number is 62and for Google it is 54 Figure 6 shows the share of ads by sizeof spend and here we can begin to see how federal candidates usethese platforms in dierent ways We note that the distribution ofads by size for federal candidates in Figure 6 is very similar to theoverall distribution of ads by size seen in Figure 1

Figure 7 shows the relative spend on dierent ad platformswhere we see very dierent percentages for types of ads Commer-cial ads are not shown in this gure because there were too fewcommercial ads to be visible Particularly of note is the fact that adsseeking donations were far more common on the Google platformand ads seeking to spread a message (rsquoInformrsquo) were much morecommon on Facebook

Seeing these dierences in both ad size and the types of ads thatwere run we wanted to understand if advertisers were trying toreach dierent geographic audiences with dierent types of adsTo do this we compared the number of regions in which variousads had impressions on Facebook and Twier and the number ofregions targeted for Google Figure 8 shows that a variation intargeting strategy is visible on Facebook and Twier On FacebookrsquoMoversquo ads that encouraged people to aend a rally volunteer fora candidate or some other in-person activity were viewed onaverage in 4 regions while rsquoDonatersquo ads were viewed in 24 regionson average is makes a certain amount of intuitive sense peopleare willing to travel only so far to aend a rally but can donate tocandidates anywhere in the United States

7

Figure 8 CDF of Regions by Ad Type for Federal Candidates

53 Ad TargetingOne of the deciencies with the Facebook political ad archive isthat while it does share geographic and demographic informationabout who saw a particular ad we have no way of knowing howthat ad was targeted However we were able to connect our datasetcontaining information about who consumed ads with one pub-lished by ProPublica which contains some data about how ads weretargeted e ProPublica data was collected by a browser pluginoperated by ProPublica which anyone can install ProPublicarsquosbrowser plugin [22] uses a supervised Natural Language Process-ing (NLP) classier to detect political ads in addition to allowingusers to manually classify ads they see as political e browserplugin then collects the partial ad targeting explanations Facebookprovides by automatically clicking on the ldquoWhy am I seeing thisrdquobuon for political ads and sends it to ProPublica for them to makepublic

We rst provide a brief background on Facebook targeting audi-ence options [34] Facebook exposes prospective advertisers to aplethora of options First advertisers can target users based on agegender location and languages they speak Second advertisers canchoose to send their ads to users in a custom audience or lookalikeaudience Custom audiences contain a list of identiers of specicusers Advertisers can use various types of data to create a customaudience list ranging from specifying the emails phone numbersor physical addresses of people they want to reach to users thathave visited their website installed their mobile application orliked their Facebook Page Lookalike audiences allow advertisers tolet Facebook choose to whom to sends their ads based on previouscampaigns Finally advertisers can choose from a long list of target-ing aributes the characteristics they want users who receive theirads to have (eg users interested in Catholic Church) Targetingaributes are categorized in types such as demographics behaviorsand interests Advertisers can choose multiple aributes to target

A prior study by Athanasioshas et al [2] reverse engineeredwhat Facebook chooses to show and the limitations of the ad tar-geting explanation Facebook provides is study showed that adexplanations are incomplete each explanation shows at most onetargeting aribute (plus agegenderlocation information) regard-less of how many aributes the advertisers use is means thatexplanations reveal only part of the targeting aributes that wereused providing us ndash and the users ndash with an incomplete picture of

Figure 9 ProPublica Spend by Ad Type

the aributes that advertisers were using However in the samestudy authors performed a number of controlled experiments thatsuggest ndash but not conclusively prove ndash that there is a logic behindwhich aributes appear in an explanation and which do not Givena targeting audience A obtained from two aributes a1 and a2 if a1and a2 come from dierent aribute categories (eg DemographicBehavior Interest etc) the aribute shown follows a specic prece-dence (Demographics and AgeGenderLocation iquest Interests iquest PIIbased lists iquest Behaviors) If a1 and a2 come from the same aributecategory the one that appears in the explanation is the one withthe highest estimated audience size is will result in a systematicunder-counting of lower priority targeting types

ere are two main sources of biases and limitations in ProP-ublicarsquos dataset One comes from users that installed ProPublicarsquosplugin and which political ads they were shown Another is fromthe way Facebook provides ad explanations e ProPublica datasetis the only publicly available source of targeting information forFacebook political ads us we present these results to provide aninitial insight into how Facebook political advertisers are targetingtheir ads with the understanding of likely biases and limitations

With these caveats in mind we proceed to an analysis of the18010 ads which we were able to connect between the ProPublicadataset and ours In Figure 9 we see that dierent types of adsdo indeed rely on dierent targeting strategies Of particular noteis the the divergence of rsquoCommercialrsquo ads of which 74 rely ontargeting by interest groups and of rsquoDonatersquo ads of which only 24do e average ad size did not dier signicantly between targetingtypes but was signicantly larger than the average for the Facebook

8

Figure 10 ProPublica Targeting by Advertiser Type

archive as a whole We believe this to be an artifact of the collectionmechanism which is biased toward nding larger ads 91 ofads in the overall ProPublica dataset had some kind of geographictargeting and 92 had age or gender targeting On average adsin this dataset had on average 41 dierent targeting parametersso for Facebook these should be thought of as a minimum criteriaBy contrast 58 of ads in the Google archive had no geographictargeting whatsoever and 70 had neither age nor gender targeting

In Figure 10 we also see diverging strategies between advertiserswith Political Candidates and PACs making heavy use of customlists of users and For Prot and For Prot Media companies relyingfar more on targeting users by their interests Campaigns havenumerous potential sources from which to compile lists of usersIn addition to their own lists of donors and voter rolls campaignscan rent lists from other candidates [15]

Both Google and Twier oer advertisers similar targeting cri-teria to what we have described for Facebook including customaudiences and lookalike audiences Both even allow targeting ofusers based on interests although they infer these interests in dier-ent ways Both havemade transparent demographic and geographictargeting information for ads in their archive but without othertargeting information this is an incomplete picture at best Weencourage Google and Twier to at minimum follow Facebookrsquosexample and make transparent to users information about whythey have been targeed for ads that they are seeing

54 New Types of Political Advertisers541 For-Profit Media One advertiser type in particular proved

to be an interesting outlier e category rsquoFor Prot Mediarsquo con-tains advertisers whose ads are not considered traditional news byFacebook (those ads are in a separate part of the archive that wedid not include) but have content intended solely to entertain orsway the opinion of the viewer Over the Facebook dataset as awhole the average ad sponsor ran ads on 16 pages Advertisersin the for-prot media category however ran ads on 32 pages onaverage We have examined many of these for-prot media compa-nies to understand why they are running across many Facebookpages What we have found in numerous instances is unknownfor-prot media companies that appear to be creating disingenuouscommunities that appear to be ldquograssroots movementsrdquo to targetdierent demographics and interests with a combination of paidand organic political messaging

A good example of this type of advertiser is rdquoNew AmericanMedia Group LLCrdquo is ad sponsor ran le leaning ads on 10dierent pages ese pages were designed to appeal to dierent

demographics (rdquoMelaninrdquo for people of color rdquoe Soldier Networkrdquofor Veterans rdquoRaising Tomorrowrdquo for parents etc) but oen run thesame content on multiple pages While this LLC has an extremelysimilar name to a now-defunct genuine le-leaning media outlet(New America Media) it appears to have no connection to thatprior group and also appears to have no activity o of Facebook

While some advertisers in this category were fairly traditionalentertainment websites (ie Comedy Central) some were ldquofor-protrdquo companies in name only that appeared to exist for no otherpurpose other than to spread a particular political message and hadno way of generating an actual prot We also discovered ldquoNewsfor Democracyrdquo is an LLC that ran le leaning ads on 14 dierentFacebook pages most of which were designed to be appealing togroups with traditionally conservative view points such as ldquoeHoly Tribunerdquo Journalists investigated this LLC and linked it toMotiveAI which is a liberal political advertising company

542 Corporate Astroturfing Corporations paying for politicaladvertising is not an entirely new phenomenon and has traditionallybeen funded through industry trade groups and PACs Howeverthe reporting requirements by the FCC for US political advertisingon television oen made this political messaging traceable to thereal sponsor ese stricter reporting requirements do not apply toonline political advertising and the ad-hoc reporting requirementsthat online platforms have enacted are being abused by corporationsand industry trade groups to undo transparency eorts

We discovered in our analysis 355 ads sponsored by ldquoCitizens forTobacco Rightsrdquo which is not a registered company in the US butdoes disclose on their website and Facebook page that it is operatedby cigaree company Philip Morris However someone who onlysaw the Facebook ad disclaimer would not be able to connect theads to Philip Morris without further investigation Other journalistshave found instances of oil and insurance lobbying groups that alsoprovided sponsor names that did not match the legally incorporatedentity sponsoring the ads [21] ese organizations are seeminglytaking advantage of Facebookrsquos policy of not veing sponsor namessince some of these entities also ran political ads on Googlersquos adplatform but provided Google with their EIN (tax ID) and correctlegally incorporated names of their organizations [21]

55 Discussione dierent policies bugs idiosyncrasies and security weaknessesof each transparency archive implementation present challenges toour analysis eorts We nd many of the issues with these archiveslikely stem from a combination of their hasty creation and the factthat the platforms are still working out how to improve security ofthese archives such they are dicult to deceive or evade We willrst discuss issues related to accidentally or intentionally deceivingthese transparency eorts and how they might be improved byimplementing more robust sponsor aribution techniques e sec-ond part of our discussion will focus on issues related to bypassinginclusion into the dierent platformsrsquo archives and what can bedone to improve these issues

551 Sponsor Aribution e for-prot political advertisers ap-pear to be the ones that are accidental or intentionally skirting andviolating the spirit of online transparency sponsorship disclosure

9

policies As we discussed in the Ad Targeting section itrsquos extremelyeasy for groups such as rsquoNew American Mediarsquo to obscure whothey actually are from users and researchers

It is worth noting that such advertising by for-prot corpora-tions was not legal until the Citizens United Supreme Court decisionin 2010 [1] that struck down restrictions on election spending byfor-prot corporations However political messaging advertiserswho run ads on television or radio stations governed by the FCCmust still report the name and contact information of the busi-ness which paid for the ad including the companyrsquos ocers anddirectors Such data is published by the FCC in a public databasePolitical advertisers who send direct mail through the US PostalService (USPS) must also report their activities through the FECwith similar public disclosure of the name and contact informationof the business e regulations that require such disclosure for adsthat mention candidates do not apply to online advertising largelybecause the laws that mandate such public disclosures were draedbefore these platforms were as ubiquitous as they have become

What this means in practice though is that people who wantto publicize a political message can form a for-prot company fordoing so with no intent of making an prot As a private companythey do not need to publicly disclose their investors in the waythat PACs are required to disclose their donors en the for-protcompany can advertise on social media also without disclosing thelegal entity providing the funds to pay for the ad

On Facebookrsquos platform advertisers can easily mislead whenproviding the rsquoad sponsorrsquo string associated with their ads eitherintentionally or accidentally us it is eectively free to circum-vent Facebookrsquos transparency implementation We see numerousinstances on Facebookrsquos platforms of this occurring Sometimes theunreliability of the ad sponsor label appeared to be caused purelyby human error such as typos or variation during data entry Forexample Donald J Trump For President Inc sponsored ads onboth the Donald J Trump page and the Mike Pence page Howeverwhen sponsoring ads on the Donald J Trump page the organiza-tion is known as rsquoDonald J Trump For President Incrsquo and whensponsoring ads on the Mike Pence page is known as rsquoDonald JTrump For President Incrsquo Facebook has not publicly stated plansto implement additional veing of political sponsors Facebookrsquosargument is that anything they might implement for additional vet-ting would not be scalable because of their broader inclusion policywhich extends to political issue ads [21] However this has cre-ated a weakness in Facebookrsquos transparency implementation thatgreatly diminishes its eectiveness for studying dishonest politicaladvertisers

Google and Twier both vet sponsors so companies must eitherreveal their legally incorporated name pay existing third-partiesto create ads on their behalf or create shell organizations (ie LLCPACs) We should note that we see instances of political ads onFacebook and Twier where the sponsor is a third-party advertisingagency instead of the actual entity that paid for the ads is is anexample of the complexities of correctly aributing political adsto the real sponsors It is clear from analysis that we need morediscussion about how to implement sponsorship disclosure andveing in a way that makes it practical to deploy at scale and moredicult to circumvent

552 Transparency Infrastructure As we have noted we ap-preciate the speed with which these transparency archives werecreated However the lack of full integration of these archives intothe broader ad platforms of these companies is currently hurtingthe ecacy of these transparency eorts

We believe that there are ads on the Google and Twier platformsthat would be considered political content that are not includedin their transparency archives because their criteria for inclusionare too narrow or their mechanisms for nding this content areinsucient More research needs to be done into exactly whatthe general population considers to be political advertising Wewould encourage these platforms to create policies and enforcementmechanisms that will make transparent advertising content thatthe general population would consider political

We also encountered several technical and policy issues withthe archives as they currently exist Many ads particularly in theGoogle archive were missing content information Information onspend and impressions were only available in broad ranges fromFacebook and Google No targeting information or very lile target-ing information was available from any of the platforms Facebookrequired us to sign an NDA that prohibited us from sharing our rawdata even with other researchers or even discussing our ndingsdirectly with non- US Persons

We call on these organizations to re-architect their platforms andpolicies to support full transparency of all political ads We realizethat making the changes we recommend will require investment oftime and money both in the technology of these platforms and thecorporate culture of the organizations that own them

6 RELATEDWORK61 Online AdvertisingKorolova [16] was the rst to point out privacy aacks based onmicro-targeted online ads Followup work has reverse-engineeredthe targeting options provided by major online ad networks [33]and explored privacy [2] and bias [26] issues of these online adnetworks ere has also been work on designing improved adtransparency mechanisms [20] For our study we leverage thisprior work on reverse-engineering online advertising networksrsquotargeting options and how Facebookrsquos ad targeting explanationlikely is implemented

To the best of our knowledge there has been no systematicanalysis of online advertisers to this point likely due to the dicultyof collecting large-scale data from online ad networks [13] One ofthe only prior large-scale quantitative studies of online advertisersfocused on how their strategies eected conversion rates based onaggregate analysis of advertisers on Microsorsquos ad network [32]XRay [18] and Sunshine [19] are two techniques that were createdto detect and infer online ad targeting methods However thesewere proof of concept systems and not deployed at large-scale Aninitial analysis of Facebookrsquos proposed ad transparency archiveimplementations pointed out the issue of only including politicalads and not revealing targeting information [31] is report wasreleased before Facebook implemented their transparency archiveand therefor did not analyze the ad data archived by Facebook orissues with the actual implementation We have conducted the rst

10

large-scale analysis of online political advertising based on the datarecently made transparent by Facebook Google and Twier

62 Political AdvertisingAnalysis of political television ads has been the focus of most priorpolitical advertising studies likely due to this data being publiclypublished by the FCC and easy to access [14 35] ere is at leastone prior study that explored the inuence of political televisionads on online discussion [25] ere have also been studies of inves-tigating the polarization of online political discourse [3 10] eclosest related to our study is a prior study which showed thatuploading political video advertisements to YouTube generated un-paid organic-views and improved their eectiveness [24] Howeverto the best of our knowledge ours is the rst large-scale study ofonline political advertising

7 CONCLUSIONSWe have performed an analysis of the ads that we were able tocollect from Facebook Google and Twierrsquos transparency archivesrelated to US politics Based on the data we collected we providean initial understanding and taxonomies of online political advertis-ing strategies for both honest and possibly dishonest US politicaladvertisers We also point out limitations and weaknesses of thepolicies and current implementations of these archives As part ofour analysis we demonstrate how advertisers are intentionally oraccidentally deceiving and bypassing these political transparencyarchives We provide a concrete list of suggestions that wouldlikely make them more robust and useful for enabling a beer un-derstanding of political advertising We are actively working witheach archive product teams to improve their implementations

We commend Facebook Google and Twier for their eorts sofar in improving transparency into political advertising on theirplatforms We note the speed with which these archives weremade available aer public concern about this issue was raised andthat these transparency eorts have improved a great deal in theshort time that these tools have been available We encourage theplatforms to continue to improve

REFERENCES[1] 2009-03-24 Citizens United v Federal Election Commission[2] Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P Gummadi

Patrick Loiseau and Alan Mislove 2018 Investigating ad transparency mecha-nisms in social media A case study of Facebookrsquos explanations InNDSS 2018 Net-work and Distributed Systems Security Symposium San Diego UNITED STATES18 ndash 21

[3] Pablo Barber John T Jost Jonathan Nagler Joshua A Tucker and RichardBonneau 2015 Tweeting From Le to Right Is Online Political CommunicationMore an an Echo Chamber Psychological Science 26 10 (2015) 1531ndash1542

[4] Brian Barre 2018 For Russia Unraveling US DemocracyWas Just Another Day Job hpswwwwiredcomstorymueller-indictment-internet-research-agency

[5] Facebook 2018 Ad Archive hpswwwfacebookcomadsarchive[6] Facebook 2018 Facebook Ad Archive Report Retrieved October 25th 2018

from hpswwwfacebookcomadsarchivereport[7] Facebook 2018-07-03 About ads that include political content rdquohpswww

facebookcombusinesshelp167836590566506rdquo[8] Facebook 2018-07-03 National issues of public importance rdquohpswww

facebookcombusinesshelp214754279118974rdquo[9] Facebook 2018-08-22 Introducing the Ad Archive API hpsnewsroom

comnews201808introducing-the-ad-archive-api[10] DJ Flynn Brendan Nyhan and Jason Reier [n d] e Nature and Origins of

Misperceptions Understanding False and Unsupported Beliefs About PoliticsPolitical Psychology 38 S1 ([n d]) 127ndash150

[11] Google 2018-08-15 Transparency Report - Political Ads hpstransparencyreportgooglecompolitical-adslibrary

[12] Kevin Granville 2018 Facebook and Cambridge Analytica What You Needto Know as Fallout Widens hpswwwnytimescom20180319technologyfacebook-cambridge-analytica-explainedhtml

[13] Saikat Guha Bin Cheng and Paul Francis 2010 Challenges in Measuring OnlineAdvertising Systems In Proceedings of the 10th ACM SIGCOMM Conference onInternet Measurement (IMC rsquo10) ACM 81ndash87

[14] Lynda Lee Kaid and Monica Postelnicu 2005 Political Advertising in the 2004Election Comparison of Traditional Television and Internet Messages AmericanBehavioral Scientist 49 2 (2005) 265ndash278

[15] Maggie Haberman Kenneth P Vogel 2018-10-13hpswwwnytimescom20181013uspoliticstrump-political-datahtmlhpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[16] A Korolova 2010 Privacy Violations Using Microtargeted Ads A Case StudyIn 2010 IEEE International Conference on Data Mining Workshops 474ndash482

[17] Shikhar Sakhuja Laura Edelson 2018 An Analysis of UnitedStates Online Political Advertising hpsonline-pol-adsgithubioOnline-Political-Ads-Analysis

[18] Mathias Lecuyer Guillaume Ducoe Francis Lan Andrei Papancea eolosPetsios Riley Spahn Augustin Chaintreau and Roxana Geambasu 2014 XRayEnhancing the Webrsquos Transparency with Dierential Correlation In USENIXSecurity Symposium San Diego United States

[19] Mathias Lecuyer Riley Spahn Yannis Spiliopolous Augustin Chaintreau RoxanaGeambasu and Daniel Hsu 2015 Sunlight Fine-grained Targeting Detectionat Scale with Statistical Condence In Proceedings of the 22Nd ACM SIGSACConference on Computer and Communications Security (CCS rsquo15) ACM New YorkNY USA 554ndash566 hpsdoiorg10114528101032813614

[20] Bin Liu Anmol Sheth Udi Weinsberg Jaideep Chandrashekar and RameshGovindan 2013 AdReveal Improving Transparency into Online TargetedAdvertising In Proceedings of the TwelhACMWorkshop onHot Topics in Networks(HotNets-XII) ACM New York NY USA Article 12 7 pages hpsdoiorg10114525357712535783

[21] Jeremy B Merrill 2018 How Big Oil Dodges FacebooksNew Ad Transparency Rules hpswwwpropublicaorgarticlehow-big-oil-dodges-facebooks-new-ad-transparency-rules

[22] ProPublica 2017-09-07 Political Advertisements from Facebook hpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[23] ProPublica 2018-10-21 Political Advertisements from Facebook hpswwwpropublicaorgdatastoredatasetpolitical-advertisements-from-facebook

[24] Travis N Ridout Erika Franklin Fowler and John Bransteer 2010 PoliticalAdvertising in the 21st Century e Rise of the YouTube Ad In APSA 2010Annual Meeting

[25] Dhavan V Shah Jaeho Cho Seungahn Nah Melissa R Gotlieb Hyunseo HwangNam-Jin Lee Rosanne M Scholl and Douglas M McLeod [n d] Campaign AdsOnline Messaging and Participation Extending the Communication MediationModel Journal of Communication 57 4 ([n d]) 676ndash703

[26] Till Speicher Muhammad Ali Giridhari Venkatadri Filipe Nunes Ribeiro GeorgeArvanitakis Fabrcio Benevenuto Krishna P Gummadi Patrick Loiseau andAlan Mislove 2018 Potential for Discrimination in Online Targeted AdvertisingIn Proceedings of the 1st Conference on Fairness Accountability and Transparency(Proceedings of Machine Learning Research) Sorelle A Friedler and ChristoWilson(Eds) Vol 81 PMLR New York NY USA 5ndash19

[27] William Turton 2018 We posed as 100 Senators to run ads on FacebookFacebook approved all of them hpsnewsvicecomen usarticlexw9n3qwe-posed-as-100-senators-to-run-ads-on-facebook-facebook-approved-all-of-them

[28] Twier 2018 Political campaigning advertisers hpsadstwiercomtransparencyipolitical advertisers

[29] Twier 2018-06-28 Ad Transparency Center hpsadstwiercomtransparency

[30] Twier 2018-06-28 Political Content in the United States hpsbusinesstwiercomenhelpads-policiesrestricted-content-policiespolitical-campaigningUS-political-contenthtml

[31] Upturn 2018 Leveling the Platform Real Transparency for Paid Messages onFacebook hpswwwteamupturnorgreports2018facebook-ads

[32] Bhanu C Vaikonda Vacha Dave Saikat Guha and Alex C Snoeren 2015Empirical Analysis of Search Advertising Strategies In Proceedings of the 2015Internet Measurement Conference (IMC rsquo15) ACM New York NY USA 79ndash91hpsdoiorg10114528156752815694

[33] G Venkatadri A Andreou Y Liu A Mislove K P Gummadi P Loiseau and OGoga 2018 Privacy Risks with Facebookrsquos PII-Based Targeting Auditing a DataBrokerrsquos Advertising Interface In 2018 IEEE Symposium on Security and Privacy(SP) 89ndash107

[34] Giridhari Venkatadri Yabing Liu Athanasios Andreou Oana Goga PatrickLoiseau Alan Mislove and Krishna P Gummadi 2018 Privacy Risks withFacebookrsquos PII-based Targeting Auditing a Data Brokerrsquos Advertising InterfaceIn Proceedings of the IEEE Symposium on Security and Privacy (IEEE SampPrsquo18) San

11

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References
Page 3: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

and scalable veing techniques [21] We will describe the issuesthat we found with this self-reported text string later in this paper

Google provides both a text string and a Federal Election Candi-date (FEC) ID or EIN (US Tax ID) which is veed for every politicaladvertiser in their archive Twier provides a text string and whenavailable the billing info of the political sponsors Twier initiallydid not vet this information but started veing sponsorrsquos EIN simi-lar to Google on September 30th 2018 Providing a consistent andeasy to reference identier such as FEC or EIN for each sponsorenables us to beer study sponsors in Google and Twierrsquos archiveViewedAudience Facebook and Twier both include break downson the impression viewing audience by gender age range and statelevel geolocation Google provides this information as a heat mapimage that we cannot currently extract this information from forour analysis thus wemark Google as ldquoNArdquo for this category Googlehas replied that they will work on releasing this information in aformat that we can analyzeTargeting Info Facebook does not include any explicit targetinginformation in their archive Google and Twier makes transparentAge gender and geolocation based targeting but do not appearto release other types of targeting criteria such as audience andcontent which are allowed by their advertising platforms Googlealso makes transparent some aggregated keyword targeting dataAll of the platforms only release partial targeting information atbest which obscures a key facet of online political advertisingData Availability Facebook initially only provided a keywordbased portal that was designed for small-scale interactive user ex-ploration of the ads in the archive Facebook enabled anti-scrapingfunctionality in July 2018 that makes it dicult to collect large-scale data by scraping this portal In September 2018 Facebookreleased an API that is currently in beta testing which we haveaccess to aer signing an NDA stipulating that we will not publiclyrelease raw data collected from the API is eectively means thatonly a small set of US organizations participating in the API betatest have large-scale data collection access to ads in the Facebookarchive

Twier has provided an open API and list of all accounts in-cluded in their transparency archive which allows us to eectivelycollect all ads included in their archive Google implemented aportal similar to Facebookrsquos and also releases a Bigery (SQL-like)database of all the ads included in their archive which is updatedweekly For our use case of large-scale data analysis this databaseformat is ideal

3 DATA COLLECTION METHODOLOGY31 FacebookInitially we scraped Facebookrsquos archive using a list of keywords thatincluded elected positions (ie governor judge senator) US statenames and key political issues (ie health care immigration taxes)Around the end of July 2018 Facebook implemented anti-scrapingmeasures which blocked our scraper us we had no viable meansof collecting large-scale ad data until Facebook implemented theirAPI in September 2018 We have publicly released a report and allof the data that we collected by scraping Facebookrsquos archive userportal before our scraper was blocked [17]

We are part of Facebookrsquos Political Ad Archive API beta test-ing program [9] which allows us to query Facebookrsquos Political AdArchive for specic keyword terms which is matched against thepage name the disclaimer or ad text Ads returned by FacebookrsquosAPI are ordered using a proprietary ranking algorithm that was notdescribed to us how it functions However most advertisementsappear to be returned in chronological order A single query toFacebookrsquos API returns at most 1000 ads and we can page throughto collect additional ads using pagination functionality as part ofthe API Currently there is a limitation in Facebookrsquos Political AdArchive API beta that prevents us from paging past 8000 ads isis problematic because many searches will return far more than8000 results

Information on spend and impressions per ad is only availablein broad ranges For impressions the ranges presented are 0 - 9991000 - 4999 5000 - 9999 10000 - 49999 50000 - 99999 100000 -199999 200000 - 499999 500000 - 999999 For spend the rangespresented are 0 - 99 100 - 499 500 - 999 1000 - 4999 5000 - 999910000 - 49999 50000 - 99999 100000 - 199999 200000 - 499999500000 - 999999

Additionally the API has very low rate limits We have foundthat functionally we could make at most 3 requests per minuteon average before hiing these rate limits Our goal was to createas comprehensive and representative a dataset as possible Giventhe very low rate limits and limits on the number of responses fora given search our approach was to search by advertising pageas much as possible in order to reduce the bias in our data Weare currently able to keep up with the rate of new advertisementsappearing in Facebookrsquos political ad transparency archive Wecannot publicly release the raw data that we have collected fromFacebookrsquos API due to the agreement that we have signed withFacebook as a requirement for access to their API

We created a separate approach for discovering pages that arelinked to sponsored political ads Our approach to discoveringpages involved scraping Facebookrsquos Political Ad Archive user portalinterface We chose a scraping method for page discovery sinceour access to Facebookrsquos API is highly rate limited and it wouldbe logistically infeasible to perform the queries required for bothpage discovery and to collect ads using our API access Our list wasnot a complete list of advertisers using Facebookrsquos platform sinceit depends on good coverage based on our keyword searches

Facebook started publishing a comprehensive list aer the cut-o for our data analyzed in this study Our data collection fromFacebookrsquos archive is our best eort and was incomplete based onanalysis of what is contained in Facebookrsquos transparency reportWe have changed our data collection methodology moving forwardto discover Facebook pages running political ads using Facebookrsquosweekly transparency reports is combined with improvementswe requested and Facebook implemented to their API aer thedata collection period for this study will improve our coverage to amostly complete set of US political ads Facebook has included intheir transparency archive

32 GoogleGoogle published their archive as a public dataset in a Bigery(SQL-like) format and commied to keeping it public However we

3

observed that ad spend and impression values for ads and occasion-ally advertiser information were being changed aer their entryinto the archive For this reason we created separate archives ofthe dataset on a weekly basis Additionally ad text information wasnot available in the dataset itself but was viewable at a summarypage for each ad We scraped all of these associated pages to col-lect ad text to be associated in our dataset with the underlying addata Unfortunately many of these summary pages did not rendercorrectly so we were only able to collect ad text for approximately66 of pages which contained it Two separate issues preventedcollection First some ad pages display the messageldquoAdvertisers are able to use approved third party vendors to serve adson Google While we are able to review these ads for compliance withadvertising policies due to technical limitations we are currentlyunable to display the content of the ad in the Transparency Reportrdquo

Second some ad pages displayed the messageldquoPolicy violation is ad violated Googlersquos Advertising Policyrdquo Werecommend that Google change their implementation so that adsserved by third party vendors or which were deleted for compli-ance reasons are still accessible through their transparency archiveGoogle can place a click-through disclaimer to avoid accidentalexposure to policy violated content similar to what Facebook hasimplemented for deleted advertisements

Information on spend and impressions per ad are only availablein broad ranges For impressions the ranges presented are rdquoiexcl= 10krdquordquo10k-100krdquo rdquo100k-1Mrdquo rdquo1M-10Mrdquo rdquoiquest 10Mrdquo For spend the rangespresented are rdquoiexcl 100rdquo rdquo100-1krdquo rdquo1k-50krdquo rdquo50k-100krdquo rdquoiquest 100krdquoe ads in this dataset are a combination of text-only ads that aredisplayed alongside Google search results and image or video-onlyads that are displayed as banner or sidebar ads on Googlersquos AdSensenetwork

In addition to per-ad data Google also published some aggregatedata on a per-advertiser and geographic basis One of these aggre-gations was exact weekly spend per advertiser roughout thispaper we present minimum numbers for impressions and spendbecause both Google and Facebook publish ranges for impressionsand spend for each ad instead of exact numbers ese total ag-gregations give us a sense of how much error there is when weuse these minimum estimates for Google According to Googleadvertisers spent $45 M on political ads but our minimum estimateof spending was only $11 M

33 TwitterTwier publishes a list of all political campaigning advertisers [28]which we scrape daily to discover new political campaign advertis-ersrsquo Twier accounts In addition to this list provided by Twierwe have also manually aempted to identify every federal electioncandidatesrsquo personal or campaign Twier account We then queryeach account daily using Twierrsquos API perform to collect updatedinformation on all promoted tweets and detect federal electioncandidates which are not listed on Twierrsquos political campaigningadvertisers page but are sponsoring tweets During our scrapingwe have noticed that some promoted tweets were deleted and arereplaced with the textldquois Tweet is not available because it includes content that violatedTwier Ads Policiesrdquo

e information for these deleted promoted tweets is no longeraccessible through Twierrsquos political transparency archive How-ever if we have scraped them before they were deleted we haveretained the content and information about these promoted tweetsWe recommend that Twier change their implementation so thatpromoted tweets which were deleted are still accessible throughtheir transparency archive Twier can place a click-through dis-claimer to avoid accidental exposure to policy violated contentsimilar to what Facebook has implemented for deleted advertise-ments We have made public all of the data that we have collectedfrom Twierrsquos transparency archive

Additionally we noted that there were several accounts of fed-eral candidates that were not being archived according to Twierrsquospolicies We would nd these ads during our regular scrapes for adsby all federal candidates but no billing or impression data would beavailable and the ads would disappear from Twierrsquos archive aer7 days as is typical for non-political ads We notied Twier about4 accounts which they subsequently added to their transparencyarchive However Twier did not retroactively include their priorpromoted tweets from these accounts and there are currently 11additional federal candidate accounts which have promoted tweetsnot included in the archive us it appears that Twierrsquos pro-cess for agging federal election candidatesrsquo account that shouldbe included in their archive is not working correctly We will dis-close this new set of 11 accounts to Twier and continue to workwith them to improve their process for discovering and includingrelevant promoted tweets in their transparency archive

4 DATASETSWe have collected all of the US political ad data that Google andTwier have made transparent and archived as of October 21st 2018 In addition we have made our best eort to collect as muchof the US political ad data that Facebook has made transparent andarchived as of October 21st 2018 For Facebook we are not ableto collect all of the ad data from their transparency archive due tothe limitations in their API this is a subset of US political ads thatran on Facebook Note that our scraper was blocked by Facebookin mid-July 2018 and we were not able to collect data until thebeginning of September 2018 when we began to use their beta APIis means that we do not have good coverage of Facebook adsduring that period since it is dicult to retrieve older ads fromFacebookrsquos current beta API On October 23rd 2018 shortly aerwe froze our dataset Facebook released their Ad Archive Report [6]From this we know that as of the cuto date for data analyzed inthis study Facebook had a total of 167M ads in their archive froma total of 256M spent across 78K pages We have captured over 75of all ads in the archive but only 49 of the pages

On the Facebook platform ads that run without a rsquoPaid for byrsquolabel but are later deemed to be political are removed from circula-tion and added to the archive We have been able to nd 96106 suchads in the archive with a total spend of at least $428 million and670 million impressions It does not appear that Google or Twierhave any mechanism for retroactively marking an ad as politicalif it is discovered aer the fact and we would encourage them todevelop this capacity

4

Table 2 shows all of the data that we have collected from eachof the platforms Most of the political ads in these archives arefrom late May 2018 to October 21st 2018 but there are severalolder ads from Twier and Facebook that have been included intheir transparency archives Facebook has the most advertisersads impressions and spend However Facebook also includesmany political issue ads in their transparency archive that are notincluded in Google and Twierrsquos transparency archives so this isnot a fair comparison of political advertising activity across allthree platforms An important dierence between the datasetsis that while Facebook and Twier are publishing breakdownsof impressions on geographic and demographic lines Google isinstead publishing geographic and demographic targetings eseshould not be considered equivalent Below in the analysis sectionwe will present a more accurate comparison of political advertisingactivity across all three platforms

Additionally we use a dataset published by ProPublica of politicalads that have been viewed by their users who have installed browserextensions that automatically collected advertisements on theirFacebook pages and sent them to ProPublicarsquos servers [23] Table 3provides an overview of this dataset We were able to connect adsin the ProPublica dataset to ads in our dataset of archived politicalads by mapping the ad IDs used in the ProPublica dataset to thead archive IDs used in the archive To do this we scraped theFacebookrsquos web-based political ad archive as both the ad IDs andad archive ids were available Each record in this dataset containsamong other things the text of the data the various targetingsreceived by the dierent users who saw the ad the page associatedwith the ad and the rsquoPaid byrsquo ad sponsor string associated with thead Of the 33308 ads in the ProPublica dataset with a creation dateaer May 7th 2018 the ocial start of the Facebook dataset wewere able to nd 18010 Because the users who contribute data tothis dataset are self-selecting these ads should not be considereda representative sample of ads in the larger Facebook Ad ArchiveAmong other things the average ad spend on ads in this datasetwas $644 compared to $107 for the larger dataset

As part of our analysis we manually categorized the top adver-tisers on all three platforms We categorized these advertisers byorganization type (political candidate Political Action Commiee(PAC) Union For Prot etc) For Facebook we were able to classifythe organizations of the advertisers who were responsible for atleast 75 of the total number of ads in the Facebook archive ForGoogle we labeled the organization of top advertisers who wereresponsible for 80 of the total number of ads and for Twier wewe were able to label all 88 advertisers with their organization typeWe were able to categorize 12833 of the top ad sponsors If we werenot able to categorize an advertiser it is marked as rsquoUnknownrsquo

We also classied the ads themselves into 5 categories InformConnect Donate Move or Commercial Inform ads seek to per-suade the viewer but do not make an explicit ask Connect ads seekthe userrsquos contact information Donate ads seek the userrsquos moneyMove ads aempt to motivate the user to take some action in thephysical world such as aending a rally or voting Commercial adsseek to sell the user goods or services We classied the ads basedon the outgoing links from the ads Ads that had no outgoing linkswere always classied as Inform ads as they could not have any fur-ther ask from the user Ads that linked directly third-party sites for

Figure 1 Distribution of ads by size

event management (eventbritecom) contact management(GoogleDocs) or payments management (actbluecom) were solely classi-ed as Move Connect or Donate ads respectively Ads that linkedto general campaign sites were usually multiple-classed as somecombination of the three as these ads and pages typically mademultiple asks Ads by For Prot Media organizations were classiedas Inform ads as these advertisers do not sell goods or servicesto users Ads by For Prot organizations that linked to store sitesor sites selling services were classied as Commercial We wereable to categorize 907840 ads with these methods Heavy use ofthird-party service providers by advertisers was extremely helpfulin making these classications If we were not able to categorize anad it was marked as rsquoUnknownrsquo We validated this method of adcategorization by taking a random sample of 300 categorized adsfrom each platform and manually verifying them e error rate forFacebook was 4 for Google was 37 and for Twier was 37

A limitation that applies to all our datasets is that we do notknow when the spend and impressions for each ad occurred duringthe lifetime of the ad Some ads run for several weeks and some foronly a day but in either case we aribute their entire spend andtotal impressions to the creation date of the ad

5 RESULTSWe calculate total spend and impression minimum and maximumfor Facebook ads by summing respectively the smallest and largestvalue for the range given for each ad For Google advertiser weeklyspend data was aggregated for all advertisers so we did not have toestimate that number For Twier exact numbers for impressionsand spend were available so no estimation was needed We alsonote that we are only able to collect a subset of political advertise-ments from Facebookrsquos transparency archive due to accessibilityissues with their beta API We stress that because the criteria forinclusion in these archives diered on the dierent platforms thegures on relative proportions of ad types and advertiser typesshould be seen as a reection of what the platforms chose to maketransparent in addition to what is organically present on theseplatforms

With that in mind we can see clear dierences between theplatforms Of particular note is the dierence in ad size visible inFigure 1 with Facebook having a much larger sized share of thesmallest size of ad Also of note is the diering prevalence of typesof advertisers in Figure 3 with PACs making up a much largerpercentage of spend on Google compared with the other platforms

5

Platform Total Ads Total Sponsors Total Pages Impressions Spend First Ad Date Last Ad DateFacebook 126 M 24 K 38 k 735 B - 2112 B $135 M - $567 M July 14th 2014 October 21st 2018Google 41 K 616 NA 13 B - 116 B $45 M May 31st 2018 October 21st 2018Twier 1808 88 NA 118 M $16 M December 21st 2016 October 21st 2018

Table 2 Overall Datasets

Total Ads 81052Total Pages 2363Total Ad Sponsors 2395Earliest Ad Date July 31st 2017Latest Ad Date October 18th 2018

Table 3 ProPublica Political Advertisements From Face-book

Figure 2 Distribution of ads by type

Figure 3 Distribution of spend by advertiser type

51 Data Over Timee time period during which we were collecting data coincidedwith the 2018 midterm elections in the United States us we wereable to observe changing paerns in spend leading up to a majorelection Figure 4 shows spend by week for the 5 month periodleading up to the election and Figure 5 shows raw ad count forthe same period We note that our data particularly for Facebookspend is right-censored for the nal two weeks is is causedby Facebookrsquos API limitations which only enable us to be ableto recheck ad spends weekly us newly create ads have likely

Results Facebook Google TwierTotal Advertisers 1 K 534 54Total Ads 161 K 15 K 1 KTotal Impressions 800 M - 24 B 280 M- 3 B 100 MTotal USD Spend $12 M - $60 M $135 M $14 MAve ImpressionsAd 5 K - 15 K 32 K - 283 K 65 KAve USD SpendAd $74 - $373 $1 K $885

Table 4 Federal Candidate Only Results

not spent much of their budget when we initially discover themis right censor eect also likely eects Google and Twier to alesser degree due to ads with larger budgets that take several daysto spend down completely is can be corrected by periodicallyrechecking the ads until they have all spent their budgets which isnormally within a week If the paper is accepted we will updatethe data to include ads up to the US midterm elections

We can see the expected increases in the number of ads onall three platforms as the US midterm elections approach OnFacebookrsquos platform there is an increase in connect ads and onTwier there is an increase in move ads Both of these are relatedto sophisticated ldquoget out the voterdquo eorts that many groups havedeployed ese move ads include images which include specicpolling place addresses and websites that provide polling placedirections and information e connect ads oen provide userswith instructions on how they can volunteer to help with early andday-of voter turnout eorts e cause of the spending spikes forFacebookrsquos platform can be aributed to a few unknown sponsorsthat we could not link to a legally registered entity but that werelikely quasi for-prot advertisers which we will discuss furtherlater in the paper e spending spikes on Twierrsquos platform canbe aributed to candidates who ran a few ads with larger budgets

52 Federal Candidate ComparisonIn order to understand how political advertising across these plat-forms dier we aempted to create a comparable subset of adver-tisers and ads is is dicult because each platform has slightlydierent criteria for inclusion To do this we present results foradvertising only paid for by candidates for federal oce whichwas the broadest set that was reliably included in all three archivesNote this does not include ads by current oceholders who are notseeking re-election or ads that merely mention a federal candidatebut are paid for by another party Results for these advertisers arepresented in Table 4

Table 4 shows that Facebook is the platform with the broadestappeal to federal candidate advertisers with far more advertisersand ads than Google However political advertising by this groupon Google appears to generate more spend and possibly more im-pressions than ads on Facebook e average ad size on Facebookin terms of impressions and spend are the smallest based on our

6

Figure 4 Platform Ad Count By Ad Type By Week

Figure 5 Platform Spend By Advertiser Type By Week

Figure 6 Federal Candidate ads by Size

Figure 7 Federal Candidate Spend by Ad Type

minimum estimates indicating that advertisers are running smallerlikely more targeted ads on Facebook ese small ads on Face-book are what are called micro targeted which we dene as lessthan 1000 impressions or a spend of less than $100 For Facebookmicrotargeted ads make up 81 of the overall number of ads for

federal candidates in our dataset For Twier this number is 62and for Google it is 54 Figure 6 shows the share of ads by sizeof spend and here we can begin to see how federal candidates usethese platforms in dierent ways We note that the distribution ofads by size for federal candidates in Figure 6 is very similar to theoverall distribution of ads by size seen in Figure 1

Figure 7 shows the relative spend on dierent ad platformswhere we see very dierent percentages for types of ads Commer-cial ads are not shown in this gure because there were too fewcommercial ads to be visible Particularly of note is the fact that adsseeking donations were far more common on the Google platformand ads seeking to spread a message (rsquoInformrsquo) were much morecommon on Facebook

Seeing these dierences in both ad size and the types of ads thatwere run we wanted to understand if advertisers were trying toreach dierent geographic audiences with dierent types of adsTo do this we compared the number of regions in which variousads had impressions on Facebook and Twier and the number ofregions targeted for Google Figure 8 shows that a variation intargeting strategy is visible on Facebook and Twier On FacebookrsquoMoversquo ads that encouraged people to aend a rally volunteer fora candidate or some other in-person activity were viewed onaverage in 4 regions while rsquoDonatersquo ads were viewed in 24 regionson average is makes a certain amount of intuitive sense peopleare willing to travel only so far to aend a rally but can donate tocandidates anywhere in the United States

7

Figure 8 CDF of Regions by Ad Type for Federal Candidates

53 Ad TargetingOne of the deciencies with the Facebook political ad archive isthat while it does share geographic and demographic informationabout who saw a particular ad we have no way of knowing howthat ad was targeted However we were able to connect our datasetcontaining information about who consumed ads with one pub-lished by ProPublica which contains some data about how ads weretargeted e ProPublica data was collected by a browser pluginoperated by ProPublica which anyone can install ProPublicarsquosbrowser plugin [22] uses a supervised Natural Language Process-ing (NLP) classier to detect political ads in addition to allowingusers to manually classify ads they see as political e browserplugin then collects the partial ad targeting explanations Facebookprovides by automatically clicking on the ldquoWhy am I seeing thisrdquobuon for political ads and sends it to ProPublica for them to makepublic

We rst provide a brief background on Facebook targeting audi-ence options [34] Facebook exposes prospective advertisers to aplethora of options First advertisers can target users based on agegender location and languages they speak Second advertisers canchoose to send their ads to users in a custom audience or lookalikeaudience Custom audiences contain a list of identiers of specicusers Advertisers can use various types of data to create a customaudience list ranging from specifying the emails phone numbersor physical addresses of people they want to reach to users thathave visited their website installed their mobile application orliked their Facebook Page Lookalike audiences allow advertisers tolet Facebook choose to whom to sends their ads based on previouscampaigns Finally advertisers can choose from a long list of target-ing aributes the characteristics they want users who receive theirads to have (eg users interested in Catholic Church) Targetingaributes are categorized in types such as demographics behaviorsand interests Advertisers can choose multiple aributes to target

A prior study by Athanasioshas et al [2] reverse engineeredwhat Facebook chooses to show and the limitations of the ad tar-geting explanation Facebook provides is study showed that adexplanations are incomplete each explanation shows at most onetargeting aribute (plus agegenderlocation information) regard-less of how many aributes the advertisers use is means thatexplanations reveal only part of the targeting aributes that wereused providing us ndash and the users ndash with an incomplete picture of

Figure 9 ProPublica Spend by Ad Type

the aributes that advertisers were using However in the samestudy authors performed a number of controlled experiments thatsuggest ndash but not conclusively prove ndash that there is a logic behindwhich aributes appear in an explanation and which do not Givena targeting audience A obtained from two aributes a1 and a2 if a1and a2 come from dierent aribute categories (eg DemographicBehavior Interest etc) the aribute shown follows a specic prece-dence (Demographics and AgeGenderLocation iquest Interests iquest PIIbased lists iquest Behaviors) If a1 and a2 come from the same aributecategory the one that appears in the explanation is the one withthe highest estimated audience size is will result in a systematicunder-counting of lower priority targeting types

ere are two main sources of biases and limitations in ProP-ublicarsquos dataset One comes from users that installed ProPublicarsquosplugin and which political ads they were shown Another is fromthe way Facebook provides ad explanations e ProPublica datasetis the only publicly available source of targeting information forFacebook political ads us we present these results to provide aninitial insight into how Facebook political advertisers are targetingtheir ads with the understanding of likely biases and limitations

With these caveats in mind we proceed to an analysis of the18010 ads which we were able to connect between the ProPublicadataset and ours In Figure 9 we see that dierent types of adsdo indeed rely on dierent targeting strategies Of particular noteis the the divergence of rsquoCommercialrsquo ads of which 74 rely ontargeting by interest groups and of rsquoDonatersquo ads of which only 24do e average ad size did not dier signicantly between targetingtypes but was signicantly larger than the average for the Facebook

8

Figure 10 ProPublica Targeting by Advertiser Type

archive as a whole We believe this to be an artifact of the collectionmechanism which is biased toward nding larger ads 91 ofads in the overall ProPublica dataset had some kind of geographictargeting and 92 had age or gender targeting On average adsin this dataset had on average 41 dierent targeting parametersso for Facebook these should be thought of as a minimum criteriaBy contrast 58 of ads in the Google archive had no geographictargeting whatsoever and 70 had neither age nor gender targeting

In Figure 10 we also see diverging strategies between advertiserswith Political Candidates and PACs making heavy use of customlists of users and For Prot and For Prot Media companies relyingfar more on targeting users by their interests Campaigns havenumerous potential sources from which to compile lists of usersIn addition to their own lists of donors and voter rolls campaignscan rent lists from other candidates [15]

Both Google and Twier oer advertisers similar targeting cri-teria to what we have described for Facebook including customaudiences and lookalike audiences Both even allow targeting ofusers based on interests although they infer these interests in dier-ent ways Both havemade transparent demographic and geographictargeting information for ads in their archive but without othertargeting information this is an incomplete picture at best Weencourage Google and Twier to at minimum follow Facebookrsquosexample and make transparent to users information about whythey have been targeed for ads that they are seeing

54 New Types of Political Advertisers541 For-Profit Media One advertiser type in particular proved

to be an interesting outlier e category rsquoFor Prot Mediarsquo con-tains advertisers whose ads are not considered traditional news byFacebook (those ads are in a separate part of the archive that wedid not include) but have content intended solely to entertain orsway the opinion of the viewer Over the Facebook dataset as awhole the average ad sponsor ran ads on 16 pages Advertisersin the for-prot media category however ran ads on 32 pages onaverage We have examined many of these for-prot media compa-nies to understand why they are running across many Facebookpages What we have found in numerous instances is unknownfor-prot media companies that appear to be creating disingenuouscommunities that appear to be ldquograssroots movementsrdquo to targetdierent demographics and interests with a combination of paidand organic political messaging

A good example of this type of advertiser is rdquoNew AmericanMedia Group LLCrdquo is ad sponsor ran le leaning ads on 10dierent pages ese pages were designed to appeal to dierent

demographics (rdquoMelaninrdquo for people of color rdquoe Soldier Networkrdquofor Veterans rdquoRaising Tomorrowrdquo for parents etc) but oen run thesame content on multiple pages While this LLC has an extremelysimilar name to a now-defunct genuine le-leaning media outlet(New America Media) it appears to have no connection to thatprior group and also appears to have no activity o of Facebook

While some advertisers in this category were fairly traditionalentertainment websites (ie Comedy Central) some were ldquofor-protrdquo companies in name only that appeared to exist for no otherpurpose other than to spread a particular political message and hadno way of generating an actual prot We also discovered ldquoNewsfor Democracyrdquo is an LLC that ran le leaning ads on 14 dierentFacebook pages most of which were designed to be appealing togroups with traditionally conservative view points such as ldquoeHoly Tribunerdquo Journalists investigated this LLC and linked it toMotiveAI which is a liberal political advertising company

542 Corporate Astroturfing Corporations paying for politicaladvertising is not an entirely new phenomenon and has traditionallybeen funded through industry trade groups and PACs Howeverthe reporting requirements by the FCC for US political advertisingon television oen made this political messaging traceable to thereal sponsor ese stricter reporting requirements do not apply toonline political advertising and the ad-hoc reporting requirementsthat online platforms have enacted are being abused by corporationsand industry trade groups to undo transparency eorts

We discovered in our analysis 355 ads sponsored by ldquoCitizens forTobacco Rightsrdquo which is not a registered company in the US butdoes disclose on their website and Facebook page that it is operatedby cigaree company Philip Morris However someone who onlysaw the Facebook ad disclaimer would not be able to connect theads to Philip Morris without further investigation Other journalistshave found instances of oil and insurance lobbying groups that alsoprovided sponsor names that did not match the legally incorporatedentity sponsoring the ads [21] ese organizations are seeminglytaking advantage of Facebookrsquos policy of not veing sponsor namessince some of these entities also ran political ads on Googlersquos adplatform but provided Google with their EIN (tax ID) and correctlegally incorporated names of their organizations [21]

55 Discussione dierent policies bugs idiosyncrasies and security weaknessesof each transparency archive implementation present challenges toour analysis eorts We nd many of the issues with these archiveslikely stem from a combination of their hasty creation and the factthat the platforms are still working out how to improve security ofthese archives such they are dicult to deceive or evade We willrst discuss issues related to accidentally or intentionally deceivingthese transparency eorts and how they might be improved byimplementing more robust sponsor aribution techniques e sec-ond part of our discussion will focus on issues related to bypassinginclusion into the dierent platformsrsquo archives and what can bedone to improve these issues

551 Sponsor Aribution e for-prot political advertisers ap-pear to be the ones that are accidental or intentionally skirting andviolating the spirit of online transparency sponsorship disclosure

9

policies As we discussed in the Ad Targeting section itrsquos extremelyeasy for groups such as rsquoNew American Mediarsquo to obscure whothey actually are from users and researchers

It is worth noting that such advertising by for-prot corpora-tions was not legal until the Citizens United Supreme Court decisionin 2010 [1] that struck down restrictions on election spending byfor-prot corporations However political messaging advertiserswho run ads on television or radio stations governed by the FCCmust still report the name and contact information of the busi-ness which paid for the ad including the companyrsquos ocers anddirectors Such data is published by the FCC in a public databasePolitical advertisers who send direct mail through the US PostalService (USPS) must also report their activities through the FECwith similar public disclosure of the name and contact informationof the business e regulations that require such disclosure for adsthat mention candidates do not apply to online advertising largelybecause the laws that mandate such public disclosures were draedbefore these platforms were as ubiquitous as they have become

What this means in practice though is that people who wantto publicize a political message can form a for-prot company fordoing so with no intent of making an prot As a private companythey do not need to publicly disclose their investors in the waythat PACs are required to disclose their donors en the for-protcompany can advertise on social media also without disclosing thelegal entity providing the funds to pay for the ad

On Facebookrsquos platform advertisers can easily mislead whenproviding the rsquoad sponsorrsquo string associated with their ads eitherintentionally or accidentally us it is eectively free to circum-vent Facebookrsquos transparency implementation We see numerousinstances on Facebookrsquos platforms of this occurring Sometimes theunreliability of the ad sponsor label appeared to be caused purelyby human error such as typos or variation during data entry Forexample Donald J Trump For President Inc sponsored ads onboth the Donald J Trump page and the Mike Pence page Howeverwhen sponsoring ads on the Donald J Trump page the organiza-tion is known as rsquoDonald J Trump For President Incrsquo and whensponsoring ads on the Mike Pence page is known as rsquoDonald JTrump For President Incrsquo Facebook has not publicly stated plansto implement additional veing of political sponsors Facebookrsquosargument is that anything they might implement for additional vet-ting would not be scalable because of their broader inclusion policywhich extends to political issue ads [21] However this has cre-ated a weakness in Facebookrsquos transparency implementation thatgreatly diminishes its eectiveness for studying dishonest politicaladvertisers

Google and Twier both vet sponsors so companies must eitherreveal their legally incorporated name pay existing third-partiesto create ads on their behalf or create shell organizations (ie LLCPACs) We should note that we see instances of political ads onFacebook and Twier where the sponsor is a third-party advertisingagency instead of the actual entity that paid for the ads is is anexample of the complexities of correctly aributing political adsto the real sponsors It is clear from analysis that we need morediscussion about how to implement sponsorship disclosure andveing in a way that makes it practical to deploy at scale and moredicult to circumvent

552 Transparency Infrastructure As we have noted we ap-preciate the speed with which these transparency archives werecreated However the lack of full integration of these archives intothe broader ad platforms of these companies is currently hurtingthe ecacy of these transparency eorts

We believe that there are ads on the Google and Twier platformsthat would be considered political content that are not includedin their transparency archives because their criteria for inclusionare too narrow or their mechanisms for nding this content areinsucient More research needs to be done into exactly whatthe general population considers to be political advertising Wewould encourage these platforms to create policies and enforcementmechanisms that will make transparent advertising content thatthe general population would consider political

We also encountered several technical and policy issues withthe archives as they currently exist Many ads particularly in theGoogle archive were missing content information Information onspend and impressions were only available in broad ranges fromFacebook and Google No targeting information or very lile target-ing information was available from any of the platforms Facebookrequired us to sign an NDA that prohibited us from sharing our rawdata even with other researchers or even discussing our ndingsdirectly with non- US Persons

We call on these organizations to re-architect their platforms andpolicies to support full transparency of all political ads We realizethat making the changes we recommend will require investment oftime and money both in the technology of these platforms and thecorporate culture of the organizations that own them

6 RELATEDWORK61 Online AdvertisingKorolova [16] was the rst to point out privacy aacks based onmicro-targeted online ads Followup work has reverse-engineeredthe targeting options provided by major online ad networks [33]and explored privacy [2] and bias [26] issues of these online adnetworks ere has also been work on designing improved adtransparency mechanisms [20] For our study we leverage thisprior work on reverse-engineering online advertising networksrsquotargeting options and how Facebookrsquos ad targeting explanationlikely is implemented

To the best of our knowledge there has been no systematicanalysis of online advertisers to this point likely due to the dicultyof collecting large-scale data from online ad networks [13] One ofthe only prior large-scale quantitative studies of online advertisersfocused on how their strategies eected conversion rates based onaggregate analysis of advertisers on Microsorsquos ad network [32]XRay [18] and Sunshine [19] are two techniques that were createdto detect and infer online ad targeting methods However thesewere proof of concept systems and not deployed at large-scale Aninitial analysis of Facebookrsquos proposed ad transparency archiveimplementations pointed out the issue of only including politicalads and not revealing targeting information [31] is report wasreleased before Facebook implemented their transparency archiveand therefor did not analyze the ad data archived by Facebook orissues with the actual implementation We have conducted the rst

10

large-scale analysis of online political advertising based on the datarecently made transparent by Facebook Google and Twier

62 Political AdvertisingAnalysis of political television ads has been the focus of most priorpolitical advertising studies likely due to this data being publiclypublished by the FCC and easy to access [14 35] ere is at leastone prior study that explored the inuence of political televisionads on online discussion [25] ere have also been studies of inves-tigating the polarization of online political discourse [3 10] eclosest related to our study is a prior study which showed thatuploading political video advertisements to YouTube generated un-paid organic-views and improved their eectiveness [24] Howeverto the best of our knowledge ours is the rst large-scale study ofonline political advertising

7 CONCLUSIONSWe have performed an analysis of the ads that we were able tocollect from Facebook Google and Twierrsquos transparency archivesrelated to US politics Based on the data we collected we providean initial understanding and taxonomies of online political advertis-ing strategies for both honest and possibly dishonest US politicaladvertisers We also point out limitations and weaknesses of thepolicies and current implementations of these archives As part ofour analysis we demonstrate how advertisers are intentionally oraccidentally deceiving and bypassing these political transparencyarchives We provide a concrete list of suggestions that wouldlikely make them more robust and useful for enabling a beer un-derstanding of political advertising We are actively working witheach archive product teams to improve their implementations

We commend Facebook Google and Twier for their eorts sofar in improving transparency into political advertising on theirplatforms We note the speed with which these archives weremade available aer public concern about this issue was raised andthat these transparency eorts have improved a great deal in theshort time that these tools have been available We encourage theplatforms to continue to improve

REFERENCES[1] 2009-03-24 Citizens United v Federal Election Commission[2] Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P Gummadi

Patrick Loiseau and Alan Mislove 2018 Investigating ad transparency mecha-nisms in social media A case study of Facebookrsquos explanations InNDSS 2018 Net-work and Distributed Systems Security Symposium San Diego UNITED STATES18 ndash 21

[3] Pablo Barber John T Jost Jonathan Nagler Joshua A Tucker and RichardBonneau 2015 Tweeting From Le to Right Is Online Political CommunicationMore an an Echo Chamber Psychological Science 26 10 (2015) 1531ndash1542

[4] Brian Barre 2018 For Russia Unraveling US DemocracyWas Just Another Day Job hpswwwwiredcomstorymueller-indictment-internet-research-agency

[5] Facebook 2018 Ad Archive hpswwwfacebookcomadsarchive[6] Facebook 2018 Facebook Ad Archive Report Retrieved October 25th 2018

from hpswwwfacebookcomadsarchivereport[7] Facebook 2018-07-03 About ads that include political content rdquohpswww

facebookcombusinesshelp167836590566506rdquo[8] Facebook 2018-07-03 National issues of public importance rdquohpswww

facebookcombusinesshelp214754279118974rdquo[9] Facebook 2018-08-22 Introducing the Ad Archive API hpsnewsroom

comnews201808introducing-the-ad-archive-api[10] DJ Flynn Brendan Nyhan and Jason Reier [n d] e Nature and Origins of

Misperceptions Understanding False and Unsupported Beliefs About PoliticsPolitical Psychology 38 S1 ([n d]) 127ndash150

[11] Google 2018-08-15 Transparency Report - Political Ads hpstransparencyreportgooglecompolitical-adslibrary

[12] Kevin Granville 2018 Facebook and Cambridge Analytica What You Needto Know as Fallout Widens hpswwwnytimescom20180319technologyfacebook-cambridge-analytica-explainedhtml

[13] Saikat Guha Bin Cheng and Paul Francis 2010 Challenges in Measuring OnlineAdvertising Systems In Proceedings of the 10th ACM SIGCOMM Conference onInternet Measurement (IMC rsquo10) ACM 81ndash87

[14] Lynda Lee Kaid and Monica Postelnicu 2005 Political Advertising in the 2004Election Comparison of Traditional Television and Internet Messages AmericanBehavioral Scientist 49 2 (2005) 265ndash278

[15] Maggie Haberman Kenneth P Vogel 2018-10-13hpswwwnytimescom20181013uspoliticstrump-political-datahtmlhpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[16] A Korolova 2010 Privacy Violations Using Microtargeted Ads A Case StudyIn 2010 IEEE International Conference on Data Mining Workshops 474ndash482

[17] Shikhar Sakhuja Laura Edelson 2018 An Analysis of UnitedStates Online Political Advertising hpsonline-pol-adsgithubioOnline-Political-Ads-Analysis

[18] Mathias Lecuyer Guillaume Ducoe Francis Lan Andrei Papancea eolosPetsios Riley Spahn Augustin Chaintreau and Roxana Geambasu 2014 XRayEnhancing the Webrsquos Transparency with Dierential Correlation In USENIXSecurity Symposium San Diego United States

[19] Mathias Lecuyer Riley Spahn Yannis Spiliopolous Augustin Chaintreau RoxanaGeambasu and Daniel Hsu 2015 Sunlight Fine-grained Targeting Detectionat Scale with Statistical Condence In Proceedings of the 22Nd ACM SIGSACConference on Computer and Communications Security (CCS rsquo15) ACM New YorkNY USA 554ndash566 hpsdoiorg10114528101032813614

[20] Bin Liu Anmol Sheth Udi Weinsberg Jaideep Chandrashekar and RameshGovindan 2013 AdReveal Improving Transparency into Online TargetedAdvertising In Proceedings of the TwelhACMWorkshop onHot Topics in Networks(HotNets-XII) ACM New York NY USA Article 12 7 pages hpsdoiorg10114525357712535783

[21] Jeremy B Merrill 2018 How Big Oil Dodges FacebooksNew Ad Transparency Rules hpswwwpropublicaorgarticlehow-big-oil-dodges-facebooks-new-ad-transparency-rules

[22] ProPublica 2017-09-07 Political Advertisements from Facebook hpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[23] ProPublica 2018-10-21 Political Advertisements from Facebook hpswwwpropublicaorgdatastoredatasetpolitical-advertisements-from-facebook

[24] Travis N Ridout Erika Franklin Fowler and John Bransteer 2010 PoliticalAdvertising in the 21st Century e Rise of the YouTube Ad In APSA 2010Annual Meeting

[25] Dhavan V Shah Jaeho Cho Seungahn Nah Melissa R Gotlieb Hyunseo HwangNam-Jin Lee Rosanne M Scholl and Douglas M McLeod [n d] Campaign AdsOnline Messaging and Participation Extending the Communication MediationModel Journal of Communication 57 4 ([n d]) 676ndash703

[26] Till Speicher Muhammad Ali Giridhari Venkatadri Filipe Nunes Ribeiro GeorgeArvanitakis Fabrcio Benevenuto Krishna P Gummadi Patrick Loiseau andAlan Mislove 2018 Potential for Discrimination in Online Targeted AdvertisingIn Proceedings of the 1st Conference on Fairness Accountability and Transparency(Proceedings of Machine Learning Research) Sorelle A Friedler and ChristoWilson(Eds) Vol 81 PMLR New York NY USA 5ndash19

[27] William Turton 2018 We posed as 100 Senators to run ads on FacebookFacebook approved all of them hpsnewsvicecomen usarticlexw9n3qwe-posed-as-100-senators-to-run-ads-on-facebook-facebook-approved-all-of-them

[28] Twier 2018 Political campaigning advertisers hpsadstwiercomtransparencyipolitical advertisers

[29] Twier 2018-06-28 Ad Transparency Center hpsadstwiercomtransparency

[30] Twier 2018-06-28 Political Content in the United States hpsbusinesstwiercomenhelpads-policiesrestricted-content-policiespolitical-campaigningUS-political-contenthtml

[31] Upturn 2018 Leveling the Platform Real Transparency for Paid Messages onFacebook hpswwwteamupturnorgreports2018facebook-ads

[32] Bhanu C Vaikonda Vacha Dave Saikat Guha and Alex C Snoeren 2015Empirical Analysis of Search Advertising Strategies In Proceedings of the 2015Internet Measurement Conference (IMC rsquo15) ACM New York NY USA 79ndash91hpsdoiorg10114528156752815694

[33] G Venkatadri A Andreou Y Liu A Mislove K P Gummadi P Loiseau and OGoga 2018 Privacy Risks with Facebookrsquos PII-Based Targeting Auditing a DataBrokerrsquos Advertising Interface In 2018 IEEE Symposium on Security and Privacy(SP) 89ndash107

[34] Giridhari Venkatadri Yabing Liu Athanasios Andreou Oana Goga PatrickLoiseau Alan Mislove and Krishna P Gummadi 2018 Privacy Risks withFacebookrsquos PII-based Targeting Auditing a Data Brokerrsquos Advertising InterfaceIn Proceedings of the IEEE Symposium on Security and Privacy (IEEE SampPrsquo18) San

11

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References
Page 4: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

observed that ad spend and impression values for ads and occasion-ally advertiser information were being changed aer their entryinto the archive For this reason we created separate archives ofthe dataset on a weekly basis Additionally ad text information wasnot available in the dataset itself but was viewable at a summarypage for each ad We scraped all of these associated pages to col-lect ad text to be associated in our dataset with the underlying addata Unfortunately many of these summary pages did not rendercorrectly so we were only able to collect ad text for approximately66 of pages which contained it Two separate issues preventedcollection First some ad pages display the messageldquoAdvertisers are able to use approved third party vendors to serve adson Google While we are able to review these ads for compliance withadvertising policies due to technical limitations we are currentlyunable to display the content of the ad in the Transparency Reportrdquo

Second some ad pages displayed the messageldquoPolicy violation is ad violated Googlersquos Advertising Policyrdquo Werecommend that Google change their implementation so that adsserved by third party vendors or which were deleted for compli-ance reasons are still accessible through their transparency archiveGoogle can place a click-through disclaimer to avoid accidentalexposure to policy violated content similar to what Facebook hasimplemented for deleted advertisements

Information on spend and impressions per ad are only availablein broad ranges For impressions the ranges presented are rdquoiexcl= 10krdquordquo10k-100krdquo rdquo100k-1Mrdquo rdquo1M-10Mrdquo rdquoiquest 10Mrdquo For spend the rangespresented are rdquoiexcl 100rdquo rdquo100-1krdquo rdquo1k-50krdquo rdquo50k-100krdquo rdquoiquest 100krdquoe ads in this dataset are a combination of text-only ads that aredisplayed alongside Google search results and image or video-onlyads that are displayed as banner or sidebar ads on Googlersquos AdSensenetwork

In addition to per-ad data Google also published some aggregatedata on a per-advertiser and geographic basis One of these aggre-gations was exact weekly spend per advertiser roughout thispaper we present minimum numbers for impressions and spendbecause both Google and Facebook publish ranges for impressionsand spend for each ad instead of exact numbers ese total ag-gregations give us a sense of how much error there is when weuse these minimum estimates for Google According to Googleadvertisers spent $45 M on political ads but our minimum estimateof spending was only $11 M

33 TwitterTwier publishes a list of all political campaigning advertisers [28]which we scrape daily to discover new political campaign advertis-ersrsquo Twier accounts In addition to this list provided by Twierwe have also manually aempted to identify every federal electioncandidatesrsquo personal or campaign Twier account We then queryeach account daily using Twierrsquos API perform to collect updatedinformation on all promoted tweets and detect federal electioncandidates which are not listed on Twierrsquos political campaigningadvertisers page but are sponsoring tweets During our scrapingwe have noticed that some promoted tweets were deleted and arereplaced with the textldquois Tweet is not available because it includes content that violatedTwier Ads Policiesrdquo

e information for these deleted promoted tweets is no longeraccessible through Twierrsquos political transparency archive How-ever if we have scraped them before they were deleted we haveretained the content and information about these promoted tweetsWe recommend that Twier change their implementation so thatpromoted tweets which were deleted are still accessible throughtheir transparency archive Twier can place a click-through dis-claimer to avoid accidental exposure to policy violated contentsimilar to what Facebook has implemented for deleted advertise-ments We have made public all of the data that we have collectedfrom Twierrsquos transparency archive

Additionally we noted that there were several accounts of fed-eral candidates that were not being archived according to Twierrsquospolicies We would nd these ads during our regular scrapes for adsby all federal candidates but no billing or impression data would beavailable and the ads would disappear from Twierrsquos archive aer7 days as is typical for non-political ads We notied Twier about4 accounts which they subsequently added to their transparencyarchive However Twier did not retroactively include their priorpromoted tweets from these accounts and there are currently 11additional federal candidate accounts which have promoted tweetsnot included in the archive us it appears that Twierrsquos pro-cess for agging federal election candidatesrsquo account that shouldbe included in their archive is not working correctly We will dis-close this new set of 11 accounts to Twier and continue to workwith them to improve their process for discovering and includingrelevant promoted tweets in their transparency archive

4 DATASETSWe have collected all of the US political ad data that Google andTwier have made transparent and archived as of October 21st 2018 In addition we have made our best eort to collect as muchof the US political ad data that Facebook has made transparent andarchived as of October 21st 2018 For Facebook we are not ableto collect all of the ad data from their transparency archive due tothe limitations in their API this is a subset of US political ads thatran on Facebook Note that our scraper was blocked by Facebookin mid-July 2018 and we were not able to collect data until thebeginning of September 2018 when we began to use their beta APIis means that we do not have good coverage of Facebook adsduring that period since it is dicult to retrieve older ads fromFacebookrsquos current beta API On October 23rd 2018 shortly aerwe froze our dataset Facebook released their Ad Archive Report [6]From this we know that as of the cuto date for data analyzed inthis study Facebook had a total of 167M ads in their archive froma total of 256M spent across 78K pages We have captured over 75of all ads in the archive but only 49 of the pages

On the Facebook platform ads that run without a rsquoPaid for byrsquolabel but are later deemed to be political are removed from circula-tion and added to the archive We have been able to nd 96106 suchads in the archive with a total spend of at least $428 million and670 million impressions It does not appear that Google or Twierhave any mechanism for retroactively marking an ad as politicalif it is discovered aer the fact and we would encourage them todevelop this capacity

4

Table 2 shows all of the data that we have collected from eachof the platforms Most of the political ads in these archives arefrom late May 2018 to October 21st 2018 but there are severalolder ads from Twier and Facebook that have been included intheir transparency archives Facebook has the most advertisersads impressions and spend However Facebook also includesmany political issue ads in their transparency archive that are notincluded in Google and Twierrsquos transparency archives so this isnot a fair comparison of political advertising activity across allthree platforms An important dierence between the datasetsis that while Facebook and Twier are publishing breakdownsof impressions on geographic and demographic lines Google isinstead publishing geographic and demographic targetings eseshould not be considered equivalent Below in the analysis sectionwe will present a more accurate comparison of political advertisingactivity across all three platforms

Additionally we use a dataset published by ProPublica of politicalads that have been viewed by their users who have installed browserextensions that automatically collected advertisements on theirFacebook pages and sent them to ProPublicarsquos servers [23] Table 3provides an overview of this dataset We were able to connect adsin the ProPublica dataset to ads in our dataset of archived politicalads by mapping the ad IDs used in the ProPublica dataset to thead archive IDs used in the archive To do this we scraped theFacebookrsquos web-based political ad archive as both the ad IDs andad archive ids were available Each record in this dataset containsamong other things the text of the data the various targetingsreceived by the dierent users who saw the ad the page associatedwith the ad and the rsquoPaid byrsquo ad sponsor string associated with thead Of the 33308 ads in the ProPublica dataset with a creation dateaer May 7th 2018 the ocial start of the Facebook dataset wewere able to nd 18010 Because the users who contribute data tothis dataset are self-selecting these ads should not be considereda representative sample of ads in the larger Facebook Ad ArchiveAmong other things the average ad spend on ads in this datasetwas $644 compared to $107 for the larger dataset

As part of our analysis we manually categorized the top adver-tisers on all three platforms We categorized these advertisers byorganization type (political candidate Political Action Commiee(PAC) Union For Prot etc) For Facebook we were able to classifythe organizations of the advertisers who were responsible for atleast 75 of the total number of ads in the Facebook archive ForGoogle we labeled the organization of top advertisers who wereresponsible for 80 of the total number of ads and for Twier wewe were able to label all 88 advertisers with their organization typeWe were able to categorize 12833 of the top ad sponsors If we werenot able to categorize an advertiser it is marked as rsquoUnknownrsquo

We also classied the ads themselves into 5 categories InformConnect Donate Move or Commercial Inform ads seek to per-suade the viewer but do not make an explicit ask Connect ads seekthe userrsquos contact information Donate ads seek the userrsquos moneyMove ads aempt to motivate the user to take some action in thephysical world such as aending a rally or voting Commercial adsseek to sell the user goods or services We classied the ads basedon the outgoing links from the ads Ads that had no outgoing linkswere always classied as Inform ads as they could not have any fur-ther ask from the user Ads that linked directly third-party sites for

Figure 1 Distribution of ads by size

event management (eventbritecom) contact management(GoogleDocs) or payments management (actbluecom) were solely classi-ed as Move Connect or Donate ads respectively Ads that linkedto general campaign sites were usually multiple-classed as somecombination of the three as these ads and pages typically mademultiple asks Ads by For Prot Media organizations were classiedas Inform ads as these advertisers do not sell goods or servicesto users Ads by For Prot organizations that linked to store sitesor sites selling services were classied as Commercial We wereable to categorize 907840 ads with these methods Heavy use ofthird-party service providers by advertisers was extremely helpfulin making these classications If we were not able to categorize anad it was marked as rsquoUnknownrsquo We validated this method of adcategorization by taking a random sample of 300 categorized adsfrom each platform and manually verifying them e error rate forFacebook was 4 for Google was 37 and for Twier was 37

A limitation that applies to all our datasets is that we do notknow when the spend and impressions for each ad occurred duringthe lifetime of the ad Some ads run for several weeks and some foronly a day but in either case we aribute their entire spend andtotal impressions to the creation date of the ad

5 RESULTSWe calculate total spend and impression minimum and maximumfor Facebook ads by summing respectively the smallest and largestvalue for the range given for each ad For Google advertiser weeklyspend data was aggregated for all advertisers so we did not have toestimate that number For Twier exact numbers for impressionsand spend were available so no estimation was needed We alsonote that we are only able to collect a subset of political advertise-ments from Facebookrsquos transparency archive due to accessibilityissues with their beta API We stress that because the criteria forinclusion in these archives diered on the dierent platforms thegures on relative proportions of ad types and advertiser typesshould be seen as a reection of what the platforms chose to maketransparent in addition to what is organically present on theseplatforms

With that in mind we can see clear dierences between theplatforms Of particular note is the dierence in ad size visible inFigure 1 with Facebook having a much larger sized share of thesmallest size of ad Also of note is the diering prevalence of typesof advertisers in Figure 3 with PACs making up a much largerpercentage of spend on Google compared with the other platforms

5

Platform Total Ads Total Sponsors Total Pages Impressions Spend First Ad Date Last Ad DateFacebook 126 M 24 K 38 k 735 B - 2112 B $135 M - $567 M July 14th 2014 October 21st 2018Google 41 K 616 NA 13 B - 116 B $45 M May 31st 2018 October 21st 2018Twier 1808 88 NA 118 M $16 M December 21st 2016 October 21st 2018

Table 2 Overall Datasets

Total Ads 81052Total Pages 2363Total Ad Sponsors 2395Earliest Ad Date July 31st 2017Latest Ad Date October 18th 2018

Table 3 ProPublica Political Advertisements From Face-book

Figure 2 Distribution of ads by type

Figure 3 Distribution of spend by advertiser type

51 Data Over Timee time period during which we were collecting data coincidedwith the 2018 midterm elections in the United States us we wereable to observe changing paerns in spend leading up to a majorelection Figure 4 shows spend by week for the 5 month periodleading up to the election and Figure 5 shows raw ad count forthe same period We note that our data particularly for Facebookspend is right-censored for the nal two weeks is is causedby Facebookrsquos API limitations which only enable us to be ableto recheck ad spends weekly us newly create ads have likely

Results Facebook Google TwierTotal Advertisers 1 K 534 54Total Ads 161 K 15 K 1 KTotal Impressions 800 M - 24 B 280 M- 3 B 100 MTotal USD Spend $12 M - $60 M $135 M $14 MAve ImpressionsAd 5 K - 15 K 32 K - 283 K 65 KAve USD SpendAd $74 - $373 $1 K $885

Table 4 Federal Candidate Only Results

not spent much of their budget when we initially discover themis right censor eect also likely eects Google and Twier to alesser degree due to ads with larger budgets that take several daysto spend down completely is can be corrected by periodicallyrechecking the ads until they have all spent their budgets which isnormally within a week If the paper is accepted we will updatethe data to include ads up to the US midterm elections

We can see the expected increases in the number of ads onall three platforms as the US midterm elections approach OnFacebookrsquos platform there is an increase in connect ads and onTwier there is an increase in move ads Both of these are relatedto sophisticated ldquoget out the voterdquo eorts that many groups havedeployed ese move ads include images which include specicpolling place addresses and websites that provide polling placedirections and information e connect ads oen provide userswith instructions on how they can volunteer to help with early andday-of voter turnout eorts e cause of the spending spikes forFacebookrsquos platform can be aributed to a few unknown sponsorsthat we could not link to a legally registered entity but that werelikely quasi for-prot advertisers which we will discuss furtherlater in the paper e spending spikes on Twierrsquos platform canbe aributed to candidates who ran a few ads with larger budgets

52 Federal Candidate ComparisonIn order to understand how political advertising across these plat-forms dier we aempted to create a comparable subset of adver-tisers and ads is is dicult because each platform has slightlydierent criteria for inclusion To do this we present results foradvertising only paid for by candidates for federal oce whichwas the broadest set that was reliably included in all three archivesNote this does not include ads by current oceholders who are notseeking re-election or ads that merely mention a federal candidatebut are paid for by another party Results for these advertisers arepresented in Table 4

Table 4 shows that Facebook is the platform with the broadestappeal to federal candidate advertisers with far more advertisersand ads than Google However political advertising by this groupon Google appears to generate more spend and possibly more im-pressions than ads on Facebook e average ad size on Facebookin terms of impressions and spend are the smallest based on our

6

Figure 4 Platform Ad Count By Ad Type By Week

Figure 5 Platform Spend By Advertiser Type By Week

Figure 6 Federal Candidate ads by Size

Figure 7 Federal Candidate Spend by Ad Type

minimum estimates indicating that advertisers are running smallerlikely more targeted ads on Facebook ese small ads on Face-book are what are called micro targeted which we dene as lessthan 1000 impressions or a spend of less than $100 For Facebookmicrotargeted ads make up 81 of the overall number of ads for

federal candidates in our dataset For Twier this number is 62and for Google it is 54 Figure 6 shows the share of ads by sizeof spend and here we can begin to see how federal candidates usethese platforms in dierent ways We note that the distribution ofads by size for federal candidates in Figure 6 is very similar to theoverall distribution of ads by size seen in Figure 1

Figure 7 shows the relative spend on dierent ad platformswhere we see very dierent percentages for types of ads Commer-cial ads are not shown in this gure because there were too fewcommercial ads to be visible Particularly of note is the fact that adsseeking donations were far more common on the Google platformand ads seeking to spread a message (rsquoInformrsquo) were much morecommon on Facebook

Seeing these dierences in both ad size and the types of ads thatwere run we wanted to understand if advertisers were trying toreach dierent geographic audiences with dierent types of adsTo do this we compared the number of regions in which variousads had impressions on Facebook and Twier and the number ofregions targeted for Google Figure 8 shows that a variation intargeting strategy is visible on Facebook and Twier On FacebookrsquoMoversquo ads that encouraged people to aend a rally volunteer fora candidate or some other in-person activity were viewed onaverage in 4 regions while rsquoDonatersquo ads were viewed in 24 regionson average is makes a certain amount of intuitive sense peopleare willing to travel only so far to aend a rally but can donate tocandidates anywhere in the United States

7

Figure 8 CDF of Regions by Ad Type for Federal Candidates

53 Ad TargetingOne of the deciencies with the Facebook political ad archive isthat while it does share geographic and demographic informationabout who saw a particular ad we have no way of knowing howthat ad was targeted However we were able to connect our datasetcontaining information about who consumed ads with one pub-lished by ProPublica which contains some data about how ads weretargeted e ProPublica data was collected by a browser pluginoperated by ProPublica which anyone can install ProPublicarsquosbrowser plugin [22] uses a supervised Natural Language Process-ing (NLP) classier to detect political ads in addition to allowingusers to manually classify ads they see as political e browserplugin then collects the partial ad targeting explanations Facebookprovides by automatically clicking on the ldquoWhy am I seeing thisrdquobuon for political ads and sends it to ProPublica for them to makepublic

We rst provide a brief background on Facebook targeting audi-ence options [34] Facebook exposes prospective advertisers to aplethora of options First advertisers can target users based on agegender location and languages they speak Second advertisers canchoose to send their ads to users in a custom audience or lookalikeaudience Custom audiences contain a list of identiers of specicusers Advertisers can use various types of data to create a customaudience list ranging from specifying the emails phone numbersor physical addresses of people they want to reach to users thathave visited their website installed their mobile application orliked their Facebook Page Lookalike audiences allow advertisers tolet Facebook choose to whom to sends their ads based on previouscampaigns Finally advertisers can choose from a long list of target-ing aributes the characteristics they want users who receive theirads to have (eg users interested in Catholic Church) Targetingaributes are categorized in types such as demographics behaviorsand interests Advertisers can choose multiple aributes to target

A prior study by Athanasioshas et al [2] reverse engineeredwhat Facebook chooses to show and the limitations of the ad tar-geting explanation Facebook provides is study showed that adexplanations are incomplete each explanation shows at most onetargeting aribute (plus agegenderlocation information) regard-less of how many aributes the advertisers use is means thatexplanations reveal only part of the targeting aributes that wereused providing us ndash and the users ndash with an incomplete picture of

Figure 9 ProPublica Spend by Ad Type

the aributes that advertisers were using However in the samestudy authors performed a number of controlled experiments thatsuggest ndash but not conclusively prove ndash that there is a logic behindwhich aributes appear in an explanation and which do not Givena targeting audience A obtained from two aributes a1 and a2 if a1and a2 come from dierent aribute categories (eg DemographicBehavior Interest etc) the aribute shown follows a specic prece-dence (Demographics and AgeGenderLocation iquest Interests iquest PIIbased lists iquest Behaviors) If a1 and a2 come from the same aributecategory the one that appears in the explanation is the one withthe highest estimated audience size is will result in a systematicunder-counting of lower priority targeting types

ere are two main sources of biases and limitations in ProP-ublicarsquos dataset One comes from users that installed ProPublicarsquosplugin and which political ads they were shown Another is fromthe way Facebook provides ad explanations e ProPublica datasetis the only publicly available source of targeting information forFacebook political ads us we present these results to provide aninitial insight into how Facebook political advertisers are targetingtheir ads with the understanding of likely biases and limitations

With these caveats in mind we proceed to an analysis of the18010 ads which we were able to connect between the ProPublicadataset and ours In Figure 9 we see that dierent types of adsdo indeed rely on dierent targeting strategies Of particular noteis the the divergence of rsquoCommercialrsquo ads of which 74 rely ontargeting by interest groups and of rsquoDonatersquo ads of which only 24do e average ad size did not dier signicantly between targetingtypes but was signicantly larger than the average for the Facebook

8

Figure 10 ProPublica Targeting by Advertiser Type

archive as a whole We believe this to be an artifact of the collectionmechanism which is biased toward nding larger ads 91 ofads in the overall ProPublica dataset had some kind of geographictargeting and 92 had age or gender targeting On average adsin this dataset had on average 41 dierent targeting parametersso for Facebook these should be thought of as a minimum criteriaBy contrast 58 of ads in the Google archive had no geographictargeting whatsoever and 70 had neither age nor gender targeting

In Figure 10 we also see diverging strategies between advertiserswith Political Candidates and PACs making heavy use of customlists of users and For Prot and For Prot Media companies relyingfar more on targeting users by their interests Campaigns havenumerous potential sources from which to compile lists of usersIn addition to their own lists of donors and voter rolls campaignscan rent lists from other candidates [15]

Both Google and Twier oer advertisers similar targeting cri-teria to what we have described for Facebook including customaudiences and lookalike audiences Both even allow targeting ofusers based on interests although they infer these interests in dier-ent ways Both havemade transparent demographic and geographictargeting information for ads in their archive but without othertargeting information this is an incomplete picture at best Weencourage Google and Twier to at minimum follow Facebookrsquosexample and make transparent to users information about whythey have been targeed for ads that they are seeing

54 New Types of Political Advertisers541 For-Profit Media One advertiser type in particular proved

to be an interesting outlier e category rsquoFor Prot Mediarsquo con-tains advertisers whose ads are not considered traditional news byFacebook (those ads are in a separate part of the archive that wedid not include) but have content intended solely to entertain orsway the opinion of the viewer Over the Facebook dataset as awhole the average ad sponsor ran ads on 16 pages Advertisersin the for-prot media category however ran ads on 32 pages onaverage We have examined many of these for-prot media compa-nies to understand why they are running across many Facebookpages What we have found in numerous instances is unknownfor-prot media companies that appear to be creating disingenuouscommunities that appear to be ldquograssroots movementsrdquo to targetdierent demographics and interests with a combination of paidand organic political messaging

A good example of this type of advertiser is rdquoNew AmericanMedia Group LLCrdquo is ad sponsor ran le leaning ads on 10dierent pages ese pages were designed to appeal to dierent

demographics (rdquoMelaninrdquo for people of color rdquoe Soldier Networkrdquofor Veterans rdquoRaising Tomorrowrdquo for parents etc) but oen run thesame content on multiple pages While this LLC has an extremelysimilar name to a now-defunct genuine le-leaning media outlet(New America Media) it appears to have no connection to thatprior group and also appears to have no activity o of Facebook

While some advertisers in this category were fairly traditionalentertainment websites (ie Comedy Central) some were ldquofor-protrdquo companies in name only that appeared to exist for no otherpurpose other than to spread a particular political message and hadno way of generating an actual prot We also discovered ldquoNewsfor Democracyrdquo is an LLC that ran le leaning ads on 14 dierentFacebook pages most of which were designed to be appealing togroups with traditionally conservative view points such as ldquoeHoly Tribunerdquo Journalists investigated this LLC and linked it toMotiveAI which is a liberal political advertising company

542 Corporate Astroturfing Corporations paying for politicaladvertising is not an entirely new phenomenon and has traditionallybeen funded through industry trade groups and PACs Howeverthe reporting requirements by the FCC for US political advertisingon television oen made this political messaging traceable to thereal sponsor ese stricter reporting requirements do not apply toonline political advertising and the ad-hoc reporting requirementsthat online platforms have enacted are being abused by corporationsand industry trade groups to undo transparency eorts

We discovered in our analysis 355 ads sponsored by ldquoCitizens forTobacco Rightsrdquo which is not a registered company in the US butdoes disclose on their website and Facebook page that it is operatedby cigaree company Philip Morris However someone who onlysaw the Facebook ad disclaimer would not be able to connect theads to Philip Morris without further investigation Other journalistshave found instances of oil and insurance lobbying groups that alsoprovided sponsor names that did not match the legally incorporatedentity sponsoring the ads [21] ese organizations are seeminglytaking advantage of Facebookrsquos policy of not veing sponsor namessince some of these entities also ran political ads on Googlersquos adplatform but provided Google with their EIN (tax ID) and correctlegally incorporated names of their organizations [21]

55 Discussione dierent policies bugs idiosyncrasies and security weaknessesof each transparency archive implementation present challenges toour analysis eorts We nd many of the issues with these archiveslikely stem from a combination of their hasty creation and the factthat the platforms are still working out how to improve security ofthese archives such they are dicult to deceive or evade We willrst discuss issues related to accidentally or intentionally deceivingthese transparency eorts and how they might be improved byimplementing more robust sponsor aribution techniques e sec-ond part of our discussion will focus on issues related to bypassinginclusion into the dierent platformsrsquo archives and what can bedone to improve these issues

551 Sponsor Aribution e for-prot political advertisers ap-pear to be the ones that are accidental or intentionally skirting andviolating the spirit of online transparency sponsorship disclosure

9

policies As we discussed in the Ad Targeting section itrsquos extremelyeasy for groups such as rsquoNew American Mediarsquo to obscure whothey actually are from users and researchers

It is worth noting that such advertising by for-prot corpora-tions was not legal until the Citizens United Supreme Court decisionin 2010 [1] that struck down restrictions on election spending byfor-prot corporations However political messaging advertiserswho run ads on television or radio stations governed by the FCCmust still report the name and contact information of the busi-ness which paid for the ad including the companyrsquos ocers anddirectors Such data is published by the FCC in a public databasePolitical advertisers who send direct mail through the US PostalService (USPS) must also report their activities through the FECwith similar public disclosure of the name and contact informationof the business e regulations that require such disclosure for adsthat mention candidates do not apply to online advertising largelybecause the laws that mandate such public disclosures were draedbefore these platforms were as ubiquitous as they have become

What this means in practice though is that people who wantto publicize a political message can form a for-prot company fordoing so with no intent of making an prot As a private companythey do not need to publicly disclose their investors in the waythat PACs are required to disclose their donors en the for-protcompany can advertise on social media also without disclosing thelegal entity providing the funds to pay for the ad

On Facebookrsquos platform advertisers can easily mislead whenproviding the rsquoad sponsorrsquo string associated with their ads eitherintentionally or accidentally us it is eectively free to circum-vent Facebookrsquos transparency implementation We see numerousinstances on Facebookrsquos platforms of this occurring Sometimes theunreliability of the ad sponsor label appeared to be caused purelyby human error such as typos or variation during data entry Forexample Donald J Trump For President Inc sponsored ads onboth the Donald J Trump page and the Mike Pence page Howeverwhen sponsoring ads on the Donald J Trump page the organiza-tion is known as rsquoDonald J Trump For President Incrsquo and whensponsoring ads on the Mike Pence page is known as rsquoDonald JTrump For President Incrsquo Facebook has not publicly stated plansto implement additional veing of political sponsors Facebookrsquosargument is that anything they might implement for additional vet-ting would not be scalable because of their broader inclusion policywhich extends to political issue ads [21] However this has cre-ated a weakness in Facebookrsquos transparency implementation thatgreatly diminishes its eectiveness for studying dishonest politicaladvertisers

Google and Twier both vet sponsors so companies must eitherreveal their legally incorporated name pay existing third-partiesto create ads on their behalf or create shell organizations (ie LLCPACs) We should note that we see instances of political ads onFacebook and Twier where the sponsor is a third-party advertisingagency instead of the actual entity that paid for the ads is is anexample of the complexities of correctly aributing political adsto the real sponsors It is clear from analysis that we need morediscussion about how to implement sponsorship disclosure andveing in a way that makes it practical to deploy at scale and moredicult to circumvent

552 Transparency Infrastructure As we have noted we ap-preciate the speed with which these transparency archives werecreated However the lack of full integration of these archives intothe broader ad platforms of these companies is currently hurtingthe ecacy of these transparency eorts

We believe that there are ads on the Google and Twier platformsthat would be considered political content that are not includedin their transparency archives because their criteria for inclusionare too narrow or their mechanisms for nding this content areinsucient More research needs to be done into exactly whatthe general population considers to be political advertising Wewould encourage these platforms to create policies and enforcementmechanisms that will make transparent advertising content thatthe general population would consider political

We also encountered several technical and policy issues withthe archives as they currently exist Many ads particularly in theGoogle archive were missing content information Information onspend and impressions were only available in broad ranges fromFacebook and Google No targeting information or very lile target-ing information was available from any of the platforms Facebookrequired us to sign an NDA that prohibited us from sharing our rawdata even with other researchers or even discussing our ndingsdirectly with non- US Persons

We call on these organizations to re-architect their platforms andpolicies to support full transparency of all political ads We realizethat making the changes we recommend will require investment oftime and money both in the technology of these platforms and thecorporate culture of the organizations that own them

6 RELATEDWORK61 Online AdvertisingKorolova [16] was the rst to point out privacy aacks based onmicro-targeted online ads Followup work has reverse-engineeredthe targeting options provided by major online ad networks [33]and explored privacy [2] and bias [26] issues of these online adnetworks ere has also been work on designing improved adtransparency mechanisms [20] For our study we leverage thisprior work on reverse-engineering online advertising networksrsquotargeting options and how Facebookrsquos ad targeting explanationlikely is implemented

To the best of our knowledge there has been no systematicanalysis of online advertisers to this point likely due to the dicultyof collecting large-scale data from online ad networks [13] One ofthe only prior large-scale quantitative studies of online advertisersfocused on how their strategies eected conversion rates based onaggregate analysis of advertisers on Microsorsquos ad network [32]XRay [18] and Sunshine [19] are two techniques that were createdto detect and infer online ad targeting methods However thesewere proof of concept systems and not deployed at large-scale Aninitial analysis of Facebookrsquos proposed ad transparency archiveimplementations pointed out the issue of only including politicalads and not revealing targeting information [31] is report wasreleased before Facebook implemented their transparency archiveand therefor did not analyze the ad data archived by Facebook orissues with the actual implementation We have conducted the rst

10

large-scale analysis of online political advertising based on the datarecently made transparent by Facebook Google and Twier

62 Political AdvertisingAnalysis of political television ads has been the focus of most priorpolitical advertising studies likely due to this data being publiclypublished by the FCC and easy to access [14 35] ere is at leastone prior study that explored the inuence of political televisionads on online discussion [25] ere have also been studies of inves-tigating the polarization of online political discourse [3 10] eclosest related to our study is a prior study which showed thatuploading political video advertisements to YouTube generated un-paid organic-views and improved their eectiveness [24] Howeverto the best of our knowledge ours is the rst large-scale study ofonline political advertising

7 CONCLUSIONSWe have performed an analysis of the ads that we were able tocollect from Facebook Google and Twierrsquos transparency archivesrelated to US politics Based on the data we collected we providean initial understanding and taxonomies of online political advertis-ing strategies for both honest and possibly dishonest US politicaladvertisers We also point out limitations and weaknesses of thepolicies and current implementations of these archives As part ofour analysis we demonstrate how advertisers are intentionally oraccidentally deceiving and bypassing these political transparencyarchives We provide a concrete list of suggestions that wouldlikely make them more robust and useful for enabling a beer un-derstanding of political advertising We are actively working witheach archive product teams to improve their implementations

We commend Facebook Google and Twier for their eorts sofar in improving transparency into political advertising on theirplatforms We note the speed with which these archives weremade available aer public concern about this issue was raised andthat these transparency eorts have improved a great deal in theshort time that these tools have been available We encourage theplatforms to continue to improve

REFERENCES[1] 2009-03-24 Citizens United v Federal Election Commission[2] Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P Gummadi

Patrick Loiseau and Alan Mislove 2018 Investigating ad transparency mecha-nisms in social media A case study of Facebookrsquos explanations InNDSS 2018 Net-work and Distributed Systems Security Symposium San Diego UNITED STATES18 ndash 21

[3] Pablo Barber John T Jost Jonathan Nagler Joshua A Tucker and RichardBonneau 2015 Tweeting From Le to Right Is Online Political CommunicationMore an an Echo Chamber Psychological Science 26 10 (2015) 1531ndash1542

[4] Brian Barre 2018 For Russia Unraveling US DemocracyWas Just Another Day Job hpswwwwiredcomstorymueller-indictment-internet-research-agency

[5] Facebook 2018 Ad Archive hpswwwfacebookcomadsarchive[6] Facebook 2018 Facebook Ad Archive Report Retrieved October 25th 2018

from hpswwwfacebookcomadsarchivereport[7] Facebook 2018-07-03 About ads that include political content rdquohpswww

facebookcombusinesshelp167836590566506rdquo[8] Facebook 2018-07-03 National issues of public importance rdquohpswww

facebookcombusinesshelp214754279118974rdquo[9] Facebook 2018-08-22 Introducing the Ad Archive API hpsnewsroom

comnews201808introducing-the-ad-archive-api[10] DJ Flynn Brendan Nyhan and Jason Reier [n d] e Nature and Origins of

Misperceptions Understanding False and Unsupported Beliefs About PoliticsPolitical Psychology 38 S1 ([n d]) 127ndash150

[11] Google 2018-08-15 Transparency Report - Political Ads hpstransparencyreportgooglecompolitical-adslibrary

[12] Kevin Granville 2018 Facebook and Cambridge Analytica What You Needto Know as Fallout Widens hpswwwnytimescom20180319technologyfacebook-cambridge-analytica-explainedhtml

[13] Saikat Guha Bin Cheng and Paul Francis 2010 Challenges in Measuring OnlineAdvertising Systems In Proceedings of the 10th ACM SIGCOMM Conference onInternet Measurement (IMC rsquo10) ACM 81ndash87

[14] Lynda Lee Kaid and Monica Postelnicu 2005 Political Advertising in the 2004Election Comparison of Traditional Television and Internet Messages AmericanBehavioral Scientist 49 2 (2005) 265ndash278

[15] Maggie Haberman Kenneth P Vogel 2018-10-13hpswwwnytimescom20181013uspoliticstrump-political-datahtmlhpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[16] A Korolova 2010 Privacy Violations Using Microtargeted Ads A Case StudyIn 2010 IEEE International Conference on Data Mining Workshops 474ndash482

[17] Shikhar Sakhuja Laura Edelson 2018 An Analysis of UnitedStates Online Political Advertising hpsonline-pol-adsgithubioOnline-Political-Ads-Analysis

[18] Mathias Lecuyer Guillaume Ducoe Francis Lan Andrei Papancea eolosPetsios Riley Spahn Augustin Chaintreau and Roxana Geambasu 2014 XRayEnhancing the Webrsquos Transparency with Dierential Correlation In USENIXSecurity Symposium San Diego United States

[19] Mathias Lecuyer Riley Spahn Yannis Spiliopolous Augustin Chaintreau RoxanaGeambasu and Daniel Hsu 2015 Sunlight Fine-grained Targeting Detectionat Scale with Statistical Condence In Proceedings of the 22Nd ACM SIGSACConference on Computer and Communications Security (CCS rsquo15) ACM New YorkNY USA 554ndash566 hpsdoiorg10114528101032813614

[20] Bin Liu Anmol Sheth Udi Weinsberg Jaideep Chandrashekar and RameshGovindan 2013 AdReveal Improving Transparency into Online TargetedAdvertising In Proceedings of the TwelhACMWorkshop onHot Topics in Networks(HotNets-XII) ACM New York NY USA Article 12 7 pages hpsdoiorg10114525357712535783

[21] Jeremy B Merrill 2018 How Big Oil Dodges FacebooksNew Ad Transparency Rules hpswwwpropublicaorgarticlehow-big-oil-dodges-facebooks-new-ad-transparency-rules

[22] ProPublica 2017-09-07 Political Advertisements from Facebook hpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[23] ProPublica 2018-10-21 Political Advertisements from Facebook hpswwwpropublicaorgdatastoredatasetpolitical-advertisements-from-facebook

[24] Travis N Ridout Erika Franklin Fowler and John Bransteer 2010 PoliticalAdvertising in the 21st Century e Rise of the YouTube Ad In APSA 2010Annual Meeting

[25] Dhavan V Shah Jaeho Cho Seungahn Nah Melissa R Gotlieb Hyunseo HwangNam-Jin Lee Rosanne M Scholl and Douglas M McLeod [n d] Campaign AdsOnline Messaging and Participation Extending the Communication MediationModel Journal of Communication 57 4 ([n d]) 676ndash703

[26] Till Speicher Muhammad Ali Giridhari Venkatadri Filipe Nunes Ribeiro GeorgeArvanitakis Fabrcio Benevenuto Krishna P Gummadi Patrick Loiseau andAlan Mislove 2018 Potential for Discrimination in Online Targeted AdvertisingIn Proceedings of the 1st Conference on Fairness Accountability and Transparency(Proceedings of Machine Learning Research) Sorelle A Friedler and ChristoWilson(Eds) Vol 81 PMLR New York NY USA 5ndash19

[27] William Turton 2018 We posed as 100 Senators to run ads on FacebookFacebook approved all of them hpsnewsvicecomen usarticlexw9n3qwe-posed-as-100-senators-to-run-ads-on-facebook-facebook-approved-all-of-them

[28] Twier 2018 Political campaigning advertisers hpsadstwiercomtransparencyipolitical advertisers

[29] Twier 2018-06-28 Ad Transparency Center hpsadstwiercomtransparency

[30] Twier 2018-06-28 Political Content in the United States hpsbusinesstwiercomenhelpads-policiesrestricted-content-policiespolitical-campaigningUS-political-contenthtml

[31] Upturn 2018 Leveling the Platform Real Transparency for Paid Messages onFacebook hpswwwteamupturnorgreports2018facebook-ads

[32] Bhanu C Vaikonda Vacha Dave Saikat Guha and Alex C Snoeren 2015Empirical Analysis of Search Advertising Strategies In Proceedings of the 2015Internet Measurement Conference (IMC rsquo15) ACM New York NY USA 79ndash91hpsdoiorg10114528156752815694

[33] G Venkatadri A Andreou Y Liu A Mislove K P Gummadi P Loiseau and OGoga 2018 Privacy Risks with Facebookrsquos PII-Based Targeting Auditing a DataBrokerrsquos Advertising Interface In 2018 IEEE Symposium on Security and Privacy(SP) 89ndash107

[34] Giridhari Venkatadri Yabing Liu Athanasios Andreou Oana Goga PatrickLoiseau Alan Mislove and Krishna P Gummadi 2018 Privacy Risks withFacebookrsquos PII-based Targeting Auditing a Data Brokerrsquos Advertising InterfaceIn Proceedings of the IEEE Symposium on Security and Privacy (IEEE SampPrsquo18) San

11

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References
Page 5: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

Table 2 shows all of the data that we have collected from eachof the platforms Most of the political ads in these archives arefrom late May 2018 to October 21st 2018 but there are severalolder ads from Twier and Facebook that have been included intheir transparency archives Facebook has the most advertisersads impressions and spend However Facebook also includesmany political issue ads in their transparency archive that are notincluded in Google and Twierrsquos transparency archives so this isnot a fair comparison of political advertising activity across allthree platforms An important dierence between the datasetsis that while Facebook and Twier are publishing breakdownsof impressions on geographic and demographic lines Google isinstead publishing geographic and demographic targetings eseshould not be considered equivalent Below in the analysis sectionwe will present a more accurate comparison of political advertisingactivity across all three platforms

Additionally we use a dataset published by ProPublica of politicalads that have been viewed by their users who have installed browserextensions that automatically collected advertisements on theirFacebook pages and sent them to ProPublicarsquos servers [23] Table 3provides an overview of this dataset We were able to connect adsin the ProPublica dataset to ads in our dataset of archived politicalads by mapping the ad IDs used in the ProPublica dataset to thead archive IDs used in the archive To do this we scraped theFacebookrsquos web-based political ad archive as both the ad IDs andad archive ids were available Each record in this dataset containsamong other things the text of the data the various targetingsreceived by the dierent users who saw the ad the page associatedwith the ad and the rsquoPaid byrsquo ad sponsor string associated with thead Of the 33308 ads in the ProPublica dataset with a creation dateaer May 7th 2018 the ocial start of the Facebook dataset wewere able to nd 18010 Because the users who contribute data tothis dataset are self-selecting these ads should not be considereda representative sample of ads in the larger Facebook Ad ArchiveAmong other things the average ad spend on ads in this datasetwas $644 compared to $107 for the larger dataset

As part of our analysis we manually categorized the top adver-tisers on all three platforms We categorized these advertisers byorganization type (political candidate Political Action Commiee(PAC) Union For Prot etc) For Facebook we were able to classifythe organizations of the advertisers who were responsible for atleast 75 of the total number of ads in the Facebook archive ForGoogle we labeled the organization of top advertisers who wereresponsible for 80 of the total number of ads and for Twier wewe were able to label all 88 advertisers with their organization typeWe were able to categorize 12833 of the top ad sponsors If we werenot able to categorize an advertiser it is marked as rsquoUnknownrsquo

We also classied the ads themselves into 5 categories InformConnect Donate Move or Commercial Inform ads seek to per-suade the viewer but do not make an explicit ask Connect ads seekthe userrsquos contact information Donate ads seek the userrsquos moneyMove ads aempt to motivate the user to take some action in thephysical world such as aending a rally or voting Commercial adsseek to sell the user goods or services We classied the ads basedon the outgoing links from the ads Ads that had no outgoing linkswere always classied as Inform ads as they could not have any fur-ther ask from the user Ads that linked directly third-party sites for

Figure 1 Distribution of ads by size

event management (eventbritecom) contact management(GoogleDocs) or payments management (actbluecom) were solely classi-ed as Move Connect or Donate ads respectively Ads that linkedto general campaign sites were usually multiple-classed as somecombination of the three as these ads and pages typically mademultiple asks Ads by For Prot Media organizations were classiedas Inform ads as these advertisers do not sell goods or servicesto users Ads by For Prot organizations that linked to store sitesor sites selling services were classied as Commercial We wereable to categorize 907840 ads with these methods Heavy use ofthird-party service providers by advertisers was extremely helpfulin making these classications If we were not able to categorize anad it was marked as rsquoUnknownrsquo We validated this method of adcategorization by taking a random sample of 300 categorized adsfrom each platform and manually verifying them e error rate forFacebook was 4 for Google was 37 and for Twier was 37

A limitation that applies to all our datasets is that we do notknow when the spend and impressions for each ad occurred duringthe lifetime of the ad Some ads run for several weeks and some foronly a day but in either case we aribute their entire spend andtotal impressions to the creation date of the ad

5 RESULTSWe calculate total spend and impression minimum and maximumfor Facebook ads by summing respectively the smallest and largestvalue for the range given for each ad For Google advertiser weeklyspend data was aggregated for all advertisers so we did not have toestimate that number For Twier exact numbers for impressionsand spend were available so no estimation was needed We alsonote that we are only able to collect a subset of political advertise-ments from Facebookrsquos transparency archive due to accessibilityissues with their beta API We stress that because the criteria forinclusion in these archives diered on the dierent platforms thegures on relative proportions of ad types and advertiser typesshould be seen as a reection of what the platforms chose to maketransparent in addition to what is organically present on theseplatforms

With that in mind we can see clear dierences between theplatforms Of particular note is the dierence in ad size visible inFigure 1 with Facebook having a much larger sized share of thesmallest size of ad Also of note is the diering prevalence of typesof advertisers in Figure 3 with PACs making up a much largerpercentage of spend on Google compared with the other platforms

5

Platform Total Ads Total Sponsors Total Pages Impressions Spend First Ad Date Last Ad DateFacebook 126 M 24 K 38 k 735 B - 2112 B $135 M - $567 M July 14th 2014 October 21st 2018Google 41 K 616 NA 13 B - 116 B $45 M May 31st 2018 October 21st 2018Twier 1808 88 NA 118 M $16 M December 21st 2016 October 21st 2018

Table 2 Overall Datasets

Total Ads 81052Total Pages 2363Total Ad Sponsors 2395Earliest Ad Date July 31st 2017Latest Ad Date October 18th 2018

Table 3 ProPublica Political Advertisements From Face-book

Figure 2 Distribution of ads by type

Figure 3 Distribution of spend by advertiser type

51 Data Over Timee time period during which we were collecting data coincidedwith the 2018 midterm elections in the United States us we wereable to observe changing paerns in spend leading up to a majorelection Figure 4 shows spend by week for the 5 month periodleading up to the election and Figure 5 shows raw ad count forthe same period We note that our data particularly for Facebookspend is right-censored for the nal two weeks is is causedby Facebookrsquos API limitations which only enable us to be ableto recheck ad spends weekly us newly create ads have likely

Results Facebook Google TwierTotal Advertisers 1 K 534 54Total Ads 161 K 15 K 1 KTotal Impressions 800 M - 24 B 280 M- 3 B 100 MTotal USD Spend $12 M - $60 M $135 M $14 MAve ImpressionsAd 5 K - 15 K 32 K - 283 K 65 KAve USD SpendAd $74 - $373 $1 K $885

Table 4 Federal Candidate Only Results

not spent much of their budget when we initially discover themis right censor eect also likely eects Google and Twier to alesser degree due to ads with larger budgets that take several daysto spend down completely is can be corrected by periodicallyrechecking the ads until they have all spent their budgets which isnormally within a week If the paper is accepted we will updatethe data to include ads up to the US midterm elections

We can see the expected increases in the number of ads onall three platforms as the US midterm elections approach OnFacebookrsquos platform there is an increase in connect ads and onTwier there is an increase in move ads Both of these are relatedto sophisticated ldquoget out the voterdquo eorts that many groups havedeployed ese move ads include images which include specicpolling place addresses and websites that provide polling placedirections and information e connect ads oen provide userswith instructions on how they can volunteer to help with early andday-of voter turnout eorts e cause of the spending spikes forFacebookrsquos platform can be aributed to a few unknown sponsorsthat we could not link to a legally registered entity but that werelikely quasi for-prot advertisers which we will discuss furtherlater in the paper e spending spikes on Twierrsquos platform canbe aributed to candidates who ran a few ads with larger budgets

52 Federal Candidate ComparisonIn order to understand how political advertising across these plat-forms dier we aempted to create a comparable subset of adver-tisers and ads is is dicult because each platform has slightlydierent criteria for inclusion To do this we present results foradvertising only paid for by candidates for federal oce whichwas the broadest set that was reliably included in all three archivesNote this does not include ads by current oceholders who are notseeking re-election or ads that merely mention a federal candidatebut are paid for by another party Results for these advertisers arepresented in Table 4

Table 4 shows that Facebook is the platform with the broadestappeal to federal candidate advertisers with far more advertisersand ads than Google However political advertising by this groupon Google appears to generate more spend and possibly more im-pressions than ads on Facebook e average ad size on Facebookin terms of impressions and spend are the smallest based on our

6

Figure 4 Platform Ad Count By Ad Type By Week

Figure 5 Platform Spend By Advertiser Type By Week

Figure 6 Federal Candidate ads by Size

Figure 7 Federal Candidate Spend by Ad Type

minimum estimates indicating that advertisers are running smallerlikely more targeted ads on Facebook ese small ads on Face-book are what are called micro targeted which we dene as lessthan 1000 impressions or a spend of less than $100 For Facebookmicrotargeted ads make up 81 of the overall number of ads for

federal candidates in our dataset For Twier this number is 62and for Google it is 54 Figure 6 shows the share of ads by sizeof spend and here we can begin to see how federal candidates usethese platforms in dierent ways We note that the distribution ofads by size for federal candidates in Figure 6 is very similar to theoverall distribution of ads by size seen in Figure 1

Figure 7 shows the relative spend on dierent ad platformswhere we see very dierent percentages for types of ads Commer-cial ads are not shown in this gure because there were too fewcommercial ads to be visible Particularly of note is the fact that adsseeking donations were far more common on the Google platformand ads seeking to spread a message (rsquoInformrsquo) were much morecommon on Facebook

Seeing these dierences in both ad size and the types of ads thatwere run we wanted to understand if advertisers were trying toreach dierent geographic audiences with dierent types of adsTo do this we compared the number of regions in which variousads had impressions on Facebook and Twier and the number ofregions targeted for Google Figure 8 shows that a variation intargeting strategy is visible on Facebook and Twier On FacebookrsquoMoversquo ads that encouraged people to aend a rally volunteer fora candidate or some other in-person activity were viewed onaverage in 4 regions while rsquoDonatersquo ads were viewed in 24 regionson average is makes a certain amount of intuitive sense peopleare willing to travel only so far to aend a rally but can donate tocandidates anywhere in the United States

7

Figure 8 CDF of Regions by Ad Type for Federal Candidates

53 Ad TargetingOne of the deciencies with the Facebook political ad archive isthat while it does share geographic and demographic informationabout who saw a particular ad we have no way of knowing howthat ad was targeted However we were able to connect our datasetcontaining information about who consumed ads with one pub-lished by ProPublica which contains some data about how ads weretargeted e ProPublica data was collected by a browser pluginoperated by ProPublica which anyone can install ProPublicarsquosbrowser plugin [22] uses a supervised Natural Language Process-ing (NLP) classier to detect political ads in addition to allowingusers to manually classify ads they see as political e browserplugin then collects the partial ad targeting explanations Facebookprovides by automatically clicking on the ldquoWhy am I seeing thisrdquobuon for political ads and sends it to ProPublica for them to makepublic

We rst provide a brief background on Facebook targeting audi-ence options [34] Facebook exposes prospective advertisers to aplethora of options First advertisers can target users based on agegender location and languages they speak Second advertisers canchoose to send their ads to users in a custom audience or lookalikeaudience Custom audiences contain a list of identiers of specicusers Advertisers can use various types of data to create a customaudience list ranging from specifying the emails phone numbersor physical addresses of people they want to reach to users thathave visited their website installed their mobile application orliked their Facebook Page Lookalike audiences allow advertisers tolet Facebook choose to whom to sends their ads based on previouscampaigns Finally advertisers can choose from a long list of target-ing aributes the characteristics they want users who receive theirads to have (eg users interested in Catholic Church) Targetingaributes are categorized in types such as demographics behaviorsand interests Advertisers can choose multiple aributes to target

A prior study by Athanasioshas et al [2] reverse engineeredwhat Facebook chooses to show and the limitations of the ad tar-geting explanation Facebook provides is study showed that adexplanations are incomplete each explanation shows at most onetargeting aribute (plus agegenderlocation information) regard-less of how many aributes the advertisers use is means thatexplanations reveal only part of the targeting aributes that wereused providing us ndash and the users ndash with an incomplete picture of

Figure 9 ProPublica Spend by Ad Type

the aributes that advertisers were using However in the samestudy authors performed a number of controlled experiments thatsuggest ndash but not conclusively prove ndash that there is a logic behindwhich aributes appear in an explanation and which do not Givena targeting audience A obtained from two aributes a1 and a2 if a1and a2 come from dierent aribute categories (eg DemographicBehavior Interest etc) the aribute shown follows a specic prece-dence (Demographics and AgeGenderLocation iquest Interests iquest PIIbased lists iquest Behaviors) If a1 and a2 come from the same aributecategory the one that appears in the explanation is the one withthe highest estimated audience size is will result in a systematicunder-counting of lower priority targeting types

ere are two main sources of biases and limitations in ProP-ublicarsquos dataset One comes from users that installed ProPublicarsquosplugin and which political ads they were shown Another is fromthe way Facebook provides ad explanations e ProPublica datasetis the only publicly available source of targeting information forFacebook political ads us we present these results to provide aninitial insight into how Facebook political advertisers are targetingtheir ads with the understanding of likely biases and limitations

With these caveats in mind we proceed to an analysis of the18010 ads which we were able to connect between the ProPublicadataset and ours In Figure 9 we see that dierent types of adsdo indeed rely on dierent targeting strategies Of particular noteis the the divergence of rsquoCommercialrsquo ads of which 74 rely ontargeting by interest groups and of rsquoDonatersquo ads of which only 24do e average ad size did not dier signicantly between targetingtypes but was signicantly larger than the average for the Facebook

8

Figure 10 ProPublica Targeting by Advertiser Type

archive as a whole We believe this to be an artifact of the collectionmechanism which is biased toward nding larger ads 91 ofads in the overall ProPublica dataset had some kind of geographictargeting and 92 had age or gender targeting On average adsin this dataset had on average 41 dierent targeting parametersso for Facebook these should be thought of as a minimum criteriaBy contrast 58 of ads in the Google archive had no geographictargeting whatsoever and 70 had neither age nor gender targeting

In Figure 10 we also see diverging strategies between advertiserswith Political Candidates and PACs making heavy use of customlists of users and For Prot and For Prot Media companies relyingfar more on targeting users by their interests Campaigns havenumerous potential sources from which to compile lists of usersIn addition to their own lists of donors and voter rolls campaignscan rent lists from other candidates [15]

Both Google and Twier oer advertisers similar targeting cri-teria to what we have described for Facebook including customaudiences and lookalike audiences Both even allow targeting ofusers based on interests although they infer these interests in dier-ent ways Both havemade transparent demographic and geographictargeting information for ads in their archive but without othertargeting information this is an incomplete picture at best Weencourage Google and Twier to at minimum follow Facebookrsquosexample and make transparent to users information about whythey have been targeed for ads that they are seeing

54 New Types of Political Advertisers541 For-Profit Media One advertiser type in particular proved

to be an interesting outlier e category rsquoFor Prot Mediarsquo con-tains advertisers whose ads are not considered traditional news byFacebook (those ads are in a separate part of the archive that wedid not include) but have content intended solely to entertain orsway the opinion of the viewer Over the Facebook dataset as awhole the average ad sponsor ran ads on 16 pages Advertisersin the for-prot media category however ran ads on 32 pages onaverage We have examined many of these for-prot media compa-nies to understand why they are running across many Facebookpages What we have found in numerous instances is unknownfor-prot media companies that appear to be creating disingenuouscommunities that appear to be ldquograssroots movementsrdquo to targetdierent demographics and interests with a combination of paidand organic political messaging

A good example of this type of advertiser is rdquoNew AmericanMedia Group LLCrdquo is ad sponsor ran le leaning ads on 10dierent pages ese pages were designed to appeal to dierent

demographics (rdquoMelaninrdquo for people of color rdquoe Soldier Networkrdquofor Veterans rdquoRaising Tomorrowrdquo for parents etc) but oen run thesame content on multiple pages While this LLC has an extremelysimilar name to a now-defunct genuine le-leaning media outlet(New America Media) it appears to have no connection to thatprior group and also appears to have no activity o of Facebook

While some advertisers in this category were fairly traditionalentertainment websites (ie Comedy Central) some were ldquofor-protrdquo companies in name only that appeared to exist for no otherpurpose other than to spread a particular political message and hadno way of generating an actual prot We also discovered ldquoNewsfor Democracyrdquo is an LLC that ran le leaning ads on 14 dierentFacebook pages most of which were designed to be appealing togroups with traditionally conservative view points such as ldquoeHoly Tribunerdquo Journalists investigated this LLC and linked it toMotiveAI which is a liberal political advertising company

542 Corporate Astroturfing Corporations paying for politicaladvertising is not an entirely new phenomenon and has traditionallybeen funded through industry trade groups and PACs Howeverthe reporting requirements by the FCC for US political advertisingon television oen made this political messaging traceable to thereal sponsor ese stricter reporting requirements do not apply toonline political advertising and the ad-hoc reporting requirementsthat online platforms have enacted are being abused by corporationsand industry trade groups to undo transparency eorts

We discovered in our analysis 355 ads sponsored by ldquoCitizens forTobacco Rightsrdquo which is not a registered company in the US butdoes disclose on their website and Facebook page that it is operatedby cigaree company Philip Morris However someone who onlysaw the Facebook ad disclaimer would not be able to connect theads to Philip Morris without further investigation Other journalistshave found instances of oil and insurance lobbying groups that alsoprovided sponsor names that did not match the legally incorporatedentity sponsoring the ads [21] ese organizations are seeminglytaking advantage of Facebookrsquos policy of not veing sponsor namessince some of these entities also ran political ads on Googlersquos adplatform but provided Google with their EIN (tax ID) and correctlegally incorporated names of their organizations [21]

55 Discussione dierent policies bugs idiosyncrasies and security weaknessesof each transparency archive implementation present challenges toour analysis eorts We nd many of the issues with these archiveslikely stem from a combination of their hasty creation and the factthat the platforms are still working out how to improve security ofthese archives such they are dicult to deceive or evade We willrst discuss issues related to accidentally or intentionally deceivingthese transparency eorts and how they might be improved byimplementing more robust sponsor aribution techniques e sec-ond part of our discussion will focus on issues related to bypassinginclusion into the dierent platformsrsquo archives and what can bedone to improve these issues

551 Sponsor Aribution e for-prot political advertisers ap-pear to be the ones that are accidental or intentionally skirting andviolating the spirit of online transparency sponsorship disclosure

9

policies As we discussed in the Ad Targeting section itrsquos extremelyeasy for groups such as rsquoNew American Mediarsquo to obscure whothey actually are from users and researchers

It is worth noting that such advertising by for-prot corpora-tions was not legal until the Citizens United Supreme Court decisionin 2010 [1] that struck down restrictions on election spending byfor-prot corporations However political messaging advertiserswho run ads on television or radio stations governed by the FCCmust still report the name and contact information of the busi-ness which paid for the ad including the companyrsquos ocers anddirectors Such data is published by the FCC in a public databasePolitical advertisers who send direct mail through the US PostalService (USPS) must also report their activities through the FECwith similar public disclosure of the name and contact informationof the business e regulations that require such disclosure for adsthat mention candidates do not apply to online advertising largelybecause the laws that mandate such public disclosures were draedbefore these platforms were as ubiquitous as they have become

What this means in practice though is that people who wantto publicize a political message can form a for-prot company fordoing so with no intent of making an prot As a private companythey do not need to publicly disclose their investors in the waythat PACs are required to disclose their donors en the for-protcompany can advertise on social media also without disclosing thelegal entity providing the funds to pay for the ad

On Facebookrsquos platform advertisers can easily mislead whenproviding the rsquoad sponsorrsquo string associated with their ads eitherintentionally or accidentally us it is eectively free to circum-vent Facebookrsquos transparency implementation We see numerousinstances on Facebookrsquos platforms of this occurring Sometimes theunreliability of the ad sponsor label appeared to be caused purelyby human error such as typos or variation during data entry Forexample Donald J Trump For President Inc sponsored ads onboth the Donald J Trump page and the Mike Pence page Howeverwhen sponsoring ads on the Donald J Trump page the organiza-tion is known as rsquoDonald J Trump For President Incrsquo and whensponsoring ads on the Mike Pence page is known as rsquoDonald JTrump For President Incrsquo Facebook has not publicly stated plansto implement additional veing of political sponsors Facebookrsquosargument is that anything they might implement for additional vet-ting would not be scalable because of their broader inclusion policywhich extends to political issue ads [21] However this has cre-ated a weakness in Facebookrsquos transparency implementation thatgreatly diminishes its eectiveness for studying dishonest politicaladvertisers

Google and Twier both vet sponsors so companies must eitherreveal their legally incorporated name pay existing third-partiesto create ads on their behalf or create shell organizations (ie LLCPACs) We should note that we see instances of political ads onFacebook and Twier where the sponsor is a third-party advertisingagency instead of the actual entity that paid for the ads is is anexample of the complexities of correctly aributing political adsto the real sponsors It is clear from analysis that we need morediscussion about how to implement sponsorship disclosure andveing in a way that makes it practical to deploy at scale and moredicult to circumvent

552 Transparency Infrastructure As we have noted we ap-preciate the speed with which these transparency archives werecreated However the lack of full integration of these archives intothe broader ad platforms of these companies is currently hurtingthe ecacy of these transparency eorts

We believe that there are ads on the Google and Twier platformsthat would be considered political content that are not includedin their transparency archives because their criteria for inclusionare too narrow or their mechanisms for nding this content areinsucient More research needs to be done into exactly whatthe general population considers to be political advertising Wewould encourage these platforms to create policies and enforcementmechanisms that will make transparent advertising content thatthe general population would consider political

We also encountered several technical and policy issues withthe archives as they currently exist Many ads particularly in theGoogle archive were missing content information Information onspend and impressions were only available in broad ranges fromFacebook and Google No targeting information or very lile target-ing information was available from any of the platforms Facebookrequired us to sign an NDA that prohibited us from sharing our rawdata even with other researchers or even discussing our ndingsdirectly with non- US Persons

We call on these organizations to re-architect their platforms andpolicies to support full transparency of all political ads We realizethat making the changes we recommend will require investment oftime and money both in the technology of these platforms and thecorporate culture of the organizations that own them

6 RELATEDWORK61 Online AdvertisingKorolova [16] was the rst to point out privacy aacks based onmicro-targeted online ads Followup work has reverse-engineeredthe targeting options provided by major online ad networks [33]and explored privacy [2] and bias [26] issues of these online adnetworks ere has also been work on designing improved adtransparency mechanisms [20] For our study we leverage thisprior work on reverse-engineering online advertising networksrsquotargeting options and how Facebookrsquos ad targeting explanationlikely is implemented

To the best of our knowledge there has been no systematicanalysis of online advertisers to this point likely due to the dicultyof collecting large-scale data from online ad networks [13] One ofthe only prior large-scale quantitative studies of online advertisersfocused on how their strategies eected conversion rates based onaggregate analysis of advertisers on Microsorsquos ad network [32]XRay [18] and Sunshine [19] are two techniques that were createdto detect and infer online ad targeting methods However thesewere proof of concept systems and not deployed at large-scale Aninitial analysis of Facebookrsquos proposed ad transparency archiveimplementations pointed out the issue of only including politicalads and not revealing targeting information [31] is report wasreleased before Facebook implemented their transparency archiveand therefor did not analyze the ad data archived by Facebook orissues with the actual implementation We have conducted the rst

10

large-scale analysis of online political advertising based on the datarecently made transparent by Facebook Google and Twier

62 Political AdvertisingAnalysis of political television ads has been the focus of most priorpolitical advertising studies likely due to this data being publiclypublished by the FCC and easy to access [14 35] ere is at leastone prior study that explored the inuence of political televisionads on online discussion [25] ere have also been studies of inves-tigating the polarization of online political discourse [3 10] eclosest related to our study is a prior study which showed thatuploading political video advertisements to YouTube generated un-paid organic-views and improved their eectiveness [24] Howeverto the best of our knowledge ours is the rst large-scale study ofonline political advertising

7 CONCLUSIONSWe have performed an analysis of the ads that we were able tocollect from Facebook Google and Twierrsquos transparency archivesrelated to US politics Based on the data we collected we providean initial understanding and taxonomies of online political advertis-ing strategies for both honest and possibly dishonest US politicaladvertisers We also point out limitations and weaknesses of thepolicies and current implementations of these archives As part ofour analysis we demonstrate how advertisers are intentionally oraccidentally deceiving and bypassing these political transparencyarchives We provide a concrete list of suggestions that wouldlikely make them more robust and useful for enabling a beer un-derstanding of political advertising We are actively working witheach archive product teams to improve their implementations

We commend Facebook Google and Twier for their eorts sofar in improving transparency into political advertising on theirplatforms We note the speed with which these archives weremade available aer public concern about this issue was raised andthat these transparency eorts have improved a great deal in theshort time that these tools have been available We encourage theplatforms to continue to improve

REFERENCES[1] 2009-03-24 Citizens United v Federal Election Commission[2] Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P Gummadi

Patrick Loiseau and Alan Mislove 2018 Investigating ad transparency mecha-nisms in social media A case study of Facebookrsquos explanations InNDSS 2018 Net-work and Distributed Systems Security Symposium San Diego UNITED STATES18 ndash 21

[3] Pablo Barber John T Jost Jonathan Nagler Joshua A Tucker and RichardBonneau 2015 Tweeting From Le to Right Is Online Political CommunicationMore an an Echo Chamber Psychological Science 26 10 (2015) 1531ndash1542

[4] Brian Barre 2018 For Russia Unraveling US DemocracyWas Just Another Day Job hpswwwwiredcomstorymueller-indictment-internet-research-agency

[5] Facebook 2018 Ad Archive hpswwwfacebookcomadsarchive[6] Facebook 2018 Facebook Ad Archive Report Retrieved October 25th 2018

from hpswwwfacebookcomadsarchivereport[7] Facebook 2018-07-03 About ads that include political content rdquohpswww

facebookcombusinesshelp167836590566506rdquo[8] Facebook 2018-07-03 National issues of public importance rdquohpswww

facebookcombusinesshelp214754279118974rdquo[9] Facebook 2018-08-22 Introducing the Ad Archive API hpsnewsroom

comnews201808introducing-the-ad-archive-api[10] DJ Flynn Brendan Nyhan and Jason Reier [n d] e Nature and Origins of

Misperceptions Understanding False and Unsupported Beliefs About PoliticsPolitical Psychology 38 S1 ([n d]) 127ndash150

[11] Google 2018-08-15 Transparency Report - Political Ads hpstransparencyreportgooglecompolitical-adslibrary

[12] Kevin Granville 2018 Facebook and Cambridge Analytica What You Needto Know as Fallout Widens hpswwwnytimescom20180319technologyfacebook-cambridge-analytica-explainedhtml

[13] Saikat Guha Bin Cheng and Paul Francis 2010 Challenges in Measuring OnlineAdvertising Systems In Proceedings of the 10th ACM SIGCOMM Conference onInternet Measurement (IMC rsquo10) ACM 81ndash87

[14] Lynda Lee Kaid and Monica Postelnicu 2005 Political Advertising in the 2004Election Comparison of Traditional Television and Internet Messages AmericanBehavioral Scientist 49 2 (2005) 265ndash278

[15] Maggie Haberman Kenneth P Vogel 2018-10-13hpswwwnytimescom20181013uspoliticstrump-political-datahtmlhpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[16] A Korolova 2010 Privacy Violations Using Microtargeted Ads A Case StudyIn 2010 IEEE International Conference on Data Mining Workshops 474ndash482

[17] Shikhar Sakhuja Laura Edelson 2018 An Analysis of UnitedStates Online Political Advertising hpsonline-pol-adsgithubioOnline-Political-Ads-Analysis

[18] Mathias Lecuyer Guillaume Ducoe Francis Lan Andrei Papancea eolosPetsios Riley Spahn Augustin Chaintreau and Roxana Geambasu 2014 XRayEnhancing the Webrsquos Transparency with Dierential Correlation In USENIXSecurity Symposium San Diego United States

[19] Mathias Lecuyer Riley Spahn Yannis Spiliopolous Augustin Chaintreau RoxanaGeambasu and Daniel Hsu 2015 Sunlight Fine-grained Targeting Detectionat Scale with Statistical Condence In Proceedings of the 22Nd ACM SIGSACConference on Computer and Communications Security (CCS rsquo15) ACM New YorkNY USA 554ndash566 hpsdoiorg10114528101032813614

[20] Bin Liu Anmol Sheth Udi Weinsberg Jaideep Chandrashekar and RameshGovindan 2013 AdReveal Improving Transparency into Online TargetedAdvertising In Proceedings of the TwelhACMWorkshop onHot Topics in Networks(HotNets-XII) ACM New York NY USA Article 12 7 pages hpsdoiorg10114525357712535783

[21] Jeremy B Merrill 2018 How Big Oil Dodges FacebooksNew Ad Transparency Rules hpswwwpropublicaorgarticlehow-big-oil-dodges-facebooks-new-ad-transparency-rules

[22] ProPublica 2017-09-07 Political Advertisements from Facebook hpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[23] ProPublica 2018-10-21 Political Advertisements from Facebook hpswwwpropublicaorgdatastoredatasetpolitical-advertisements-from-facebook

[24] Travis N Ridout Erika Franklin Fowler and John Bransteer 2010 PoliticalAdvertising in the 21st Century e Rise of the YouTube Ad In APSA 2010Annual Meeting

[25] Dhavan V Shah Jaeho Cho Seungahn Nah Melissa R Gotlieb Hyunseo HwangNam-Jin Lee Rosanne M Scholl and Douglas M McLeod [n d] Campaign AdsOnline Messaging and Participation Extending the Communication MediationModel Journal of Communication 57 4 ([n d]) 676ndash703

[26] Till Speicher Muhammad Ali Giridhari Venkatadri Filipe Nunes Ribeiro GeorgeArvanitakis Fabrcio Benevenuto Krishna P Gummadi Patrick Loiseau andAlan Mislove 2018 Potential for Discrimination in Online Targeted AdvertisingIn Proceedings of the 1st Conference on Fairness Accountability and Transparency(Proceedings of Machine Learning Research) Sorelle A Friedler and ChristoWilson(Eds) Vol 81 PMLR New York NY USA 5ndash19

[27] William Turton 2018 We posed as 100 Senators to run ads on FacebookFacebook approved all of them hpsnewsvicecomen usarticlexw9n3qwe-posed-as-100-senators-to-run-ads-on-facebook-facebook-approved-all-of-them

[28] Twier 2018 Political campaigning advertisers hpsadstwiercomtransparencyipolitical advertisers

[29] Twier 2018-06-28 Ad Transparency Center hpsadstwiercomtransparency

[30] Twier 2018-06-28 Political Content in the United States hpsbusinesstwiercomenhelpads-policiesrestricted-content-policiespolitical-campaigningUS-political-contenthtml

[31] Upturn 2018 Leveling the Platform Real Transparency for Paid Messages onFacebook hpswwwteamupturnorgreports2018facebook-ads

[32] Bhanu C Vaikonda Vacha Dave Saikat Guha and Alex C Snoeren 2015Empirical Analysis of Search Advertising Strategies In Proceedings of the 2015Internet Measurement Conference (IMC rsquo15) ACM New York NY USA 79ndash91hpsdoiorg10114528156752815694

[33] G Venkatadri A Andreou Y Liu A Mislove K P Gummadi P Loiseau and OGoga 2018 Privacy Risks with Facebookrsquos PII-Based Targeting Auditing a DataBrokerrsquos Advertising Interface In 2018 IEEE Symposium on Security and Privacy(SP) 89ndash107

[34] Giridhari Venkatadri Yabing Liu Athanasios Andreou Oana Goga PatrickLoiseau Alan Mislove and Krishna P Gummadi 2018 Privacy Risks withFacebookrsquos PII-based Targeting Auditing a Data Brokerrsquos Advertising InterfaceIn Proceedings of the IEEE Symposium on Security and Privacy (IEEE SampPrsquo18) San

11

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References
Page 6: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

Platform Total Ads Total Sponsors Total Pages Impressions Spend First Ad Date Last Ad DateFacebook 126 M 24 K 38 k 735 B - 2112 B $135 M - $567 M July 14th 2014 October 21st 2018Google 41 K 616 NA 13 B - 116 B $45 M May 31st 2018 October 21st 2018Twier 1808 88 NA 118 M $16 M December 21st 2016 October 21st 2018

Table 2 Overall Datasets

Total Ads 81052Total Pages 2363Total Ad Sponsors 2395Earliest Ad Date July 31st 2017Latest Ad Date October 18th 2018

Table 3 ProPublica Political Advertisements From Face-book

Figure 2 Distribution of ads by type

Figure 3 Distribution of spend by advertiser type

51 Data Over Timee time period during which we were collecting data coincidedwith the 2018 midterm elections in the United States us we wereable to observe changing paerns in spend leading up to a majorelection Figure 4 shows spend by week for the 5 month periodleading up to the election and Figure 5 shows raw ad count forthe same period We note that our data particularly for Facebookspend is right-censored for the nal two weeks is is causedby Facebookrsquos API limitations which only enable us to be ableto recheck ad spends weekly us newly create ads have likely

Results Facebook Google TwierTotal Advertisers 1 K 534 54Total Ads 161 K 15 K 1 KTotal Impressions 800 M - 24 B 280 M- 3 B 100 MTotal USD Spend $12 M - $60 M $135 M $14 MAve ImpressionsAd 5 K - 15 K 32 K - 283 K 65 KAve USD SpendAd $74 - $373 $1 K $885

Table 4 Federal Candidate Only Results

not spent much of their budget when we initially discover themis right censor eect also likely eects Google and Twier to alesser degree due to ads with larger budgets that take several daysto spend down completely is can be corrected by periodicallyrechecking the ads until they have all spent their budgets which isnormally within a week If the paper is accepted we will updatethe data to include ads up to the US midterm elections

We can see the expected increases in the number of ads onall three platforms as the US midterm elections approach OnFacebookrsquos platform there is an increase in connect ads and onTwier there is an increase in move ads Both of these are relatedto sophisticated ldquoget out the voterdquo eorts that many groups havedeployed ese move ads include images which include specicpolling place addresses and websites that provide polling placedirections and information e connect ads oen provide userswith instructions on how they can volunteer to help with early andday-of voter turnout eorts e cause of the spending spikes forFacebookrsquos platform can be aributed to a few unknown sponsorsthat we could not link to a legally registered entity but that werelikely quasi for-prot advertisers which we will discuss furtherlater in the paper e spending spikes on Twierrsquos platform canbe aributed to candidates who ran a few ads with larger budgets

52 Federal Candidate ComparisonIn order to understand how political advertising across these plat-forms dier we aempted to create a comparable subset of adver-tisers and ads is is dicult because each platform has slightlydierent criteria for inclusion To do this we present results foradvertising only paid for by candidates for federal oce whichwas the broadest set that was reliably included in all three archivesNote this does not include ads by current oceholders who are notseeking re-election or ads that merely mention a federal candidatebut are paid for by another party Results for these advertisers arepresented in Table 4

Table 4 shows that Facebook is the platform with the broadestappeal to federal candidate advertisers with far more advertisersand ads than Google However political advertising by this groupon Google appears to generate more spend and possibly more im-pressions than ads on Facebook e average ad size on Facebookin terms of impressions and spend are the smallest based on our

6

Figure 4 Platform Ad Count By Ad Type By Week

Figure 5 Platform Spend By Advertiser Type By Week

Figure 6 Federal Candidate ads by Size

Figure 7 Federal Candidate Spend by Ad Type

minimum estimates indicating that advertisers are running smallerlikely more targeted ads on Facebook ese small ads on Face-book are what are called micro targeted which we dene as lessthan 1000 impressions or a spend of less than $100 For Facebookmicrotargeted ads make up 81 of the overall number of ads for

federal candidates in our dataset For Twier this number is 62and for Google it is 54 Figure 6 shows the share of ads by sizeof spend and here we can begin to see how federal candidates usethese platforms in dierent ways We note that the distribution ofads by size for federal candidates in Figure 6 is very similar to theoverall distribution of ads by size seen in Figure 1

Figure 7 shows the relative spend on dierent ad platformswhere we see very dierent percentages for types of ads Commer-cial ads are not shown in this gure because there were too fewcommercial ads to be visible Particularly of note is the fact that adsseeking donations were far more common on the Google platformand ads seeking to spread a message (rsquoInformrsquo) were much morecommon on Facebook

Seeing these dierences in both ad size and the types of ads thatwere run we wanted to understand if advertisers were trying toreach dierent geographic audiences with dierent types of adsTo do this we compared the number of regions in which variousads had impressions on Facebook and Twier and the number ofregions targeted for Google Figure 8 shows that a variation intargeting strategy is visible on Facebook and Twier On FacebookrsquoMoversquo ads that encouraged people to aend a rally volunteer fora candidate or some other in-person activity were viewed onaverage in 4 regions while rsquoDonatersquo ads were viewed in 24 regionson average is makes a certain amount of intuitive sense peopleare willing to travel only so far to aend a rally but can donate tocandidates anywhere in the United States

7

Figure 8 CDF of Regions by Ad Type for Federal Candidates

53 Ad TargetingOne of the deciencies with the Facebook political ad archive isthat while it does share geographic and demographic informationabout who saw a particular ad we have no way of knowing howthat ad was targeted However we were able to connect our datasetcontaining information about who consumed ads with one pub-lished by ProPublica which contains some data about how ads weretargeted e ProPublica data was collected by a browser pluginoperated by ProPublica which anyone can install ProPublicarsquosbrowser plugin [22] uses a supervised Natural Language Process-ing (NLP) classier to detect political ads in addition to allowingusers to manually classify ads they see as political e browserplugin then collects the partial ad targeting explanations Facebookprovides by automatically clicking on the ldquoWhy am I seeing thisrdquobuon for political ads and sends it to ProPublica for them to makepublic

We rst provide a brief background on Facebook targeting audi-ence options [34] Facebook exposes prospective advertisers to aplethora of options First advertisers can target users based on agegender location and languages they speak Second advertisers canchoose to send their ads to users in a custom audience or lookalikeaudience Custom audiences contain a list of identiers of specicusers Advertisers can use various types of data to create a customaudience list ranging from specifying the emails phone numbersor physical addresses of people they want to reach to users thathave visited their website installed their mobile application orliked their Facebook Page Lookalike audiences allow advertisers tolet Facebook choose to whom to sends their ads based on previouscampaigns Finally advertisers can choose from a long list of target-ing aributes the characteristics they want users who receive theirads to have (eg users interested in Catholic Church) Targetingaributes are categorized in types such as demographics behaviorsand interests Advertisers can choose multiple aributes to target

A prior study by Athanasioshas et al [2] reverse engineeredwhat Facebook chooses to show and the limitations of the ad tar-geting explanation Facebook provides is study showed that adexplanations are incomplete each explanation shows at most onetargeting aribute (plus agegenderlocation information) regard-less of how many aributes the advertisers use is means thatexplanations reveal only part of the targeting aributes that wereused providing us ndash and the users ndash with an incomplete picture of

Figure 9 ProPublica Spend by Ad Type

the aributes that advertisers were using However in the samestudy authors performed a number of controlled experiments thatsuggest ndash but not conclusively prove ndash that there is a logic behindwhich aributes appear in an explanation and which do not Givena targeting audience A obtained from two aributes a1 and a2 if a1and a2 come from dierent aribute categories (eg DemographicBehavior Interest etc) the aribute shown follows a specic prece-dence (Demographics and AgeGenderLocation iquest Interests iquest PIIbased lists iquest Behaviors) If a1 and a2 come from the same aributecategory the one that appears in the explanation is the one withthe highest estimated audience size is will result in a systematicunder-counting of lower priority targeting types

ere are two main sources of biases and limitations in ProP-ublicarsquos dataset One comes from users that installed ProPublicarsquosplugin and which political ads they were shown Another is fromthe way Facebook provides ad explanations e ProPublica datasetis the only publicly available source of targeting information forFacebook political ads us we present these results to provide aninitial insight into how Facebook political advertisers are targetingtheir ads with the understanding of likely biases and limitations

With these caveats in mind we proceed to an analysis of the18010 ads which we were able to connect between the ProPublicadataset and ours In Figure 9 we see that dierent types of adsdo indeed rely on dierent targeting strategies Of particular noteis the the divergence of rsquoCommercialrsquo ads of which 74 rely ontargeting by interest groups and of rsquoDonatersquo ads of which only 24do e average ad size did not dier signicantly between targetingtypes but was signicantly larger than the average for the Facebook

8

Figure 10 ProPublica Targeting by Advertiser Type

archive as a whole We believe this to be an artifact of the collectionmechanism which is biased toward nding larger ads 91 ofads in the overall ProPublica dataset had some kind of geographictargeting and 92 had age or gender targeting On average adsin this dataset had on average 41 dierent targeting parametersso for Facebook these should be thought of as a minimum criteriaBy contrast 58 of ads in the Google archive had no geographictargeting whatsoever and 70 had neither age nor gender targeting

In Figure 10 we also see diverging strategies between advertiserswith Political Candidates and PACs making heavy use of customlists of users and For Prot and For Prot Media companies relyingfar more on targeting users by their interests Campaigns havenumerous potential sources from which to compile lists of usersIn addition to their own lists of donors and voter rolls campaignscan rent lists from other candidates [15]

Both Google and Twier oer advertisers similar targeting cri-teria to what we have described for Facebook including customaudiences and lookalike audiences Both even allow targeting ofusers based on interests although they infer these interests in dier-ent ways Both havemade transparent demographic and geographictargeting information for ads in their archive but without othertargeting information this is an incomplete picture at best Weencourage Google and Twier to at minimum follow Facebookrsquosexample and make transparent to users information about whythey have been targeed for ads that they are seeing

54 New Types of Political Advertisers541 For-Profit Media One advertiser type in particular proved

to be an interesting outlier e category rsquoFor Prot Mediarsquo con-tains advertisers whose ads are not considered traditional news byFacebook (those ads are in a separate part of the archive that wedid not include) but have content intended solely to entertain orsway the opinion of the viewer Over the Facebook dataset as awhole the average ad sponsor ran ads on 16 pages Advertisersin the for-prot media category however ran ads on 32 pages onaverage We have examined many of these for-prot media compa-nies to understand why they are running across many Facebookpages What we have found in numerous instances is unknownfor-prot media companies that appear to be creating disingenuouscommunities that appear to be ldquograssroots movementsrdquo to targetdierent demographics and interests with a combination of paidand organic political messaging

A good example of this type of advertiser is rdquoNew AmericanMedia Group LLCrdquo is ad sponsor ran le leaning ads on 10dierent pages ese pages were designed to appeal to dierent

demographics (rdquoMelaninrdquo for people of color rdquoe Soldier Networkrdquofor Veterans rdquoRaising Tomorrowrdquo for parents etc) but oen run thesame content on multiple pages While this LLC has an extremelysimilar name to a now-defunct genuine le-leaning media outlet(New America Media) it appears to have no connection to thatprior group and also appears to have no activity o of Facebook

While some advertisers in this category were fairly traditionalentertainment websites (ie Comedy Central) some were ldquofor-protrdquo companies in name only that appeared to exist for no otherpurpose other than to spread a particular political message and hadno way of generating an actual prot We also discovered ldquoNewsfor Democracyrdquo is an LLC that ran le leaning ads on 14 dierentFacebook pages most of which were designed to be appealing togroups with traditionally conservative view points such as ldquoeHoly Tribunerdquo Journalists investigated this LLC and linked it toMotiveAI which is a liberal political advertising company

542 Corporate Astroturfing Corporations paying for politicaladvertising is not an entirely new phenomenon and has traditionallybeen funded through industry trade groups and PACs Howeverthe reporting requirements by the FCC for US political advertisingon television oen made this political messaging traceable to thereal sponsor ese stricter reporting requirements do not apply toonline political advertising and the ad-hoc reporting requirementsthat online platforms have enacted are being abused by corporationsand industry trade groups to undo transparency eorts

We discovered in our analysis 355 ads sponsored by ldquoCitizens forTobacco Rightsrdquo which is not a registered company in the US butdoes disclose on their website and Facebook page that it is operatedby cigaree company Philip Morris However someone who onlysaw the Facebook ad disclaimer would not be able to connect theads to Philip Morris without further investigation Other journalistshave found instances of oil and insurance lobbying groups that alsoprovided sponsor names that did not match the legally incorporatedentity sponsoring the ads [21] ese organizations are seeminglytaking advantage of Facebookrsquos policy of not veing sponsor namessince some of these entities also ran political ads on Googlersquos adplatform but provided Google with their EIN (tax ID) and correctlegally incorporated names of their organizations [21]

55 Discussione dierent policies bugs idiosyncrasies and security weaknessesof each transparency archive implementation present challenges toour analysis eorts We nd many of the issues with these archiveslikely stem from a combination of their hasty creation and the factthat the platforms are still working out how to improve security ofthese archives such they are dicult to deceive or evade We willrst discuss issues related to accidentally or intentionally deceivingthese transparency eorts and how they might be improved byimplementing more robust sponsor aribution techniques e sec-ond part of our discussion will focus on issues related to bypassinginclusion into the dierent platformsrsquo archives and what can bedone to improve these issues

551 Sponsor Aribution e for-prot political advertisers ap-pear to be the ones that are accidental or intentionally skirting andviolating the spirit of online transparency sponsorship disclosure

9

policies As we discussed in the Ad Targeting section itrsquos extremelyeasy for groups such as rsquoNew American Mediarsquo to obscure whothey actually are from users and researchers

It is worth noting that such advertising by for-prot corpora-tions was not legal until the Citizens United Supreme Court decisionin 2010 [1] that struck down restrictions on election spending byfor-prot corporations However political messaging advertiserswho run ads on television or radio stations governed by the FCCmust still report the name and contact information of the busi-ness which paid for the ad including the companyrsquos ocers anddirectors Such data is published by the FCC in a public databasePolitical advertisers who send direct mail through the US PostalService (USPS) must also report their activities through the FECwith similar public disclosure of the name and contact informationof the business e regulations that require such disclosure for adsthat mention candidates do not apply to online advertising largelybecause the laws that mandate such public disclosures were draedbefore these platforms were as ubiquitous as they have become

What this means in practice though is that people who wantto publicize a political message can form a for-prot company fordoing so with no intent of making an prot As a private companythey do not need to publicly disclose their investors in the waythat PACs are required to disclose their donors en the for-protcompany can advertise on social media also without disclosing thelegal entity providing the funds to pay for the ad

On Facebookrsquos platform advertisers can easily mislead whenproviding the rsquoad sponsorrsquo string associated with their ads eitherintentionally or accidentally us it is eectively free to circum-vent Facebookrsquos transparency implementation We see numerousinstances on Facebookrsquos platforms of this occurring Sometimes theunreliability of the ad sponsor label appeared to be caused purelyby human error such as typos or variation during data entry Forexample Donald J Trump For President Inc sponsored ads onboth the Donald J Trump page and the Mike Pence page Howeverwhen sponsoring ads on the Donald J Trump page the organiza-tion is known as rsquoDonald J Trump For President Incrsquo and whensponsoring ads on the Mike Pence page is known as rsquoDonald JTrump For President Incrsquo Facebook has not publicly stated plansto implement additional veing of political sponsors Facebookrsquosargument is that anything they might implement for additional vet-ting would not be scalable because of their broader inclusion policywhich extends to political issue ads [21] However this has cre-ated a weakness in Facebookrsquos transparency implementation thatgreatly diminishes its eectiveness for studying dishonest politicaladvertisers

Google and Twier both vet sponsors so companies must eitherreveal their legally incorporated name pay existing third-partiesto create ads on their behalf or create shell organizations (ie LLCPACs) We should note that we see instances of political ads onFacebook and Twier where the sponsor is a third-party advertisingagency instead of the actual entity that paid for the ads is is anexample of the complexities of correctly aributing political adsto the real sponsors It is clear from analysis that we need morediscussion about how to implement sponsorship disclosure andveing in a way that makes it practical to deploy at scale and moredicult to circumvent

552 Transparency Infrastructure As we have noted we ap-preciate the speed with which these transparency archives werecreated However the lack of full integration of these archives intothe broader ad platforms of these companies is currently hurtingthe ecacy of these transparency eorts

We believe that there are ads on the Google and Twier platformsthat would be considered political content that are not includedin their transparency archives because their criteria for inclusionare too narrow or their mechanisms for nding this content areinsucient More research needs to be done into exactly whatthe general population considers to be political advertising Wewould encourage these platforms to create policies and enforcementmechanisms that will make transparent advertising content thatthe general population would consider political

We also encountered several technical and policy issues withthe archives as they currently exist Many ads particularly in theGoogle archive were missing content information Information onspend and impressions were only available in broad ranges fromFacebook and Google No targeting information or very lile target-ing information was available from any of the platforms Facebookrequired us to sign an NDA that prohibited us from sharing our rawdata even with other researchers or even discussing our ndingsdirectly with non- US Persons

We call on these organizations to re-architect their platforms andpolicies to support full transparency of all political ads We realizethat making the changes we recommend will require investment oftime and money both in the technology of these platforms and thecorporate culture of the organizations that own them

6 RELATEDWORK61 Online AdvertisingKorolova [16] was the rst to point out privacy aacks based onmicro-targeted online ads Followup work has reverse-engineeredthe targeting options provided by major online ad networks [33]and explored privacy [2] and bias [26] issues of these online adnetworks ere has also been work on designing improved adtransparency mechanisms [20] For our study we leverage thisprior work on reverse-engineering online advertising networksrsquotargeting options and how Facebookrsquos ad targeting explanationlikely is implemented

To the best of our knowledge there has been no systematicanalysis of online advertisers to this point likely due to the dicultyof collecting large-scale data from online ad networks [13] One ofthe only prior large-scale quantitative studies of online advertisersfocused on how their strategies eected conversion rates based onaggregate analysis of advertisers on Microsorsquos ad network [32]XRay [18] and Sunshine [19] are two techniques that were createdto detect and infer online ad targeting methods However thesewere proof of concept systems and not deployed at large-scale Aninitial analysis of Facebookrsquos proposed ad transparency archiveimplementations pointed out the issue of only including politicalads and not revealing targeting information [31] is report wasreleased before Facebook implemented their transparency archiveand therefor did not analyze the ad data archived by Facebook orissues with the actual implementation We have conducted the rst

10

large-scale analysis of online political advertising based on the datarecently made transparent by Facebook Google and Twier

62 Political AdvertisingAnalysis of political television ads has been the focus of most priorpolitical advertising studies likely due to this data being publiclypublished by the FCC and easy to access [14 35] ere is at leastone prior study that explored the inuence of political televisionads on online discussion [25] ere have also been studies of inves-tigating the polarization of online political discourse [3 10] eclosest related to our study is a prior study which showed thatuploading political video advertisements to YouTube generated un-paid organic-views and improved their eectiveness [24] Howeverto the best of our knowledge ours is the rst large-scale study ofonline political advertising

7 CONCLUSIONSWe have performed an analysis of the ads that we were able tocollect from Facebook Google and Twierrsquos transparency archivesrelated to US politics Based on the data we collected we providean initial understanding and taxonomies of online political advertis-ing strategies for both honest and possibly dishonest US politicaladvertisers We also point out limitations and weaknesses of thepolicies and current implementations of these archives As part ofour analysis we demonstrate how advertisers are intentionally oraccidentally deceiving and bypassing these political transparencyarchives We provide a concrete list of suggestions that wouldlikely make them more robust and useful for enabling a beer un-derstanding of political advertising We are actively working witheach archive product teams to improve their implementations

We commend Facebook Google and Twier for their eorts sofar in improving transparency into political advertising on theirplatforms We note the speed with which these archives weremade available aer public concern about this issue was raised andthat these transparency eorts have improved a great deal in theshort time that these tools have been available We encourage theplatforms to continue to improve

REFERENCES[1] 2009-03-24 Citizens United v Federal Election Commission[2] Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P Gummadi

Patrick Loiseau and Alan Mislove 2018 Investigating ad transparency mecha-nisms in social media A case study of Facebookrsquos explanations InNDSS 2018 Net-work and Distributed Systems Security Symposium San Diego UNITED STATES18 ndash 21

[3] Pablo Barber John T Jost Jonathan Nagler Joshua A Tucker and RichardBonneau 2015 Tweeting From Le to Right Is Online Political CommunicationMore an an Echo Chamber Psychological Science 26 10 (2015) 1531ndash1542

[4] Brian Barre 2018 For Russia Unraveling US DemocracyWas Just Another Day Job hpswwwwiredcomstorymueller-indictment-internet-research-agency

[5] Facebook 2018 Ad Archive hpswwwfacebookcomadsarchive[6] Facebook 2018 Facebook Ad Archive Report Retrieved October 25th 2018

from hpswwwfacebookcomadsarchivereport[7] Facebook 2018-07-03 About ads that include political content rdquohpswww

facebookcombusinesshelp167836590566506rdquo[8] Facebook 2018-07-03 National issues of public importance rdquohpswww

facebookcombusinesshelp214754279118974rdquo[9] Facebook 2018-08-22 Introducing the Ad Archive API hpsnewsroom

comnews201808introducing-the-ad-archive-api[10] DJ Flynn Brendan Nyhan and Jason Reier [n d] e Nature and Origins of

Misperceptions Understanding False and Unsupported Beliefs About PoliticsPolitical Psychology 38 S1 ([n d]) 127ndash150

[11] Google 2018-08-15 Transparency Report - Political Ads hpstransparencyreportgooglecompolitical-adslibrary

[12] Kevin Granville 2018 Facebook and Cambridge Analytica What You Needto Know as Fallout Widens hpswwwnytimescom20180319technologyfacebook-cambridge-analytica-explainedhtml

[13] Saikat Guha Bin Cheng and Paul Francis 2010 Challenges in Measuring OnlineAdvertising Systems In Proceedings of the 10th ACM SIGCOMM Conference onInternet Measurement (IMC rsquo10) ACM 81ndash87

[14] Lynda Lee Kaid and Monica Postelnicu 2005 Political Advertising in the 2004Election Comparison of Traditional Television and Internet Messages AmericanBehavioral Scientist 49 2 (2005) 265ndash278

[15] Maggie Haberman Kenneth P Vogel 2018-10-13hpswwwnytimescom20181013uspoliticstrump-political-datahtmlhpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[16] A Korolova 2010 Privacy Violations Using Microtargeted Ads A Case StudyIn 2010 IEEE International Conference on Data Mining Workshops 474ndash482

[17] Shikhar Sakhuja Laura Edelson 2018 An Analysis of UnitedStates Online Political Advertising hpsonline-pol-adsgithubioOnline-Political-Ads-Analysis

[18] Mathias Lecuyer Guillaume Ducoe Francis Lan Andrei Papancea eolosPetsios Riley Spahn Augustin Chaintreau and Roxana Geambasu 2014 XRayEnhancing the Webrsquos Transparency with Dierential Correlation In USENIXSecurity Symposium San Diego United States

[19] Mathias Lecuyer Riley Spahn Yannis Spiliopolous Augustin Chaintreau RoxanaGeambasu and Daniel Hsu 2015 Sunlight Fine-grained Targeting Detectionat Scale with Statistical Condence In Proceedings of the 22Nd ACM SIGSACConference on Computer and Communications Security (CCS rsquo15) ACM New YorkNY USA 554ndash566 hpsdoiorg10114528101032813614

[20] Bin Liu Anmol Sheth Udi Weinsberg Jaideep Chandrashekar and RameshGovindan 2013 AdReveal Improving Transparency into Online TargetedAdvertising In Proceedings of the TwelhACMWorkshop onHot Topics in Networks(HotNets-XII) ACM New York NY USA Article 12 7 pages hpsdoiorg10114525357712535783

[21] Jeremy B Merrill 2018 How Big Oil Dodges FacebooksNew Ad Transparency Rules hpswwwpropublicaorgarticlehow-big-oil-dodges-facebooks-new-ad-transparency-rules

[22] ProPublica 2017-09-07 Political Advertisements from Facebook hpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[23] ProPublica 2018-10-21 Political Advertisements from Facebook hpswwwpropublicaorgdatastoredatasetpolitical-advertisements-from-facebook

[24] Travis N Ridout Erika Franklin Fowler and John Bransteer 2010 PoliticalAdvertising in the 21st Century e Rise of the YouTube Ad In APSA 2010Annual Meeting

[25] Dhavan V Shah Jaeho Cho Seungahn Nah Melissa R Gotlieb Hyunseo HwangNam-Jin Lee Rosanne M Scholl and Douglas M McLeod [n d] Campaign AdsOnline Messaging and Participation Extending the Communication MediationModel Journal of Communication 57 4 ([n d]) 676ndash703

[26] Till Speicher Muhammad Ali Giridhari Venkatadri Filipe Nunes Ribeiro GeorgeArvanitakis Fabrcio Benevenuto Krishna P Gummadi Patrick Loiseau andAlan Mislove 2018 Potential for Discrimination in Online Targeted AdvertisingIn Proceedings of the 1st Conference on Fairness Accountability and Transparency(Proceedings of Machine Learning Research) Sorelle A Friedler and ChristoWilson(Eds) Vol 81 PMLR New York NY USA 5ndash19

[27] William Turton 2018 We posed as 100 Senators to run ads on FacebookFacebook approved all of them hpsnewsvicecomen usarticlexw9n3qwe-posed-as-100-senators-to-run-ads-on-facebook-facebook-approved-all-of-them

[28] Twier 2018 Political campaigning advertisers hpsadstwiercomtransparencyipolitical advertisers

[29] Twier 2018-06-28 Ad Transparency Center hpsadstwiercomtransparency

[30] Twier 2018-06-28 Political Content in the United States hpsbusinesstwiercomenhelpads-policiesrestricted-content-policiespolitical-campaigningUS-political-contenthtml

[31] Upturn 2018 Leveling the Platform Real Transparency for Paid Messages onFacebook hpswwwteamupturnorgreports2018facebook-ads

[32] Bhanu C Vaikonda Vacha Dave Saikat Guha and Alex C Snoeren 2015Empirical Analysis of Search Advertising Strategies In Proceedings of the 2015Internet Measurement Conference (IMC rsquo15) ACM New York NY USA 79ndash91hpsdoiorg10114528156752815694

[33] G Venkatadri A Andreou Y Liu A Mislove K P Gummadi P Loiseau and OGoga 2018 Privacy Risks with Facebookrsquos PII-Based Targeting Auditing a DataBrokerrsquos Advertising Interface In 2018 IEEE Symposium on Security and Privacy(SP) 89ndash107

[34] Giridhari Venkatadri Yabing Liu Athanasios Andreou Oana Goga PatrickLoiseau Alan Mislove and Krishna P Gummadi 2018 Privacy Risks withFacebookrsquos PII-based Targeting Auditing a Data Brokerrsquos Advertising InterfaceIn Proceedings of the IEEE Symposium on Security and Privacy (IEEE SampPrsquo18) San

11

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References
Page 7: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

Figure 4 Platform Ad Count By Ad Type By Week

Figure 5 Platform Spend By Advertiser Type By Week

Figure 6 Federal Candidate ads by Size

Figure 7 Federal Candidate Spend by Ad Type

minimum estimates indicating that advertisers are running smallerlikely more targeted ads on Facebook ese small ads on Face-book are what are called micro targeted which we dene as lessthan 1000 impressions or a spend of less than $100 For Facebookmicrotargeted ads make up 81 of the overall number of ads for

federal candidates in our dataset For Twier this number is 62and for Google it is 54 Figure 6 shows the share of ads by sizeof spend and here we can begin to see how federal candidates usethese platforms in dierent ways We note that the distribution ofads by size for federal candidates in Figure 6 is very similar to theoverall distribution of ads by size seen in Figure 1

Figure 7 shows the relative spend on dierent ad platformswhere we see very dierent percentages for types of ads Commer-cial ads are not shown in this gure because there were too fewcommercial ads to be visible Particularly of note is the fact that adsseeking donations were far more common on the Google platformand ads seeking to spread a message (rsquoInformrsquo) were much morecommon on Facebook

Seeing these dierences in both ad size and the types of ads thatwere run we wanted to understand if advertisers were trying toreach dierent geographic audiences with dierent types of adsTo do this we compared the number of regions in which variousads had impressions on Facebook and Twier and the number ofregions targeted for Google Figure 8 shows that a variation intargeting strategy is visible on Facebook and Twier On FacebookrsquoMoversquo ads that encouraged people to aend a rally volunteer fora candidate or some other in-person activity were viewed onaverage in 4 regions while rsquoDonatersquo ads were viewed in 24 regionson average is makes a certain amount of intuitive sense peopleare willing to travel only so far to aend a rally but can donate tocandidates anywhere in the United States

7

Figure 8 CDF of Regions by Ad Type for Federal Candidates

53 Ad TargetingOne of the deciencies with the Facebook political ad archive isthat while it does share geographic and demographic informationabout who saw a particular ad we have no way of knowing howthat ad was targeted However we were able to connect our datasetcontaining information about who consumed ads with one pub-lished by ProPublica which contains some data about how ads weretargeted e ProPublica data was collected by a browser pluginoperated by ProPublica which anyone can install ProPublicarsquosbrowser plugin [22] uses a supervised Natural Language Process-ing (NLP) classier to detect political ads in addition to allowingusers to manually classify ads they see as political e browserplugin then collects the partial ad targeting explanations Facebookprovides by automatically clicking on the ldquoWhy am I seeing thisrdquobuon for political ads and sends it to ProPublica for them to makepublic

We rst provide a brief background on Facebook targeting audi-ence options [34] Facebook exposes prospective advertisers to aplethora of options First advertisers can target users based on agegender location and languages they speak Second advertisers canchoose to send their ads to users in a custom audience or lookalikeaudience Custom audiences contain a list of identiers of specicusers Advertisers can use various types of data to create a customaudience list ranging from specifying the emails phone numbersor physical addresses of people they want to reach to users thathave visited their website installed their mobile application orliked their Facebook Page Lookalike audiences allow advertisers tolet Facebook choose to whom to sends their ads based on previouscampaigns Finally advertisers can choose from a long list of target-ing aributes the characteristics they want users who receive theirads to have (eg users interested in Catholic Church) Targetingaributes are categorized in types such as demographics behaviorsand interests Advertisers can choose multiple aributes to target

A prior study by Athanasioshas et al [2] reverse engineeredwhat Facebook chooses to show and the limitations of the ad tar-geting explanation Facebook provides is study showed that adexplanations are incomplete each explanation shows at most onetargeting aribute (plus agegenderlocation information) regard-less of how many aributes the advertisers use is means thatexplanations reveal only part of the targeting aributes that wereused providing us ndash and the users ndash with an incomplete picture of

Figure 9 ProPublica Spend by Ad Type

the aributes that advertisers were using However in the samestudy authors performed a number of controlled experiments thatsuggest ndash but not conclusively prove ndash that there is a logic behindwhich aributes appear in an explanation and which do not Givena targeting audience A obtained from two aributes a1 and a2 if a1and a2 come from dierent aribute categories (eg DemographicBehavior Interest etc) the aribute shown follows a specic prece-dence (Demographics and AgeGenderLocation iquest Interests iquest PIIbased lists iquest Behaviors) If a1 and a2 come from the same aributecategory the one that appears in the explanation is the one withthe highest estimated audience size is will result in a systematicunder-counting of lower priority targeting types

ere are two main sources of biases and limitations in ProP-ublicarsquos dataset One comes from users that installed ProPublicarsquosplugin and which political ads they were shown Another is fromthe way Facebook provides ad explanations e ProPublica datasetis the only publicly available source of targeting information forFacebook political ads us we present these results to provide aninitial insight into how Facebook political advertisers are targetingtheir ads with the understanding of likely biases and limitations

With these caveats in mind we proceed to an analysis of the18010 ads which we were able to connect between the ProPublicadataset and ours In Figure 9 we see that dierent types of adsdo indeed rely on dierent targeting strategies Of particular noteis the the divergence of rsquoCommercialrsquo ads of which 74 rely ontargeting by interest groups and of rsquoDonatersquo ads of which only 24do e average ad size did not dier signicantly between targetingtypes but was signicantly larger than the average for the Facebook

8

Figure 10 ProPublica Targeting by Advertiser Type

archive as a whole We believe this to be an artifact of the collectionmechanism which is biased toward nding larger ads 91 ofads in the overall ProPublica dataset had some kind of geographictargeting and 92 had age or gender targeting On average adsin this dataset had on average 41 dierent targeting parametersso for Facebook these should be thought of as a minimum criteriaBy contrast 58 of ads in the Google archive had no geographictargeting whatsoever and 70 had neither age nor gender targeting

In Figure 10 we also see diverging strategies between advertiserswith Political Candidates and PACs making heavy use of customlists of users and For Prot and For Prot Media companies relyingfar more on targeting users by their interests Campaigns havenumerous potential sources from which to compile lists of usersIn addition to their own lists of donors and voter rolls campaignscan rent lists from other candidates [15]

Both Google and Twier oer advertisers similar targeting cri-teria to what we have described for Facebook including customaudiences and lookalike audiences Both even allow targeting ofusers based on interests although they infer these interests in dier-ent ways Both havemade transparent demographic and geographictargeting information for ads in their archive but without othertargeting information this is an incomplete picture at best Weencourage Google and Twier to at minimum follow Facebookrsquosexample and make transparent to users information about whythey have been targeed for ads that they are seeing

54 New Types of Political Advertisers541 For-Profit Media One advertiser type in particular proved

to be an interesting outlier e category rsquoFor Prot Mediarsquo con-tains advertisers whose ads are not considered traditional news byFacebook (those ads are in a separate part of the archive that wedid not include) but have content intended solely to entertain orsway the opinion of the viewer Over the Facebook dataset as awhole the average ad sponsor ran ads on 16 pages Advertisersin the for-prot media category however ran ads on 32 pages onaverage We have examined many of these for-prot media compa-nies to understand why they are running across many Facebookpages What we have found in numerous instances is unknownfor-prot media companies that appear to be creating disingenuouscommunities that appear to be ldquograssroots movementsrdquo to targetdierent demographics and interests with a combination of paidand organic political messaging

A good example of this type of advertiser is rdquoNew AmericanMedia Group LLCrdquo is ad sponsor ran le leaning ads on 10dierent pages ese pages were designed to appeal to dierent

demographics (rdquoMelaninrdquo for people of color rdquoe Soldier Networkrdquofor Veterans rdquoRaising Tomorrowrdquo for parents etc) but oen run thesame content on multiple pages While this LLC has an extremelysimilar name to a now-defunct genuine le-leaning media outlet(New America Media) it appears to have no connection to thatprior group and also appears to have no activity o of Facebook

While some advertisers in this category were fairly traditionalentertainment websites (ie Comedy Central) some were ldquofor-protrdquo companies in name only that appeared to exist for no otherpurpose other than to spread a particular political message and hadno way of generating an actual prot We also discovered ldquoNewsfor Democracyrdquo is an LLC that ran le leaning ads on 14 dierentFacebook pages most of which were designed to be appealing togroups with traditionally conservative view points such as ldquoeHoly Tribunerdquo Journalists investigated this LLC and linked it toMotiveAI which is a liberal political advertising company

542 Corporate Astroturfing Corporations paying for politicaladvertising is not an entirely new phenomenon and has traditionallybeen funded through industry trade groups and PACs Howeverthe reporting requirements by the FCC for US political advertisingon television oen made this political messaging traceable to thereal sponsor ese stricter reporting requirements do not apply toonline political advertising and the ad-hoc reporting requirementsthat online platforms have enacted are being abused by corporationsand industry trade groups to undo transparency eorts

We discovered in our analysis 355 ads sponsored by ldquoCitizens forTobacco Rightsrdquo which is not a registered company in the US butdoes disclose on their website and Facebook page that it is operatedby cigaree company Philip Morris However someone who onlysaw the Facebook ad disclaimer would not be able to connect theads to Philip Morris without further investigation Other journalistshave found instances of oil and insurance lobbying groups that alsoprovided sponsor names that did not match the legally incorporatedentity sponsoring the ads [21] ese organizations are seeminglytaking advantage of Facebookrsquos policy of not veing sponsor namessince some of these entities also ran political ads on Googlersquos adplatform but provided Google with their EIN (tax ID) and correctlegally incorporated names of their organizations [21]

55 Discussione dierent policies bugs idiosyncrasies and security weaknessesof each transparency archive implementation present challenges toour analysis eorts We nd many of the issues with these archiveslikely stem from a combination of their hasty creation and the factthat the platforms are still working out how to improve security ofthese archives such they are dicult to deceive or evade We willrst discuss issues related to accidentally or intentionally deceivingthese transparency eorts and how they might be improved byimplementing more robust sponsor aribution techniques e sec-ond part of our discussion will focus on issues related to bypassinginclusion into the dierent platformsrsquo archives and what can bedone to improve these issues

551 Sponsor Aribution e for-prot political advertisers ap-pear to be the ones that are accidental or intentionally skirting andviolating the spirit of online transparency sponsorship disclosure

9

policies As we discussed in the Ad Targeting section itrsquos extremelyeasy for groups such as rsquoNew American Mediarsquo to obscure whothey actually are from users and researchers

It is worth noting that such advertising by for-prot corpora-tions was not legal until the Citizens United Supreme Court decisionin 2010 [1] that struck down restrictions on election spending byfor-prot corporations However political messaging advertiserswho run ads on television or radio stations governed by the FCCmust still report the name and contact information of the busi-ness which paid for the ad including the companyrsquos ocers anddirectors Such data is published by the FCC in a public databasePolitical advertisers who send direct mail through the US PostalService (USPS) must also report their activities through the FECwith similar public disclosure of the name and contact informationof the business e regulations that require such disclosure for adsthat mention candidates do not apply to online advertising largelybecause the laws that mandate such public disclosures were draedbefore these platforms were as ubiquitous as they have become

What this means in practice though is that people who wantto publicize a political message can form a for-prot company fordoing so with no intent of making an prot As a private companythey do not need to publicly disclose their investors in the waythat PACs are required to disclose their donors en the for-protcompany can advertise on social media also without disclosing thelegal entity providing the funds to pay for the ad

On Facebookrsquos platform advertisers can easily mislead whenproviding the rsquoad sponsorrsquo string associated with their ads eitherintentionally or accidentally us it is eectively free to circum-vent Facebookrsquos transparency implementation We see numerousinstances on Facebookrsquos platforms of this occurring Sometimes theunreliability of the ad sponsor label appeared to be caused purelyby human error such as typos or variation during data entry Forexample Donald J Trump For President Inc sponsored ads onboth the Donald J Trump page and the Mike Pence page Howeverwhen sponsoring ads on the Donald J Trump page the organiza-tion is known as rsquoDonald J Trump For President Incrsquo and whensponsoring ads on the Mike Pence page is known as rsquoDonald JTrump For President Incrsquo Facebook has not publicly stated plansto implement additional veing of political sponsors Facebookrsquosargument is that anything they might implement for additional vet-ting would not be scalable because of their broader inclusion policywhich extends to political issue ads [21] However this has cre-ated a weakness in Facebookrsquos transparency implementation thatgreatly diminishes its eectiveness for studying dishonest politicaladvertisers

Google and Twier both vet sponsors so companies must eitherreveal their legally incorporated name pay existing third-partiesto create ads on their behalf or create shell organizations (ie LLCPACs) We should note that we see instances of political ads onFacebook and Twier where the sponsor is a third-party advertisingagency instead of the actual entity that paid for the ads is is anexample of the complexities of correctly aributing political adsto the real sponsors It is clear from analysis that we need morediscussion about how to implement sponsorship disclosure andveing in a way that makes it practical to deploy at scale and moredicult to circumvent

552 Transparency Infrastructure As we have noted we ap-preciate the speed with which these transparency archives werecreated However the lack of full integration of these archives intothe broader ad platforms of these companies is currently hurtingthe ecacy of these transparency eorts

We believe that there are ads on the Google and Twier platformsthat would be considered political content that are not includedin their transparency archives because their criteria for inclusionare too narrow or their mechanisms for nding this content areinsucient More research needs to be done into exactly whatthe general population considers to be political advertising Wewould encourage these platforms to create policies and enforcementmechanisms that will make transparent advertising content thatthe general population would consider political

We also encountered several technical and policy issues withthe archives as they currently exist Many ads particularly in theGoogle archive were missing content information Information onspend and impressions were only available in broad ranges fromFacebook and Google No targeting information or very lile target-ing information was available from any of the platforms Facebookrequired us to sign an NDA that prohibited us from sharing our rawdata even with other researchers or even discussing our ndingsdirectly with non- US Persons

We call on these organizations to re-architect their platforms andpolicies to support full transparency of all political ads We realizethat making the changes we recommend will require investment oftime and money both in the technology of these platforms and thecorporate culture of the organizations that own them

6 RELATEDWORK61 Online AdvertisingKorolova [16] was the rst to point out privacy aacks based onmicro-targeted online ads Followup work has reverse-engineeredthe targeting options provided by major online ad networks [33]and explored privacy [2] and bias [26] issues of these online adnetworks ere has also been work on designing improved adtransparency mechanisms [20] For our study we leverage thisprior work on reverse-engineering online advertising networksrsquotargeting options and how Facebookrsquos ad targeting explanationlikely is implemented

To the best of our knowledge there has been no systematicanalysis of online advertisers to this point likely due to the dicultyof collecting large-scale data from online ad networks [13] One ofthe only prior large-scale quantitative studies of online advertisersfocused on how their strategies eected conversion rates based onaggregate analysis of advertisers on Microsorsquos ad network [32]XRay [18] and Sunshine [19] are two techniques that were createdto detect and infer online ad targeting methods However thesewere proof of concept systems and not deployed at large-scale Aninitial analysis of Facebookrsquos proposed ad transparency archiveimplementations pointed out the issue of only including politicalads and not revealing targeting information [31] is report wasreleased before Facebook implemented their transparency archiveand therefor did not analyze the ad data archived by Facebook orissues with the actual implementation We have conducted the rst

10

large-scale analysis of online political advertising based on the datarecently made transparent by Facebook Google and Twier

62 Political AdvertisingAnalysis of political television ads has been the focus of most priorpolitical advertising studies likely due to this data being publiclypublished by the FCC and easy to access [14 35] ere is at leastone prior study that explored the inuence of political televisionads on online discussion [25] ere have also been studies of inves-tigating the polarization of online political discourse [3 10] eclosest related to our study is a prior study which showed thatuploading political video advertisements to YouTube generated un-paid organic-views and improved their eectiveness [24] Howeverto the best of our knowledge ours is the rst large-scale study ofonline political advertising

7 CONCLUSIONSWe have performed an analysis of the ads that we were able tocollect from Facebook Google and Twierrsquos transparency archivesrelated to US politics Based on the data we collected we providean initial understanding and taxonomies of online political advertis-ing strategies for both honest and possibly dishonest US politicaladvertisers We also point out limitations and weaknesses of thepolicies and current implementations of these archives As part ofour analysis we demonstrate how advertisers are intentionally oraccidentally deceiving and bypassing these political transparencyarchives We provide a concrete list of suggestions that wouldlikely make them more robust and useful for enabling a beer un-derstanding of political advertising We are actively working witheach archive product teams to improve their implementations

We commend Facebook Google and Twier for their eorts sofar in improving transparency into political advertising on theirplatforms We note the speed with which these archives weremade available aer public concern about this issue was raised andthat these transparency eorts have improved a great deal in theshort time that these tools have been available We encourage theplatforms to continue to improve

REFERENCES[1] 2009-03-24 Citizens United v Federal Election Commission[2] Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P Gummadi

Patrick Loiseau and Alan Mislove 2018 Investigating ad transparency mecha-nisms in social media A case study of Facebookrsquos explanations InNDSS 2018 Net-work and Distributed Systems Security Symposium San Diego UNITED STATES18 ndash 21

[3] Pablo Barber John T Jost Jonathan Nagler Joshua A Tucker and RichardBonneau 2015 Tweeting From Le to Right Is Online Political CommunicationMore an an Echo Chamber Psychological Science 26 10 (2015) 1531ndash1542

[4] Brian Barre 2018 For Russia Unraveling US DemocracyWas Just Another Day Job hpswwwwiredcomstorymueller-indictment-internet-research-agency

[5] Facebook 2018 Ad Archive hpswwwfacebookcomadsarchive[6] Facebook 2018 Facebook Ad Archive Report Retrieved October 25th 2018

from hpswwwfacebookcomadsarchivereport[7] Facebook 2018-07-03 About ads that include political content rdquohpswww

facebookcombusinesshelp167836590566506rdquo[8] Facebook 2018-07-03 National issues of public importance rdquohpswww

facebookcombusinesshelp214754279118974rdquo[9] Facebook 2018-08-22 Introducing the Ad Archive API hpsnewsroom

comnews201808introducing-the-ad-archive-api[10] DJ Flynn Brendan Nyhan and Jason Reier [n d] e Nature and Origins of

Misperceptions Understanding False and Unsupported Beliefs About PoliticsPolitical Psychology 38 S1 ([n d]) 127ndash150

[11] Google 2018-08-15 Transparency Report - Political Ads hpstransparencyreportgooglecompolitical-adslibrary

[12] Kevin Granville 2018 Facebook and Cambridge Analytica What You Needto Know as Fallout Widens hpswwwnytimescom20180319technologyfacebook-cambridge-analytica-explainedhtml

[13] Saikat Guha Bin Cheng and Paul Francis 2010 Challenges in Measuring OnlineAdvertising Systems In Proceedings of the 10th ACM SIGCOMM Conference onInternet Measurement (IMC rsquo10) ACM 81ndash87

[14] Lynda Lee Kaid and Monica Postelnicu 2005 Political Advertising in the 2004Election Comparison of Traditional Television and Internet Messages AmericanBehavioral Scientist 49 2 (2005) 265ndash278

[15] Maggie Haberman Kenneth P Vogel 2018-10-13hpswwwnytimescom20181013uspoliticstrump-political-datahtmlhpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[16] A Korolova 2010 Privacy Violations Using Microtargeted Ads A Case StudyIn 2010 IEEE International Conference on Data Mining Workshops 474ndash482

[17] Shikhar Sakhuja Laura Edelson 2018 An Analysis of UnitedStates Online Political Advertising hpsonline-pol-adsgithubioOnline-Political-Ads-Analysis

[18] Mathias Lecuyer Guillaume Ducoe Francis Lan Andrei Papancea eolosPetsios Riley Spahn Augustin Chaintreau and Roxana Geambasu 2014 XRayEnhancing the Webrsquos Transparency with Dierential Correlation In USENIXSecurity Symposium San Diego United States

[19] Mathias Lecuyer Riley Spahn Yannis Spiliopolous Augustin Chaintreau RoxanaGeambasu and Daniel Hsu 2015 Sunlight Fine-grained Targeting Detectionat Scale with Statistical Condence In Proceedings of the 22Nd ACM SIGSACConference on Computer and Communications Security (CCS rsquo15) ACM New YorkNY USA 554ndash566 hpsdoiorg10114528101032813614

[20] Bin Liu Anmol Sheth Udi Weinsberg Jaideep Chandrashekar and RameshGovindan 2013 AdReveal Improving Transparency into Online TargetedAdvertising In Proceedings of the TwelhACMWorkshop onHot Topics in Networks(HotNets-XII) ACM New York NY USA Article 12 7 pages hpsdoiorg10114525357712535783

[21] Jeremy B Merrill 2018 How Big Oil Dodges FacebooksNew Ad Transparency Rules hpswwwpropublicaorgarticlehow-big-oil-dodges-facebooks-new-ad-transparency-rules

[22] ProPublica 2017-09-07 Political Advertisements from Facebook hpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[23] ProPublica 2018-10-21 Political Advertisements from Facebook hpswwwpropublicaorgdatastoredatasetpolitical-advertisements-from-facebook

[24] Travis N Ridout Erika Franklin Fowler and John Bransteer 2010 PoliticalAdvertising in the 21st Century e Rise of the YouTube Ad In APSA 2010Annual Meeting

[25] Dhavan V Shah Jaeho Cho Seungahn Nah Melissa R Gotlieb Hyunseo HwangNam-Jin Lee Rosanne M Scholl and Douglas M McLeod [n d] Campaign AdsOnline Messaging and Participation Extending the Communication MediationModel Journal of Communication 57 4 ([n d]) 676ndash703

[26] Till Speicher Muhammad Ali Giridhari Venkatadri Filipe Nunes Ribeiro GeorgeArvanitakis Fabrcio Benevenuto Krishna P Gummadi Patrick Loiseau andAlan Mislove 2018 Potential for Discrimination in Online Targeted AdvertisingIn Proceedings of the 1st Conference on Fairness Accountability and Transparency(Proceedings of Machine Learning Research) Sorelle A Friedler and ChristoWilson(Eds) Vol 81 PMLR New York NY USA 5ndash19

[27] William Turton 2018 We posed as 100 Senators to run ads on FacebookFacebook approved all of them hpsnewsvicecomen usarticlexw9n3qwe-posed-as-100-senators-to-run-ads-on-facebook-facebook-approved-all-of-them

[28] Twier 2018 Political campaigning advertisers hpsadstwiercomtransparencyipolitical advertisers

[29] Twier 2018-06-28 Ad Transparency Center hpsadstwiercomtransparency

[30] Twier 2018-06-28 Political Content in the United States hpsbusinesstwiercomenhelpads-policiesrestricted-content-policiespolitical-campaigningUS-political-contenthtml

[31] Upturn 2018 Leveling the Platform Real Transparency for Paid Messages onFacebook hpswwwteamupturnorgreports2018facebook-ads

[32] Bhanu C Vaikonda Vacha Dave Saikat Guha and Alex C Snoeren 2015Empirical Analysis of Search Advertising Strategies In Proceedings of the 2015Internet Measurement Conference (IMC rsquo15) ACM New York NY USA 79ndash91hpsdoiorg10114528156752815694

[33] G Venkatadri A Andreou Y Liu A Mislove K P Gummadi P Loiseau and OGoga 2018 Privacy Risks with Facebookrsquos PII-Based Targeting Auditing a DataBrokerrsquos Advertising Interface In 2018 IEEE Symposium on Security and Privacy(SP) 89ndash107

[34] Giridhari Venkatadri Yabing Liu Athanasios Andreou Oana Goga PatrickLoiseau Alan Mislove and Krishna P Gummadi 2018 Privacy Risks withFacebookrsquos PII-based Targeting Auditing a Data Brokerrsquos Advertising InterfaceIn Proceedings of the IEEE Symposium on Security and Privacy (IEEE SampPrsquo18) San

11

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References
Page 8: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

Figure 8 CDF of Regions by Ad Type for Federal Candidates

53 Ad TargetingOne of the deciencies with the Facebook political ad archive isthat while it does share geographic and demographic informationabout who saw a particular ad we have no way of knowing howthat ad was targeted However we were able to connect our datasetcontaining information about who consumed ads with one pub-lished by ProPublica which contains some data about how ads weretargeted e ProPublica data was collected by a browser pluginoperated by ProPublica which anyone can install ProPublicarsquosbrowser plugin [22] uses a supervised Natural Language Process-ing (NLP) classier to detect political ads in addition to allowingusers to manually classify ads they see as political e browserplugin then collects the partial ad targeting explanations Facebookprovides by automatically clicking on the ldquoWhy am I seeing thisrdquobuon for political ads and sends it to ProPublica for them to makepublic

We rst provide a brief background on Facebook targeting audi-ence options [34] Facebook exposes prospective advertisers to aplethora of options First advertisers can target users based on agegender location and languages they speak Second advertisers canchoose to send their ads to users in a custom audience or lookalikeaudience Custom audiences contain a list of identiers of specicusers Advertisers can use various types of data to create a customaudience list ranging from specifying the emails phone numbersor physical addresses of people they want to reach to users thathave visited their website installed their mobile application orliked their Facebook Page Lookalike audiences allow advertisers tolet Facebook choose to whom to sends their ads based on previouscampaigns Finally advertisers can choose from a long list of target-ing aributes the characteristics they want users who receive theirads to have (eg users interested in Catholic Church) Targetingaributes are categorized in types such as demographics behaviorsand interests Advertisers can choose multiple aributes to target

A prior study by Athanasioshas et al [2] reverse engineeredwhat Facebook chooses to show and the limitations of the ad tar-geting explanation Facebook provides is study showed that adexplanations are incomplete each explanation shows at most onetargeting aribute (plus agegenderlocation information) regard-less of how many aributes the advertisers use is means thatexplanations reveal only part of the targeting aributes that wereused providing us ndash and the users ndash with an incomplete picture of

Figure 9 ProPublica Spend by Ad Type

the aributes that advertisers were using However in the samestudy authors performed a number of controlled experiments thatsuggest ndash but not conclusively prove ndash that there is a logic behindwhich aributes appear in an explanation and which do not Givena targeting audience A obtained from two aributes a1 and a2 if a1and a2 come from dierent aribute categories (eg DemographicBehavior Interest etc) the aribute shown follows a specic prece-dence (Demographics and AgeGenderLocation iquest Interests iquest PIIbased lists iquest Behaviors) If a1 and a2 come from the same aributecategory the one that appears in the explanation is the one withthe highest estimated audience size is will result in a systematicunder-counting of lower priority targeting types

ere are two main sources of biases and limitations in ProP-ublicarsquos dataset One comes from users that installed ProPublicarsquosplugin and which political ads they were shown Another is fromthe way Facebook provides ad explanations e ProPublica datasetis the only publicly available source of targeting information forFacebook political ads us we present these results to provide aninitial insight into how Facebook political advertisers are targetingtheir ads with the understanding of likely biases and limitations

With these caveats in mind we proceed to an analysis of the18010 ads which we were able to connect between the ProPublicadataset and ours In Figure 9 we see that dierent types of adsdo indeed rely on dierent targeting strategies Of particular noteis the the divergence of rsquoCommercialrsquo ads of which 74 rely ontargeting by interest groups and of rsquoDonatersquo ads of which only 24do e average ad size did not dier signicantly between targetingtypes but was signicantly larger than the average for the Facebook

8

Figure 10 ProPublica Targeting by Advertiser Type

archive as a whole We believe this to be an artifact of the collectionmechanism which is biased toward nding larger ads 91 ofads in the overall ProPublica dataset had some kind of geographictargeting and 92 had age or gender targeting On average adsin this dataset had on average 41 dierent targeting parametersso for Facebook these should be thought of as a minimum criteriaBy contrast 58 of ads in the Google archive had no geographictargeting whatsoever and 70 had neither age nor gender targeting

In Figure 10 we also see diverging strategies between advertiserswith Political Candidates and PACs making heavy use of customlists of users and For Prot and For Prot Media companies relyingfar more on targeting users by their interests Campaigns havenumerous potential sources from which to compile lists of usersIn addition to their own lists of donors and voter rolls campaignscan rent lists from other candidates [15]

Both Google and Twier oer advertisers similar targeting cri-teria to what we have described for Facebook including customaudiences and lookalike audiences Both even allow targeting ofusers based on interests although they infer these interests in dier-ent ways Both havemade transparent demographic and geographictargeting information for ads in their archive but without othertargeting information this is an incomplete picture at best Weencourage Google and Twier to at minimum follow Facebookrsquosexample and make transparent to users information about whythey have been targeed for ads that they are seeing

54 New Types of Political Advertisers541 For-Profit Media One advertiser type in particular proved

to be an interesting outlier e category rsquoFor Prot Mediarsquo con-tains advertisers whose ads are not considered traditional news byFacebook (those ads are in a separate part of the archive that wedid not include) but have content intended solely to entertain orsway the opinion of the viewer Over the Facebook dataset as awhole the average ad sponsor ran ads on 16 pages Advertisersin the for-prot media category however ran ads on 32 pages onaverage We have examined many of these for-prot media compa-nies to understand why they are running across many Facebookpages What we have found in numerous instances is unknownfor-prot media companies that appear to be creating disingenuouscommunities that appear to be ldquograssroots movementsrdquo to targetdierent demographics and interests with a combination of paidand organic political messaging

A good example of this type of advertiser is rdquoNew AmericanMedia Group LLCrdquo is ad sponsor ran le leaning ads on 10dierent pages ese pages were designed to appeal to dierent

demographics (rdquoMelaninrdquo for people of color rdquoe Soldier Networkrdquofor Veterans rdquoRaising Tomorrowrdquo for parents etc) but oen run thesame content on multiple pages While this LLC has an extremelysimilar name to a now-defunct genuine le-leaning media outlet(New America Media) it appears to have no connection to thatprior group and also appears to have no activity o of Facebook

While some advertisers in this category were fairly traditionalentertainment websites (ie Comedy Central) some were ldquofor-protrdquo companies in name only that appeared to exist for no otherpurpose other than to spread a particular political message and hadno way of generating an actual prot We also discovered ldquoNewsfor Democracyrdquo is an LLC that ran le leaning ads on 14 dierentFacebook pages most of which were designed to be appealing togroups with traditionally conservative view points such as ldquoeHoly Tribunerdquo Journalists investigated this LLC and linked it toMotiveAI which is a liberal political advertising company

542 Corporate Astroturfing Corporations paying for politicaladvertising is not an entirely new phenomenon and has traditionallybeen funded through industry trade groups and PACs Howeverthe reporting requirements by the FCC for US political advertisingon television oen made this political messaging traceable to thereal sponsor ese stricter reporting requirements do not apply toonline political advertising and the ad-hoc reporting requirementsthat online platforms have enacted are being abused by corporationsand industry trade groups to undo transparency eorts

We discovered in our analysis 355 ads sponsored by ldquoCitizens forTobacco Rightsrdquo which is not a registered company in the US butdoes disclose on their website and Facebook page that it is operatedby cigaree company Philip Morris However someone who onlysaw the Facebook ad disclaimer would not be able to connect theads to Philip Morris without further investigation Other journalistshave found instances of oil and insurance lobbying groups that alsoprovided sponsor names that did not match the legally incorporatedentity sponsoring the ads [21] ese organizations are seeminglytaking advantage of Facebookrsquos policy of not veing sponsor namessince some of these entities also ran political ads on Googlersquos adplatform but provided Google with their EIN (tax ID) and correctlegally incorporated names of their organizations [21]

55 Discussione dierent policies bugs idiosyncrasies and security weaknessesof each transparency archive implementation present challenges toour analysis eorts We nd many of the issues with these archiveslikely stem from a combination of their hasty creation and the factthat the platforms are still working out how to improve security ofthese archives such they are dicult to deceive or evade We willrst discuss issues related to accidentally or intentionally deceivingthese transparency eorts and how they might be improved byimplementing more robust sponsor aribution techniques e sec-ond part of our discussion will focus on issues related to bypassinginclusion into the dierent platformsrsquo archives and what can bedone to improve these issues

551 Sponsor Aribution e for-prot political advertisers ap-pear to be the ones that are accidental or intentionally skirting andviolating the spirit of online transparency sponsorship disclosure

9

policies As we discussed in the Ad Targeting section itrsquos extremelyeasy for groups such as rsquoNew American Mediarsquo to obscure whothey actually are from users and researchers

It is worth noting that such advertising by for-prot corpora-tions was not legal until the Citizens United Supreme Court decisionin 2010 [1] that struck down restrictions on election spending byfor-prot corporations However political messaging advertiserswho run ads on television or radio stations governed by the FCCmust still report the name and contact information of the busi-ness which paid for the ad including the companyrsquos ocers anddirectors Such data is published by the FCC in a public databasePolitical advertisers who send direct mail through the US PostalService (USPS) must also report their activities through the FECwith similar public disclosure of the name and contact informationof the business e regulations that require such disclosure for adsthat mention candidates do not apply to online advertising largelybecause the laws that mandate such public disclosures were draedbefore these platforms were as ubiquitous as they have become

What this means in practice though is that people who wantto publicize a political message can form a for-prot company fordoing so with no intent of making an prot As a private companythey do not need to publicly disclose their investors in the waythat PACs are required to disclose their donors en the for-protcompany can advertise on social media also without disclosing thelegal entity providing the funds to pay for the ad

On Facebookrsquos platform advertisers can easily mislead whenproviding the rsquoad sponsorrsquo string associated with their ads eitherintentionally or accidentally us it is eectively free to circum-vent Facebookrsquos transparency implementation We see numerousinstances on Facebookrsquos platforms of this occurring Sometimes theunreliability of the ad sponsor label appeared to be caused purelyby human error such as typos or variation during data entry Forexample Donald J Trump For President Inc sponsored ads onboth the Donald J Trump page and the Mike Pence page Howeverwhen sponsoring ads on the Donald J Trump page the organiza-tion is known as rsquoDonald J Trump For President Incrsquo and whensponsoring ads on the Mike Pence page is known as rsquoDonald JTrump For President Incrsquo Facebook has not publicly stated plansto implement additional veing of political sponsors Facebookrsquosargument is that anything they might implement for additional vet-ting would not be scalable because of their broader inclusion policywhich extends to political issue ads [21] However this has cre-ated a weakness in Facebookrsquos transparency implementation thatgreatly diminishes its eectiveness for studying dishonest politicaladvertisers

Google and Twier both vet sponsors so companies must eitherreveal their legally incorporated name pay existing third-partiesto create ads on their behalf or create shell organizations (ie LLCPACs) We should note that we see instances of political ads onFacebook and Twier where the sponsor is a third-party advertisingagency instead of the actual entity that paid for the ads is is anexample of the complexities of correctly aributing political adsto the real sponsors It is clear from analysis that we need morediscussion about how to implement sponsorship disclosure andveing in a way that makes it practical to deploy at scale and moredicult to circumvent

552 Transparency Infrastructure As we have noted we ap-preciate the speed with which these transparency archives werecreated However the lack of full integration of these archives intothe broader ad platforms of these companies is currently hurtingthe ecacy of these transparency eorts

We believe that there are ads on the Google and Twier platformsthat would be considered political content that are not includedin their transparency archives because their criteria for inclusionare too narrow or their mechanisms for nding this content areinsucient More research needs to be done into exactly whatthe general population considers to be political advertising Wewould encourage these platforms to create policies and enforcementmechanisms that will make transparent advertising content thatthe general population would consider political

We also encountered several technical and policy issues withthe archives as they currently exist Many ads particularly in theGoogle archive were missing content information Information onspend and impressions were only available in broad ranges fromFacebook and Google No targeting information or very lile target-ing information was available from any of the platforms Facebookrequired us to sign an NDA that prohibited us from sharing our rawdata even with other researchers or even discussing our ndingsdirectly with non- US Persons

We call on these organizations to re-architect their platforms andpolicies to support full transparency of all political ads We realizethat making the changes we recommend will require investment oftime and money both in the technology of these platforms and thecorporate culture of the organizations that own them

6 RELATEDWORK61 Online AdvertisingKorolova [16] was the rst to point out privacy aacks based onmicro-targeted online ads Followup work has reverse-engineeredthe targeting options provided by major online ad networks [33]and explored privacy [2] and bias [26] issues of these online adnetworks ere has also been work on designing improved adtransparency mechanisms [20] For our study we leverage thisprior work on reverse-engineering online advertising networksrsquotargeting options and how Facebookrsquos ad targeting explanationlikely is implemented

To the best of our knowledge there has been no systematicanalysis of online advertisers to this point likely due to the dicultyof collecting large-scale data from online ad networks [13] One ofthe only prior large-scale quantitative studies of online advertisersfocused on how their strategies eected conversion rates based onaggregate analysis of advertisers on Microsorsquos ad network [32]XRay [18] and Sunshine [19] are two techniques that were createdto detect and infer online ad targeting methods However thesewere proof of concept systems and not deployed at large-scale Aninitial analysis of Facebookrsquos proposed ad transparency archiveimplementations pointed out the issue of only including politicalads and not revealing targeting information [31] is report wasreleased before Facebook implemented their transparency archiveand therefor did not analyze the ad data archived by Facebook orissues with the actual implementation We have conducted the rst

10

large-scale analysis of online political advertising based on the datarecently made transparent by Facebook Google and Twier

62 Political AdvertisingAnalysis of political television ads has been the focus of most priorpolitical advertising studies likely due to this data being publiclypublished by the FCC and easy to access [14 35] ere is at leastone prior study that explored the inuence of political televisionads on online discussion [25] ere have also been studies of inves-tigating the polarization of online political discourse [3 10] eclosest related to our study is a prior study which showed thatuploading political video advertisements to YouTube generated un-paid organic-views and improved their eectiveness [24] Howeverto the best of our knowledge ours is the rst large-scale study ofonline political advertising

7 CONCLUSIONSWe have performed an analysis of the ads that we were able tocollect from Facebook Google and Twierrsquos transparency archivesrelated to US politics Based on the data we collected we providean initial understanding and taxonomies of online political advertis-ing strategies for both honest and possibly dishonest US politicaladvertisers We also point out limitations and weaknesses of thepolicies and current implementations of these archives As part ofour analysis we demonstrate how advertisers are intentionally oraccidentally deceiving and bypassing these political transparencyarchives We provide a concrete list of suggestions that wouldlikely make them more robust and useful for enabling a beer un-derstanding of political advertising We are actively working witheach archive product teams to improve their implementations

We commend Facebook Google and Twier for their eorts sofar in improving transparency into political advertising on theirplatforms We note the speed with which these archives weremade available aer public concern about this issue was raised andthat these transparency eorts have improved a great deal in theshort time that these tools have been available We encourage theplatforms to continue to improve

REFERENCES[1] 2009-03-24 Citizens United v Federal Election Commission[2] Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P Gummadi

Patrick Loiseau and Alan Mislove 2018 Investigating ad transparency mecha-nisms in social media A case study of Facebookrsquos explanations InNDSS 2018 Net-work and Distributed Systems Security Symposium San Diego UNITED STATES18 ndash 21

[3] Pablo Barber John T Jost Jonathan Nagler Joshua A Tucker and RichardBonneau 2015 Tweeting From Le to Right Is Online Political CommunicationMore an an Echo Chamber Psychological Science 26 10 (2015) 1531ndash1542

[4] Brian Barre 2018 For Russia Unraveling US DemocracyWas Just Another Day Job hpswwwwiredcomstorymueller-indictment-internet-research-agency

[5] Facebook 2018 Ad Archive hpswwwfacebookcomadsarchive[6] Facebook 2018 Facebook Ad Archive Report Retrieved October 25th 2018

from hpswwwfacebookcomadsarchivereport[7] Facebook 2018-07-03 About ads that include political content rdquohpswww

facebookcombusinesshelp167836590566506rdquo[8] Facebook 2018-07-03 National issues of public importance rdquohpswww

facebookcombusinesshelp214754279118974rdquo[9] Facebook 2018-08-22 Introducing the Ad Archive API hpsnewsroom

comnews201808introducing-the-ad-archive-api[10] DJ Flynn Brendan Nyhan and Jason Reier [n d] e Nature and Origins of

Misperceptions Understanding False and Unsupported Beliefs About PoliticsPolitical Psychology 38 S1 ([n d]) 127ndash150

[11] Google 2018-08-15 Transparency Report - Political Ads hpstransparencyreportgooglecompolitical-adslibrary

[12] Kevin Granville 2018 Facebook and Cambridge Analytica What You Needto Know as Fallout Widens hpswwwnytimescom20180319technologyfacebook-cambridge-analytica-explainedhtml

[13] Saikat Guha Bin Cheng and Paul Francis 2010 Challenges in Measuring OnlineAdvertising Systems In Proceedings of the 10th ACM SIGCOMM Conference onInternet Measurement (IMC rsquo10) ACM 81ndash87

[14] Lynda Lee Kaid and Monica Postelnicu 2005 Political Advertising in the 2004Election Comparison of Traditional Television and Internet Messages AmericanBehavioral Scientist 49 2 (2005) 265ndash278

[15] Maggie Haberman Kenneth P Vogel 2018-10-13hpswwwnytimescom20181013uspoliticstrump-political-datahtmlhpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[16] A Korolova 2010 Privacy Violations Using Microtargeted Ads A Case StudyIn 2010 IEEE International Conference on Data Mining Workshops 474ndash482

[17] Shikhar Sakhuja Laura Edelson 2018 An Analysis of UnitedStates Online Political Advertising hpsonline-pol-adsgithubioOnline-Political-Ads-Analysis

[18] Mathias Lecuyer Guillaume Ducoe Francis Lan Andrei Papancea eolosPetsios Riley Spahn Augustin Chaintreau and Roxana Geambasu 2014 XRayEnhancing the Webrsquos Transparency with Dierential Correlation In USENIXSecurity Symposium San Diego United States

[19] Mathias Lecuyer Riley Spahn Yannis Spiliopolous Augustin Chaintreau RoxanaGeambasu and Daniel Hsu 2015 Sunlight Fine-grained Targeting Detectionat Scale with Statistical Condence In Proceedings of the 22Nd ACM SIGSACConference on Computer and Communications Security (CCS rsquo15) ACM New YorkNY USA 554ndash566 hpsdoiorg10114528101032813614

[20] Bin Liu Anmol Sheth Udi Weinsberg Jaideep Chandrashekar and RameshGovindan 2013 AdReveal Improving Transparency into Online TargetedAdvertising In Proceedings of the TwelhACMWorkshop onHot Topics in Networks(HotNets-XII) ACM New York NY USA Article 12 7 pages hpsdoiorg10114525357712535783

[21] Jeremy B Merrill 2018 How Big Oil Dodges FacebooksNew Ad Transparency Rules hpswwwpropublicaorgarticlehow-big-oil-dodges-facebooks-new-ad-transparency-rules

[22] ProPublica 2017-09-07 Political Advertisements from Facebook hpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[23] ProPublica 2018-10-21 Political Advertisements from Facebook hpswwwpropublicaorgdatastoredatasetpolitical-advertisements-from-facebook

[24] Travis N Ridout Erika Franklin Fowler and John Bransteer 2010 PoliticalAdvertising in the 21st Century e Rise of the YouTube Ad In APSA 2010Annual Meeting

[25] Dhavan V Shah Jaeho Cho Seungahn Nah Melissa R Gotlieb Hyunseo HwangNam-Jin Lee Rosanne M Scholl and Douglas M McLeod [n d] Campaign AdsOnline Messaging and Participation Extending the Communication MediationModel Journal of Communication 57 4 ([n d]) 676ndash703

[26] Till Speicher Muhammad Ali Giridhari Venkatadri Filipe Nunes Ribeiro GeorgeArvanitakis Fabrcio Benevenuto Krishna P Gummadi Patrick Loiseau andAlan Mislove 2018 Potential for Discrimination in Online Targeted AdvertisingIn Proceedings of the 1st Conference on Fairness Accountability and Transparency(Proceedings of Machine Learning Research) Sorelle A Friedler and ChristoWilson(Eds) Vol 81 PMLR New York NY USA 5ndash19

[27] William Turton 2018 We posed as 100 Senators to run ads on FacebookFacebook approved all of them hpsnewsvicecomen usarticlexw9n3qwe-posed-as-100-senators-to-run-ads-on-facebook-facebook-approved-all-of-them

[28] Twier 2018 Political campaigning advertisers hpsadstwiercomtransparencyipolitical advertisers

[29] Twier 2018-06-28 Ad Transparency Center hpsadstwiercomtransparency

[30] Twier 2018-06-28 Political Content in the United States hpsbusinesstwiercomenhelpads-policiesrestricted-content-policiespolitical-campaigningUS-political-contenthtml

[31] Upturn 2018 Leveling the Platform Real Transparency for Paid Messages onFacebook hpswwwteamupturnorgreports2018facebook-ads

[32] Bhanu C Vaikonda Vacha Dave Saikat Guha and Alex C Snoeren 2015Empirical Analysis of Search Advertising Strategies In Proceedings of the 2015Internet Measurement Conference (IMC rsquo15) ACM New York NY USA 79ndash91hpsdoiorg10114528156752815694

[33] G Venkatadri A Andreou Y Liu A Mislove K P Gummadi P Loiseau and OGoga 2018 Privacy Risks with Facebookrsquos PII-Based Targeting Auditing a DataBrokerrsquos Advertising Interface In 2018 IEEE Symposium on Security and Privacy(SP) 89ndash107

[34] Giridhari Venkatadri Yabing Liu Athanasios Andreou Oana Goga PatrickLoiseau Alan Mislove and Krishna P Gummadi 2018 Privacy Risks withFacebookrsquos PII-based Targeting Auditing a Data Brokerrsquos Advertising InterfaceIn Proceedings of the IEEE Symposium on Security and Privacy (IEEE SampPrsquo18) San

11

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References
Page 9: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

Figure 10 ProPublica Targeting by Advertiser Type

archive as a whole We believe this to be an artifact of the collectionmechanism which is biased toward nding larger ads 91 ofads in the overall ProPublica dataset had some kind of geographictargeting and 92 had age or gender targeting On average adsin this dataset had on average 41 dierent targeting parametersso for Facebook these should be thought of as a minimum criteriaBy contrast 58 of ads in the Google archive had no geographictargeting whatsoever and 70 had neither age nor gender targeting

In Figure 10 we also see diverging strategies between advertiserswith Political Candidates and PACs making heavy use of customlists of users and For Prot and For Prot Media companies relyingfar more on targeting users by their interests Campaigns havenumerous potential sources from which to compile lists of usersIn addition to their own lists of donors and voter rolls campaignscan rent lists from other candidates [15]

Both Google and Twier oer advertisers similar targeting cri-teria to what we have described for Facebook including customaudiences and lookalike audiences Both even allow targeting ofusers based on interests although they infer these interests in dier-ent ways Both havemade transparent demographic and geographictargeting information for ads in their archive but without othertargeting information this is an incomplete picture at best Weencourage Google and Twier to at minimum follow Facebookrsquosexample and make transparent to users information about whythey have been targeed for ads that they are seeing

54 New Types of Political Advertisers541 For-Profit Media One advertiser type in particular proved

to be an interesting outlier e category rsquoFor Prot Mediarsquo con-tains advertisers whose ads are not considered traditional news byFacebook (those ads are in a separate part of the archive that wedid not include) but have content intended solely to entertain orsway the opinion of the viewer Over the Facebook dataset as awhole the average ad sponsor ran ads on 16 pages Advertisersin the for-prot media category however ran ads on 32 pages onaverage We have examined many of these for-prot media compa-nies to understand why they are running across many Facebookpages What we have found in numerous instances is unknownfor-prot media companies that appear to be creating disingenuouscommunities that appear to be ldquograssroots movementsrdquo to targetdierent demographics and interests with a combination of paidand organic political messaging

A good example of this type of advertiser is rdquoNew AmericanMedia Group LLCrdquo is ad sponsor ran le leaning ads on 10dierent pages ese pages were designed to appeal to dierent

demographics (rdquoMelaninrdquo for people of color rdquoe Soldier Networkrdquofor Veterans rdquoRaising Tomorrowrdquo for parents etc) but oen run thesame content on multiple pages While this LLC has an extremelysimilar name to a now-defunct genuine le-leaning media outlet(New America Media) it appears to have no connection to thatprior group and also appears to have no activity o of Facebook

While some advertisers in this category were fairly traditionalentertainment websites (ie Comedy Central) some were ldquofor-protrdquo companies in name only that appeared to exist for no otherpurpose other than to spread a particular political message and hadno way of generating an actual prot We also discovered ldquoNewsfor Democracyrdquo is an LLC that ran le leaning ads on 14 dierentFacebook pages most of which were designed to be appealing togroups with traditionally conservative view points such as ldquoeHoly Tribunerdquo Journalists investigated this LLC and linked it toMotiveAI which is a liberal political advertising company

542 Corporate Astroturfing Corporations paying for politicaladvertising is not an entirely new phenomenon and has traditionallybeen funded through industry trade groups and PACs Howeverthe reporting requirements by the FCC for US political advertisingon television oen made this political messaging traceable to thereal sponsor ese stricter reporting requirements do not apply toonline political advertising and the ad-hoc reporting requirementsthat online platforms have enacted are being abused by corporationsand industry trade groups to undo transparency eorts

We discovered in our analysis 355 ads sponsored by ldquoCitizens forTobacco Rightsrdquo which is not a registered company in the US butdoes disclose on their website and Facebook page that it is operatedby cigaree company Philip Morris However someone who onlysaw the Facebook ad disclaimer would not be able to connect theads to Philip Morris without further investigation Other journalistshave found instances of oil and insurance lobbying groups that alsoprovided sponsor names that did not match the legally incorporatedentity sponsoring the ads [21] ese organizations are seeminglytaking advantage of Facebookrsquos policy of not veing sponsor namessince some of these entities also ran political ads on Googlersquos adplatform but provided Google with their EIN (tax ID) and correctlegally incorporated names of their organizations [21]

55 Discussione dierent policies bugs idiosyncrasies and security weaknessesof each transparency archive implementation present challenges toour analysis eorts We nd many of the issues with these archiveslikely stem from a combination of their hasty creation and the factthat the platforms are still working out how to improve security ofthese archives such they are dicult to deceive or evade We willrst discuss issues related to accidentally or intentionally deceivingthese transparency eorts and how they might be improved byimplementing more robust sponsor aribution techniques e sec-ond part of our discussion will focus on issues related to bypassinginclusion into the dierent platformsrsquo archives and what can bedone to improve these issues

551 Sponsor Aribution e for-prot political advertisers ap-pear to be the ones that are accidental or intentionally skirting andviolating the spirit of online transparency sponsorship disclosure

9

policies As we discussed in the Ad Targeting section itrsquos extremelyeasy for groups such as rsquoNew American Mediarsquo to obscure whothey actually are from users and researchers

It is worth noting that such advertising by for-prot corpora-tions was not legal until the Citizens United Supreme Court decisionin 2010 [1] that struck down restrictions on election spending byfor-prot corporations However political messaging advertiserswho run ads on television or radio stations governed by the FCCmust still report the name and contact information of the busi-ness which paid for the ad including the companyrsquos ocers anddirectors Such data is published by the FCC in a public databasePolitical advertisers who send direct mail through the US PostalService (USPS) must also report their activities through the FECwith similar public disclosure of the name and contact informationof the business e regulations that require such disclosure for adsthat mention candidates do not apply to online advertising largelybecause the laws that mandate such public disclosures were draedbefore these platforms were as ubiquitous as they have become

What this means in practice though is that people who wantto publicize a political message can form a for-prot company fordoing so with no intent of making an prot As a private companythey do not need to publicly disclose their investors in the waythat PACs are required to disclose their donors en the for-protcompany can advertise on social media also without disclosing thelegal entity providing the funds to pay for the ad

On Facebookrsquos platform advertisers can easily mislead whenproviding the rsquoad sponsorrsquo string associated with their ads eitherintentionally or accidentally us it is eectively free to circum-vent Facebookrsquos transparency implementation We see numerousinstances on Facebookrsquos platforms of this occurring Sometimes theunreliability of the ad sponsor label appeared to be caused purelyby human error such as typos or variation during data entry Forexample Donald J Trump For President Inc sponsored ads onboth the Donald J Trump page and the Mike Pence page Howeverwhen sponsoring ads on the Donald J Trump page the organiza-tion is known as rsquoDonald J Trump For President Incrsquo and whensponsoring ads on the Mike Pence page is known as rsquoDonald JTrump For President Incrsquo Facebook has not publicly stated plansto implement additional veing of political sponsors Facebookrsquosargument is that anything they might implement for additional vet-ting would not be scalable because of their broader inclusion policywhich extends to political issue ads [21] However this has cre-ated a weakness in Facebookrsquos transparency implementation thatgreatly diminishes its eectiveness for studying dishonest politicaladvertisers

Google and Twier both vet sponsors so companies must eitherreveal their legally incorporated name pay existing third-partiesto create ads on their behalf or create shell organizations (ie LLCPACs) We should note that we see instances of political ads onFacebook and Twier where the sponsor is a third-party advertisingagency instead of the actual entity that paid for the ads is is anexample of the complexities of correctly aributing political adsto the real sponsors It is clear from analysis that we need morediscussion about how to implement sponsorship disclosure andveing in a way that makes it practical to deploy at scale and moredicult to circumvent

552 Transparency Infrastructure As we have noted we ap-preciate the speed with which these transparency archives werecreated However the lack of full integration of these archives intothe broader ad platforms of these companies is currently hurtingthe ecacy of these transparency eorts

We believe that there are ads on the Google and Twier platformsthat would be considered political content that are not includedin their transparency archives because their criteria for inclusionare too narrow or their mechanisms for nding this content areinsucient More research needs to be done into exactly whatthe general population considers to be political advertising Wewould encourage these platforms to create policies and enforcementmechanisms that will make transparent advertising content thatthe general population would consider political

We also encountered several technical and policy issues withthe archives as they currently exist Many ads particularly in theGoogle archive were missing content information Information onspend and impressions were only available in broad ranges fromFacebook and Google No targeting information or very lile target-ing information was available from any of the platforms Facebookrequired us to sign an NDA that prohibited us from sharing our rawdata even with other researchers or even discussing our ndingsdirectly with non- US Persons

We call on these organizations to re-architect their platforms andpolicies to support full transparency of all political ads We realizethat making the changes we recommend will require investment oftime and money both in the technology of these platforms and thecorporate culture of the organizations that own them

6 RELATEDWORK61 Online AdvertisingKorolova [16] was the rst to point out privacy aacks based onmicro-targeted online ads Followup work has reverse-engineeredthe targeting options provided by major online ad networks [33]and explored privacy [2] and bias [26] issues of these online adnetworks ere has also been work on designing improved adtransparency mechanisms [20] For our study we leverage thisprior work on reverse-engineering online advertising networksrsquotargeting options and how Facebookrsquos ad targeting explanationlikely is implemented

To the best of our knowledge there has been no systematicanalysis of online advertisers to this point likely due to the dicultyof collecting large-scale data from online ad networks [13] One ofthe only prior large-scale quantitative studies of online advertisersfocused on how their strategies eected conversion rates based onaggregate analysis of advertisers on Microsorsquos ad network [32]XRay [18] and Sunshine [19] are two techniques that were createdto detect and infer online ad targeting methods However thesewere proof of concept systems and not deployed at large-scale Aninitial analysis of Facebookrsquos proposed ad transparency archiveimplementations pointed out the issue of only including politicalads and not revealing targeting information [31] is report wasreleased before Facebook implemented their transparency archiveand therefor did not analyze the ad data archived by Facebook orissues with the actual implementation We have conducted the rst

10

large-scale analysis of online political advertising based on the datarecently made transparent by Facebook Google and Twier

62 Political AdvertisingAnalysis of political television ads has been the focus of most priorpolitical advertising studies likely due to this data being publiclypublished by the FCC and easy to access [14 35] ere is at leastone prior study that explored the inuence of political televisionads on online discussion [25] ere have also been studies of inves-tigating the polarization of online political discourse [3 10] eclosest related to our study is a prior study which showed thatuploading political video advertisements to YouTube generated un-paid organic-views and improved their eectiveness [24] Howeverto the best of our knowledge ours is the rst large-scale study ofonline political advertising

7 CONCLUSIONSWe have performed an analysis of the ads that we were able tocollect from Facebook Google and Twierrsquos transparency archivesrelated to US politics Based on the data we collected we providean initial understanding and taxonomies of online political advertis-ing strategies for both honest and possibly dishonest US politicaladvertisers We also point out limitations and weaknesses of thepolicies and current implementations of these archives As part ofour analysis we demonstrate how advertisers are intentionally oraccidentally deceiving and bypassing these political transparencyarchives We provide a concrete list of suggestions that wouldlikely make them more robust and useful for enabling a beer un-derstanding of political advertising We are actively working witheach archive product teams to improve their implementations

We commend Facebook Google and Twier for their eorts sofar in improving transparency into political advertising on theirplatforms We note the speed with which these archives weremade available aer public concern about this issue was raised andthat these transparency eorts have improved a great deal in theshort time that these tools have been available We encourage theplatforms to continue to improve

REFERENCES[1] 2009-03-24 Citizens United v Federal Election Commission[2] Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P Gummadi

Patrick Loiseau and Alan Mislove 2018 Investigating ad transparency mecha-nisms in social media A case study of Facebookrsquos explanations InNDSS 2018 Net-work and Distributed Systems Security Symposium San Diego UNITED STATES18 ndash 21

[3] Pablo Barber John T Jost Jonathan Nagler Joshua A Tucker and RichardBonneau 2015 Tweeting From Le to Right Is Online Political CommunicationMore an an Echo Chamber Psychological Science 26 10 (2015) 1531ndash1542

[4] Brian Barre 2018 For Russia Unraveling US DemocracyWas Just Another Day Job hpswwwwiredcomstorymueller-indictment-internet-research-agency

[5] Facebook 2018 Ad Archive hpswwwfacebookcomadsarchive[6] Facebook 2018 Facebook Ad Archive Report Retrieved October 25th 2018

from hpswwwfacebookcomadsarchivereport[7] Facebook 2018-07-03 About ads that include political content rdquohpswww

facebookcombusinesshelp167836590566506rdquo[8] Facebook 2018-07-03 National issues of public importance rdquohpswww

facebookcombusinesshelp214754279118974rdquo[9] Facebook 2018-08-22 Introducing the Ad Archive API hpsnewsroom

comnews201808introducing-the-ad-archive-api[10] DJ Flynn Brendan Nyhan and Jason Reier [n d] e Nature and Origins of

Misperceptions Understanding False and Unsupported Beliefs About PoliticsPolitical Psychology 38 S1 ([n d]) 127ndash150

[11] Google 2018-08-15 Transparency Report - Political Ads hpstransparencyreportgooglecompolitical-adslibrary

[12] Kevin Granville 2018 Facebook and Cambridge Analytica What You Needto Know as Fallout Widens hpswwwnytimescom20180319technologyfacebook-cambridge-analytica-explainedhtml

[13] Saikat Guha Bin Cheng and Paul Francis 2010 Challenges in Measuring OnlineAdvertising Systems In Proceedings of the 10th ACM SIGCOMM Conference onInternet Measurement (IMC rsquo10) ACM 81ndash87

[14] Lynda Lee Kaid and Monica Postelnicu 2005 Political Advertising in the 2004Election Comparison of Traditional Television and Internet Messages AmericanBehavioral Scientist 49 2 (2005) 265ndash278

[15] Maggie Haberman Kenneth P Vogel 2018-10-13hpswwwnytimescom20181013uspoliticstrump-political-datahtmlhpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[16] A Korolova 2010 Privacy Violations Using Microtargeted Ads A Case StudyIn 2010 IEEE International Conference on Data Mining Workshops 474ndash482

[17] Shikhar Sakhuja Laura Edelson 2018 An Analysis of UnitedStates Online Political Advertising hpsonline-pol-adsgithubioOnline-Political-Ads-Analysis

[18] Mathias Lecuyer Guillaume Ducoe Francis Lan Andrei Papancea eolosPetsios Riley Spahn Augustin Chaintreau and Roxana Geambasu 2014 XRayEnhancing the Webrsquos Transparency with Dierential Correlation In USENIXSecurity Symposium San Diego United States

[19] Mathias Lecuyer Riley Spahn Yannis Spiliopolous Augustin Chaintreau RoxanaGeambasu and Daniel Hsu 2015 Sunlight Fine-grained Targeting Detectionat Scale with Statistical Condence In Proceedings of the 22Nd ACM SIGSACConference on Computer and Communications Security (CCS rsquo15) ACM New YorkNY USA 554ndash566 hpsdoiorg10114528101032813614

[20] Bin Liu Anmol Sheth Udi Weinsberg Jaideep Chandrashekar and RameshGovindan 2013 AdReveal Improving Transparency into Online TargetedAdvertising In Proceedings of the TwelhACMWorkshop onHot Topics in Networks(HotNets-XII) ACM New York NY USA Article 12 7 pages hpsdoiorg10114525357712535783

[21] Jeremy B Merrill 2018 How Big Oil Dodges FacebooksNew Ad Transparency Rules hpswwwpropublicaorgarticlehow-big-oil-dodges-facebooks-new-ad-transparency-rules

[22] ProPublica 2017-09-07 Political Advertisements from Facebook hpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[23] ProPublica 2018-10-21 Political Advertisements from Facebook hpswwwpropublicaorgdatastoredatasetpolitical-advertisements-from-facebook

[24] Travis N Ridout Erika Franklin Fowler and John Bransteer 2010 PoliticalAdvertising in the 21st Century e Rise of the YouTube Ad In APSA 2010Annual Meeting

[25] Dhavan V Shah Jaeho Cho Seungahn Nah Melissa R Gotlieb Hyunseo HwangNam-Jin Lee Rosanne M Scholl and Douglas M McLeod [n d] Campaign AdsOnline Messaging and Participation Extending the Communication MediationModel Journal of Communication 57 4 ([n d]) 676ndash703

[26] Till Speicher Muhammad Ali Giridhari Venkatadri Filipe Nunes Ribeiro GeorgeArvanitakis Fabrcio Benevenuto Krishna P Gummadi Patrick Loiseau andAlan Mislove 2018 Potential for Discrimination in Online Targeted AdvertisingIn Proceedings of the 1st Conference on Fairness Accountability and Transparency(Proceedings of Machine Learning Research) Sorelle A Friedler and ChristoWilson(Eds) Vol 81 PMLR New York NY USA 5ndash19

[27] William Turton 2018 We posed as 100 Senators to run ads on FacebookFacebook approved all of them hpsnewsvicecomen usarticlexw9n3qwe-posed-as-100-senators-to-run-ads-on-facebook-facebook-approved-all-of-them

[28] Twier 2018 Political campaigning advertisers hpsadstwiercomtransparencyipolitical advertisers

[29] Twier 2018-06-28 Ad Transparency Center hpsadstwiercomtransparency

[30] Twier 2018-06-28 Political Content in the United States hpsbusinesstwiercomenhelpads-policiesrestricted-content-policiespolitical-campaigningUS-political-contenthtml

[31] Upturn 2018 Leveling the Platform Real Transparency for Paid Messages onFacebook hpswwwteamupturnorgreports2018facebook-ads

[32] Bhanu C Vaikonda Vacha Dave Saikat Guha and Alex C Snoeren 2015Empirical Analysis of Search Advertising Strategies In Proceedings of the 2015Internet Measurement Conference (IMC rsquo15) ACM New York NY USA 79ndash91hpsdoiorg10114528156752815694

[33] G Venkatadri A Andreou Y Liu A Mislove K P Gummadi P Loiseau and OGoga 2018 Privacy Risks with Facebookrsquos PII-Based Targeting Auditing a DataBrokerrsquos Advertising Interface In 2018 IEEE Symposium on Security and Privacy(SP) 89ndash107

[34] Giridhari Venkatadri Yabing Liu Athanasios Andreou Oana Goga PatrickLoiseau Alan Mislove and Krishna P Gummadi 2018 Privacy Risks withFacebookrsquos PII-based Targeting Auditing a Data Brokerrsquos Advertising InterfaceIn Proceedings of the IEEE Symposium on Security and Privacy (IEEE SampPrsquo18) San

11

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References
Page 10: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

policies As we discussed in the Ad Targeting section itrsquos extremelyeasy for groups such as rsquoNew American Mediarsquo to obscure whothey actually are from users and researchers

It is worth noting that such advertising by for-prot corpora-tions was not legal until the Citizens United Supreme Court decisionin 2010 [1] that struck down restrictions on election spending byfor-prot corporations However political messaging advertiserswho run ads on television or radio stations governed by the FCCmust still report the name and contact information of the busi-ness which paid for the ad including the companyrsquos ocers anddirectors Such data is published by the FCC in a public databasePolitical advertisers who send direct mail through the US PostalService (USPS) must also report their activities through the FECwith similar public disclosure of the name and contact informationof the business e regulations that require such disclosure for adsthat mention candidates do not apply to online advertising largelybecause the laws that mandate such public disclosures were draedbefore these platforms were as ubiquitous as they have become

What this means in practice though is that people who wantto publicize a political message can form a for-prot company fordoing so with no intent of making an prot As a private companythey do not need to publicly disclose their investors in the waythat PACs are required to disclose their donors en the for-protcompany can advertise on social media also without disclosing thelegal entity providing the funds to pay for the ad

On Facebookrsquos platform advertisers can easily mislead whenproviding the rsquoad sponsorrsquo string associated with their ads eitherintentionally or accidentally us it is eectively free to circum-vent Facebookrsquos transparency implementation We see numerousinstances on Facebookrsquos platforms of this occurring Sometimes theunreliability of the ad sponsor label appeared to be caused purelyby human error such as typos or variation during data entry Forexample Donald J Trump For President Inc sponsored ads onboth the Donald J Trump page and the Mike Pence page Howeverwhen sponsoring ads on the Donald J Trump page the organiza-tion is known as rsquoDonald J Trump For President Incrsquo and whensponsoring ads on the Mike Pence page is known as rsquoDonald JTrump For President Incrsquo Facebook has not publicly stated plansto implement additional veing of political sponsors Facebookrsquosargument is that anything they might implement for additional vet-ting would not be scalable because of their broader inclusion policywhich extends to political issue ads [21] However this has cre-ated a weakness in Facebookrsquos transparency implementation thatgreatly diminishes its eectiveness for studying dishonest politicaladvertisers

Google and Twier both vet sponsors so companies must eitherreveal their legally incorporated name pay existing third-partiesto create ads on their behalf or create shell organizations (ie LLCPACs) We should note that we see instances of political ads onFacebook and Twier where the sponsor is a third-party advertisingagency instead of the actual entity that paid for the ads is is anexample of the complexities of correctly aributing political adsto the real sponsors It is clear from analysis that we need morediscussion about how to implement sponsorship disclosure andveing in a way that makes it practical to deploy at scale and moredicult to circumvent

552 Transparency Infrastructure As we have noted we ap-preciate the speed with which these transparency archives werecreated However the lack of full integration of these archives intothe broader ad platforms of these companies is currently hurtingthe ecacy of these transparency eorts

We believe that there are ads on the Google and Twier platformsthat would be considered political content that are not includedin their transparency archives because their criteria for inclusionare too narrow or their mechanisms for nding this content areinsucient More research needs to be done into exactly whatthe general population considers to be political advertising Wewould encourage these platforms to create policies and enforcementmechanisms that will make transparent advertising content thatthe general population would consider political

We also encountered several technical and policy issues withthe archives as they currently exist Many ads particularly in theGoogle archive were missing content information Information onspend and impressions were only available in broad ranges fromFacebook and Google No targeting information or very lile target-ing information was available from any of the platforms Facebookrequired us to sign an NDA that prohibited us from sharing our rawdata even with other researchers or even discussing our ndingsdirectly with non- US Persons

We call on these organizations to re-architect their platforms andpolicies to support full transparency of all political ads We realizethat making the changes we recommend will require investment oftime and money both in the technology of these platforms and thecorporate culture of the organizations that own them

6 RELATEDWORK61 Online AdvertisingKorolova [16] was the rst to point out privacy aacks based onmicro-targeted online ads Followup work has reverse-engineeredthe targeting options provided by major online ad networks [33]and explored privacy [2] and bias [26] issues of these online adnetworks ere has also been work on designing improved adtransparency mechanisms [20] For our study we leverage thisprior work on reverse-engineering online advertising networksrsquotargeting options and how Facebookrsquos ad targeting explanationlikely is implemented

To the best of our knowledge there has been no systematicanalysis of online advertisers to this point likely due to the dicultyof collecting large-scale data from online ad networks [13] One ofthe only prior large-scale quantitative studies of online advertisersfocused on how their strategies eected conversion rates based onaggregate analysis of advertisers on Microsorsquos ad network [32]XRay [18] and Sunshine [19] are two techniques that were createdto detect and infer online ad targeting methods However thesewere proof of concept systems and not deployed at large-scale Aninitial analysis of Facebookrsquos proposed ad transparency archiveimplementations pointed out the issue of only including politicalads and not revealing targeting information [31] is report wasreleased before Facebook implemented their transparency archiveand therefor did not analyze the ad data archived by Facebook orissues with the actual implementation We have conducted the rst

10

large-scale analysis of online political advertising based on the datarecently made transparent by Facebook Google and Twier

62 Political AdvertisingAnalysis of political television ads has been the focus of most priorpolitical advertising studies likely due to this data being publiclypublished by the FCC and easy to access [14 35] ere is at leastone prior study that explored the inuence of political televisionads on online discussion [25] ere have also been studies of inves-tigating the polarization of online political discourse [3 10] eclosest related to our study is a prior study which showed thatuploading political video advertisements to YouTube generated un-paid organic-views and improved their eectiveness [24] Howeverto the best of our knowledge ours is the rst large-scale study ofonline political advertising

7 CONCLUSIONSWe have performed an analysis of the ads that we were able tocollect from Facebook Google and Twierrsquos transparency archivesrelated to US politics Based on the data we collected we providean initial understanding and taxonomies of online political advertis-ing strategies for both honest and possibly dishonest US politicaladvertisers We also point out limitations and weaknesses of thepolicies and current implementations of these archives As part ofour analysis we demonstrate how advertisers are intentionally oraccidentally deceiving and bypassing these political transparencyarchives We provide a concrete list of suggestions that wouldlikely make them more robust and useful for enabling a beer un-derstanding of political advertising We are actively working witheach archive product teams to improve their implementations

We commend Facebook Google and Twier for their eorts sofar in improving transparency into political advertising on theirplatforms We note the speed with which these archives weremade available aer public concern about this issue was raised andthat these transparency eorts have improved a great deal in theshort time that these tools have been available We encourage theplatforms to continue to improve

REFERENCES[1] 2009-03-24 Citizens United v Federal Election Commission[2] Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P Gummadi

Patrick Loiseau and Alan Mislove 2018 Investigating ad transparency mecha-nisms in social media A case study of Facebookrsquos explanations InNDSS 2018 Net-work and Distributed Systems Security Symposium San Diego UNITED STATES18 ndash 21

[3] Pablo Barber John T Jost Jonathan Nagler Joshua A Tucker and RichardBonneau 2015 Tweeting From Le to Right Is Online Political CommunicationMore an an Echo Chamber Psychological Science 26 10 (2015) 1531ndash1542

[4] Brian Barre 2018 For Russia Unraveling US DemocracyWas Just Another Day Job hpswwwwiredcomstorymueller-indictment-internet-research-agency

[5] Facebook 2018 Ad Archive hpswwwfacebookcomadsarchive[6] Facebook 2018 Facebook Ad Archive Report Retrieved October 25th 2018

from hpswwwfacebookcomadsarchivereport[7] Facebook 2018-07-03 About ads that include political content rdquohpswww

facebookcombusinesshelp167836590566506rdquo[8] Facebook 2018-07-03 National issues of public importance rdquohpswww

facebookcombusinesshelp214754279118974rdquo[9] Facebook 2018-08-22 Introducing the Ad Archive API hpsnewsroom

comnews201808introducing-the-ad-archive-api[10] DJ Flynn Brendan Nyhan and Jason Reier [n d] e Nature and Origins of

Misperceptions Understanding False and Unsupported Beliefs About PoliticsPolitical Psychology 38 S1 ([n d]) 127ndash150

[11] Google 2018-08-15 Transparency Report - Political Ads hpstransparencyreportgooglecompolitical-adslibrary

[12] Kevin Granville 2018 Facebook and Cambridge Analytica What You Needto Know as Fallout Widens hpswwwnytimescom20180319technologyfacebook-cambridge-analytica-explainedhtml

[13] Saikat Guha Bin Cheng and Paul Francis 2010 Challenges in Measuring OnlineAdvertising Systems In Proceedings of the 10th ACM SIGCOMM Conference onInternet Measurement (IMC rsquo10) ACM 81ndash87

[14] Lynda Lee Kaid and Monica Postelnicu 2005 Political Advertising in the 2004Election Comparison of Traditional Television and Internet Messages AmericanBehavioral Scientist 49 2 (2005) 265ndash278

[15] Maggie Haberman Kenneth P Vogel 2018-10-13hpswwwnytimescom20181013uspoliticstrump-political-datahtmlhpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[16] A Korolova 2010 Privacy Violations Using Microtargeted Ads A Case StudyIn 2010 IEEE International Conference on Data Mining Workshops 474ndash482

[17] Shikhar Sakhuja Laura Edelson 2018 An Analysis of UnitedStates Online Political Advertising hpsonline-pol-adsgithubioOnline-Political-Ads-Analysis

[18] Mathias Lecuyer Guillaume Ducoe Francis Lan Andrei Papancea eolosPetsios Riley Spahn Augustin Chaintreau and Roxana Geambasu 2014 XRayEnhancing the Webrsquos Transparency with Dierential Correlation In USENIXSecurity Symposium San Diego United States

[19] Mathias Lecuyer Riley Spahn Yannis Spiliopolous Augustin Chaintreau RoxanaGeambasu and Daniel Hsu 2015 Sunlight Fine-grained Targeting Detectionat Scale with Statistical Condence In Proceedings of the 22Nd ACM SIGSACConference on Computer and Communications Security (CCS rsquo15) ACM New YorkNY USA 554ndash566 hpsdoiorg10114528101032813614

[20] Bin Liu Anmol Sheth Udi Weinsberg Jaideep Chandrashekar and RameshGovindan 2013 AdReveal Improving Transparency into Online TargetedAdvertising In Proceedings of the TwelhACMWorkshop onHot Topics in Networks(HotNets-XII) ACM New York NY USA Article 12 7 pages hpsdoiorg10114525357712535783

[21] Jeremy B Merrill 2018 How Big Oil Dodges FacebooksNew Ad Transparency Rules hpswwwpropublicaorgarticlehow-big-oil-dodges-facebooks-new-ad-transparency-rules

[22] ProPublica 2017-09-07 Political Advertisements from Facebook hpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[23] ProPublica 2018-10-21 Political Advertisements from Facebook hpswwwpropublicaorgdatastoredatasetpolitical-advertisements-from-facebook

[24] Travis N Ridout Erika Franklin Fowler and John Bransteer 2010 PoliticalAdvertising in the 21st Century e Rise of the YouTube Ad In APSA 2010Annual Meeting

[25] Dhavan V Shah Jaeho Cho Seungahn Nah Melissa R Gotlieb Hyunseo HwangNam-Jin Lee Rosanne M Scholl and Douglas M McLeod [n d] Campaign AdsOnline Messaging and Participation Extending the Communication MediationModel Journal of Communication 57 4 ([n d]) 676ndash703

[26] Till Speicher Muhammad Ali Giridhari Venkatadri Filipe Nunes Ribeiro GeorgeArvanitakis Fabrcio Benevenuto Krishna P Gummadi Patrick Loiseau andAlan Mislove 2018 Potential for Discrimination in Online Targeted AdvertisingIn Proceedings of the 1st Conference on Fairness Accountability and Transparency(Proceedings of Machine Learning Research) Sorelle A Friedler and ChristoWilson(Eds) Vol 81 PMLR New York NY USA 5ndash19

[27] William Turton 2018 We posed as 100 Senators to run ads on FacebookFacebook approved all of them hpsnewsvicecomen usarticlexw9n3qwe-posed-as-100-senators-to-run-ads-on-facebook-facebook-approved-all-of-them

[28] Twier 2018 Political campaigning advertisers hpsadstwiercomtransparencyipolitical advertisers

[29] Twier 2018-06-28 Ad Transparency Center hpsadstwiercomtransparency

[30] Twier 2018-06-28 Political Content in the United States hpsbusinesstwiercomenhelpads-policiesrestricted-content-policiespolitical-campaigningUS-political-contenthtml

[31] Upturn 2018 Leveling the Platform Real Transparency for Paid Messages onFacebook hpswwwteamupturnorgreports2018facebook-ads

[32] Bhanu C Vaikonda Vacha Dave Saikat Guha and Alex C Snoeren 2015Empirical Analysis of Search Advertising Strategies In Proceedings of the 2015Internet Measurement Conference (IMC rsquo15) ACM New York NY USA 79ndash91hpsdoiorg10114528156752815694

[33] G Venkatadri A Andreou Y Liu A Mislove K P Gummadi P Loiseau and OGoga 2018 Privacy Risks with Facebookrsquos PII-Based Targeting Auditing a DataBrokerrsquos Advertising Interface In 2018 IEEE Symposium on Security and Privacy(SP) 89ndash107

[34] Giridhari Venkatadri Yabing Liu Athanasios Andreou Oana Goga PatrickLoiseau Alan Mislove and Krishna P Gummadi 2018 Privacy Risks withFacebookrsquos PII-based Targeting Auditing a Data Brokerrsquos Advertising InterfaceIn Proceedings of the IEEE Symposium on Security and Privacy (IEEE SampPrsquo18) San

11

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References
Page 11: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

large-scale analysis of online political advertising based on the datarecently made transparent by Facebook Google and Twier

62 Political AdvertisingAnalysis of political television ads has been the focus of most priorpolitical advertising studies likely due to this data being publiclypublished by the FCC and easy to access [14 35] ere is at leastone prior study that explored the inuence of political televisionads on online discussion [25] ere have also been studies of inves-tigating the polarization of online political discourse [3 10] eclosest related to our study is a prior study which showed thatuploading political video advertisements to YouTube generated un-paid organic-views and improved their eectiveness [24] Howeverto the best of our knowledge ours is the rst large-scale study ofonline political advertising

7 CONCLUSIONSWe have performed an analysis of the ads that we were able tocollect from Facebook Google and Twierrsquos transparency archivesrelated to US politics Based on the data we collected we providean initial understanding and taxonomies of online political advertis-ing strategies for both honest and possibly dishonest US politicaladvertisers We also point out limitations and weaknesses of thepolicies and current implementations of these archives As part ofour analysis we demonstrate how advertisers are intentionally oraccidentally deceiving and bypassing these political transparencyarchives We provide a concrete list of suggestions that wouldlikely make them more robust and useful for enabling a beer un-derstanding of political advertising We are actively working witheach archive product teams to improve their implementations

We commend Facebook Google and Twier for their eorts sofar in improving transparency into political advertising on theirplatforms We note the speed with which these archives weremade available aer public concern about this issue was raised andthat these transparency eorts have improved a great deal in theshort time that these tools have been available We encourage theplatforms to continue to improve

REFERENCES[1] 2009-03-24 Citizens United v Federal Election Commission[2] Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P Gummadi

Patrick Loiseau and Alan Mislove 2018 Investigating ad transparency mecha-nisms in social media A case study of Facebookrsquos explanations InNDSS 2018 Net-work and Distributed Systems Security Symposium San Diego UNITED STATES18 ndash 21

[3] Pablo Barber John T Jost Jonathan Nagler Joshua A Tucker and RichardBonneau 2015 Tweeting From Le to Right Is Online Political CommunicationMore an an Echo Chamber Psychological Science 26 10 (2015) 1531ndash1542

[4] Brian Barre 2018 For Russia Unraveling US DemocracyWas Just Another Day Job hpswwwwiredcomstorymueller-indictment-internet-research-agency

[5] Facebook 2018 Ad Archive hpswwwfacebookcomadsarchive[6] Facebook 2018 Facebook Ad Archive Report Retrieved October 25th 2018

from hpswwwfacebookcomadsarchivereport[7] Facebook 2018-07-03 About ads that include political content rdquohpswww

facebookcombusinesshelp167836590566506rdquo[8] Facebook 2018-07-03 National issues of public importance rdquohpswww

facebookcombusinesshelp214754279118974rdquo[9] Facebook 2018-08-22 Introducing the Ad Archive API hpsnewsroom

comnews201808introducing-the-ad-archive-api[10] DJ Flynn Brendan Nyhan and Jason Reier [n d] e Nature and Origins of

Misperceptions Understanding False and Unsupported Beliefs About PoliticsPolitical Psychology 38 S1 ([n d]) 127ndash150

[11] Google 2018-08-15 Transparency Report - Political Ads hpstransparencyreportgooglecompolitical-adslibrary

[12] Kevin Granville 2018 Facebook and Cambridge Analytica What You Needto Know as Fallout Widens hpswwwnytimescom20180319technologyfacebook-cambridge-analytica-explainedhtml

[13] Saikat Guha Bin Cheng and Paul Francis 2010 Challenges in Measuring OnlineAdvertising Systems In Proceedings of the 10th ACM SIGCOMM Conference onInternet Measurement (IMC rsquo10) ACM 81ndash87

[14] Lynda Lee Kaid and Monica Postelnicu 2005 Political Advertising in the 2004Election Comparison of Traditional Television and Internet Messages AmericanBehavioral Scientist 49 2 (2005) 265ndash278

[15] Maggie Haberman Kenneth P Vogel 2018-10-13hpswwwnytimescom20181013uspoliticstrump-political-datahtmlhpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[16] A Korolova 2010 Privacy Violations Using Microtargeted Ads A Case StudyIn 2010 IEEE International Conference on Data Mining Workshops 474ndash482

[17] Shikhar Sakhuja Laura Edelson 2018 An Analysis of UnitedStates Online Political Advertising hpsonline-pol-adsgithubioOnline-Political-Ads-Analysis

[18] Mathias Lecuyer Guillaume Ducoe Francis Lan Andrei Papancea eolosPetsios Riley Spahn Augustin Chaintreau and Roxana Geambasu 2014 XRayEnhancing the Webrsquos Transparency with Dierential Correlation In USENIXSecurity Symposium San Diego United States

[19] Mathias Lecuyer Riley Spahn Yannis Spiliopolous Augustin Chaintreau RoxanaGeambasu and Daniel Hsu 2015 Sunlight Fine-grained Targeting Detectionat Scale with Statistical Condence In Proceedings of the 22Nd ACM SIGSACConference on Computer and Communications Security (CCS rsquo15) ACM New YorkNY USA 554ndash566 hpsdoiorg10114528101032813614

[20] Bin Liu Anmol Sheth Udi Weinsberg Jaideep Chandrashekar and RameshGovindan 2013 AdReveal Improving Transparency into Online TargetedAdvertising In Proceedings of the TwelhACMWorkshop onHot Topics in Networks(HotNets-XII) ACM New York NY USA Article 12 7 pages hpsdoiorg10114525357712535783

[21] Jeremy B Merrill 2018 How Big Oil Dodges FacebooksNew Ad Transparency Rules hpswwwpropublicaorgarticlehow-big-oil-dodges-facebooks-new-ad-transparency-rules

[22] ProPublica 2017-09-07 Political Advertisements from Facebook hpswwwpropublicaorgarticlehelp-us-monitor-political-ads-online

[23] ProPublica 2018-10-21 Political Advertisements from Facebook hpswwwpropublicaorgdatastoredatasetpolitical-advertisements-from-facebook

[24] Travis N Ridout Erika Franklin Fowler and John Bransteer 2010 PoliticalAdvertising in the 21st Century e Rise of the YouTube Ad In APSA 2010Annual Meeting

[25] Dhavan V Shah Jaeho Cho Seungahn Nah Melissa R Gotlieb Hyunseo HwangNam-Jin Lee Rosanne M Scholl and Douglas M McLeod [n d] Campaign AdsOnline Messaging and Participation Extending the Communication MediationModel Journal of Communication 57 4 ([n d]) 676ndash703

[26] Till Speicher Muhammad Ali Giridhari Venkatadri Filipe Nunes Ribeiro GeorgeArvanitakis Fabrcio Benevenuto Krishna P Gummadi Patrick Loiseau andAlan Mislove 2018 Potential for Discrimination in Online Targeted AdvertisingIn Proceedings of the 1st Conference on Fairness Accountability and Transparency(Proceedings of Machine Learning Research) Sorelle A Friedler and ChristoWilson(Eds) Vol 81 PMLR New York NY USA 5ndash19

[27] William Turton 2018 We posed as 100 Senators to run ads on FacebookFacebook approved all of them hpsnewsvicecomen usarticlexw9n3qwe-posed-as-100-senators-to-run-ads-on-facebook-facebook-approved-all-of-them

[28] Twier 2018 Political campaigning advertisers hpsadstwiercomtransparencyipolitical advertisers

[29] Twier 2018-06-28 Ad Transparency Center hpsadstwiercomtransparency

[30] Twier 2018-06-28 Political Content in the United States hpsbusinesstwiercomenhelpads-policiesrestricted-content-policiespolitical-campaigningUS-political-contenthtml

[31] Upturn 2018 Leveling the Platform Real Transparency for Paid Messages onFacebook hpswwwteamupturnorgreports2018facebook-ads

[32] Bhanu C Vaikonda Vacha Dave Saikat Guha and Alex C Snoeren 2015Empirical Analysis of Search Advertising Strategies In Proceedings of the 2015Internet Measurement Conference (IMC rsquo15) ACM New York NY USA 79ndash91hpsdoiorg10114528156752815694

[33] G Venkatadri A Andreou Y Liu A Mislove K P Gummadi P Loiseau and OGoga 2018 Privacy Risks with Facebookrsquos PII-Based Targeting Auditing a DataBrokerrsquos Advertising Interface In 2018 IEEE Symposium on Security and Privacy(SP) 89ndash107

[34] Giridhari Venkatadri Yabing Liu Athanasios Andreou Oana Goga PatrickLoiseau Alan Mislove and Krishna P Gummadi 2018 Privacy Risks withFacebookrsquos PII-based Targeting Auditing a Data Brokerrsquos Advertising InterfaceIn Proceedings of the IEEE Symposium on Security and Privacy (IEEE SampPrsquo18) San

11

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References
Page 12: An Analysis of United States Online Political Advertising … · 2019-02-13 · exact ad impressions and spend amounts instead of ranges. We also note that impressions is an imperfect

Francisco CA USA[35] David West 2018 Air wars television advertising and social media in election

campaigns 1952-2016 SAGECQ Press

12

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data Collection Methodology
    • 31 Facebook
    • 32 Google
    • 33 Twitter
      • 4 Datasets
      • 5 Results
        • 51 Data Over Time
        • 52 Federal Candidate Comparison
        • 53 Ad Targeting
        • 54 New Types of Political Advertisers
        • 55 Discussion
          • 6 Related Work
            • 61 Online Advertising
            • 62 Political Advertising
              • 7 Conclusions
              • References