Auditing Algorithms : Towards Transparency in the Age of ...
Transcript of Auditing Algorithms : Towards Transparency in the Age of ...
![Page 1: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/1.jpg)
AuditingAlgorithms:TowardsTransparencyintheAgeofBigData
ChristoWilsonAssistantProfessor@[email protected]
![Page 2: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/2.jpg)
PersonalizationontheWebSantaBarbara,California Amherst,Massachusetts
![Page 3: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/3.jpg)
PersonalizationisUbiquitousSearchResults
GoodsandServices
Music,Movies,Media
SocialMedia
![Page 4: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/4.jpg)
DangersofPersonalization?
![Page 5: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/5.jpg)
RacialDiscriminationChrisWilson
LookingforChrisWilson?Ad
FindPeopleNearYou!www.yellowpages.com
TrevonJones
TrevonJones,Arrested?Ad
SearchCriminalRecords,SexOffenderRegistry,andMore.
www.instantcheckmate.com
RacialbiasinGoogle’sAdSensesystemuncoveredbyLatanya Sweeneyin2013
Exampleofunintendedconsequences ofbigdataPeopleexhibitracialbiasintheirsearchandclickspatternsThead-placementalgorithmobservedandlearnedthesebehaviors
![Page 6: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/6.jpg)
PriceDiscriminationShowingusersdifferentpricesInecon:differentialpricing
Example:Amazonin2001DVDsweresoldfor$3-4moretosomeusers
Surprisingly,notillegalintheUSAnti-DiscriminationActdoesnotprotectconsumers
Article20(2)oftheServicesDirectiveprotectsEUresidentsButcompaniesseemtobeflauntingtheregulation:(
WebsitesVaryPrices,DealsBasedonUsers’Information
![Page 7: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/7.jpg)
PriceSteeringAlteringtheorderorcompositionofproductsE.g.highpriceditemsrankhigherforsomepeople
Example:Orbitz in2012UsersreceivedhotelsinadifferentorderwhensearchingNormalusers:cheaphotelsfirst;Macusers:expensivehotelsfirst
OnOrbitz,MacUsersSteeredtoPricierHotels
![Page 8: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/8.jpg)
AuditingAlgorithmsGovernmentsandregulatorsareconcernedaboutbigdataandalgorithmsWhiteHousereports:BigData:SeizingOpportunities,PreservingValuesBigDataandDifferentialPricing
FTC’snewOfficeofTechnologyResearchandInvestigationTaskedwithmonitoringtheapplicationsofbigdataandalgorithms
Howdowemeasureandunderstandalgorithms?Algorithmsmaybetradesecrets,constantlychangingAccesstosourcecodeisnotenough,dataisequallyimportant
Emergingscientificarea:AuditingAlgorithms
![Page 9: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/9.jpg)
GoalsofOurWork
1. UnderstandinghowcompaniescollectandsharedataaboutusersOnlineandofflineretailersAdvertisersandmarketersDatabrokerslikeAcxiom,Datalogix,Equifax,Experian,etc…
2. Reverse-engineeringonlinealgorithmstoassesstheirimpactSearchenginesOnlineadvertisementsE-commerceSocialnetworksetc…
![Page 10: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/10.jpg)
MeasuringPersonalizationCaseStudy:GoogleSearchCaseStudy:E-commerce
![Page 11: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/11.jpg)
MeasuringPersonalizationCaseStudy:E-commerce
![Page 12: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/12.jpg)
AreAllDifferencesPersonalization?
Product1Lorem ipsum dolorsitamet,consecteturadipiscing elit.Inmollis adipiscing pharetra.
Product2Lorem ipsum dolorsitamet,consecteturadipiscing elit.Inmollis adipiscing pharetra.
Product4Lorem ipsum dolorsitamet,consecteturadipiscing elit.Inmollis
Product3Lorem ipsum dolorsitamet,consecteturadipiscing elit.Inmollis adipiscing pharetra.
Product2Lorem ipsum dolorsitamet,consecteturadipiscing elit.Inmollis adipiscing pharetra.
Product1Lorem ipsum dolorsitamet,consecteturadipiscing elit.Inmollis adipiscing pharetra.
Product3Lorem ipsum dolorsitamet,consecteturadipiscing elit.Inmollis
Product4Lorem ipsum dolorsitamet,consecteturadipiscing elit.Inmollis adipiscing pharetra.
Compare
Notnecessarily! Itcouldbe:• Updatestoinventory/prices• Tax/Shippingdifferences• Distributedinfrastructure• Load-balancing
Howcanwereliablyidentifyandquantifypersonalization?
Personalization?
![Page 13: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/13.jpg)
ControllingforNoise
129.10.115.14
129.10.115.15 74.125.225.67
Product 1Lorem ipsum dolor sit amet, consectetur adipiscing elit. In mollis
Product 2Lorem ipsum dolor sit amet, consectetur adipiscing elit. In mollis
Queriesrunatthesametime
SameAmazonIPaddress
129.10.115.16
Product 2Lorem ipsum dolor sit amet, consectetur adipiscing elit. In mollis
Noise
Difference – Noise = Personalization
IPaddressesinthesame/24
![Page 14: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/14.jpg)
DualMethodology
REALUSERACCOUNTS
Leveragerealuseraccountswithlotsofhistory
Measurepersonalizationinreallife
SYNTHETICUSERACCOUNTS
Createaccountsthateachvarybyonefeature
Measuretheimpactofspecificfeatures
Questionswewanttoanswer:1. Towhatextentiscontentpersonalized?2. Whatuserfeaturesdrivepersonalization?
![Page 15: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/15.jpg)
RealUserExperiment
TaskonAmazonMechanicalTurk(AMT)Over1000sofparticipantsEachexecutedhundredsofsearchqueriesEveryquerypairedwithtwocontrolqueriesRunfromemptyaccounts,i.e.nohistoryBaselineresultsforcomparison
HTTPProxy
UserQuery
UserQueryControlQuery
ControlQuery
![Page 16: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/16.jpg)
MeasuringPersonalizationCaseStudy:GoogleSearchCaseStudy:E-commerce
![Page 17: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/17.jpg)
ResultsfromRealUsers
0
5
10
15
20
25
30
35
40
45
50
1 2 3 4 5 6 7 8 9 10
ResultsChanged(%
)
SearchResultRank
Control/Control
RealUser/Control Differencebetweenresultsispersonalization
Topranksarelesspersonalized
Lowerranksaremorepersonalized
• Onaverage,realusershavea12%higherchanceofdifferingthanthecontrols• Mostchangesareduetolocation
![Page 18: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/18.jpg)
WhatCausesofPersonalization?
HistoricalFeatures• LoggedIn/Out• HistoryofSearches• HistoryofSearchResultClicks• BrowsingHistory
AMTresultsrevealextensivepersonalizationNextquestion:whatuserfeaturesdrivethis?
StaticFeatures• Gender• Age• Browser• OperatingSystem• Location(IPAddress)• LoggedIn/Out
Methodology:usesynthetic(fake)accounts
![Page 19: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/19.jpg)
LoggedIn/OuttoGoogle
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5 6 7
Ave
rage
Jac
card
Inde
x
Day
No Cookies / No Cookies
Logged In / No Cookies
Logged Out / No Cookies
0
1
2
3
4
5
1 2 3 4 5 6 7A
vera
ge E
dit D
ista
nce
Day
Sameresults…Butina
differentorder
![Page 20: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/20.jpg)
IPAddressGeolocation
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5 6 7
Jacc
ard
Inde
x
Days
MA / MACA / MAUT / MAIL / MANC / MA
0
1
2
3
4
5
1 2 3 4 5 6 7
Ave
rage
Edi
t Dis
tanc
eDay
Onaverage,1differentresult
…Plus1pairofreorderedresults
![Page 21: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/21.jpg)
WhatAboutSearchHistory?Searchfor‘healthcare’ Searchfor‘obama,’ then‘healthcare’
Subsequentqueriesmay“carry-over”
![Page 22: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/22.jpg)
ImpactofSearchHistory
00.10.20.30.40.50.60.70.80.91
0 2.5 5 7.5 10 12.5 15 17.5 20
AverageJaccardIndex
TimeBetweenQueries(Minutes)
OverlapinResults,Searchingfor‘healthcare’and‘obama’+‘healthcare’
10minutecutoff
![Page 23: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/23.jpg)
MeasuringPersonalizationCaseStudy:GoogleSearchCaseStudy:E-commerce
![Page 24: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/24.jpg)
MeasuringPersonalizationCaseStudy:E-commerce
![Page 25: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/25.jpg)
TargetedRetailers10Generalretailers
BestBuyCDWHomeDepot JCPenney Macy’sNewEgg OfficeDepot SearsStaplesWalmart
Focusonproductsreturnedbysearches,20searchterms/site
6travelsites(hotels&carrental)CheapTickets Expedia Hotels.comPricelineOrbitz Travelocity
![Page 26: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/26.jpg)
DoUsersSeetheSamePricesfortheSameProducts?
Manysitesshowinconsistencies forrealusersUpto3.6%ofallproducts
Retailers Hotels RentalCars
%ofP
roducts
InconsistentPrices
![Page 27: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/27.jpg)
0
200
400
600
800
1000Differencein$
95th
75th
mean
25th
5th
HowMuchMoneyAreWeTalkingAbout?
Inconsistenciescanbe$100s!(perday/nightforhotels/cars)
Retailers Hotels RentalCars
![Page 28: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/28.jpg)
WhatFeaturesTriggerPersonalization?Methodology:usesynthetic(fake)accountsGivethemdifferentfeatures,lookforpersonalizationEachdayfor1month,runstandardsetofsearches
Category Feature TestedFeatures
Account Cookie NoAccount,LoggedIn,NoCookies
User-AgentOS WinXP,Win7,OSX,Linux
BrowserChrome33,AndroidChrome34,IE8,Firefox25,Safari7,iOSSafari6
HistoryClick BigSpender,LowSpender
Purchase BigSpender,LowSpender
![Page 29: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/29.jpg)
HomeDepotSmartphoneusersseetotallydifferent
productsthandesktopusers
7%ofproductshavedifferentpricesonAndroid
…butthepricesonlygoupby$0.50onaverage
![Page 30: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/30.jpg)
TravelSitesCheaptickets andOrbitz offerlowerpricesonhotelsforuserswholog-intothesites1hotelperpage,$12offpernightonaverage
Travelocityoffersdiscountsonhotelsforusersonmobiledevices1hotelperpage,$15offpernightonaverage
Pricelinechangestheorderofsearchresultsbasedonclickandpurchasehistory
Exampleofpricesteering• 2accountsclick/reservehighpricehotels• 2accountsclick/reservelowpricehotels• 2accountsdonothing
![Page 31: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/31.jpg)
Cheaptickets/Orbitz
![Page 32: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/32.jpg)
Cheaptickets/OrbitzCheaptickets andOrbitz offerlowerpricesonhotelsforuserswholog-intothesites
About1hotelperpagehasalowerprice
Pricesdropbyaround$12pernight
Avg.PriceDifference($)
![Page 33: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/33.jpg)
Travelocity
iOSusersseedifferenthotels
About1hotelperpagehasalowerprice
Pricedropsbyaround$15/night
Travelocityoffersdiscountsonhotelsforusersonmobiledevices
![Page 34: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/34.jpg)
PricelinePricelinechangestheorderofsearchresultsbasedonclickandpurchasehistory
• 2accountsclick/reservehighpricehotels• 2accountsclick/reservelowpricehotels• 2accountsdonothing
![Page 35: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/35.jpg)
Hotels.com/ExpediaHotelsandExpediaareconductinglarge-scaleA/BtestsontheirusersWhenyouvisitthesite,youarerandomly placedina“bucket”2outof3bucketsseehigh-pricehotelsatthetopofsearchresultsTheremainingbucketseeslow-pricehotelsatthetopofthepage
ExemplifiespricesteeringTheonlywaytoseethehiddenhotelresultsistoclearyourcookiesandreloadthesite
![Page 36: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/36.jpg)
ConclusionsandFutureWork
![Page 37: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/37.jpg)
TheEraofBigDataAlgorithmsdrivenbybigdatashapeyourworldSearchresultsyouaregivenPricesandproductsyouareshownMovie,music,andbookrecommendationsThedirectionsyouusetodrive
Inmanycases,thesesystemsarewonderful
Inothercases,theymaybedetrimentalUnintendedconsequencesIntentionalmanipulation
EligibilityforsocialservicesAccesstocreditandbankingAllocationofpoliceforces
![Page 38: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/38.jpg)
OurGoal:TransparencyPersonalizationisproblematicwhenitisnottransparentHowisdatabeingcollectedandshared?Howisdatabeingusedtoaltercontent?
Usealgorithmauditstoinvestigatedeployedsystems,assesstheirimpact
OurgoalistoincreasetransparencyBuilding toolstohelpusersandregulatorsReverse-engineeringsystemstounderstandhowtheyworkRaisingpublicawarenessoftheseissues
![Page 39: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/39.jpg)
PeekingBeneaththeHoodofUber
![Page 40: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/40.jpg)
BordersonGoogleMaps
![Page 41: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/41.jpg)
DiscriminationintheGig-economy
![Page 42: Auditing Algorithms : Towards Transparency in the Age of ...](https://reader030.fdocuments.in/reader030/viewer/2022012916/61c70901f188654b7e3b4dc1/html5/thumbnails/42.jpg)
Allofourcode,data,andpapersareavailableat:
http://personalization.ccs.neu.edu