COMPILER CONSTR UCTION - Andr© Platzer @ Carnegie Mellon University
Marco Balduzzi, Christian Platzer, Thorsten Holz, Abusing ... · Abusing Social Networks for...
Transcript of Marco Balduzzi, Christian Platzer, Thorsten Holz, Abusing ... · Abusing Social Networks for...
Abusing Social Networks for Automated User ProfilingMarco Balduzzi, Christian Platzer, Thorsten Holz,Engin Kirda1 Davide Balzarotti, and Christopher Kruegel - 2010
4MMSR - Network Security 2011-2012
Benjamin LAVIALLEDamien GRUEL
3/21/12
Student seminar
Abusing Social Networks for Automated User Profiling - Summary
● Introduction
● Ethical and legal Considerations
● Abusing E-mail Querying
● Evaluation with real-world Experiments
● Countermeasures
● Conclusion
1
IntroductionSocial Networks :
- Chatting- Sharing personal information
Some users do not take care or do not know about its visibility :
○ Real complete profiles are online○ confidentiality of their information is violated
○ Interesting for hackers : spammers, hustler,
real life attackers (kidnapping) 2
Ethical and Legal Considerations○ Realistic experiments
○ Protect the users’ privacy
○ No accounts, passwords, protected area or information broken into
○ No influence on social network's performance
○ Legal statement confirming3
Abusing E-Mail Querying
Goals :
○ Validate a list of E-mail for spamming
○ Create complete profiles by crosschecking :
○ Social phishing : Identity Theft
○ Generation of detailed profiles of employees for an
attack => attack of a company(ex : find password
thanks to information)
4
Abusing E-Mail Querying
Implementation of the Attack :
○ Address Prober : submits e-mails to social networks, 1000-5000 mail/30 sec => 500,000mail/day
○ Profile Crawler : visits users profile pages, stores them
in a database : 50,000 page/day with one 'standard' computer
○ Correlator : combines and correlates profiles from different social networks
5
Abusing E-Mail Querying
Possibilities : ○ Association Real name <=> Pseudonym
○ Association Real name <=> Personal information hided by a fake
name( ex : sexual tendencies)
○ Detection of Inconsistent Values (ex : fake age)
Overview of system architecture
6
Evaluation with Real-World Experiments10,427,982 e-mail addresses
Total of 1,228,644 profiles
method size efficiency speed effiency accounts
Facebook Direct 5000 10M/day 517,747
MySpace GMail 1000 500K/day 209,627
Twitter GMail 1000 500K/day 124,398
LinkedIn Direct 5000 9M/day 246,093
Friendster GMail 1000 400K/day 42,236
Badoo Direct 1000 5M/day 12,689
Netlog GMail 1000 800K/day 69,971
XING Direct 500 3.5M/day 5,883
7
Evaluation with Real-World Experiments○ Lots of addresses on several social networks
○ Information found: photo, location, friends, age, sex, job, current relation...
○ Many users do not change the default privacy settings
8
Evaluation with Real-World Experiments
Automated guessing of user profiles
○ Some profiles...
○ ...automatic generation of possible e-mail addresses...
○ ...check addresses on eight social networks.
650 profiles -> 20 000 users with their profile and
address9
Detecting anomalous profiles by cross-correlation: Mismatched information○ On Age
10
Range # %
2-10 4,163 60.17
11-30 1,790 25.87
31 + 966 13.96
Profiles with Age 20,085
Total with mismatched 6,919
11
Detecting anomalous profiles by cross-correlation: Mismatched information
○ Into hidden profiles
example of Cross-correlation
Countermeasures
○ Raising Awareness: Mitigation From the User’s Perspective
○ CAPTCHAs
○ Contextual Information: give personal information on the friend queried on (location, age...)
12
Countermeasures○ Limiting Information Exposure
○ Incremental Updates
○ Rate-limiting Queries
13
Conclusion○ common weakness, present in many popular
social networking sites in 2010
○ The sixth solution chosen by Facebook and XING
○ 10.4 million e-mail addresses, 1.2 million user profiles identified
14