Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social...

60
Group 4 1. Maithili Gokhale 2. Swati Sisodia 3. Aman Chanana 4. Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio Wang, Tingting Gao, Ben Y. Zhao, Yafei Dai

Transcript of Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social...

Page 1: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Group 41. Maithili Gokhale2. Swati Sisodia3. Aman Chanana4. Piyush Agade

“Uncovering Social Network Sybils in the Wild”- Zhi Yang, Christo Wilson, Xio Wang, Tingting Gao, Ben Y. Zhao, Yafei Dai

Page 2: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

The Renren Network

o Renren is one of the most popular (220 million users) OSNs in China.

o Functions maintain personal profiles, upload photos, write diary entries (blogs), and establish bidirectional social links with friends.

o The most popular type of user activity is sharing blog entries, which can be forwarded across social hops like “retweets” on Twitter.

Page 3: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

What are Sybils?

o Sybils are fake identities created to unfairly increase the power or resources of a single malicious user.

o Sybil accounts on Renren blend in extremely well with normal users to effectively attract friends and disseminate advertisements.

o They have completely filled user profiles with realistic background information, coupled with attractive profile.

o As its user population has grown, Renren has become an attractive venue for companies to disseminate information about their products.

o This has created opportunities for Sybil accounts to spam advertisements for companies.

Page 4: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Previous detectors on Renren

o Previously, Renren had already deployed a few techniques to detect Sybil accounts:

• using thresholds to detect spamming• scanning content for suspect keywords and blacklisted URLs• providing Renren users with the ability to flag accounts and

content as abusive.

o Disadvantages of these techniques • generally ad hoc• require significant human effort• effective only after spam content has been posted.

Page 5: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Identifying Malicious Activities

o Definition: Malicious activities are actions taken by an attacker that directly or indirectly support a monetization strategy.

o Example: targeting users with spam and phishing attacks. o The definition does not cover legitimate monetization strategies, such

as keyword, banner, or news-feed advertising.o In order for attackers to reach a user on OSNs, the attacker must first

be friends with that user.

Page 6: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Profiles that are NOT consideredo Benign Fake Accounts.:

• Although, it is possible that an attacker could create benign Sybils that behave identically to normal users and appear on the surface to be real- we are only interested in detecting Sybil accounts that perform attacks.

o Inactive Accounts:• Determining whether an inactive account is a malicious Sybil is challenging

because there is no behavioral data (e.g., friend requests, status updates).• The goal of the detector is to catch these accounts as quickly as possible

once they become active to minimize the amount of damage they can do to normal users.

Page 7: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Characterizing Sybil Accounts

The features that would help distinguish Sybil accounts from normal users are:

o Invitation Frequencyo Outgoing Requests Acceptedo Incoming Requests Acceptedo Clustering Coefficient

Page 8: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Characterizing Sybil Accounts

1) Invitation Frequencyo The number of friend requests

that a user has sent within a fixed time period

o Figure shows the friend invitation frequency of our dataset, averaged over long-term (400-hour) and short-term (1-hour) time scales.

o Sybil accounts are much more aggressive in sending requests than normal users. There is a clear separation: accounts sending more

than 20 invites per time interval are Sybils.

Page 9: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Characterizing Sybil Accounts

2) Outgoing Requests Acceptedo It is the fraction of outgoing friend

requests confirmed by the recipient. o Figure shows a distinct difference

between Sybils and normal userso Non-Sybil users have high accepted

percentages, with an average of 79%. o On average, only 26% of all friend

requests sent by Sybil accounts are accepted.

Page 10: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Characterizing Sybil Accounts

3) Incoming requests Acceptedo It is the fraction of incoming friend

requests that users accept. o Sybil accounts are nearly uniform:

they accept all incoming friend requests (e.g., 80% of Sybils accepted all friend requests).

o Sybil accounts receive few friend requests, this detection mechanism-hence, this method can incur significant delay.

The incoming requests accepted by non- Sybil users are spread across the board.

Page 11: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Characterizing Sybil Accounts

4) Clustering Coeffeciento Is graph metric that measures the

mutual connectivity of a user’s friends. o Sybil accounts, are likely to befriend

users with no mutual friendships.o Figure plots the CDF of cc values for

each user’s first 50 friends (sorted by time).

o Non-Sybil users have cc values orders of magnitude larger than Sybil users.

Page 12: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Building and Running a Sybil Detectoro An Support Vector Machine (SVM) classifier is applied to dataset of 1,000 normal users

and 1,000 Sybils. o Partition: five subsamples-four for training the classifier and one tests the classifier. o The results show that the classifier is very accurate, correctly identifying 99% of both

Sybil and non-Sybil accounts. o Value of threshold:

outgoing requests accepted % < 0.5 ∧ frequency > 20 ∧ cc< 0.01o Properly tuned threshold-based detector can achieve performance similar to the

computationally expensive SVM.

Page 13: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Real time Sybil Detection

o Uses ground truth dataset to give an adaptive, threshold based Sybil detector.o Monitors characteristics of Sybil accounts.o After the detector has been bootstrapped, it uses an adaptive feedback(drawn from

the customer complaint rate ) scheme to dynamically tune the threshold parameters on the fly.

o Tuning the thresholds minimizes the likelihood of false-positive classifications of normal accounts as Sybils.

o It is unlikely to detect Sybils that behave like normal users. o Drawback: will not catch benign inactive Sybils. Inactive Sybils will not be detected

until after they begin friending normal users.

Page 14: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Real time Sybil Detectiono The detector incorporates real-time changes in friendship links when

calculating acceptance percentages. o In some cases, normal users accept friend requests from Sybils only to later

revoke the friendship. This causes the accept percentage for the Sybil to drop. o When Renren bans Sybils, all of their edges are destroyed. o This causes the acceptance percentages for other Sybils with which they are

linked to drop.o In both cases, the decrease in acceptance percentage helps the detector to

more accurately detect Sybils.

Page 15: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

False Positiveso To assess false positives, feedback to Renren’s customer support department was

examined . It determines if the account was banned erroneously.o Computation:

as an upper bound on false positives. o During the 2-week period between December 13 and 26, 2010, Renren received

nearly 50 complaints per day, with the complaint rate being almost 0.015, which is extremely low.

o Manual inspection confirms that 48% of the accounts are Sybils, meaning that attackers attempted to recover Sybils by abusing the account recovery process.

o The true false-positive rate is even less than the daily complaint rate.

Page 16: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Analysis of Structural and Behavioral Attributes of Sybils

1. Topological Analysis

2. Clickstream Analysis

Page 17: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Topological Analysis

Normal Edges

Sybil Edges

Attack Edges

Honest Nodes

Sybil Nodes

Page 18: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Topological Analysis

o Community Detection Algorithms work under assumption that Sybils form tight knit communities

Community Detection

o Given Network Structure , is it ddpossible to detect Sybil Nodes ?

Page 19: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Topological Analysis

o Normal User follow same general trend as Sybil User

o Only 20% of Sybils are connected to one or more than one Sybil edges

Page 20: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Topological AnalysisIs it still possible that the connected minority are vulnerable to community

detection ?

o Community detection is not a viable option

o Is this edge creation intentional ?

Page 21: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Topological Analysis

o Most Sybil edge creation is interspersed randomly with edges created to normal users.

o For each Sybil, sequence of edges is plotted, with the edges sorted chronologically by creation time.

Page 22: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Topological Analysis

o Majority of Sybils do not form communities.

o Even the Sybil Edges that are formed are unintentional.

Page 23: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Clickstream Analysiso Each click characterized by USER ID : TIMESTAMP : URL

o Clicks were grouped into five categories

PhotoMessageShareFriendingProfile

Various aspects of clickstream were analyzed :o Number of clicks for each category o Sequence of clicks for a particular session.o Session Duration : Time between first and last clicko Session Frequency : How often does a user login

Page 24: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Clickstream Analysis

Session Frequencyo Sixty-four percent of normal users

access Renren no more than once per day.

o Only 8% Sybils fall in this low-frequency range

o Sybils averaged 3.9 sessions per day versus 1.5 for normal users

Page 25: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Clickstream Analysis

Session Durationo The median session duration for

normal users is 6 minutes, whereas the median for Sybils is 48 seconds

o Less than 25% Normal sessions are 48s long

o A very small percent of sybils exhibit sessions that are hours long

Page 26: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Clickstream Analysis

Click Activity

Page 27: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Clickstream AnalysisClickstream Modelling

o Each state represents a category

o Initial and final states are added to mark the beginning and end of each click sequence

o Each Edge represents probability of transition from one state to next

To analyze sequence of clicks from normal and Sybil nodes a Markov model was created.

Page 28: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Clickstream Analysis

Page 29: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Clickstream Analysis

There is stark difference in Click Activity, Click Sequence, and Sessions of Normal and Sybil users.

Can this difference be leveraged ?

Page 30: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Clickstream Analysis SVM (Support Vector Machine)

Train an SVM on the following clickstream features:

o Session-level features including •Average session length•Average sessions per day

o Features from click activities• Percentage of clicks in each category

• Transition probabilities between Categories

Page 31: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Clickstream Analysis

MLE (Maximum likelihood Estimation)MLE categorizes user from its clickstream by examining which clickstream model better explains user’s click sequence.

For a click sequence {s1, s2,..., sn} Individual Likelihood PM (si, si+1) = Probability that user transits from category si to category si+1

according to the model M.

Likelihood that Model M = ∏ (Individual PM) reproduces given click stream

Page 32: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Spam Strategies and CollusionoShare Spam on RenrenoCase Study: Spam BlogsoContent-Based Sybil ComponentsoTemporal Correlation Between Sybils

Page 33: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Share Spam on Renren

oSybils dominantly share links to spam content to disseminate spam.oShares per Sybil is much greater than status updates or wall posts.

Page 34: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Share Spam on Renreno25% of the 237K Sybils share once before they are caught and banned.oLess than 1% of Sybils go uncaught long enough to share 100 or more links.

Page 35: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Share Spam on Renren

oThe shares of a random sample of 1000 Sybils were manually examined.

o Sybils on Renren share two types of links:oBlogs (62.5% shares link to spam blog posts)oVideos (37.5% shares link to bogus online videos)

Page 36: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Case Study: Spam Blogs

oClassifying Spam Blogso Identifying Collusiono Information Dissemination

Page 37: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Classifying Spam Blogs

The subset of blogs shared by Sybils were manually verified to be spam.

These blogs:oInclude links to phishing sites.oInclude links to websites selling contraband goodsoMajority of them were banned by Renren’s security system.

Page 38: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Identifying Collusion

oFundamental question: are Sybils colluding to promote spam blogs, or is each Sybil operating independently?

oAnswer: the amount of duplication among the spam blogs was calculated.

oOnly 302,333 unique spam blogs were promoted, among the 3 million individual spam shares in the dataset.

Page 39: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Identifying Collusion

oTop 30 spam blogs were shared more than 10,000 times.o25% of spam blogs received 2 or more shares from Sybils.

Page 40: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Information Dissemination

oSybils collude so that the spam blogs get featured on the trending content section on Renren.

oSybils can inflate the popularity of spam blogs by making them artificially trend.

oCurrently, Renren relies on manual inspection by humans to filter spam out of the trending section.

Page 41: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Content-Based Sybil Components

oWhether content similarity can be used to group Sybils into connected components.

o Intuitively, a single attacker is likely to control strongly connected components .

oUnderstanding these components allows to estimate the number of attackers threatening Renren.

oCollusion between Sybils is modeled as a content similarity graph.

Page 42: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

o In a content similarity graph, Sybils are nodes and two Sybils are connected if they share similar content.

oContent similarity between two sets si and sj is:

where si and sj are sets of contents shared by two Sybils, respectively.

o It ranges from 0 to 1, whereo0- no duplicationo1- sybils share exactly same content

Content-Based Sybil Components

Page 43: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

oTwo Sybils i and j share similar content if sij is larger than some threshold Ts (or equal to Ts in the special case of Ts = 1)

oTs = 0 is the most lax thresholdoTs = 1 is the strictest threshold

oFor Ts = 1, >50% of Sybils have at least one Sybil partner forwarding exactly the same content.

Content-Based Sybil Components

Page 44: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

oFigure shows the quantity and sizes of connected components for different thresholds, ordered from largest to smallest.

TsConnected component

s

Giant component

0 4.9K 219K(90%) Sybils

0.5 76K 84K(35%) Sybils

1 114K 3700 Sybils

Content-Based Sybil Components

Page 45: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Temporal Correlation Between Sybils

oAre there temporal correlations between Sybils that exhibit content similarity?

oWe suspect that Sybils under the control of a single attacker will be active at similar times.

o If ti and tj are set of links that two sybils i, j share during time interval ‘S’ the temporal similarity between them is defined as

Page 46: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

oTemporal similarity ranges from 0 to 1, with 0 meaning no overlap and 1 meaning exact overlap.

oThe size of the time interval ‘S’ can be varied to control the granularity of comparisons.

oWe evaluate time similarity over two time intervals: 1 hour and 1 day.

Temporal Correlation Between Sybils

Page 47: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

oEach line plots average time similarity for discreet sets of Sybil pairs with close content similarity.

oFor example, the first point of the hour-scale line represents the average time similarity for all pairs of Sybils with content similarity in the range of 0 to 0.1.

Temporal Correlation Between Sybils

Page 48: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

oFigure reveals that time similarity is roughly proportional to content similarity.

oSybils that share similar content tend to do so at similar times.

oUnder 1 day threshold, Sybils that share near-identical content also exhibit nearly 0.92 time similarity.

Temporal Correlation Between Sybils

Page 49: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Making Sybil Defense Future-proofo We discussed a scalable, and accurate system that has been really effective

in detecting Sybils in Renren OSN.o Can attackers try adapting and circumvent the defense strategy discussed

earlier? o If yes, what are the options that an attacker has? What can an attacker

control and manipulate?o Invitation frequency?o Incoming requests acceptance rate?o Outgoing requests acceptance rate?o Clustering Coefficient?

Page 50: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Making Sybil Defense Future-proof

o Outgoing requests acceptance rate?o Clustering Coefficient?

o The only way these two features can be influenced by a Sybil is by forming tight-knit communities with other Sybil.

o What will sending friend requests to other Sybils accomplish?o Other Sybils will accept the requests, hence, the outgoing acceptance

rate of the sender will inflate.o A tight community of Sybils will imply a high clustering coefficient.

Page 51: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Fortunately, there is!

The Sybils won? There should be something more that could be done.

Page 52: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Making Sybil Defense Future-proofo Before we discuss the new futuristic defense strategy, let us first formally

describe the new attack model.o An attacker controls N Sybils.o Each Sybil sends a friend request to another Sybil with probability p and to

normal users with probability (1-p).o To avoid detection, let each Sybil maintain outgoing acceptance rate of at least

β. And let α be the probability that normal users accept their friend requests.o So, to avoid being detected, each Sybil must send requests so as to obey the

following inequality:

Page 53: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Making Sybil Defense Future-proofo A study where the new attack model was simulated (on a regional network in

Renren having 170k nodes) suggests Sybil graph structure changed according to the input parameters.

o In the simulations, two models for directing the creation of Sybil edges were usedo Erdos-Renyi - the attacker links randomly chosen Sybils.o Preferential Attachment - the destination of each Sybil edge is chosen

proportionally to the destination Sybil’s degree.

o In the simulations, α = 0.26 and β = 0.5, p = 0.33 and Blondel’s algorithm was used to detect communities in the regional graph.

Page 54: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Making Sybil Defense Future-proofo For various values of n and N the following table was obtained.

o The results are mixed. For n ≤ 300, the community detector is able to identify Sybils with high accuracy. However, as n grows, so does the false-positive rate.

* Uncovering Social Network Sybils in the Wild, Zhi Yang, et alia

N : no of Sybil nodesn : no of friend requests sent per Sybil node

Page 55: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

So, the community detection algorithms alone are not as precise as we want them to be, as with increasing n, the number false positives increases.

Page 56: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Making Sybil Defense Future-proofo In order for Sybil community detectors to be accurate (i.e., not generate false

positives), they must leverage additional features beyond the graph topology (detecting communities).

o External Acceptance Rate – The external acceptance percentage is the fraction of friend requests sent by members of a community to users outside the community that are accepted.

This should work. Why? Because for Sybils the vast majority of accepted friend requests are from other Sybils inside the local community. Conversely, rejections are from normal users outside the local community.

Page 57: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Conclusiono We discussed the behaviour of Sybils to create a feature-based Sybil detector which

can manage to catch 99% of Sybils, with low false-positive and false-negative rates.o Next we saw characterization of Sybil graph topology on a major OSN (Renren). And

we found that Sybils on Renren do not obey behavioural assumptions that underlie previous work on decentralized Sybil detectors. 80% of Sybils do not connect to other Sybils but instead they emphasize on connecting with normal users.

o We also analyzed Sybil clickstream and learnt that Sybils do not waste time browsing photos or viewing profiles; they prefer visiting profiles.

o Finally, we learnt that social links between Sybils are inadequate for identifying colluding behaviour. Sybils with no social connections still act in concert to spread spam.

Page 58: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Question?

Page 59: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Question?

Page 60: Group 4 1.Maithili Gokhale 2.Swati Sisodia 3.Aman Chanana 4.Piyush Agade “Uncovering Social Network Sybils in the Wild” - Zhi Yang, Christo Wilson, Xio.

Thank you!