PhishAri: Automatic Realtime Phishing Detection on Twitter
-
Upload
anupama-aggarwal -
Category
Education
-
view
74 -
download
1
description
Transcript of PhishAri: Automatic Realtime Phishing Detection on Twitter
Automatic Realtime Phishing Detection on
Anupama Aggarwal, Ashwin Rajadesingan,Ponnurangam Kumaraguru
1
Motivation: Some Statistics
• $520 million were lost worldwide from phishing attacks in 2011 alone. (RSA Report)
• In 2012, around 20% of all phishing attacks targeted Facebook
• Social network phishing has jumped 221% attacks during Q1 of 2012
2
Phishing Detection on OSM: Current State-of-Art
3
• Offline Spam Characterization & Detection Studies
• No characterization of phishing on OSM
• Lack of Realtime detection mechanisms
• Absence of end-user deployed systems
• Dependence on Spam/Phishing Blacklists
What Did We Do to Fill the Gap?
• Built a mechanism to Automatically detect phishing on Twitter in Realtime
• No dependency on Blacklists
• Deployed end-user system for Twitter users - Chrome Extension
4
Twitter 101
5
Hey, I am in Puerto Rico
attending @APWG eCrime research
Talking about #phishing on OSN
Tweets<140 char
Earn Money #help #moneyhttp://bit.ly/Pw637z
Twitter 101
6
Hey, I am in Puerto Rico
attending @APWG eCrime research
Talking about #phishing on OSN
Earn Money #help #moneyhttp://bit.ly/Pw637z
@Tag
#Tag
URL in Tweet
To mention/reply to a Twitter user
To mention a topic
To link external media
Twitter 101
7
attending @APWG eCrime research
I’ll follow Grey1!
I’ll follow Grey2!
We’ll follow Blue!
Followers
Followees
attending @APWG eCrime research
Retweet (RT)
Nice! I’ll share this tweet in my network!
Twitter 101
8
attending @APWG eCrime research
I’ll follow Grey1!
I’ll follow Grey2!
We’ll follow Blue!
Nice! I’ll share this tweet in my network!
Followers
Followees
attending @APWG eCrime research
Retweet (RT)
Twitter Timeline
Tweets by FolloweesRetweets by Followees
Tweets by SelfRetweets by Self
Tweets with @Blue
@Blue
Challenges of PhishingDetection on Twitter
• Only 140 Characters - very less information
• Use of short URLs in tweets
• 100,000 Tweets per minute - quick spread
• Phishing Blacklists are slow - not reliable
9
Our Contribution
• PhishAri: Automatic realtime phishing detection mechanism for Twitter
• More efficient than plain blacklisting method
• Better than Twitter’s own phishing detection mechanism
• Real-world implementation of the system - Chrome Extension for Twitter
10
Methodology
• Step 1: Classification Model for Phishing Detection
• Data Collection
• Feature Extraction
• Classification
• Step 2: Realtime end-user Interface
• Using pre-trained classification model
• Chrome Browser Extension
11
Data Collection
12
Wait for 3 days
• 1,589 Phishing Tweets
• 903 Unique phishing URLs
• URL Features - Length, number of dots, characters, redirections
• WHOIs Features - domain name, ownership period
• Tweet Features - Number of #tags, @mentions, length, trending topics
• Network Features - Follower/Followee ratio, Age of account, Number of Tweets
13
Features Used
Classification Results
14
EvaluationMetric Naive Bayes Decision
TreeRandom Forest
Accuracy 87.02% 89.28% 92.52%
Precision(Phishing)
89.21% 88.05% 95.24%
Precision(Safe)
92.12% 94.15% 97.23%
Recall(Phishing)
68.32% 74.51% 92.21%
Precision(Safe)
85.68% 89.20% 95.54%
Evaluation
• Comparison with Blacklists
• 80.6% more phishing tweets detected by PhishAri at zero hour which were caught by blacklists after 3 days.
• Comparison with Twitter’s defense mechanism
• 84.6% more phishing tweets detected by PhishAri at zero hour which were marked as suspicious by Twitter after 3 days
15
Time Evaluation
• Used Intel Xeon 16 core Ubuntu server with 2.67 GHz processor and 32 GB RAM
• Multiprocessing Modules for faster processing
• Time required for the feature extraction & classification of a tweet is a maximum of 0.522 seconds (Min: 0.167 sec, Avg: 0.425 sec, Median 0.384 sec)
16
Text Analysis
17
Legitimate Tweets Phishing Tweets
PhishAri: RESTful API
• Use above classification model to create a RESTful API
• POST requests can be made to API to query a tweet
• Pre-trained classifier model used for classification of new tweets
18
PhishAri Chrome Extension
19
• Red / Green Indicators in front of Tweets with URLs
• Detects phishing tweets on
• User Timeline
• Twitter search results
• Profile of other users
• DMs (Limited as for now)
20
PhishAri Chrome Extension
21
Demo
How Extension Works?
22
• Integration of API with the Browser Extension
PhishAri Extension: User Experience and Statistics
• 78 Active Users
• User study shows that -
• users want support for other browsers, mobile apps
• found useful to use
• more robustness desired
23
• “Phish” + “Ari” = Realtime Automatic Detection
• 92.52% Accuracy with Random Forest Classifier
• Efficient - takes only 0.522 seconds for indicator to appear
• No dependency on Blacklists
• Faster than Blacklists
• Faster than Twitter’s own detection mechanism
24
Conclusion
• Backend database for faster lookup
• Increase the scope of PhishAri from public to all tweets
• Increase response time of PhishAri and appearance of indicators
• Support for other browsers and mobile apps
25
Future Work
Thank You!
26
Questions?Suggestions?