John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...
Transcript of John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...
![Page 1: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/1.jpg)
deSEO: Combating Search-Result Poisoning
John P JohnFang Yu, Yinglian Xie,
Arvind Krishnamurthy, Martin AbadiUniversity of Washington & MSR, Silicon Valley
![Page 2: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/2.jpg)
The malware pipeline
bad stuff
spread malicious links via email, IM, search results
compromise web servers and host malicious content
find vulnerable web servers
![Page 3: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/3.jpg)
The malware pipeline
• Malware links spread through:
• spam emails, spam IMs, social networks, search results, etc.
• We look at search results
bad stuff
spread malicious links via email, IM, search results
compromise web servers and host malicious content
find vulnerable web servers
![Page 4: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/4.jpg)
![Page 5: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/5.jpg)
Is this really a problem?
• ~40% of popular searches contain at least one malicious link in top results
• Scareware fraud made $150 m. in pro!t last year
![Page 6: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/6.jpg)
Is this really a problem?
• ~40% of popular searches contain at least one malicious link in top results
• Scareware fraud made $150 m. in pro!t last year
![Page 7: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/7.jpg)
Contributions
• How does the search poisoning attack work?
• What can we learn about such attacks?
• How can we defend against them?
-examined a live attack involving 5,000 compromised sites
-identi!ed common features in search poisoning attacks
-developed deSEO, which detected new live SEO attacks on 1,000+ domains
![Page 8: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/8.jpg)
Anatomy of SEO attack
search engine
redirection server
exploit server
compromised Web server
![Page 9: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/9.jpg)
Anatomy of SEO attack
search query
search engine
redirection server
exploit server
compromised Web server
![Page 10: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/10.jpg)
Anatomy of SEO attack
search query
search engine
redirection server
exploit server
compromised Web server
![Page 11: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/11.jpg)
Anatomy of SEO attack
search query
search engine
redirection server
exploit server
compromised Web server
![Page 12: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/12.jpg)
Anatomy of SEO attack
search query
search engine
redirection server
exploit server
compromised Web server
![Page 13: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/13.jpg)
Anatomy of SEO attack
search query
search engine
redirection server
exploit server
compromised Web server
![Page 14: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/14.jpg)
Analysis of an attack
• Examine a speci!c attack
• August - October 2010
• 5,000 compromised domains
• Tens of thousands of compromised keywords
• Millions of SEO pages generated
![Page 15: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/15.jpg)
How are servers compromised?
• Sites running osCommerce
• Unpatched vulnerabilities
• Allows attackers to host any !le on the Web server - including executableswww.example.com/admin/file_manager.php/login.php?action=processuploads!
![Page 16: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/16.jpg)
What files are uploaded?
![Page 17: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/17.jpg)
What files are uploaded?
• php shell to manage !le operations
![Page 18: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/18.jpg)
What files are uploaded?
• php shell to manage !le operations
• HTML templates, images
![Page 19: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/19.jpg)
What files are uploaded?
• php shell to manage !le operations
• HTML templates, images
• php script to generate SEO web pages
![Page 20: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/20.jpg)
The main php script
www.example.com/images/page.php?page=kobayashi+arrested
![Page 21: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/21.jpg)
The main php script
www.example.com/images/page.php?page=kobayashi+arrestedkobayashi arrested
![Page 22: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/22.jpg)
The main php script
• Obfuscated script
• Simple encryption using nested evals
www.example.com/images/page.php?page=kobayashi+arrested
![Page 23: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/23.jpg)
The main script (de-obfuscated)
![Page 24: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/24.jpg)
The main script (de-obfuscated)
Check if search crawler
Generate page for keyword
![Page 25: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/25.jpg)
The main script (de-obfuscated)
Check if search crawler
Generate page for keyword
Fetch: snippets from google images from bing
![Page 26: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/26.jpg)
The main script (de-obfuscated)
Check if search crawler
Generate page for keyword
Fetch: snippets from google images from bing
Add links to other compromised sites
![Page 27: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/27.jpg)
The main script (de-obfuscated)
Check if search crawler
Generate page for keyword
Fetch: snippets from google images from bing
Add links to other compromised sites
Cache page
![Page 28: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/28.jpg)
Dense link structure
• Other compromised domains found by crawling included links
• Each site linked to 200 other sites
• ~5,000 compromised domains identi!ed
• Each site hosted 8,000 SEO pages
• 40 million pages total
![Page 29: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/29.jpg)
Poisoned keywords
• 20,000+ popular search terms poisoned
![Page 30: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/30.jpg)
Poisoned keywords
• 20,000+ popular search terms poisoned
![Page 31: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/31.jpg)
Poisoned keywords
• 20,000+ popular search terms poisoned
![Page 32: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/32.jpg)
Poisoned keywords
• 20,000+ popular search terms poisoned
• Google Trends + Bing related searches
• haiti earthquake
• senate elections
• veterans day 2010
• halloween 2010
• thanksgiving 2010 ...
![Page 33: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/33.jpg)
Poisoned keywords
• 20,000+ popular search terms poisoned
• Google Trends + Bing related searches
• haiti earthquake
• senate elections
• veterans day 2010
• halloween 2010
• thanksgiving 2010 ...
• 95% of Google Trends keywords poisoned
![Page 34: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/34.jpg)
Redirection servers
• Three domains used for redirection
• Over 1,000 exploit URLs fetched
τ0 τ1 τ2 τ3
δ1
τ0+T
δ3
δ2
!"#!!!"$!!!"%!!!"&!!!"'!!!"(!!!")!!!"*!!!"
!"#
$%&'()'*+,-#'*+.+/.'
01/%'
![Page 35: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/35.jpg)
Redirection servers
• Three domains used for redirection
• Over 1,000 exploit URLs fetched
τ0 τ1 τ2 τ3
δ1
τ0+T
δ3
δ2
Almost 100,000 victims over 10 weeks
!"#!!!"$!!!"%!!!"&!!!"'!!!"(!!!")!!!"*!!!"
!"#
$%&'()'*+,-#'*+.+/.'
01/%'
![Page 36: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/36.jpg)
Evasive techniques
• Why can’t redirection behavior be easily detected?
• Cloaking
• Requiring user interaction
• Redirection through javascript or "ash
![Page 37: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/37.jpg)
What are prominent features in search poisoning?
• Dense link structure
• Automatic generation of relevant pages
• Large number of pages with popular keywords
• Behavior of compromised sites• before - diverse content and behavior• after - similar content and behavior
![Page 38: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/38.jpg)
What are prominent features in search poisoning?
• Dense link structure
• Automatic generation of relevant pages
• Large number of pages with popular keywords
• Behavior of compromised sites• before - diverse content and behavior• after - similar content and behavior
![Page 39: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/39.jpg)
deSEO steps
1. History-based !ltering
select domains where many new pages are set up, di#erent from older pages
2. Clustering suspicious domains
using K-means++
3. Group similarity analysis
select groups where new pages are similar across domains
![Page 40: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/40.jpg)
Sample web URLs with trendy keywords
http://www.askania-fachmaerkte.de/images/news.php?page=justin+bieber+breaks+neck
![Page 41: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/41.jpg)
Sample web URLs with trendy keywords
History based detection
![Page 42: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/42.jpg)
History based detection
Domain clustering -lexical features of URLs
String features- keyword separators, arguments, !lename, path
Numerical features- number of arguments, length of arguments, length of keywords
Bag of words- set of keywords
Sample web URLs with trendy keywords
![Page 43: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/43.jpg)
History based detection
Domain clustering -lexical features of URLs
Group analysis -web page feature similarity
Sample web URLs with trendy keywords
![Page 44: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/44.jpg)
History based detection
Domain clustering -lexical features of URLs
Group analysis -web page feature similarity
Sample web URLs with trendy keywords
![Page 45: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/45.jpg)
History based detection
Domain clustering -lexical features of URLs
Group analysis -web page feature similarity
!!"!#!"!$!"!%!"!&!"!'!"!(!"!)!"!*!"!+!"#
#! %! '! )! +! ##!
#%!
#'!
#)!
#+!
$#!
$%!
$'!
$)!
%!!
%(!
&!!
&$!
'#!
()!
!"#$%&'
()'*)+#,-.))
/)'*)012.)
!
!"#
!"$
!"%
!"&
!"'
!"(
!")
!"*
! # $ ) + #! $! $+ %$ %* (! (' (( ### #+#
!"#$%&'
()'*)+#,-.))
/)'*)012.)
Sample web URLs with trendy keywords
![Page 46: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/46.jpg)
History based detection
Domain clustering -lexical features of URLs
Group analysis -web page feature similarity
Regular expressions -to match URLs not in our sample
.*\/xmlrpc\.php\/\?showc=\w+(\+\w+)+$
Sample web URLs with trendy keywords
![Page 47: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/47.jpg)
deSEO findings
• 11 malicious groups from sampled web graph in January 2011
• 957 domains
• 15,482 URLs
• Revealed a new search poisoning attack
• compromised Wordpress installations
• cloaking to avoid detection
• di#erent link topology
![Page 48: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/48.jpg)
Applying to search results
• 120 keyword searches in Google and Bing
• 163 malicious URLs detected in results
• 43 search terms a#ected
!"#$ %&'($)*+,*'"-./.+&0*-.1203 45 56 64 77 48 69 4: 7; ;
<*
5*
4*
8*
:*
3<*
3* 5* 6* 4* 7* 8* 9* :* ;*!"#
$%&''()'#
*+,-,(".'+,/0
.'
1%*&-2'&%."+3'4*5%'
![Page 49: John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin ...](https://reader034.fdocuments.in/reader034/viewer/2022052419/58736d991a28abb86e8b8138/html5/thumbnails/49.jpg)
Conclusion
• Malware and SEO are big problems
• Analyzed an ongoing scareware campaign
• Identi!ed thousands of compromised domains
• Identi!ed prominent features in SEO attacks and used them to build deSEO
• Promising results on a partial dataset from bing
• Identi!ed multiple live SEO attacks