Web Spambot Detection Based on Web Navigation Behaviour

18
Web Spambot Detection Based on Web Navigation Behaviour Pedram Hayati Vidyasagar Potdar Kevin Chai Alex Talevski Anti-Spam Research Lab (ASRL) Digital Ecosystem and Business Intelligence Institute Curtin University, Perth, Western Australia

description

Web Spambot Detection Based on Web Navigation Behaviour. Pedram Hayati Vidyasagar Potdar Kevin Chai Alex Talevski Anti-Spam Research Lab (ASRL) Digital Ecosystem and Business Intelligence Institute Curtin University, Perth, Western Australia. Introduction. - PowerPoint PPT Presentation

Transcript of Web Spambot Detection Based on Web Navigation Behaviour

Page 1: Web Spambot Detection  Based on  Web Navigation Behaviour

Web Spambot Detection Based on Web Navigation Behaviour

Pedram HayatiVidyasagar Potdar

Kevin ChaiAlex Talevski

Anti-Spam Research Lab (ASRL)Digital Ecosystem and Business Intelligence Institute

Curtin University, Perth, Western Australia

Page 2: Web Spambot Detection  Based on  Web Navigation Behaviour

2www.AntiSpamResearchLab.com

Introduction

• Junk, Unrelated, Unwelcome, Anonymous content ==> spam.

• Spam now not only spreads through email but also through Web 2.0.

• This new trend of spamming is called as Spam 2.0.

Page 3: Web Spambot Detection  Based on  Web Navigation Behaviour

3www.AntiSpamResearchLab.com

Examples of Spam 2.0

• Hosting Spam content in Web applications on legitimate websites¹.

¹ P. Hayati, V. Potdar, A. Talveski, N. Firoozeh, S. Sarenche, E. A. Yeganeh. Spam 2.0 Definition, New Spamming Boom. DEST 2010, Dubai, UAE, April 2010.

Page 4: Web Spambot Detection  Based on  Web Navigation Behaviour

4www.AntiSpamResearchLab.com

Web SpamBot

• A tool is used by spammer to distribute Spam 2.0.

• Use the idea of Web robots.

• Mimic Human user behaviour.

• Waste useful resources.

In order to counter Spam 2.0 We can concentrate on Web Spambot detection as Source of Spam 2.0 problem.

Page 5: Web Spambot Detection  Based on  Web Navigation Behaviour

5www.AntiSpamResearchLab.com

Spam 2.0

Page 6: Web Spambot Detection  Based on  Web Navigation Behaviour

6www.AntiSpamResearchLab.com

Countermeasures

• Mostly on Email Spam detection.

• Content based, Meta-Content based.

• Applicable for Web environment like link-based detection.

• CAPTCHA– Possible to bypass using ML.– Machines are better to decipher.– Inconveniences human users.

Page 7: Web Spambot Detection  Based on  Web Navigation Behaviour

7www.AntiSpamResearchLab.com

Problem

• Not suitable for web 2.0 platform– Spam hosts on legitimate website – Parasitic nature– We cannot make whole website blacklisted

because of spam posts.

Page 8: Web Spambot Detection  Based on  Web Navigation Behaviour

8www.AntiSpamResearchLab.com

Our Solution

• Study Web spambot behaviour in order to stop spam 2.0.

• Fundamental assumption:

– spambot behaviour is intrinsically different from those of humans.

• Use Web Usage Data.– Contain information about user navigation through

website.– Can be gathered implicitly.

• Convert web usage data into a format that can be– Extendible– Discriminative

Page 9: Web Spambot Detection  Based on  Web Navigation Behaviour

9www.AntiSpamResearchLab.com

Our Solution

• Propose new feature set called Action.– a set of user requested webpages to achieve

a certain goal.

• Example– in an online forum, a user navigates to a

specific board then goes to the New Thread page to start a new topic.

– This user navigation can be formulated as submitting new content action.

Page 10: Web Spambot Detection  Based on  Web Navigation Behaviour

10www.AntiSpamResearchLab.com

Framework

Page 11: Web Spambot Detection  Based on  Web Navigation Behaviour

11www.AntiSpamResearchLab.com

Action Extraction

Page 12: Web Spambot Detection  Based on  Web Navigation Behaviour

12www.AntiSpamResearchLab.com

Algorithm

Page 13: Web Spambot Detection  Based on  Web Navigation Behaviour

13www.AntiSpamResearchLab.com

Dataset

• 60 days study of web spambot behaviour on a live discussion board (HoneySpam 2.0 Project).

• 1 month study of human user behaviour.

Page 14: Web Spambot Detection  Based on  Web Navigation Behaviour

14www.AntiSpamResearchLab.com

Action Frequency of Humans and Spambots

Page 15: Web Spambot Detection  Based on  Web Navigation Behaviour

15www.AntiSpamResearchLab.com

Performance Measurement

• Matthew Correlation Coefficient (MCC)

Page 16: Web Spambot Detection  Based on  Web Navigation Behaviour

16www.AntiSpamResearchLab.com

Results

Page 17: Web Spambot Detection  Based on  Web Navigation Behaviour

17www.AntiSpamResearchLab.com

Conclusion

• We propose innovative idea by focusing on spambot identification to manage spam rather than analysing spam content.

• We proposed a novel framework to detect spambots inside Web 2.0 applications, which lead us to Spam 2.0 detection.

• We proposed a new feature set i.e. action navigations, to detect spambots.

• We validated our framework against an online forum and achieved 96.24% accuracy using the MCC method

Page 18: Web Spambot Detection  Based on  Web Navigation Behaviour

18www.AntiSpamResearchLab.com

Thank YOU!

Web Spambot Detection Based on Web Navigation Behaviour

• Pedram Hayati – [email protected]• Vidyasagar Potdar – [email protected]• Kevin Chai – [email protected]• Alex Talevski – [email protected]

• Anti-Spam Research Lab (ASRL)• Digital Ecosystem and Business Intelligence Institute• Curtin University, Perth, Western Australia

• www.antispamresearchlab.com