Automating Analysis of Large-Scale Botnet Probing Events

27
Automating Analysis of Large-Scale Botnet Probing Events Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson* Lab for Internet and Security Technology (LIST) Northwestern University * UC Berkeley / ICSI

description

Automating Analysis of Large-Scale Botnet Probing Events. Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson* Lab for Internet and Security Technology (LIST) Northwestern University * UC Berkeley / ICSI. Motivation. IPv4 Space. Botnets. Can we answer this question with - PowerPoint PPT Presentation

Transcript of Automating Analysis of Large-Scale Botnet Probing Events

Page 1: Automating Analysis of Large-Scale Botnet Probing Events

Automating Analysis of Large-Scale Botnet Probing Events

Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson*Lab for Internet and Security Technology (LIST)

Northwestern University* UC Berkeley / ICSI

Page 2: Automating Analysis of Large-Scale Botnet Probing Events

2

Motivation

Administrators

IPv4 Space

Enterprise

Botnets

Does this attack

specially target us?

Can we answer this question with only limited information observed

locally in the enterprise?

Page 3: Automating Analysis of Large-Scale Botnet Probing Events

3

Motivation

• Can we infer the probe strategy used by botnets?

• Can we infer whether a botnet probing attack specially targets a certain network, or we are just part of a larger, indiscriminant attack?

• Can we extrapolate botnet global properties given limited local information?

Page 4: Automating Analysis of Large-Scale Botnet Probing Events

4

Agenda

• Motivation• Basic framework• Discover the botnet probing strategies• Extrapolate global properties• Evaluation• Conclusions

Page 5: Automating Analysis of Large-Scale Botnet Probing Events

5

Botnet Probing Events

Big spikes of larger numbers

of probers mainly caused

by botnets

Page 6: Automating Analysis of Large-Scale Botnet Probing Events

6

System Framework

See the paper for subtle system details.

Misconfiguration

Botnet

Worm

Global Property

Extrapolation

Misconfiguration Separation

Traffic Classification

Event Extraction

Worm Separation

Botnet with

uniform scan

model

Modelchecking

Monotonictrend checking

Hit listchecking

Uniformitychecking

Independencychecking

Honeynets/Honeyfarms

Traffic

Botnet Detection Subsystem Botnet Inference Subsystem

Page 7: Automating Analysis of Large-Scale Botnet Probing Events

7

Agenda

• Motivation• Basic framework• Discover the botnet probing strategies• Extrapolate global properties• Evaluation• Conclusions

Page 8: Automating Analysis of Large-Scale Botnet Probing Events

8

Discover the Botnet Probing Strategies

• Use statistical tests to understand probing strategies– Leverage on existing statistical tests

• Monotonic trend checking: detect whether bots probe the IP space monotonically

• Uniformity checking: detect whether bots scan the IP range uniformly.

– Design our own• Hitlist (liveness) checking: detect whether they

avoid the dark IP space• Dependency checking: do the bots scan

independently or are they coordinated?

Page 9: Automating Analysis of Large-Scale Botnet Probing Events

9

Design Space

No mono trend

W/ monotrend

Hit List Not Hit List

Monotonic Trend Monotonic Trend

Non-Uniform

Non-Uniform

Uniform &Independent

Uniform &Non-independent

Uniform &Non-independent

Uniform &Independent

Partial Monotonic Trend Partial Monotonic Trend

Page 10: Automating Analysis of Large-Scale Botnet Probing Events

10

Hitlist Checking• Configure the sensor to be half darknet

and half honeynet• Use metric θ= # src in darknet/ # src in

honeynet. • Threshold 0.5

Hit-list

Destination IPs in the sensor

#sca

n pe

r IP

0 500 1000 2000

02

46

810

Uniform random

Destination IPs in the sensor

#sca

n pe

r IP

0 500 1000 2000

02

46

810

Page 11: Automating Analysis of Large-Scale Botnet Probing Events

11

Agenda

• Motivation• Basic framework• Discover the botnet probing strategies• Extrapolate global properties

– Global scan scope, total # of bots, total # of scans, total scan rate for each bot

• Evaluation• Conclusions

Page 12: Automating Analysis of Large-Scale Botnet Probing Events

12

Extrapolate Global Properties: Basic Ideas and Validation

• Observe the packet fields that change with certain patterns in continuous probes.– IPID: a packet field in IP header used for IP

defragmentation – Ephemeral port number: the source port used by bots– Increment for a fixed # per scan

• Validation– IPID continuity: All versions of Windows and MacOS – Ephemeral port number continuity: botnet source code

study• Agobot, Phatbot, Spybot, SDbot, rxBot, etc.

– Control experiments with NAT

Page 13: Automating Analysis of Large-Scale Botnet Probing Events

13

Estimate Global Scan Rate of Each Bot• Count the IPID & ephemeral port #

changes– Recover the overflow of IPID and ephemeral

port number– Estimate the rate with linear regression when

correlation coefficient > 0.99– Counter overestimation: use less of the two

T

IPID

Page 14: Automating Analysis of Large-Scale Botnet Probing Events

14

Extrapolate Global Scan Scope

IPv4 Space

Botnets

Total scans from boti: scan rate Ri * scan time Ti = 100*1000=100,000

botini=100

ii

i

TRn

Aggregating multiple bots

Local/global ratio

Page 15: Automating Analysis of Large-Scale Botnet Probing Events

15

Extrapolate Global # of Bots• Idea: similar to Mark and Recapture• Assumption: All bots have the same global

scan range

BotsTotal M=4000First half m1=1000

Observed by both m12= 250

Second half m2=1000

M=m1*m2/m12

M

m1 m2

m12

Page 16: Automating Analysis of Large-Scale Botnet Probing Events

16

Agenda

• Motivation• Basic framework• Discover the botnet probing strategies• Extrapolate global properties• Evaluation• Conclusions

Page 17: Automating Analysis of Large-Scale Botnet Probing Events

17

Dataset

• Based on a 10 /24 honeynet in a National Lab (LBNL)

• 293GB packet traces in 24 months (2006-07)• Totally observed 203 botnet probing events

– Average observed #bots/event is 980. • Mainly on SMB/WINRPC, VNC, Symantec,

MSSQL, HTTP, Telnet• Size of the system: 13,900 lines: Bro (6,000),

Python (4,000), C++ (2,500), R (1,400)

Page 18: Automating Analysis of Large-Scale Botnet Probing Events

18

No mono trend97.0%

W/ monotrend3.0%

Hit List 16.3% (33) Not Hit List 83.7% (170)

Monotonic Trend 0% Monotonic Trend 0%

Non-Uniform2.5% (5)

Non-Uniform

14.2% (29)

Uniform &Independent13.8% (28)Uniform &

Non-independent0%

Uniform &Non-independent

0%

Uniform &Independent66.5% (135)

Partial Monotonic Trend 0% Partial Monotonic Trend 3.0% (6)

• More than 80% uniform scanning• Validate the results through visualization and find the results are

highly accurate.

Property Checking Results

Page 19: Automating Analysis of Large-Scale Botnet Probing Events

19

Extrapolation Results

• Most of extrapolated global scopes are at /8 size, which means the botnets do not target the enterprise (LBNL).

• Validation based with DShield data– DShield: the largest Internet alert repository– Find the /8 prefixes in DShield with sufficient

source (bots) overlap with the honeynet events• Due to incompleteness of Dshield data, 12 events

validated– Calculate the scan scope in each /8 based on

sensor coverage ratio.

Page 20: Automating Analysis of Large-Scale Botnet Probing Events

20

Extrapolation Validation

• Define scope factor as max(DShield/Honeynet,Honeynet/DShield)

1.0 1.1 1.2 1.3 1.4

0.2

0.6

scope factor

cum

ulat

ive

prob

abili

ty CDF of the scope factor 75% within 1.35 All within 1.5

Page 21: Automating Analysis of Large-Scale Botnet Probing Events

21

Conclusions

• Develop a set of statistical approaches to assess four properties of botnet probing strategies

• Designed approaches to extrapolate the global properties of a scan event based on limited local view

• Through real-world validation based on DShield, we show our scheme are promisingly accurate

Page 22: Automating Analysis of Large-Scale Botnet Probing Events

22

Backup

Page 23: Automating Analysis of Large-Scale Botnet Probing Events

23

Event size distribution

0 2000 6000 10000

0.0

0.4

0.8

# of sources per event

cum

ulat

ive

prob

abili

ty

Page 24: Automating Analysis of Large-Scale Botnet Probing Events

24

Extrapolate the scope

ii

i

TRn

Local/global

ratio

Probing time window

Estimate global probing rate

Probes observed

locally

Page 25: Automating Analysis of Large-Scale Botnet Probing Events

25

Monotonic trend checking

• Goal: detect whether the bots probe the IP space monotonically– E.g. simple sequential probing

• Technique:– Mann-Kendall trend test– Intuition: check whether the aggregated sign value

(sign(Ai+1-Ai)) out of the range of randomness can achieve.

– When most (>80%) senders in an events follow trend we label the events follow trends

Page 26: Automating Analysis of Large-Scale Botnet Probing Events

26

Uniformity Checking

• Goal: detect whether the botnet scan the IP range uniformly.

• Technique:– Chi-Square test– Intuition: put address into bins. The scan

observed in each bin should be similar. – Significance level of 0.5%

Page 27: Automating Analysis of Large-Scale Botnet Probing Events

27

Dependency Checking

• Goal: Is the bots try to get out each other’s way?

• Idea: account the number of address receive zero scan and comparing with confidence interval of the independent random case.