1 Using Failure Information Analysis to Detect Enterprise Zombies Zhaosheng Zhu, Vinod Yegneswaran,...

30
1 Using Failure Information Using Failure Information Analysis to Detect Analysis to Detect Enterprise Zombies Enterprise Zombies Zhaosheng Zhu, Vinod Zhaosheng Zhu, Vinod Yegneswaran, Yan Chen Yegneswaran, Yan Chen Lab of Internet and Security Lab of Internet and Security Technology Technology Northwestern University Northwestern University SRI International

Transcript of 1 Using Failure Information Analysis to Detect Enterprise Zombies Zhaosheng Zhu, Vinod Yegneswaran,...

1

Using Failure Information Using Failure Information Analysis to Detect Enterprise Analysis to Detect Enterprise

ZombiesZombiesZhaosheng Zhu, Vinod Yegneswaran, Zhaosheng Zhu, Vinod Yegneswaran,

Yan ChenYan Chen

Lab of Internet and Security Lab of Internet and Security TechnologyTechnology

Northwestern UniversityNorthwestern University

SRI InternationalSRI International

2

MotivationMotivation

Increasing prevalence and sophistication Increasing prevalence and sophistication of malwareof malware

Current solutions are a day late and Current solutions are a day late and dollar shortdollar short NIDSNIDS FirewallsFirewalls AV systemsAV systems

Conficker is a great example!Conficker is a great example! Over 10M hosts infected across variants Over 10M hosts infected across variants

A/B/CA/B/C

3

Related WorkRelated Work BotHunter [Usenix Security 2007]BotHunter [Usenix Security 2007]

Dialog Correlation Engine to detect enterprise botsDialog Correlation Engine to detect enterprise bots Models lifecycle of bots:Models lifecycle of bots:

Inbound Scan / Exploit / Egg download / C & C / Outbound Inbound Scan / Exploit / Egg download / C & C / Outbound ScansScans

Relies on Snort signatures to detect different phasesRelies on Snort signatures to detect different phases Rishi [HotBots 07]: Rishi [HotBots 07]: Detects IRC bots based on nickname Detects IRC bots based on nickname

patternspatterns BotSniffer [NDSS 08]BotSniffer [NDSS 08]

Uses spatio-temporal correlation to detect C&C activityUses spatio-temporal correlation to detect C&C activity BotMiner [Usenix Security 08]BotMiner [Usenix Security 08]

Combines clustering with BotHunter and BotSniffer heuristicsCombines clustering with BotHunter and BotSniffer heuristics Focus on successful bot communication patternsFocus on successful bot communication patterns

4

Objective and ApproachObjective and Approach Develop a complement to existing network Develop a complement to existing network

defenses to improve its resilience and defenses to improve its resilience and robustnessrobustness Signature independentSignature independent Malware family independent – no prior knowledge Malware family independent – no prior knowledge

on malware semantics or C&C mechanisms neededon malware semantics or C&C mechanisms needed Malware class independent (detect more than bots)Malware class independent (detect more than bots)

Key idea: Failure Information AnalysisKey idea: Failure Information Analysis Observation: malware communication patterns Observation: malware communication patterns

result in abnormally high failure ratesresult in abnormally high failure rates Correlates network and application failures at Correlates network and application failures at

multi-pointsmulti-points

5

OutlineOutline Motivations and Key IdeaMotivations and Key Idea Empirical Failure Pattern Empirical Failure Pattern

Study: Malware and Normal Study: Malware and Normal ApplicationsApplications

Netfuse DesignNetfuse Design EvaluationsEvaluations ConclusionsConclusions

6

Malware Failure PatternsMalware Failure Patterns Empirical survey of 32 malware instances Empirical survey of 32 malware instances

with long-lived traces (5 – 8 hours)with long-lived traces (5 – 8 hours) SRI honeynet, spamtrap and Offensive SRI honeynet, spamtrap and Offensive

ComputingComputing Spyware, HTTP botnet, IRC botnet, P2P botnet, WormSpyware, HTTP botnet, IRC botnet, P2P botnet, Worm

Application protocols studied: Application protocols studied: DNS, HTTP, FTP, SMTP, IRCDNS, HTTP, FTP, SMTP, IRC

24/32 generated failures24/32 generated failures 18/32 generated DNS failures18/32 generated DNS failures

Mostly NXDOMAINsMostly NXDOMAINs DNS failures part of normal behavior for some bots DNS failures part of normal behavior for some bots

like Kraken and Conficker (generates new list of C&C like Kraken and Conficker (generates new list of C&C rendezvous points everyday)rendezvous points everyday)

7

Malware Failure Patterns Malware Failure Patterns (2)(2)

SMTP failures part of most spam botsSMTP failures part of most spam bots Storm, Bobax etc. Storm, Bobax etc. 550: recipient address rejected550: recipient address rejected

HTTP failuresHTTP failures Generated by worms: Virut (DoS attacks) Generated by worms: Virut (DoS attacks)

and Webyand Weby Weby contacts remote servers to get Weby contacts remote servers to get

configuration infoconfiguration info IRC failuresIRC failures

Channel removed from a public IRC serverChannel removed from a public IRC server Channel is full due to too many botsChannel is full due to too many bots

8

MALWAREMALWARE CLASSCLASS DNS DNS raterate

HTTP HTTP raterate

ICMP ICMP raterate

SMTP SMTP raterate

TCP TCP raterate

Look2meLook2me

WsnpoemWsnpoemSPYWARSPYWAREE

55

1515

BobaxBobax

KrakenKrakenHTTPHTTP

BOTNETBOTNET148148

348348191191

AgobotAgobot

GobotGobot

Sdbot I+IISdbot I+II

Spybot Spybot I/II/IIII/II/III

WootbotWootbot

WebloitWebloit

IRCIRC

BOTNETBOTNET53125312

22412241

315315

891891 95399539

15561556

275275

477477

NugacheNugache

Storm I/IIStorm I/IIP2PP2P

BOTNETBOTNET 2626 3258332583 284284291291

7373

AllapleAllaple

GrumGrum

KwbotKwbot

MytobMytob

NetskyNetsky

ProtorideProtoride

VirutVirut

WebyWeby

WORMWORM 99

6060

3737

221221

5101251012

503503

222222 1010

6767

3341333413

160160

385385

409409

57385738

3133031330

5353

151151

1414

2424

9

Normal Applications Normal Applications StudiedStudied

WebcrawlerWebcrawler news.sohu.com, amazon.com, bofa.com, news.sohu.com, amazon.com, bofa.com,

imdb.comimdb.com P2PP2P

BitTorrent, EmuleBitTorrent, Emule VideoVideo

Youtube.comYoutube.com HTTP 304/Not Modified errors HTTP 304/Not Modified errors

whitelistedwhitelisted

10

Normal Applications Normal Applications StudiedStudied

For video traffic, no transport-layer For video traffic, no transport-layer failuresfailures

Application level only “HTTP 304/Not Application level only “HTTP 304/Not modified” failures. modified” failures.

11

Normal Application Failure Normal Application Failure PatternsPatterns

ApplicatiApplication on

HTTP HTTP

Hourly Hourly raterate

ICMP ICMP

Hourly Hourly raterate

TCPTCP

# ports/ # ports/ Hourly Hourly raterate

Sohu.comSohu.com

Amazon.cAmazon.comom

Imdb.comImdb.com

Bofa.comBofa.com

1.41.4

0.040.04

0.80.8

1/0.041/0.04

1/1.41/1.4

1/0.21/0.2

1/0.91/0.9

BitTorrenBitTorrentt

eMuleeMule

0.60.6

6868382/333382/333

839/370839/370

12

Empirical Analysis Empirical Analysis SummarySummary

High volume failures are good High volume failures are good indicators of malwareindicators of malware

DNS failures (NXDomain messages) DNS failures (NXDomain messages) are common among malwareare common among malware

Malware failures tend to be Malware failures tend to be persistentpersistent

Malware failure patterns tend to be Malware failure patterns tend to be repetitive (low entropy) while repetitive (low entropy) while normal applications don’tnormal applications don’t

13

OutlineOutline Motivations and Key IdeaMotivations and Key Idea Empirical Failure Pattern Empirical Failure Pattern

Study: Malware and Normal Study: Malware and Normal ApplicationsApplications

Netfuse DesignNetfuse Design EvaluationsEvaluations ConclusionsConclusions

14

Netfuse DesignNetfuse Design Netfuse: a behavior based network Netfuse: a behavior based network

monitormonitor Correlates network and application failuresCorrelates network and application failures Wireshark and L7 filters for protocol parsingWireshark and L7 filters for protocol parsing Multi-point failure monitoringMulti-point failure monitoring

Netfuse componentsNetfuse components FIA (Failure Information Analysis) EngineFIA (Failure Information Analysis) Engine DNSMon DNSMon SVM-based Correlation EngineSVM-based Correlation Engine ClusteringClustering

15

Multi-point DeploymentMulti-point Deployment

Enterprise Enterprise NetworkNetwork

Enterprise Enterprise NetworkNetwork

DNSMon

Gateway FIA

Failure Scores

SVM Correlation

Clustering

16

FIA EngineFIA Engine Wireshark: open source protocol analyzer Wireshark: open source protocol analyzer

/ dissector/ dissector Analyzes online and offline pcap capturesAnalyzes online and offline pcap captures Supports most protocolsSupports most protocols Uses port numbers to choose dissectors Uses port numbers to choose dissectors

Augment wireshark with L7 protocol Augment wireshark with L7 protocol signaturessignatures Automated decoding with payload signaturesAutomated decoding with payload signatures Sample sig for HTTPSample sig for HTTP

http/(0\.9|1\.0|1\.1) [1-5][0-9][0-9] [\x09-\x0d -http/(0\.9|1\.0|1\.1) [1-5][0-9][0-9] [\x09-\x0d -~]*(connection:|content-type:|content-length:|date:)|post ~]*(connection:|content-type:|content-length:|date:)|post [\x09-\x0d -~]* http/[01[\x09-\x0d -~]* http/[01 ]\.[019]]\.[019]

17

DNSMonDNSMon

DNS servers typically located inside DNS servers typically located inside enterprise networksenterprise networks Suspicious domain lookups can’t be tracked Suspicious domain lookups can’t be tracked

back to original clients from gateway tracesback to original clients from gateway traces Especially true for NXDomain lookupsEspecially true for NXDomain lookups DNS CachingDNS Caching

DNSMon track traffic b/t clients and DNSMon track traffic b/t clients and resolving DNS serverresolving DNS server

More comprehensive view of failure More comprehensive view of failure activityactivity

18

Correlation EngineCorrelation Engine Integrates four failure scoresIntegrates four failure scores

Composite Failure ScoreComposite Failure Score Failure Divergence ScoreFailure Divergence Score Failure Entropy ScoreFailure Entropy Score Failure Persistence Score Failure Persistence Score

Malware failures tend to be long-livedMalware failures tend to be long-lived

SVM-based correlation using WekaSVM-based correlation using Weka

19

Composite Failure ScoreComposite Failure Score Estimates severity of each host based on Estimates severity of each host based on

failure volumefailure volume Consider hosts Consider hosts

Large # of application failures (e.g., > 15 Large # of application failures (e.g., > 15 per min) orper min) or

TCP RST, ICMP failures > 2 std. dev from TCP RST, ICMP failures > 2 std. dev from mean of all hostsmean of all hosts

Compute weighted failure score based on Compute weighted failure score based on failure frequency of protocolfailure frequency of protocol

20

Failure Persistence ScoreFailure Persistence Score

Motivated by observation that Motivated by observation that malware failures tend to be long-malware failures tend to be long-livedlived

Split time horizon into N parts and Split time horizon into N parts and compute number of parts where compute number of parts where failure occursfailure occurs In our experiments N = 24In our experiments N = 24

21

Failure Divergence ScoreFailure Divergence Score Measure degree of uptick in a host’s failure Measure degree of uptick in a host’s failure

profileprofile Newly infected hosts would demonstrate strong Newly infected hosts would demonstrate strong

and positive dynamicsand positive dynamics EWMA AlgorithmEWMA Algorithm

αα = 0.5 = 0.5 For each host, protocol and date compute For each host, protocol and date compute

difference between expected and actual value.difference between expected and actual value. Add divergence of each protocol for that hostAdd divergence of each protocol for that host Normalize by dividing with the maximum Normalize by dividing with the maximum

divergence value for all hostsdivergence value for all hosts

22

Failure Entropy ScoreFailure Entropy Score Measure degree of diversity in a host’s failure Measure degree of diversity in a host’s failure

profileprofile Malware failures tend to be redundant (low diversity)Malware failures tend to be redundant (low diversity) TCP: track server/port distribution of each client TCP: track server/port distribution of each client

receiving failuresreceiving failures DNS: track domain name diversityDNS: track domain name diversity HTTP/SMTP/FTP: track failure types and host namesHTTP/SMTP/FTP: track failure types and host names Ignore ICMPIgnore ICMP

Compute weighted average failure entropy scoreCompute weighted average failure entropy score Protocols that dominate failure volume of a host get Protocols that dominate failure volume of a host get

higher weightshigher weights

23

OutlineOutline

Motivations and Key IdeaMotivations and Key Idea Empirical Failure Pattern Study: Empirical Failure Pattern Study:

Malware and Normal Malware and Normal ApplicationsApplications

Netfuse DesignNetfuse Design EvaluationsEvaluations ConclusionsConclusions

24

Evaluation TracesEvaluation Traces

Malware I: 24 malware traces from failure pattern Malware I: 24 malware traces from failure pattern studystudy

Malware II: 5 new malware families (Peacomm, Malware II: 5 new malware families (Peacomm, Mimail, Rbot, Bifrose, Kraken) + 3 trained Mimail, Rbot, Bifrose, Kraken) + 3 trained families families Run for 8 to 10 hours each.Run for 8 to 10 hours each.

Malware III: 242 traces selected from 5000 Malware III: 242 traces selected from 5000 malware sandbox traces based on duration & malware sandbox traces based on duration & trace sizetrace size

Institute Traces: Benign traces from well-Institute Traces: Benign traces from well-administered Class B (/16) network with hundreds administered Class B (/16) network with hundreds of machines (5-day and 12-day)of machines (5-day and 12-day)

25

Evaluation MethodologyEvaluation Methodology

5-day 5-day Institute Institute TraceTrace

12-day 12-day Institute Institute TraceTrace

Malware Malware Trace ITrace I

TrainingTraining TestingTesting

Malware Malware Trace 2Trace 2

TestingTesting

Malware Malware Trace 3Trace 3

TestingTesting

26

Detection RateDetection Rate

27

False Positive RateFalse Positive Rate

28

Performance SummaryPerformance Summary

Detection rate > 92% for traces I/IIDetection rate > 92% for traces I/II Detection rate under 40% for trace IIIDetection rate under 40% for trace III

Trace includes many types of malware Trace includes many types of malware including adware with failure patterns including adware with failure patterns similar to benign applicationssimilar to benign applications

Traces are short, many under 15 minsTraces are short, many under 15 mins False positive rate < 5%False positive rate < 5%

29

Clustering ResultsClustering Results

PeacomPeacommm

999905 999905 pktspkts

3/33/3 100%100%

BifroseBifrose 3063530635 3/33/3 100%100%

MimailMimail 279962279962 3/33/3 100%100%

KrakenKraken 4950549505 3/33/3 100%100%

SdbotSdbot 312796312796 3/33/3 100%100%

SpybotSpybot 7975079750 3/33/3 100%100%

RbotRbot 11750831175083 3/33/3 100%100%

WebyWeby 90009000 3/33/3 100%100%

• Cluster detected hosts based on their failure profile• 24 instances belong to 8 different types of malware

30

ConclusionsConclusions Failure Information AnalysisFailure Information Analysis

Signature-independent methodology for Signature-independent methodology for detecting infected enterprise hostsdetecting infected enterprise hosts

Netfuse systemNetfuse system Four components: FIA Engine, DNSMon, Four components: FIA Engine, DNSMon,

Correlation Engine, ClusteringCorrelation Engine, Clustering Correlation metrics: Correlation metrics: Composite Failure Score, Divergence Score, Composite Failure Score, Divergence Score,

Failure Entropy Score, Persistence ScoreFailure Entropy Score, Persistence Score Useful complement to existing network Useful complement to existing network

defenses defenses