Information fusion

34
Information fusion By Ganesh Godavari

description

Information fusion. By Ganesh Godavari. outline. Hacking methodology Paper review Work report questions Simple Intrusion detection example. Footprinting. Scanning. Enumeration. Gaining Access. Escalating Privilege. Pilferting. Covering Tracks. Creating Back Doors. - PowerPoint PPT Presentation

Transcript of Information fusion

Page 1: Information fusion

Information fusion

By

Ganesh Godavari

Page 2: Information fusion

outline

• Hacking methodology

• Paper review

• Work report

• questions

• Simple Intrusion detection example

Page 3: Information fusion

Anatomy of a Hack[1]

Scanning

Footprinting

Enumeration

Gaining Access

Escalating Privilege

Pilferting

Covering Tracks

Creating Back Doors

Denial of Service

• Information gathering

• Focus on promising avenues of entry.

• Identify valid user accounts or poorly protected resource shares

• Based on the information gathered so far make an informed attempted to access the target

• If user-level access was obtained in the last step, seek to gain complete control of the system.

• Gather info on identify mechanisms to allow access of trusted systems.

• Once total ownership of the target is secured,

hide this fact from system administrators

• Plant trap doors in the system to ensure thatprivilege access can be easily regained

• If unsuccessful in gaining access, use readilyavailable exploit code to disable/disrupt the target

[1] http://cs.uccs.edu/~cs691/penetrateTest/penetrateTest.ppt

Page 4: Information fusion

Paper

Techniques and Tools for Analyzing Intrusion Alerts

By PENG NING, YUN CUI, DOUGLAS S. REEVES, and DINGBANG XU

Published in ACM Transactions on Information and System Security (TISSEC), 

Volume 7 Issue 2(274--318); 2004

Page 5: Information fusion

Paper outline

• How ids works

• Related work

• Alert correlation model

• Interactive utilities

• references

Page 6: Information fusion

How does IDS work

• Intrusion detection systems (IDS)– Focus on low-level attacks or anomalies– Mix actual alerts with false alerts– Generate an unmanageable number of alerts

• 10-20,000 alerts per day per sensor is common [2]

• How to reduce this unmanageable number of alerts?

[2] Manganaris, S., Christensen, M., Zerkle, D., and Hermiz, K. 2000. A data mining analysis of RTID alarms. Computer Networks 34, 571–577

Page 7: Information fusion

Related work

• Alert correlation based on similarities between alert attributes [3][4]

– alerts with the same source and destination IP addresses– cannot fully discover the causal relation-ships between related alerts

• alert correlation based on attack scenarios specified by human users, or learned from training datasets [5][6][7]

– restricted to known attack scenarios or generalized from known scenarios

• Use pre- and post-conditions of attacks [8][9]

– recognition of multi-stage attacks

Page 8: Information fusion

Motivation

• Observation– attacks are not isolated, but related as different stages of attack

sequences, with the early stages preparing for the later ones.– Distributed Denial of Service (DDoS) attacks, the attacker has to

install the DDoS daemon programs in vulnerable hosts before he can instruct the daemons to launch an attack

– Ex.: if attack A learns a vulnerable service exists, and attack B exploits the same vulnerable service, then correlate A and B

• Correlate attacks– Prerequisite of attack: necessary condition for an intrusion to be

successful– Consequence of an attack: possible outcome of an intrusion

Page 9: Information fusion

Alert Correlation model

• Hyper-alert type: definition of a potential attack, including its prerequisites and consequences

• Example – SadmindBufferOverflow = ({VictimIP, VictimPort},

ExistHost (VictimIP) ^ VulnerableSadmind (VictimIP), {GainRootAccess(VictimIP)})

• Hyper-alert: a set of occurrences of a hyper-alert type, and the times at which they occurred

• Example– hSadmindBOF = {(VictimIP = 152.1.19.5,VictimPort = 1235), (VictimIP = 152.1.19.7, VictimPort =

1235)}

Page 10: Information fusion

Model contd …

• Given hyper-alerts h1 and h2, h1 prepares for h2 if…– h1 occurred before h2 and

– the prerequisite of h2 implies the consequence of h1

• For a hyper-alert correlation graph HG = ( V, E)…– V represents a set of hyper-alerts

– for all h1, h2 V, there is a directed edge from h∈ 1 to h2 if and only if h1 prepares for h2

Page 11: Information fusion

Hyper-Alert correlation graph

Page 12: Information fusion

Interactive utilities

• Analysis of very large attack scenarios is cumbersome– six utilities were provided

• Alert aggregation and disaggregation• Focused analysis• Clustering analysis• Frequency analysis• Link analysis• Association analysis

Page 13: Information fusion

Alert Aggregation/disAggregation

• Aggregate hyper-alerts of the same type only when they occur close to each other in time

• Interval constraint: given a time interval I, a hyper-alert h satisfies interval constraint I if– 1) h has only one alert, or

– 2) for every alert ai in h, there is at least one other alert aj in h which overlaps ai in time, or which is separated from ai by no more than I units of time

Page 14: Information fusion

Focused Analysis

• Focus on the hyper-alerts of interest according to user’s specification

• A focusing constraint is a logical combination of comparisons between attribute names and constants.– Example: srcIP= 129.174.142.2 destIP= ∨

129.174.142.2– Only correlate hyper-alerts that evaluate to

true w.r.t. the focusing constraint, i.e., filter out irrelevant hyper-alerts

Page 15: Information fusion

Cluster analysis

• Cluster Analysis or graph decomposition– cluster the hyper-alerts based on common

features shared by them, and decompose a large graph into smaller, more meaningful graphs (clusters)

– Given sets of attribute names A1 and A2 for two hyper-alerts h1 and h2, a clustering constraint CC(A1, A2 )is a logical combination of comparisons between constants and attribute names in A1 and A2

Page 16: Information fusion

Contd..

• Example– A1 = A2 = { srcIP, destIP}

– CC(A1, A2 )= (A1 .srcIP= A2 .srcIP) ^ (A1 .destIP= A2 .destIP)

i.e., two hyper-alerts are clustered if they have the same source and destination IP addresses

Page 17: Information fusion

Frequency analysis

• Identify patterns in a collection of alerts by counting the number of raw alerts that share some common features

• Two modes– count mode

• counts the number of raw intrusion alerts that fall into the same cluster

– weighted analysis• add all the values of a given numerical attribute (weight

attribute) of all the alerts in the same cluster• Example: use priority of an alert type as weight attribute, and

learn the weighted frequency of alerts for all destination IP addresses

Page 18: Information fusion

Association analysis

• find out frequent co-occurrences of values belonging to different attributes that represent various entities.

• Identify patterns – Example how many attacks are from source

IP address 152.14.51.14 to destination IP address 129.14.1.31 at destination port 80

Page 19: Information fusion

references[3] Valdes, A. and Skinner, K. 2001. Probabilistic alert correlation. In Proceedings of the 4th International Symposium on Recent Advances in Intrusion Detection (RAID 2001). 54–68.[4] Staniford, S., Hoagland, J., and McAlerney, J. 2002. Practical automated detection of stealthy portscans. Journal of Computer Security 10, 1/2, 105–136.[5] Debar, H. and Wespi, A. 2001. Aggregation and correlation of intrusion-detection alerts. In Recent Advances in Intrusion Detection. LNCS 2212. 85 – 103.[6] Cuppens, F. and Ortalo, R. 2000. LAMBDA: A language to model a database for detection of attacks. In Proc. of Recent Advances in Intrusion Detection (RAID 2000). 197–216.[7] Dain, O. and Cunningham, R. 2001. Fusing a heterogeneous alert stream into scenarios. In Proceedings of the 2001 ACM Workshop on Data Mining for Security Applications. 1–13.[8] Templeton, S. and Levitt, K. 2000. A requires/provides model for computer attacks. In Proceedings of New Security Paradigms Workshop. ACM Press, 31 – 38.[9] Cuppens, F. and Miege, A. 2002. Alert correlation in a cooperative intrusion detection framework. In Proceedings of the 2002 IEEE Symposium on Security and Privacy.

Page 20: Information fusion

last week report

• Downloaded the Toolkit for Intrusion Alert Analysis (TIAA) tool from ncsu– Available at

• http://cs.uccs.edu/~infofuse/src/tools/ncsu/

• Downloaded defcon 8 and 9 data set – Available at

• http://cs.uccs.edu/~infofuse/src/datasets/– Problem in the downloaded defcon 8 and 9 CTF– the files are in .gz format but any unzip tool doesn’t work. The

file is encrypted as normal view of the file displays unreadable format

– Sent mail to dr peng ning of ncsu, he replied that there is a problem when dataset was captured using as packet length was not set right.

Page 21: Information fusion

Information fusion system architecture

Page 22: Information fusion

DecisionSupportSystem

Fusion Result

Monitor Control (IDMEX)

De

csio

n R

esu

lts

EnhancedFirewallRouter 1

NIDSBehavior-based

NIDSSnort

Detection ResultMonitor Control (IDMEX)

Detection Result

Monitor Control (IDMEX)

Legitimate Traffic

Suspicious Traffic

Attack Traffic

Legitimate Traffic

Suspicious Traffic

InformationFusion

Component n

Detection ResultMonitor Control (IDMEX)

Detection ResultMonitor Control (IDMEX)

Legitimate Traffic

Suspicious Traffic1

1

Legitimate Traffic

Suspicious Traffic

Attack Traffic

EnhancedFirewallRouter n

Fusio

n Res

ult

Mon

itor C

ontro

l (ID

MEX)

`

HIDSTripwire

Detec

tion

Result

Mon

itor C

ontro

l

(IDM

EX)

InformationFusion

Component 1

NIDSBehavior-based

`

HIDSTripwire

NIDSSnort

· Alert AggregationRemove False Positives

· Alert Refinement· Alert Correlation/Filtering

(Derive Temporal Relationship)

Page 23: Information fusion

neural networks and admission control

• Interaction of neural networks and admission control– 0 block (the pattern definitely indicates the intrusion)– 1 uncertain but highly suspicious (the pattern matches

partially, says more than 50%, of event sequence in an attack scenario? Or the consequence the potential attack are severe? We need to come up with reason analysis for the rating. )

– 2 uncertain but it is unlikely to be an attack (the pattern matches less than 50% of event sequence in an attack scenario? Deserve further monitoring,...)

– 3 false alarm (the pattern is benign).

Page 24: Information fusion

Detection result and information fusion

• Detection result are produced in IDMEF format.

• Correlation information can be supplied to neural network!!.

• Does fusion agent need to store the alerts? Will it be taken care by neural networks?

Page 25: Information fusion

Questions

• IDS agents produce only observable data.• This observable data can be categorized into 4 types

– Alert is right, agent says wrong– Alert is right, agent says right– Alert is wrong, agent says wrong ( never occurs)– Alert is wrong, agent says right ( never occurs)– Cannot use any statistical methods like fishers exact test, chi-

square test– How about conditional probability?

• Need to check if there is a method!!• If a->b->c….->z then intrusion = 1• Probability that a happened what is the probability that b and c

events occur? Can c occur independent of b?

• This observable data cannot be used for any rating.

Page 26: Information fusion

Training dataset

• DARPA 2000 dataset, training set is 1999 dataset

• Downloaded 1999 dataset for training.

• Trying to use netpoke utility to generate data from tcpdump

• Shall try to have week1 training dataset with alerts when sent to snort by next week

Page 27: Information fusion

Sample intrusion attack scenarioIDS-id

Src ip Dest ip protocol destportDesc ranking Action

Ids1202.77.162.213 172.16.115.20 udp 111

ipsweep -1 No action

Ids2202.77.162.213 172.16.115.20 udp 32773

ipsweep -1 No action

Ids2202.77.162.213 172.16.115.87 udp 111

ipsweep -1 No action

Ids3172.16.112.194 202.77.162.213 idu 0

ipsweep 3 monitor

Ids1202.77.162.213 172.16.115.87 tcp 111

rsh 3 monitor

Ids1202.77.162.213 172.16.115.87 tcp 111

rsh 2 LPQ1

Ids1202.77.162.213 172.16.115.87 tcp 111

rsh 2 LPQ1

Ids1202.77.162.213 172.16.112.87 tcp 111

BOF 1 LPQ1

Ids1202.77.162.213 172.16.112.87 tcp 67898

mstream 0 block

Ids2202.77.162.213 172.16.112.87 tcp 111

rsh 0 block

Idu = icmp-destination-unreachable

Low Priority Queue Lpq1>Lpq2

Page 28: Information fusion

Cyber Defense System

Configuration DNSServer

Internet

Outer Firewall/Router

Firewall

Inner Firewall/Router

Firewall

user1

SW

SW

MailServer

WebServer

DMZ

NIDS_out

NIDS_dmz

NIDS_in

WebServer

DBServer

user2

Every node is configured with HIDS

Page 29: Information fusion

How to ranking is decided

• Ranking– Timed history

• Number of times the event has occurred from the same source

• Number of times the event has occurred from the same source to the same destination

• Number of times event of same type is reported by various IDS

– History• coordinated event mapping

– If a -> b -> c is an complete intrusion, each event reported by independent IDS

Page 30: Information fusion

Sample example using darpa 2000 dataset

Idu = icmp-destination-unreachable

Low Priority Queue Lpq1>Lpq2

IDS-id Src ip Dest ip protocol destport ranking Action

Ids1 202.77.162.213 172.16.115.20 udp 111 -1 No action

Ids2 202.77.162.213 172.16.115.20 udp 32773 -1 No action

Ids2 202.77.162.213 172.16.115.87 udp 111 -1 No action

Ids3 172.16.112.194 202.77.162.213 idu 0 3 monitor

Ids1 172.16.115.87 202.77.162.213 idu 0 3 monitor

Ids1 172.16.112.194 202.77.162.213 idu 0 2 LPQ1

Ids1 172.16.115.87 202.77.162.213 idu 0 2 LPQ1

Ids2 172.16.112.194 202.77.162.213 idu 0 1 LPQ2

Ids2 202.77.162.213 172.16.113.50 udp 111 1 LPQ2

Ids1 172.16.112.194 202.77.162.213 idu 0 1 LPQ2

Ids2 172.16.113.105 202.77.162.213 idu 0 0 block

Ids2 172.16.117.132 202.77.162.213 idu 0 0 block

Ids2 172.16.112.194 202.77.162.213 idu 0 0 block

Every 2 times an error occurs reduce its rank by 1

Page 31: Information fusion

<IDMEF-Message version="0.1"><Alert alertid="1" impact="unknown"

version="1"><Time> <date>03/07/2000</date> <time>10:08:07</time>

<sessionduration>00:00:00</sessionduration>

</Time><Analyzer ident="tcpdump_inside"> <name>tcpdump_inside</name></Analyzer><Source spoofed="unknown"> <Node> <Address category="ipv4-addr">

<address>202.77.162.213</address> </Address> </Node></Source>

<Target> <Node> <Address category="ipv4-addr">

<address>172.16.115.20</address> </Address> </Node> <Service> <name>udp</name> <sport>54790</sport> <dport>111</dport> </Service></Target></Alert></IDMEF-Message>

Page 32: Information fusion

Questions

?

Page 33: Information fusion

simple Network case

Sniffer Servermonitoring/analysis

Internet

SNIFFERNETWORK

Page 34: Information fusion

Sample example using darpa 2000 dataset

Src ip Dest ip protocol destport ranking

202.77.162.213 172.16.115.20 udp 111 -1

202.77.162.213 172.16.115.20 udp 32773 -1

202.77.162.213 172.16.115.87 udp 111 -1

172.16.112.194 202.77.162.213 idu 0 3

172.16.115.87 202.77.162.213 idu 0 3

172.16.112.194 202.77.162.213 idu 0 2

172.16.115.87 202.77.162.213 idu 0 2

172.16.112.194 202.77.162.213 idu 0 1

202.77.162.213 172.16.113.50 udp 111 1

172.16.112.194 202.77.162.213 idu 0 1

172.16.113.105 202.77.162.213 idu 0 0

172.16.117.132 202.77.162.213 idu 0 0

172.16.112.194 202.77.162.213 idu 0 0

Every 2 times an error occurs reduce its rank by 1