Network Mapping and Anomaly Detection Athina Markopoulou ( Irvine)

31
Network Mapping and Anomaly Detection Athina Markopoulou (Irvine) Robert Calderbank (Princeton) Rob Nowak (Madison) MURI Kickoff Meeting September 19, 2009

description

Network Mapping and Anomaly Detection Athina Markopoulou ( Irvine) Robert Calderbank ( Princeton) Rob Nowak (Madison) MURI Kickoff Meeting September 19, 2009. Outline. Challenges - Applications - Mathematics Preliminary Results - Detecting Malicious Traffic Sources - PowerPoint PPT Presentation

Transcript of Network Mapping and Anomaly Detection Athina Markopoulou ( Irvine)

Page 1: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Network Mapping and Anomaly Detection

Athina Markopoulou (Irvine)

Robert Calderbank (Princeton)

Rob Nowak (Madison)

MURI Kickoff Meeting September 19, 2009

Page 2: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Challenges

- Applications- Mathematics

Preliminary Results

- Detecting Malicious Traffic Sources(Athina Markopoulou)

- Network Topology Id

- Network-wide Anomaly Detection

Research Directions

Outline

Page 3: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Application Challenges

Network Mapping: Infer network topology/connectivity from minimal measurements

Detecting Topology Changes: Quickly sense changes in routing or connectivity

Network-wide Anomalies: Detect weak and distributed patters of anomalous network activity

Predicting Malicious Traffic: Identify network sources that are likely to launch future attacks

Page 4: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Mathematical Challenges

Vastly Incomplete Data: Impossible to monitor a network everywhere and all the time. Where and when should we measure?

Large-scale Inference: Inference of high-dimensional signals/graphs from noisy and incomplete data. Robust statistical data analysis and scalable algorithms are crucial.

Network Representations: Statistical analysis matched to network structures. Can network data be ‘sparsified’ using new representations and transformations?

Network Prediction Models: New ‘network-centric’ statistical methods are needed to cluster network nodes for robust prediction from limited datasets.

Page 5: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Predicting Malicious Traffic Sources

Page 6: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Predictive Blacklistingas an Implicit Recommendation System

• Problem: predict sources of malicious traffic on the Internet– Blacklists:

• list of worst offenders (source IP addresses or prefixes)• used to block (or to further scrub) traffic originating from those sources

– Goal:• Predict malicious sources that are likely to attack a victim in the future

based on past logs

• Prediction vs. Detection• strictly speaking, this is not “detection”• but it does require finding patterns in the data

Page 7: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Traditional Blacklist Generation

• Local Worst Offender List (LWOL)– Most prolific local offenders– Reactive but not proactive

• Global Worst Offender List (GWOL)– Most prolific global offenders– Might contain irrelevant offenders– Non prolific attackers are elusive to GWOL

• State of the art: Collaborative Blacklisting – J. Zhang, P. Porras, and J. Ullrich, “Highly Predictive Blacklisting”, USENIX

Security 2008 (best paper award)– Key idea: A victim is likely to be attacked not only by sources that

attacked this particular, but also by sources that attacked “similar victims”

– Methodology: Use link-analysis (pagerank) on the victims similarity graph to predict future attacks

– First methodological development in this problem a long time!

Page 8: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Formulating Predictive Blacklistingas an Implicit Recommendation System

3 2 ? ?

1 ? ? 4

6 3 1 9

? ? 2 ?

Item

s (

movie

s)

Users

R = Rating Matrix

Recommendation system(e.g. Netflix, Amazon)

8

- 13 4 ?

? - 3 ?

? ? - 2

3 8 ? -

- ? ? 1

? - 12 1

4 ? - 27

2 ? 6 -

- 7 ? 1

3 - ? 9

? 21 - ?

11 2 ? -

Vic

tim

s

Attackers

Predictive Blacklisting

- ? ? ?

? - ? ?

? ? - ?

? ? ? -Time

R(t) = Attack Matrix

Page 9: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Collaborative Filtering (CF) different techniques capture different patterns in the data

• Multi-level Prediction– Individual level: (attacker, victim)

• use time series to project past trends

– Local level: neighborhood-based CF• group “similar” victim networks,(knn)

– notion of similarity accounts for common attackers and time

• groups of attackers attacking the same victims – find them using the cross-association (CA) method

– [Global level: factorization-based CF (in progress)]• find latent factors in the data using, e.g. SVD

• Combine ratings from different predictors

9

Page 10: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Tested our approach on Dshield data6-month of logs

• Dshield.org is a central repository of shared logs– Several victim organizations submit their IDS logs (flow data)– The repository analyzes the logs and provides a predictive

blacklist, tailored to each victim

UCI

Princeton

Dshield.org

Page 11: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Several different patterns co-exist in the data

and should be detected and used for prediction

11

Page 12: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Preliminary results

• A combination of methods significantly improves the hit count of the blacklist

– up to 70% (57% on avg) compared to the state-of-the art (HPB)

Combined method

State-of-the-art (HPB)

Older method (GWOL)

• and there is much room for improvement

Page 13: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Challenges & Future Directions

• Get closer to the upper bound– Latent factor techniques– Dealing with missing data

• Adversarial model• Scalability

• Hopefully interactions with other people in this group

• F. Soldo, A. Le, A. Markopoulou, http://arxiv.org/abs/0908.2007

Page 14: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Network Mapping

Page 15: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Network Mapping

Existing Methods:Active probing (e.g., traceroute)

New Approach:Passive monitoring

Lumeta Corporation

Page 16: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

The Data

8

5

1011

13

161110

51013

10410

64

74

11107

1076

1015

1412

85510

946

12

98

1575

6411

812

144

1244

782

10135

161110

51013

10410

64

74

11107

1076

1015

1412

85510

946

12

98

1575

6411

812

144

1244

782

10135

monitors

end-

host

sHop-counts from end-hosts to monitors; extracted from TTL fields of traffic at monitors

16111049

5101365

101118410

64794

734515

11710127

831076

10152136

5214121

785510

946910

51511126

910852

1512775

6134115

881295

414274

1214544

716829

1101385

16111049

5101365

101118410

64794

734515

11710127

831076

10152136

5214121

785510

946910

51511126

910852

1512775

6134115

881295

414274

1214544

716829

1101385

?1

?

Page 17: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Clustering End-Hosts

Problem: Use hop-count data to automatically cluster end-hosts into topologically relevant groups (e.g., subnets)

Intuition: End-hosts with similar hop-counts are probably close together

Challenge: Clustering with missing data

161110

51013

10410

64

74

11107

1076

1015

1412

85510

946

12

98

1575

6411

812

144

1244

782

10135

161110

51013

10410

64

74

11107

1076

1015

1412

85510

946

12

98

1575

6411

812

144

1244

782

10135

honeypots

end-

host

s

2-d histogram of hop-counts; ellipses indicate end-hosts from different subnets

Page 18: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Matrix Completion

161110

51013

10410

64

74

11107

1076

1015

1412

85510

946

12

98

1575

6411

812

144

1244

782

10135

161110

51013

10410

64

74

11107

1076

1015

1412

85510

946

12

98

1575

6411

812

144

1244

782

10135

observed hop-counts are random samples of entries of complete hop-count matrix

16111049

5101365

101118410

64794

734515

11710127

831076

10152136

5214121

785510

946910

51511126

910852

1512775

6134115

881295

414274

1214544

716829

1101385

16111049

5101365

101118410

64794

734515

11710127

831076

10152136

5214121

785510

946910

51511126

910852

1512775

6134115

881295

414274

1214544

716829

1101385

SVD of hop-countmatrix is low-rank

r

Page 19: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Results

clusters from complete data

clusters from 25% data

0 fraction of complete data 1

mixture model

Page 20: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Network-wide Anomaly Detection

Page 21: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Binary pattern 0/1Signal strength

Observation model:

Weak in strength: signal

Invisible in per node signature

Weak in extent: # affected nodes

Invisible in network wide aggregate

unknown

Distributed Network Anomaly Detection

Page 22: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Prior work: Can detect weak and unstructured patterns by exploiting multiplicity. (Ingster, Jin-Donoho’03, Abramovich et al ’01)

Subtle adaptive testing procedures: Higher criticism, False discovery control

Localizable

sign

al s

tren

gth

sparsity

Now you see it, now you don’t

# active nodes

Detecting weak and sparse patterns

Page 23: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

In addition to multiplicity, can we exploit the (possibly non-local) dependencies between node measurements to boost performance?

Method must be adaptive to network interaction structure.

How do node interactions effect thresholds of detectability/localizability?

InteractionsNodes

Network anomaly patterns

Page 24: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Latent multi-scale Ising model :

# edge agreementsstrength of interaction

Observed network node measurements

Latent multi-scale dependencies

Modeling network anomaly patterns

Page 25: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Theorem: Consider a latent multi-scale Ising model with uniform node interaction strength . With probability ,

(1)the correct dependency structure (tree) can be learnt using i.i.d network observations x by hierarchical correlation clustering.

(2)the number of non-zero basis coefficients for an x drawn at random is is .

Hierarchical correlation clustering Unbalanced Haar Basis

Hierarchical clustering and basis learning

Page 26: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Network data

Signal is focused and strengthis amplified

Wavelet coefficients

Weak patterns are amplified by the sparsifying transform adapted to network topology, whilenoise characteristics remain the same.

Detection of anomalies in transform domain

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Detection vs. Signal Strength

Detection usingOriginal data

Detection using Wavelet coefficients

Coherent activation patterns result in few non-zero basis coefficients and can be detected with much smaller signal strength

Page 27: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Example basis vectors learnt from O(log p) network measurements using hierarchical clustering

Sample delay covariance matrix

Internet anomaly detection

Monitor

unknown networkunknown network

Compression achieved for real Internet RTT data

Page 28: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Research Directions

Active Sensing: Sequential algorithms that automatically decide where, what and when to measure?

Online Large-scale Inference: on-line and near real-time network monitoring to detect topology changes and traffic anomalies.

Wireless Network Sensing: Exploitation of sparsity and diversity in wireless networks for fast and robust identification of network-wide characteristics.

New Network Representations: Relationships between wavelet representations and persistent homology.

Page 29: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Extra slides

Page 30: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

5

10

6

11

13

network structure is unknown; infer network routing/topology by ‘triangulation’

Network Discovery

Page 31: Network Mapping and  Anomaly Detection Athina  Markopoulou  ( Irvine)

Network Discovery

5

10

6

11

13

Unfortunately, many hop-counts are not

observed

?

?