Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify...

47
Background Visual Motifs Traffic Classification Evaluation Towards Reliable Traffic Classification Using Visual Motifs Wilson Lian 1 John McHugh 1,2 Fabian Monrose 1 1 University of North Carolina at Chapel Hill 2 RedJack, LLC FloCon 2010 Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Transcript of Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify...

Page 1: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Towards Reliable Traffic Classification UsingVisual Motifs

Wilson Lian1 John McHugh1,2 Fabian Monrose1

1University of North Carolina at Chapel Hill

2RedJack, LLC

FloCon 2010

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 2: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Overview

Background

Visual Motifs

Traffic Classification

Evaluation

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 3: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Motivation

Internet

Network Administrator

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 4: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Motivation

GET /index.ht...

d3b07384d113e...

d41d8cd98f00b...

7d8ad5cb9c940...

MAIL FROM: foo@...

d41d8cd98f00b...

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 5: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Goals

Port 22

36fd6d8c3f5af4...Port 22

fc2394c1a922...Port 22

f4d6d8c3f5a36...Port 22

222394c1a9fc...Port 22

5ad6d8c3ff436...Port 22

a92394c122fc...

Port 25

MAIL FROM: [email protected] 25

f98698466c3ef...Port 25

ef8698466c3f9...Port 25

DATA\r\nSubject: fo...Port 25

f98698466c3ef...

Port 80

b314caafaa3e...Port 80

b314caafaa3e...Port 80

POST /login.ph...Port 80

b314caafaa3e...Port 80

POST /AuthSv...Port 80

GET /index.ht...

Port 1214

113edec49eaa...Port 1214

006f7b3db8f4f...Port 1214

aa3edec49e11...Port 1214

4f006f7b3db8f0...Port 1214

9e3edec4aa11...Port 1214

b8006f7b3d4ff0...

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 6: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Assumptions

Reliable transport via TCP

Stream Cipher

No access to payloadLength preservation

Negligible packet loss & retransmission

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 7: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Related Work

Scatter (and other) Plots for Visualizing User Profiling Dataand Network Traffic, Goldring 2004.

Using Visual Motifs to Classify Encrypted Traffic, Wright etal. 2006

Intelligent Classification and Visualization of Network ScansMuelder et al. 2008.

FloVis: A Network Security Visualization Framework, Taylor2009.

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 8: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Timeline Heatmaps

Image credit: Wright et al. 2006

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 9: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Timeline Heatmaps

Image credit: Wright et al. 2006

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 10: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Timeline Heatmaps

Image credit: Wright et al. 2006

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 11: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Timeline Heatmaps

Image credit: Wright et al. 2006

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 12: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Timeline Heatmaps

Image credit: Wright et al. 2006

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 13: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Timeline Heatmaps

Image credit: Wright et al. 2006

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 14: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Timeline Heatmaps

Image credit: Wright et al. 2006

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 15: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Unigram Heatmaps

Image credit: Wright et al. 2006

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 16: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Bigram Heatmaps

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 17: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Heatmap Construction

Time

SYN 48 bytes

SYN-ACK 48 bytes

ACK 40 bytesHTTP Request 891 bytes

1500 bytes

40 bytes

40 bytes

1500 bytes270 bytes40 bytes

Client Server

48-4840

891-40

-270-1500

40-1500

40

(48, -48)(-48, 40)(40, 891)(891, -40)(-40, -270)

(-270, -1500)(-1500, 40)(40, -1500)(-1500, 40)

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 18: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Heatmap Construction

(48, -48)(-48, 40)(40, 891)(891, -40)(-40, -270)

(-270, -1500)(-1500, 40)(40, -1500)(-1500, 40)

(40, 891)

(48, -48)

(-48, 40)(40, 891)

(891, -40)(-40, -270)(-270, -1500)

(-1500, 40)

(40, -1500)

(-1500, 40)

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 19: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Heatmap Construction

(48, -48)(-48, 40)(40, 891)(891, -40)(-40, -270)

(-270, -1500)(-1500, 40)(40, -1500)(-1500, 40)

(40, 891)

(48, -48)

(-48, 40)(40, 891)

(891, -40)(-40, -270)(-270, -1500)

(-1500, 40)

(40, -1500)

(-1500, 40)3/9 = 33.3%

2/9 = 22.2%

1/9 = 11.1%

3/9 = 33.3%

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 20: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Heatmap Construction

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 21: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Bigram Heatmaps

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 22: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Modeling Protocol Behavior

(40, 891)

(48, -48)

(-48, 40)(40, 891)

(891, -40)(-40, -270)(-270, -1500)

(-1500, 40)

(40, -1500)

(-1500, 40)3/9 = 33.3%

2/9 = 22.2%

1/9 = 11.1%

3/9 = 33%

1

3 4

2

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 23: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Modeling Protocol Behavior

0 1 2 3 4

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Bin

Prob

abili

ty

.333

.111

.222

.333

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 24: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Comparing Protocol Models

1 2 3 4

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Bin

Prob

abili

ty

.333

.100 .111

.222

.333

.700

.150

.050

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 25: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Comparing Protocol Models

Atotal =n∑

k=1

Ak

Btotal =n∑

k=1

Bk

ScoreA↔B =n∑

i=1

∣∣∣∣ Ai

Atotal− Bi

Btotal

∣∣∣∣=

1

Atotal · Btotal

n∑i=1

|Ai · Btotal − Bi · Atotal |

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 26: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Comparing Protocol Models

50 1 2 3 4

Prob

abili

ty

.333

.100 .111

.222

.333

.700

.150

.050

Score = .233+.589+.072+.283 = 1.177

|.333-.100| = .233

|.111-.700| = .589

|.333-.050| = .283

|.222-.150| = .072

Diff

eren

ce

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 27: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Comparing Protocol Models

1 2 3 4

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Bin

Prob

abili

ty

.333

.400

.111

.222

.333

.150 .150

.300

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 28: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Comparing Protocol Models

50 1 2 3 4

-0.2

-0.1

-0

0.1

0.2

0.3

0.4

0.5

0.6

0.7Pr

obab

ility

.333

.400

.111

.222

.333

.150 .150

.300

Score = .067+.039+.072+.033 = .211

|.333-.400| = .067

|.111-.150| = .039 |.333-.300| = .033

|.222-.150| = .072

Diff

eren

ce

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 29: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Classifying Samples: Easy as 1-2-3

1 Create training models for desired protocols

2 Build distribution for sample network trace

3 Find training model with lowest difference score

ScoreA↔B =n∑

i=1

∣∣∣∣ Ai

Atotal− Bi

Btotal

∣∣∣∣=

1

Atotal · Btotal

n∑i=1

|Ai · Btotal − Bi · Atotal |

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 30: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Evaluation

How much traffic must be collected for:

TrainingTesting

Precision?

true positives

true positives + false positives

Recall?

true positives

true positives + false negatives

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 31: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Data

CRAWDAD Dataset

Weekdays: January 19, 2004 – February 6, 2004

Ports with sufficient traffic

≥ 1M packets0.3% of ports → 95.21% of packets

Keep top 10 ports by number of sessions observed

No ground truth

Total Packets 1.3 Billion

Traffic Volume 707 GB

Observed Ports 64,214

Sessions 5.2 Million

Port 80 Sessions 1.7 Million

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 32: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Methodology

Trial :=1 Randomly sample some percentage of available data for each

port and train classifier2 Randomly sample some number of the remaining data points

for each port and create testing samples3 Classify testing samples

50 Trials

80 110 445

TrainingData

TestingData

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 33: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Training Size Selection

160 2 4 6 8 10 12 14

0.88

0.89

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

Training Sample Size (%)

Aver

age

Reca

ll

.952

.884

Average Recall for Varying Training Size

6.8% improvement

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 34: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Training Size Selection

160 2 4 6 8 10 12 14

0.88

0.89

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

Training Sample Size (%)

Aver

age

Prec

isio

n

.887

.953

Average Precision for Varying Training Size

6.6% improvement

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 35: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Testing Size Selection

0 10,000 20,000 30,000 40,000 50,000

0.88

0.89

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

Testing Sample Size (Data Points)

Aver

age

Reca

ll

.898

.926

.954

Average Recall for Varying Testing Size

5.6% improvement

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 36: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Testing Size Selection

0 10,000 20,000 30,000 40,000 50,000

0.88

0.89

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

Testing Sample Size (Data Points)

Aver

age

Prec

isio

n

.961

.914

4.7% improvement

Average Precision for Varying Testing Size

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 37: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Results

50 Trials

15% TrainingSet Size

50,000 DataPoints TestingSet Size

96.5% Precision

96.0% Recall

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 38: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Classification Confidence Threshold

Goal: Eliminate close calls

Require 1st place candidate to lead 2nd place by certainamount to make decision

Standard deviation of scores

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 39: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Methodology v2.0

Randomly sample some percentage of available data for eachport and train classifier

Randomly sample some number of the remaining data pointsfor each port and create testing samples

Attempt to classify testing samples

If all testing samples reach threshold, done.If any testing sample fails, rebuild testing samples and tryagain.

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 40: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Classification Confidence

50 Trials

5% Training SetSize

35,000 DataPoints TestingSet Size

1.0 LeadThreshold

96.9% Precision

96.6% Recall

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 41: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Classification Confidence

0 0.25 0.5 0.75 1 1.25

0

500,000

1,000,000

1,500,000

2,000,000

2,500,000

3,000,000

3,500,000

4,000,000

4,500,000

5,000,000

5,500,000

Lead Threshold

Test

ing

Sess

ions

Sam

pled

1.0 M 1.0 M 1.2 M

2.4 M

Testing Session Sampled For 50 Trials35,000 Data Point Testing Samples

5.4 M

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 42: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Ground Truth Testing

MIT Lincoln Labs DARPA Data50 trials, 5% training sample size, 35,000 data point testingsample size, 1.25 lead threshold

Precision: 98.3%

Recall: 98.0%

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 43: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Results

50 Trials

5% Training SetSize

35,000 DataPoints TestingSize

1.25 LeadThreshold

Ground Truth

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 44: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Evasion

One might attempt to thwart our technique by padding all packetsto MTU.Reduces problem to 4-quadrant problem.Can still make decisions based on relative prevalence of eachquadrant.

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 45: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Current/Future Work

Packet loss/re-transmission may cause unpredictable results

On-line classification

Training and testing from separate datasets

UDP

Subcategorization

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 46: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Conclusion

Modeling protocol behavior using only packet size, direction,and order

Resistant to encryption and padding

Average precision and recall > 97%

Quick and reliable traffic inspection

Useful for pre-screening traffic for deeper analysis

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs

Page 47: Towards Reliable Traffic Classification Using Visual Motifs€¦ · Using Visual Motifs to Classify Encrypted Tra c, Wright et al. 2006 Intelligent Classi cation and Visualization

BackgroundVisual Motifs

Traffic ClassificationEvaluation

Questions?

Thanks for listening.Q & A

[email protected]

Wilson Lian, John McHugh, Fabian Monrose Towards Reliable Traffic Classification UsingVisual Motifs