RelSamp : Preserving Application Structure in Sampled Flow Measurements
description
Transcript of RelSamp : Preserving Application Structure in Sampled Flow Measurements
![Page 1: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/1.jpg)
RelSamp:Preserving Application Structurein Sampled Flow Measurements
Myungjin Lee, Mohammad Hajjat,Ramana Rao Kompella, Sanjay Rao
![Page 2: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/2.jpg)
Internet
A plethora of Internet applications
Objectives Re-provision networks Detect undesirable behaviors of applications Prepare network better against major application
trends
2) Measure/Monitor1) Emergence of new applications
3) Characterization
![Page 3: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/3.jpg)
Monitoring applications at an edge Goal: Monitoring application
behavior Identify number of flows Identify number of packets
Current Solution: Sampled NetFlow Supported by most modern routers
Key limitation: Application session structure gets distorted Small # of flows per application
session Small # of packets per application
session
EnterpriseNetwork
EdgeRouter
Internet
SampledNetFlow
![Page 4: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/4.jpg)
Preserving application structure in flow measurements Benefit 1: Enables continuous monitoring of
applications Better understanding about communication patterns Better understanding of characteristics (# of flows,
packets)
Benefit 2: Application classification becomes easier Statistical machine learning techniques: SVM, C4.5, etc. Social behavior-based classifier: BLINC
Benefit 3: Detecting undesirable traffic patterns of an application
![Page 5: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/5.jpg)
Contributions Introduce the notion of related sampling
Flows belonging to the same application session are sampled with higher probability
Propose RelSamp architecture for realizing related sampling Uses three stages of sampling to preserve application
structure
Show efficacy in preserving application structure Captures more number of flows per application session Significant increase of accuracy in application
classification
![Page 6: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/6.jpg)
Related sampling
App2
App1
App3
Original applicatio
n structure
Sampled NetFlow
Related sampling
Key idea: Sample more flows from fewer application sessions
![Page 7: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/7.jpg)
Realizing related sampling
Question 1: How to sample an application session ?
Question 2: How to sample packets within an application session ?
![Page 8: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/8.jpg)
Defining application session A sequence of packets from an application on
a given host with inter-arrival time ≤ τ seconds Packets may belong to different flows to different
destinations
Example 1: BitTorrent connections to several destinations within a short span of time constitute an application session
Example 2: Web connections from a browser several seconds apart constitute different application sessions
![Page 9: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/9.jpg)
Sampling an application session One possible approach: Similar to Sampled
NetFlow Sample packets with some probability Create an application session record if no record
exists Update the application session record
Problem: Hard to do in an online fashion No application session identifier (like flow key) Need to know all flows that constitute an
application session DPI-based techniques are both difficult and
incomplete
![Page 10: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/10.jpg)
Our approach: sampling hosts Observation: Host is a super-set of an
application session Sample more flows from the same host
Flows originating at a same host closely in time typically belong to few application sessions About 80% hosts run fewer than 2 applications in
our study More details in the paper
![Page 11: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/11.jpg)
RelSamp design Three-stage sampling process consisting of host,
flow, and packet selection stages Host stage: hash-based sampling
No state maintained on a per-application basis Many application sessions for a given host are possibly
sampled Change hash function periodically to track different hosts
Flow and packet stages: random packet sampling Controls fraction of flows sampled in an application
session and packets sampled in a flow Post processing: Can separate flow records into
application sessions using port-based/statistical classifiers
![Page 12: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/12.jpg)
RelSamp architecture
Host-levelbias stage
Flow-levelbias stage
Pkt-levelbias stage
11
Copy
Ph
Selection rangeH(SrcIP) Hash space
Ph = selection range / hash space
Pfif ( random no. ≤ Pf && no flow record) create a flow record
Ppif ( random no. ≤ Pp && flow record) update the flow record
1
Tunableparameters
2
2
Flow Memory
![Page 13: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/13.jpg)
Exploring parametric space Router sampling budget Pe = f(Ph, Pf, Pp) Trade-off between accuracy of flow statistics
and # flows/application session Parameters can be tuned depending on
Objective Network environment
Examples of tuning parameters by objective Application classification: low Ph, high Pf, low Pp Application characterization: lower Ph, high Pf, high
Pp Flow statistics of all flows: Ph = Pf = Pp = Pe
![Page 14: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/14.jpg)
Evaluation goals Application characterization
Question 1: Is RelSamp effective for sampling more # of flows in an application session?
Question 2: Can RelSamp estimate statistics of an application session?
Application classification Questions 3: Is sampling more # flows in an
application session beneficial for application classification?
![Page 15: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/15.jpg)
Experimental setup Evaluation of effectiveness for capturing more flows
Trace 1: 1 hour packet trace collected at an edge RelSamp configuration (other settings in paper): Capture
more flows of app session from many hosts , , ()
Evaluation of application classification accuracy Trace 2: 13-hour full-payload trace captured at a dorm
network RelSamp setting: Similar setting, but varies from 0.1 to
1.0 Classifiers: BLINC [SIGCOMM ’05] , SVM, and C4.5 Ground truth is obtained using DPI-based classifier (tstat)
![Page 16: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/16.jpg)
Flows per application session
#captured flows/#total flows in an app session
CDF
More # of flowsper app session
![Page 17: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/17.jpg)
Accuracy of BLINC classifier
Sampling rate
Accu
racy
(%)
Note: classification results on flows using non-standard port
~ 50% increase
![Page 18: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/18.jpg)
Related work Flow Sampling [ToN ’06]
Samples flows once flow record is created Flow Slices [IMC ’05]
Focuses on controlling router resources (CPU and memory)
cSamp [NSDI ’08] Supports sampling of all traffic by coordinating
various vantage points in a network FlexSample [IMC ’08]
Support monitoring of traffic subpopulations, but needs to maintain extra states for approximate checking of predicates
![Page 19: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/19.jpg)
Summary Introduced the notion of related sampling
Samples more number of related flows in the same application session with higher probability
Proposed RelSamp architecture Preserve application structure in sampled flow records
Effective to preserving application session structure 5-10x more flows per application session compared to
Sampled NetFlow Up to 50% higher classification accuracy than
Sampled NetFlow
![Page 20: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/20.jpg)
Thank you! Questions?
![Page 21: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/21.jpg)
Evaluation method of classification techniques
DPI-basedClassifi
erRelSam
pSample
dNetFlo
wFlowSampli
ng
Ground
TruthFlow
Record1
FlowRecord
2Flow
Record3
Classification Algorithm(e.g., BLIN
C, SVM,
C4.5)Packe
tTrace
Report
Tstat
![Page 22: RelSamp : Preserving Application Structure in Sampled Flow Measurements](https://reader033.fdocuments.in/reader033/viewer/2022050911/568166bb550346895ddac420/html5/thumbnails/22.jpg)
Comparison with other solutions using BLINC
Sampling rate# of
acc
urat
ely
class
ified
flow
s
Note: classification results on flows using non-standard port