PBS: Periodic Behavioral Spectrum of P2P Applications Tom Z.J. Fu, Yan Hu, Xingang Shi, Dah Ming...
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of PBS: Periodic Behavioral Spectrum of P2P Applications Tom Z.J. Fu, Yan Hu, Xingang Shi, Dah Ming...
PBS: Periodic Behavioral Spectrum of P2P Applications
Tom Z.J. Fu, Yan Hu, Xingang Shi, Dah Ming Chiu and John C.S. Lui
The Chinese University of Hong Kong
Outline
Where the idea comes from
Periodic patterns of P2P applications
Methods to extract the patterns
Application of PBS
Related work
Discussion
Conclusion
Idea comes from ...
Discovery by chance:
Doing measurement on PPLive packet trace locally and analysis the performance.
IE-Dept.
CampusNetwork
Outsidecampus
Idea comes from ...
Measurement results of PPLive traffic:
Every 5 seconds!
Like a periodic sequence!
Idea comes from ...
Based on discovery, raise several questions:
1. Dose this phenomenon always occur or just by chance?
2. If so, does other P2P Streaming / file sharing systems have such property?
3. If so, do they have same periods or different?
4. If so, one period represents a particular P2P application Identification of P2P applications
Idea comes from ...
Borrow idea: Element identification by Spectrum Analysis
f
Periodic group communication patterns
Why P2P applications have periodicity?
In order to form and maintain the overlay topology.
Two classes of periodic group communication pattern:
1) Control plane – overlay form and maintenance
2) Data plane – content multicasting
For both structured overlay and data-driven overlay.
Periodic group communication patterns
Why P2P applications have periodicity?
Two kinds of overlays formed by P2P applications:
Structured overlays:
a) Mesh-based: End System Multicast
b) Tree-based: NICE, Yoid, Scribe
Data driven overlays (e.g. BitTorrent):
a) Periodically update chunk information with tracker
b) Periodically choking (10s) and unchoking (30s).
Periodic group communication patterns
Pattern 1: Gossip of Buffer Maps
Pattern 2:
Content Flow Control
Periodic group communication patterns
Periodic group communication patterns
Pattern 3:
Synchronized Link Activation and Deactivation
Peer’s tit-for-tat mechanism, such as Bit-torrent
User
User
Alice
Bob
Choked
User
Christ
User
David
User
Elaine
Unchoke
Three types of sequence generator
SG1
ACF
FFT
SG1: Sequence generator for gossip pattern
Three types of sequence generator
SG2
ACF
FFT
SG2: Sequence generator for content flow control pattern
Three types of sequence generator
SG3
ACF
FFT
SG3: Sequence generator for Synchronized start and end of flows
FFT results of selective P2P applications
Packet trace
SG1
SG2
SG3
FFT1
FFT2
FFT3
Analyzer
PBS of known P2P applications
Apply PBS to identify P2P traffic
a) Filtering
b) Sequencing
c) Transforming
d) Analyzing
Heuristic Algorithm overview
Detect on target host and iterative process.
Filter
ConfiguringFiltering
parameters
Identification Results
Two days’ traffic trace collected at IE Department Gateway
Packet payload signature validation Result
In these two days, four hosts running: 1. PPStream live streaming, 2. PPLive live streaming, 3. Emule 4. BT,
were identified by our method with 100% accuracy.
Discussions
1. Only packet header information is needed.
2. Aims for specific P2P applications
3. Can be used as a validating method.
4. The data collection position affects the performance.
(May work well at campus level traffic trace.)
5. The identification results are host-level not flow-level.
6. Packet sampling may cause problems.
7. Lack of ways to validate identification results.
When apply PBS to identify P2P application:
Related Work
P2P traffic identification is a hot topic.
Existing approach:
1. Transport layer port number based
• Simplest method, easy implementation and real time
• Effective and efficient for normal applications (WEB, DNS, MAIL, FTP …)
• In nowadays P2P applications do not use fixed predefined well-known port numbers.
• Sometimes applications tunnel through well-known port.
Related Work
Existing approach:
2. Packet payload-based
• More reliable than port-based method
• Adopted by commercial products
• Detect specific applications (BT, E-Donkey, etc)
• Privacy and legal issue
• Ineffective when payload encryption is done
• Finding appropriate signatures for newly released applications or maintaining up-to-date signatures are daunting tasks! (our experience)
Existing approach:
3. Host traffic pattern based: BLINC
• Only need flow-level information, no payload, no port number information needed.
• Host-level identification (new thinking way)
• Not aiming for P2P traffic but all kinds of applications
Related Work
Figure from “Blinc: Multilevel Traffic Classification in the Dark”, In Sigcomm’05
Conclusion
In this paper:
1) Periodic communication patterns of P2P applications
2) Three sequence generators to catch the periodic patterns
3) Illustrating Frequency Characteristics of several existing P2P applications
4) Heuristic identification method by applying PBS
5) Discussions
6) Related work
Thanks !
Q & A