.
On the Way IN: DC Forensics
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 2/28...
2/28
.
Forensics Basics
.(traditional) Forensics Stages.....
.... are collection, examination, analysis, and reporting
• many challenges in data centers
• collection: realtime is really really really difficult
• examiation: you can't examine what you can't collect, also flexibility is important
• analysis: deeper form of examination, same problems
• reporting: that part is actually easy, but DCs have no standards◦ one standard is offered later in this presentation
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 3/28...
3/28
.
Forensics : All is Traffic
.Statement..
.All information in data centers can be reduced to the traffic form• logs are information carried on packets
• logging, storage, etc. are distributed -- have to be communicate usingtraffic
• a corrolary: if something is not traffic, it might be useful to convert it intotraffic
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 4/28...
4/28
.
Practical DC Forensics
• we wantDeep Packet Inspection (DPI) back on the table
• we want to not use sampling, but capture everything• we want to differentiate attention spent to different classes oftraffic◦ called context-based sampling◦ probability of capture/inspection depends on current context
• note: all these are gradually removed from practice for infeasibility
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 5/28...
5/28
.
Conventional Multicore
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 6/28...
6/28
.
Generic Multicore Design
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 7/28...
7/28
.
Generic Multicore Capture
• 2 roles: manager andcore
• traditional parallelprocessing: messagepassing or sharedmemory 05 06
05 M.Aldinucci+2 "FastFlow: Efficient Parallel Streaming Applications on Multi-core" U.Pisa Techreport (2009)
06 R.Brightwell "Workshop on Managed Many-Core Systems" 1st Managed Many-Core Systems (2008)
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 8/28...
8/28
.
Conventional Shortcomings.Reality is.....
.
... that traditional parallel processing designs are extremely inefficienton multicore
• overhead from parallelization is too high
• unit of processing is too small
• streamline designs are rare but are recently discussed in BigData 08
.The solution is.....
.... to use a lockfree (message-less) parallelization design
08 R.Chen+2 "Tiled-MapReduce: Optimizing Resource Usages ... on Multicore with Tiling" 19th PACT (2010)
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 9/28...
9/28
.
Conventional → Proposed
• spawn, but don't wait to merge
• collect results form corescontinuously to avoid lumps
• get used to not being able tocommunicate to cores (nomessages)◦ relatively short tasks diminish this
effect 02
02
myself+0 "Experiments with Practical On-Demand Multi-Core
Packet Capture" APNOMS (2013)
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 10/28...
10/28
.
Proposal : the New Multicore
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 11/28...
11/28
.
Proposal : Mission Statement
.Proposal Components..
.
• lockfree design
• tasks-into-cores packing problem and optimization
• implementation that support lockfree design
• remember: the easiest way to aggregate traffic is to use IP address prefixes• again, generic, so we do not care about the contents
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 12/28...
12/28
.
Proposal : Shared Memory
• communication happens over
shared memory04
• C/C++ implementationis common, but will work inother languages as well
• shared memory is persistent,but cores come and go
04 K.Michael "The Linux Programming Interface" No Starch Press (2010)
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 13/28...
13/28
.
Proposal : DLL is Key.DLL stands for.....
.... Double Linked List• common in C/C++designs
• extremely flexible --you can swapelements byreassigning pointers
• sidewaysDLL is a methodto avoid collisions inhashing
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 14/28...
14/28
.
Optimization
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 15/28...
15/28
.
Optimization Targets
• few cores, many data units• need to pack latter into former
• moreover: scheduling problem, which is packing but along the timeline
• moreover(2) : when packing, do you randomize input or not -- hashing
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 16/28...
16/28
.
Prefix Packing Problem
minimize w1count(P) + w2max(M) + w3var(C)
subject of k1 < pi < k2 ∀ pi ∈ P.
Hashkey - 32 bits 0 -
k1 (shortest) k2
(longest)
Effective range
Core0 Core1 Core2 …
p (prefix)
p1 p3
p2 p4 p5 p6 p8
p7 m (max)
n
Prefix Packing Problem
• prefix length between k1 and k2s◦ hashkey or raw◦ fixed in each run in this paper
• pi is a pack (group) of items
• n total items, mapped to set M ofprefixes in each of m cores
• C a set of item counts c acrossprefixes,
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 17/28...
17/28
.
Prefix Packing GA Heuristic
• Generic Algorithm (GA) 12
• chromosome is a tuple of prefixes packed into one core
gi = ⟨pi,1, pi,2, ..., pi,m⟩. (1)
• one gene (whole solution) is a tuple containing all chromosomes
Gj = ⟨g1, g2, ..., gn⟩. (2)
12 D.Knysh+1 "Parallel Genetic Algorithms: a Survey and Problem State of the Art" IJCSS (2010)
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 18/28...
18/28
.
Analysis
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 19/28...
19/28
.
Analysis Setup
• actual packet traces -- trace-based simulation 16
• input: 2 cases -- hashing verus raw
• items are individual packets◦ packets arepacked into prefixes◦ prefixes arepacked into cores
• the above GA optimization heuristic
16 myself "MAWI Working Group Traffic Archive" http://mawi.wide.ad.jp/mawi (2014)
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 20/28...
20/28
.
Analysis (1) Cores
0 1 2 3 4 5 6 7 8 9Time sequence
4.64.74.84.9
55.15.25.35.45.5
log(
max
item
cou
nt /
cor
e) 1 core
2 cores
3 cores4 cores
5 cores6 cores
7 cores
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 21/28...
21/28
.
Analysis (2) Hashing
0 0.2 0.4 0.6 0.8 1Increasing cutoff parameter
0
40
80
120
160
200
240
Num
ber o
f uni
que
pref
ixes
hashedraw
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 22/28...
22/28
.
Forensics 2.0
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 23/28...
23/28
.
Forensics 2.0• reporting part: let's use sketches from data streaming 11
Core 1
Core 1
Core X
TABID Manager
Now(replay)
….
BIG DATA TIMELINE Cursor
Time Direction
One Sketch One Sketch One Sketch Start End End End
Read/prepare
Shared Memory
Start
11 M.Sung+3 "Scalable and Efficient Data Streaming Algorithms for Detecting Common Content..." ICDE (2006)M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 24/28
...
24/28
.
Wrapup
• a natively multicore technology is proposed
• performance is opitimized using a packing heuristic• raw input is found to be preferable to randomization
• future topics:1. variable-length prefixes2. optimization along the timeline3. jitter minimization (fewer reasignments)4. further lookup optimiation -- fast hashing
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 25/28...
25/28
.
That’s all, thank you ...
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28...
26/28
.
[01] myself+0 (2013)...community-based architecture for measuring E2E QoS at DCcIJCSE
[02] myself+0 (2013)Experiments with Practical On-Demand Multi-Core Packet CaptureAPNOMS
[03] myself+1 (2013)A Graphical Method for Detection of Flash Crowds in TrafficTelecom. Systems (TM)
[04] K.Michael (2010)The Linux Programming InterfaceNo Starch Press
[05] M.Aldinucci+2 (2009)FastFlow: Efficient Parallel Streaming Applications on Multi-coreU.Pisa Techreport
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28...
26/28
.
[06] R.Brightwell (2008)Workshop on Managed Many-Core Systems1st Managed Many-Core Systems
[07] X.Sui+3 (2010)Parallel Graph Partitioning on Multicore Architectures23rd LCPC
[08] R.Chen+2 (2010)Tiled-MapReduce: Optimizing Resource Usages ... on Multicore with Tiling19th PACT
[09] I.Machdi+2 (2009)Executing parallel TwigStack algorithm on a multi-core system11th IIWAS
[10] S.Stoichev+1 (2009)Parallel Algorithm for Integer Sorting with Multi-Core ProcessorsIT and Control
[11] M.Sung+3 (2006)M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28
...
26/28
.
Scalable and Efficient Data Streaming Algorithms for Detecting Common Content...ICDE
[12] D.Knysh+1 (2010)Parallel Genetic Algorithms: a Survey and Problem State of the ArtIJCSS
[13] Luca Deri (2009)Modern Packet Capture and Analysis: Multi-Core, Multi-Gigabit, and BeyondIM
[14] myself (2014)MCoreMemory project pagehttps://github.com/maratishe/mcorememory
[15] myself (2013)Rings-on-Cores project pagehttps://github.com/maratishe/ringsNcores
[16] myself (2014)MAWI Working Group Traffic Archive
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28...
26/28
.
http://mawi.wide.ad.jp/mawi
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 27/28...
27/28
.
Extras (1) Per-Unit Cost
Hashing
Increasing Per-Unit Cost
Manager
Prefix Matching
Cores that do not match
Process
Stage 1 Stage 2 Stage 3
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 27/28...
27/28
.
Extras (2) Share Memory Trick
M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 28/28...
28/28
Top Related