Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU (joint...

9
Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU http://www.datapository.net/ (joint with Jay Lepreau and Emulab)

Transcript of Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU (joint...

Page 1: Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU  (joint with Jay Lepreau and Emulab)

Data Mining Challenges for Network Management

Nick Feamster, Georgia Tech

Dave Andersen, CMU

http://www.datapository.net/

(joint with Jay Lepreau and Emulab)

Page 2: Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU  (joint with Jay Lepreau and Emulab)

Reactive Operation

• Problems cause downtime• Problems often not immediately apparent

What happens if I tweak this policy…?

Configure ObserveWait for

Next ProblemDesired Effect?

RevertNo

Yes

Page 3: Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU  (joint with Jay Lepreau and Emulab)

Proactive Techniques

Better: Proactive Operation

• Idea: Analyze configuration before deployment

Configure

rccDetectFaults Deploy

Many faults can be detected with static analysis.

PredictTraffic Flow

Page 4: Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU  (joint with Jay Lepreau and Emulab)

Dynamics for Network Management

• Problem: Many problems can’t be detected from static configuration analysis of a single AS

• Dependencies on neighboring ASes– Contract violations– Route hijacks– BGP “wedgies”– Filtering

• Dependencies on route arrivals– Simple network configurations can oscillate, but operators can’t

tell until the routes actually arrive.

Threshold-based “anomaly detection” schemes cannot detect these problems.

Page 5: Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU  (joint with Jay Lepreau and Emulab)

Network Management Challenges

• Infrastructure support for data management– Heterogeneous

• DB support for longest-prefix match would make correlation of routing and traffic data (“joint analysis”) much easier

– Large volumes– Need for real-time analysis (e.g., for anomalies/intrusion detection)

• Algorithmic support for data mining– Support for joint analysis– Threshold-based schemes don’t work for

• Small traffic blips• Small routing blips

• Support for proactive, offline analysis of routing dynamics– Analyzing configuration changes, etc.

• Support for online control

Page 6: Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU  (joint with Jay Lepreau and Emulab)

Challenge 1: Infrastructure Support

• Separate: collection, storage, analysis• Collection: abstract type, format, and access method

Page 7: Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU  (joint with Jay Lepreau and Emulab)

Challenge 2: Algorithmic SupportBlips across signals may be more operationally

interesting than any spike in one.

Page 8: Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU  (joint with Jay Lepreau and Emulab)

Challenge 3: Proactive Fault Detection

Configure

Static FaultDetection

ConstructNetwork Model

Dynamic AnalysisIn Emulation Deploy

Proactive Techniques

Existing Routes(e.g., from Datapository)

A possibility: detect configuration faults by observing “playback” of routing dynamics

“What-if” analysis in a safe sandbox.

Page 9: Data Mining Challenges for Network Management Nick Feamster, Georgia Tech Dave Andersen, CMU  (joint with Jay Lepreau and Emulab)

Challenge 4: Support for Online Control

Probes

BGP updates

IGP updates

Netflow

Router Configs

Compute Engine(input processing)

Storageand DB

AnomalyDetection

Network-Wide Route Selection,

Filter deployment, etc.

Given a system to monitor, why not also use it for control?