Troubleshooting WAN with Distributed Network Monitoring

14
Troubleshooting Wide Area Networks with Distributed Network Monitoring netbeez.net

Transcript of Troubleshooting WAN with Distributed Network Monitoring

Troubleshooting Wide Area Networks with Distributed Network Monitoring

netbeez.net

• Monitoring of Wide Area Networks• Challenges• Monitoring Options

• Distributed Network Monitoring• Deployment

• Benefits

• Use Cases

• Routing Change

• Wireless Performance

• Local versus Global

• Q&A

netbeez.net

AGENDA

New York City

Miami Chicago San Francisco

Atlanta

Internet

Baltimore

I cannot send emails!

The Internet is

slow!

Salesforce is down!

PittsburghHeadquarters IT

WiFi is down!

netbeez.net

THE NETWORK MONITORING STACK

Infrastructure SNMP Data No end-to-end view Device status only No performance data

Applications Flow Data Passive data Limited historical data Requires tap/span/*flow

Users/Services Active Data✓ Active End-to-End Tests

✓ Perf. Measurement

✓ Distributed Monitoring

netbeez.net

Internet

New York City

Miami Chicago

San Francisco

Atlanta

Baltimore

Cloud Beez

PittsburghHeadquarters ITnetbeez.net

FAULT ISOLATION

Remote Application

Issue

Local Issue(One Location)

Global Issue(All Locations)

One User

All Users

NetworkLayer

ApplicationLayer

Data center NetworkWide Area NetworkInternet Connection

Server HardwareOperating System

ApplicationCloud Provider

Branch RouterISP Connection

Power Loss

End-User ConnectionEnd-User Workstation

netbeez.net

✓Active End-to-End Tests

✓Performance Measurement

✓Distributed Monitoring

✓Real-time and Historical Data

✓Wireless Client Monitoring

GigE wired FastE wired Wireless External Virtual

netbeez.net

netbeez.net

TEST Primary Metric Secondary Metric

PING RTT Packet LossDNS Query Time Failed QueriesHTTP GET Time Failed RequestsTraceroute Number of Hops Hop Count, Hop RTT, Path-MTUIperf TCP/UDP/Multicast data transfer Bandwidth, Jitter, Packet Loss

ALERT Purpose Triggering Rule

Up Down Failure detection X consecutive tests failed

Performance Baseline Degradation based on historical data Short term average is y times greater than long term

Performance Watermark Degradation based on threshold Short term average crosses a specific value

TESTS, ALERTS AND TARGETS

Websites and Cloud Services

• Availability and performance• Connectivity to Internet• Internal versus external issue

Full Mesh Network

• End-to-End latency and packet loss• Connectivity across sites• Configuration changes

Domain Name Services (DNS)

• Availability and performance• Changes in DNS configurations

Examples of resources that can be monitored

netbeez.net

IMPACT OF NETWORK PERFORMANCE TO APPLICATION

From 7 hops

To 14 hops

DUE TO A ROUTING CHANGE

netbeez.net

WIRELESS PERFORMANCE DURING PEAK HOURS

High packet loss for WiFi agents

Increased HTTP response time during peak hours

PED12

NOC

PED88

ERBCC

PED40

ShowFloor

PED32

ShowFloor

PED24

ShowFloor

PED16

ShowFloor

PED152

SwitchCC2G

PED200

CC3A

PED208

CC3B

PED216

SwitchCC3D

PED224

CC3F

PED104

CC2A

PED144

CC2F

PED136A

CC2E

PED136BMedia

Ctr

PED48

SDN 1

ShowFloor

PED56

SDN 2

ShowFloor

PED64

SDN 3

ShowFloor

PED72

SDN 4

ShowFloor

PED2

Core

ShowFloor

PED4

Core

ShowFloor

PED6

Core

ShowFloor

ClinkSFO

ClinkDEN / LAS

PED112

CC2B

PED120

SwitchCC2C

PED128

CC2D

MMF3pr each

PED

InteropNETLas Vegas 2015

Layer1

CC2D

AECC1K

MMF3pr each

PED

Single ModeMulti Mode

Cat6

Show FloorBooths

HP

MPOMPO

MPO

MPOMPO

MPO

MPO

netbeez.net

Zone 1(show floor)

Zone 2(off show floor)

DISTRIBUTED NETWORK MONITORING AT INTEROP

CommunicationPoint (patch panel)

Uplinks

netbeez.net

PING test zone 1 agent to zone 2 agent PING test zone 1 agent to Google

PING test zone 1 agent to zone 1 agent

LOCAL VERSUS GLOBAL

Root Cause: Disconnection of the patch panel,the communication point between

the two zones and uplink to Internet(Planned outage)

Request a free trial

https://netbeez.net/request-trial/

netbeez.net

Q&A