NetFlow: Digging Flows Out of the Traffic Evandro de Souza ESnet ESnet Site Coordinating Committee...
-
Upload
beverly-boyd -
Category
Documents
-
view
215 -
download
1
Transcript of NetFlow: Digging Flows Out of the Traffic Evandro de Souza ESnet ESnet Site Coordinating Committee...
NetFlow: Digging Flows NetFlow: Digging Flows Out of the TrafficOut of the Traffic
Evandro de SouzaEvandro de SouzaESnetESnet
ESnet Site Coordinating Committee MeetingESnet Site Coordinating Committee MeetingColumbus/OH – July/2004Columbus/OH – July/2004
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 22
OutlineOutline
MotivationMotivation Possible ApproachesPossible Approaches What is NetFlowWhat is NetFlow Solution DesignSolution Design SnapshotsSnapshots Trouble-Shooting ExampleTrouble-Shooting Example Present StatePresent State
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 33
MotivationMotivation CHALLENGECHALLENGE
Steve Wolf challenge: “Show me all traffic exchanged between ESnet Steve Wolf challenge: “Show me all traffic exchanged between ESnet and Abilene.“and Abilene.“
Generalized challenge: To show ingress and egress traffic exchanged Generalized challenge: To show ingress and egress traffic exchanged with ESnet broken down by AS.with ESnet broken down by AS.
MAIN REQUIREMENTSMAIN REQUIREMENTS ability to identify the top 100 flows involving institutions directly using ability to identify the top 100 flows involving institutions directly using
ESnetESnet ability to identify AS-AS trafficability to identify AS-AS traffic ability to visualize the top 10 flows and their ability to visualize the top 10 flows and their evolutionevolution during a period during a period
of timeof time scalability to process data from all ESnet border routersscalability to process data from all ESnet border routers
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 44
Solutions AvailableSolutions Available
Hardware SolutionsHardware Solutions Dedicated Router Monitoring BoardDedicated Router Monitoring Board
Example: Juniper’s Monitoring Services PICExample: Juniper’s Monitoring Services PIC Manufacturer dependentManufacturer dependent Very expensiveVery expensive
Dedicated Link Monitoring BoxDedicated Link Monitoring Box Example: BSD box using BroExample: BSD box using Bro Scalability issuesScalability issues Real-time information about routing tables Real-time information about routing tables
Software SolutionsSoftware Solutions Example: NetFlow Example: NetFlow
Adopted by several router and switch products (Cisco, Juniper, etc)Adopted by several router and switch products (Cisco, Juniper, etc) May require huge computing power to process data from large May require huge computing power to process data from large
networksnetworks
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 55
NetFlow Characteristics (1)NetFlow Characteristics (1) What is a Flow?What is a Flow?
A flow is defined as a A flow is defined as a unidirectional stream of packetsunidirectional stream of packets. It is . It is uniquely identified as the combination of the following seven key uniquely identified as the combination of the following seven key fields: fields: Source IP address Source IP address Destination IP address Destination IP address Source port number Source port number Destination port number Destination port number Layer 3 protocol type Layer 3 protocol type ToS byte ToS byte Input logical interface (ifIndex)Input logical interface (ifIndex)
It’s not a TCP flow.It’s not a TCP flow.
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 66
NetFlow Characteristics (2)NetFlow Characteristics (2)Packet Count
Byte Count
Start sysUpTime
End sysUpTime
Input ifIndex
Output ifIndex
Type of Service
TCP Flags
Protocol
Source IP Address
Destination IP Address
Source TCP/UDP Port
Destination TCP/UDP Port
Next Hop Address
Source AS Number
Destination AS Number
Source Prefix Mask
Destination Prefix Mask
NetFlow Packet Version 5
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 77
System ArchitectureSystem Architecture
Network Statistics Network Statistics System (Linux Cluster)System (Linux Cluster) CollectorsCollectors Web ServersWeb Servers Computing NodesComputing Nodes Disk StorageDisk Storage
Software ToolsSoftware Tools Flow-Tools (OSU)Flow-Tools (OSU) PerlPerl MySQLMySQL
Data Flow ProcessingData Flow Processing Router sends NetflowRouter sends Netflow Collectors scale up Collectors scale up
and store raw dataand store raw data Cluster performs:Cluster performs:
Intercloud filteringIntercloud filtering AggregationAggregation SortingSorting Truncation (Top 100)Truncation (Top 100) SQL StoreSQL Store
Display DataDisplay Data
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 99
Data AccuracyData Accuracy ESnet has a variety of router models from Cisco and ESnet has a variety of router models from Cisco and
Juniper. Both companies have different approaches to Juniper. Both companies have different approaches to generate NetFlow information.generate NetFlow information.
CiscoCisco Conditions for end of a flowConditions for end of a flow
end of TCP connection (RST/SYN)end of TCP connection (RST/SYN) traffic not seen on a flow for 15 secondstraffic not seen on a flow for 15 seconds 30 minutes after the flow starts30 minutes after the flow starts when the flow table fillswhen the flow table fills
No sampling for models lower than 12000No sampling for models lower than 12000 JuniperJuniper
Statistical sampling per interfaceStatistical sampling per interface We used SNMP data to compare the information We used SNMP data to compare the information
obtained from NetFlow dataobtained from NetFlow data
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 1010
SNMP Comparison (Juniper)SNMP Comparison (Juniper)
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 1111
SNMP Comparison (Cisco)SNMP Comparison (Cisco)
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 1212
User InterfaceUser Interface
Long Term AnalysisLong Term Analysis Use data stored in SQL databaseUse data stored in SQL database Trend analysisTrend analysis
Short Term AnalysisShort Term Analysis Use raw data collected from routersUse raw data collected from routers Network troubleshootingNetwork troubleshooting
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 1313
Top Flows Screenshot - 1Top Flows Screenshot - 1
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 1414
Top Flows Screenshot - 2Top Flows Screenshot - 2
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 1515
Top Flows Screenshot - 3Top Flows Screenshot - 3
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 1616
Top Flows Screenshot - 4Top Flows Screenshot - 4
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 1717
Trouble-Shooting Example (1)Trouble-Shooting Example (1)
HypothesisHypothesis Traffic from FNAL GE connection (FNAL CE -> FNAL-RT1) was over-running OC12 POS Traffic from FNAL GE connection (FNAL CE -> FNAL-RT1) was over-running OC12 POS
(FNAL-RT1 -> CHI-RT1)(FNAL-RT1 -> CHI-RT1)
TopologyTopology GE OC12 POS
IssueIssue Regular egress discards on OC12 POS between FNAL-RT1 router and CHI-CR1 router.Regular egress discards on OC12 POS between FNAL-RT1 router and CHI-CR1 router.
FNAL CE FNAL-RT1 CHI-CR1
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 1818
# --- ---- ---- Report Information --- --- ---## Fields: Total# Symbols: Disabled# Sorting: Descending Field 3# Name: Source/Destination IP## Args: flow-stat -f10 -S3### src IPaddr dst IPaddr flows octets packets originating file#129.105.21.229 198.49.208.10 193 1140264700 1014000 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125129.105.21.229 198.49.208.10 174 1138227500 1014600 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125198.49.208.10 129.105.21.229 196 1106719500 1114000 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125129.105.21.229 198.49.208.10 175 1086035800 980500 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925198.49.208.10 128.100.190.11 182 1085264900 980500 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925198.49.208.10 128.100.190.11 213 1062479100 960000 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125198.49.208.10 129.105.21.229 180 1051220800 1093500 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925128.100.190.11 198.49.208.10 242 1012027800 842100 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125198.49.208.10 128.100.190.11 206 1007483100 916300 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125128.100.190.11 198.49.208.10 200 1001671900 842300 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925128.100.190.11 198.49.208.10 231 989225200 817700 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125198.49.208.10 129.105.21.229 211 957567200 1050100 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125131.215.144.227 198.49.208.10 198 946292400 876500 fnal-rt1.burst.2004-06-23.2050-2004-06-23.2055131.215.144.227 198.49.208.10 209 936021800 882900 fnal-rt1.burst.2004-06-24.0850-2004-06-24.0855131.215.144.227 198.49.208.10 196 932688300 857700 fnal-rt1.burst.2004-06-24.0250-2004-06-24.0255131.215.144.227 198.49.208.10 206 904774900 848500 fnal-rt1.burst.2004-06-24.0650-2004-06-24.0655…
Trouble-Shooting Example (2)Trouble-Shooting Example (2) Flow AnalysisFlow Analysis
Isolate flows within discard time windowIsolate flows within discard time window Mark time window by referencing “originating file”Mark time window by referencing “originating file” Sort by “octets” fieldSort by “octets” field
VerificationVerification Reroute 198.49.208.10 (dmzmon0.deemz.net) via an alternate routeReroute 198.49.208.10 (dmzmon0.deemz.net) via an alternate route
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 1919
Present State of DevelopmentPresent State of Development
Porting application to ClusterPorting application to Cluster Some problems on the OS and Disk ArraySome problems on the OS and Disk Array
Testing Scalability of the SystemTesting Scalability of the System Amount of disk space necessary per day to Amount of disk space necessary per day to
store data for all border routersstore data for all border routers CPU and Memory necessary to process dataCPU and Memory necessary to process data Other issuesOther issues
Developing a Web Interface to display the Developing a Web Interface to display the stored datastored data
July/2004July/2004 ESCC Meeting - Columbus/OHESCC Meeting - Columbus/OH 2121
Small Flows PercentageSmall Flows Percentage