100G Network Monitoring Irvine, CA March 11th, 2015 …rich/Bro/CENIC-100GMonitoring.pdf · Python...

71
100G Network Monitoring with Bro and Time Machine Vincent Stoffer Cyber Security Engineer CENIC Conference March 11th, 2015 Irvine, CA UNIVERSITY OF CALIFORNIA

Transcript of 100G Network Monitoring Irvine, CA March 11th, 2015 …rich/Bro/CENIC-100GMonitoring.pdf · Python...

100G Network Monitoring with Bro and Time Machine

Vincent StofferCyber Security Engineer

CENIC ConferenceMarch 11th, 2015Irvine, CA

UNIVERSITY OF CALIFORNIA

CENIC 2015

● Intro / overview● 100G monitoring challenges● Bro!● Time Machine● Questions

Agenda

CENIC 2015

Lawrence Berkeley National Laboratory● Located in Berkeley, CA● "Bringing science solutions to the world"● Unclassified DoE research facility

operated by University of California● Functions much like a research

university

Overview

CENIC 2015

● ~5000 users ~10,000 hosts● Distributed computing resources● Many guests and visitors● Open network to enable

collaboration and research

Computing overview

CENIC 2015

● Expensive hardware○ No “product” solution

● Overall traffic volume○ overwhelming sensors○ log volumes

● Elephant flows○ Scaling up and down

● Maintain same visibility and protections

100G monitoring challenges

CENIC 2015

● Optical taps○ 100G, 10G, 1G

● Collect at packet broker○ Previously expensive proprietary

hardware○ Merchant silicon changed the game

● Send out to monitoring devices

Overview

CENIC 2015

Apcon, 10G monitordevices installed @LBL2007

cPacket cVu, 10G monitordevices installed @LBL 2011

Arista,100G monitordevices installed @LBL 2015

CENIC 2015

● Mostly flat network● Simple tapping setup

○ External & Internal○ Dynamic “firewall” in the middle

● Apcon -> cPacket tapping infrastructure

10G @ LBL since 2007

CENIC 2015

CENIC 2015

100G Berkeley Lab approach

● Scale up our setup on 10G● Moving from duplication to

advanced aggregation● New device needed● Based on previous work from

Scott Campbell at NERSC

CENIC 2015

● 100G and 10G ports● Filtering at ingress & egress● Port speed agnostic● Aggregation, symmetric load-balancing● No oversubscription limits● API for dynamic filtering/shunting● Filtering for arbitrary IP headers and TCP flags● Every port can be input/output● Create port groups● Send output to load-balanced groups and single

ports● IPv6 support

100G Device requirements

CENIC 2015

● Commercial / Appliance● Commodity network (proprietary /

hybrid)● Commodity network + SDN

(scipass/flowscale)

100G Monitoring device options

CENIC 2015

● Endace Access● Brocade MLXe● Arista 7504+7150

100G Monitoring device eval

CENIC 2015

● Flexible interface including GUI● High density - 6 port 100G line card

(supports LR-4) plus 144 10G ports! ● Easy to use API

○ dynamic shunting!● Relatively low cost● Lots of peers using

We chose Arista

CENIC 2015

Arista 7504

Arista 7150

CENIC 2015

10G Cluster (cPacket + Force10+12 Super Micro’s)LBL since 2007

Cluster-in-a-box (Arista + myricom + 1 super Micro )

CENIC 2015

General Architecture● Split 100Gb link into 5 (or more) streams

of 10G to feed each node● Further divide each 10G stream into

10x1Gb so each of the worker nodes sees 1/50th of the traffic

● When our sustained traffic is 20Gbps (high estimate), each worker sees about 400 Mbps of the traffic

● Scale up as necessary

CENIC 2015

CENIC 2015

● Sniffer10G○ Support for Linux, FreeBSD○ Myricom 10G cards only○ Supports only one tool in 2.0

(multiple tools in 3.0)○ Company/IP in some flux

Network cards - Myricon

CENIC 2015

CENIC 2015

Traffic Distribution to the Cluster

CENIC 2015

Traffic per “node”

CENIC 2015

Shunting● “Heavy Tail Effect*” is the observation

that a small number of network flows will dominate the overall volume of data transferred for a given time

● By detecting and removing the data component of these “heavy tail” flows, analysis load is dramatically reduced without sacrificing security

*Scott Campbell’s work

CENIC 2015

● Exclusions (IP pairs, netblocks, ports/protocols)○ Research networks / affiliates○ Resnet?

● Identify Elephant flows○ allow Control traffic

● Dynamic - Holy Grail○ Bro, API, near real time

Filters for Shunting

CENIC 2015

● Python program for shunting● Written by Justin Azoff● Uses Arista JSON API to add ACLs

which allow only control packets● Bro’s reaction framework feeds data

real-time● Connection details are preserved

Dumbno

CENIC 2015

CENIC 2015

CENIC 2015

CENIC 2015

Load Balancer Traffic split/node IDS UNIX OS

Arista (7504+7150) Myricom 10G-PCIE2-8C2, Myricom 10G sniffer drivers

Bro FreeBSD-10.1

Load Balancer Traffic split/node IDS UNIX OS

● Arista● Brocade● Endace● Gigamon● Open Flow● others ?

● PF_RING● Packet

Bricks + netmap

● Endace DAG

● Snort● Suricata

● Linux ● FreeBSD

This table provides alternative tools and technologies for various parts of a 100G monitoring system.

CENIC 2015

Questions??

BROverview

● Know thy network● Focus on people not products● Commodity hardware● UNIX/Linux focused● Free & open source software● Super adaptable

Open Source Network MonitoringPhilosophy

CENIC 2015

CENIC 2015

Not your typical IDS/IPS

● A monitoring platform○ A standalone network monitor○ A programmable framework○ An ecosystem

What is Bro? www.bro.org

Bro History

CENIC 2015

● Commodity servers (Supermicro)● Linux/FreeBSD● Network cards (Intel, Myricom,

high end DAG)

Hardware

CENIC 2015

Bro platform

Intrusion Detection

Programming Language

Packet Processing

VulnMgmt

File Analysis

Log Recording

Custom Logic

Standard Library

Network Traffic

Apps

Bro Platform

Tap

CENIC 2015

Bro platform

Intrusion Detection

Programming Language

Packet Processing

VulnMgmt

File Analysis

Log Recording

Custom Logic

Standard Library

Network Traffic

Apps

Bro Platform

Tap

CENIC 2015

● Connection logs● Protocol logs● Custom logs● Alerting and debug logs● Log formats:

○ ASCII (plain text, default)○ Elasticsearch○ SQLite○ Dataseries (HP) binary output

Bro log types

CENIC 2015

>ls *.log

app_stats.log notice.logcommunication.log reporter.logconn.log smtp.logdhcp.log socks.logdns.log software.logdpd.log ssh.logfiles.log ssl.logftp.log stderr.loghttp.log stdout.logirc.log syslog.logknown_certs.log traceroute.logknown_hosts.log tunnel.logweird.log modbus.log

CENIC 2015

● Netflow ++● Stateful connection records● Includes “originator” and

“responder”● Total byte counts, connections

times, history and more

Bro connection logs (conn.log)

CENIC 2015

Mar 3 16:35:36 ClmuHr1gC6p76JbdVl128.3.x.x 45191 207.62.80.166 80 tcp

http 0.023945 351 9886 SF T 0ShADadfF 6 671 11 10466 (empty)worker-2-5

conn.log

CENIC 2015

Field Value Description

ts 1425429336.809148 UNIX timestamp

uid ClmuHr1gC6p76JbdVl Unique ID

id.orig_h 128.3.x.x Originator IP

id.orig_p 45191 Originator port

id.resp_h 207.62.80.166 Responder IP

id.resp_p 80 Responder port

proto tcp IP Protocol

service http Application protocol

duration 0.023945 Duration

orig_bytes 351 Bytes by originator

resp_bytes 9886 Bytes by responder

history ShADadfF State history

CENIC 2015

● Full protocol level details● Configurable● Unique ID consistent across all

logs● Contents based on protocol

Bro application logs

CENIC 2015

Mar 3 16:35:36 CHlGTa39L4ViNKf5wb128.3.x.x 32609 131.243.5.1 53 udp52600 cenic2015.cenic.org

1C_INTERNET 1 A 0 NOERROR F F T T 0207.62.80.166 7973.000000 F

dns.log

CENIC 2015

Mar 3 16:35:36 ClmuHr1gC6p76JbdVl128.3.x.x 45191 207.62.80.166 80 1GET cenic2015.cenic.org / -Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36 09695 200 OK - - - (empty) - - -- -FrQ9Ct3IucTKymFao7 text/html HOST,CONNECTION,ACCEPT,USER-AGENT,DNT,ACCEPT-ENCODING,ACCEPT-LANGUAGE - - /

http.log

CENIC 2015

● Ground truth for your network (Know thy network)

● Troubleshooting● Analytics / reporting● DFIR● Use to build alerts and take

actions

Great, but what do I need all that for?

CENIC 2015

● Basic logs○ Connections○ HTTP○ SMTP○ DNS

Know thy network - examples

CENIC 2015

Bro platform

Intrusion Detection

Programming Language

Packet Processing

VulnMgmt

File Analysis

Log Recording

Custom Logic

Standard Library

Network Traffic

Apps

Bro Platform

Tap

CENIC 2015

● Bro is event based● Almost any event can trigger a

notice (notice.log)● Then you can take action● More typical IDS function

Notices / Alerts

CENIC 2015

Address_SeenScan::Address_ScanScan::Port_ScanSSH::Password_GuessingTraceroute::DetectedNTP::NTP_Monlist_QueriesSSL::Invalid_Server_CertSMTPurl::SMTP_Link_in_EMAIL_ClickedSMTPurl::SMTP_WatchedFileTypeSMTPurl::SMTP_Embeded_Malicious_URLHTTP::HTTP_SensitiveURIHTTP::SQL_Injection_AttackerSoftware::Vulnerable_VersionTeamCymruMalwareHashRegistry::Match

Some example notices

CENIC 2015

● Notify via email/SMS/etc.● Shell scripts● Firewall/device integration● ACLd● Total flexibility

Alert actions

CENIC 2015

Bro platform

Intrusion Detection

Programming Language

Packet Processing

VulnMgmt

File Analysis

Log Recording

Custom Logic

Standard Library

Network Traffic

Apps

Bro Platform

Tap

CENIC 2015

● Core - Generates events● Scripting - Does stuff with themNot a “signature” though of course there is a way to do that :)

Bro policy

CENIC 2015

● Don’t ask what Bro can do, better to ask what do you want to do?○ NTP monlist○ SIP scanners○ Tor ban○ SMTP URL○ SSH foreign login

Bro policy philosophy

CENIC 2015

● But Bro can do everything??!!● Bro provides us amazing

metadata and beyond, but we sometimes need more

● Enter Time Machine

Beyond Bro?

CENIC 2015

Time machine??

CENIC 2015

● Stefan Kornexl● Graduate thesis project● Technische Universität München

Stefan Kornexl, Vern Paxson, Holger Dreger, Anja Feldmann, and Robin Sommer. 2005. Building a time machine for efficient recording and retrieval of high-volume network traffic. In Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement (IMC '05). USENIX Association, Berkeley, CA, USA

Time Machine background

CENIC 2015

● Creates pcap files with indexes● Killer feature: "connection cutoff"● Cutoffs defined per port● Assumption: interesting stuff in

the first N bits

Time Machine

CENIC 2015

class "smtp" {

filter "port 25 or port 587";

cutoff 25m;

filesize 2000m;

}

Time Machine configclass "encrypted" {

filter "port 22 or port 443";

cutoff 500k ;

filesize 2000m;

}

CENIC 2015

● Average 2-4 Gb/s● Spikes to 10-20 Gb/s● Roughly 25 TB / day full traffic● 750 TB / month!

Traffic numbers

CENIC 2015

● Our goal was 6 months of packet capture

● With full traffic, we could do <1 week

● After multiple iterations/tuning of our buckets

Storage

CENIC 2015

March 2015 config

bucketscapture

MB daily GB 6mo TB

http 5 500

smtp 25 50

encrypted 500k 200

udp 5 20

icmp 64k 1

53 tcp/udp 5 15

else 5 150

TOTAL 936 170 From 750TB/ month!

CENIC 2015

● Unless you are under regulatory requirements, doing full packet capture is probably wrong

● Once tuned, we want more horizontal but not more vertical (shallow TM)

● Incidents (SIP)

But it’s not full packet capture...

CENIC 2015

Buckets Number of conns

threshold

conns < threshold

conns > threshold

Capture coverage with Threshold (%)

Capture size

Actual traffic on the wire

udp 13,149,143 5M 13,142,093 7,050 99.94 20 G 400 G

http 21,586,940 5M 21,568,519 18421 99.91 480 G 6100 G

https 8,332,603 500K 8207340 125263 98.49 200 G 2300 G

icmp 5,168,723 64K 5,168,004 719 99.98 935 M 984 M

smtp 1,005,569 25M 1005400 169 99.98 60 G 66 G

dns 53,450,492 5M 53450434 58 99.99 17 G 9 G

ssh 4,445,375 500K 4443373 2002 99.95 2 G 2100 G

CENIC 2015

● Indexes may be helpful● TCPdump as the retrieval

interface (BPF)● Command line ‘find’ in your

buckets● Off to wireshark or whatever

Time machine - retrieval

CENIC 2015

● Bro connects to Time Machine● Bro can request data from TM to

pass to an analyst or to perform retroactive processing

Time machine - Bro

CENIC 2015

● IPv6 support (LBL branch)● Indexes don’t persist between

restarts (Fix coming?)● Searching and collating can be a

pain● No searching above layer 4

Time machine - shortcomings

CENIC 2015

● Persistent indexes● Shunted traffic● Load-balanced TM?

Time machine - future

CENIC 2015

● Download Bro: www.bro.org● Check out Security Onion: www.

securityonion.net● Time Machine: www.bro.

org/community/time-machine.html

● Berkeley Lab 100G technical doc

How to get started

CENIC 2015

Discussion / Questions?

Vincent Stoffer - [email protected] [email protected]