Open Security Operations Center - OpenSOC

47
OpenSOC The Open Security Operations Center for Analyzing 1.2 Million Network Packets per Second in Real Time James Sirota, Big Data Architect Cisco Security Solutions Practice [email protected] Sheetal Dolas Principal Architect Hortonworks [email protected] June 3, 2014

description

OpenSOC The Open Security Operations Center for Analyzing 1.2 Million Network Packets per Second in Real Time

Transcript of Open Security Operations Center - OpenSOC

Page 1: Open Security Operations Center - OpenSOC

OpenSOC ���The Open Security Operations Center ���

���for���

Analyzing 1.2 Million Network Packets per Second in Real Time

James Sirota, Big Data Architect Cisco Security Solutions Practice [email protected]

Sheetal Dolas Principal Architect Hortonworks [email protected]

June 3, 2014

Page 2: Open Security Operations Center - OpenSOC

2

§  Problem Statement & Business Case for OpenSOC

§  Solution Architecture and Design

§  Best Practices and Lessons Learned

§  Q & A

Over Next Few Minutes

Page 3: Open Security Operations Center - OpenSOC

3

Business Case

Page 4: Open Security Operations Center - OpenSOC

4

“There's now a growing sense of fatalism:

It's no longer if or when you get hacked, but the assumption is that you've already

been hacked,

with a focus on minimizing the damage.” Source: Dark Reading / Security’s New

Reality: Assume The Worst

Page 5: Open Security Operations Center - OpenSOC

5

Breaches Happen in Hours…���But Go Undetected for Months or Even Years

Source: 2013 Data Breach Investigations Report

Seconds Minutes Hours Days Weeks Months Years

Initial Attack to Initial Compromise

10% 75% 12% 2% 0% 1% 1%

Initial Compromise to Data Exfiltration

8% 38% 14% 25% 8% 8% 0%

Initial Compromise to Discovery

0% 0% 2% 13% 29% 54% 2%

Discovery to Containment/ Restoration 0% 1% 9% 32% 38% 17% 4%

Timespan of events by percent of breaches

In 60% of ���breaches, data ���

is stolen in hours

54% of breaches are not discovered for

months

Page 6: Open Security Operations Center - OpenSOC

6

Cisco Global Cloud Index

Source: 2014 Cisco Global Cloud Index

Page 7: Open Security Operations Center - OpenSOC

7

Introducing OpenSOC���Intersection of Big Data and Security Analytics

Multi Petabyte Storage

Interactive Query

Real-Time Search

Scalable Stream Processing

Unstructured Data

Data Access Control

Scalable Compute

OpenSOC

Real-Time Alerts

Anomaly Detection

Data Correlation

Rules and Reports

Predictive Modeling

UI and Applications

Big Data Platform

Hadoop

Elastic

Search

Page 8: Open Security Operations Center - OpenSOC

8

OpenSOC Journey

Sept 2013

First Prototype

Dec 2013 Hortonworks

joins the project

March 2014 Platform

development finished

Sept 2014 General

Availability

May 2014

CR Work off

April 2014 First beta test at customer

site

Page 9: Open Security Operations Center - OpenSOC

9

Solution Architecture & Design

Page 10: Open Security Operations Center - OpenSOC

10

OpenSOC Conceptual Architecture

Raw Network Stream

Network Metadata Stream

Netflow

Syslog

Raw Application Logs

Other Streaming Telemetry

Hive HBase

Raw Packet Store

Long-Term Store

Elastic Search

Real-Time Index

Network Packet Mining

and PCAP Reconstruction

Log Mining and Analytics

Big Data Exploration, Predictive Modeling

Applications + Analyst Tools

Pars

e +

Fo

rmat

En

rich

Ale

rt

Threat Intelligence Feeds

Enrichment Data

Page 11: Open Security Operations Center - OpenSOC

11

§  Raw Network Packet Capture, Store, Traffic Reconstruction

§  Telemetry Ingest, Enrichment and Real-Time Rules-Based Alerts

§  Real-Time Telemetry Search and Cross-Telemetry Matching

§  Automated Reports, Anomaly Detection and Anomaly Alerts

§  Rich Analytics Apps and Integration with Existing Analytics Tools

Key Functional Capabilities

Page 12: Open Security Operations Center - OpenSOC

12

§  Fully-Backed by Cisco and Used Internally for Multiple Customers

§  Free, Open Source and Apache Licensed

§  Built on Highly-Scalable and Proven Platforms (Hadoop, Kafka, Storm)

§  Extensible and Pluggable Design

§  Flexible Deployment Model (On-Premise or Cloud)

§  Centralize your processes, people and data

The OpenSOC Advantage

Page 13: Open Security Operations Center - OpenSOC

13

OpenSOC Deployment at Cisco Hardware footprint (40u)

§ 14 Data Nodes (UCS C240 M3)

§ 3 Cluster Control Nodes (UCS C220 M3)

§ 2 ESX Hypervisor Hosts (UCS C220 M3)

§ 1 PCAP Processor (UCS C220 M3 +

Napatech NIC)

§ 2 SourceFire Threat alert processors

§ 1 Anue Network Traffic splitter

§ 1 Router

§ 1 48 Port 10GE Switch

Software Stack

§ HDP 2.1

§ Kafka 0.8

§ Elastic Search 1.1

§ MySQL 5.5

Page 14: Open Security Operations Center - OpenSOC

14

OpenSOC - Stitching Things Together Access Messaging System Data Collection Source Systems Storage Real Time Processing

Storm Kafka

B Topic

N Topic

Elastic Search

Index

Web Services

Search

PCAP Reconstruction

HBase

PCAP Table

Analytic Tools

R / Python

Power Pivot

Tableau

Hive

Raw Data

ORC

Passive Tap

PCAP Topic

DPI Topic

A Topic

Telemetry Sources

Syslog

HTTP

File System

Other

Flume

Agent A

Agent B

Agent N

B Topology

N Topology

A Topology

PCAP

Traffic Replicator PCAP Topology

DPI Topology

Page 15: Open Security Operations Center - OpenSOC

15

OpenSOC - Stitching Things Together Access Messaging System Data Collection Source Systems Storage Real Time Processing

Storm Kafka

B Topic

N Topic

Elastic Search

Index

Web Services

Search

PCAP Reconstruction

HBase

PCAP Table

Analytic Tools

R / Python

Power Pivot

Tableau

Hive

Raw Data

ORC

Passive Tap

PCAP Topic

DPI Topic

A Topic

Telemetry Sources

Syslog

HTTP

File System

Other

Flume

Agent A

Agent B

Agent N

B Topology

N Topology

A Topology

PCAP

Traffic Replicator

Deeper Look

PCAP Topology

DPI Topology

Page 16: Open Security Operations Center - OpenSOC

16

PCAP Topology Storage Real Time Processing

Storm

Elastic Search

Index

HBase

PCAP Table

Hive

Raw Data

ORC

Kafka Spout

Parser Bolt

HDFS Bolt

HBase Bolt

ES Bolt

Page 17: Open Security Operations Center - OpenSOC

17

DPI Topology & Telemetry Enrichment Storage Real Time Processing

Storm

Elastic Search

Index

HBase

PCAP Table

Hive

Raw Data

ORC

Kafka Spout

Parser Bolt

GEO Enrich

Whois Enrich

CIF Enrich

HDFS Bolt

ES Bolt

Page 18: Open Security Operations Center - OpenSOC

18

Enrichments

Parser Bolt

GEO Enrich

RAW Message

{!“msg_key1”: “msg value1”,!“src_ip”: “10.20.30.40”,!“dest_ip”: “20.30.40.50”,!“domain”: “mydomain.com”!}!

Who Is

Enrich

"geo":[ {"region":"CA",!"postalCode":"95134",!"areaCode":"408",!"metroCode":"807",!"longitude":-121.946,!"latitude":37.425,!"locId":4522,!"city":"San Jose",!"country":"US"! }]!

CIF Enrich

"whois":[ {!"OrgId":"CISCOS",!"Parent":"NET-144-0-0-0-0",!"OrgAbuseName":"Cisco Systems Inc",!"RegDate":"1991-01-171991-01-17",!"OrgName":"Cisco Systems",!"Address":"170 West Tasman Drive",!"NetType":"Direct Assignment"!} ],!“cif”:”Yes”!

Enriched Message

Cache

MySQL

Geo Lite Data

Cache

HBase

Who Is Data

Cache

HBase

CIF Data

Page 19: Open Security Operations Center - OpenSOC

19

Applications: Telemetry Matching and DPI

Step1: Search

Step2: Match

Step3: Analyze

Step4: Build PCAP

Page 20: Open Security Operations Center - OpenSOC

20

Integration with Analytics Tools

Dashboards Reports

Page 21: Open Security Operations Center - OpenSOC

21

Best Practices ���and���

Lessons Learned

Page 22: Open Security Operations Center - OpenSOC

22

Journey Towards Highly Scalable Application

Page 23: Open Security Operations Center - OpenSOC

23

Kafka Tuning

Page 24: Open Security Operations Center - OpenSOC

24

This is where we began

Page 25: Open Security Operations Center - OpenSOC

25

Some code optimizations and increased parallelism

Page 26: Open Security Operations Center - OpenSOC

26

§  Is Disk I/O heavy

§  Kafka 0.8+ supports replication and JBOD §  Better performance compared to RAID

§  Parallelism is largely driven by number of disks and partitions per topic

§  Key configuration parameters: §  num.io.threads - Keep it at least equal to number of disks provided to Kafka §  num.network.threads - adjust it based on number of concurrent producers,

consumers and replication factor

Kafka Tuning

Page 27: Open Security Operations Center - OpenSOC

27

After Kafka Tuning

Page 28: Open Security Operations Center - OpenSOC

28

Bottleneck Isolation, Resource Profiling, Load Balancing

Page 29: Open Security Operations Center - OpenSOC

29

HBase Tuning

Page 30: Open Security Operations Center - OpenSOC

30

This is where we began

Page 31: Open Security Operations Center - OpenSOC

31

§  Row Key design is critical (gets or scans or both?) §  Keys with IP Addresses

§  Standard IP addresses have only two variations of the first character : 1 & 2

§  Minimum key length will be 7 characters and max 15 with a typical average of 12

§  Subnet range scans become difficult – range of 90 to 220 excludes 112

§  IP converted to hex (10.20.30.40 => 0a141e28) §  gives 16 variations of first key character

§  consistently 8 character key

§  Easy to search for subnet ranges

Row Key Design

Page 32: Open Security Operations Center - OpenSOC

32

Experiments with Row Key

Page 33: Open Security Operations Center - OpenSOC

33

§  Know your data §  Auto split under high workload can result into hotspots and split storms §  Understand your data and presplit the regions §  Identify how many regions a RS can have to perform optimally. Use the formula

below (RS memory)*(total memstore fraction)/((memstore size)*(# column families))!

Region Splits

Page 34: Open Security Operations Center - OpenSOC

34

With Region Pre-Splits

Page 35: Open Security Operations Center - OpenSOC

35

§  Enable Micro Batching (client side buffer)

§  Smart shuffle/grouping in storm

§  Understand your data and situationally exploit various WAL options

§  Watch for many minor compactions §  For heavy ‘write’ workload Increase hbase.hstore.blockingStoreFiles (we

used 200)

Know Your Application

Page 36: Open Security Operations Center - OpenSOC

36

And Finally

Page 37: Open Security Operations Center - OpenSOC

37

Kafka Spout

Page 38: Open Security Operations Center - OpenSOC

38

§  Parallelism is controlled by number of partitions per topic §  Set Kafka spout parallelism equal to number of partitions in topic

§  Other key parameters that drive performance

§  fetchSizeBytes!

§  bufferSizeBytes!

Kafka Spout

Page 39: Open Security Operations Center - OpenSOC

39

Mysteriously Missing Data

Page 40: Open Security Operations Center - OpenSOC

40

§  A bug in Kafka spout that used to miss out some partitions and loose data §  It is now fixed and available from Hortonworks repository (

http://repo.hortonworks.com/content/repositories/releases/org/apache/storm/storm-Kafka )

Mysteriously Missing Data Root Cause

Page 41: Open Security Operations Center - OpenSOC

41

Storm

Page 42: Open Security Operations Center - OpenSOC

42

§  Every small thing counts at scale §  Even simple string operations can slowdown throughput when executed on

millions of Tuples

Storm

Page 43: Open Security Operations Center - OpenSOC

43

§  Error handling is critical §  Poorly handled errors can lead to topology failure and eventually loss of

data (or data duplication)

Storm

Page 44: Open Security Operations Center - OpenSOC

44

§  Tune & Scale individual spout and bolts before performance testing/tuning entire topology §  Write your own simple data generator spouts and no-op bolts

§  Making as many things configurable as possible helps a lot

Storm

Page 45: Open Security Operations Center - OpenSOC

45

§ When it comes to Hadoop…partner up

§  Separate the hype from the opportunity

§  Start small then scale up

§ Design Iteratively

§  It doesn’t work unless you have proven it at scale

§ Keep an eye on ROI

Lessons Learned

Page 46: Open Security Operations Center - OpenSOC

46

How can you contribute? §  Technology Partner Program – contribute developers to join

the Cisco and Hortonworks team

Looking for Community Partners���Cisco + Hortonworks + Community Support for OpenSOC ���

Page 47: Open Security Operations Center - OpenSOC

Thank you!

We are hiring: [email protected] [email protected]