Ory Segal, Tsvika Klein Akamai Technologies

36
Big Data Intelligence Harnessing Petabytes of WAF statistics to Analyze & Improve Web Protection in the Cloud Ory Segal, Tsvika Klein Akamai Technologies

description

Big Data Intelligence Harnessing Petabytes of WAF statistics to Analyze & Improve Web Protection in the Cloud. Ory Segal, Tsvika Klein Akamai Technologies. About Us. Ory Segal Principal Product Architect, Cloud Security Tsvika Klein Product Manager, Cloud Security. - PowerPoint PPT Presentation

Transcript of Ory Segal, Tsvika Klein Akamai Technologies

Page 1: Ory Segal,  Tsvika  Klein Akamai Technologies

Big Data IntelligenceHarnessing Petabytes of WAF statistics to Analyze & Improve Web Protection in the Cloud

Ory Segal, Tsvika KleinAkamai Technologies

Page 2: Ory Segal,  Tsvika  Klein Akamai Technologies

About Us

• Ory Segal– Principal Product Architect, Cloud Security

• Tsvika Klein– Product Manager, Cloud Security

Hosted by OWASP & the NYC Chapter

Page 3: Ory Segal,  Tsvika  Klein Akamai Technologies

Topics to Cover

Hosted by OWASP & the NYC Chapter

Akamai & OWASP ModSecurity CRS Relationship

Security Big Data @ Akamai

Measuring WAF Accuracy @ Akamai

CRS through the Big Data Prism (Lessons Learned)

Page 4: Ory Segal,  Tsvika  Klein Akamai Technologies

About UsBut we only have 45 minutes…

And too much data to cover…

Page 5: Ory Segal,  Tsvika  Klein Akamai Technologies

Akamai & OWASP CRS

This is not an Akamai marketing presentation

Akamai has been offering its cloud-based WAF since 2009. Kona Site Defender:

– OWASP CRS (Akamai Kona Rules)– DDoS Protection– DNS Protection– Bot Detection– Site Shield / Site Cloaking

OWASP CRS was ported to Akamai MD, and does not run directly on ModSecurity

Page 6: Ory Segal,  Tsvika  Klein Akamai Technologies

SECURITY BIG DATA @ AKAMAI

Page 7: Ory Segal,  Tsvika  Klein Akamai Technologies

Akamai’s cloud platform enables secure, high-performing user experiences on any device, anywhere

Highlights: 100 million page views per second and

500 billion hits per day 734 Million IP addresses seen quarterly 260+ Terabytes of compressed daily logs 30% of all internet traffic

120,000+Servers

2,000+Locations

82Countries

1,100+Networks

750+ Cities

Akamai Intelligent Platform

Page 8: Ory Segal,  Tsvika  Klein Akamai Technologies

CSI Platform Statistics

10 Terabytes of daily attack data

2 Petabytes of security data stored

45 days retention

140K concurrent connections (incoming data)

600K log lines / sec. indexed by 30 dimensions

8000 queries daily scanning terabytes of data

Page 9: Ory Segal,  Tsvika  Klein Akamai Technologies

CSI High Level Architecture

HADOOP

YODALOG AGENT

HBASE

AKAMAI EDGE SERVERS

YODA ADAPTER

BE Applications

FE Applications

Page 10: Ory Segal,  Tsvika  Klein Akamai Technologies

Yoda (Distributed Query Engine)

Interactive

Multiple data streams

Intuitive query language

High cardinality aggregation

Page 11: Ory Segal,  Tsvika  Klein Akamai Technologies

Security Big Data Challenge #1

Page 12: Ory Segal,  Tsvika  Klein Akamai Technologies

Security Big Data Challenge #2

Page 13: Ory Segal,  Tsvika  Klein Akamai Technologies

Sample Data App - SARA

Interactive Tool to Analyze Security Events

Page 14: Ory Segal,  Tsvika  Klein Akamai Technologies

BACK TO WAF & OWASP CRS…

Page 15: Ory Segal,  Tsvika  Klein Akamai Technologies

WAF Accuracy Lingo

• Imagine a WAF that protects against 100% of all possible attack vectors

…by blocking 100% of all HTTP requests

• Accurate WAF testing requires you to measure:• How many real attacks got blocked (TP)• How much valid requests were allowed through (TN)• How much valid traffic was inappropriately blocked (FP)• How many attacks were allowed through ((FN)

Lets talk about measuring Precision, Recall, Accuracy, MCC…

Page 16: Ory Segal,  Tsvika  Klein Akamai Technologies

Things You Need to Know

% of blocked requests that were actual attacks

% of attacks that were actually blocked

% of decisions that were good decisions

* MCC: http://en.wikipedia.org/wiki/Matthews_correlation_coefficient

Correlation between WAF decisions and actual nature of requests

Page 17: Ory Segal,  Tsvika  Klein Akamai Technologies

Lets Look at Some Examples

A WAF’s accuracy needs to be measured both in its ability to block attacks, as well as it’s ability to allow good traffic through…

WAF Type Requests Valid Attacks Blocked TP TN FP FN P R A MCC

Real 1000 990 10 11 8 987 3 2 0.73 0.8 0.995 0.76

Off 1000 990 10 0 0 990 0 10 N/A 0 0.99 0

Always Block 1000 990 10 1000 10 0 990 0 0.01 1 0.01 0

Noisy 1000 990 10 31 8 967 23 2 0.26 0.8 0.975 0.45

Conservative 1000 990 10 2 2 990 0 8 1.00 0.2 0.992 0.45

Page 18: Ory Segal,  Tsvika  Klein Akamai Technologies

Introducing:

Akamai WAF Testing Framework

Page 19: Ory Segal,  Tsvika  Klein Akamai Technologies

Akamai WAF Testing (AWT) Framework

• Ability to send both valid & attack traffic

• Easily create or add new test cases:• 3 methods: Text files, Burp Extender, Wireshark .pcaps

• Easily import test cases from Akamai’s Big Data platform

• Configurable and can work with any WAF• Easily define success / fail criteria

• Intuitive XML & HTML reports

• Easy debugging of FP/FN w/ Anomaly Scoring (rule comb.)

Page 20: Ory Segal,  Tsvika  Klein Akamai Technologies

AWT Built-In Test Cases

In order to accurately assess WAF, we collected test cases from the following sources:

Web interaction recordings of Alexa Top 100 internet sites – Commerce, Health, Consumer Electronics, Reference, Finance, …

Ported common False Positive cases from Akamai customers (Big Data)

Recorded commercial web application scanner traffic

Attacks from Akamai CSI big data platform

Havij & SQLMap attacks

Exploits from the internet (fuzzers, exploit-db, …*

Tens of Thousands of HTTP Requests, divided 95% - 5%

Ported “valid” test cases from other tools*

Page 21: Ory Segal,  Tsvika  Klein Akamai Technologies

AWT Reports – High Level Statistics

Page 22: Ory Segal,  Tsvika  Klein Akamai Technologies

AWT Reports – Protection Statistics

Page 23: Ory Segal,  Tsvika  Klein Akamai Technologies

AWT Reports – False Positives Analysis

Page 24: Ory Segal,  Tsvika  Klein Akamai Technologies

OWASP CRS – LESSONS LEARNED

Page 25: Ory Segal,  Tsvika  Klein Akamai Technologies

CRS Issue #1 – Risk Groups

• CRS 2.2.x uses a single anomaly score– Visibility (granularity) issues – What really happened?

• Separate anomaly score “accounting” to smaller risk groups (attack types)– Clear understanding of which attack took place

• Challenge: – requires rule mapping to risk groups– Some rules contribute to more than 1 risk group– Requires to put some more thought into anomaly scoring – it’s not just

one pile of rules/scores

XSS = 35, SQLi = 10, RFI = 0, LFI = 0, …

Page 26: Ory Segal,  Tsvika  Klein Akamai Technologies

CRS Issue #2 –Multiple Thresholds

<script>alert('xss')</script>=> Score 30

; /bin/sh cat /etc/passwd=> Score 5

<book> Hello World </book>=> Score 10

Different risks require different anomaly thresholds

25

5

Threshold

<xss>

XSS Attack:

CMDi Attack:

Valid XML:

<xml>

Page 27: Ory Segal,  Tsvika  Klein Akamai Technologies

CRS Issue #2

XSS SQLi CMDi HTTP RFI0

5

10

15

20

25

30

TH

Page 28: Ory Segal,  Tsvika  Klein Akamai Technologies

CRS Issue #3 – HTTP Violations

“BLOCK HTTP PROTOCOL VIOLATIONS ?!???

THAT’S LIKE 1.21 PETABYTES OF LOGS PER DAY!!!!!”

Page 29: Ory Segal,  Tsvika  Klein Akamai Technologies

CRS Issue #3 – HTTP Violations

• HTTP RFC Enforcement?! Good Luck!– APIs, REST services, RSS feeds, Good Bots – most don’t adhere to

HTTP RFC– Prior to system tuning:

• Missing Accept Header (960015): 14%• Missing User-Agent Header (960009 ): 3%

• Can’t trust HTTP violation rules on their own– “Invalid HTTP” risk group with its own threshold

• Blocks only seriously-damaged HTTP requests– Build more focused tool fingerprints

• See next slide for an explanation on 960015

Page 30: Ory Segal,  Tsvika  Klein Akamai Technologies

960015 – Research into 3 hours of triggers

Which URLs trigger this rule?

85% Static Media Files

Perhaps a Unique User-Agent?

95.1K “Unique” UAsAnything in Common?

“Android” String found in 50%

Can You Give Me Something Else?

Common: Android (50%), AppleWebKit (19%), News (21%), App (20%)

Page 31: Ory Segal,  Tsvika  Klein Akamai Technologies

CRS Issue #4: Cookies

YEAR: 2003

SESSID = 12f0a0193b4d93e9s92a39af;

Quite easy to spot a SQLi or XSS payload in a cookies

Page 32: Ory Segal,  Tsvika  Klein Akamai Technologies

CRS Issue #4: Cookies

YEAR: 2013C1state = 24~1~-1~-1~E~6~6~6~10~10~0~0~|~37A1B34A~2EBA820B~0AEBA380~130959B9~0327C30B~7617CC73~21B797A5~C6392AF5~5FE036DB~|~8A173E13~7F5D33BF~30DFEF65~|~~|~0~1~2~3~4~5|3~4~6~7~8||0~1~2|4~4~6||~|~0~0~0~0~0~0~|~0~0~0~0~0~|~~|~~|~~|~~|; C2state = PC#1382573257902-104085.19_06#1384742638|cat#true#1383533098|session#1383533019933-203317#1383534898; C3data = {"v":1,"rid":"1371546489873_699561","to":5,"c":"http://www.some.site/page.aspx?a=5","pv":2,"lc":{"d0":{"v":2,"s":true}},"cd":0,"sd":0,"f":1371546904751} ; Cinfo = 1403D3394_232#scroll on "//<![CDATA[(function() { var f5_cspm = { pass_params: '1102912_0394939_19210_24253..."

Page 33: Ory Segal,  Tsvika  Klein Akamai Technologies

CRS Issue #5: Score Spreading Across Selectors

In many FP scenarios, score spreads across “selectors”

c1 = 1384044727071|ABCD:2::|AC:1::|PSD:0:AKFJ~MOBILE^CLAK_KOL:1385149290276 [950901 - 5]

c2 = bn:Samsung|mn:GT-I9300 Galaxy S III|tb:false|mb:true|dos:Android|dosv:4.1|bos:KJSKKL|bosv:9 [981172 - 3]

c3 = PC#1383939352901-916004.20_14#1386636727|check#true#1384044787|session#1384044726390-399957#1384046587 [981231 - 3]

c4: = ”” [981318, 981242 – 2, 5] (Total Score: 18)

Consider a FP reduction heuristics that reduces the total score when spread across selectors? There are security implications,…

Page 34: Ory Segal,  Tsvika  Klein Akamai Technologies

CRS Issue #6: Rule Inefficiency

During our big data analysis & AWT usage, we noticed a few troubling rule issues:

– Many rules have redundancies in expressions• This tends to push the anomaly score up in many

scenarios (“reinforcing a FP”)• Forces pushing the threshold much higher than really

needed– Some rules combine weak & strong signatures

• FP-prone rules generate high score – reducing their “weight” hurts the accurate signatures in them

– Some rules seemed almost useless – e.g. 981172

Page 35: Ory Segal,  Tsvika  Klein Akamai Technologies

Summary

• Big Data:– OWASP / ModSecurity should consider collecting anonymized

trigger information– CRS would greatly benefit from a much larger sample set

• CRS Future:– Akamai has already contributed to the CRS project, and would

continue to contribute back to the community– We highly recommend adopting some of the major changes done

@ Akamai – mainly the “risk groups” model & multiple thresholds• WAF Testing:

– Now that the WAF industry has matured, it is time that WAF deployments will be measured for accuracy using tools & methods mentioned here – Precision, Recall and MCC

Page 36: Ory Segal,  Tsvika  Klein Akamai Technologies

THANK YOU